Multi-Turn Sessions¶
So far, each test is independent—the agent has no memory between tests. Sessions let multiple tests share conversation history, simulating real multi-turn interactions.
Why Sessions?¶
Real agents don't answer single questions. Users have conversations:
- "What's my checking account balance?"
- "Transfer $200 to savings" ← Requires remembering the accounts
- "What are my new balances?" ← Requires remembering the transfer
Without sessions, test 2 would fail—the agent doesn't know which accounts were discussed.
Defining a Session¶
Use the @pytest.mark.session marker:
import pytest
from pytest_aitest import Agent, Provider, MCPServer
banking_server = MCPServer(command=["python", "banking_mcp.py"])
banking_agent = Agent(
name="banking",
provider=Provider(model="azure/gpt-5-mini"),
mcp_servers=[banking_server],
)
@pytest.mark.session("banking-chat")
class TestBankingConversation:
"""Tests run in order, sharing conversation history."""
async def test_initial_query(self, aitest_run):
"""First message - establishes context."""
result = await aitest_run(banking_agent, "What's my checking account balance?")
assert result.success
assert result.tool_was_called("get_balance")
async def test_followup(self, aitest_run):
"""Second message - uses context from first."""
result = await aitest_run(banking_agent, "Transfer $200 to savings")
assert result.success
# Agent remembers we were talking about checking
assert result.tool_was_called("transfer")
async def test_verification(self, aitest_run):
"""Third message - builds on full conversation."""
result = await aitest_run(banking_agent, "What are my new balances?")
assert result.success
Key points:
- Tests in a session run in order (top to bottom)
- Each test sees the full conversation history from previous tests
Not compatible with pytest-xdist
Sessions require sequential test execution to maintain conversation order.
Don't use -n auto or other parallel execution with session tests.
- The session name (
"banking-chat") groups related tests
Session Context Flow¶
test_initial_query
User: "What's my checking account balance?"
Agent: "Your checking balance is $1,500.00..."
↓ context passed to next test
test_followup
[Previous messages included]
User: "Transfer $200 to savings"
Agent: "Done! Transferred $200 from checking to savings..."
↓ context passed to next test
test_verification
[All previous messages included]
User: "What are my new balances?"
Agent: "Checking: $1,300, Savings: $3,200..."
When to Use Sessions¶
| Scenario | Use Session? |
|---|---|
| Single Q&A tests | No |
| Multi-turn conversation | Yes |
| Workflow with multiple steps | Yes |
| Independent feature tests | No |
| Testing context retention | Yes |
Sessions with Parametrize¶
You can combine sessions with model comparison:
@pytest.mark.session("shopping-flow")
@pytest.mark.parametrize("model", ["gpt-5-mini", "gpt-4.1"])
class TestShoppingWorkflow:
"""Test the same conversation flow with different models."""
async def test_browse(self, aitest_run, model, shopping_server):
agent = Agent(
name=f"shop-{model}",
provider=Provider(model=f"azure/{model}"),
mcp_servers=[shopping_server],
)
result = await aitest_run(agent, "Show me running shoes")
assert result.success
async def test_select(self, aitest_run, model, shopping_server):
agent = Agent(
name=f"shop-{model}",
provider=Provider(model=f"azure/{model}"),
mcp_servers=[shopping_server],
)
result = await aitest_run(agent, "I'll take the Nike ones")
assert result.success
This creates two separate session flows:
shopping-flow[gpt-5-mini]: browse → select (with gpt-5-mini)shopping-flow[gpt-4.1]: browse → select (with gpt-4.1)
The report shows each session as a complete flow with all turns visualized.
Next Steps¶
- Comparing Configurations — Pattern for parametrized tests
- Generate Reports — Understand report output
📁 Real Example: test_sessions.py — Banking workflow with session continuity