Interactivity
Overview
In some cases you may wish to introduce user interaction into the implementation of tasks. For example, you may wish to:
- Confirm consequential actions like requests made to web services
- Prompt the model dynamically based on the trajectory of the evaluation
- Score model output with human judges
The input_screen()
function provides a context manager that temporarily clears the task display for user input. Note that prompting the user is a synchronous operation that pauses other activity within the evaluation (pending model requests or subprocesses will continue to execute, but their results won’t be processed until the input is complete).
Example
Before diving into the details of how to add interactions to your tasks, you might want to check out the Intervention Mode example.
Intervention mode is a prototype of an Inspect agent with human intervention, meant to serve as a starting point for evaluations which need these features (e.g. manual open-ended probing). It implements the following:
Sets up a Linux agent with
bash()
andpython()
tools.Prompts the user for a starting question for the agent.
Displays all messages and prompts to approve tool calls.
When the model stops calling tools, prompts the user for the next action (i.e. continue generating, ask a new question, or exit the task).
After reviewing the example and the documentation below you’ll be well equipped to write your own custom interactive evaluation tasks.
Input Screen
You can prompt the user for input at any point in an evaluation using the input_screen()
context manager, which clears the normal task display and provides access to a Console object for presenting content and asking for user input. For example:
from inspect_ai.util import input_screen
with input_screen() as console:
print("Some preamble text")
console.input = console.input("Please enter your name: ")
The console
object provided by the context manager is from the Rich Python library used by Inspect, and has many other capabilities beyond simple text input. Read on to learn more.
Prompts
Rich includes Prompt and Confirm classes with additional capabilities including default values, choice lists, and re-prompting. For example:
from inspect_ai.util import input_screen
from rich.prompt import Prompt
with input_screen() as console:
= Prompt.ask(
name "Enter your name",
=["Paul", "Jessica", "Duncan"],
choices="Paul"
default )
The Prompt
class is designed to be subclassed for more specialized inputs. The IntPrompt
and FloatPrompt
classes are built-in, but you can also create your own more customised prompts (the Confirm
class is another example of this). See the prompt.py source code for additional details.
Trace Mode
When introducing interactions it’s often useful to see a trace of message activity for additional context. You can do this via the --trace
CLI option (or trace
parameter of the eval()
function). For example:
$ inspect eval theory.py --trace
In trace mode, all messages exchanged with the model are printed to the terminal (tool output is truncated at 100 lines).
Note that enabling trace mode automatically sets max_tasks
and max_samples
to 1, as otherwise messages from concurrently running samples would be interleaved together in an incoherent jumble.
If you want to add your own trace content, use the trace_enabled()
function to check whether trace mode is currently enabled and the trace_panel()
function to output a panel that is visually consistent with other trace mode output. For example:
from inspect_ai.util import trace_enabled, trace_panel
if trace_enabled():
"My Panel", content="Panel content") trace_panel(
Progress
Evaluations with user input alternate between asking for input and displaying task progress. By default, the normal task status display is shown when a user input screen is not active.
However, if your evaluation is dominated by user input with very short model interactions in between, the task display flashing on and off might prove distracting. For these cases, you can specify the transient=False
option, to indicate that the input screen should be shown at all times. For example:
with input_screen(transient=False) as console:
print("Some preamble text")
console.input = console.input("Please enter your name: ")
This will result in the input screen staying active throughout the evaluation. A small progress indicator will be shown whenever user input isn’t being requested so that the user knows that the evaluation is still running.
Header
You can add a header to your console input via the header
parameter. For example:
with input_screen(header="Input Request") as console:
input = console.input("Please enter your name: ")
The header
option is a useful way to delineate user input requests (especially when switching between input display and the normal task display). You might also prefer to create your own heading treatments–under the hood, the header
option calls console.rule()
with a blue bold treatment:
f"[blue bold]{header}[/blue bold]", style="blue bold") console.rule(
You can also use the Layout primitives (columns, panels, and tables) to present your input user interface.
Formatting
The console.print()
method supports formatting using simple markup. For example:
with input_screen() as console:
print("[bold red]alert![/bold red] Something happened") console.
See the documentation on console markup for additional details.
You can also render markdown directly, for example:
from inspect_ai.util import input_screen
from rich.markdown import Markdown
with input_screen() as console:
print(Markdown('The _quick_ brown **fox**')) console.
Layout
Rich includes Columns, Table and Panel classes for more advanced layout. For example, here is a simple table:
from inspect_ai.util import input_screen
from rich.table import Table
with input_screen() as console:
= Table(title="Tool Calls")
table "Function", justify="left", style="cyan")
table.add_column("Parameters", style="magenta")
table.add_column("bash", "ls /usr/bin")
table.add_row("python", "print('foo')")
table.add_row(print(table) console.