inspect eval-retry
Retry failed evaluation(s)
Usage
inspect eval_retry [OPTIONS] LOG_FILES...
Options
Name | Type | Description | Default |
---|---|---|---|
--max-samples |
integer | Maximum number of samples to run in parallel (default is running all samples in parallel) | None |
--max-tasks |
integer | Maximum number of tasks to run in parallel (default is 1) | None |
--max-subprocesses |
integer | Maximum number of subprocesses to run in parallel (default is os.cpu_count()) | None |
--max-sandboxes |
integer | Maximum number of sandboxes (per-provider) to run in parallel. | None |
--no-sandbox-cleanup |
boolean | Do not cleanup sandbox environments after task completes | False |
--fail-on-error |
float | Threshold of sample errors to tolerage (by default, evals fail when any error occurs). Value between 0 to 1 to set a proportion; value greater than 1 to set a count. | None |
--no-fail-on-error |
boolean | Do not fail the eval if errors occur within samples (instead, continue running other samples) | False |
--no-log-samples |
boolean | Do not include samples in the log file. | False |
--log-images / --no-log-images |
boolean | Include base64 encoded versions of filename or URL based images in the log file. | True |
--log-buffer |
integer | Number of samples to buffer before writing log file. If not specified, an appropriate default for the format and filesystem is chosen (10 for most all cases, 100 for JSON logs on remote filesystems). | None |
--no-score |
boolean | Do not score model output (use the inspect score command to score output later) | False |
--no-score-display |
boolean | Do not score model output (use the inspect score command to score output later) | False |
--max-connections |
integer | Maximum number of concurrent connections to Model API (defaults to 10) | None |
--max-retries |
integer | Maximum number of times to retry request (defaults to 5) | None |
--timeout |
integer | Request timeout (in seconds). | None |
--log-level-transcript |
choice (debug | trace | http | info | warning | error | critical ) |
Set the log level of the transcript (defaults to ‘info’) | info |
--log-level |
choice (debug | trace | http | info | warning | error | critical ) |
Set the log level (defaults to ‘warning’) | warning |
--log-dir |
text | Directory for log files. | ./logs |
--display |
choice (full | conversation | rich | plain | none ) |
Set the display type (defaults to ‘full’) | full |
--debug |
boolean | Wait to attach debugger | False |
--debug-port |
integer | Port number for debugger | 5678 |
--debug-errors |
boolean | Raise task errors (rather than logging them) so they can be debugged. | False |
--help |
boolean | Show this message and exit. | False |