feat: add verify-bug agent for EagleEyes scanner validation
loading diff…
EagleEyes scanner has a 69% false positive rate (verified across 72 bugs in 8 apps). We need an automated way to verify scanner-flagged bugs via live API execution and — critically — explain why false positives occur so we can improve the scanner.
verify_bug_agent under cortex/agents/ — a read-only agent that reproduces scanner-reported bugs via real API callsTestAndFixToolProvider (same tools as fix agent: execute_current_action, execute_parent_action, execute_curl_action) so parent dependency resolution works out of the boxVerifyBugResponse with:
verdict: REAL_BUG / FALSE_POSITIVE / INCONCLUSIVEroot_cause_class: CONFIRMED_BUG / FRAMEWORK_ABSORBS / API_LENIENT / CLIENT_HANDLES / SCANNER_WRONG / CODE_FLOW_MISSEDevidence: actual API response proving the verdictexplanation: why static analysis and runtime behavior disagreescanner_feedback: one-liner actionable fix for the scannerVerifyBugWorkflow registered as WorkflowKind.VerifyBugenable_file_edit=False, no git/PR logic{
"workflow": "verify-bug",
"app_name": "gmail",
"action_name": "GMAIL_FETCH_EMAILS",
"bug_description": "URL encoding: label_ids containing @ character will fail",
"scanner_category": "URL Encoding"
}
FALSE_POSITIVE with API_LENIENT root cause and evidence showing Gmail accepts unencoded @VerifyBugConfig extends TestAndFixWithFixInstructionConfig so TestAndFixToolProvider accepts it without type changes. fix_instruction holds the bug description, with a bug_description property alias._validate_response leniency, model_dump forwarding, HTTP client auto-encoding) so the agent knows what to look for.WorkflowKind.VerifyBug enum value in CortexExecution.api_type.🤖 Generated with Claude Code