PR Custody

feat(create_app): dedup'd Linear tickets for auth/scope failures

@zen-agentchecks n/achecks…zen/auth-failure-linear-tickets-s84mxq → next6 files · +965 −42updated 2w ago

▸Description· 1 comment

Why

Auth/scope failures are the #1 drop-off in CreateApp — affecting plaid (327), twilio (253), microsoft_power_bi (63), daytona (60), highlevel (53), zendesk, paypal, neo4j, langsmith, metaads, salesforce, etc. The vast majority surface at CurlTester: any endpoint that 401s/403s there is filtered out at create_app.py:194 and never reaches ActionBuilder or TestAndFix, so the action files never get written and the PR ships without them. There's no ticket for support, no dedup across runs, and no signal that scope work would unlock dozens of dropped actions.

This PR builds the detection + ticketing half of the recovery flow: every CreateApp run that hits auth-related curl failures opens (or comments on) one Linear ticket per (toolkit, scope_signature) group, assigned to Manas for triage. The re-resolution half (re-run CurlTester on only the dropped subset after scopes are granted) is a follow-up.

Note: We deliberately hook at CurlTester rather than TestAndFix. TestAndFix would catch the tail (write-scope-specific failures, test_all_actions=True on pre-existing actions) but miss the bulk of the signal that's already been filtered. Watchdog catches the post-merge tail anyway.

What

cortex/common/auth_failure_grouper.py — pure function that groups pre-classified auth failures (dict[action_name, error_text]) by (toolkit, normalized_scope_signature). Scope signature is extracted from the error blob via regex (handles missing_scopes: [...], "scope":"...", required scopes: ..., insufficient_scope: ..., scope=...). Falls back to auth-401 / auth-403 / auth-unknown when no scope is extractable.
cortex/common/auth_failure_reporter.py — classifies each failed CurlTesterResponse as auth-related by regex-scanning summary + curl_output for HTTP 401/403, unauthorized, forbidden, insufficient_scope, missing_scope, permission denied, invalid token, authentication failed. Looks up an existing open Linear issue by a dedup-marker (HTML comment embedded in the description), and either comments on it or creates a new one assigned to Manas (UUID 310cdff7-cdf6-400a-a695-91d43a0dfc81). Never raises — exceptions are logged so a Linear blip can't fail an otherwise-successful CreateApp run.
common/linear_utils.py — two helpers: find_open_issue_by_dedup_marker (Linear GraphQL issues query with description.contains filter, scoped to non-completed/cancelled states) and add_comment_to_issue_by_uuid.
cortex/workflows/create_app.py — single report_auth_failures(...) call right after CurlTester completes (between CurlTester and ActionBuilder).
35 unit tests across the grouper + reporter covering scope extraction patterns, fallbacks, normalization, grouping, dedup-key stability, and the CurlTester-stage classifier (positive cases for 401/403 in summary, unauthorized/forbidden keywords, insufficient_scope in curl_output; negative cases for 404/400/422/5xx and port-numbers-in-URL false positives).

How to Test

make fmt && make chk — passes (ruff format/import/lint + pyrefly all green)
uv run pytest cortex/tests/test_common/test_auth_failure_grouper.py cortex/tests/test_common/test_auth_failure_reporter.py -v — 35/35 pass
Full sweep: uv run pytest cortex/tests/test_common cortex/tests/test_workflows -q — 202 passed
E2E: triggered CreateApp on freshdesk (workflow_id=w9xoyoio) and typefully (workflow_id=gnv058sv) with integrator_branch=zen/auth-failure-linear-tickets-s84mxq. Both have historical CurlTester auth failures — expect 1-3 Linear tickets in INT assigned to Manas, titled like [auth-scope][freshdesk] N actions blocked: <scopes>.

Pre-Review Checklist

I have self-reviewed this PR

Notes

Hardcoded assignee UUID (310cdff7-cdf6-400a-a695-91d43a0dfc81 = Manas) and team key (INT). Intentional — avoids round-tripping Linear on every workflow run and prevents a Linear permissions hiccup from silently producing unassigned tickets or routing to the wrong team. Update the constants if ownership moves.
Labels (auth-failure, support-triage) — Linear skips missing labels with a warning; pre-create them in the INT team if you want them to land on the first ticket.
Auth classifier is regex-based, not LLM-based. Catches the common shapes (HTTP code + standard keywords) without burning an extra agent call. Quality will improve as we see real failures; misclassifications fall through as no-ops (no ticket).
Scope-signature extraction is heuristic. The dedup mechanism is robust even when signatures fall back to coarse buckets like auth-401.
Re-resolution endpoint not in this PR. Support can still manually re-run CreateApp for a toolkit once partnership scopes are granted; the CurlTester will pick up the previously-blocked endpoints.

🤖 Generated with Claude Code

@zen-agent3w ago

Post-PR cycle status

Codex review — clean

Ran 6 iterations of /codex-review-loop. Each of the first 5 found one P2 issue, all addressed:

Iter	Issue	Fix commit
1	`actions_to_test` included pre-existing actions in `test_all_actions=True` mode	`db3696aa`
2	Linear issue titles could exceed the ~256-char cap with long scope lists	`d119955e`
3	Dedupe-removed duplicates kept their `is_auth_issue=True` flag	`599976ec`
4	Inline scope pattern captured trailing prose → unstable dedup keys	`c6a1bdfa`
5	`find_open_issue_by_dedup_marker` couldn't distinguish lookup-failure from no-match	`5dce8285`

Iteration 6: "No discrete correctness issues were identified in the diff. The new auth failure grouping/reporting path is guarded against side-effect failures and does not appear to break existing workflow behavior."

Cursor Bugbot

1 low-severity DRY finding on commentCreate mutation duplication — addressed in 5590b22d by extracting shared _post_comment(api_key, issue_uuid, body) helper (-22 net lines). Reply posted on the thread.

Build checks

make fmt && make chk clean. 28/28 unit tests pass (cortex/tests/test_common/test_auth_failure_grouper.py).

CI status

✅ check-lines, run-lint, detect-changes, enable-auto-merge, Analyze (python/javascript-typescript/actions)
❌ run-tests — 2 pre-existing failures (already documented in PR description). Same exact failures on unrelated PR #1426 (run 26019976649). Both hit production Composio's admin API with a hardcoded connection ID ca_vbZ1LhakOBdV that CI's prod API key no longer has access to. Neither touches code modified in this PR. Right fix is rotating the CI secret or updating the hardcoded ID — out of scope here.

loading diff…

@zen-agent3w ago

Post-PR cycle status

Codex review — clean

Ran 6 iterations of /codex-review-loop. Each of the first 5 found one P2 issue, all addressed:

Iter	Issue	Fix commit
1	`actions_to_test` included pre-existing actions in `test_all_actions=True` mode	`db3696aa`
2	Linear issue titles could exceed the ~256-char cap with long scope lists	`d119955e`
3	Dedupe-removed duplicates kept their `is_auth_issue=True` flag	`599976ec`
4	Inline scope pattern captured trailing prose → unstable dedup keys	`c6a1bdfa`
5	`find_open_issue_by_dedup_marker` couldn't distinguish lookup-failure from no-match	`5dce8285`

Cursor Bugbot

Build checks

make fmt && make chk clean. 28/28 unit tests pass (cortex/tests/test_common/test_auth_failure_grouper.py).

CI status

✅ check-lines, run-lint, detect-changes, enable-auto-merge, Analyze (python/javascript-typescript/actions)
❌ run-tests — 2 pre-existing failures (already documented in PR description). Same exact failures on unrelated PR #1426 (run 26019976649). Both hit production Composio's admin API with a hardcoded connection ID ca_vbZ1LhakOBdV that CI's prod API key no longer has access to. Neither touches code modified in this PR. Right fix is rotating the CI secret or updating the hardcoded ID — out of scope here.

Why

What

How to Test

Pre-Review Checklist

Notes

Post-PR cycle status

Codex review — clean

Cursor Bugbot

Build checks

CI status

Why

What

How to Test

Pre-Review Checklist

Notes

Post-PR cycle status

Codex review — clean

Cursor Bugbot

Build checks

CI status

Testing scope