[LOW PRIORITY] docs: Manus auth feasibility analysis - 82% of auth failures automatable
loading diff…
PR #942 concluded auth failures are "only 1% of production failures" and recommended LOW PRIORITY for the Manus auth management service. However, that analysis counted auth failures within the FAILED state rather than the separate FAILED_AUTH_ISSUE state.
This analysis corrects that finding:
docs/manus-auth-feasibility-analysis.md)Analyzed the latest 100 production auth failures individually:
| Category | Count | Manus Solvable? |
|---|---|---|
| Invalid/Revoked Credentials | 31 | Yes |
| Code Fix Unverifiable | 28 | Yes |
| Insufficient Permissions/Scope | 13 | Yes |
| Other/Unclear | 11 | No |
| Token/Key Expired | 8 | Yes |
| Paid Plan Required | 2 | No |
| IP Whitelist Restriction | 2 | Yes |
| Code Bug (misclassified) | 2 | No |
| Rate Limiting (misclassified) | 2 | No |
| Master/Enterprise Key Required | 1 | No |
Result: 82/100 (82%) solvable by Manus service
infra/runbooks/auth-issues/)Added agent-usable scripts for investigating auth failures:
| Script | Purpose |
|---|---|
auth-stats.sh | Overall stats: failure rates, workflow types, tag distribution |
toolkit-breakdown.sh | Per-toolkit failure counts with assessments |
error-details.py | Extract detailed error messages for specific toolkits |
list-auth-failures.sh | List individual failures with filters |
Added Runbooks section to infra/AI_AGENT_GUIDE.md referencing the new auth-issues runbook.
MEDIUM-HIGH priority (up from LOW in PR #942). Start with token refresh automation (highest volume, simplest implementation).
auth-stats.sh produces correct statstoolkit-breakdown.sh shows per-toolkit breakdownerror-details.py extracts detailed errorslist-auth-failures.sh filters work correctly🤖 Generated with Claude Code