Why
Prevent reintroduction of the bug that caused workflow ada5svth to hang for 7+ hours and cost $1.7K.
as_completed() blocks indefinitely if tasks hang on LLM calls - this was the root cause of #925.
What
Add a CI lint check that flags as_completed( usage in production code (cortex/).
Excludes:
/tests/ - test files
_test.py, test_ - test file patterns
/eval.py, long_eval - benchmarking/evaluation code
On failure, suggests:
Use wait(futures, timeout=60, return_when=FIRST_COMPLETED) with manual timeout checking instead.
See cortex/workflows/response_schema_modifier.py for the correct pattern.
How to test
# Should pass (all production code now uses wait())
grep -r "as_completed(" cortex/ --include="*.py" \
| grep -v "/tests/" | grep -v "_test.py" | grep -v "test_" \
| grep -v "/eval.py" | grep -v "long_eval"
Notes
This is a required CI check - any new as_completed() usage in production code will fail the lint job.
🤖 Generated with Claude Code