Skill Info
What It Does
Implements systematic self-reflection and iterative refinement for AI agents, based on peer-reviewed research:
- Core loop: GENERATE → CRITIQUE → REFINE → CHECK (from Madaan et al. 2023)
- 6 reflection dimensions: Logic, Facts, Completeness, Conciseness, Actionability, Consistency
- 4 auto-selected depth levels: Level 0-3 (0-50% token overhead)
- Cross-session Reflexion memory (from Shinn et al. 2023)
Why It Fits
- First (AFAIK) installable Agent Skill focused purely on self-refinement
- Zero API cost — the host AI acts as its own critic
- Works with Claude Code, Cursor, Copilot, Codex CLI, Gemini CLI, and 10+ more platforms
- Academic grounding: Self-Refine, Reflexion, CoVe, Self-Calibration, CRITIC, Andrew Ng Reflection
Benchmark
+21.2% average improvement across 20 test cases, 5 categories, 6 scoring dimensions.
Full report: https://github.com/Thomaszhou22/self-refine-skill/blob/main/BENCHMARK.md
Checklist