01Mandatory evidence quoting to justify pass/fail verdicts
02Parallel A/B subagent dispatching for consistent baseline comparison
030 GitHub stars
04Built-in discipline rules to prevent context leakage and evaluation bias
05Automatic generation of comprehensive markdown summary reports
06Strict binary scoring system for objective performance evaluation