01Provides guardrails to prevent training crashes on false-positive spikes
020 GitHub stars
03Corrects percentage-based drawdown calculations for RL rewards
04Configures resilient early-stop thresholds tailored for PPO training
05Clamps metric ranges to ensure valid mathematical inputs for optimizers
06Implements adaptive recovery logic with LR and entropy adjustments