mirror of
https://github.com/microsoft/SkillOpt.git
synced 2026-07-03 14:02:58 +08:00
fix(reflect): support continuous reward scores in failure filtering
not r.get("hard") treats non-zero floats as success.
Add explicit float threshold check (< 1e-9).
Backward compatible with binary hard=0/1.
This commit is contained in: