mirror of
https://github.com/microsoft/SkillOpt.git
synced 2026-07-03 14:02:58 +08:00
fix(trainer): support continuous reward scores in bucket aggregation
int() truncates any float in [0,1) to 0. Replace with float(). Also fix falsy float check in failure detection. Backward compatible with binary hard=0/1.
This commit is contained in: