fix(trainer): support continuous reward scores in bucket aggregation

int() truncates any float in [0,1) to 0. Replace with float().
Also fix falsy float check in failure detection.
Backward compatible with binary hard=0/1.
This commit is contained in:
zq
2026-05-29 19:03:52 +08:00
parent 75b5c7f31c
commit afb552008b

File diff suppressed because it is too large Load Diff