microsoft-SkillOpt

mirror of https://github.com/microsoft/SkillOpt.git synced 2026-07-03 14:02:58 +08:00

Author	SHA1	Message	Date
zq	afb552008b	fix(trainer): support continuous reward scores in bucket aggregation int() truncates any float in [0,1) to 0. Replace with float(). Also fix falsy float check in failure detection. Backward compatible with binary hard=0/1.	2026-05-29 19:03:52 +08:00
Yif Yang	75b5c7f31c	Merge pull request #16 from guilhermeleste/feat/pioneer-ai-provider-integration Add OpenAI-compatible backend support for Pioneer.ai and other providers	2026-05-29 10:14:32 +08:00
hwq	786d57b5cf	Make rollout completion tokens configurable	2026-05-28 09:45:47 +00:00
guilhermeleste	d5c5b61830	Add OpenAI-compatible backend support for Pioneer.ai and other providers - Add 'openai_compatible', 'compat', and 'openai' auth modes to azure_openai.py - Modify _make_client() to use OpenAI client (not AzureOpenAI) for compatible endpoints - Update type hints to support both AzureOpenAI and OpenAI clients - Auto-configure API version sentinel when using compatible modes - Add .env template for Pioneer.ai configuration This allows users to use Pioneer.ai or any OpenAI-compatible API endpoint as both optimizer and target backend without requiring Azure OpenAI. Resolves: Support for non-Azure OpenAI-compatible providers	2026-05-28 05:54:43 -03:00
Cuzyoung	f55a26414e	cleanup: remove unused benchmarks, deep_probe, meta_reflect Remove sealqa, babyvision, mathverse, mmrb, swebench envs and configs. Remove deep_probe, deep_reflect, meta_reflect modules and prompts. Remove download_babyvision script. These are not part of the core released benchmarks. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-24 19:36:48 +00:00
Cuzyoung	cff7ff6846	fix: rename remaining teacher/student refs, remove .gradio from repo - Fix teacher/student in deep_reflect, meta_reflect, sealqa, babyvision, mathverse, mmrb, swebench envs and prompt templates - Remove .gradio/certificate.pem from tracked files - Add .gradio/ to .gitignore Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-24 19:22:20 +00:00
Cuzyoung	4a1b984d87	refactor: rename teacher/student to optimizer/target, remove best skills, fix slow update - Rename teacher -> optimizer, student -> target across all code, configs, docs, prompts - CLI: --teacher_model -> --optimizer_model, --student_model -> --target_model - Remove best_skill files, keep only initial skills - Fix slow update gate (force write into skill) - Fix SLOW_UPDATE marker stripping - Remove deep_reflect and meta_reflect mechanisms - Update .env.example with export prefix and azure_cli docs - Add endpoint empty validation in azure_openai.py Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-24 19:15:10 +00:00
CharlesYang030	244e346b83	SkillOpt v0.1.0: initial release - Skill optimization framework with training loop analogy - 11 benchmarks, 4 model backends (Azure OpenAI, Claude, Codex, Qwen) - WebUI for browser-based training control - Pluggable architecture for extending benchmarks and backends	2026-05-21 17:22:04 +00:00

8 Commits