Commit Graph

3 Commits

Author SHA1 Message Date
SuYao
0665ce7270 fix(anthropic): strip unsupported JSON Schema keywords from tool schemas (#16480)
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: suyao <sy20010504@gmail.com>
2026-06-27 18:22:00 +08:00
SuYao
7ddceb698f fix(deps): omit max_tokens for non-Anthropic models in @ai-sdk/anthropic (#14749)
### What this PR does

Before this PR:

When `@ai-sdk/anthropic`'s `AnthropicMessagesLanguageModel` was used as
transport for non-Anthropic models (e.g. NewAPI configured with
`endpointType: 'anthropic'` routing non-Claude models, or any gateway
proxying non-Claude models through an Anthropic-compatible API), the SDK
silently injected a default `max_tokens` value (4096 for unknown models,
model-specific for Claude). This overrode the user's intent when they
had disabled `enableMaxTokens` in assistant settings — Cherry Studio's
`getMaxTokens` deliberately returns `undefined` in that case.

After this PR:

`max_tokens` is only emitted when either (a) the user has explicitly set
it, or (b) the model is actually a Claude model (`isKnownModel ||
modelId.startsWith("claude-")`). For non-Anthropic models routed through
the Anthropic SDK, the field is omitted, respecting the user's choice.

The fix lives in `patches/@ai-sdk__anthropic.patch`:
1. Make the fallback to `maxOutputTokensForModel` Claude-only.
2. Use a conditional spread for `max_tokens` in `baseArgs` so it's
omitted when undefined.
3. Guard the thinking-budget addition (`maxTokens + thinkingBudget`) so
it doesn't compute `NaN`.

Fixes #

### Why we need it and why it was done in this way

The Anthropic API itself requires `max_tokens`, so the SDK's default
fallback is correct for actual Claude requests. But the same SDK class
is reused as a transport for non-Anthropic backends in NewAPI /
aggregator providers, and there the silent default leaks through and
caps generation.

The following tradeoffs were made:

- Patched the upstream package via the existing `pnpm patch` workflow
rather than wrapping/intercepting the SDK at the call site, because the
field is constructed deep inside `getArgs()` and forking that path would
duplicate a lot of logic.

The following alternatives were considered:

- Setting `maxOutputTokens` explicitly per-provider in `getMaxTokens`.
Rejected because we want `undefined` to mean "let the server decide" for
non-Anthropic backends, and there is no generic safe default.
- Pre-stripping `max_tokens` after the SDK builds the body. Rejected:
the SDK posts directly via `postJsonToApi`; there is no public hook
between body construction and request dispatch.

Links to places where the discussion took place: N/A

### Breaking changes

None. Behavior for real Anthropic Claude requests is unchanged (the
default is still applied). The change only affects requests where the
SDK is used against a non-Claude model and the user has not set
`maxOutputTokens` — previously they got a silent 4096 cap, now they get
whatever the upstream server defaults to.

### Special notes for your reviewer

- The patch is regenerated via `pnpm patch @ai-sdk/anthropic@3.0.71` +
`pnpm patch-commit`, so the hash in `pnpm-lock.yaml` updates
accordingly.
- Existing 28 tests in
`src/renderer/src/aiCore/prepareParams/__tests__/model-parameters.test.ts`
still pass.
- Hotfix branch per the post-2026-04-03 main-branch policy. No
refactoring included.

### Checklist

- [x] PR: The PR description is expressive enough and will help future
contributors
- [x] Code: [Write code that humans can
understand](https://en.wikiquote.org/wiki/Martin_Fowler#code-for-humans)
and [Keep it simple](https://en.wikipedia.org/wiki/KISS_principle)
- [x] Refactor: You have [left the code cleaner than you found it (Boy
Scout
Rule)](https://learning.oreilly.com/library/view/97-things-every/9780596809515/ch08.html)
- [x] Upgrade: Impact of this change on upgrade flows was considered and
addressed if required
- [ ] Documentation: A [user-guide update](https://docs.cherry-ai.com)
was considered and is present (link) or not required. Check this only
when the PR introduces or changes a user-facing feature or behavior.
- [x] Self-review: I have reviewed my own code (e.g., via
[`/gh-pr-review`](/.claude/skills/gh-pr-review/SKILL.md), `gh pr diff`,
or GitHub UI) before requesting review from others

### Release note

```release-note
fix(deps): prevent @ai-sdk/anthropic from injecting a default max_tokens when used as transport for non-Anthropic models (e.g. NewAPI with anthropic endpoint type for non-Claude models). User's choice to disable max tokens is now respected on those backends.
```

Signed-off-by: suyao <sy20010504@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 20:23:01 +08:00
SuYao
4e1e4548bb hotfix(deepseek): forward reasoning effort for DeepSeek V4+ via Claude endpoint (#14572)
### What this PR does

Before this PR:

- For DeepSeek V4+ models served through a Claude-compatible endpoint,
`getAnthropicReasoningParams` returned only `{ thinking: { type:
'enabled', budgetTokens } }`. The requested `reasoning_effort` was
dropped before the request left the client, so `high` / `xhigh` behaved
identically to default.
- `@ai-sdk/deepseek` had no `reasoning_effort` field on its Zod options
schema, so even if the effort were forwarded the SDK would strip it from
the request body.
- `@ai-sdk/anthropic` silently omitted the `thinking` field when
thinking was turned off, preventing downstream providers that require an
explicit `{ type: 'disabled' }` from honoring the disable.

After this PR:

- Non-Anthropic Claude-endpoint models (Kimi, MiniMax, DeepSeek V4+,
etc.) receive `sendReasoning: true` so reasoning output is actually
streamed back.
- DeepSeek V4+ additionally receives `effort: 'max'` when the user picks
`xhigh`, otherwise `effort: 'high'`.
- `@ai-sdk/deepseek@2.0.29` is patched to accept `reasoning_effort` on
its language-model options schema and forward it into the request body.
- `@ai-sdk/anthropic` is patched to emit an explicit `thinking: { type:
'disabled' }` when thinking is disabled.
- Updated one existing `getAnthropicReasoningParams` unit test to assert
the new `sendReasoning: true` payload, and added a new V4+ reasoning
test suite.

Fixes #

### Why we need it and why it was done in this way

Follow-up to #14551, which introduced `isDeepSeekV4PlusModel` and the
`deepseek_v4` reasoning category but did not wire the selected effort
into the outgoing request for the Claude-compatible endpoint path.
Without this PR the `high` / `xhigh` options are cosmetic for V4+.

The following tradeoffs were made:

- Patched upstream SDKs (`@ai-sdk/deepseek`, `@ai-sdk/anthropic`)
instead of waiting for upstream releases, so the fix ships on `main`'s
code freeze window. Patches are narrow (single field additions).
- Kept all non-Anthropic Claude-endpoint models on `sendReasoning: true`
rather than gating per-provider — the field is harmless for providers
that don't emit reasoning and restores visibility for those that do.

The following alternatives were considered:

- Upstreaming the `reasoning_effort` change to `@ai-sdk/deepseek` —
rejected for timing; patch is mechanical and easy to drop once upstream
lands.
- Sending effort via a custom header — rejected; DeepSeek documents
`reasoning_effort` in the request body.

Links to places where the discussion took place:

### Breaking changes

None. The added fields (`sendReasoning`, `effort`, `reasoning_effort`)
are additive and only take effect for models already routed through the
Claude-compatible path with reasoning enabled.

### Special notes for your reviewer

- Two new patch files under `patches/` are registered in
`package.json#pnpm.patchedDependencies`; they are narrow dist-only
additions.
- The implementation change is localized to
`getAnthropicReasoningParams` in
`src/renderer/src/aiCore/utils/reasoning.ts`.
- Targeting `main` via `hotfix/*` branch per the branch-strategy rule,
since this restores intended V4+ behavior shipped in #14551 and contains
no refactoring.

### Checklist

- [x] PR: The PR description is expressive enough and will help future
contributors
- [x] Code: Write code that humans can understand and Keep it simple
- [x] Refactor: You have left the code cleaner than you found it (Boy
Scout Rule)
- [x] Upgrade: Impact of this change on upgrade flows was considered and
addressed if required
- [x] Documentation: A user-guide update was considered and is present
(link) or not required. Check this only when the PR introduces or
changes a user-facing feature or behavior.
- [x] Self-review: I have reviewed my own code before requesting review
from others

### Release note

```release-note
Fix: DeepSeek V4+ reasoning effort (high / xhigh) is now actually forwarded to the API when the model is served via a Claude-compatible endpoint; reasoning output is also streamed back to the UI.
```

---------

Signed-off-by: suyao <sy20010504@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 02:12:47 +08:00