CherryHQ-cherry-studio

github/CherryHQ-cherry-studio

Fork 0

mirror of https://github.com/CherryHQ/cherry-studio.git synced 2026-07-05 21:50:46 +08:00

Commit Graph

Author	SHA1	Message	Date
SuYao	0665ce7270	fix(anthropic): strip unsupported JSON Schema keywords from tool schemas (#16480 ) Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com> Signed-off-by: suyao <sy20010504@gmail.com>	2026-06-27 18:22:00 +08:00
SuYao	7ddceb698f	fix(deps): omit max_tokens for non-Anthropic models in @ai-sdk/anthropic (#14749 ) ### What this PR does Before this PR: When `@ai-sdk/anthropic`'s `AnthropicMessagesLanguageModel` was used as transport for non-Anthropic models (e.g. NewAPI configured with `endpointType: 'anthropic'` routing non-Claude models, or any gateway proxying non-Claude models through an Anthropic-compatible API), the SDK silently injected a default `max_tokens` value (4096 for unknown models, model-specific for Claude). This overrode the user's intent when they had disabled `enableMaxTokens` in assistant settings — Cherry Studio's `getMaxTokens` deliberately returns `undefined` in that case. After this PR: `max_tokens` is only emitted when either (a) the user has explicitly set it, or (b) the model is actually a Claude model (`isKnownModel \|\| modelId.startsWith("claude-")`). For non-Anthropic models routed through the Anthropic SDK, the field is omitted, respecting the user's choice. The fix lives in `patches/@ai-sdk__anthropic.patch`: 1. Make the fallback to `maxOutputTokensForModel` Claude-only. 2. Use a conditional spread for `max_tokens` in `baseArgs` so it's omitted when undefined. 3. Guard the thinking-budget addition (`maxTokens + thinkingBudget`) so it doesn't compute `NaN`. Fixes # ### Why we need it and why it was done in this way The Anthropic API itself requires `max_tokens`, so the SDK's default fallback is correct for actual Claude requests. But the same SDK class is reused as a transport for non-Anthropic backends in NewAPI / aggregator providers, and there the silent default leaks through and caps generation. The following tradeoffs were made: - Patched the upstream package via the existing `pnpm patch` workflow rather than wrapping/intercepting the SDK at the call site, because the field is constructed deep inside `getArgs()` and forking that path would duplicate a lot of logic. The following alternatives were considered: - Setting `maxOutputTokens` explicitly per-provider in `getMaxTokens`. Rejected because we want `undefined` to mean "let the server decide" for non-Anthropic backends, and there is no generic safe default. - Pre-stripping `max_tokens` after the SDK builds the body. Rejected: the SDK posts directly via `postJsonToApi`; there is no public hook between body construction and request dispatch. Links to places where the discussion took place: N/A ### Breaking changes None. Behavior for real Anthropic Claude requests is unchanged (the default is still applied). The change only affects requests where the SDK is used against a non-Claude model and the user has not set `maxOutputTokens` — previously they got a silent 4096 cap, now they get whatever the upstream server defaults to. ### Special notes for your reviewer - The patch is regenerated via `pnpm patch @ai-sdk/anthropic@3.0.71` + `pnpm patch-commit`, so the hash in `pnpm-lock.yaml` updates accordingly. - Existing 28 tests in `src/renderer/src/aiCore/prepareParams/__tests__/model-parameters.test.ts` still pass. - Hotfix branch per the post-2026-04-03 main-branch policy. No refactoring included. ### Checklist - [x] PR: The PR description is expressive enough and will help future contributors - [x] Code: [Write code that humans can understand](https://en.wikiquote.org/wiki/Martin_Fowler#code-for-humans) and [Keep it simple](https://en.wikipedia.org/wiki/KISS_principle) - [x] Refactor: You have [left the code cleaner than you found it (Boy Scout Rule)](https://learning.oreilly.com/library/view/97-things-every/9780596809515/ch08.html) - [x] Upgrade: Impact of this change on upgrade flows was considered and addressed if required - [ ] Documentation: A [user-guide update](https://docs.cherry-ai.com) was considered and is present (link) or not required. Check this only when the PR introduces or changes a user-facing feature or behavior. - [x] Self-review: I have reviewed my own code (e.g., via [`/gh-pr-review`](/.claude/skills/gh-pr-review/SKILL.md), `gh pr diff`, or GitHub UI) before requesting review from others ### Release note ```release-note fix(deps): prevent @ai-sdk/anthropic from injecting a default max_tokens when used as transport for non-Anthropic models (e.g. NewAPI with anthropic endpoint type for non-Claude models). User's choice to disable max tokens is now respected on those backends. ``` Signed-off-by: suyao <sy20010504@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 20:23:01 +08:00
SuYao	4e1e4548bb	hotfix(deepseek): forward reasoning effort for DeepSeek V4+ via Claude endpoint (#14572 ) ### What this PR does Before this PR: - For DeepSeek V4+ models served through a Claude-compatible endpoint, `getAnthropicReasoningParams` returned only `{ thinking: { type: 'enabled', budgetTokens } }`. The requested `reasoning_effort` was dropped before the request left the client, so `high` / `xhigh` behaved identically to default. - `@ai-sdk/deepseek` had no `reasoning_effort` field on its Zod options schema, so even if the effort were forwarded the SDK would strip it from the request body. - `@ai-sdk/anthropic` silently omitted the `thinking` field when thinking was turned off, preventing downstream providers that require an explicit `{ type: 'disabled' }` from honoring the disable. After this PR: - Non-Anthropic Claude-endpoint models (Kimi, MiniMax, DeepSeek V4+, etc.) receive `sendReasoning: true` so reasoning output is actually streamed back. - DeepSeek V4+ additionally receives `effort: 'max'` when the user picks `xhigh`, otherwise `effort: 'high'`. - `@ai-sdk/deepseek@2.0.29` is patched to accept `reasoning_effort` on its language-model options schema and forward it into the request body. - `@ai-sdk/anthropic` is patched to emit an explicit `thinking: { type: 'disabled' }` when thinking is disabled. - Updated one existing `getAnthropicReasoningParams` unit test to assert the new `sendReasoning: true` payload, and added a new V4+ reasoning test suite. Fixes # ### Why we need it and why it was done in this way Follow-up to #14551, which introduced `isDeepSeekV4PlusModel` and the `deepseek_v4` reasoning category but did not wire the selected effort into the outgoing request for the Claude-compatible endpoint path. Without this PR the `high` / `xhigh` options are cosmetic for V4+. The following tradeoffs were made: - Patched upstream SDKs (`@ai-sdk/deepseek`, `@ai-sdk/anthropic`) instead of waiting for upstream releases, so the fix ships on `main`'s code freeze window. Patches are narrow (single field additions). - Kept all non-Anthropic Claude-endpoint models on `sendReasoning: true` rather than gating per-provider — the field is harmless for providers that don't emit reasoning and restores visibility for those that do. The following alternatives were considered: - Upstreaming the `reasoning_effort` change to `@ai-sdk/deepseek` — rejected for timing; patch is mechanical and easy to drop once upstream lands. - Sending effort via a custom header — rejected; DeepSeek documents `reasoning_effort` in the request body. Links to places where the discussion took place: ### Breaking changes None. The added fields (`sendReasoning`, `effort`, `reasoning_effort`) are additive and only take effect for models already routed through the Claude-compatible path with reasoning enabled. ### Special notes for your reviewer - Two new patch files under `patches/` are registered in `package.json#pnpm.patchedDependencies`; they are narrow dist-only additions. - The implementation change is localized to `getAnthropicReasoningParams` in `src/renderer/src/aiCore/utils/reasoning.ts`. - Targeting `main` via `hotfix/*` branch per the branch-strategy rule, since this restores intended V4+ behavior shipped in #14551 and contains no refactoring. ### Checklist - [x] PR: The PR description is expressive enough and will help future contributors - [x] Code: Write code that humans can understand and Keep it simple - [x] Refactor: You have left the code cleaner than you found it (Boy Scout Rule) - [x] Upgrade: Impact of this change on upgrade flows was considered and addressed if required - [x] Documentation: A user-guide update was considered and is present (link) or not required. Check this only when the PR introduces or changes a user-facing feature or behavior. - [x] Self-review: I have reviewed my own code before requesting review from others ### Release note ```release-note Fix: DeepSeek V4+ reasoning effort (high / xhigh) is now actually forwarded to the API when the model is served via a Claude-compatible endpoint; reasoning output is also streamed back to the UI. ``` --------- Signed-off-by: suyao <sy20010504@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-25 02:12:47 +08:00

Author

SHA1

Message

Date

SuYao

0665ce7270

fix(anthropic): strip unsupported JSON Schema keywords from tool schemas (#16480 )

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: fullex <106392080+0xfullex@users.noreply.github.com>
Signed-off-by: suyao <sy20010504@gmail.com>

2026-06-27 18:22:00 +08:00

SuYao

7ddceb698f

fix(deps): omit max_tokens for non-Anthropic models in @ai-sdk/anthropic (#14749 )

### What this PR does

Before this PR:

When `@ai-sdk/anthropic`'s `AnthropicMessagesLanguageModel` was used as
transport for non-Anthropic models (e.g. NewAPI configured with
`endpointType: 'anthropic'` routing non-Claude models, or any gateway
proxying non-Claude models through an Anthropic-compatible API), the SDK
silently injected a default `max_tokens` value (4096 for unknown models,
model-specific for Claude). This overrode the user's intent when they
had disabled `enableMaxTokens` in assistant settings — Cherry Studio's
`getMaxTokens` deliberately returns `undefined` in that case.

After this PR:

`max_tokens` is only emitted when either (a) the user has explicitly set
it, or (b) the model is actually a Claude model (`isKnownModel ||
modelId.startsWith("claude-")`). For non-Anthropic models routed through
the Anthropic SDK, the field is omitted, respecting the user's choice.

The fix lives in `patches/@ai-sdk__anthropic.patch`:
1. Make the fallback to `maxOutputTokensForModel` Claude-only.
2. Use a conditional spread for `max_tokens` in `baseArgs` so it's
omitted when undefined.
3. Guard the thinking-budget addition (`maxTokens + thinkingBudget`) so
it doesn't compute `NaN`.

Fixes #

### Why we need it and why it was done in this way

The Anthropic API itself requires `max_tokens`, so the SDK's default
fallback is correct for actual Claude requests. But the same SDK class
is reused as a transport for non-Anthropic backends in NewAPI /
aggregator providers, and there the silent default leaks through and
caps generation.

The following tradeoffs were made:

- Patched the upstream package via the existing `pnpm patch` workflow
rather than wrapping/intercepting the SDK at the call site, because the
field is constructed deep inside `getArgs()` and forking that path would
duplicate a lot of logic.

The following alternatives were considered:

- Setting `maxOutputTokens` explicitly per-provider in `getMaxTokens`.
Rejected because we want `undefined` to mean "let the server decide" for
non-Anthropic backends, and there is no generic safe default.
- Pre-stripping `max_tokens` after the SDK builds the body. Rejected:
the SDK posts directly via `postJsonToApi`; there is no public hook
between body construction and request dispatch.

Links to places where the discussion took place: N/A

### Breaking changes

None. Behavior for real Anthropic Claude requests is unchanged (the
default is still applied). The change only affects requests where the
SDK is used against a non-Claude model and the user has not set
`maxOutputTokens` — previously they got a silent 4096 cap, now they get
whatever the upstream server defaults to.

### Special notes for your reviewer

- The patch is regenerated via `pnpm patch @ai-sdk/anthropic@3.0.71` +
`pnpm patch-commit`, so the hash in `pnpm-lock.yaml` updates
accordingly.
- Existing 28 tests in
`src/renderer/src/aiCore/prepareParams/__tests__/model-parameters.test.ts`
still pass.
- Hotfix branch per the post-2026-04-03 main-branch policy. No
refactoring included.

### Checklist

- [x] PR: The PR description is expressive enough and will help future
contributors
- [x] Code: [Write code that humans can
understand](https://en.wikiquote.org/wiki/Martin_Fowler#code-for-humans)
and [Keep it simple](https://en.wikipedia.org/wiki/KISS_principle)
- [x] Refactor: You have [left the code cleaner than you found it (Boy
Scout
Rule)](https://learning.oreilly.com/library/view/97-things-every/9780596809515/ch08.html)
- [x] Upgrade: Impact of this change on upgrade flows was considered and
addressed if required
- [ ] Documentation: A [user-guide update](https://docs.cherry-ai.com)
was considered and is present (link) or not required. Check this only
when the PR introduces or changes a user-facing feature or behavior.
- [x] Self-review: I have reviewed my own code (e.g., via
[`/gh-pr-review`](/.claude/skills/gh-pr-review/SKILL.md), `gh pr diff`,
or GitHub UI) before requesting review from others

### Release note

```release-note
fix(deps): prevent @ai-sdk/anthropic from injecting a default max_tokens when used as transport for non-Anthropic models (e.g. NewAPI with anthropic endpoint type for non-Claude models). User's choice to disable max tokens is now respected on those backends.
```

Signed-off-by: suyao <sy20010504@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-30 20:23:01 +08:00

SuYao

4e1e4548bb

hotfix(deepseek): forward reasoning effort for DeepSeek V4+ via Claude endpoint (#14572 )

### What this PR does

Before this PR:

- For DeepSeek V4+ models served through a Claude-compatible endpoint,
`getAnthropicReasoningParams` returned only `{ thinking: { type:
'enabled', budgetTokens } }`. The requested `reasoning_effort` was
dropped before the request left the client, so `high` / `xhigh` behaved
identically to default.
- `@ai-sdk/deepseek` had no `reasoning_effort` field on its Zod options
schema, so even if the effort were forwarded the SDK would strip it from
the request body.
- `@ai-sdk/anthropic` silently omitted the `thinking` field when
thinking was turned off, preventing downstream providers that require an
explicit `{ type: 'disabled' }` from honoring the disable.

After this PR:

- Non-Anthropic Claude-endpoint models (Kimi, MiniMax, DeepSeek V4+,
etc.) receive `sendReasoning: true` so reasoning output is actually
streamed back.
- DeepSeek V4+ additionally receives `effort: 'max'` when the user picks
`xhigh`, otherwise `effort: 'high'`.
- `@ai-sdk/deepseek@2.0.29` is patched to accept `reasoning_effort` on
its language-model options schema and forward it into the request body.
- `@ai-sdk/anthropic` is patched to emit an explicit `thinking: { type:
'disabled' }` when thinking is disabled.
- Updated one existing `getAnthropicReasoningParams` unit test to assert
the new `sendReasoning: true` payload, and added a new V4+ reasoning
test suite.

Fixes #

### Why we need it and why it was done in this way

Follow-up to #14551, which introduced `isDeepSeekV4PlusModel` and the
`deepseek_v4` reasoning category but did not wire the selected effort
into the outgoing request for the Claude-compatible endpoint path.
Without this PR the `high` / `xhigh` options are cosmetic for V4+.

The following tradeoffs were made:

- Patched upstream SDKs (`@ai-sdk/deepseek`, `@ai-sdk/anthropic`)
instead of waiting for upstream releases, so the fix ships on `main`'s
code freeze window. Patches are narrow (single field additions).
- Kept all non-Anthropic Claude-endpoint models on `sendReasoning: true`
rather than gating per-provider — the field is harmless for providers
that don't emit reasoning and restores visibility for those that do.

The following alternatives were considered:

- Upstreaming the `reasoning_effort` change to `@ai-sdk/deepseek` —
rejected for timing; patch is mechanical and easy to drop once upstream
lands.
- Sending effort via a custom header — rejected; DeepSeek documents
`reasoning_effort` in the request body.

Links to places where the discussion took place:

### Breaking changes

None. The added fields (`sendReasoning`, `effort`, `reasoning_effort`)
are additive and only take effect for models already routed through the
Claude-compatible path with reasoning enabled.

### Special notes for your reviewer

- Two new patch files under `patches/` are registered in
`package.json#pnpm.patchedDependencies`; they are narrow dist-only
additions.
- The implementation change is localized to
`getAnthropicReasoningParams` in
`src/renderer/src/aiCore/utils/reasoning.ts`.
- Targeting `main` via `hotfix/*` branch per the branch-strategy rule,
since this restores intended V4+ behavior shipped in #14551 and contains
no refactoring.

### Checklist

- [x] PR: The PR description is expressive enough and will help future
contributors
- [x] Code: Write code that humans can understand and Keep it simple
- [x] Refactor: You have left the code cleaner than you found it (Boy
Scout Rule)
- [x] Upgrade: Impact of this change on upgrade flows was considered and
addressed if required
- [x] Documentation: A user-guide update was considered and is present
(link) or not required. Check this only when the PR introduces or
changes a user-facing feature or behavior.
- [x] Self-review: I have reviewed my own code before requesting review
from others

### Release note

```release-note
Fix: DeepSeek V4+ reasoning effort (high / xhigh) is now actually forwarded to the API when the model is served via a Claude-compatible endpoint; reasoning output is also streamed back to the UI.
```

---------

Signed-off-by: suyao <sy20010504@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-25 02:12:47 +08:00

3 Commits