Skip to content

🛰️ feat: Add GPT-5.5 + Frontier OpenAI Models, Drop Deprecated Defaults#13636

Merged
danny-avila merged 8 commits into
devfrom
feat/openai-gpt-5.5-models
Jun 10, 2026
Merged

🛰️ feat: Add GPT-5.5 + Frontier OpenAI Models, Drop Deprecated Defaults#13636
danny-avila merged 8 commits into
devfrom
feat/openai-gpt-5.5-models

Conversation

@danny-avila

@danny-avila danny-avila commented Jun 9, 2026

Copy link
Copy Markdown
Owner

Summary

I refreshed the default OpenAI model catalog to match OpenAI's current API lineup (verified June 2026 against the official models, pricing, and deprecations pages), adding the GPT-5.5 family and frontier models while removing dead and end-of-life entries, and aligned token windows and pricing for the additions. Scope is intentionally limited to the model catalog — endpoint/handler behavior for these models (pro-model streaming constraints, non-streaming aggregation) is deferred to the OpenAI-compat surface moving into the agents SDK.

  • Added gpt-5.5, gpt-5.5-pro, and chat-latest (GPT-5.5 Instant) to sharedOpenAIModels, alongside gpt-5.4-mini, gpt-5.3-codex, gpt-5.2, and gpt-5.1-codex-max — all confirmed live API slugs.
  • Removed models OpenAI has already shut down: gpt-4.5-preview (+ snapshot), gpt-4-0314, gpt-4-32k-0314, gpt-4-0125-preview, gpt-4-1106-preview, gpt-4-turbo-preview, and the gpt-3.5-turbo-0613/-16k variants.
  • Removed models with announced shutdowns that would not outlive a release: the remaining gpt-4/gpt-4-turbo/gpt-3.5-turbo legacy block (retires 2026-10-23) and gpt-5-chat-latest/gpt-5.1-chat-latest (retire 2026-07-23, superseded by GPT-5.5 Instant).
  • Removed gpt-5.4-thinking, which does not exist as an API slug ("Thinking" is ChatGPT-only branding), resolving the standing TODO on that entry; skipped gpt-5.2-codex (deprecated) and gpt-5.3-codex-spark (not in the API).
  • Added pricing for gpt-5.5 ($5/$30), gpt-5.5-pro ($30/$180), gpt-5.4-mini ($0.75/$4.50), and chat-latest ($5/$30) with matching cache-read rates; pro models correctly receive no cache discount.
  • Corrected gpt-5.4-pro pricing from the placeholder $5/$30 to the now-published official $30/$180, resolving the TODO and fixing a 6x underbilling.
  • Added long-context premium tiers (>272K input tokens, 2x input / 1.5x output per OpenAI's published rates) for gpt-5.5 and gpt-5.4 via premiumTokenValues.
  • Pinned explicit pricing keys for the still-live legacy gpt-5.x-chat-latest aliases so the new bare chat-latest key cannot out-match their cheaper rates in the longest-substring lookup.
  • Added gpt-5.4-mini (400K) and chat-latest (400K) context windows and 128K output caps in tokens.ts; without explicit entries the matcher resolved them incorrectly.
  • Allowed chat-latest through the official-OpenAI fetched-model filter in fetchOpenAIModels, which otherwise drops any model ID not matching gpt-/o\d+.
  • Updated stale models.spec default-list assertions and covered chat-latest filter retention.

Change Type

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

Testing

  • Verified every added/removed slug against OpenAI's models, pricing, and deprecations documentation (June 2026).
  • Ran packages/data-schemas Jest suite for transaction pricing: 210 tests pass, including premium-tier and chat-latest alias key coverage.
  • Ran the full packages/data-provider Jest suite: 1201 tests pass.
  • Ran packages/api models spec covering the fetched-model filter and default-list fallbacks.
  • Type-checked all touched packages with tsc --noEmit; prettier, eslint, and import-order checks pass on all changed files.

Test Configuration:

  • Node.js v24.16.0, Jest run per-workspace, MongoDB not required for the affected suites.

Checklist

  • My code adheres to this project's style guidelines
  • I have performed a self-review of my own code
  • My changes do not introduce new warnings
  • I have written tests demonstrating that my changes are effective or that my feature works
  • Local unit tests pass with my changes

Copilot AI review requested due to automatic review settings June 9, 2026 21:59
@danny-avila

Copy link
Copy Markdown
Owner Author

@codex review

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates LibreChat’s default OpenAI model allowlist and aligns token/pricing configuration with the newly added GPT-5.5 family and refreshed GPT-5.x lineup, while removing deprecated/default legacy entries.

Changes:

  • Updated OpenAI token pricing and cache token pricing tables to add GPT-5.5 (+ GPT-5.4-mini) and correct GPT-5.4-pro rates.
  • Refreshed the default OpenAI model list used by librechat-data-provider to include GPT-5.5 and remove deprecated legacy defaults.
  • Added explicit token context and max-output mappings for gpt-5.4-mini to avoid fallback matching to gpt-5.4.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
packages/data-schemas/src/methods/tx.ts Adds/updates per-model token pricing and cache pricing for new GPT-5.x entries (incl. GPT-5.5) and corrects GPT-5.4-pro pricing.
packages/data-provider/src/config.ts Updates the shared OpenAI default model allowlist to include GPT-5.5 family and remove deprecated legacy defaults.
packages/api/src/utils/tokens.ts Adds explicit context/max-output entries for gpt-5.4-mini to ensure correct token limit resolution.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 549fb38e40

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/data-provider/src/config.ts Outdated
const sharedOpenAIModels = [
'gpt-5.5',
'gpt-5.5-pro',
'gpt-5.5-chat-latest',

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Use the actual ChatGPT latest model id

OpenAI's current model catalog lists the ChatGPT Instant alias as chat-latest, not gpt-5.5-chat-latest (https://developers.openai.com/api/docs/models/chat-latest). With this default exposed in sharedOpenAIModels, users who select it will send an unsupported model id to OpenAI and get request failures; replace this entry with chat-latest and make sure token/pricing maps handle that alias.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed and fixed in 3968115chat-latest is the correct slug (the gpt-5.5-chat-latest model page 404s). Registered it in the token maps (400K context, 128K output) and pricing maps ($5/$30, cached $0.50). Also pinned explicit pricing keys for the still-live legacy aliases (gpt-5-chat-latest, gpt-5.1-chat-latest, gpt-5.2-chat-latest, gpt-5.3-chat-latest) — without them, the new bare chat-latest key would become their longest substring match and overbill them at GPT-5.5 rates.

'gpt-5.4-pro': { prompt: 5, completion: 30 },
'gpt-5.4-pro': { prompt: 30, completion: 180 },
'gpt-5.4-mini': { prompt: 0.75, completion: 4.5 },
'gpt-5.5': { prompt: 5, completion: 30 },

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Account for GPT-5.5 long-context pricing

For GPT-5.5, OpenAI publishes separate long-context rates ($10/$45, cached $1) in addition to the short-context rates added here (https://developers.openai.com/api/docs/pricing). Since this commit also gives gpt-5.5 a ~1M context window, any request that enters the long-context tier will still be billed internally at the short-context $5/$30 rates because there is no premiumTokenValues entry keyed by inputTokenCount, causing under-accounting for large prompts.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed and fixed in 3968115 — added premiumTokenValues entries: gpt-5.5 at $10/$45 and gpt-5.4 at $5/$22.50 (its published long-context tier follows the same 2x input / 1.5x output structure), both with a 272K threshold per the model page: "prompts with >272K input tokens are priced at 2x input and 1.5x output for the full session." Cached-read tiering isn't representable in the current cache map structure, so long-context cached reads still bill at the standard rate — same limitation as the existing gemini entry.


const sharedOpenAIModels = [
'gpt-5.5',
'gpt-5.5-pro',

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Don't expose GPT-5.5 Pro with streaming enabled

OpenAI documents gpt-5.5-pro with streaming as unsupported (https://developers.openai.com/api/docs/models/gpt-5.5-pro), while getOpenAIConfig defaults streaming = true for OpenAI requests. Adding this model to the default list means a user who selects it in the normal OpenAI flow will send a streaming request by default and hit an API failure unless they know to disable streaming first.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed and fixed in 3968115getOpenAILLMConfig now forces streaming: false for pro reasoning models (o1-pro, o3-pro, gpt-5-pro, gpt-5.x-pro), with spec coverage for both the pro and non-pro paths. This also covers gpt-5.4-pro, which was already in the default list before this PR with the same constraint.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: the endpoint/handler fix for this was reverted in 1c4d1fd — this PR is intentionally trimmed to model-catalog changes only (lists, token windows, pricing, fetched-model filter). The OpenAI-compat endpoint/handler surface is moving out of LibreChat into the agents SDK, where this behavior belongs.

'gpt-5.1',
'gpt-5.1-chat-latest',
'gpt-5.1-codex',
'gpt-5.1-codex-max',

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Remove deprecated Codex model from defaults

OpenAI now marks gpt-5.1-codex-max as deprecated in its model page (https://developers.openai.com/api/docs/models/gpt-5.1-codex-max). Adding it to the default OpenAI model list exposes a deprecated option to new configurations, so users can select a model that is already on the removal path instead of the current gpt-5.3-codex entry added next to it.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping this one — the model page only marks a dated snapshot as deprecated; the default gpt-5.1-codex-max listing carries no deprecation marking and the page still describes it as the current default for long-running agentic coding tasks, with no recommended replacement. The deprecations page doesn't list it either. (For contrast, gpt-5.2-codex is marked deprecated at the model level, which is why it was deliberately left out of this PR.)

- Replace nonexistent gpt-5.5-chat-latest with the actual chat-latest
  alias; register its context window, output cap, pricing, and cache
  rates, and pin explicit rates for legacy gpt-5.x-chat-latest aliases
  so the new chat-latest key cannot out-match their cheaper pricing
- Add long-context premium tiers (>272K input) for gpt-5.5 and gpt-5.4
- Disable streaming for pro reasoning models (o1-pro, gpt-5.x-pro),
  which OpenAI does not support, with spec coverage
@danny-avila

Copy link
Copy Markdown
Owner Author

@codex review

Pushed fixes for the review feedback in 3968115: swapped in the real chat-latest slug with full token/pricing registration (plus explicit keys so legacy gpt-5.x-chat-latest aliases keep their cheaper rates), added >272K long-context premium pricing tiers for gpt-5.5 and gpt-5.4, and disabled streaming for pro reasoning models in getOpenAILLMConfig with spec coverage. Kept gpt-5.1-codex-max — only its dated snapshot is deprecated, not the model (details in the thread replies).

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3968115670

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

const sharedOpenAIModels = [
'gpt-5.5',
'gpt-5.5-pro',
'chat-latest',

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Include chat-latest in fetched OpenAI model filtering

Adding chat-latest to the default list does not make it available for the common server-key path: getOpenAIModels calls fetchOpenAIModels, and when the OpenAI /models response is non-empty it filters official OpenAI results with /(text-davinci-003|gpt-|o\d+)/ in packages/api/src/endpoints/models.ts:276-279. Because chat-latest does not match that regex and fallback defaults are only used on an empty response, deployments that fetch models from OpenAI will drop this new slug from the model picker.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed and fixed in ea432bc — added chat-latest to the official-OpenAI filter regex in fetchOpenAIModels, with a spec covering that it survives filtering while dall-e-3/gpt-realtime-2 are still excluded.

/** Pro reasoning models (o1-pro, gpt-5.x-pro) do not support streaming */
if (
typeof modelOptions.model === 'string' &&
/\b(?:o[13]|gpt-5(?:\.\d+)?)-pro\b/.test(modelOptions.model)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Drop sampling params for versioned pro models

This identifies gpt-5.4-pro/gpt-5.5-pro as pro reasoning models only for streaming, but those same IDs fall through the later sampling-parameter removal because the regex at packages/api/src/endpoints/openai/llm.ts:668 explicitly excludes versioned gpt-5.x models. When a user selects one of the new pro defaults with the normal temperature/topP settings still present, LibreChat will send unsupported sampling parameters to a reasoning model and the OpenAI request can fail instead of just disabling streaming.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed and fixed in ea432bc — extracted the pro-model check into an exported isProReasoningModel helper and added it to the sampling-exclusion condition, so versioned pro models (gpt-5.4-pro, gpt-5.5-pro) now drop temperature/topP/penalties like other reasoning models. Both are added to the reasoning-model spec matrix.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: the endpoint/handler fix for this was reverted in 1c4d1fd — this PR is intentionally trimmed to model-catalog changes only (lists, token windows, pricing, fetched-model filter). The OpenAI-compat endpoint/handler surface is moving out of LibreChat into the agents SDK, where this behavior belongs.

typeof modelOptions.model === 'string' &&
/\b(?:o[13]|gpt-5(?:\.\d+)?)-pro\b/.test(modelOptions.model)
) {
llmConfig.streaming = false;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Propagate pro streaming disable to agent routes

Setting only llmConfig.streaming = false does not disable SSE handling for agent runs: both agent controllers decide whether to stream from model_parameters.disableStreaming (packages/api/src/agents/openai/service.ts:462-464 and api/server/controllers/agents/openai.js:431-432). When a user runs one of these pro models through the agents endpoint with stream: true, the HTTP route still enters the streaming path even though the initialized LLM config says not to stream, so the new guard is not honored in that context.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed and fixed in ea432bc — both agent chat-completions routes (the TS service and the legacy JS controller) now factor isProReasoningModel(model_parameters.model) into their streamingDisabled decision alongside the explicit disableStreaming flag, so pro models take the aggregator/non-SSE path even when the request asks for streaming.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: the endpoint/handler fix for this was reverted in 1c4d1fd — this PR is intentionally trimmed to model-catalog changes only (lists, token windows, pricing, fetched-model filter). The OpenAI-compat endpoint/handler surface is moving out of LibreChat into the agents SDK, where this behavior belongs.

- Allow chat-latest through the official OpenAI fetched-model filter
- Export isProReasoningModel and drop unsupported sampling parameters
  for versioned pro models (gpt-5.4-pro, gpt-5.5-pro), which the
  versioned-model exemption previously let through
- Honor the pro-model streaming disable in both agent chat-completions
  routes, which decide SSE from model_parameters before llmConfig exists
- Update models.spec default-list assertions for the refreshed defaults
  and cover chat-latest filter retention
@danny-avila

Copy link
Copy Markdown
Owner Author

@codex review

Round-2 fixes pushed in ea432bc: chat-latest now passes the fetched-model filter, versioned pro models drop unsupported sampling parameters via a shared isProReasoningModel helper, and both agent chat-completions routes honor the pro-model streaming disable. Also updated the stale models.spec default-list assertions that failed CI shard 4.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ea432bc56b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

const sharedOpenAIModels = [
'gpt-5.5',
'gpt-5.5-pro',
'chat-latest',

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle max-token conversion for chat-latest

When users select the new chat-latest default and set a max token limit, getOpenAILLMConfig still leaves it on llmConfig.maxTokens because the GPT-5 conversion guard only matches /\bgpt-[5-9]/ in packages/api/src/endpoints/openai/llm.ts. Since this alias is being treated as the latest GPT-5 chat model, those requests will send the legacy max-token field instead of max_completion_tokens/max_output_tokens, unlike the other GPT-5 defaults, and can fail only for conversations that configure a max token limit.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed and fixed in 13aa046 — the max-token conversion guard now also matches chat-latest, so it sends max_completion_tokens (or max_output_tokens on the Responses API) like the rest of the GPT-5 family. Spec added.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: the endpoint/handler fix for this was reverted in 1c4d1fd — this PR is intentionally trimmed to model-catalog changes only (lists, token windows, pricing, fetched-model filter). The OpenAI-compat endpoint/handler surface is moving out of LibreChat into the agents SDK, where this behavior belongs.

Comment on lines +466 to +467
const streamingDisabled =
!!initializedParams?.disableStreaming || isProReasoningModel(initializedParams?.model);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Don't force pro models onto an empty nonstream path

In this TypeScript chat-completions service, setting streamingDisabled for pro models makes a streamed request take the non-streaming branch, but that branch never installs aggregator handlers (eventHandlers is {} unless isStreaming is true), so the later aggregator.getText() response stays empty even though the run produced output. This only affects pro-model requests through this route that ask for stream: true; the JS controller has separate non-streaming aggregation handlers, but this service does not.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed and fixed in 13aa046 — added createOpenAIAggregatorHandlers (message/reasoning deltas, tool-call accumulation, usage capture feeding the content aggregator) and wired it into the service's non-streaming branch. Worth noting this path was returning empty responses for ALL stream: false requests through this service, not just the new pro-model case — the aggregator existed but no handlers ever fed it. Covered with a spec that drives the handlers through a mocked run and asserts the aggregated body and usage, plus one asserting pro models take the JSON path when stream: true is requested.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: the endpoint/handler fix for this was reverted in 1c4d1fd — this PR is intentionally trimmed to model-catalog changes only (lists, token windows, pricing, fetched-model filter). The OpenAI-compat endpoint/handler surface is moving out of LibreChat into the agents SDK, where this behavior belongs.

modelOptions.model &&
/\b(o[13]|gpt-5)(?!\.|-chat)(?:-|$)/.test(modelOptions.model as string)
(/\b(o[13]|gpt-5)(?!\.|-chat)(?:-|$)/.test(modelOptions.model as string) ||
isProReasoningModel(modelOptions.model))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Drop snake_case sampling params for pro models

When endpoint settings or the UI provide OpenAI parameters, the sampling fields use snake_case names such as top_p and logit_bias; this new pro-model branch still only adds the camelCase keys (topP, logitBias) to reasoningExcludeParams, so gpt-5.4-pro/gpt-5.5-pro requests can still carry unsupported sampling parameters and be rejected whenever those options are configured.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed and fixed in 13aa046 — added top_p, logit_bias, frequency_penalty, and presence_penalty to the reasoning exclusion list, so addParams-sourced snake_case values are stripped from modelKwargs for pro models (and the o-series/gpt-5 base models sharing that branch, which had the same latent gap). Spec covers addParams-provided snake_case params being dropped for gpt-5.5-pro.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: the endpoint/handler fix for this was reverted in 1c4d1fd — this PR is intentionally trimmed to model-catalog changes only (lists, token windows, pricing, fetched-model filter). The OpenAI-compat endpoint/handler surface is moving out of LibreChat into the agents SDK, where this behavior belongs.

- Convert max_tokens for chat-latest, which the gpt-[5-9] guard missed
- Drop snake_case sampling params (top_p, logit_bias, penalties) in the
  reasoning-model exclusion list so addParams-sourced values are removed
- Add createOpenAIAggregatorHandlers and wire them into the agent
  chat-completions service's non-streaming branch, which previously ran
  with no handlers and always returned an empty aggregated response
@danny-avila

Copy link
Copy Markdown
Owner Author

@codex review

Round-3 fixes pushed in 13aa046: chat-latest now gets the max-token field conversion, snake_case sampling params are stripped for reasoning/pro models, and the agent chat-completions service has proper non-streaming aggregation handlers (fixing empty responses on that path generally, not just for pro models).

- Sort type import first in service.spec.ts per import-order convention
- Register isProReasoningModel in the openai controller spec's
  @librechat/api mock factory, whose enumerated exports left the new
  helper undefined and broke the non-streaming flow under test

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 13aa04677a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

constructor(private aggregator: OpenAIContentAggregator) {}

handle(_event: string, data: MessageDeltaData): void {
const content = data?.content;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Read agent deltas from delta.content

In the non-streaming agent path, the new aggregator ignores the shape emitted by the existing agent event handlers: both the legacy OpenAI controller and the Responses service read message/reasoning chunks from data.delta.content, and the stream integration tests use that same envelope. When a normal non-streaming request (and now every forced non-streaming pro reasoning request) receives those events, data.content is undefined here, so all text chunks are dropped and the final chat.completion can be returned with content: null despite the model producing output.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct finding — the aggregator handlers did assume the streaming-handler shape rather than the delta.content envelope the agent graph actually emits. However, the aggregator handlers have been removed in 1c4d1fd: the PR scope is trimmed to the model catalog only, and the OpenAI-compat handler surface is moving out of LibreChat into the agents SDK, where this (and the equivalent shape mismatch in the existing streaming handler classes) should be addressed.

const handlers: Record<string, EventHandler> = {
[GraphEvents.ON_MESSAGE_DELTA]: new OpenAIAggregatorMessageDeltaHandler(aggregator),
[GraphEvents.ON_REASONING_DELTA]: new OpenAIAggregatorReasoningDeltaHandler(aggregator),
[GraphEvents.ON_RUN_STEP_DELTA]: new OpenAIAggregatorRunStepDeltaHandler(aggregator),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Track tool starts for non-streaming calls

For non-streaming agent requests that surface a tool call, this registers only on_run_step_delta, but the existing controller and Responses handlers derive the tool id/name from on_run_step.stepDetails.tool_calls and then append only args from later deltas. With the normal two-event shape, this aggregator never creates the tool call entry unless a delta happens to repeat id, so the final OpenAI-compatible response can omit tool_calls and report finish_reason: "stop" for a tool-call turn.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct on the two-event shape (on_run_step carries id/name via stepDetails, deltas append args) — but the aggregator handlers were removed in 1c4d1fd as part of trimming this PR to model-catalog scope. To be handled in the agents SDK migration of this surface.

Revert the OpenAI endpoint and agent handler changes (pro-model
streaming, sampling exclusions, non-streaming aggregation) — that
surface is moving out of LibreChat into the agents SDK and belongs
in its own change. Keep the model list, token windows, pricing, and
the fetched-model filter for chat-latest.
@danny-avila

Copy link
Copy Markdown
Owner Author

@codex review

Scope trimmed in 1c4d1fd: this PR is now model-catalog only — default model lists, token windows, pricing (incl. gpt-5.4-pro correction, long-context tiers, chat-latest alias keys), and the fetched-model filter. All endpoint/handler changes (pro-model streaming, sampling exclusions, non-streaming aggregation) were reverted; that surface is moving into the agents SDK and the related findings are deferred there. PR description updated to match.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1c4d1fd297

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/api/src/utils/tokens.ts Outdated
Comment on lines 58 to 59
'gpt-5.4': 272000, // standard context; 1M experimental available via API opt-in (2x rate)
'gpt-5.4-pro': 272000, // same window as gpt-5.4

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use the full GPT-5.4 context window

For gpt-5.4/gpt-5.4-pro, this value is used by getModelMaxTokens() and then by clients as maxContextTokens, so any prompt above 272K tokens gets truncated before it can use the long-context tier this PR adds. OpenAI’s model/pricing docs list GPT-5.4 and GPT-5.4 Pro with a 1.05M context window and a pricing breakpoint at 272K (https://developers.openai.com/api/docs/models/gpt-5.4), so the max-token map should represent the 1.05M capacity rather than the billing threshold.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed and fixed in 70c841a — both gpt-5.4 and gpt-5.4-pro now map to the documented 1,050,000 window. The 272K value was carried over from the pre-release note treating long context as an opt-in; the current model page lists 1.05M as standard with 272K as the pricing breakpoint only.

Comment on lines +365 to +366
'gpt-5.4': { threshold: 272000, prompt: 5, completion: 22.5 },
'gpt-5.5': { threshold: 272000, prompt: 10, completion: 45 },

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Apply long-context pricing to pro models

When a gpt-5.4-pro or gpt-5.5-pro request exceeds 272K input tokens, getValueKey() resolves to the exact pro key added above, but premiumTokenValues only contains the non-pro keys, so getPremiumRate() never applies the long-context pro rates. OpenAI’s pricing table includes long-context columns for both pro models (https://developers.openai.com/api/docs/pricing), so these sessions would be billed at the short-context $30/$180 rate instead of $60/$270.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partially confirmed, fixed in 70c841a — added gpt-5.4-pro long-context premium rates ($60/$270 above 272K), whose model page states the 2x input / 1.5x output rule verbatim. gpt-5.5-pro was left out deliberately: its model page documents flat $30/$180 pricing with no long-context tier or breakpoint (checked twice), so applying $60/$270 there would overbill. Happy to add it if OpenAI publishes a tier for it.

'gpt-5.3': { write: 1.75, read: 0.175 },
'gpt-5.4': { write: 2.5, read: 0.25 },
'gpt-5.4-mini': { write: 0.75, read: 0.075 },
'gpt-5.5': { write: 5, read: 0.5 },

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Charge cached GPT-5.5 long context at the long rate

For gpt-5.5 requests above the 272K-token long-context breakpoint, cached input is billed at $1.00/1M rather than $0.50/1M in OpenAI’s pricing table (https://developers.openai.com/api/docs/pricing), but getCacheMultiplier() has no input-token-count path and will always return this short-context read value. That undercounts any cached portion of long-context GPT-5.5 sessions even though prompt/output long-context rates are handled separately in premiumTokenValues.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Real but deferred — getCacheMultiplier() has no input-token-count path: its callers in the transaction layer don't carry prompt size, so tiered cache pricing isn't representable without extending the billing mechanism (the same limitation applies to the existing gemini premium entry). That's beyond this catalog-scoped PR; flagging it for the billing-mechanism follow-up. The long-context prompt/output rates themselves are handled via premiumTokenValues.

- Set gpt-5.4 and gpt-5.4-pro context to the documented 1,050,000
  window — 272K is the long-context pricing breakpoint, not the cap,
  and using it truncated prompts before they could reach that tier
- Add gpt-5.4-pro long-context premium rates ($60/$270 above 272K)
  per its model page; gpt-5.5-pro documents no long-context tier
@danny-avila

Copy link
Copy Markdown
Owner Author

@codex review

Round-5 fixes pushed in 70c841a: gpt-5.4/gpt-5.4-pro context windows corrected to the documented 1.05M (272K is the pricing breakpoint, not the cap), and gpt-5.4-pro long-context premium rates added ($60/$270). gpt-5.5-pro intentionally has no premium entry — its model page documents no long-context tier. Tiered cached-input pricing is deferred: the cache multiplier path carries no token counts, so it needs a billing-mechanism change outside this catalog PR.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 70c841a946

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

// TODO: gpt-5.4-pro pricing not yet officially published — verify before release
'gpt-5.4-pro': { prompt: 5, completion: 30 },
'gpt-5.4-pro': { prompt: 30, completion: 180 },
'gpt-5.4-mini': { prompt: 0.75, completion: 4.5 },

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add the missing gpt-5.4-nano pricing key

When OpenAI returns gpt-5.4-nano from /v1/models or an admin configures it manually, this table has no explicit key for it even though the current pricing page lists it as part of the GPT-5.4 family at much lower rates ($0.20 input / $1.25 output, with cached input). Because getValueKey uses the longest substring match over tokenValues, gpt-5.4-nano resolves to the generic gpt-5.4 entry here and is billed at $2.50/$15 instead, so the new catalog refresh will substantially overcharge nano usage.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed and fixed in ffa6359gpt-5.4-nano is registered in the default list, pricing ($0.20/$1.25, cached $0.02), and token maps (400K context; an explicit entry was also needed there since the substring match would have given it gpt-5.4's 1.05M window). Good catch — this model hadn't surfaced in any of the catalog sweeps.


const sharedOpenAIModels = [
'gpt-5.5',
'gpt-5.5-pro',

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Hide gpt-5.5-pro until streaming is handled

Exposing gpt-5.5-pro in the default OpenAI model list lets users select it even though the OpenAI model page marks streaming as unsupported, while this endpoint still defaults to streaming = true in packages/api/src/endpoints/openai/config.ts. In the default setup (no explicit OPENAI_MODELS override and streaming left enabled), selecting this newly listed model will send a streaming chat request that the model rejects; either omit it from defaults or add the non-streaming/pro-model handling before advertising it.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping it listed, deliberately — the maintainer descoped endpoint/handler behavior from this PR: the OpenAI-compat surface (including pro-model streaming handling) is moving into the agents SDK, and the catalog should reflect the current frontier lineup. The same constraint class already ships today via gpt-5.4-pro, which predates this PR in the default list; users can disable streaming via model parameters until the SDK-side handling lands.

'gemini-3.1': { threshold: 200000, prompt: 4, completion: 18 },
'gpt-5.4': { threshold: 272000, prompt: 5, completion: 22.5 },
'gpt-5.4-pro': { threshold: 272000, prompt: 60, completion: 270 },
'gpt-5.5': { threshold: 272000, prompt: 10, completion: 45 },

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Apply long-context rates to gpt-5.5-pro

For gpt-5.5-pro requests over the 272K-token long-context breakpoint, getMultiplier only switches rates when the model has an entry in premiumTokenValues; the current pricing table lists gpt-5.5-pro long-context standard rates as $60 input / $270 output, but this block only adds the tier for gpt-5.5. As a result, any >272K gpt-5.5-pro session falls back to the standard $30/$180 rates from tokenValues and underbills long-context pro usage.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed and fixed in ffa6359gpt-5.5-pro now has the $60/$270 long-context premium entry. This also corrects my reply on the earlier thread: the pricing page's long-context table does list gpt-5.5-pro, even though its model page presents only flat $30/$180 with no breakpoint mention — the pricing table is the canonical source here.

- Register gpt-5.4-nano ($0.20/$1.25, cached $0.02, 400K context) in
  the model list, pricing, cache, and token maps — the longest-match
  fallback billed it at gpt-5.4's $2.50/$15
- Add gpt-5.5-pro long-context premium rates ($60/$270 above 272K);
  the pricing table lists the tier even though the model page omits it
@danny-avila

Copy link
Copy Markdown
Owner Author

@codex review

Round-6 fixes in ffa6359: added gpt-5.4-nano across the catalog (list, pricing, cache, token maps) and the gpt-5.5-pro long-context premium tier ($60/$270 — the pricing table lists it even though the model page omits the breakpoint, correcting my earlier reply). gpt-5.5-pro stays in the default list per the maintainer's descope decision; pro-model streaming handling lands with the agents SDK migration.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ffa635927f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +367 to +370
'gpt-5.4': { threshold: 272000, prompt: 5, completion: 22.5 },
'gpt-5.4-pro': { threshold: 272000, prompt: 60, completion: 270 },
'gpt-5.5': { threshold: 272000, prompt: 10, completion: 45 },
'gpt-5.5-pro': { threshold: 272000, prompt: 60, completion: 270 },

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Apply long-context pricing to cached input

For GPT-5.4/5.5 requests above the 272K-token breakpoint, these new premium entries only affect uncached prompt and completion multipliers; structured cached reads still go through getCacheMultiplier, which has no inputTokenCount and returns the standard cacheTokenValues read rate. In a long-context request with cached input (e.g. read tokens plus total input >272K), the prompt and completion are billed at the premium tier but cached input remains at the below-threshold rate, undercharging those runs unless the cache multiplier is also made threshold-aware.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate of the earlier cached-input thread on this file — same answer: getCacheMultiplier has no input-token-count path (its transaction-layer callers don't carry prompt size), so threshold-aware cache rates require a billing-mechanism change that is deferred beyond this catalog-scoped PR. The gemini premium entry shares the same limitation.

'gpt-5.1',
'gpt-5.1-chat-latest',
'gpt-5.1-codex',
'gpt-5.1-codex-max',

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Remove deprecated Codex-Max from defaults

When the OpenAI model fetch is skipped or fails, this static list is what users can select, so adding gpt-5.1-codex-max reintroduces a model that the current OpenAI model page marks as deprecated (source). That works against the "drop deprecated defaults" change and can leave fallback/user-provided-key setups presenting a model that may fail as soon as OpenAI disables the alias; it should be omitted from defaults unless it is returned by the live models endpoint.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-checked the model page just now: the Deprecated label still applies only to a dated snapshot row in the Snapshots section — the model itself carries no deprecation marking, the page describes it as current ("purpose-built for agentic coding") with active pricing, no replacement is recommended, and it is absent from the deprecations page. Same conclusion as the earlier thread on this entry; keeping it. If OpenAI deprecates the model proper (as with gpt-5.2-codex, which this PR deliberately excludes), it should come out in a catalog follow-up.

@danny-avila danny-avila merged commit ca26a2d into dev Jun 10, 2026
27 of 28 checks passed
@danny-avila danny-avila deleted the feat/openai-gpt-5.5-models branch June 10, 2026 00:12
fuuuzzy pushed a commit to fuuuzzy/LibreChat that referenced this pull request Jun 18, 2026
…ts (danny-avila#13636)

* 🛰️ feat: Add GPT-5.5 + Frontier OpenAI Models, Drop Deprecated Defaults

* 🛰️ fix: Address Codex Review on OpenAI Model Refresh

- Replace nonexistent gpt-5.5-chat-latest with the actual chat-latest
  alias; register its context window, output cap, pricing, and cache
  rates, and pin explicit rates for legacy gpt-5.x-chat-latest aliases
  so the new chat-latest key cannot out-match their cheaper pricing
- Add long-context premium tiers (>272K input) for gpt-5.5 and gpt-5.4
- Disable streaming for pro reasoning models (o1-pro, gpt-5.x-pro),
  which OpenAI does not support, with spec coverage

* 🛰️ fix: Address Codex Round-2 Review and CI Spec Failure

- Allow chat-latest through the official OpenAI fetched-model filter
- Export isProReasoningModel and drop unsupported sampling parameters
  for versioned pro models (gpt-5.4-pro, gpt-5.5-pro), which the
  versioned-model exemption previously let through
- Honor the pro-model streaming disable in both agent chat-completions
  routes, which decide SSE from model_parameters before llmConfig exists
- Update models.spec default-list assertions for the refreshed defaults
  and cover chat-latest filter retention

* 🛰️ fix: Address Codex Round-3 Review

- Convert max_tokens for chat-latest, which the gpt-[5-9] guard missed
- Drop snake_case sampling params (top_p, logit_bias, penalties) in the
  reasoning-model exclusion list so addParams-sourced values are removed
- Add createOpenAIAggregatorHandlers and wire them into the agent
  chat-completions service's non-streaming branch, which previously ran
  with no handlers and always returned an empty aggregated response

* 🛰️ ci: Fix Import Order Drift and Controller Spec Mock

- Sort type import first in service.spec.ts per import-order convention
- Register isProReasoningModel in the openai controller spec's
  @librechat/api mock factory, whose enumerated exports left the new
  helper undefined and broke the non-streaming flow under test

* 🛰️ chore: Trim Scope to Model Catalog Changes

Revert the OpenAI endpoint and agent handler changes (pro-model
streaming, sampling exclusions, non-streaming aggregation) — that
surface is moving out of LibreChat into the agents SDK and belongs
in its own change. Keep the model list, token windows, pricing, and
the fetched-model filter for chat-latest.

* 🛰️ fix: Correct GPT-5.4 Context Windows and Pro Long-Context Pricing

- Set gpt-5.4 and gpt-5.4-pro context to the documented 1,050,000
  window — 272K is the long-context pricing breakpoint, not the cap,
  and using it truncated prompts before they could reach that tier
- Add gpt-5.4-pro long-context premium rates ($60/$270 above 272K)
  per its model page; gpt-5.5-pro documents no long-context tier

* 🛰️ fix: Add gpt-5.4-nano and gpt-5.5-pro Long-Context Pricing

- Register gpt-5.4-nano ($0.20/$1.25, cached $0.02, 400K context) in
  the model list, pricing, cache, and token maps — the longest-match
  fallback billed it at gpt-5.4's $2.50/$15
- Add gpt-5.5-pro long-context premium rates ($60/$270 above 272K);
  the pricing table lists the tier even though the model page omits it
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants