Skip to content

🚧 feat: Support Guardrail Config Option streamProcessingMode#12815

Merged
danny-avila merged 1 commit into
danny-avila:devfrom
newjersey:dlew/guardrail-stream-processing-mode
May 13, 2026
Merged

🚧 feat: Support Guardrail Config Option streamProcessingMode#12815
danny-avila merged 1 commit into
danny-avila:devfrom
newjersey:dlew/guardrail-stream-processing-mode

Conversation

@dlew

@dlew dlew commented Apr 24, 2026

Copy link
Copy Markdown
Contributor

Summary

streamProcessingMode affects how guardrail processes the stream from the model. If it's in "sync" mode, it chunks up the response and processes them before returning them to the user. If it's in "async" mode, it both processes the chunk & sends it to the user at the same time, allowing for smoother streaming (at the cost of guardrail only reacting after offending content starts to stream, in some cases).

Related documentation change here: LibreChat-AI/librechat.ai#564

Change Type

Testing

I added streamProcessingMode to an existing test to make sure it's processed by the guardrail configuration code.

I also verified that setting it to async in my librechat.yaml would result in a faster streaming response from Bedrock.

Checklist

Please delete any irrelevant options.

  • My code adheres to this project's style guidelines
  • I have performed a self-review of my own code
  • I have made pertinent documentation changes
  • My changes do not introduce new warnings
  • I have written tests demonstrating that my changes are effective or that my feature works
  • Local unit tests pass with my changes
  • A pull request for updating the documentation has been submitted (here: docs: guardrail config's streamProcessingMode LibreChat-AI/librechat.ai#564)

@dlew dlew force-pushed the dlew/guardrail-stream-processing-mode branch from 6193d0d to 9182d9e Compare April 29, 2026 16:21
@dlew dlew force-pushed the dlew/guardrail-stream-processing-mode branch from 9182d9e to 4ce64d8 Compare May 11, 2026 16:12
`streamProcessingMode` affects how guardrail processes the stream from
the model. If it's in "sync" mode, it chunks up the response and processes
them before returning them to the user. If it's in "async" mode, it
both processes the chunk & sends it to the user at the same time, allowing
for smoother streaming (at the cost of guardrail only reacting *after*
offending content starts to stream, in some cases).
@dlew dlew force-pushed the dlew/guardrail-stream-processing-mode branch from 4ce64d8 to 19e6753 Compare May 13, 2026 14:28
@danny-avila danny-avila changed the title feat: support guardrail config option streamProcessingMode 🚧 feat: Support Guardrail Config Option streamProcessingMode May 13, 2026
@danny-avila

Copy link
Copy Markdown
Owner

@dlew lgtm, can we maybe make async the default behavior? seems better for most use cases

@dlew

dlew commented May 13, 2026

Copy link
Copy Markdown
Contributor Author

@dlew lgtm, can we maybe make async the default behavior? seems better for most use cases

@danny-avila The issue is that it's inherently less safe to use async, since it could (briefly) let through content that should've been blocked. So for something like content filtering, I think the default should be whatever is safest (and let people opt into a less safe mode).

Though if you feel strongly about it, I'm fine setting the default the other way.

@danny-avila danny-avila merged commit 83ea3ef into danny-avila:dev May 13, 2026
10 checks passed
@danny-avila

Copy link
Copy Markdown
Owner

fair enough, merged!

@dlew dlew deleted the dlew/guardrail-stream-processing-mode branch May 13, 2026 19:20
fuuuzzy pushed a commit to fuuuzzy/LibreChat that referenced this pull request May 15, 2026
…-avila#12815)

`streamProcessingMode` affects how guardrail processes the stream from
the model. If it's in "sync" mode, it chunks up the response and processes
them before returning them to the user. If it's in "async" mode, it
both processes the chunk & sends it to the user at the same time, allowing
for smoother streaming (at the cost of guardrail only reacting *after*
offending content starts to stream, in some cases).
patricia2510 pushed a commit to lexaeon-org/libre-chat that referenced this pull request May 21, 2026
…-avila#12815)

`streamProcessingMode` affects how guardrail processes the stream from
the model. If it's in "sync" mode, it chunks up the response and processes
them before returning them to the user. If it's in "async" mode, it
both processes the chunk & sends it to the user at the same time, allowing
for smoother streaming (at the cost of guardrail only reacting *after*
offending content starts to stream, in some cases).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants