Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StreamerBase: add write tokens vector #1769

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

as-suvorov
Copy link
Contributor

@as-suvorov as-suvorov commented Feb 19, 2025

  • adds StreamingStatus write(const std::vector<int64_t>& tokens) for StreamerBase
  • implements StreamingStatus write(const std::vector<int64_t>& tokens) for TextStreamer
  • switches whisper pipelines to StreamerBase interface & deprecates ChunkStreamerBase interface
  • adds tests

@github-actions github-actions bot added category: LLM LLM pipeline (stateful, static) category: whisper Whisper pipeline category: sampling Sampling / Decoding algorithms category: Python API Python API for GenAI category: GenAI C++ API Changes in GenAI C++ public headers no-match-files labels Feb 19, 2025
@as-suvorov as-suvorov changed the title StreamerBase: add write vector StreamerBase: add write tokens vector Feb 19, 2025
@as-suvorov as-suvorov added this to the 2025.1 milestone Feb 19, 2025
@as-suvorov as-suvorov requested review from Wovchena, ilya-lavrenov, pavel-esir and apaniukov and removed request for Wovchena February 19, 2025 16:47
@as-suvorov as-suvorov marked this pull request as ready for review February 19, 2025 16:48
@as-suvorov as-suvorov requested a review from sbalandi February 19, 2025 16:50
"""
Write is called every time new token is decoded. Returns a StreamingStatus flag to indicate whether generation should be stopped or cancelled
Put is called every time new token or vector of tokens is decoded. Returns a bool flag to indicate whether generation should be stopped, if return true generation stops
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it's return bool ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was copypasted, thanks, fixed

Copy link
Contributor

@sbalandi sbalandi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the category: samples GenAI samples label Feb 20, 2025
public:
StreamingStatus write(int64_t token) override;
StreamingStatus write(const std::vector<int64_t>& tokens) override;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @mzegla
as you requested such method

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that should remove the need for our WA for streaming the prompt back. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: GenAI C++ API Changes in GenAI C++ public headers category: LLM LLM pipeline (stateful, static) category: Python API Python API for GenAI category: samples GenAI samples category: sampling Sampling / Decoding algorithms category: whisper Whisper pipeline no-match-files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants