Releases · LMCache/lmcache-vllm · GitHub

10 Dec 20:05

ApostaC

v0.6.2.3 Pre-release

Pre-release

What's Changed

Fix async store problem by @YaoJiayi in #36
Drop special tokens. by @XbzOnGit in #33
Add Docker Build Files by @qyy2003 in #35
Support Turing GPU by @Second222None in #40
Fix online multi-turn decode cache saving by @YaoJiayi in #41
Bump version number to 0.6.2.3 by @ApostaC in #46

New Contributors

@qyy2003 made their first contribution in #35
@Second222None made their first contribution in #40

Full Changelog: v0.6.2.2...v0.6.2.3

Contributors

Second222None, ApostaC, and 3 other contributors

Assets 2

29 Oct 23:12

ApostaC

v0.6.2.2 Pre-release

Pre-release

New version: v0.6.2.2

Compatibility:

vLLM: 0.6.1.post2, 0.6.2
LMCache: 0.1.3

Key Features

Supporting chunked prefill in vLLM
Faster KV loading for multi-turn conversation by saving KV at the decoding time
Experimental KV blending feature to enable reusing non-prefix KV caches
New model support: llama-3.1 and qwen-2

What's Changed

typo fix on retrive --> retrieve by @KuntaiDu in #14
bugfix by @YaoJiayi in #15
Bugfix by @YaoJiayi in #16
Support saving decode cache by @YaoJiayi in #17
vllm internal prefix cache compatibility by @XbzOnGit in #19
Supprt chunk prefill by @YaoJiayi in #18
[Refactor] Add support for "dtype" (KV cache storage data type) in LMCacheEngineMetadata by @Alex-q-z in #10
Fix TP by @YaoJiayi in #22
Add model specific patches by @YaoJiayi in #23
Fix store's compatibility with suffix prefill by @YaoJiayi in #24
Cacheblend integration by @ApostaC in #13
Reduce memory copy on store by @XbzOnGit in #21
Fix bug for chunk prefill and vllm internal prefix caching by @YaoJiayi in #27
Optimize chunk prefill performance by @YaoJiayi in #30
Bump version number to 0.6.2.2, working with vllm 0.6.2 + lmcache 0.1.3 by @ApostaC in #31

New Contributors

@XbzOnGit made their first contribution in #19
@Alex-q-z made their first contribution in #10

Full Changelog: v0.6.2.post1...v0.6.2.2

Contributors

KuntaiDu, ApostaC, and 3 other contributors

Assets 2

24 Sep 19:56

ApostaC

v0.6.2.post1 Pre-release

Pre-release

What's Changed

[Release] lmcache_vllm wrapper with vllm 0.6.1.post2 support by @ApostaC in #9
Fix the main-thread-blocker in lmcache_kv_store

Contributors

ApostaC

Assets 2

17 Sep 18:43

ApostaC

v0.1.1-alpha Pre-release

Pre-release

Tagging with LMCache v0.1.1-alpha

Assets 2