Skip to content

Commit

Permalink
[Docs] Update Supported Matrix (#679)
Browse files Browse the repository at this point in the history
* update supported matrix

* change the default shard size when saving quantized weights

* baichuan2 kv8
  • Loading branch information
pppppM authored Nov 13, 2023
1 parent ab1767c commit e641dd8
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 7 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,10 +66,10 @@ LMDeploy is a toolkit for compressing, deploying, and serving LLM, developed by
| SOLAR | Yes | Yes | Yes | Yes | No |
| InternLM-7B | Yes | Yes | Yes | Yes | No |
| InternLM-20B | Yes | Yes | Yes | Yes | No |
| QWen-7B | Yes | Yes | Yes | No | No |
| QWen-14B | Yes | Yes | Yes | No | No |
| QWen-7B | Yes | Yes | Yes | Yes | No |
| QWen-14B | Yes | Yes | Yes | Yes | No |
| Baichuan-7B | Yes | Yes | Yes | Yes | No |
| Baichuan2-7B | Yes | Yes | No | No | No |
| Baichuan2-7B | Yes | Yes | Yes | Yes | No |
| Code Llama | Yes | Yes | No | No | No |

### Pytorch
Expand Down
6 changes: 3 additions & 3 deletions README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,10 +67,10 @@ LMDeploy 由 [MMDeploy](https://github.com/open-mmlab/mmdeploy) 和 [MMRazor](ht
| SOLAR | Yes | Yes | Yes | Yes | No |
| InternLM-7B | Yes | Yes | Yes | Yes | No |
| InternLM-20B | Yes | Yes | Yes | Yes | No |
| QWen-7B | Yes | Yes | Yes | No | No |
| QWen-14B | Yes | Yes | Yes | No | No |
| QWen-7B | Yes | Yes | Yes | Yes | No |
| QWen-14B | Yes | Yes | Yes | Yes | No |
| Baichuan-7B | Yes | Yes | Yes | Yes | No |
| Baichuan2-7B | Yes | Yes | No | No | No |
| Baichuan2-7B | Yes | Yes | Yes | Yes | No |
| Code Llama | Yes | Yes | No | No | No |

### Pytorch
Expand Down
2 changes: 1 addition & 1 deletion lmdeploy/lite/apis/auto_awq.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ def auto_awq(model: str,
smooth_layers(layers, fc2fcs, norm2fcs, act_scales, w_group_size, device)
quant_weights(model, fcs, w_bits, w_sym, w_group_size, device)

model.save_pretrained(work_dir)
model.save_pretrained(work_dir, max_shard_size='2GB')
tokenizer.save_pretrained(work_dir)


Expand Down

0 comments on commit e641dd8

Please sign in to comment.