Download lmdeploy-v0.2.5:
cd ~/
git clone https://github.com/InternLM/lmdeploy.git
cd lmdeploy
git checkout c5f4014 # 确保为0.2.5版本
Activate conda environment:
conda activate lmdeploy
Install dependencies.
cd ~/lmdeploy
pip install -r requirements/build.txt
Create a new file named generate_jetson.sh
under ~/lmdeploy
, and fill in the following content:
#!/bin/sh
builder="-G Ninja"
if [ "$1" == "make" ]; then
builder=""
fi
cmake ${builder} .. \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DCMAKE_EXPORT_COMPILE_COMMANDS=1 \
-DCMAKE_INSTALL_PREFIX=./install \
-DBUILD_PY_FFI=ON \
-DBUILD_MULTI_GPU=OFF \
-DCMAKE_CUDA_FLAGS="-lineinfo" \
-DUSE_NVTX=ON
Modify file permissions.
chmod +x generate_jetson.sh
Install Ninja.
sudo apt-get install ninja-build
Create a new build folder.
cd ~/lmdeploy
mkdir build && cd build
Compile LMDeploy.
../generate_jetson.sh
ninja install
During the compilation process, the memory may run out and result in Killed
. You can expand the swap capacity as follows, and then execute ninja install again.
# Create a 6 GB swap area. The size can be customized and combined with the disk capacity
sudo fallocate -l 6G /var/swapfile
# Modify file permissions.
sudo chmod 600 /var/swapfile
# Make swap area
sudo mkswap /var/swapfile
# Setup swap area
sudo swapon /var/swapfile
# Setup swap area automatically
sudo bash -c 'echo "/var/swapfile swap swap defaults 0 0" >> /etc/fstab'
Attention: Use vim to edit requirements/runtime.txt
, and delete the lines containing torch<=2.1.2,>=2.0.0
and triton>=2.1.0,<2.2.0
.
Note: To simplify dependencies, we have removed triton
. This also means that when deploying models using lmdeploy, they can only be invoked through the turbomind method, and not through the API method.
Install lmdeploy-v0.2.5 locally.
cd ~/lmdeploy
pip install -e .[serve]