深海游弋的鱼 – 默默的点滴

前置条件

AMD Ryzen™ 9 5900X × 24
128.0 GB内存、8TB SSD
NVIDIA GeForce RTX™ 3060 12GB
Ubuntu 24.04.4 LTS
Linux 6.8.0-106-generic
NVIDIA driver (open kernel) metapackage nvidia-driver-590-open
nvidia-cuda-toolkit 12.0.140

执行步骤

1. 启用内存压缩 zram ，增加部分 CPU 占用，节约部分宝贵的内存

$ sudo apt install zram-config

1	$ sudo apt install zram-config

2. 安装 NVIDIA CUDA

$ sudo apt install nvidia-cuda-toolkit

1	$ sudo apt install nvidia-cuda-toolkit

3. 编译 llama.cpp

$ apt-get update

$ apt-get install pciutils build-essential cmake curl libcurl4-openssl-dev -y

$ git clone https://github.com/ggml-org/llama.cpp

$ cmake llama.cpp -B llama.cpp/build \
    -DBUILD_SHARED_LIBS=OFF -DGGML_CUDA=ON -DLLAMA_CURL=ON

$ cmake --build llama.cpp/build --config Release -j --clean-first --target llama-cli llama-mtmd-cli llama-server llama-gguf-split

$ cp llama.cpp/build/bin/llama-* llama.cpp

$ apt-get update

$ apt-get install pciutils build-essential cmake curl libcurl4-openssl-dev -y

$ git clone https://github.com/ggml-org/llama.cpp

$ cmake llama.cpp -B llama.cpp/build \

-DBUILD_SHARED_LIBS=OFF -DGGML_CUDA=ON -DLLAMA_CURL=ON

$ cmake --build llama.cpp/build --config Release -j --clean-first --target llama-cli llama-mtmd-cli llama-server llama-gguf-split

$ cp llama.cpp/build/bin/llama-* llama.cpp

4. 下载模型文件，国内用户去 ModelScope 魔搭社区搜索下载，国外的 Hugging Face下载比较艰难。

5. 启动模型

./llama-server \
    --model unsloth/NVIDIA-Nemotron-3-Super-120B-A12B-GGUF/UD-Q4_K_XL/NVIDIA-Nemotron-3-Super-120B-A12B-UD-Q4_K_XL-00001-of-00003.gguf \
    --ctx-size 16384 \
    --seed 3407 \
    --prio 2 \
    --temp 0.6 \
    --top-p 0.95 \
    --port 8080 \
    --host 0.0.0.0 \
    --fit on \
    --api-key-file api-keys.txt

./llama-server \

--model unsloth/NVIDIA-Nemotron-3-Super-120B-A12B-GGUF/UD-Q4_K_XL/NVIDIA-Nemotron-3-Super-120B-A12B-UD-Q4_K_XL-00001-of-00003.gguf \

--ctx-size 16384 \

--seed 3407 \

--prio 2 \

--temp 0.6 \

--top-p 0.95 \

--port 8080 \

--host 0.0.0.0 \

--fit on \

--api-key-file api-keys.txt

一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

ubuntu 22.04 本地部署大语言模型

前置条件

执行步骤

参考链接

发布者

默默

发表回复取消回复

前置条件

执行步骤

参考链接

发布者

默默

发表回复 取消回复

发表回复取消回复