Command-line arguments for llama-server.
See https://github.com/ggml-org/llama.cpp/blob/master/tools/server/README.md for the full list of options.
Declarations
Type
open submodule of (attribute set)Default
{ }Example
{
batch-size = 512;
ctx-size = 252144;
flash-attn = "on";
host = "0.0.0.0";
model = "/mnt/llms/Foo3.6-27B-UD-Q4_K_XL.gguf";
port = 1337;
spec-draft-n-max = 2;
spec-type = "draft-mtp";
temp = 0.6;
top-k = 20;
top-p = 0.95;
ubatch-size = 256;
}