llama.cpp

LLM inference in C/C++

Recent API changes

Hot topics

guide : running gpt-oss with llama.cpp
[FEEDBACK] Better packaging for llama.cpp to support downstream consumers 🤗
Support for the gpt-oss model with native MXFP4 format has been added | PR | Collaboration with NVIDIA | Comment
Hot PRs: All | Open
Multimodal support arrived in llama-server: #12898 | documentation
VS Code extension for FIM completions: https://github.com/ggml-org/llama.vscode
Vim/Neovim plugin for FIM completions: https://github.com/ggml-org/llama.vim
Introducing GGUF-my-LoRA ggml-org/llama.cpp#10123
Hugging Face Inference Endpoints now support GGUF out of the box! ggml-org/llama.cpp#9669
Hugging Face GGUF editor: discussion | tool

Quick start (llama.cpp)

The quick start guide of the basic llama.cpp follows the original repository. The main focus of IGNITE for on-device inference is based on llama-cli, guided by the llama-completion.

Quick start (IGNITE)

Model download

python downloader.py

Through this file, you can download models, which are pre-selected to evaluate themselves on IGNITE. If there is no preferred model, you can download and run your own gguf models also.

Build (on-device)

cd scripts && sh build-android.sh && cd ..

Run (on-device)

chmod +x scripts-termux/run.sh
su -c "sh scripts-termux/run.sh"

Build (Linux)

cd scripts && sh build.sh && cd ..

Run (Linux)

./build/bin/ignite \
    -m models/qwen-1.5-0.5b-chat-q4k.gguf \
    -cnv \
    --temp 0 \
    --top-k 1 \
    --threads 1 \
    --output-path outputs/hotpot_0_0.csv \
    --json-path dataset/hotpot_qa_30.json

This will be filled up. Please wait.

Name		Name	Last commit message	Last commit date
Latest commit History 4,567 Commits
.devops		.devops
.github		.github
.ipynb_checkpoints		.ipynb_checkpoints
assets		assets
ci		ci
cmake		cmake
common		common
data		data
dataset		dataset
docs		docs
examples		examples
ggml		ggml
gguf-py		gguf-py
grammars		grammars
include		include
licenses		licenses
main		main
media		media
models		models
pocs		pocs
prompts		prompts
requirements		requirements
scripts-arm		scripts-arm
scripts-termux		scripts-termux
scripts		scripts
src		src
test_files		test_files
tests		tests
tools		tools
vendor		vendor
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.dockerignore		.dockerignore
.ecrc		.ecrc
.editorconfig		.editorconfig
.flake8		.flake8
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
AUTHORS		AUTHORS
CMakeLists.txt		CMakeLists.txt
CMakePresets.json		CMakePresets.json
CODEOWNERS		CODEOWNERS
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
Package.swift		Package.swift
README.md		README.md
SECURITY.md		SECURITY.md
_hard.txt		_hard.txt
convert_hf_to_gguf.py		convert_hf_to_gguf.py
convert_hf_to_gguf_update.py		convert_hf_to_gguf_update.py
convert_llama_ggml_to_gguf.py		convert_llama_ggml_to_gguf.py
convert_lora_to_gguf.py		convert_lora_to_gguf.py
downloader.py		downloader.py
flake.lock		flake.lock
flake.nix		flake.nix
mypy.ini		mypy.ini
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
pyrightconfig.json		pyrightconfig.json
requirements.txt		requirements.txt
result.csv		result.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llama.cpp

Recent API changes

Hot topics

Quick start (llama.cpp)

Quick start (IGNITE)

Model download

Build (on-device)

Run (on-device)

Build (Linux)

Run (Linux)

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

License

kjh2159/llama.cpp

Folders and files

Latest commit

History

Repository files navigation

llama.cpp

Recent API changes

Hot topics

Quick start (llama.cpp)

Quick start (IGNITE)

Model download

Build (on-device)

Run (on-device)

Build (Linux)

Run (Linux)

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages