-
Notifications
You must be signed in to change notification settings - Fork 10.2k
Pull requests: ggerganov/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add Fedora CUDA Guide for Development in Toolbox Environment
documentation
Improvements or additions to documentation
#11135
opened Jan 8, 2025 by
teihome
Loading…
vulkan: optimize coopmat2 q2_k dequant function
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#11130
opened Jan 7, 2025 by
jeffbolznv
Loading…
llama-bench : add test measuring token generation rate at given prompt length
examples
#11126
opened Jan 7, 2025 by
fairydreaming
Loading…
SYCL: Refactor ggml_sycl_compute_forward
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#11121
opened Jan 7, 2025 by
qnixsynapse
Loading…
gguf-py: moved scripts directory
python
python script changes
#11116
opened Jan 7, 2025 by
VJHack
Loading…
feat(ci): add visionOS build workflow
devops
improvements to build systems and github actions
ggml
changes relating to the ggml tensor library for machine learning
#11103
opened Jan 6, 2025 by
ggerganov
Loading…
vulkan: scale caching for k quants + misc fixes
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#11081
opened Jan 5, 2025 by
netrunnereve
Loading…
Remove obsolete HIP workaround
build
Compilation issues
devops
improvements to build systems and github actions
ggml
changes relating to the ggml tensor library for machine learning
nix
Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment
Nvidia GPU
Issues specific to Nvidia GPUs
#11080
opened Jan 5, 2025 by
sARY77
Loading…
server : POC OAI-compat TTS using OuteTTS
examples
server
#11070
opened Jan 3, 2025 by
ngxson
Loading…
feat(ci): add visionOS build workflow
devops
improvements to build systems and github actions
#11065
opened Jan 3, 2025 by
sinkingsugar
Loading…
llama : remove notion of CLS token
python
python script changes
#11064
opened Jan 3, 2025 by
ggerganov
Loading…
android : Apply chat template
android
Issues specific to Android
examples
#11059
opened Jan 3, 2025 by
Dhruvanand24
Loading…
CUDA Graph Compute Function Refactor (precursor for performance improvements)
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
Add VisionOS compatibility by adding missing type definitions
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
#11019
opened Dec 30, 2024 by
sinkingsugar
Loading…
model: Add support for PhiMoE arch
documentation
Improvements or additions to documentation
enhancement
New feature or request
model
Model specific
python
python script changes
#11003
opened Dec 28, 2024 by
phymbert
Loading…
Add support for QRWKV6 hybrid models & slight optimization for RWKV6
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
python
python script changes
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
testing
Everything test related
Vulkan
Issues specific to the Vulkan backend
#11001
opened Dec 28, 2024 by
MollySophia
Loading…
Vulkan: Destroy Vulkan instance on exit
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#10989
opened Dec 26, 2024 by
0cc4m
Loading…
Removed unnecessary iteration of batch n_tokens on sequence embedding…
examples
#10972
opened Dec 25, 2024 by
Emreerdog
Loading…
Allow user to compile with any cuda version using github actions
devops
improvements to build systems and github actions
#10928
opened Dec 21, 2024 by
jianlins
Loading…
ASCII/Romanization for OuteTTS Multilingual Processing
demo
Demonstrate some concept or idea, not intended to be merged
examples
#10894
opened Dec 19, 2024 by
edwko
Loading…
Previous Next
ProTip!
Follow long discussions with comments:>50.