ggml - 技术专题深度解读 | GitHub 中文社区

Here are 146 public repositories matching this topic...

ggml-org / llama.cpp

Star

LLM inference in C/C++

ggml

Updated Feb 19, 2026
C++

LostRuins / koboldcpp

Star

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.

llama language-model gemma mistral koboldai llm llamacpp ggml koboldcpp gguf

Updated Feb 19, 2026
C++

Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source, speech, and multimodal models on cloud, on-prem, or your laptop — all through one unified, production-ready inference API.

Updated Feb 15, 2026
Python

rustformers / llm

Star

[Unmaintained, see README] An ecosystem of Rust libraries for working with large language models

rust ai ml llm ggml

Updated Jun 24, 2024
Rust

leejet / stable-diffusion.cpp

Star

Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference in pure C/C++

flux ai cplusplus image-generation wan diffusion text2image image2image img2img txt2img latent-diffusion stable-diffusion ggml videogeneration flux-dev flux-schnell qwen-image z-image-turbo z-image

Updated Feb 10, 2026
C++

guinmoon / LLMFarm

Star

llama and other large language models on iOS and MacOS offline using GGML library.

macos swift ios ai llama gpt-2 rwkv ggml gptneox starcoder

Updated Jan 30, 2026
C

sammcj / gollama

Sponsor

Star

Go manage your Ollama models

macos linux ai models tui llm ggml ollama gguf

Updated Dec 30, 2025
Go

RWKV / rwkv.cpp

Star

INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model

machine-learning deep-learning quantization language-model llm rwkv ggml

Updated Mar 23, 2025
C++

RahulSChand / gpu_poor

Star

Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization

gpu pytorch llama quantization language-model huggingface llm llamacpp ggml llama2

Updated Dec 3, 2024
JavaScript

PABannier / bark.cpp

Star

Suno AI's Bark model in C/C++ for fast text-to-speech generation

machine-learning text-to-speech inference tts ggml

Updated Nov 16, 2024
C++

pollockjj / ComfyUI-MultiGPU

Star

This custom_node for ComfyUI adds one-click "Virtual VRAM" for any UNet and CLIP loader as well MultiGPU integration in WanVideoWrapper, managing the offload/Block Swap of layers to DRAM *or* VRAM to maximize the latent space of your card. Also includes nodes for directly loading entire components (UNet, CLIP, VAE) onto the device you choose

pytorch unet-pytorch stable-diffusion comfyui ggml comfyui-workflow comfyui-nodes gguf-models wanvideowrapper