#gguf

I built a Rust LLM inference engine with custom WGSL GPU kernels, here's what I learned!

🖼️

0

I built a Rust LLM inference engine with custom WGSL GPU kernels, here's what I learned!

DEV Community: rust·saripalli shanmukha kiran sagar·3 days ago

#UQqI7l18

#dev #rust #gguf #wgpu #project #aether

I've been working on a side project called aether , a Rust LLM inference engine that can load GGUF...

15s

GGUF Quantization Explained: Q4_K_M vs Q5_K_M vs Q8 — Which to Pick (2026)

🖼️

0

GGUF Quantization Explained: Q4_K_M vs Q5_K_M vs Q8 — Which to Pick (2026)

DEV Community·Patrick Hughes·20 days ago

#fqnJEpdi

#gguf #llamacpp #quantization #model #q4_k_m #quality

Q4_K_M cuts model size 75% with minimal quality loss — but when should you use Q5, Q6, or Q8 instead? We benchmarked every quant level on real hardware and measured the actual accuracy tradeoffs.

15s

Build a Local AI Chatbot with Python (No Internet Needed)

🖼️

0

Build a Local AI Chatbot with Python (No Internet Needed)

DEV Community·Jeffrey.Feillp·30 days ago

#zp4GCHNN

#python #ai #machinelearning #opensource #fullscreen #llama

View the full article

Create a free account to read full articles inline — no redirect to the original site.

Create account Log in

🖼️

0

Build a Local AI Chatbot with Python (No Internet Needed)

DEV Community·Jeffrey.Feillp·30 days ago

#GDc70HiW

#python #ai #machinelearning #opensource #fullscreen #llama

View the full article

Create a free account to read full articles inline — no redirect to the original site.

Create account Log in

Meet Tian AI: Your Completely Offline AI Assistant for Android

🖼️

0

Meet Tian AI: Your Completely Offline AI Assistant for Android

DEV Community·Jeffrey.Feillp·about 1 month ago

#jx3alb4D

#android #ai #opensource #software #tian #gguf

Meet Tian AI — the open-source, completely offline AI assistant for Android that runs entirely on your phone via Termux. No cloud, no data leaks, no subscriptions.

15s

How to Deploy Llama 3.2 7B with GGUF Quantization on a $5/Month DigitalOcean Droplet: Sub-1GB Memory Inference

📰

0

How to Deploy Llama 3.2 7B with GGUF Quantization on a $5/Month DigitalOcean Droplet: Sub-1GB Memory Inference

DEV Community·RamosAI·about 1 month ago

#zd3DHbX5

#programming #tutorial #ai #fullscreen #llama #ollama

From Dev.to - tutorial: How to Deploy Llama 3.2 7B with GGUF Quantization on a $5/Month DigitalOcean Droplet: Sub-1GB Memory Inference

15s

📰

0

Meet Tian AI: Your Completely Offline AI Assistant for Android

DEV Community·Jeffrey.Feillp·about 1 month ago

#ONRMfXNb

#android #ai #opensource #software #tian #gguf

Meet Tian AI — the open-source, completely offline AI assistant for Android that runs entirely on your phone via Termux. No cloud, no data leaks, no subscriptions.

15s