Llm inference with rust

📰

Llm inference with rust

Reddit r/rust·u/ramzeez88·about 1 month ago

#inference #rust #vibe #playing #candle #article

Reading 0:00

15s threshold

Llm inference with rust Hi all, I have been vibe playing with Candle to run some inference with qwen 3.5 4b q4 ggufs on cpu only. The speed I get is mindblowing 3.5 to 6 tok/s with some optimizations. Does anyone have any tips or tricks to gain more t/s ?

Anonymous readers can preview up to 1024 characters here. Log in to unlock the full article once ingest succeeds.

Menu

Llm inference with rust