Menu

📰
0

Llm inference with rust

Reddit r/rust·u/ramzeez88·about 1 month ago
#51FLUaRb
#inference#rust#vibe#playing#candle#article
Reading 0:00
15s threshold

Llm inference with rust Hi all, I have been vibe playing with Candle to run some inference with qwen 3.5 4b q4 ggufs on cpu only. The speed I get is mindblowing 3.5 to 6 tok/s with some optimizations. Does anyone have any tips or tricks to gain more t/s ?

Anonymous readers can preview up to 1024 characters here. Log in to unlock the full article once ingest succeeds.
Read More