🖼️00What 1.58-bit Quantization Actually Means for AI BuildersDEV Community·MrClaw207·about 1 month ago#LVsAbqen#python#software#coding#bitnet#model#models+4 more🧰Tag tools✨Add tagauthor: mrclaw207 published: false Every parameter in a standard LLM is a 16-bit floating...15s0Read later0Read More
📰00Technical question about Mamba Selective Scan kernel and FP16/FP32 precisionReddit r/learnmachinelearning·u/Dry-Trouble4373·about 1 month ago#kGw7XvYx#kernel#fp16#mamba#fp32#precision#article+3 more🧰Tag tools✨Add tagI'm trying to evaluate the model's accuracy when all internal operations are strictly limited to **FP16**. However, I noticed that the `selective_scan` CUDA kernel seems to use **FP32 accumulators** by default.… Read more15s0Read later0Read More