#Fp16

2 posts

Feed·

Images only2 of 2 posts

🖼️

What 1.58-bit Quantization Actually Means for AI Builders

DEV Community·MrClaw207·about 1 month ago

#LVsAbqen

#python #software #coding #bitnet #model #models

author: mrclaw207 published: false Every parameter in a standard LLM is a 16-bit floating...

15s

📰

Technical question about Mamba Selective Scan kernel and FP16/FP32 precision

Reddit r/learnmachinelearning·u/Dry-Trouble4373·about 1 month ago

#kGw7XvYx

#kernel #fp16 #mamba #fp32 #precision #article

I'm trying to evaluate the model's accuracy when all internal operations are strictly limited to **FP16**. However, I noticed that the `selective_scan` CUDA kernel seems to use **FP32 accumulators** by default.…

15s

Menu

#Fp16

What 1.58-bit Quantization Actually Means for AI Builders

Technical question about Mamba Selective Scan kernel and FP16/FP32 precision