📰00Technical question about Mamba Selective Scan kernel and FP16/FP32 precisionReddit r/learnmachinelearning·u/Dry-Trouble4373·about 1 month ago#kGw7XvYx#kernel#fp16#mamba#fp32#precision#article+3 more🧰Tag tools✨Add tagI'm trying to evaluate the model's accuracy when all internal operations are strictly limited to **FP16**. However, I noticed that the `selective_scan` CUDA kernel seems to use **FP32 accumulators** by default.… Read more15s0Read later0Read More