Menu

Post image 1
Post image 2
Post image 3
1 / 3
0

Controlling Floating-Point Determinism in NVIDIA CCCL

NVIDIA Technical Blog·Nader Al Awar·about 1 month ago
#1xYheCaE
Reading 0:00
15s threshold

A computation is considered deterministic if multiple runs with the same input data produce the same bitwise result. While this may seem like a simple property to guarantee, it can be difficult to achieve in practice, especially in parallel programming and floating-point arithmetic. This is because floating-point addition and multiplication aren’t strictly associative—that is, (a + b) + c may not equal a + (b + c)—due to rounding that occurs when intermediate results are stored with finite precision . With NVIDIA CUDA Core Compute Libraries (CCCL) 3.1, CUB—a low-level CUDA library for speed-of-light parallel device algorithms —added a new single-phase API that accepts an execution environment, enabling users to customize algorithm behavior. We can use this environment to configure the reduce algorithm’s determinism property. This can only be done through the new single-phase API, since the two-phase API doesn’t accept an execution environment.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More