Menu

Post image 1
Post image 2
Post image 3
1 / 3
0

How to Process Azure Cosmos DB Change Streams in Parallel with Java (and Stop Leaving Throughput on the Table)

DEV Community·Ankit Sood·about 1 month ago
#TDekIfcf
Reading 0:00
15s threshold

You have a Cosmos DB collection with dozens of physical partitions and millions of documents. You need to migrate them or stream changes in real time to another system. You open a single change stream cursor and watch it crawl through one partition at a time, burning hours that should take minutes. The bottleneck isn't Cosmos DB. It's the single-threaded cursor you're reading with. Cosmos DB shards data across physical partitions, but a vanilla change stream reads them sequentially. Every partition waits its turn. Your provisioned RUs sit idle while one thread does all the work. This post walks through a Java implementation that opens one change stream per physical partition, processes them concurrently, and handles the operational details that tutorials usually skip: batching, RU consumption tracking, resume token checkpointing, and retry with exponential backoff. TL;DR Cosmos DB's GetChangeStreamTokens custom action returns one resume token per physical partition.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More