How to Process Azure Cosmos DB Change Streams in Parallel with Java (and Stop Leaving Throughput …

1 / 3

How to Process Azure Cosmos DB Change Streams in Parallel with Java (and Stop Leaving Throughput on the Table)

DEV Community·Ankit Sood·about 1 month ago

#TDekIfcf

#getting #processing #fault #tracking #partition #cosmos

Reading 0:00

15s threshold

You have a Cosmos DB collection with dozens of physical partitions and millions of documents. You need to migrate them or stream changes in real time to another system. You open a single change stream cursor and watch it crawl through one partition at a time, burning hours that should take minutes. The bottleneck isn't Cosmos DB. It's the single-threaded cursor you're reading with. Cosmos DB shards data across physical partitions, but a vanilla change stream reads them sequentially. Every partition waits its turn. Your provisioned RUs sit idle while one thread does all the work. This post walks through a Java implementation that opens one change stream per physical partition, processes them concurrently, and handles the operational details that tutorials usually skip: batching, RU consumption tracking, resume token checkpointing, and retry with exponential backoff. TL;DR Cosmos DB's GetChangeStreamTokens custom action returns one resume token per physical partition.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

How to Process Azure Cosmos DB Change Streams in Parallel with Java (and Stop Leaving Throughput on the Table)