Menu

Post image 1
Post image 2
Post image 3
Post image 4
1 / 4
0

Stop Using Spark for Your Small Data - Why Azure Functions is the Right Tool for the Job

DEV Community·Luca Liu·27 days ago
#QXbblZtE
Reading 0:00
15s threshold

As a data analyst, my job is to get data from A to B, cleaned and ready for use. A common workflow for my team involves users uploading Excel files to a OneDrive folder. A Power Automate flow then syncs these files daily to a container in our Azure Storage Account . From there, my responsibility begins: Read the new Excel file from Blob Storage using Python. Process the data (clean, transform, apply business logic). Write the final data to an Azure SQL Database. I needed this to run on two triggers: a time schedule (e.g., every morning at 7 AM) and an event-driven trigger (i.e., as soon as a new file lands in the container). My first thought was to use the "big data" tools I'd heard of: Azure Databricks or Azure Synapse Analytics . The "Big Tool" Trap On the surface, Databricks and Synapse are perfect. They let me write Python in a Notebook , which I'm very comfortable with. They have easy-to-use trigger and monitoring tools. I set up a proof-of-concept, and it worked. But I quickly realized a problem.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More