Real-Time Data Streaming and Processing using Synapse Analytics
Authors: Hari Prasad Bomma
DOI: https://doi.org/10.5281/zenodo.14762564
Short DOI: https://doi.org/g83jcm
Country: USA
Full-text Research PDF File:
View |
Download
Abstract: Extract, Transform, Load (ETL) is a traditional method widely used for data integration, involving extracting data from various sources, transforming it to meet operational needs, and loading it into a target data warehouse. Regular ETL processes typically scheduled at intervals like daily or weekly, offer advantages such as simplifying data processing and reducing resource usage during off peak hours. However, they also present significant drawbacks, including latency and difficulty in scaling with large data volumes, which can lead to processing delays and potential system failures. The paper will explore these challenges and the increasing demand for real time data processing in the era of Big Data, driven by the proliferation of IoT devices. It will discuss modern data processing requirements, focusing on high throughput and low latency data streams, and the need for scalable and reliable infrastructure. The paper will present Microsoft's Synapse Analytics as a comprehensive solution, detailing its unified capabilities for data engineering, warehousing, and exploration to meet contemporary data processing needs.
Keywords: Data Streaming, Real Time data processing, Synapse, Apache Spark, Event hub, Blob Storage, IoT
Paper Id: 232067
Published On: 2024-11-12
Published In: Volume 12, Issue 6, November-December 2024