Parallelizing Complex Data Pipelines Using ZIO with Spark
Designing reusable, extensible, and modular data pipelines to process multiple independent datasets is challenging. It can be even tougher when you produce metrics from those pipelines to make critical data-driven decisions to meet an ambitious SLA.
Fortunately, functional programming gives us a nice set of constructs to make this possible. Small, pure functions can be composed together to form complex data pipelines. Not only does this provide modularity and reuse, but it also makes data pipelines easier to test and debug.
In this talk, Kunal will present this use case and how it’s solved using ZIO (a functional library for concurrent and asynchronous programming in Scala) and Spark to build a complex analytics pipeline.
Let’s explore why these became great technology choices, and share what we learned along the way.