Streaming Systems: The What, Where, When, and How of Large-Scale Data Processing

Streaming Systems: The What, Where, When, and How of Large-Scale Data Processing
Description
He is the author of the 2015 Dataflow Model paper and the Streaming 101 and Streaming 102 articles on the O’Reilly website. Though deeply passionate and vocal about the capabilities and importance of stream processing, he is also a firm believer in batch and streaming as two sides of the same coin, with the real endgame for data processing systems the seamless merging between the two. About the AuthorTyler Akidau is a staff software engineer at Google Seattle. He leads technical infrastructure’s internal data processing teams (MillWheel & Flume), is a founding member of the Apache
He is the author of the 2015 Dataflow Model paper and the Streaming 101 and Streaming 102 articles on the O’Reilly website. His preferred mode of transportation is by cargo bike, with his two young daughters in tow.. Though deeply passionate and vocal about the capabilities and importance of stream processing, he is also a firm believer in batch and stre
Streaming data is a big deal in big data these days, and for good reason. You’ll go from "101"-level understanding of stream processing to a nuanced grasp of the what, where, when, and how of processing real-time data streams.Dive deep into topics including watermarks and windowing, as well as state and timers in the context of stream processing. Businesses crave ever more timely data, and streaming is a good way to achieve lower latency. Although the book uses Apache Beam code snippets to make examples concrete, it presents a general and broad explanation of streaming that's not tied to a specific framework.. Plus, streaming is a much easier way to tame the massive, unbounded data sets that are increasingly common today.Expanded from co-author Tyler Akidau’s popular series of blog posts "Streaming 101" and "Streaming 102", this practical book shows data engineers, data scientists, and developers how to work with streaming or event-time data in a conceptual and platform-agnostic way