Apache Spark

Advanced Analytics Using Apache Spark

With big data expert and author Jeffrey Aven. The third module in the “Big Data Development Using Apache Spark” series, this course provides the practical knowledge needed to perform statistical, machine learning and graph analysis operations at scale using Apache Spark. It enables data scientists and statisticians with experience in other frameworks to extend their knowledge to the Spark runtime environment with its specific APIs and libraries designed to implement machine learning and statistical analysis in a distributed and scalable processing environment.

Data Transformation and Analysis Using Apache Spark

With big data expert and author Jeffrey Aven. The first module in the “Big Data Development Using Apache Spark” series, this course provides a detailed overview of the spark runtime and application architecture, processing patterns, functional programming using Python, fundamental API concepts, basic programming skills and deep dives into additional constructs including broadcast variables, accumulators, and storage and lineage options. Attendees will learn to understand the Spark framework and runtime architecture, fundamentals of programming for Spark, gain mastery of basic transformations, actions, and operations, and be prepared for advanced topics in Spark including streaming and machine learning.

Stream and Event Processing using Apache Spark

With big data expert and author Jeffrey Aven. The second module in the “Big Data Development Using Apache Spark” series, this course provides the knowledge needed to develop real-time, event-driven or -oriented processing applications using Apache Spark. It covers using Spark with NoSQL systems and popular messaging platforms like Apache Kafka and Amazon Kinesis. It covers the Spark streaming architecture in depth, and uses practical hands-on exercises to reinforce the use of transformations and output operations, as well as more advanced stream-processing patterns.