This course provides the Spark streaming conceptual and practical knowledge required to develop real time and event driven (or event oriented) processing applications using Apache Spark. The course covers using Spark with NoSQL systems as well as popular messaging platforms such as Apache Kafka and Amazon Kinesis.
The Spark streaming architecture is covered in depth using the DStream API as well as structured streaming in Spark. Practical hands-on exercises are provided to reinforce the usage of transformations and output operations in Spark streaming, as well as more advanced stream processing patterns such as stateful stream processing and sliding window operations using Spark.
Topics covered include:
- Introduction to NoSQL systems
- Using Spark with
- HBase
- DynamoDB
- Apache Kafka
- Amazon Kinesis
- Introducing spark streaming and the DStream API
- DStream sources, transformations and output operations
- Stateful stream processing
- Sliding window operations
- Event sourcing using Spark
- Structured streaming with Spark
Developed by Jeffrey Aven, author of SAMS Teach Yourself Apache Spark and Data and Analytics with Spark using Python, this course will provide the core knowledge and skills required to develop streaming and event processing applications using Apache Spark.
Apache Spark Training Series
This Stream and Event Processing Using Apache Spark module is the second of three modules in the Big Data Development Using Apache Spark series. It follows the Data Transformation and Analysis using Apache Spark module and precedes the Advanced Analytics Using Apache Spark module.
See what former trainees are saying about AlphaZetta courses.
Additional Information – Stream and Event Processing using Apache Spark
Audience | Expert This course is suitable for developers and analysts who will be developing real time or event oriented applications using Spark; this could include IoT applications, real-time fraud applications and more. The course is designed for developers and analysts who have a basic level of competence with Spark programming using the RDD and DataFrame APIs. |
Prerequisites |
|
Objective / outcomes | Attendees should, by the end of the course:
|
Format | Class |
Duration | 2 days |
Trainer | Courses are taught by Jeffrey Aven. Jeffrey Aven is a big data, open source software, and cloud computing consultant, author and instructor based in Melbourne, Australia. He has extensive experience as a technical instructor, having taught courses on Hadoop and HBase for Cloudera (awarded Cloudera Hadoop Instructor of the Year for APAC in 2013) and courses on Apache Kafka for Confluent in addition to delivering his own courses. Jeffrey is also the author of several Big Data related books including SAMS Teach Yourself Hadoop in 24 Hours, SAMS Teach Yourself Apache Spark in 24 Hours and Data Analytics with Spark using Python. In addition to his credentials as an instructor and author, Jeff has over thirty years of industry experience and has been involved in key roles with several major big data and cloud implementations over the last several years. |
Delivery Method | In-person at AlphaZetta Academy locations or on-premise for corporate groups |
Private and Corporate Training
In addition to our public seminars, workshops and courses, AlphaZetta Academy can provide this training for your organisation in a private setting at your location or ours, or online. Please enquire to discuss your needs.
Scheduled Public Courses
BOOK NOW ⇓
Private and Corporate Training
In addition to our public seminars, workshops and courses, AlphaZetta Academy can provide this training for your organisation in a private setting at your location or ours, or online. Please enquire to discuss your needs.