The second module in the “Big Data Development Using Apache Spark” series, this course provides the Spark streaming knowledge needed to develop real-time, event-driven or event-oriented processing applications using Apache Spark. It covers using Spark with NoSQL systems and popular messaging platforms like Apache Kafka and Amazon Kinesis. It covers the Spark streaming architecture in depth, and uses practical hands-on exercises to reinforce the use of transformations and output operations, as well as more advanced stream-processing patterns. With big data expert and author Jeffrey Aven.
This course is an introduction to the highly celebrated area of Neural Networks, popularised as “deep learning” and “AI”. The course will cover the key concepts underlying neural network technology, as well as the unique capabilities of a number of advanced deep learning technologies, including Convolutional Neural Nets for image recognition, recurrent neural nets for time series and text modelling, and new artificial intelligence techniques including Generative Adversarial Networks and Reinforcement Learning. Practical exercises will present these methods in some of the most popular Deep Learning packages available in Python, including Keras and Tensorflow. Trainees are expected to be familiar with the basics of machine learning from the Fundamentals course, as well as the python language.
Text analytics is a crucial skill set in nearly all contexts where data science has an impact, whether that be customer analytics, fraud detection, automation or fintech. In this course, you will learn a toolbox of skills and techniques, starting from effective data preparation and stretching right through to advanced modelling with deep-learning and neural-network approaches such as word2vec.
This class builds on the introductory Python class. Jupyter Notebook advanced use and customisation is covered as well as configuring multiple environments and kernels. The Numpy package is introduced for working with arrays and matrices and a deeper coverage of Pandas data analysis and manipulation methods is provided including working with time series data. Data exploration and advanced visualisations are taught using the Plotly and Seaborne libraries.
This course presents a process and methods for an agile analytics delivery. Agile Insights reflects the capabilities required by any organisation to develop insights from data and validate potential business value. Content presented describes the process, how it is executed and how it can be deployed as a standard process inside an organisation. The course will also share best practices, highlight potential tripwires to watch out for, as well as roles and resources required.
This class builds on “Intro to R (+data visualisation)” by providing students with powerful, modern R tools including pipes, the tidyverse, and many other packages that make coding for data analysis easier, more intuitive and more readable. The course will also provide a deeper view of functional programming in R, which also allows cleaner and more powerful coding, as well as R Markdown, R Notebooks, and the shiny package for interactive documentation, browser-based dashboards and GUIs for R code.
This course goes deeper into the tidyverse family of packages, with a focus on advanced data handling, as well as advanced data structures such as list columns in tibbles, and their application to model management. Another key topic is advanced functional programming with the purrr package, and advanced use of the pipe operator. Optional topics may include dplyr on databases, and use of rmarkdown and Rstudio notebooks.
In the Advanced Python 2 course, you will learn advanced methods and packages for working with "big data" with Pandas. The course also covers using Dask for parallel computation. Machine learning is demonstrated with [...]
This course is for experienced machine-learning practitioners who want to take their skills to the next level by using R to hone their abilities as predictive modellers. Trainees will learn essential techniques for real machine-learning model development, helping them to build more accurate models. In the masterclass, participants will work to deploy, test, and improve their models.
Blockchain is one of the most disruptive and least understood technologies to emerge over the previous decade. This course gives participants an intuitive understanding of blockchain in both public and private contexts, allowing them to distinguish genuine use cases from hype. We explore public crypto-currencies, smart contracts and consortium chains, interspersing theory with case studies from areas such as financial markets, health care, trade finance, and supply chain. The course does not require a technical background.