Data Science Level 3

Advanced Python 2

This class builds on the introductory Python class. Jupyter Notebook advanced use and customisation is covered as well as configuring multiple environments and kernels. The Numpy package is introduced for working with arrays and matrices and a deeper coverage of Pandas data analysis and manipulation methods is provided including working with time series data. Data exploration and advanced visualisations are taught using the Plotly and Seaborne libraries.

Advanced R 2

This course goes deeper into the tidyverse family of packages, with a focus on advanced data handling, as well as advanced data structures such as list columns in tibbles, and their application to model management. Another key topic is advanced functional programming with the purrr package, and advanced use of the pipe operator. Optional topics may include dplyr on databases, and use of rmarkdown and Rstudio notebooks.

Advanced Fraud and Anomaly Detection

The detection of anomalies is one of the most eclectic and difficult activities in data analysis. This course builds on the basics introduced in the earlier course, and provides more advanced methods including supervised and unsupervised learning, advanced use of Benford’s Law, and more on statistical anomaly detection. Optional topics may include anomalies in time series, deception in text and the use of social network analysis to detect fraud and other undesirable behaviours.

Advanced Deep Learning

This course provides a more rigorous, mathematically based view of modern neural networks, their training, applications, strengths and weaknesses, focusing on key architectures such as convolutional nets for image processing and recurrent nets for text and time series. This course will also include use of dedicated hardware such as GPUs and multiple computing nodes on the cloud. There will also be an overview of the most common available platforms for neural computation. Some topics touched in the introduction will be revisited in more thorough detail. Optional advanced topics may include Generative Adversarial Networks, Reinforcement Learning, Transfer Learning and probabilistic neural networks.

Real-Time Analytics Development and Deployment

Real-time analytics is rapidly changing the landscape for deployment of decision support capability. The challenges of supporting extreme service levels in the areas of performance, availability, and data freshness demand new methods for data warehouse construction. Particular attention is paid to architectural topologies for successful implementation and the role of frameworks for Microservices deployment. In this workshop we will discuss evolution of data warehousing technology and new methods for meeting the associated service levels with each stage of evolution.