Data Engineering Level 1

Best Practices in Enterprise Information Management

The effective management of enterprise information for analytics deployment requires best practices in the areas of people, processes, and technology. In this talk we will share both successful and unsuccessful practices in these areas. The scope of this workshop will involve five key areas of enterprise information management: (1) metadata management, (2) data quality management, (3) data security and privacy, (4) master data management, and (5) data integration.

Intro to Python for Data Analysis

Python is a high-level, general-purpose language used by a thriving community of millions. Data-science teams often use it in their production environments and analysis pipelines, and it’s the tool of choice for elite data-mining competition winners and deep-learning innovations. This course provides a foundation for using Python in exploratory data analysis and visualisation, and as a stepping stone to machine learning.

Stars, Flakes, Vaults and the Sins of Denormalisation

Providing both performance and flexibility are often seen as contradictory goals in designing large scale data implementations. In this talk we will discuss techniques for denormalisation and provide a framework for understanding the performance and flexibility implications of various design options. We will examine a variety of logical and physical design approaches and evaluate the trade offs between them. Specific recommendations are made for guiding the translation from a normalised logical data model to an engineered-for-performance physical data model. The role of dimensional modeling and various physical design approaches are discussed in detail. Best practices in the use of surrogate keys is also discussed. The focus is on understanding the benefit (or not) of various denormalisation approaches commonly taken in analytic database designs.

Data Ethics DRAFT

Data ethics is rapidly becoming the most critical aspect of engaging in a data driven, digital world. Significant backlash against industry giants like Facebook and Google for their data practices has pushed data ethics into mainstream society. With the ACCC signaling its intentions to focus on data practices and a host of new legislation, led by GDPR in Europe, the open data movement and the Consumer Data Right in Australia, it has become a key concern for digital consumers and the companies that serve them. The course covers the practical issues involved in implementing data ethics and uses real world illustrations and cases. We start with high profile data ethics cases and cover the essentials of the new legislation. We then walk through a data ethics policy. Day 2 focuses on a toolkit for implementing data trust and privacy by design, then covers consent and transparency requirements. It closes with a real-world framework for the governance required and an overview of the practical implementation steps.

Data Governance 1

This two day course provides an informed, realistic and comprehensive foundation for establishing best practice Data Governance in your organisation. Suitable for every level from CDO to executive to data steward, this highly practical course will equip you with the tools and strategies needed to successfully create and implement a Data Governance strategy and roadmap.

Leadership and Resilience Skills for Data Professionals

Many people today have been developed emotionally and mentally for an era that no longer really exists. This has created a critical soft-skills gap between current workforce ability and business requirements today. In this course participants learn to ‘readapt’ their soft skills so that they are aligned with a thriving 21st century business. They are also given a simple framework from which to continue the self-development so that the training instigates sustainable change.

The Future of Analytics

This full day workshop examines the trends in analytics deployment and developments in advanced technology. The implications of these technology developments for data foundation implementations will be discussed with examples in future architecture and deployment. This workshop presents best practices for deployment of a next generation data management implementation as the realization of analytic capability for mobile devices and consumer intelligence. We will also explore emerging trends related to big data analytics using content from Web 3.0 applications and other non-traditional data sources such as sensors and rich media.

Agile Data Management Architecture

This full-day workshop examines the trends in analytic technologies, methodologies, and use cases. The implications of these developments for deployment of analytic capabilities will be discussed with examples in future architecture and implementation. This workshop also presents best practices for deployment of next generation analytics.

Innovating with Best Practices to Modernise Delivery Architecture and Governance

Organisations often struggle with the conflicting goals of both delivering production reporting with high reliability while at the same time creating new value propositions from their data assets. Gartner has observed that organizations that focus only on mode one (predictable) deployment of analytics in the construction of reliable, stable, and high-performance capabilities will very often lag the marketplace in delivering competitive insights because the domain is moving too fast for traditional SDLC methodologies. Explorative analytics requires a very different model for identifying analytic opportunities, managing teams, and deploying into production. Rapid progress in the areas of machine learning and artificial intelligence exacerbates the need for bi-modal deployment of analytics. In this workshop we will describe best practices in both architecture and governance necessary to modernise an enterprise to enable participation in the digital economy.

Modernising Your Data Warehouse and Analytic Ecosystem

This full-day workshop examines the emergence of new trends in data warehouse implementation and the deployment of analytic ecosystems.  We will discuss new platform technologies such as columnar databases, in-memory computing, and cloud-based infrastructure deployment.  We will also examine the concept of a “logical” data warehouse – including and ecosystem of both commercial and open source technologies.  Real-time analytics and in-database analytics will also be covered.  The implications of these developments for deployment of analytic capabilities will be discussed with examples in future architecture and implementation. This workshop also presents best practices for deployment of next generation analytics using AI and machine learning. 

Optimising Your Big Data Ecosystem

Big Data exploitation has the potential to revolutionise the analytic value proposition for organisations that are able to successfully harness these capabilities. However, the architectural components necessary for success in Big Data analytics are different than those used in traditional data warehousing. This workshop will provide a framework for Big Data exploitation along with recommendations for architectural deployment of Big Data solutions.

Capacity Planning for Enterprise Data Deployment

This workshop describes a framework for capacity planning in an enterprise data environment. We will propose a model for defining service level agreements (SLAs) and then using these SLAs to drive the capacity planning and configuration for enterprise data solutions. Guidelines will be provided for capacity planning in a mixed workload environment involving both strategic and tactical decision support. Performance implications related to technology trends in multi-core CPU deployment, large memory deployment, and high density disk drives will be described. In addition, the capacity planning implications for different approaches for data acquisition will be considered.

Real-Time Analytics Development and Deployment

Real-time analytics is rapidly changing the landscape for deployment of decision support capability. The challenges of supporting extreme service levels in the areas of performance, availability, and data freshness demand new methods for data warehouse construction. Particular attention is paid to architectural topologies for successful implementation and the role of frameworks for Microservices deployment. In this workshop we will discuss evolution of data warehousing technology and new methods for meeting the associated service levels with each stage of evolution.