We are progressively releasing DPS 2020 session recordings for the community. You will have free access. Keep watching this space for more content each week. This repository will have 150+ session recordings in due course of time. Also, note that Data Platform Virtual Summit 2021 has been announced.
|Sasha Nosov||Architecture||Azure Arc Enabled SQL Server||Even if you cannot migrate or modify your SQL Server application, you can leverage Azure. Whether your existing SQL server instances deployed to your private infrstraucture, AWS or GCP, you can use Azure Arc to manage your global inventory, protect SQL Server instances with Azure Security Center or periodically assess and tune the health of your SQL Server configurations.|
|Bob Ward||Data Administration||Inside Waits, Latches, and Spinlocks Returns||This session marks the return of a popular session dive into the internals of waits in SQL Server including latches and spinlocks. In this session, you will learn how SQL Server implements waits, how you can monitor and troubleshoot waits, and a deep dive into specific common wait types. This session will include new wait types specific to Azure SQL. The session will include plenty of demos and back by popular demand the use of the Windows Debugger to peek inside how waits are truly implemented in SQL Server.|
|Warner Chaves||Architecture||Global Analytics with Azure Cosmos Db and Synapse Analytics||Cosmos Db is Azure's NoSQL Database as a Service, born in the cloud and designed to take advantage of the flexibility, elasticity and global reach of cloud computing.|
Synapse Analytics is Azures data analytics services that integrates on-demand SQL querying, Spark big data, Data Lake Store Gen2 as well as an integrated authoring experience.
Together these two services can be used to develop solutions in a simple and elegant way that would have been incredibly complex before. The most ambitious is the capability of doing Global Analytics, being able to do analytical queries over your live operational data coming from anywhere in the planet. All without having to handle one piece of infrastructure yourself.
In this demo-heavy session we will look at the C# code, features and configuration of Cosmos Db and Synapse and see the Global Analytics in action live.
#CosmosDB #NoSQL #Synapse #BigData #ADLS2 #Analytics
|Peter Myers||Business Intelligence & Advanced Analytics||Working With Different Power BI Data Model Architectures||Power BI provides you with different data model architectures. It's all controlled by setting the storage mode of model tables, as either Import, DirectQuery, or Dual. In this presentation, learn why and how to develop the model that best fits your data and circumstances. Also, learn how you can extend an existing Power BI data model with new data and calculations.||Watch Now|
|Alicia Moniz||Data Science (AI/ML)||Data Stewardship In An AI-Driven Ecosystem: InterpretML, FairLearn, WhiteNoise||At the core of Microsoft's AI are the principles of fairness, reliability & safety, privacy & security, inclusiveness, transparency & accountability. As AI capabilities increase along with adoption, it is important that we also leverage tools that enable us to practice AI responsibly.|
Responsible ML provides us with tools to ensure that as practitioners we
Understand machine learning models - Are we able to interpret and explain model behavior? Are we able to assess and mitigate model unfairness
Protect people and their data - Are we actively working to prevent data exposure with differential privacy?
Control the end-to-end machine learning process - Are we documenting the machine learning life cycle?
Announced at Build this year were multiple Responsible ML open source packages. The accessibility of these freely available tools enables every machine learning developer to consider incorporating Responsible ML into the development cycle of their AI projects.
InterpretML - An open source package that enables developers to understand their models behavior and the reasons behind individual predictions.
A python package that enables developers to assess and address fairness and observed unfairness within their models.
WhiteNoise - An open source library that enables developers to review and validate the differential privacy of their data set and analysis. Also included are components for data access allowing data consumers to dynamically inject 'noise' directly into their queries.
Datasheets for Models - A python SDK that enables developers to document assets within a model, enabling easier access to metadata about models
It is import that we design sustainable AI systems with ethics in mind. Join us for an overview and demo of these packages!
|Anthony Nocentino||Architecture||Containers - What's Next?||You've been working with containers in development for a while, benefiting from the ease and speed of the deployments. Now it's time to extend your container-based data platform's capabilities for your production scenarios.|
In this session, we'll look at how to build custom containers, enabling you to craft a container image for your production system's needs. We'll also dive deeper into operationalizing your container-based data platform and learn how to provision advanced disk topologies, seed larger databases, implement resource control and understand performance concepts.
By the end of this session, you will learn what it takes to build containers and make them production ready for your environment.
|Abhishek Narain||Architecture||Architecting enterprise-grade data pipelines with Azure Data Factory||Azure Data Factory is the modern data integration service (hybrid, server-less, cloud scale) that enables customers to build their ETL/ ELT pipelines for their Modern Data Warehouse (MDW) from Big Data. What truly makes Azure Data Factory an enterprise-ready ETL service is the in-built security features. In this talk we will learn all security fundamentals and best practices while building data pipelines in Azure Data Factory.|
|Torsten Strauss||Architecture||Microsoft SQL Server - In-Memory OLTP Design Principles||In this session we will look at the design principals of the in-memory OLTP engine. We will understand how the in-memory engine optimizes data storage for main memory, eliminates latches and locks, and uses native compilation to reduce the CPU overhead.|
For this we will compare the traditional on-disk engine with the in-memory engine to decide when it makes sense to use in-memory OLTP.
|Gilbert Quevauvilliers||Business Intelligence & Advanced Analytics||How I Reduced My Power BI Dataset By 60%||How to optimize Power BI Datasets to ensure that they can run as fast as possible with the smallest amount of memory possible.|
The session will cover the following topics:
Data Modelling with the Star Schema
Looking at columns in the dataset
Data Types and how this affects the size of your dataset
Using DAX Studio for analysis of datasets
Real world optimizations that I put into practice