Cloud Native Data Strategy with Cosmos DB

This month we’re talking about cloud-native data strategy, the unique features of Azure Cosmos DB, and what’s new and exciting with Microsoft’s cloud-native NoSQL database offering in 2020. We’re joined by special guests Dave Judd, Application Development Practice Lead at ObjectSharp, and Mark Brown, Principal Program Manager at Microsoft for Cosmos DB.

This month we’re talking about cloud native data strategy, the unique features of Azure Cosmos DB, and what’s new and exciting with Microsoft’s cloud native NoSQL database offering in 2020. We’re joined by special guests Dave Judd, Application Development Practice Lead at ObjectSharp, and Mark Brown, Principal Program Manager at Microsoft for Cosmos DB.

Minutes

0:15 – Intro to the show
1:13 – Dave Judd introduces himself to the show and his work as Application Development Practice Lead
1:45 – Mark Brown introduces himself to the show and his work at Microsoft as Principal Program Manager for Cosmos DB
3:07 – Jeff kicks off discussion on the background of what is Cosmos DB: what it is, why it’s important
3:50 – Mark Brown describes Cosmos DB – a NoSQL data store built on and for Azure – and what sets it apart: it’s a NoSQL database with all of the benefits that come with that (schema agnostic, multi-model – supports a number of APIs); horizontally partitioned (scale out vs scale up); fully managed (provision an account in the portal or using a script or ARM template, set throughput, and you’re up and running). Main use cases are high availability and global distribution; Azure can replicate your data globally, allowing seamless failover to other regions if needed; Azure also provides an SLA on latency, with a guarantee of less than 10 ms. It’s the only data service in Azure with 5 9s of availability. A great solution for customers that need low latency and / or high resiliency.
7:15 – Dave Judd talks about why Cosmos DB has become such a preference for the work he’s been doing at ObjectSharp for its clients, namely: ease of use, incredibly low latency and performance, and global replication allows for better performance for users everywhere – bringing the data closer to the end-user for faster round-tripping
09:20 – Dave Judd talks about using the Cosmos DB Change Feed to allow streaming of data into other places for reporting and analytics
09:50 – Mark Brown talks about customers using Cosmos DB as a cache
10:00 – Jeff discusses the practical significance of Cosmos DB – how it solves real business problems and architectures and is simple to adopt, and not just tech for tech’s sake
11:07 – Dave Judd comments on the barrier to entry being small, unless you’re not used to working with NoSQL, but notes that the multiple APIs available make the barriers to entry even smaller, depending on your background
12:48 – Mark Brown talks about the Mongo DB support on Cosmos and efforts to make the Mongo experience on Cosmos DB even better
13:25 – Nick asks Mark Brown to talk about what’s new and exciting for Cosmos DB in 2020, and announcements that were made at Microsoft Ignite 2020
14:00 – Mark Brown discusses “auto pilot” – Cosmos DB’s new autoscale capability – solves the problem of not knowing how much throughput to provision for their database (e.g. spikes in traffic hitting an app or site). Auto pilot is available in preview. Customers can set maximum level of throughput for their container, and Azure will auto scale up and down as required. Mark notes that this is important not just for traffic spikes but also testing.
17:00 – Dave Judd talks about how ObjectSharp has written its own auto scale technology for its clients historically, and how this now will save time to be able to use Microsoft’s solution
18:00 – Mark Brown talks about Azure Synapse – Microsoft’s next generation data warehouse solution. The vision is to use Cosmos with Synapse to get operational data and do analytics out of the Synapse portal. Any time you ingest or write data into Cosmos, they will automatically ETL that and you can write queries using Spark to assess your analytics in Synapse. Don’t need to create your own pipelines anymore.
20:23 – Mark Brown talks about Notebooks support – which you can use to do analytics all within Cosmos
20:57 – Mark Brown talks about bulk operations support in v 3.0 of the SDK. Before you used to have to use a different library, but now it’s built into the SDK.
21:30 – Mark Brown talks about the Cosmos DB’s team work on private endpoints and their relevance for data protection and security of your data in the Azure Cloud, preventing data exfiltration by ensuring everything is on private IPs.
23:00 – Jeff and Mark Brown talk about customer success stories involving Cosmos DB, real world examples. Mark notes that Cosmos is very effective in microservice architectures because of its change feed. You can subscribe to the change feed using Azure Functions and the Cosmos bindings. Makes it super easy to set up a pub/sub, asynchronous, event-driven architecture, which is important for use cases in IoT and retail.
26:40 – Jeff talks about the relevance of Cosmos DB not just for big companies but also smaller companies that want to use modern cloud native architecture to succeed
27:00 – Dave Judd talks about ObjectSharp’s use of Cosmos DB and its change feed for client projects.
28:00 – Dave Judd talks about his work on a project for a global mining company, and how he’s using Cosmos DB to give the company real-time visibility into data coming from many disparate sources (e.g. IoT, etc.) so that it can better plan and make decisions. Cosmos DB plays an important role because of its global replication capabilities, which allow the data to be replicated and delivered quickly to end-users at multiple locations around the world, with low latency.
29:00 – Mark talks about the “speed of light” problem and its relevance for modern distributed systems
33:00 – The team talks about the benefits for speed of feature development as the Cloud creates more features that developers can leverage out of the box