Tristan Handy has been curating the Analytics Engineering Roundup newsletter since 2015, pulling together the internet’s best data science & analytics articles. Tristan and co-host Julia Schottenstein now bring the Roundup to real life, hosting biweekly conversations with data practitioners inventing the future of analytics engineering. You can view full episode summaries and read back issues of the Roundup newsletter at https://roundup.getdbt.com. The podcast is sponsored by dbt labs, makers of the data transformation framework dbt. To reach our team, drop a note to [email protected].
The rapid experimentation of AI agents (w/ Yohei Nakajima)
Yohei Nakajima is an investor by day and coder by night. In particular, one of his projects, an AI agent framework called BabyAGI that creates a plan-execute loop, got a ton of attention in the past year. The truth is that AI agents are an extremely experimental space, and depending on how strict you want to be with your definition, there aren't a lot of production use cases today. Yohei discusses the current state of AI agents and where they might take us. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.
6/9/2024 • 45 minutes, 55 seconds
The Personal Data Warehouse (w/ Jordan Tigani of MotherDuck)
Jordan Tigani is an expert in large-scale data processing, having spent a decade+ in the development and growth of BigQuery, and later SingleStore. Today, Jordan and his team at MotherDuck are in the early days of working on commercial applications for the open source DuckDB OLAP database. In this conversation with Tristan and Julia, Jordan dives into the origin story of BigQuery, why he thinks we should do away with the concept of working in files, and how truly performant “data apps” will require bringing data to an end user’s machine (rather than requiring them to query a warehouse directly).
7/1/2022 • 51 minutes, 39 seconds
Making Sense of the Last 2 Years in Data
Matt Bornstein and Jennifer Li (and their co-author Martin Casado) of a16z have compiled arguably the most nuanced diagram of the data ecosystem ever made. They recently refreshed their classic 2020 post, "Emerging Architectures for Modern Data Infrastructure" and in this conversation, Tristan attempts to pin down: what does all of this innovation in tooling mean for data people + the work we're capable of doing? When will the glorious future come to our laptops? For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.
6/17/2022 • 47 minutes, 13 seconds
What’s The Role Of AI in BI?
Amit Prakash is Co-founder and CTO at ThoughtSpot. He has a deep background in search, having previously led the AdSense engineering team at Google and served on the early Bing team at Microsoft. In this conversation with Tristan and Julia, Amit gets real about the promise of AI in data: which applications are being widely used today, and which are still a few years out? For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.
5/6/2022 • 44 minutes, 43 seconds
Automating Away Your Work w/ Configuration-as-Code (w/ Sarah Krasnik)
Most recently leading a data engineering team at Perpay, Sarah has built and managed data platforms end to end by working closely with internal engineering, product, and operational teams. She recently left her role to pursue a wide variety of endeavors, including writing on her Substack (https://sarahsnewsletter.substack.com/). In this conversation with Tristan and Julia, Sarah dives into how configuration-as-code can automate away data work, why you might want to consider adding a data lake to your architecture, and how those looking to build a self-serve data culture can look to self-serve frozen yogurt shops for inspiration. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.
4/22/2022 • 43 minutes, 52 seconds
The Hard Problems™️ of Data Observability w/ Kevin Hu of Metaplane
As a PhD candidate at MIT, Kevin (and friends) published Sherlock, a data type detection engine (a surprisingly bedeviling problem) for data cleaning + data discovery. Now as co-founder and CEO of Metaplane, a data observability startup, Kevin applies these same automated data discovery methods to help data teams keep their data healthy. In this conversation with Tristan & Julia, Kevin wins the coveted award for “most crystal-clear explanations of complex technical concepts through physics analogy.” For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.
4/8/2022 • 43 minutes, 10 seconds
The Bundling vs Unbundling Debate w/ Tristan, Benn Stancil and David Jayatillake
A debate has erupted on data Twitter and data Substack - should the modern data stack remain unbundled, or should it consolidate? In this conversation, Benn Stancil (Mode), David Jayatillake (Avora) and our host Tristan Handy try to make some sense of this debate, and play with various future scenarios for the modern data stack. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.
3/25/2022 • 43 minutes, 29 seconds
One Database to Rule All Workloads? With Jon "Natty" Natkins of dbt Labs
Will the dream of a mythical database to handle all workloads (transactional + analytical) ever become a reality, or does it violate the laws of physics? This question sparked a hearty debate internally at dbt Labs, and Jon "Natty" Natkins joins Julia here to continue the conversation. Natty knows databases, and this episode will take you on a historical romp through the rise and fall of Hadoop, the transition to cloud data warehouses, and what's waiting for us next in database-land. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.
3/11/2022 • 36 minutes, 24 seconds
Ashley Sherwood (AE @ Hubspot): Permissionless Innovation for Data Teams
Ashley is a Principal Analytics Engineer at Hubspot, and has helped lead their implementation of dbt. Ashley makes unique connections in her writing and work. On her Substack, "syntax error at or near ❤️," Ashley might be found comparing growing companies to butterflies, or going deep on how to accommodate sensitive people in the workplace. In this conversation with Tristan & Julia, Ashley dives into the nuts and bolts of her trajectory pushing data innovation forward at Hubspot. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.