The Python Podcast.__init__

The Python Podcast.init

English, Technology, 1 season, 389 episodes, 6 days, 8 hours, 9 minutes

The Python Podcast.init

English, Technology, 1 season, 389 episodes, 6 days, 8 hours, 9 minutes

About

The weekly podcast about the Python programming language, its ecosystem, and its community. Tune in for engaging, educational, and technical discussions about the broad range of industries, individuals, and applications that rely on Python.

Update Your Model's View Of The World In Real Time With Streaming Machine Learning Using River

Preamble This is a cross-over episode from our new show The Machine Learning Podcast, the show about going from idea to production with machine learning. Summary The majority of machine learning projects that you read about or work on are built around batch processes. The model is trained, and then validated, and then deployed, with each step being a discrete and isolated task. Unfortunately, the real world is rarely static, leading to concept drift and model failures. River is a framework for building streaming machine learning projects that can constantly adapt to new information. In this episode Max Halford explains how the project works, why you might (or might not) want to consider streaming ML, and how to get started building with River. Announcements Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery. Building good ML models is hard, but testing them properly is even harder. At Deepchecks, they built an open-source testing framework that follows best practices, ensuring that your models behave as expected. Get started quickly using their built-in library of checks for testing and validating your model’s behavior and performance, and extend it to meet your specific needs as your model evolves. Accelerate your machine learning projects by building trust in your models and automating the testing that you used to do manually. Go to themachinelearningpodcast.com/deepchecks today to get started! Your host is Tobias Macey and today I’m interviewing Max Halford about River, a Python toolkit for streaming and online machine learning Interview Introduction How did you get involved in machine learning? Can you describe what River is and the story behind it? What is "online" machine learning? What are the practical differences with batch ML? Why is batch learning so predominant? What are the cases where someone would want/need to use online or streaming ML? The prevailing pattern for batch ML model lifecycles is to train, deploy, monitor, repeat. What does the ongoing maintenance for a streaming ML model look like? Concept drift is typically due to a discrepancy between the data used to train a model and the actual data being observed. How does the use of online learning affect the incidence of drift? Can you describe how the River framework is implemented? How have the design and goals of the project changed since you started working on it? How do the internal representations of the model differ from batch learning to allow for incremental updates to the model state? In the documentation you note the use of Python dictionaries for state management and the flexibility offered by that choice. What are the benefits and potential pitfalls of that decision? Can you describe the process of using River to design, implement, and validate a streaming ML model? What are the operational requirements for deploying and serving the model once it has been developed? What are some of the challenges that users of River might run into if they are coming from a batch learning background? What are the most interesting, innovative, or unexpected ways that you have seen River used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on River? When is River the wrong choice? What do you have planned for the future of River? Contact Info Email @halford_max on Twitter MaxHalford on GitHub Parting Question From your perspective, what is the biggest barrier to adoption of machine learning today? Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links River scikit-multiflow Federated Machine Learning Hogwild! Google Paper Chip Huyen concept drift blog post Dan Crenshaw Berkeley Clipper MLOps Robustness Principle NY Taxi Dataset RiverTorch River Public Roadmap Beaver tool for deploying online models Prodigy ML human in the loop labeling The intro and outro music is from Hitman’s Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

12/12/2022 • 1 hour, 16 minutes, 22 seconds

Declarative Machine Learning For High Performance Deep Learning Models With Predibase

Preamble This is a cross-over episode from our new show The Machine Learning Podcast, the show about going from idea to production with machine learning. Summary Deep learning is a revolutionary category of machine learning that accelerates our ability to build powerful inference models. Along with that power comes a great deal of complexity in determining what neural architectures are best suited to a given task, engineering features, scaling computation, etc. Predibase is building on the successes of the Ludwig framework for declarative deep learning and Horovod for horizontally distributing model training. In this episode CTO and co-founder of Predibase, Travis Addair, explains how they are reducing the burden of model development even further with their managed service for declarative and low-code ML and how they are integrating with the growing ecosystem of solutions for the full ML lifecycle. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great! When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host is Tobias Macey and today I’m interviewing Travis Addair about Predibase, a low-code platform for building ML models in a declarative format Interview Introduction How did you get involved in machine learning? Can you describe what Predibase is and the story behind it? Who is your target audience and how does that focus influence your user experience and feature development priorities? How would you describe the semantic differences between your chosen terminology of "declarative ML" and the "autoML" nomenclature that many projects and products have adopted? Another platform that launched recently with a promise of "declarative ML" is Continual. How would you characterize your relative strengths? Can you describe how the Predibase platform is implemented? How have the design and goals of the product changed as you worked through the initial implementation and started working with early customers? The operational aspects of the ML lifecycle are still fairly nascent. How have you thought about the boundaries for your product to avoid getting drawn into scope creep while providing a happy path to delivery? Ludwig is a core element of your platform. What are the other capabilities that you are layering around and on top of it to build a differentiated product? In addition to the existing interfaces for Ludwig you created a new language in the form of PQL. What was the motivation for that decision? How did you approach the semantic and syntactic design of the dialect? What is your vision for PQL in the space of "declarative ML" that you are working to define? Can you describe the available workflows for an individual or team that is using Predibase for prototyping and validating an ML model? Once a model has been deemed satisfactory, what is the path to production? How are you approaching governance and sustainability of Ludwig and Horovod while balancing your reliance on them in Predibase? What are some of the notable investments/improvements that you have made in Ludwig during your work of building Predibase? What are the most interesting, innovative, or unexpected ways that you have seen Predibase used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Predibase? When is Predibase the wrong choice? What do you have planned for the future of Predibase? Contact Info LinkedIn tgaddair on GitHub @travisaddair on Twitter Parting Question From your perspective, what is the biggest barrier to adoption of machine learning today? Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Predibase Horovod Ludwig Podcast.__init__ Episode Support Vector Machine Hadoop Tensorflow Uber Michaelangelo AutoML Spark ML Lib Deep Learning PyTorch Continual Data Engineering Podcast Episode Overton Kubernetes Ray Nvidia Triton Whylogs Data Engineering Podcast Episode Weights and Biases MLFlow Comet Confusion Matrices dbt Data Engineering Podcast Episode Torchscript Self-supervised Learning The intro and outro music is from Hitman’s Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

12/5/2022 • 59 minutes, 22 seconds

Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks

Preamble This is a cross-over episode from our new show The Machine Learning Podcast, the show about going from idea to production with machine learning. Summary Machine learning has the potential to transform industries and revolutionize business capabilities, but only if the models are reliable and robust. Because of the fundamental probabilistic nature of machine learning techniques it can be challenging to test and validate the generated models. The team at Deepchecks understands the widespread need to easily and repeatably check and verify the outputs of machine learning models and the complexity involved in making it a reality. In this episode Shir Chorev and Philip Tannor explain how they are addressing the problem with their open source deepchecks library and how you can start using it today to build trust in your machine learning applications. Announcements Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery. Do you wish you could use artificial intelligence to drive your business the way Big Tech does, but don’t have a money printer? Graft is a cloud-native platform that aims to make the AI of the 1% accessible to the 99%. Wield the most advanced techniques for unlocking the value of data, including text, images, video, audio, and graphs. No machine learning skills required, no team to hire, and no infrastructure to build or maintain. For more information on Graft or to schedule a demo, visit themachinelearningpodcast.com/graft today and tell them Tobias sent you. Predibase is a low-code ML platform without low-code limits. Built on top of our open source foundations of Ludwig and Horovod, our platform allows you to train state-of-the-art ML and deep learning models on your datasets at scale. Our platform works on text, images, tabular, audio and multi-modal data using our novel compositional model architecture. We allow users to operationalize models on top of the modern data stack, through REST and PQL – an extension of SQL that puts predictive power in the hands of data practitioners. Go to themachinelearningpodcast.com/predibase today to learn more and try it out! Data powers machine learning, but poor data quality is the largest impediment to effective ML today. Galileo is a collaborative data bench for data scientists building Natural Language Processing (NLP) models to programmatically inspect, fix and track their data across the ML workflow (pre-training, post-training and post-production) – no more excel sheets or ad-hoc python scripts. Get meaningful gains in your model performance fast, dramatically reduce data labeling and procurement costs, while seeing 10x faster ML iterations. Galileo is offering listeners a free 30 day trial and a 30% discount on the product there after. This offer is available until Aug 31, so go to themachinelearningpodcast.com/galileo and request a demo today! Your host is Tobias Macey and today I’m interviewing Shir Chorev and Philip Tannor about Deepchecks, a Python package for comprehensively validating your machine learning models and data with minimal effort. Interview Introduction How did you get involved in machine learning? Can you describe what Deepchecks is and the story behind it? Who is the target audience for the project? What are the biggest challenges that these users face in bringing ML models from concept to production and how does DeepChecks address those problems? In the absence of DeepChecks how are practitioners solving the problems of model validation and comparison across iteratiosn? What are some of the other tools in this ecosystem and what are the differentiating features of DeepChecks? What are some examples of the kinds of tests that are useful for understanding the "correctness" of models? What are the methods by which ML engineers/data scientists/domain experts can define what "correctness" means in a given model or subject area? In software engineering the categories of tests are tiered as unit -> integration -> end-to-end. What are the relevant categories of tests that need to be built for validating the behavior of machine learning models? How do model monitoring utilities overlap with the kinds of tests that you are building with deepchecks? Can you describe how the DeepChecks package is implemented? How have the design and goals of the project changed or evolved from when you started working on it? What are the assumptions that you have built up from your own experiences that have been challenged by your early users and design partners? Can you describe the workflow for an individual or team using DeepChecks as part of their model training and deployment lifecycle? Test engineering is a deep discipline in its own right. How have you approached the user experience and API design to reduce the overhead for ML practitioners to adopt good practices? What are the interfaces available for creating reusable tests and composing test suites together? What are the additional services/capabilities that you are providing in your commercial offering? How are you managing the governance and sustainability of the OSS project and balancing that against the needs/priorities of the business? What are the most interesting, innovative, or unexpected ways that you have seen DeepChecks used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on DeepChecks? When is DeepChecks the wrong choice? What do you have planned for the future of DeepChecks? Contact Info Shir LinkedIn shir22 on GitHub Philip LinkedIn @philiptannor on Twitter Parting Question From your perspective, what is the biggest barrier to adoption of machine learning today? Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links DeepChecks Random Forest Talpiot Program SHAP Podcast.__init__ Episode Airflow Great Expectations Data Engineering Podcast Episode The intro and outro music is from Hitman’s Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

11/28/2022 • 47 minutes, 36 seconds

Build A Full Stack ML Powered App In An Afternoon With Baseten

Preamble This is a cross-over episode from our new show The Machine Learning Podcast, the show about going from idea to production with machine learning. Summary Building an ML model is getting easier than ever, but it is still a challenge to get that model in front of the people that you built it for. Baseten is a platform that helps you quickly generate a full stack application powered by your model. You can easily create a web interface and APIs powered by the model you created, or a pre-trained model from their library. In this episode Tuhin Srivastava, co-founder of Basten, explains how the platform empowers data scientists and ML engineers to get their work in production without having to negotiate for help from their application development colleagues. Announcements Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host is Tobias Macey and today I’m interviewing Tuhin Srivastava about Baseten, an ML Application Builder for data science and machine learning teams Interview Introduction How did you get involved in machine learning? Can you describe what Baseten is and the story behind it? Who are the target users for Baseten and what problems are you solving for them? What are some of the typical technical requirements for an application that is powered by a machine learning model? In the absence of Baseten, what are some of the common utilities/patterns that teams might rely on? What kinds of challenges do teams run into when serving a model in the context of an application? There are a number of projects that aim to reduce the overhead of turning a model into a usable product (e.g. Streamlit, Hex, etc.). What is your assessment of the current ecosystem for lowering the barrier to product development for ML and data science teams? Can you describe how the Baseten platform is designed? How have the design and goals of the project changed or evolved since you started working on it? How do you handle sandboxing of arbitrary user-managed code to ensure security and stability of the platform? How did you approach the system design to allow for mapping application development paradigms into a structure that was accessible to ML professionals? Can you describe the workflow for building an ML powered application? What types of models do you support? (e.g. NLP, computer vision, timeseries, deep neural nets vs. linear regression, etc.) How do the monitoring requirements shift for these different model types? What other challenges are presented by these different model types? What are the limitations in size/complexity/operational requirements that you have to impose to ensure a stable platform? What is the process for deploying model updates? For organizations that are relying on Baseten as a prototyping platform, what are the options for taking a successful application and handing it off to a product team for further customization? What are the most interesting, innovative, or unexpected ways that you have seen Baseten used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Baseten? When is Baseten the wrong choice? What do you have planned for the future of Baseten? Contact Info @tuhinone on Twitter LinkedIn Parting Question From your perspective, what is the biggest barrier to adoption of machine learning today? Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Baseten Gumroad scikit-learn Tensorflow Keras Streamlit Podcast.__init__ Episode Retool Hex Podcast.__init__ Episode Kubernetes React Monaco Huggingface Airtable Dall-E 2 GPT-3 Weights and Biases The intro and outro music is from Hitman’s Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

11/21/2022 • 45 minutes, 22 seconds

Skip Straight To The Fun Part Of Your Project With PyScaffold

Summary Starting a new project is always exciting and full of possibility, until you have to set up all of the repetitive boilerplate. Fortunately there are useful project templates that eliminate that drudgery. PyScaffold goes above and beyond simple template repositories, and gives you a toolkit for different application types that are packed with best practices to make your life easier. In this episode Florian Wilhelm shares the story behind PyScaffold, how the templates are designed to reduce friction when getting a new project off the ground, and how you can extend it to suit your needs. Stop wasting time with boring boilerplate and get straight to the fun part with PyScaffold! Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great! When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Florian Wilhelm about PyScaffold, a Python project template generator with batteries included Interview Introductions How did you get introduced to Python? Can you describe what PyScaffold is and the story behind it? What is the main goal of the project? There are a huge number of templates and starter projects available (both in Python and other languages). What are the aspects of PyScaffold that might encourage someone to adopt it? What are the different types/categories of applications that you are focused on supporting with the scaffolding? For each category, what is your selection process for which dependencies to include? How do you approach the work of keeping the various components up to date with community "best practices"? Can you describe how PyScaffold is implemented? How have the design and goals of the project changed since you first started it? What is the user experience for someone bootstrapping a project with PyScaffold? How can you adapt an existing project into the structure of a pyscaffold template? Are there any facilities for updating a project started with PyScaffold to include patches/changes in the source template? What are the most interesting, innovative, or unexpected ways that you have seen PyScaffold used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on PyScaffold? When is PyScaffold the wrong choice? What do you have planned for the future of PyScaffold? Keep In Touch Website LinkedIn FlorianWilhelm on GitHub @florianwilhelm on Twitter Picks Tobias Daredevil TV series Florian The Peripheral Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links PyScaffold Innovex SAP Cookiecutter Pytest Podcast Episode Sphinx pre-commit Podcast Episode Black Flake8 Podcast Episode Poetry Setuptools mkdocs ReStructured Text Markdown Setuptools-SCM Hatch Flit Versioneer Gource git visualization MyPy Compiler Rust Cargo The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/7/2022 • 57 minutes, 46 seconds

Add Configuration Best Practices To Your Application In An Afternoon With Dynaconf

Summary Application configuration is a deceptively complex problem. Everyone who is building a project that gets used more than once will end up needing to add configuration to control aspects of behavior or manage connections to other systems and services. At first glance it seems simple, but can quickly become unwieldy. Bruno Rocha created Dynaconf in an effort to provide a simple interface with powerful capabilities for managing settings across environments with a set of strong opinions. In this episode he shares the story behind the project, how its design allows for adapting to various applications, and how you can start using it today for your own projects. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great! When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Bruno Rocha about Dynaconf, a powerful and flexible framework for managing your application’s configuration settings Interview Introductions How did you get introduced to Python? Can you describe what Dynaconf is and the story behind it? What are your main goals for Dynaconf? What kinds of projects (e.g. web, devops, ML, etc.) are you focused on supporting with Dynaconf? Settings management is a deceptively complex and detailed aspect of software engineering, with a lot of conflicting opinions about the "right way". What are the design philosophies that you lean on for Dynaconf? Many engineers end up building their own frameworks for managing settings as their use cases and environments get increasingly complicated. What are some of the ways that those efforts can go wrong or become unmaintainable? Can you describe how Dynaconf is implemented? How have the design and goals of the project evolved since you first started it? What is the workflow for getting started with Dynaconf on a new project? How does the usage scale with the complexity of the host project? What are some strategies that you recommend for integrating Dynaconf into an existing project that already has complex requirements for settings across multiple environments? Secrets management is one of the most frequently under- or over-engineered aspects of application configuration. What are some of the ways that you have worked to strike a balance of making the "right way" easy? What are some of the more advanced or under-utilized capabilities of Dynaconf? What are the most interesting, innovative, or unexpected ways that you have seen Dynaconf used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Dynaconf? When is Dynaconf the wrong choice? What do you have planned for the future of Dynaconf? Keep In Touch rochacbruno on GitHub @rochacbruno on Twitter Website LinkedIn Picks Tobias SOPS Bruno Severance tv series Learn Rust Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Dynaconf Dynaconf GitHub Org Ansible Bash Perl 12 Factor Applications TOML Hashicorp Vault Pydantic Airflow Hydroconf The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/30/2022 • 1 hour, 3 minutes, 59 seconds

Take A Tour Of The Hidden Language Of Hardware And How It Powers Your Code

Summary Software is eating the world, but that code has to have hardware to execute the instructions. Most people, and many software engineers, don’t have a proper understanding of how that hardware functions. Charles Petzold wrote the book "Code: The Hidden Language of Computer Hardware and Software" to make this a less opaque subject. In this episode he discusses what motivated him to revise that work in the second edition and the additional details that he packed in to explore the functioning of the CPU. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great! When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Charles Petzold about his work on the second edition of Code: The Hidden Language of Computer Hardware and Software Interview Introductions How did you get introduced to Python? Can you start by describing the focus and goal of "Code" and the story behind it? Who is the target audience for the book? The sequencing of the topics parallels the curriculum of a computer engineering course of study. Why do you think that it is useful/important for a general audience to understand the electrical engineering principles that underly modern computers? What was your process for determining how to segment the information that you wanted to address in the book to balance the pacing of the reader with the density of the information? Technical books are notoriously challenging to write due to the constantly changing subject matter. What are some of the ways that the first edition of "Code" was becoming outdated? What are the most notable changes in the foundational elements of computing that have happened in the time since the first edition was published? One of the concepts that I have found most helpful as a software engineer is that of "mechanical sympathy". What are some of the ways that a better understanding of computer hardware and electrical signal processing can influence and improve the way that an engineer writes code? What are some of the insights that you gained about your own use of computers and software while working on this book? What are the most interesting, unexpected, or challenging lessons that you have learned while writing "Code" and revising it for the second edition? Once the reader has finished with your book, what are some of the other references/resources that you recommend? Keep In Touch Website Picks Tobias The Imitation Game movie Charles The Annotated Turing book by Charles Petzold Confidence Man: The Making of Donald Trump and the Breaking of America by Maggie Haberman Links Code: The Hidden Language of Computer Hardware and Software Fortran PL/I BASIC C# Z80 Intel 8080 PC Magazine Assembly Language Logic Gates C Language ASCII == American Standard Code for Information Interchange SkiaSharp Algol Code first edition bibliography The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/23/2022 • 41 minutes, 49 seconds

Take Control Of Your Electrical Systems With The Open Source FlexMeasures Energy Management System

Summary The generation, distribution, and consumption of energy is one of the most critical pieces of infrastructure for the modern world. With the rise of renewable energy there is an accompanying need for systems that can respond in real-time to the availability and demand for electricity. FlexMeasures is an open source energy management system that is designed to integrate a variety of inputs intelligently allocate energy resources to reduce waste in your home or grid. In this episode Nicolas Höning explains how the project is implemented, how it is being used in his startup Seita, and how you can try it out for your own energy needs. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great! When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Nicolas Höning about FlexMeasures, an open source project designed to manage energy resources dynamically to improve efficiency Interview Introductions How did you get introduced to Python? Can you describe what FlexMeasures is and the story behind it? What are the primary goals/objectives of the project? The energy sector is huge. Where can FlexMeasures be used? Energy systems are typically governed by a marketplace system. What are the benefits that FlexMeasures can provide for each side of that market? How do renewable sources of energy confuse/complicate the role that the different stakeholders represent? What are the different points of interaction that producers/consumers might have with the FlexMeasures platform? What are some examples of the types of decisions/recommendations that FlexMeasures might generate and how to they manifest in the energy systems? What are the types of information that FlexMeasures relies on for driving those decisions? Can you describe how FlexMeasures is implemented? How have the design and goals of the system changed/evolved since you started working on it? What are the interfaces that you provide for integrating with and extending the functionality of a FlexMeasures installation? What are the operating scales that FlexMeasures is designed for? What are the most interesting, innovative, or unexpected ways that you have seen FlexMeasures used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on FlexMeasures? When is FlexMeasures the wrong choice? What do you have planned for the future of FlexMeasures? Keep In Touch Website @nhoening on Twitter LinkedIn Picks Tobias She-Hulk Nicholas Kleo on Netflix Altair Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links FlexMeasures: Github Linux Energy Foundation Mailing List Twitter EyeQuant Energy Management System OpenEMS ICT == Information and Communications Technology HomeAssistant Podcast Episode FlexMeasures HomeAssistant Plugin Universal Smart Energy Framework PostgreSQL Data Engineering Podcast Episode TimescaleDB Data Engineering Podcast Episode OpenWeatherMap Timely-Beliefs library Flask Click Pyomo scikit-learn sktime LF Energy Flake8 MyPy Podcast Episode Black Arima Model Random Forest The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/16/2022 • 49 minutes, 16 seconds

How And Why To Build Effective Teams As An Engineering Leader

Summary Your ability to build and maintain a software project is tempered by the strength of the team that you are working with. If you are in a position of leadership, then you are responsible for the growth and maintenance of that team. In this episode Jigar Desai, currently the SVP of engineering at Sisu Data, shares his experience as an engineering leader over the past several years and the useful insights he has gained into how to build effective engineering teams. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great! When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! The biggest challenge with modern data systems is understanding what data you have, where it is located, and who is using it. Select Star’s data discovery platform solves that out of the box, with a fully automated catalog that includes lineage from where the data originated, all the way to which dashboards rely on it and who is viewing them every day. Just connect it to your dbt, Snowflake, Tableau, Looker, or whatever you’re using and Select Star will set everything up in just a few hours. Go to pythonpodcast.com/selectstar today to double the length of your free trial and get a swag package when you convert to a paid plan. Your host as usual is Tobias Macey and today I’m interviewing Jigar Desai about building effective engineering teams Interview Introductions How did you get introduced to Python? What have you found to be the central challenges involved in building an effective engineering team? What are the measures that you use to determine what "effective" means for a given team? how to establish mutual trust in an engineering team challenges introduced at different levels of team size/organizational complexity establishing and managing career ladders You have mostly worked in heavily tech-focused companies. How do industry verticals impact the ways that you think about formation and structure of engineering teams? What are some of the different roles that you might focus on hiring/team compositions in industries that aren’t purely software? (e.g. fintech, logistics, etc.) notable evolutions in engineering practices/paradigm shifts in the industry What are some of the predictions that you have about how the future of engineering will look? What impact do you think low-code/no-code solutions will have on the types of projects that code-first developers will be tasked with? What are the most interesting, innovative, or unexpected ways that you have seen organizational leaders address the work of building and scaling engineering capacity? What are the most interesting, unexpected, or challenging lessons that you have learned while working in engineering leadership? What are the most informative mistakes that you would like to share? What are some resources and reference material that you recommend for anyone responsible for the success of their engineering teams? Keep In Touch LinkedIn Picks Tobias Bullet Train movie Jigar Top Gun Maverick movie Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Sisu Data OpenStack Java The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/10/2022 • 1 hour, 4 minutes, 50 seconds

Complete Your Hardware "Weekend Projects" In An Actual Weekend With Belay

Summary Working on hardware projects often has significant friction involved when compared to pure software. Brian Pugh enjoys tinkering with microcontrollers, but his "weekend projects" often took longer than a weekend to complete, so he created Belay. In this episode he explains how Belay simplifies the interactions involved in developing for MicroPython boards and how you can use it to speed up your own experimentation. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great! When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! The biggest challenge with modern data systems is understanding what data you have, where it is located, and who is using it. Select Star’s data discovery platform solves that out of the box, with a fully automated catalog that includes lineage from where the data originated, all the way to which dashboards rely on it and who is viewing them every day. Just connect it to your dbt, Snowflake, Tableau, Looker, or whatever you’re using and Select Star will set everything up in just a few hours. Go to pythonpodcast.com/selectstar today to double the length of your free trial and get a swag package when you convert to a paid plan. Your host as usual is Tobias Macey and today I’m interviewing Brian Pugh about Belay, a python library that enables the rapid development of projects that interact with hardware via a micropython-compatible board. Interview Introductions How did you get introduced to Python? Can you describe what Belay is and the story behind it? Who are the target users for Belay? What are some of the points of friction involved in developing for hardware projects? What are some of the features of Belay that make that a smoother process? What are some of the ways that simplifying the develop/debug cycles can improve the overall experience of developing for hardware platforms? What are some of the inherent limitations of constrained hardware that Belay is unable to paper over? Can you describe how Belay is implemented? What does the workflow look like when using Belay as compared to using MicroPython directly? What are some of the ways that you are using Belay in your own projects? What are the most interesting, innovative, or unexpected ways that you have seen Belay used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Belay? When is Belay the wrong choice? What do you have planned for the future of Belay? Keep In Touch BrianPugh on GitHub LinkedIn Picks Tobias Gunnar Computer Glasses Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Belay Geomagical PIC Microcontroller AVR Microcontroller Matlab MicroPython Podcast Episode CircuitPython Podcast Episode Celery Potentiometer Raspberry Pi Raspberry Pi Pico ADC Converter Thonny Podcast Episode Adafruit Pyboard Python Inspect Module Python Tokenize Magnetometer Project Lidar The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/3/2022 • 48 minutes, 29 seconds

Catching Up With Pyre, A Fast Type Checker For Python

Summary Static typing versus dynamic typing is one of the oldest debates in software development. In recent years a number of dynamic languages have worked toward a middle ground by adding support for type hints. Python’s type annotations have given rise to an ecosystem of tools that use that type information to validate the correctness of programs and help identify potential bugs. At Instagram they created the Pyre project with a focus on speed to allow for scaling to huge Python projects. In this episode Shannon Zhu discusses how it is implemented, how to use it in your development process, and how it compares to other type checkers in the Python ecosystem. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Shannon Zhu about Pyre, a type checker for Python 3 built from the ground up to support gradual typing and deliver responsive incremental checks Interview Introductions How did you get introduced to Python? Can you describe what Pyre is and the story behind it? There have been a number of tools created to support various aspects of typing for Python. How would you describe the various goals that they support and how Pyre fits in that ecosystem? What are the core goals and notable features of Pyre? Can you describe how Pyre is implemented? How have the design and goals of the project changed/evolved since you started working on it? What are the different ways that Pyre is used in the development workflow for a team or individual? What are some of the challenges/roadblocks that people run into when adopting type definitions in their Python projects? How has the evolution of type annotations and overall support for them affected your work on Pyre? As someone who is working closely with type systems, what are the strongest aspects of Python’s implementation and opportunities for improvement? What are the most interesting, innovative, or unexpected ways that you have seen Pyre used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Pyre? When is Pyre the wrong choice? What do you have planned for the future of Pyre? Keep In Touch shannonzhu on GitHub Picks Tobias Lord Of The Rings: The Rings of Power on Amazon Video Shannon King’s Dilemma board game Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links PYre MyPy Podcast Episode PyRight PyType MonkeyType Podcast Episode Java C PEP 484 Flow Hack Continuous Integration OCaml PEP 675 – Arbitrary literal strings Gradual Typing AST == Abstract Syntax Tree Language Server Protocol Tensor Type Arithmetic PyCon: Securing Code With The Python Type System PyCon: Type Checked Python In The Real World PyCon: Łukasz Lange 2022 Keynote The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/19/2022 • 51 minutes, 45 seconds

Standardizing On Python For All Software Projects At Ascend.io

Summary Every software project is subject to a series of decisions and tradeoffs. One of the first decisions to make is which programming language to use. For companies where their product is software, this is a decision that can have significant impact on their overall success. In this episode Sean Knapp discusses the languages that his team at Ascend use for building a service that powers complex and business critical data workflows. He also explains his motivation to standardize on Python for all layers of their system to improve developer productivity. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Sean Knapp about his motivations and experiences standardizing on Python for development at Ascend Interview Introductions How did you get introduced to Python? Can you describe what Ascend is and the story behind it? How many engineers work at Ascend? What are their different areas of focus? What are your policies for selecting which technologies (e.g. languages, frameworks, dev tooling, deployment, etc.) are supported at Ascend? What does it mean for a technology to be supported? You recently started standardizing on Python as the default language for development. How has Python been used up to now? What other languages are in common use at Ascend? What are some of the challenges/difficulties that motivated you to establish this policy? What are some of the tradeoffs that you have seen in the adoption of Python in place of your other adopted languages? How are you managing ongoing maintenance of projects/products that are not written in Python? What are some of the potential pitfalls/risks that you are guarding against in your investment in Python? What are the most interesting, innovative, or unexpected ways that you have seen Python used where it was previously a different technology? What are the most interesting, unexpected, or challenging lessons that you have learned while working on aligning all of your development on a single language? When is Python the wrong choice? What do you have planned for the future of engineering practices at Ascend? Keep In Touch LinkedIn @seanknapp on Twitter Picks Tobias Delver Lens app for scanning Magic: The Gathering cards Sean Typer DuckDB Amp It Up book (affiliate link) Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Ascend Data Engineering Podcast Episode Perl Google Sawzall Technical Debt Ruby gRPC Go Language Java PySpark Apache Arrow Thrift SQL Scala Snowflake runtime for Python Snowpark Typer CLI framework Pydantic Podcast Episode Pulumi Podcast Episode PyInfra Podcast Episode Packer Plot.ly Dash DuckDB The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/13/2022 • 50 minutes, 25 seconds

Exploring The Process And Practice Of Building Better Software Through Code Reviews

Summary Writing code is only one piece of creating good software. Code reviews are an important step in the process of building applications that are maintainable and sustainable. In this episode On Freund shares his thoughts on the myriad purposes that code reviews serve, as well as exploring some of the patterns and anti-patterns that grow up around a seemingly simple process. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing On Freund about the intricacies and importance of code reviews Interview Introductions How did you get introduced to Python? Can you start by giving us your description of what a code review is? What is the purpose of the code review? At face value a code review appears to be a simple task. What are some of the subtleties that become evident with time and experience? What are some of the ways that code reviews can go wrong? What are some common anti-patterns that get applied to code reviews? What are the elements of code review that are useful to automate? What are some of the risks/bad habits that can result from overdoing automated checks/fixes or over-reliance on those tools in code reviews? identifying who can/should do a review for a piece of code how to use code reviews as a teaching tool for new/junior engineers how to use code reviews for avoiding siloed experience/promoting cross-training PR templates for capturing relevant context What are the most interesting, innovative, or unexpected ways that you have seen code reviews used? What are the most interesting, unexpected, or challenging lessons that you have learned while leading and supporting engineering teams? What are some resources that you recommend for anyone who wants to learn more about code review strategies and how to use them to scale their teams? Keep In Touch LinkedIn @onfreund on Twitter Picks Tobias The Girl Who Drank The Moon On Better Call Saul Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Wilco Code Review Home Assistant Podcast Episode Trunk-based Development Git Flow Pair Programming Feature Flags Podcast Episode KPI == Key Performance Indicator MIT Open Learning Engineering Handbook PEP Repository The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/5/2022 • 57 minutes, 24 seconds

Ship With Confidence By Automating Quality Assurance

Summary Quality assurance in the software industry has become a shared responsibility in most organizations. Given the rapid pace of development and delivery it can be challenging to ensure that your application is still working the way it’s supposed to with each release. In this episode Jonathon Wright discusses the role of quality assurance in modern software teams and how automation can help. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Jonathon Wright about the role of automation in your testing and QA strategies Interview Introductions How did you get introduced to Python? Can you share your relationship with software testing/QA and automation? What are the main categories of how companies and software teams address testing and validation of their applications? What are some of the notable tradeoffs/challenges among those approaches? With the increased adoption of agile practices and the "shift left" mentality of DevOps, who is responsible for software quality? What are some of the cases where a discrete QA role or team becomes necessary? (or is it always necessary?) With testing and validation being a shared responsibility, competing with other priorities, what role does automation play? What are some of the ways that automation manifests in software quality and testing? How is automation distinct from software tests and CI/CD? For teams who are investing in automation for their applications, what are the questions they should be asking to identify what solutions to adopt? (what are the decision points in the build vs. buy equation?) At what stage(s) of the software lifecycle does automation live? What is the process for identifying which capabilities and interactions to target during the initial application of automation for QA and validation? One of the perennial challenges with any software testing, particularly for anything in the UI, is that it is a constantly moving target. What are some of the patterns and techniques, both from a developer and tooling perspective, that increase the robustness of automated validation? What are the most interesting, innovative, or unexpected ways that you have seen automation used for QA? What are the most interesting, unexpected, or challenging lessons that you have learned while working on QA and automation? When is automation the wrong choice? What are some of the resources that you recommend for anyone who wants to learn more about this topic? Keep In Touch LinkedIn @Jonathon_Wright on Twitter Website Picks Tobias The Sandman Netflix series and Graphic Novels by Neil Gaimain Jonathon House of the Dragon HBO series Mystic Quest TV series It’s Always Sunny in Philadelphia Links Haskell Idris Esperanto Klingon Planguage Lisp Language TDD == Test Driven Development BDD == Behavior Driven Development Gherkin Format Integration Testing Chaos Engineering Gremlin Chaos Toolkit Podcast Episode Requirements Engineering Keysight QA Lead Podcast Cognitive Learning TED Talk OpenTelemetry Podcast Episode Quality Engineering Selenium Swagger XPath Regular Expression Test Guild The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/28/2022 • 1 hour, 9 minutes, 4 seconds

Remove Roadblocks And Let Your Developers Ship Faster With Self-Serve Infrastructure

Summary The goal of every software team is to get their code into production without breaking anything. This requires establishing a repeatable process that doesn’t introduce unnecessary roadblocks and friction. In this episode Ronak Rahman discusses the challenges that development teams encounter when trying to build and maintain velocity in their work, the role that access to infrastructure plays in that process, and how to build automation and guardrails for everyone to take part in the delivery process. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Ronak Rahman about how automating the path to production helps to build and maintain development velocity Interview Introductions How did you get introduced to Python? Can you describe what Quali is and the story behind it? What are the problems that you are trying to solve for software teams? How does Quali help to address those challenges? What are the bad habits that engineers fall into when they experience friction with getting their code into test and production environments? How do those habits contribute to negative feedback loops? What are signs that developers and managers need to watch for that signal the need for investment in developer experience improvements on the path to production? Can you describe what you have built at Quali and how it is implemented? How have the design and goals shifted/evolved from when you first started working on it? What are the positive and negative impacts that you have seen from the evolving set of options for application deployments? (e.g. K8s, containers, VMs, PaaS, FaaS, etc.) Can you describe how Quali fits into the workflow of software teams? Once a team has established patterns for deploying their software, what are some of the disruptions to their flow that they should guard against? What are the most interesting, innovative, or unexpected ways that you have seen Quali used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Quali? When is Quali the wrong choice? What do you have planned for the future of Quali? Keep In Touch @OfRonak on Twitter Picks Tobias The Terminal List on Amazon Ronak Midnight Gospel on Amazon Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Quali Torque Visual Studio Plugin Subversion IaC == Infrastructure as Code DevOps Terraform Pulumi Podcast Episode Cloudformation Flask The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/14/2022 • 1 hour, 1 minute, 48 seconds

The Benefits Of Python And Django For Going From Zero To MVP At Speed

Summary Every startup begins with an idea, but that won’t get you very far without testing the feasibility of that idea. A common practice is to build a Minimum Viable Product (MVP) that addresses the problem that you are trying to solve and working with early customers as they engage with that MVP. In this episode Tony Pavlovych shares his thoughts on Python’s strengths when building and launching that MVP and some of the potential pitfalls that businesses can run into on that path. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Tony Pavlovych about Python’s strengths for startups and the steps to building an MVP (minimum viable product) Interview Introductions How did you get introduced to Python? Can you describe what PLANEKS is and the story behind it? One of the services that you offer is building an MVP. What are the goals and outcomes associated with an MVP? What is the process for identifying the product focus and feature scope? What are some of the common misconceptions about building and launching MVPs that you have dealt with in your work with customers? What are the common pitfalls that companies encounter when building and validating an MVP? Can you describe the set of tools and frameworks (e.g. Django, Poetry, cookiecutter, etc.) that you have invested in to reduce the overhead of starting and maintaining velocity on multiple projects? What are the configurations that are most critical to keep constant across projects to maintain familiarity and sanity for your developers? (e.g. linting rules, build toolchains, etc.) What are the architectural patterns that you have found most useful to make MVPs flexible for adaptation and extension? Once the MVP is built and launched, what are the next steps to validate the product and determine priorities? What benefits do you get from choosing Python as your language for building an MVP/launching a startup? What are the challenges/risks involved in that choice? What are the most interesting, unexpected, or challenging lessons that you have learned while working on MVPs for your clients at PLANEKS? When is an MVP the wrong choice? What are the developments in the Python and broader software ecosystem that you are most interested in for the work you are doing for your team and clients? Keep In Touch LinkedIn Picks Tobias datamodel-code-generator Tony Screw It, Let’s Do It by Richard Branson (affiliate link) Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links PLANEKS Minimum Viable Product Django Cookiecutter Django Boilerplate OCR == Optical Character Recognition Tesseract OCR framework The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/31/2022 • 47 minutes, 6 seconds

Powering The Next Generation Of Application Architectures With Web Assembly And The Fermyon Platform

Summary Application architectures have been in a constant state of evolution as new infrastructure capabilities are introduced. Virtualization, cloud, containers, mobile, and now web assembly have each introduced new options for how to build and deploy software. Recognizing the transformative potential of web assembly, Matt Butcher and his team at Fermyon are investing in tooling and services to improve the developer experience. In this episode he explains the opportunity that web assembly offers to all language communities, what they are building to power lightweight server-side microservices, and how Python developers can get started building and contributing to this nascent ecosystem. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Need to automate your Python code in the cloud? Want to avoid the hassle of setting up and maintaining infrastructure? Shipyard is the premier orchestration platform built to help you quickly launch, monitor, and share python workflows in a matter of minutes with 0 changes to your code. Shipyard provides powerful features like webhooks, error-handling, monitoring, automatic containerization, syncing with Github, and more. Plus, it comes with over 70 open-source, low-code templates to help you quickly build solutions with the tools you already use. Go to dataengineeringpodcast.com/shipyard to get started automating with a free developer plan today! Your host as usual is Tobias Macey and today I’m interviewing Matt Butcher about Fermyon and the impact of WebAssembly on software architecture and deployment across language boundaries Interview Introductions How did you get introduced to Python? For anyone who isn’t familiar with WebAssembly can you give your elevator pitch for why it matters? What is the current state of language support for Python in the WASM ecosystem? Can you describe what Fermyon is and the story behind it? What are your goals with Fermyon and what are the products that you are building to support those goals? There has been a steady progression of technologies aimed at better ways to build, deploy, and manage software (e.g. virtualization, cloud, containers, etc.). What are the problems with the previous options and how does WASM address them? What are some examples of the types of applications/services that work well in a WASM environment? Can you describe how you have architected the Fermyon platform? How did you approach the design of the interfaces and tooling to support developer ergonomics? How have the design and goals of the platform changed or evolved since you started working on it? Can you describe what a typical workflow is for an application team that is using Spin/Fermyon to build and deploy a service? What are some of the architectural patterns that WASM/Fermyon encourage? What are some of the limitations that WASM imposes on services using it as a runtime? (e.g. system access, threading/multiprocessing, library support, C extensions, etc.) What are the new and emerging topics and capabilities in the WASM ecosystem that you are keeping track of? With Spin as the core building block of your platform, how are you approaching governance and sustainability of the open source project? What are your guiding principles for when a capability belongs in the OSS vs. commercial offerings? What are the most interesting, innovative, or unexpected ways that you have seen Fermyon used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Fermyon? When is Fermyon the wrong choice? What do you have planned for the future of Fermyon? Keep In Touch LinkedIn @technosophos on Twitter technosophos on GitHub Picks Tobias Thor: Love & Thunder movie Matt Remembrance of Earth’s Past trilogy ("Three Body Problem" is the first) by Cixin Liu Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Fermyon Our Python entry for the Wasm Language Matrix SingleStore’s WASI-Python Great notes about Wasm support in CPyton Pyodide for Python in the Browser SlashDot Web Assembly (WASM) Rust AssemblyScript Grain WASM language SingleStore Data Engineering Podcast Episode WASI PyO3 PyOxidizer RustPython Drupal OpenStack Deis Helm RedPanda Data Engineering Podcast Episode Envoy Proxy Fastly Functions as a Service CloudEvents Finicky Whiskers Fermyon Spin Nomad Tree Shaking Zappa Chalice OpenFaaS CNCF Bytecode Alliance Finicky Whiskers Minecraft Kotlin The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/25/2022 • 1 hour, 10 minutes, 39 seconds

Gain A Deeper Understanding Of What Your Code Is Doing And Where It Spends Its Time With VizTracer

Summary As your code scales beyond a trivial level of complexity and sophistication it becomes difficult or impossible to know everything that it is doing. The flow of logic and data through your software and which parts are taking the most time are impossible to understand without help from your tools. VizTracer is the tool that you will turn to when you need to know all of the execution paths that are being exercised and which of those paths are the most expensive. In this episode Tian Gao explains why he created VizTracer and how you can use it to gain a deeper familiarity with the code that you are responsible for maintaining. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Need to automate your Python code in the cloud? Want to avoid the hassle of setting up and maintaining infrastructure? Shipyard is the premier orchestration platform built to help you quickly launch, monitor, and share python workflows in a matter of minutes with 0 changes to your code. Shipyard provides powerful features like webhooks, error-handling, monitoring, automatic containerization, syncing with Github, and more. Plus, it comes with over 70 open-source, low-code templates to help you quickly build solutions with the tools you already use. Go to dataengineeringpodcast.com/shipyard to get started automating with a free developer plan today! Your host as usual is Tobias Macey and today I’m interviewing Tian Gao about VizTracer, a low-overhead logging/debugging/profiling tool that can trace and visualize your python code execution Interview Introductions How did you get introduced to Python? Can you describe what VizTracer is and the story behind it? What are the main goals that you are focused on with VizTracer? What are some examples of the types of bugs that profiling can help diagnose? How does profiling work together with other debugging approaches? (e.g. logging, breakpoint debugging, etc.) There are a number of profiling utilities for Python. What feature or combination of features were missing that motivated you to create VizTracer? Can you describe how VizTracer is implemented? How have the design and goals changed since you started working on it? There are a number of styles of profiling, what was your process for deciding which approach to use? What are the most complex engineering tasks involved in building a profiling utility? Can you describe the process of using VizTracer to identify and debug errors and performance issues in a project? What are the options for using VizTracer in a production environment? What are the interfaces and extension points that you have built in to allow developers to customize VizTracer? What are some of the ways that you have used VizTracer while working on VizTracer? What are the most interesting, innovative, or unexpected ways that you have seen VizTracer used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on VizTracer? When is VizTracer the wrong choice? What do you have planned for the future of VizTracer? Keep In Touch gaogaotiantian on GitHub LinkedIn Picks Tobias Travelers show on Netflix Tian objprint Lincoln Lawyer bilibili – Tian’s coding sessions in Chinese Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Viztracer Python cProfile Sampling Profiler Perfetto Coverage.py Podcast Episode Python setxprofile hook Circular Buffer Catapult Trace Viewer py-spy psutil gdb Flame graph The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/17/2022 • 48 minutes, 33 seconds

Stream Processing In Real Time And At Scale In Pure Python With Bytewax

Summary Analysis of streaming data in real time has long been the domain of big data frameworks, predominantly written in Java. In order to take advantage of those capabilities from Python requires using client libraries that suffer from impedance mis-matches that make the work harder than necessary. Bytewax is a new open source platform for writing stream processing applications in pure Python that don’t have to be translated into foreign idioms. In this episode Bytewax founder Zander Matheson explains how the system works and how to get started with it today. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! The biggest challenge with modern data systems is understanding what data you have, where it is located, and who is using it. Select Star’s data discovery platform solves that out of the box, with a fully automated catalog that includes lineage from where the data originated, all the way to which dashboards rely on it and who is viewing them every day. Just connect it to your dbt, Snowflake, Tableau, Looker, or whatever you’re using and Select Star will set everything up in just a few hours. Go to pythonpodcast.com/selectstar today to double the length of your free trial and get a swag package when you convert to a paid plan. Need to automate your Python code in the cloud? Want to avoid the hassle of setting up and maintaining infrastructure? Shipyard is the premier orchestration platform built to help you quickly launch, monitor, and share python workflows in a matter of minutes with 0 changes to your code. Shipyard provides powerful features like webhooks, error-handling, monitoring, automatic containerization, syncing with Github, and more. Plus, it comes with over 70 open-source, low-code templates to help you quickly build solutions with the tools you already use. Go to dataengineeringpodcast.com/shipyard to get started automating with a free developer plan today! Your host as usual is Tobias Macey and today I’m interviewing Zander Matheson about Bytewax, an open source Python framework for building highly scalable dataflows to process ANY data stream. Interview Introductions How did you get introduced to Python? Can you describe what Bytewax is and the story behind it? Who are the target users for Bytewax? What is the problem that you are trying to solve with Bytewax? What are the alternative systems/architectures that you might replace with Bytewax? Can you describe how Bytewax is implemented? What are the benefits of Timely Dataflow as a core building block for a system like Bytewax? How have the design and goals of the project changed/evolved since you first started working on it? What are the axes available for scaling Bytewax execution? How have you approached the design of the Bytewax API to make it accessible to a broader audience? Can you describe what is involved in building a project with Bytewax? What are some of the stream processing concepts that engineers are likely to run up against as they are experimenting and designing their code? What is your motivation for providing the core technology of your business as an open source engine? How are you approaching the balance of project governance and sustainability with opportunities for commercialization? What are the most interesting, innovative, or unexpected ways that you have seen Bytewax used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Bytewax? When is Bytewax the wrong choice? What do you have planned for the future of Bytewax? Keep In Touch Slack Twitter LinkedIn Picks Tobias Alta Racks Zander Atherton Bikes Links Bytewax GitHub Flink Data Engineering Podcast Episode Spark Streaming Kafka Connect Faust Podcast Episode Ray Podcast Episode Dask Data Engineering Podcast Episode Timely Dataflow PyO3 Materialize Data Engineering Podcast Episode HyperLogLog Python River Library Shannon Entropy Calculation The blog post using incremental shannon entropy NATS waxctl Prometheus Grafana Streamz The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/10/2022 • 42 minutes, 32 seconds

Tetra: A Full Stack Web Framework That Doesn't Make You Write Everything Twice

Summary Building a fully functional web application has been growing in complexity along with the growing popularity of javascript UI frameworks such as React, Vue, Angular, etc. Users have grown to expect interactive experiences with dynamic page updates, which leads to duplicated business logic and complex API contracts between the server-side application and the Javascript front-end. To reduce the friction involved in writing and maintaining a full application Sam Willis created Tetra, a framework built on top of Django that embeds the Javascript logic into the Python context where it is used. In this episode he explains his design goals for the project, how it has helped him build applications more rapidly, and how you can start using it to build your own projects today. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! So now your modern data stack is set up. How is everyone going to find the data they need, and understand it? Select Star is a data discovery platform that automatically analyzes & documents your data. For every table in Select Star, you can find out where the data originated, which dashboards are built on top of it, who’s using it in the company, and how they’re using it, all the way down to the SQL queries. Best of all, it’s simple to set up, and easy for both engineering and operations teams to use. With Select Star’s data catalog, a single source of truth for your data is built in minutes, even across thousands of datasets. Try it out for free and double the length of your free trial today at pythonpodcast.com/selectstar. You’ll also get a swag package when you continue on a paid plan. Need to automate your Python code in the cloud? Want to avoid the hassle of setting up and maintaining infrastructure? Shipyard is the premier orchestration platform built to help you quickly launch, monitor, and share python workflows in a matter of minutes with 0 changes to your code. Shipyard provides powerful features like webhooks, error-handling, monitoring, automatic containerization, syncing with Github, and more. Plus, it comes with over 70 open-source, low-code templates to help you quickly build solutions with the tools you already use. Go to dataengineeringpodcast.com/shipyard to get started automating with a free developer plan today! Your host as usual is Tobias Macey and today I’m interviewing Sam Willis about Tetra, a full stack component framework for your Django applications Interview Introductions How did you get introduced to Python? Can you describe what Tetra is and the story behind it? What are the problems that you are aiming to solve with this project? What are some of the other ways that you have addressed those problems? What are the shortcomings that you encountered with those solutions? What was missing in the existing landscape of full-stack application development patterns that prompted you to build a new meta-framework? What are some of the sources of inspiration (positive and negative) that you looked to while deciding on the component selection and implementation strategy? Can you describe how Tetra is implemented? What are the core principles that you are relying on to drive your design of APIs and developer experience? What is the process for building a full component in Tetra? What are some of the application design challenges that are introduced by Combining the javascript and Django logic and attributes? (e.g. reusing JS logic/CSS styles across components) A perennial challenge with combining the syntax across multiple languages in a single file is editor support. How are you thinking about that with Tetra’s implementation? What is your grand vision for Tetra and how are you working to make it sustainable? What are the most interesting, innovative, or unexpected ways that you have seen Tetra used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Tetra? When is Tetra the wrong choice? What do you have planned for the future of Tetra? Keep In Touch @samwillis on Twitter Website LinkedIn samwillis on GitHub Picks Tobias The Machine Learning Podcast Sam Slow Horses TV Show Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Tetra Framework Django PHP ASP Alpine.js HTMX Ruby Ruby on Rails Flutterbox Vue.js Laravel Livewire Python Import Hooks python-inline-source Tailwind CSS PostCSS Pickle Fernet esbuild Webpack Rich The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/3/2022 • 53 minutes, 6 seconds

Design Real-World Objects In Python With CadQuery

Summary Virtually everything that you interact with on a daily basis and many other things that make modern life possible were designed and modeled in software called CAD or Computer-Aided Design. These programs are advanced suites with graphical editing environments tailored to domain experts in areas such as mechanical engineering, electrical engineering, architecture, etc. While the UI-driven workflow is more accessible, it isn’t scalable which opens the door to code-driven workflows. In this episode Jeremy Wright discusses the design, uses, and benefits of the CadQuery framework for building 3D CAD models entirely in Python. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! So now your modern data stack is set up. How is everyone going to find the data they need, and understand it? Select Star is a data discovery platform that automatically analyzes & documents your data. For every table in Select Star, you can find out where the data originated, which dashboards are built on top of it, who’s using it in the company, and how they’re using it, all the way down to the SQL queries. Best of all, it’s simple to set up, and easy for both engineering and operations teams to use. With Select Star’s data catalog, a single source of truth for your data is built in minutes, even across thousands of datasets. Try it out for free and double the length of your free trial today at pythonpodcast.com/selectstar. You’ll also get a swag package when you continue on a paid plan. Need to automate your Python code in the cloud? Want to avoid the hassle of setting up and maintaining infrastructure? Shipyard is the premier orchestration platform built to help you quickly launch, monitor, and share python workflows in a matter of minutes with 0 changes to your code. Shipyard provides powerful features like webhooks, error-handling, monitoring, automatic containerization, syncing with Github, and more. Plus, it comes with over 70 open-source, low-code templates to help you quickly build solutions with the tools you already use. Go to dataengineeringpodcast.com/shipyard to get started automating with a free developer plan today! Your host as usual is Tobias Macey and today I’m interviewing Jeremy Wright about CadQuery, an easy-to-use Python module for building parametric 3D CAD models Interview Introductions How did you get introduced to Python? Can you start by explaining what CAD is and some of the real-world applications of it? Can you describe what CadQuery is and the story behind it? How did you get involved with it and what keeps you motivated? What are the different methods that are in common use for building CAD models? Are there approaches that are more common for models used in different industries? What was missing in other projects for programmatically generating CAD models that motivated you to build CadQuery? Can you describe how the CadQuery library is implemented? How have the design and goals of the project changed or evolved since you started working on it? How would you characterize the rate of change/evolution in the CAD ecosystem, and how has that factored into your work on CadQuery? How did you approach the process of API design? How do you balance accessibility for non-professionals with domain-related nomenclature? Can you describe some example workflows for going from idea to finished product with CadQuery? How are you using CadQuery in your own work? What are the most interesting, innovative, or unexpected ways that you have seen CadQuery used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on CadQuery? When is CadQuery the wrong choice? What do you have planned for the future of CadQuery? Keep In Touch Discord Twitter GitHub GitLab Picks Tobias Doctor Strange: In The Multiverse of Madness Jeremy Star Trek: Strange New Worlds Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links CadQuery CAD == Computer Assisted Design 3D Printer Jeremy’s CNC Router jQuery Blender Fusion 360 Open Cascade (OCCT) Fluent API FreeCAD KiCAD Semblage cq-editor jupyter-cadquery cq-kit FX Bricks Voxels cq_warehouse The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/27/2022 • 45 minutes, 4 seconds

Intelligent Dependency Resolution For Optimal Compatibility And Security With Project Thoth

Summary Building any software project is going to require relying on dependencies that you and your team didn’t write or maintain, and many of those will have dependencies of their own. This has led to a wide variety of potential and actual issues ranging from developer ergonomics to application security. In order to provide a higher degree of confidence in the optimal combinations of direct and transitive dependencies a team at Red Hat started Project Thoth. In this episode Fridolín Pokorný explains how the Thoth resolver uses multiple signals to find the best combination of dependency versions to ensure compatibility and avoid known security issues. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Need to automate your Python code in the cloud? Want to avoid the hassle of setting up and maintaining infrastructure? Shipyard is the premier orchestration platform built to help you quickly launch, monitor, and share python workflows in a matter of minutes with 0 changes to your code. Shipyard provides powerful features like webhooks, error-handling, monitoring, automatic containerization, syncing with Github, and more. Plus, it comes with over 70 open-source, low-code templates to help you quickly build solutions with the tools you already use. Go to dataengineeringpodcast.com/shipyard to get started automating with a free developer plan today! Your host as usual is Tobias Macey and today I’m interviewing Fridolín Pokorný about Project Thoth, a resolver service that computes the optimal combination of versions for your dependencies Interview Introductions How did you get introduced to Python? Can you describe what Project Thoth is and the story behind it? What are some examples of the types of problems that can be introduced by mismanaged dependency versions? The Python ecosystem has seen a number of dependency management tools introduced recently. What are the capabilities that Thoth offers that make it stand out? How does it compare to e.g. pip, Poetry, pip-tools, etc.? How do those other tools approach resolution of dependencies? Can you describe how Thoth is implemented? How have the scope and design of the project evolved since it was started? What are the sources of information that it relies on for generating the possible solution space? What are the algorithms that it relies on for finding an optimal combination of packages? Can you describe how Thoth fits into the workflow of a developer while selecting a set of dependencies and keeping them up to date over the life of a project? What are the opportunities for expanding Thoth’s application to other language ecosystems? What are the interfaces available for extending or integrating with Thoth? What are the most interesting, innovative, or unexpected ways that you have seen Thoth used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Thoth? When is Thoth the wrong choice? What do you have planned for the future of Thoth? Keep In Touch LinkedIn Website Picks Tobias Brass Against Fridolin micropipenv Links Redhat Emerging Technologies Group Project Thoth Thamos CLI PyPA Advisory Database Project2Vec Thoth Prescriptions Thoth: Egyptian God The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/15/2022 • 31 minutes, 31 seconds

Take A Deep Dive On How Code Completion Works And How To Customize It

Summary Most developers have encountered code completion systems and rely on them as part of their daily work. They allow you to stay in the flow of programming, but have you ever stopped to think about how they work? In this episode Meredydd Luff takes us behind the scenes to dig into the mechanics of code completion engines and how you can customize them to fit your particular use case. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Meredydd Luff about how code completion works and what it takes to build your own Interview Introductions How did you get introduced to Python? Most programmers are familiar with the idea of code completion, but can you just give the elevator pitch to get us all on the same page? You gave a presentation recently at PyCon about how to build a code completion system. What was your approach to identifying what fundamental concepts needed to be addressed and how to fit that lesson into the available time? In the presentation you mentioned that you had built a more full-featured completion engine into Anvil. Can you describe what possessed you to build your own code completion tool? What are the core components required to build a completion engine? What are the benefits that can be realized by customizing the completion engine for a given language or task? Can you describe the feature set and implementation details of the full-fledged completion engine that is available in Anvil? Beyond the toy example, there are a number of considerations to address if you want to make the completion engine "production grade". Can you talk through some of the obvious edge cases and how to solve for them? (e.g. handling parsing of incomplete code) What are the inputs that you use to build up the list of candidate tokens for completion? Once you have a functioning baseline for offering completions, what are some of the signals that you hook into for ranking suggestions? In your presentation you leaned on the machinery available in the Python standard library. What are some of the ways that you might think about generalizing across languages vs. coupling to a given language? What design/architectural advice do you have for compartmentalizing logic in a full-featured completion engine? What are some of the complexities that become a factor when you are trying to scale across an entire code base? Beyond just being able to parse and process a body of code, there is also the question of integrating with the development environment. What are some of the challenges that get introduced when trying to access the appropriate set(s) of files and code through the editor interface(s)? What are the most interesting, innovative, or unexpected ways that you have seen code completion applied to developer experience? What are the most interesting, unexpected, or challenging lessons that you have learned while working on code completion for Anvil? When is code completion more effort than it’s worth? What do you have planned for the future of the Anvil code completion functionality? Keep In Touch LinkedIn meredydd on GitHub @meredydd on Twitter Picks Tobias "Weird Al" Yankovic Meredydd TimescaleDB Data Engineering Podcast Episode Promscale Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links PyCon presentation about building a completion engine Anvil Podcast Episode Nano Language Server Protocol Jedi Podcast Episode Skulpt Parser Abstract Syntax Tree OpenAPI GitHub Copilot Halting Problem Parser Generator Python Language Grammar Definition Lezer Parser Generator Tree-sitter PyScript Grafana Tempo Tracing Service The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/30/2022 • 1 hour, 11 seconds

Hunting Black Swans With Bees: Catching Up With The Inimitable Russell Keith-Magee

Summary Russell Keith-Magee is an accomplished engineer and a fixture of the Python community. His work on the Beeware suite of projects is one of the most ambitious undertakings in the ecosystem and unfailingly forward-looking. With his recent transition to working for Anaconda he is now able to dedicate his full focus to the effort. In this episode he reflects on the journey that he has taken so far, how Beeware is helping to address some of the threats to Python’s long term viability, and how he envisions its future in light of the recent release of PyScript, an in-browser runtime for Python. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Russell Keith-Magee about the latest status of the Beeware project, the state of Python’s black swans, and how the PyScript project ties into his ambitions for world domination Interview Introductions How did you get introduced to Python? For anyone who hasn’t been graced with the BeeWare vision, can you give the elevator pitch of what it is and why it matters? At PyCon US 2019 you presented a keynote about the various potential threats to the Python language community and its future viability. With the clarity of 3 years hindsight, how has the landscape shifted? What is PyScript and how does it fit into the venn diagram of BeeWare’s objectives and the portents of black swan events (and what is your involvement with it)? How does it differ from the dozens of other "Python in the browser" and "Python transpiled to Javascript" projects that have sprouted over the years? Now that you have been granted the opportunity to dedicate your full attention to BeeWare and build a team to support it, what new potential does that unlock? What are the current areas of focus/challenges that you are spending your time on for the BeeWare project? What are some of the efforts in the BeeWare suite that proved to be dead-ends? What are the most interesting, innovative, or unexpected ways that you have seen the BeeWare suite/PyScript used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on BeeWare? When is BeeWare the wrong choice? What do you have planned for the future of BeeWare/PyScript/Python/world domination? Keep In Touch LinkedIn Website @freakboy3742 on Twitter Picks Tobias Joby Gorillapod Russell PyScript The Great TV Show Links Black Swans Episode BeeWare Episode BeeWare Django Cordova Black Swan Apple II Altair Briefcase Web Assembly (WASM) Gary Bernhardt PyScript Pyodide Toga Kotlin Swift Gaffer Tape Repl.it Brython Transcrypt Python Anywhere Batavia Anaconda Conda Voc Maestral Eddington GUI The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/24/2022 • 56 minutes, 11 seconds

Take Control Of Your Digital Photos By Running Your Own Smart Library Manager With LibrePhotos

Summary Digital cameras and the widespread availability of smartphones has allowed us all to generate massive libraries of personal photographs. Unfortunately, now we are all left to our own devices of how to manage them. While cloud services such as iPhotos and Google Photos are convenient, they aren’t always affordable and they put your pictures under the control of large companies with their own agendas. LibrePhotos is an open source and self-hosted alternative to these services that puts you in control of your digital memories. In this episode the maintainer of LibrePhotos, Niaz Faridani-Rad, explains how he got involved with the project, the capabilities that it offers for managing your image library, and how to get your own instance set up to take back control of your pictures. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This episode is sponsored by Mergify. It’s an amazing tool to make you and your team way more productive with GitHub. Mergify is all about leveling up your pull requests with useful features that eliminate busy work. Automatic merges allow you define the conditions for acceptance and Mergify will take care of merging the pull request as soon as it’s ready. Automatic updates take care of merging your pull requests serially on top of each other, so there is no way to introduce a regression. With a merge queue you can merge your urgent pull request first, organize your Prs as you wish and Mergify will merge them in that order. Mergify’s backports feature will even copy the pull request into another branch once the pull request has been merged, shipping your bug fixes on multiple branches automatically. By saving time you and your team can focus on projects that matter. Mergify is coordinated with any CI and fully integrated into GitHub. They have a Startup Program that offers a 12 months credit to leverage Mergify (up to $21,000 of value). Start saving time; visit pythonpodcast.com/mergify today to sign up for a demo and get started! Or just click the link in the show notes. Your host as usual is Tobias Macey and today I’m interviewing Niaz Faridani-Rad about LibrePhotos, an open source, self-hosted application for managing your personal photo collection Interview Introductions How did you get introduced to Python? Can you describe what LibrePhotos is and the story behind it? What are the core objectives of the project? What kind of users are you focused on? What are some of the major features of LibrePhotos? There are a number of open source and commercial options for different photo oriented use cases. What are the main capabilities that influence someone’s decision to use one over the other? Many people’s baseline expectations will be around services such as Google Photos or iPhotos. What are some of the challenges that you face in trying to provide a comparable experience? One of the features that users rely on with these services is backup/disaster recovery of their photo library. What is the recommended approach for users of LibrePhotos? Can you describe how LibrePhotos is architected? How have the design and goals evolved since you first started working on it? How have recent advances in machine learning algorithms and related tooling improved the availability and quality of advanced features in LibrePhotos? How much improvement of accuracy in face/object recognition do you see as users invest in cataloging and organizing their collections? Is there a minimum quantity of images/iindividual people that are necessary to start using the ML powered features? What kinds of storage locations are supported? What are the interfaces available for extending/enhancing/integrating with LibrePhotos? What are the most interesting, innovative, or unexpected ways that you have seen LibrePhotos used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on LibrePhotos? When is LibrePhotos the wrong choice? What do you have planned for the future of LibrePhotos? Keep In Touch derneuere on GitHub @der_neuere on Twitter Website LinkedIn Picks Tobias Uncharted movie Niaz Steam Deck Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links LibrePhotos Self-hosted Sub-Reddit OwnPhotos Google Photos Google Takeout Digikam x265 HEIC Files RAW Image Format ImageMagick Panorama Photograph Lytro light field cameras rq asynchronous task library Typescript Redux Toolkit MobileNet v3 DLib ARM Processor Docker Compose LibrePhotos Comparison Page The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/16/2022 • 45 minutes, 14 seconds

Making Investment Data Easy To Access And Analyze With The OpenBB Terminal

Summary Investing effectively is largely a game of information access and analysis. This can involve a substantial amount of research and time spent on finding, validating, and acquiring different information sources. In order to reduce the barrier to entry and provide a powerful framework for amateur and professional investors alike Didier Rodrigues Lopes created the OpenBB Terminal. In this episode he explains how a pandemic project that started as an experiment has led to him founding a new company and dedicating his time to growing and improving the project and its community. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Didier Rodrigues Lopes about the OpenBB Terminal, a modern Python-based integrated environment for investment research Interview Introductions How did you get introduced to Python? Can you describe what OpenBB is and the story behind it? What is the problem that you are trying to address by creating the OpenBB project and providing it as open source? What are some of the use cases where someone might need to use this project? The elephant in the room for financial data research is the Bloomberg Terminal. What are the other tools or services available for that purpose? What are the differentiating features of the OpenBB Terminal? Can you describe how the OpenBB Terminal is implemented? How have the design and goals/scope of the project changed since you started working on it? Can you describe a typical workflow for someone who is using the OpenBB Terminal? How have you approached the user experience design, and what are you optimizing for? What kinds of utilities do you offer beyond raw data access? What are some examples of data sources that you rely on? What is involved in integrating a new data source? What are the extension points and integration capabilities for expanding the functionality of the tool? What are the most interesting, innovative, or unexpected ways that you have seen OpenBB Terminal used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on OpenBB Terminal? When is OpenBB Terminal the wrong choice? What do you have planned for the future of OpenBB Terminal? Keep In Touch DidierRLopes on GitHub LinkedIn @didier_lopes on Twitter Picks Tobias Vikings: Valhalla show on Netflix Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links OpenBB Matlab Papermill Bloomberg Terminal Robinhood Coinbase The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/10/2022 • 47 minutes, 13 seconds

Accelerate Your Machine Learning Experimentation With Automatic Checkpoints Using FLOR

Summary The experimentation phase of building a machine learning model requires a lot of trial and error. One of the limiting factors of how many experiments you can try is the length of time required to train the model which can be on the order of days or weeks. To reduce the time required to test different iterations Rolando Garcia Sanchez created FLOR which is a library that automatically checkpoints training epochs and instruments your code so that you can bypass early training cycles when you want to explore a different path in your algorithm. In this episode he explains how the tool works to speed up your experimentation phase and how to get started with it. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Rolando Garcia about FLOR, a suite of machine learning tools for hindsight logging that lets you speed up model experimentation by checkpointing training data Interview Introductions How did you get introduced to Python? Can you describe what FLOR is and the story behind it? What is the core problem that you are trying to solve for with FLOR? What are the fundamental challenges in model training and experimentation that make it necessary? How do machine learning reasearchers and engineers address this problem in the absence of something like FLOR? Can you describe how FLOR is implemented? What were the core engineering problems that you had to solve for while building it? What is the workflow for integrating FLOR into your model development process? What information are you capturing in the log structures and epoch checkpoints? How does FLOR use that data to prime the model training to a given state when backtracking and trying a different approach? How does the presence of FLOR change the costs of ML experimentation and what is the long-range impact of that shift? Once a model has been trained and optimized, what is the long-term utility of FLOR? What are the opportunities for supporting e.g. Horovod for distributed training of large models or with large datasets? What does the maintenance process for research-oriented OSS projects look like? What are the most interesting, innovative, or unexpected ways that you have seen FLOR used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on FLOR? When is FLOR the wrong choice? What do you have planned for the future of FLOR? Keep In Touch rlnsanz on GitHub @rogarcia_sanz on Twitter Picks Tobias The Batman Rolando Severance GitHub Codespaces Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links FLOR UC Berkeley Joe Hellerstein MLOps Data Engineering Podcast Episode RISE Lab AMP Lab Clipper Model Serving Ground Data Context Service Context: The Missing Piece Of The Machine Learning Lifecycle Airflow Copy on write ASTor Green Tree Snakes: Python AST Documentation MLFlow Amazon Sagemaker Cloudpickle Horovod Podcast Episode Ray Anyscale PyTorch Tensorflow The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/2/2022 • 46 minutes, 31 seconds

Automatically Enforce Software Structures With Powerful Code Modifications Powered By LibCST

Summary Programmers love to automate tedious processes, including refactoring your code. In order to support the creation of code modifications for your Python projects Jimmy Lai created LibCST. It provides a richly typed and high level API for creating and manipulating concrete syntax trees of your source code. In this episode Jimmy Lai and Zsolt Dollenstein explain how it works, some of the linting and automatic code modification utilities that you can build with it and how to get started with using it to maintain your own Python projects. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Zsolt Dollenstein and Jimmy Lai about LibCST, a concrete syntax tree parser and serializer library for Python Interview Introductions How did you get introduced to Python? Can you describe what LibCST is and the story behind it? How does a concrete syntax tree differ from an abstract syntax tree? What are some of the situations where the preservation of the exact structure is necessary? There are a few other libraries in Python for creating concrete syntax trees. What was missing in the available options that made it necessary to create LibCST? What are the use cases that LibCST is focused on supporting Can you describe how LibCST is implemented? How have the design and goals of the project changed or evolved since you started working on it? How might I use LibCST for something like restructuring a set of modules to move a function definition while maintaining proper imports? How do the capabilities of LibCST for codemodding compare to the Rope framework? What are some other workflows that someone might build with LibCST? What are some of the ways that LibCST is being used in your own work? What are the most interesting, innovative, or unexpected ways that you have seen LibCST used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on LibCST? When is LibCST the wrong choice? What do you have planned for the future of LibCST? Keep In Touch Zsolt zsol on GitHub LinkedIn Jimmy jimmylai on GitHub LinkedIn Picks Tobias Osprey Manta Backpack Zsolt Autotransform Glean Jimmy Paying down technical debt Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links LibCST Carta lib2to3 Abstract Syntax Tree Concrete Syntax Tree Pyre Parso Cython Podcast Episode mypyc Rope Flake8 Podcast Episode Pylint ESLint Fixit MonkeyType Podcast Episode The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/25/2022 • 56 minutes, 47 seconds

Cloud Native Networking For Developers With The Gloo Platform

Summary Communication is a fundamental requirement for any program or application. As the friction involved in deploying code has gone down, the motivation for architecting your system as microservices goes up. This shifts the communication patterns in your software from function calls to network calls. In this episode Idit Levine explains how the Gloo platform that she and her team at Solo have created makes it easier for you to configure and monitor the network topologies for your microservice environments. She also discusses what developers need to know about networking in cloud native environments and how a combination of API gateways and service mesh technologies allow you to more rapidly iterate on your systems. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Idit Levine about what developers need to know about service-oriented networking and her work at Solo on the Gloo project Interview Introductions How did you get introduced to Python? Can you describe what Solo is and the story behind it? How much should developers need to know about the ways that their applications and services are communicating? What is the current state of networking for applications across physical, cloud, and containerized environments? How do service mesh features influence the architectural decisions that software teams make while building their applications? What operational capabilities do they unlock? What are the aspects of application networking that are simplified or enhanced by service mesh platforms? In what ways has service mesh introduced new complexity to operating software systems? How can developers mirror the network topologies for production environments while working on new features? What are the most interesting, innovative, or unexpected ways that you have seen Gloo used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Gloo? When is Gloo the wrong choice? What do you have planned for the future of Gloo? Keep In Touch LinkedIn @Idit_Levine on Twitter Picks Tobias Shadow and Bone on Netflix Idit Elizabeth Holmes HBO Documentary Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Solo Computational Biology Microservices Kubernetes Service Mesh Istio LinkerD Envoy Proxy API Gateway CRD == Custom Resource Definition Gloo Edge Bazel Build System GraphQL mTLS GitOps Dagger WASM == Web Assembly Kubernetes Gateway API Consul Connect eBPF The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/19/2022 • 50 minutes, 33 seconds

Accelerate And Simplify Cloud Native Development For Kubernetes Environments With Gefyra

Summary Cloud native architectures have been gaining prominence for the past few years due to the rising popularity of Kubernetes. This introduces new complications to development workflows due to the need to integrate with multiple services as you build new components for your production systems. In order to reduce the friction involved in developing applications for cloud native environments Michael Schilonka created Gefyra. In this episode he explains how it connects your local machine to a running Kubernetes environment so that you can rapidly iterate on your software in the context of the whole system. He also shares how the Django Hurricane plugin lets your applications work closely with the Kubernetes process model. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! So now your modern data stack is set up. How is everyone going to find the data they need, and understand it? Select Star is a data discovery platform that automatically analyzes & documents your data. For every table in Select Star, you can find out where the data originated, which dashboards are built on top of it, who’s using it in the company, and how they’re using it, all the way down to the SQL queries. Best of all, it’s simple to set up, and easy for both engineering and operations teams to use. With Select Star’s data catalog, a single source of truth for your data is built in minutes, even across thousands of datasets. Try it out for free and double the length of your free trial today at pythonpodcast.com/selectstar. You’ll also get a swag package when you continue on a paid plan. Your host as usual is Tobias Macey and today I’m interviewing Michael Schilonka about Gefyra and what is involved with developing applications for Kubernetes environments Interview Introductions How did you get introduced to Python? Can you describe what Gefyra is and the story behind it? What are the challenges that Kubernetes introduces to the development process? What are some of the strategies that developers might use for developing and testing applications that are deployed to Kubernetes environments? What are the use cases that Gefyra is focused on enabling? What are some of the other tools or platforms that Gefyra might replace or supplement? What are the services that need to be present in the K8s cluster to enable Gefyra’s functionality? Can you describe how Gefyra is implemented? How have the design and goals of the project changed since you first started working on it? What is the process for getting Gefyra set up between a K8s cluster and a developer’s laptop? Can you describe what the developer’s workflow looks like when using Gefyra? How do you avoid collisions/resource contention among a team of developers who are working on the same project? What are some of the ways that developing for Kubernetes influences the architectural and design decisions for a project? What are some of the additional practices or systems that you have found to be beneficial for accelerating development in cloud-native environments? What are the most interesting, innovative, or unexpected ways that you have seen Gefyra used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Gefyra? When is Gefyra the wrong choice? What do you have planned for the future of Gefyra? Keep In Touch LinkedIn Schille on GitHub Picks Tobias kubernetes.el – Kubernetes interface for Emacs Michael It’s fermentation friday, perfect for baking a sourdough bread or brewing beer Two of my favorit YouTube channels Kurzgesagt – In a Nutshell and LockPickingLawyer For entrepreneurial spirits: Reddit community research with (GummySearch)[https://gummysearch.com/]?utm_source=rss&utm_medium=rss Links Kopf framework PyOxidizer Tuna Wireguard-go https://k3d.io/?utm_source=rss&utm_medium=rss kind Django Hurricane Blueshoe Django Kubernetes K3d Telepresence Unikube Sidecar Pattern Docker-compose Kubernetes Patterns book O’Reilly Platform Amazon (affiliate link) CodeZero CoreDNS Nginx Cookiecutter Tornado Podcast Episode uWSGI Podcast Episode 12 Factor App Pycloak Keycloak Kubernetes Operator Kubernetes CRD (Custom Resource Definition The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/11/2022 • 38 minutes, 14 seconds

Building A Community And Technology Stack For Scalable Big Data Geoscience At Pangeo

Summary Science is founded on the collection and analysis of data. For disciplines that rely on data about the earth the ability to simulate and generate that data has been growing faster than the tools for analysis of that data can keep up with. In order to help scale that capacity for everyone working in geosciences the Pangeo project compiled a reference stack that combines powerful tools into an out-of-the-box solution for researchers to be productive in short order. In this episode Ryan Abernathy and Joe Hamman explain what the Pangeo project really is, how they have integrated a combination of XArray, Dask, and Jupyter to power these analytical workflows, and how it has helped to accelerate research on multidimensional geospatial datasets. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! So now your modern data stack is set up. How is everyone going to find the data they need, and understand it? Select Star is a data discovery platform that automatically analyzes & documents your data. For every table in Select Star, you can find out where the data originated, which dashboards are built on top of it, who’s using it in the company, and how they’re using it, all the way down to the SQL queries. Best of all, it’s simple to set up, and easy for both engineering and operations teams to use. With Select Star’s data catalog, a single source of truth for your data is built in minutes, even across thousands of datasets. Try it out for free and double the length of your free trial today at pythonpodcast.com/selectstar. You’ll also get a swag package when you continue on a paid plan. Your host as usual is Tobias Macey and today I’m interviewing Ryan Abernathy and Joe Hamman about Pangeo, a community platform for Big Data geoscience Interview Introductions How did you get introduced to Python? Can you describe what Pangeo is and the story behind it? What is your role in the project/community and how did you get involved? What are the goals of the project and community? What are the areas of effort and how are they organized? What are the scientific domains that Pangeo is focused on supporting? What are the primary challenges associated with data management and analysis in these scientific communities? What are the forms that these data take and how have they been evolving? (e.g. formats/sources) What are some of the challenges introduced by the widespread adoption of cloud resources and the associated architectural patterns? Can you describe the technical components that fall under the Pangeo umbrella? How do they come together to form a functional workflow for geo sciences? How has the scope of the Pangeo project changed or evolved since it started? What are the most interesting, innovative, or unexpected ways that you have seen Pangeo used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Pangeo? When is Pangeo the wrong choice? What do you have planned for the future of Pangeo? Keep In Touch Joe @HammanHydro on Twitter Ryan @rabernat on Twitter rabernat on GitHub Website Picks Tobias Mountain Biking Ryan Klara And The Sun by Kazuo Ishiguro Joe Range by David Epstein Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Pangeo Pangeo Forge CarbonPlan M2LInES LEAP Columbia University XArray MIT MatLab PHP Ruby Java NumPy SciPy Matplotlib C Fortran Perl Dask Data Engineering Podcast Episode Jupyter IDL HDF5 Unidata NetCDF CF Metadata Conventions Intake Podcast Episode FSSpec Parquet Data Engineering Podcast Episode Zarr Data Engineering Podcast Pangeo Forge Airbyte Data Engineering Podcast Episode Fivetran Data Engineering Podcast Episode Stitch TileDB Data Engineering Podcast Episode Pythia The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/28/2022 • 52 minutes, 8 seconds

Automating Application Lifecycles For Developer Happiness At Wayfair

Summary A common piece of advice when starting anything new is to "begin with the end in mind". In order to help the engineers at Wayfair manage the complete lifecycle of their applications Joshua Woodward runs a team that provides tooling and assistance along every step of the journey. In this episode he shares some of the lessons and tactics that they have developed while assisting other engineering teams with starting, deploying, and sunsetting projects. This is an interesting look at the inner workings of large organizations and how they invest in the scaffolding that supports their myriad efforts. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! So now your modern data stack is set up. How is everyone going to find the data they need, and understand it? Select Star is a data discovery platform that automatically analyzes & documents your data. For every table in Select Star, you can find out where the data originated, which dashboards are built on top of it, who’s using it in the company, and how they’re using it, all the way down to the SQL queries. Best of all, it’s simple to set up, and easy for both engineering and operations teams to use. With Select Star’s data catalog, a single source of truth for your data is built in minutes, even across thousands of datasets. Try it out for free and double the length of your free trial today at pythonpodcast.com/selectstar. You’ll also get a swag package when you continue on a paid plan. Your host as usual is Tobias Macey and today I’m interviewing Joshua Woodward about how the application lifecycle team at Wayfair uses Python to Interview Introductions Josh Woodward, for the past year have been managing the application lifecycle team at Wayfair. Prior to that, IC on python platforms team. Embed with teams looking to decouple from monolith. See pain points first hand. How did you get introduced to Python? High school physics class, TI84 Calculator, friend wrote a program to solve vector problems, I thought it was amazing. Used TI-Basic to solve specific physics problems for me. (Give fixed inputs, run through equation, get outputs) Approaching college, thinking about student loans. Heard about python and decided to give it a shot. Wrote program to simulate various payback / interest scenarios. Went to college for ME, switched to SE when I found out my dorm neighbors were using python to draw cool images with python + turtle Can you describe what the role of the application lifecycle team is and the story behind it? Story behind it: Around 2018, in a state where we had deploy congestion, challenging to iterate and ship changes. tech org invested in containerization and decoupling to directly combat this problem. Teams incentiviced to decouple. While on python platforms, the team had already been experimenting with code templating. Standard cookiecutter template for flask apps. Wayfair experimenting with Kubernetes late 2017. Spent 1 year embedding with 4 different teams to help knowledge transfer re: k8s, containers, application setup, python best practices, testing, linting, etc – through that we got a lot of great feedback on our tooling. Took senior engineers weeks to get something setup. Know who to contact, click the right buttons, file the right ticket Approach: Counted manual steps. Something like 60 distinct / atomic activities that had to be performed to get a "hello world" response from a basic flask app in production. Focus on reduce manual steps Released product (Mamba, on theme of snakes) Initially, supporting one main user story. User story: "As an engineer, I would like to create a production ready application in 10 minutes so that I can have a reliable and standardized application setup that follows best practices." grew out of python platforms, created own team with own scope, that was about 1.5 years ago. What is your team’s scope now? Team Scope is to facilitate the creation, maintenance, and decommissioning of decoupled applications at Wayfair. What are the interfaces that your team has to the rest of the organization? People Interfaces: We value getting feedback on our work to build strong products. Make assumptions, Willing to be wrong. Validate assumptions with customers. Software Interfaces: for mamba, CLI at first Backstage (open sourced from spotify) Lots of Github What is your method of determining what projects to work on? (See above). Known pain points. Intuition, Free day fridays. Being comfortable taking risk (using friday time). Vet solution with customers. How do you measure the impact of your work on the rest of the organization? We don’t force use of our products. Adoption of tooling. Number of microservices being spun up. Number of automated pull requests being created, merged. DORA metrics throughput (deployment frequency, lead time for changes) and stability (change failure rate, mean time to recovery) What is the role of Python in your work? we use it and love it! existing skillset from incubation phase within python platforms right tool for the job lightweight automation hitting lots of APIs define lots of user facing specifications (json, yaml) pydantic has been great for creating descriptive, human and machine specifications. open source (we rely on it, we also have some presence) cookiecutter -> columbo gitpython -> pygitops Can you tell me more about your application creation solution. Who can use it, and what does it actually do? Written in python, though it templates out code for any language. Runs automation to onboard an application to production git repo, build pipeline, calling out to various APIs to signal a new app is present Wayfair has a variety of applications (python, java, .net, php, javascript, some go) Team interested in integrating with our solution will create a github repository containing 1..* cookiecutter template(s) Provide a specification for what questions to ask users. Limitation with cookiecutter where the approach to ask questions isn’t dynamic. lack of validation. Pat Lannigan -> Columbo (open sourced). Python DSL to describe the set of questions to ask users. python fastapi application will have a completely different set of questions than a java library for example. You had mentioned that another part of your team scope is to facilitate the maintenance of applications. Can you tell me more about that? Reduce engineering toil around keeping applications up to date. Average engineer owns several, dozens of repos Create automated pull requests: Versioned dependencies (Renovate) Propagating platform changes (Gator) Ex1: python apps use "black" to format code and our python platform team would like to prescribe a line length. Our tooling can be used to declare desired changes. yaml specification -> pr automation at scale. Ex2: shared library, new version released, breaking interface change. Code instructions for performing AST manipulation and resolving breaking change for people. Shift from: "We need you to do this", "I am proactively letting you know that something needs to change, and I also made the change for you!" How do you actually go about creating automated pull requests? manual steps would involve cloning, checking out feature branch, applying code changes, staging / committing, pushing up branch, creating the PR gitpython is an existing and extremely powerful tool, but its api is fairly involved and (by design) doesn’t provide the type of high level abstractions that we need. created pygitops (open sourced), built completely on top of gitpython high level abstractions for the workflow I described. coolest / most pythonic part about it is the "feature branch" context manager. code changes are made in the context of a feature branch when you intentionally or accidentally leave the context of a feature branch, we want certain things to be true (default / main branch, clean workdir, no unstaged changes) when writing PR automation, don’t have to worry about this! Can you describe some of the more technical details about how your change propagation system (Gator) works? heavily inspired by kubernetes resource model (resources are defined via a declarative specification) Kubernetes itself ships with resources that implement behaviors of common resources (pods, services, etc) Gator’s execution model is broken up into two parts: what repos to act on (Source) what are the changes that need to be applied. (Output) Ex: Source to proxy github search. write github search query to get back list of repos Output to scan a repo for regex pattern at specified paths and replace with some fixed term. Very popular, engineers love find and replace. What are the most interesting, innovative, or unexpected ways that you have seen mamba / gator used? resource model of gator supports the idea of we don’t know, what we don’t know reference k8s, CRDs, resource model. container execution log4j identification and remidiation automate some of the work for identifying vulnerabilities java platform team was able to use java native tooling in the environment of their choosing to identify vulnerable apps. What are the most interesting, unexpected, or challenging lessons that you have learned while working on application lifecycle concerns? What do you have planned for the future of application lifecycle management/developer experience improvements at Wayfair? Hope to start open sourcing interesting aspects of our change propagation tool (Gator) As someone who maintains many open source projects, or even at the enterprise level, we think that some of our patterns and approaches can be shared! yaml -> code changes Keep In Touch Email Github Linkedin Picks Tobias Nocciolata hazelnut spread Joshua Cities Skylines Game Cities Skylines – Cities Planner Plays: Verde Beach Links pygitops columbo backstage renovate DORA metrics TI-84 Calculator TI BASIC Wayfair Python Platforms Team Podcast Episode Pydantic Podcast Episode Helm PyUp GitPython The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/20/2022 • 46 minutes, 11 seconds

Run Your Applications Reliably On Kubernetes Without Losing Sleep With Robusta

Summary Kubernetes is a framework that aims to simplify the work of running applications in production, but it forces you to adopt new patterns for debugging and resolving issues in your systems. Robusta is aimed at making that a more pleasant experience for developers and operators through pre-built automations, easy debugging, and a simple means of creating your own event-based workflows to find, fix, and alert on errors in production. In this episode Natan Yellin explains how the project got started, how it is architected and tested, and how you can start using it today to keep your Python projects running reliably. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! So now your modern data stack is set up. How is everyone going to find the data they need, and understand it? Select Star is a data discovery platform that automatically analyzes & documents your data. For every table in Select Star, you can find out where the data originated, which dashboards are built on top of it, who’s using it in the company, and how they’re using it, all the way down to the SQL queries. Best of all, it’s simple to set up, and easy for both engineering and operations teams to use. With Select Star’s data catalog, a single source of truth for your data is built in minutes, even across thousands of datasets. Try it out for free and double the length of your free trial today at pythonpodcast.com/selectstar. You’ll also get a swag package when you continue on a paid plan. Your host as usual is Tobias Macey and today I’m interviewing Natan Yellin about Robusta, Interview Introductions How did you get introduced to Python? Can you describe what Robusta is and the story behind it? What are some of the challenges that teams face when running their systems in Kubernetes? How does Robusta help address those difficulties? How does Robusta compare to e.g. Rookout? What are some of the ways that Robusta is able to provide specific insights for Python applications? Can you describe how Robusta is implemented? What are some of the most challenging engineering tasks that you have had to work through while building Robusta? How have the capabilities and components evolved from when you started working on it? What is the workflow for integrating Robusta into a Kubernetes environment and a team’s maintenance processes? What are some examples of the kinds of questions that Robusta can help answer out of the box? What are some tasks that Robusta facilitates which require manual exploration? What are the interfaces available for customizing and extending the functionality of Robusta? What is involved in adding a new automation capability to Robusta? How have you approached the design of the tool to make it ergonomic and intuitive so that it doesn’t contribute to the stresses of dealing with errors in production? Given that it is a tool to help resolve problems in production infrastructure, how have you worked to ensure its reliability and resilience? What is the governance and sustainability model for Robusta? What are the most interesting, innovative, or unexpected ways that you have seen Robusta used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Robusta? When is Robusta the wrong choice? What do you have planned for the future of Robusta? Keep In Touch LinkedIn @aantn on Twitter aantn on GitHub Website Picks Tobias Kubernetes: Up And Running (affiliate link) Natan Kubernetes for SysAdmins Youtube video by Kelsey Hightower Learn to delegate Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Robusta GHOP Objective C Snyk Heroku Google AppEngine OOM Killer Bin Packing/Knapsack Problem Prometheus Kubernetes Pods PySpy tracemalloc Pyrasite VSCode Debugger Pydantic Podcast Episode Helm – Kubernetes package manager Why Profiler The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/14/2022 • 53 minutes, 43 seconds

Accelerate The Development And Delivery Of Your Machine Learning Applications Using Ray And Deploy It At Anyscale

Summary Building a machine learning application is inherently complex. Once it becomes necessary to scale the operation or training of the model, or introduce online re-training the process becomes even more challenging. In order to reduce the operational burden of AI developers Robert Nishihara helped to create the Ray framework that handles the distributed computing aspects of machine learning operations. To support the ongoing development and simplify adoption of Ray he co-founded Anyscale. In this episode he re-joins the show to share how the project, its community, and the ecosystem around it have grown and evolved over the intervening two years. He also explains how the techniques and adoption of machine learning have influenced the direction of the project. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Robert Nishihara about his work at Anyscale and the Ray distributed execution framework Interview Introductions How did you get introduced to Python? Can you describe what Anyscale is and the story behind it? How has the Ray project and ecosystem evolved since we last spoke? (2 years ago) How has the landscape of AI/ML technologies and techniques shifted in that time? What are the main areas where organizations are trying to apply ML/AI? What are some of the issues that teams encounter when trying to move from prototype to production with ML/AI applications? What are the features of Ray that help to mitigate those challenges? With the introduction of more widely available streaming/real-time technologies the viability of reinforcement learning has increased. What new challenges does that approach introduce? What are some of the operational complexities associated with managing a deployment of Ray? What are some of the specialized utilities that you have had to develop to maintain a large and multi-tenant platform for your customers? What is the governance model around the Ray project and how does the work at Anyscale influence the roadmap? What are the most interesting, innovative, or unexpected ways that you have seen Anyscale/Ray used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Ray and Anyscale? When is Anyscale/Ray the wrong choice? What do you have planned for the future of Anyscale/Ray? Keep In Touch robertnishihara on GitHub @robertnishihara on Twitter Website LinkedIn Picks Tobias The Edge Chronicles: Beyond The Deepwoods Robert Production RL Summit Project Hail Mary by Andy Weir Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Ray Podcast Episode Anyscale UC Berkeley Matlab Deep Learning Pandas NumPy Horovod Podcast Episode XGBoost Modin Podcast Episode Dask Ray Datasets Reinforcement Learning Production Reinforcement Learning Summit AlphaGo Databricks Snowflake Data Engineering Podcast Episode TPU == Tensor Processing Unit Weights and Biases MLFlow RLLib Ray Serve The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/6/2022 • 45 minutes, 58 seconds

See The Structure Of Your Software At A Glance With Call Graphs From Code2Flow

Summary As software projects grow and change it can become difficult to keep track of all of the logical flows. By visualizing the interconnections of function definitions, classes, and their invocations you can speed up the time to comprehension for newcomers to a project, or help yourself remember what you worked on last month. In this episode Scott Rogowski shares his work on Code2Flow as a way to generate a call graph of your programs. He explains how it got started, how it works, and how you can start using it to understand your Python, Ruby, and PHP projects. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Subsurface Live is the cloud data lake conference, a virtual conference where data engineers, data scientists, data architects, and data analysts can gather and hear about cloud data lakes and the data ecosystem. Subsurface Live Winter 2022 includes keynote talks from Bill Inmon, the father of the data warehouse, Author of Deep Work Cal Newport, and several more from companies such as Dremio, AWS, dbt, and more. Subsurface will also have many breakout sessions featuring Pandas creator Wes McKinney, Apache Superset & Airflow creator Maxime Beauchemin, and engineers from Apple, Uber, Adobe, Bloomberg, and more. Meet other data professionals and learn about the data technologies and practices helping companies meet their current and future data needs. Register today at pythonpodcast.com/subsurface Your host as usual is Tobias Macey and today I’m interviewing Scott Rogowski about Code2Flow, a utility for generating "pretty good" call graphs for dynamic languages Interview Introductions How did you get introduced to Python? Can you describe what Code2Flow is and the story behind it? What are some of the ways that a program’s call graph might be used? How does the visual representation generated by Code2Flow help with exploring the structure of a project? What are some of the alternative approaches/tools that might be used to gain similar insights? What do you see as the overlap in utility between Code2Flow and e.g. SourceGraph? Can you describe how the Code2Flow project is implemented? How have the design and goals of the project changed since you first began working on it? Given that Code2Flow is implemented in Python, how have you managed the parsing/processing of the other languages that you support? Visualizing a complex program can quickly become very messy. How have you approached the layout of the output to enhance comprehension? What are some of the situations where Code2Flow will be unable to provide a full picture of a program’s call graph? What are some of the pieces of information that are unavailable due to the static analysis approach that you have taken? Can you describe the process of applying Code2Flow to a project? Once the structure is on display, what are some next steps that an individual or team might take to analyze and act on the information? Given the static nature of the output, how might Code2Flow be incorporated in a CI/CD system to provide insight into the evolution of a projects structure? What are the most interesting, innovative, or unexpected ways that you have seen Code2Flow used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Code2Flow? When is Code2Flow the wrong choice? What do you have planned for the future of Code2Flow? Keep In Touch Website scottrogowski on GitHub Picks Tobias Taking Vacation Universal Studios, Florida Scott Service work Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Code2Flow Colombia Mongita TI-83 Ruby PHP AST == Abstract Syntax Tree Graphviz Pylint Robert Frost The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/28/2022 • 45 minutes, 34 seconds

Scaling Knowledge Management For Technical Teams With Knowledge Repo

Summary One of the most persistent challenges faced by organizations of all sizes is the recording and distribution of institutional knowledge. In technical teams this is exacerbated by the need to incorporate technical review feedback and manage access to data before publishing. When faced with this problem as an early data scientist at AirBnB, Chetan Sharma helped create the Knowledge Repo project as a solution. In this episode he shares the story behind its creation and growth, how and why it was released as open source, and the features that make it a compelling option for your own team’s knowledge management journey. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Chetan Sharma about Knowledge Repo, an open source framework for managing documentation for technical users Interview Introductions How did you get introduced to Python? EE + CS/AI + Stats degrees Airbnb working on ML models Knowledge Repo itself Can you describe what Knowledge Repo is and the story behind it? We started seeing interviewees use ipython notebooks, thought they were great Wanted to push more people to use notebooks, but they weren’t very shareable, vettable Existing notebook hosting services weren’t very good, and weren’t built for people who aren’t data stakeholders. It was especially poor with images, annoying cell blocks Made a simple post processor to remove cell blocks, push the images to s3, and host on flask Once we were pushing notebooks into a Github repo for hosting on a flask app, so many things became possible Review cycles Shareability / collaboration features Indexing / searching Concurrently, great work was happening on developing internal R packages / python libraries to provide consistent, branded aesthetics What are some of the approaches that teams typically take for recording and sharing institutional knowledge? Copy and paste to google docs, slides Facebook was using facebook photo albums untrustworthy, not discoverable, divorced from the code What are the unique requirements that are introduced when attempting to record and distribute learnings related to data such as A/B experiments, analytical methods, data sets, etc.? Reproducibility is a big one Making sure the learnings are trustworthy (good data? no bugs?) Distributing widely, across the org and across time Experimentation Experimentation is at the end of a research-design-build-measure cycle, strategic analysis is often before Capturing all of the context Can you describe how the Knowledge Repo project is architected? Repositories: a store of posts, most commonly a github repo Markdown as original lingua franca, eventually a KR specific “KR post” concept (which is still basically markdown) Post processors Convert whatever upstream file to markdown / KR post (Jupyter notebook, R Markdown, markdown were the original ones) Handle images and other large assets, usually pushing them to cloud storage Evolved to handle PDFs, googledocs, keynotes What were the motivating factors for making it available as an open source project? It was such a common problem. Even incredibly sophisticated data teams at Uber, Facebook, etc. were begging us to share the system. What is the workflow for creating, sharing, and discovering information in an installation of Knowledge Repo? Create a github repo for hosting strategic analysis Use the KR script to create a stub/template for whatever format you’re working in Do your work in Jupyter, etc. Instead of using github scripts (git add) use knowledge scripts (knowledge add), which is basically the github scripts with postprocessors Do typical Github workflows See the result in the hosted knowledge repo app What are some of the options available for extending or customizing an installation of Knowledge Repo? More postprocessors! google docs, presentations, UX research, anything can be done in KR with a simple postprocessor to turn it to markdown/images/PDF Tying the system to your internal data tools. For example, an experimentation system like Eppo or whatever you use for marketing campaigns If you were to start over today, what are some of the ways that you might approach the solution to knowledge management differently? Think of it more holistically: What are the most interesting, innovative, or unexpected ways that you have seen Knowledge Repo used? UX research Writing up guide for acquihiring Demonstrating of capabilities, data framework What are the most interesting, unexpected, or challenging lessons that you have learned while working on Knowledge Repo? Strategic analysis needs to be elevated, this leads to paradigm changes Organization problems are helped by tools like KR: eg. promotions Meeting people’s tools/workflows where they are is powerful When is Knowledge Repo the wrong choice? Keep In Touch LinkedIn @chesharma87 Picks Tobias Learning Guitar Chetan Underrated cooking ingredients: chickpea flour, butter fried kimchi (in grilled cheese, nachos) Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Eppo Data Engineering Podcast Episode Knowledge Repo IPython Jupyter Flask The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/21/2022 • 39 minutes, 34 seconds

Simplify And Scale Your Software Development Cycles By Putting On Pants (Build Tool)

Summary Software development is a complex undertaking due to the number of options available and choices to be made in every stage of the lifecycle. In order to make it more scaleable it is necessary to establish common practices and patterns and introduce strong opinions. One area that can have a huge impact on the productivity of the engineers engaged with a project is the tooling used for building, validating, and deploying changes introduced to the software. In this episode maintainers of the Pants build tool Eric Arellano, Stu Hood, and Andreas Stenius discuss the recent updates that add support for more languages, efforts made to simplify its adoption, and the growth of the community that uses it. They also explore how using Pants as the single entry point for all of your routine tasks allows you to spend your time on the decisions that matter. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Building data integration workflows is time consuming and tedious, requiring an unpleasant amount of boilerplate code to do it right. Rivery is a managed platform for building our ELT pipelines that offers the industry’s first native integration with Python, allowing you to seamlessly load and export Pandas dataframes to and from all of your databases, services, and data warehouses with a few clicks and no extra code. Rivery is hosting a live demo of their first class Python support on February 22nd, and when you use the promo code "Python" during registration you will be entered to win a brand new series 7 apple watch. Go to pythonpodcast.com/rivery today to learn more and register. Your host as usual is Tobias Macey and today I’m interviewing Eric Arellano, Stu Hood, and Andreas Stenius about the Pants build tool and all of the work that has gone into it recently Interview Introductions How did you get introduced to Python? Can you describe what Pants is and the story behind it? What is the scope of concerns that Pants is focused on addressing? What are some of the notable changes in the project and its ecosystem over the past 1 1/2 years? How do you approach the work of defining the target scope of the Pants toolchain? What are some of your guiding principles to decide when a feature request belongs in the core vs as a plugin? What are some of the ergonomic improvements that you have added to simplify the work of getting started with Pants and adopting it across teams? What are some of the challenges that teams run into as they start to scale the size of their monorepos? (e.g. project design, boilerplate reduction, etc.) How are you managing the work of growing and supporting the community as you move beyond early adopters/experts into newcomers to Pants and programming? How are you handling support for multiple language ecosystems? What are some of the challenges involved with making Pants feel idiomatic for such a range of communities? How does the use of Python as the plugin/extension syntax work for teams that don’t use it as their primary language? What are the architectural changes that needed to be made for you to be capable of integrating with the different execution environments? How would you characterize the level of feature coverage across the different supported languages? Now that you have laid the foundation, how much effort is required to add new language targets? What are the most interesting, innovative, or unexpected ways that you have seen Pants used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Pants? When is Pants the wrong choice? What do you have planned for the future of Pants? Keep In Touch Eric LinkedIn Eric-Arellano on GitHub @earellanoaz on Twitter Stu LinkedIn @stuhood on Twitter stuhood on GitHub Andreas @andreasstenius on Twitter kaos on GitHub Picks Tobias Last Kingdom on Netflix Eric Getting Curious Stu Checks and Balance Podcast Andreas The Pragmatic Programmer Links Pants Make Earthly Podcast Episode MyPy Podcast Episode PyRight Pylint Flake8 Podcast Episode Bazel pre-commit Podcast Episode Underpants library PyOxidizer Podcast Episode Eric PyCon Talk The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/14/2022 • 58 minutes, 14 seconds

Achieve Repeatable Builds Of Your Software On Any Machine With Earthly

Summary It doesn’t matter how amazing your application is if you are unable to deliver it to your users. Frustrated with the rampant complexity involved in building and deploying software Vlad A. Ionescu created the Earthly tool to reduce the toil involved in creating repeatable software builds. In this episode he explains the complexities that are inherent to building software projects and how he designed the syntax and structure of Earthly to make it easy to adopt for developers across all language environments. By adopting Earthly you can use the same techniques for building on your laptop and in your CI/CD pipelines. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Vlad A. Ionescu about Earthly, a syntax and runtime for software builds to reduce friction between development and delivery Interview Introductions How did you get introduced to Python? Can you describe what Earthly is and the story behind it? What are the core principles that engineers should consider when designing their build and delivery process? What are some of the common problems that engineers run into when they are designing their build process? What are some of the challenges that are unique to the Python ecosystem? What is the role of Earthly in the overall software lifecycle? What are the other tools/systems that a team is likely to use alongside Earthly? What are the components that Earthly might replace? How is Earthly implemented? What were the core design requirements when you first began working on it? How have the design and goals of Earthly changed or evolved as you have explored the problem further? What is the workflow for a Python developer to get started with Earthly? How can Earthly help with the challenge of managing Javascript and CSS assets for web application projects? What are some of the challenges (technical, conceptual, or organizational) that an engineer or team might encounter when adopting Earthly? What are some of the features or capabilities of Earthly that are overlooked or misunderstood that you think are worth exploring? What are the most interesting, innovative, or unexpected ways that you have seen Earthly used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Earthly? When is Earthly the wrong choice? What do you have planned for the future of Earthly? Keep In Touch LinkedIn @VladAIonescu on Twitter Website Picks Tobias Shape Up book Vlad High Output Management by Andy Grove Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Earthly Bazel Pants Podcast Episode ARM AWS Graviton Apple M1 CPU Qemu Phoenix web framework for Elixir language The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/6/2022 • 54 minutes, 1 second

Building A Detailed View Of Your Software Delivery Process With The Eiffel Protocol

Summary The process of getting software delivered to an environment where users can interact with it requires many steps along the way. In some cases the journey can require a large number of interdependent workflows that need to be orchestrated across technical and organizational boundaries, making it difficult to know what the current status is. Faced with such a complex delivery workflow the engineers at Ericsson created a message based protocol and accompanying tooling to let the various actors in the process provide information about the events that happened across the different stages. In this episode Daniel Ståhl and Magnus Bäck explain how the Eiffel protocol allows you to build a tooling agnostic visibility layer for your software delivery process, letting you answer all of your questions about what is happening between writing a line of code and your users executing it. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Daniel Ståhl and Magnus Bäck about Eiffel, an open protocol for platform agnostic communication for CI/CD systems Interview Introductions How did you get introduced to Python? Can you describe what Eiffel is and the story behind it? What are the goals of the Eiffel protocol and ecosystem? What is the role of Python in the Eiffel ecosystem? What are some of the types of questions that someone might ask about their CI/CD workflow? How does Eiffel help to answer those questions? Who are the personas that you would expect to interact with an Eiffel system? Can you describe the core architectural elements required to integrate Eiffel into the software lifecycle? How have the design and goals of the Eiffel protocol/architecture changed or evolved since you first began working on it? What are some example workflows that an engineering/product team might build with Eiffel? What are some of the challenges that teams encounter when integrating Eiffel into their delivery process? What are the most interesting, innovative, or unexpected ways that you have seen Eiffel used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Eiffel? When is Eiffel the wrong choice? What do you have planned for the future of Eiffel? Keep In Touch Daniel d-stahl-ericsson on GitHub LinkedIn Magnus LinkedIn magnusbaeck on GitHub Picks Tobias Red Notice Daniel The Witcher Magnus Lego Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Eiffel Ericsson Axis Communications Hudson CI framework Spinnaker Jenkins Tekton Gradle Artifactory JSON Schema RabbitMQ Prometheus Continuous Delivery Foundation CD Events XKCD Competing Standards Python Eiffel SDK Pydantic The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/31/2022 • 49 minutes, 54 seconds

Improve Your Productivity By Investing In Developer Experience Design For Your Projects

Summary When we are creating applications we spend a significant amount of effort on optimizing the experience of our end users to ensure that they are able to complete the tasks that the system is intended for. A similar effort that we should all consider is optimizing the developer experience for ourselves and other engineers who contribute to the projects that we work on. Adam Johnson recently wrote a book on how to improve the developer experience for Django projects and in this episode he shares some of the insights that he has gained through that project and his work with clients to help you improve the experience that you and your team have when collaborating on software development. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Adam Johnson about optimizing your developer experience Interview Introductions How did you get introduced to Python? Can you describe what you mean by the term "developer experience"? How does it compare to the concept of user experience design? What are the main goals that you aim for through improving DX? When considering DX, what are the categories of focus for improvement? (e.g. the experience of a given software project, the developer’s physical environment, their editing environment, etc.) What are some of the most high impact optimizations that a developer can make? What are some of the areas of focus that have the most variable impact on a developer’s experience of a project? What are some of the most helpful tools or practices that you rely on in your own projects? How does the size of a development team or the scale of an organization impact the decisions and benefits around DX improvements? One of the perennial challenges with selecting a given tool or architectural pattern is the continually changing landscape of software. How have your choices for DX strategies changed or evolved over the years? What are the most interesting, innovative, or unexpected developer experience tweaks that you have encountered? What are the most interesting, unexpected, or challenging lessons that you have learned while working on your book? What are some of the potential pitfalls that individuals and teams need to guard against in their quest to improve developer experience for their projects? What are some of the new tools or practices that you are considering incorporating into your own work? Keep In Touch @AdamChainz on Twitter Website adamchainz on GitHub Picks Tobias Eternals movie Adam Fan of Eternals, enjoyed Neil Gaiman series Also general MCU fan, watched it all in lockdown Moon Knight trailer Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Boost Your Django DX Rust Ripgrep Factory Boy Mimesis Podcast Episode Language Server Protocol EditorConfig Starship Command Prompt Pre-Commit Podcast Episode Flake8 Podcast Episode DevDocs Dash library documentation search tool pyupgrade StandardJS Cython Podcast Episode The Phoenix Project The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/24/2022 • 42 minutes, 53 seconds

An Exploration Of Effective Pandas Practices With Matt Harrison

Summary Pandas has grown to be a ubiquitous tool for working with data at every stage. It has become so well known that many people learn Python solely for the purpose of using Pandas. With all of this activity and the long history of the project it can be easy to find misleading or outdated information about how to use it. In this episode Matt Harrison shares his work on the book "Effective Pandas" and some of the best practices and potential pitfalls that you should know for applying Pandas in your own work. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Matt Harrison about best practices for using Pandas for data exploration, manipulation, and analysis Interview Introductions How did you get introduced to Python? What motivated you to write a book about Pandas? There are a number of books available that cover some aspect of the Pandas framework or its application. What was missing from the available literature? Who is your target audience for this book? What are some of the most surprising things that you have learned about Pandas while working on this book? What are the sharp edges that you see newcomers to pandas run into most frequently? It is easy to use Pandas in a naive manner and get things done. What are some of the bad habits that you have seen people form in their work with Pandas? How and when do those habits become harmful? What are the most interesting, innovative, or unexpected ways that you have seen Pandas used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on this book? What are some of the projects that you are planning to work on in the near/medium term? Keep In Touch Website @__mharrison__ on Twitter Blog mattharrison on GitHub Picks Tobias MSR Snowshoes Matt Telemark Skiing 22 Designs Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Effective Pandas Book (affiliate link with 20% discount code applied) Discount code INIT TCL Perl Pandas Podcast Episode Pandas Extension Arrays Podcast Episode Koalas Dask Data Engineering Podcast Episode Modin Podcast Episode The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/15/2022 • 49 minutes, 57 seconds

Generate Your Text Files With Python Using Cog

Summary Developers hate wasting effort on manual processes when we can write code to do it instead. Cog is a tool to manage the work of automating the creation of text inside another file by executing arbitrary Python code. In this episode Ned Batchelder shares the story of why he created Cog in the first place, some of the interesting ways that he uses it in his daily work, and the unique challenges of maintaining a project with a small audience and a well defined scope. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Ned Batchelder about Cog, a tool for generating files or text from embedded Python logic Interview Introductions How did you get introduced to Python? Can you describe what Cog is and the story behind it? What are the use cases that you initially created Cog to address? What were the shortcomings or extraneous overhead that you encountered in tools such as Jinja, Mako, Genshi, etc. that led you to create a new tool? What was your path from a quick and dirty script that suited your own purposes to turning it into a niche open source project that was general and stable enough for the broader community? One of your claims to fame is your role as the maintainer for coverage.py. How has your experience managing such a widely used project translated to the relatively small and low traffic project like Cog? Can you describe how Cog is implemented? How did you approach the design of the syntactic elements for embedding Python code into a host file? What is the workflow for someone using Cog to generate all or parts of a file? How does the introduction of third party dependencies impact the viability and utility of Cog as compared to other templating systems? What are the most interesting, innovative, or unexpected ways that you have seen Cog used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Cog? When is Cog the wrong choice? What do you have planned for the future of Cog? Keep In Touch Website nedbat on GitHub @nedbat on Twitter LinkedIn Picks Tobias Samson Q9U Microphone Ned McFly Command Line History Tool Go for a walk Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Cog Boston Python Lotus Lotus Notes Zope Cheetah Template Engine Coverage.py Podcast Episode Unix Philosophy Hungarian Notation Jupyter Notebooks GitHub Profile ReadMe Ned’s GitHub Profile Raw Markdown The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/13/2022 • 50 minutes, 32 seconds

A Friendly Approach To Regression Models For Programmers

Summary Statistical regression models are a staple of predictive forecasts in a wide range of applications. In this episode Matthew Rudd explains the various types of regression models, when to use them, and his work on the book "Regression: A Friendly Guide" to help programmers add regression techniques to their toolbox. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Matthew Rudd about the applications of statistical modeling and regression, and how to start using it for your work Interview Introductions How did you get introduced to Python? Can you start by describing some use cases for statistical regression? What was your motivation for writing a book to explain this family of algorithms to programmers? What are your goals for the book? Who is the target audience? What are some of the different categories of regression algorithms? What are some heuristics for identifying which regression to use? How have you approached the balance of using software principles for explaining the work of building the models with the mathematical underpinnings that make them work? What are some of the concepts that are most challenging for people who are first working with regression models? What are the most interesting, innovative, or unexpected ways that you have seen statistical regression models used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on your book? What are some of the resources that you recommend for folks who want to learn more about the inner workings and applications of regression models after they finish your book? Keep In Touch LinkedIn @MatthewBRudd on Twitter Picks Tobias The Argument podcast from the NY Times Matthew Primus Claypool Lennon Delirium South of Reality Links Regression: A Friendly Guide Sewanee University of the South Sewanee Data Lab Mark Lutz Python books Elements of Statistical Learning Linear Regression Logistic Regression Modeling Binary Data Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/2/2022 • 45 minutes, 15 seconds

Fast, Flexible, and Incremental Task Automation With doit

Summary Every software project needs a tool for managing the repetitive tasks that are involved in building, running, and deploying the code. Frustrated with the limitations of tools like Make, Scons, and others Eduardo Schettino created doit to handle task automation in his own work and released it as open source. In this episode he shares the story behind the project, how it is implemented under the hood, and how you can start using it in your own projects to save you time and effort. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Eduardo Schettino about Doit, a flexible and low overhead task automation tool Interview Introductions How did you get introduced to Python? Can you describe what doit is and the story behind it? What are the main goals and use cases of doit? Can you describe how you approached the implementation of Doit? How has the design changed or evolved since you first began working on it? The realm of task automation tools for developers is an exceedingly crowded one, with each tool prioritizing certain use cases. How would you characterize the position of doit in the current ecosystem? How does it compare to e.g. Click, Invoke, Typer, etc.? What is your guiding philosophy for when and how to add new features? You have been running the project for ~13 years now. How has the evolution of the Python language and ecosystem influenced your approach to the development and maintenance of doit? What is the workflow for getting started with doit and integrating it into your development process? For every project there are some tasks that are identical and some that are bespoke for that application. What are the options for maintaining a standard set of tasks across repositories and composing them with per-project activites? What are some of the useful patterns that you and the community have established for designing tasks and execution graphs? How do you use doit in your own work? What are the most interesting, innovative, or unexpected ways that you have seen doit used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on doit? When is doit the wrong choice? What do you have planned for the future of doit? Keep In Touch LinkedIn schettino72 on GitHub Picks Tobias The Matrix series Eduardo John Pilger Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links doit Zope Twisted Django Pyflakes scons Make Nikola Podcast Episode Nose Pytest Podcast Episode Click Typer Invoke Puppet Ansible Chef Sphinx Snakemake Airflow Luigi pytest-incremental import-deps dbm MetalK8s The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/27/2021 • 39 minutes, 27 seconds

The Technological, Business, and Sales Challenges Of Building The Ethical Ads Network

Summary Whether we like it or not, advertising is a common and effective way to make money on the internet. In order to support the work being done at Read The Docs they decided to include advertisements on the documentation sites they were hosting, but they didn’t want to alienate their users or collect unnecessary information. In this episode David Fischer explains how they built the Ethical Ads network to solve their problem, the technical and business challenges that are involved, and the open source application that they built to power their network. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing David Fischer about the Ethical Ads marketplace and the technology that runs Interview Introductions How did you get introduced to Python? Can you describe what the Ethical Ads project is and the story behind it? What are the technical and organizational requirements involved in running an ad network? How have you approached the problem of kickstarting the flywheel for the two-sided marketplace? What are some of the challenges that you face in building an accurate profile of your audience without using detailed tracking methods? What are the benefits that you see in focusing exclusively on developers in your publisher relationships? Can you describe the design and implementation of the ad server? How has the architecture evolved since you first began working on it? If you were to start over today what might you do differently? How have you approached scaling for performance and geographic distribution? What mechanisms do you use for tracking impressions/measuring ad effectiveness? How can advertisers experiment with A/B testing of ad copy? If someone wants to run their own advertisements with the ethical ads server, what is involved in getting it deployed and integrated into their sites? What are the integration and extension points available for customizing the behavior of the platform? What are some of the most notable lessons that you have learned about online advertising since you first started working on the Ethical Ads project? What are the most interesting, innovative, or unexpected ways that you have seen Ethical Ads used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on the Ethical Ads platform? What do you have planned for the future of the Ethical Ads platform? Keep In Touch davidfischer on GitHub @djfische on Twitter LinkedIn Picks Tobias Ship It! Podcast David Local Python Meetup Click CLI framework useragents library TLD for parsing internet domains Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Ethical Ads Network Ethical Ads Server San Diego Python Read The Docs Podcast Episode CodeFund CPM == Cost Per Mille The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/20/2021 • 55 minutes, 48 seconds

Accidentally Building A Business With Python At Listen Notes

Summary Podcasts are one of the few mediums in the internet era that are still distributed through an open ecosystem. This has a number of benefits, but it also brings the challenge of making it difficult to find the content that you are looking for. Frustrated by the inability to pick and choose single episodes across various shows for his listening Wenbin Fang started the Listen Notes project to fulfill his own needs. He ended up turning that project into his full time business which has grown into the most full featured podcast search engine on the market. In this episode he explains how he build the Listen Notes application using Python and Django, his work to turn it into a sustainable business, and the various ways that you can build other applications and experiences on top of his API. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Wenbin Fang about the technology powering the Listen Notes podcast discovery platform Interview Introductions How did you get introduced to Python? Can you describe what Listen Notes is and the story behind it? What are some of the main goals that listeners have when searching for a podcast? What are the challenges that they commonly encounter when looking for information in a podcast? What are the different sources of information that you can use to extract useful details about a podcast? How do you identify and prioritize new features or product enhancements? Can you describe how the Listen Notes platform is architected? How has it changed or evolved since you first began working on it? How did you approach the technology selection for the initial version of Listen Notes? If you were to start over today, what might you do differently? What are the technical challenges that are posed by the ecosystem around podcasts? What are the biggest changes that have happened in the methods of production and consumption for podcasts since you first became involved in the space? How do you approach the design and contracts of the Listen Notes web API given how core that is to your platform? What are the most complex or complicated engineering projects that you have done for Listen Notes? What are the pieces of the infrastructure for podcasts that you would like to see improved, changed, or replaced? What are some of the kinds of projects that developers can build with the Listen Notes API? What, if any, impact have the introduction of podcasts to closed platforms such as Spotify, Amazon Music, etc. had on your business? What are some of the most surprising things that you have learned about podcasts and their consumption while building Listen Notes? What are the most interesting, innovative, or unexpected ways that you have seen Listen Notes used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Listen Notes? What do you have planned for the future of Listen Notes? Keep In Touch Website LinkedIn wenbinf on GitHub @wenbinf on Twitter Picks Tobias Wheel of Time TV Series Wenbin Superhuman email client Links Listen Notes Graphviz NextDoor PostgreSQL Elasticsearch Redis RabbitMQ Celery ReactJS Django Bootstrap CSS Digital Ocean Tailwind CSS Entity Resolution Clickhouse Data Engineering Podcast Episode The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/12/2021 • 43 minutes, 28 seconds

Making Orbital Mechanics More Accessible With Poliastro

Summary Outer space holds a deep fascination for people of all ages, and the key principle in its exploration both near and far is orbital mechanics. Poliastro is a pure Python package for exploring and simulating orbit calculations. In this episode Juan Luis Cano Rodriguez shares the story behind the project, how you can use it to learn more about space travel, and some of the interesting projects that have used it for planning planetary and interplanetary missions. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Juan Luis Cano Rodriguez about Poliastro, an open source library for interactive Astrodynamics and Orbital Mechanics, with a focus on ease of use, speed, and quick visualization. Interview Introductions How did you get introduced to Python? Can you describe what Poliastro is and the story behind it? What are some of the simulations that Poliastro is designed to be used for? How much knowledge of orbital mechanics is necessary to get started with Poliastro? Can you describe how the project is implemented? How have the goals and design of the project changed or evolved since you first started it? What are some of the design philosophies that you focus on to make the package accessible to the range of users that you support? Can you talk through the workflow of using Poliastro to do something like track the path of the ISS and its traversal of the debris field from the recent satellite destruction? What are some of the other libraries or frameworks that are commonly used with Poliastro? How are you using Poliastro in your own work? What are some overlooked or underused aspects of the project that you would like to highlight? What are the most interesting, innovative, or unexpected ways that you have seen Poliastro used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Poliastro? When is Poliastro the wrong choice? What do you have planned for the future of Poliastro? Keep In Touch LinkedIn GitHub Email Twitter Picks Tobias Josh Blue (comedian) Juan Luis DJ Cotts DJ Weaver Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Poliastro Fortran 90 (if only this community existed back then! https://ondrejcertik.com/blog/2021/03/resurrecting-fortran/)?utm_source=rss&utm_medium=rss Satellogic Read the Docs Wolfram Alpha Mathematica SageMath 2-Body Problem AstroPy Podcast Episode Numba Import Linter Vallado "Fundamentals of Astrodynamics" International Space Station Starlink Satellites Planetary Ephemeritas Data Satellite Data Kerbal Space Program NumFOCUS Open Collective Python SGP4 Libre Space Foundation The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/27/2021 • 58 minutes, 59 seconds

Build Better Analytics And Models With A Focus On The Data Experience

Summary A lot of time and energy goes into data analysis and machine learning projects to address various goals. Most of the effort is focused on the technical aspects and validating the results, but how much time do you spend on considering the experience of the people who are using the outputs of these projects? In this episode Benn Stancil explores the impact that our technical focus has on the perceived value of our work, and how taking the time to consider what the desired experience will be can lead us to approach our work more holistically and increase the satisfaction of everyone involved. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Benn Stancil about the perennial frustrations of working with data and thoughts on how to improve the experience Interview Introductions How did you get introduced to Python? Can you start by discussing your perspective on the most frustrating elements of working with data in an organization? How might that compound when working with machine learning? What are the sources of the disconnect between our level of technical sophistication and our ability to produce meaningful insights from our data? There have been a number of formulations about a "hierarchy of needs" pertaining to data. When the goal is to bring ML/AI methods to bear on an organization’s processes or products how can thinking about the intended experience act to improve the end result? What are some failure modes or suboptimal outcomes that might be expected when building from a tooling/technology/technique first mindset? What are some of the design elements that we can incorporate into our development environments/data infrastructure/data modeling that can incentivize a more experience driven process for building data products/analyses/ML models? How does the design and capabilities of the Mode platform allow teams to progress along the journey from data discovery to descriptive analytics, to ML experiments? What are the most interesting, innovative, or unexpected approaches that you have seen for encouraging the creation of positive data experiences? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Mode and data analysis? When is a data experience the wrong approach? What do you have planned for the future of Mode to support this ideal? Keep In Touch LinkedIn @bennstancil on Twitter Picks Tobias Venture Unlocked Podcast Benn Wrap Text by Bobby Pinero Counting Stuff by Randy Au Ray Data Co by Mr Ben Modern Data Democracy By JP Monteiro Bad Blood Podcast Bad Blood Book Links Mode Analytics Tidyverse Airflow Fivetran Data Engineering Podcast Episode dbt Data Engineering Podcast Episode Conway’s Law Cinchy Data Engineering Podcast Episode Reverse ETL The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/22/2021 • 59 minutes, 27 seconds

Declarative Deep Learning From Your Laptop To Production With Ludwig and Horovod

Summary Deep learning frameworks encourage you to focus on the structure of your model ahead of the data that you are working with. Ludwig is a tool that uses a data oriented approach to building and training deep learning models so that you can experiment faster based on the information that you actually have, rather than spending all of our time manipulating features to make them match your inputs. In this episode Travis Addair explains how Ludwig is designed to improve the adoption of deep learning for more companies and a wider range of users. He also explains how the Horovod framework plugs in easily to allow for scaling your training workflow from your laptop out to a massive cluster of servers and GPUs. The combination of these tools allows for a declarative workflow that starts off easy but gives you full control over the end result. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Travis Adair about building and training machine learning models with Ludwig and Horovod Interview Introductions How did you get introduced to Python? Can you describe what Horovod and Ludwig are? How do the projects work together? What was your path to being involved in those projects and what is your current role? There are a number of AutoML libraries available for frameworks such as scikit-learn, etc. What are the challenges that are introduced by applying that workflow to deep learning architectures? What are the use cases that Ludwig is designed to enable? Who are the target users of Ludwig? How do the workflows change/progress for the different personas? How is the underlying framework architected? What are the available extension points to provide a progressive exposure of complexity? How have the goals and design of the project changed or evolved as it has gained more widespread adoption beyond Uber? What was the motivation for migrating the core of Ludwig from Tensorflow to Pytorch? Can you describe the workflow of building a model definition with Ludwig? How much knowledge of neural network architectures and their relevant characteristics is necessary to use Ludwig effectively? What are the motivating factors for adding Horovod to the process? What is involved in moving from a single machine/single process training loop to a multi-core or multi-machine distributed training process? The combination of Ludwig and Horovod provide a shallower learning curve for building and scaling model training. What do you see as their potential impact on the availability and adoption of more sophisticated ML capabilities across organizations of varying scale? What do you see as other significant barriers to widespread use of ML functionality? What are the most interesting, innovative, or unexpected ways that you have seen Ludwig and/or Horovod used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Ludwig and Horovod? When is Ludwig and/or Horovod the wrong choice? What do you have planned for the future of both projects? Keep In Touch LinkedIn @TravisAddair on Twitter tgaddair on GitHub Picks Tobias Zeal and Ardor Travis Opeth Agaloch Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Ludwig Horovod Predibase Uber Michelangelo Tensorflow PyTorch Podcast Episode Gradient Boosted Trees XGBoost CatBoost LightGBM PyCaret HyperBand scikit-optimize Keras Vision Transformer Architecture HuggingFace Jax DeepSpeed AllReduce Nvidia Collective Communications Library (NCCL) Training Epoch ElasticDL Raft Consensus Algorithm TorchScript Transfer Learning Gordon Bell Prize Anyscale Ray Podcast Episode The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/22/2021 • 1 hour, 4 minutes, 48 seconds

Building Conversational AI to Augment Sales Teams at Structurely

Summary The true power of artificial intelligence is its ability to work collaboratively with humans. Nate Joens co-founded Structurely to create a conversational AI platform that augments human sales teams to help guide potential customers through the initial steps of the funnel. In this episode he discusses the technical and social considerations that need to be combined for a seamless conversational experience and how he and his team are tackling the problem. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Nate Joens about his work at Structurely to build conversational AI utilities that augment human sales interactions Interview Introductions How did you get introduced to Python? Can you describe what Structurely is and the story behind it? What are the elements that comprise a "conversational AI"? How is it distinct from the wave of chatbots that were popular in recent years? What lessons from that approach can we take forward into AI enabled conversational platforms? How are you applying AI to the sales process? How much domain expertise is necessary to make an effective and engaging conversational AI? (e.g. knowledge of sales techniques vs. knowledge of real estate, etc.) Can you describe how you have designed the Structurely platform? What are the biggest engineering challenges that you have had to work through? What challenges or complexities have been most persistent? What are the design complexities that you have to work through to make the AI accessible for end users? What are some of the advancements in AI/NLP/transfer learning that have been most beneficial for teams building conversational AI? What are the signals that you emphasize when monitoring the performance of your models? What is your approach for feeding real-world customer interactions back into your model development and training loop? What are the most active areas of research in conversational AI applications and techniques? What are the most interesting, innovative, or unexpected ways that you have seen Structurely and/or conversational AI used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on conversational AI at Structurely? When is conversational AI the wrong choice? What do you have planned for the future of Structurely? Keep In Touch @whonatejoens on Twitter LinkedIn Picks Tobias Vantage AWS Cost Management Nate VideoForm Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Stucturely GIS Generative AI GPT-3 Sanky Diagram PyTorch Podcast Episode Allen Institute for AI F Score Snorkel Podcast Episode Few-Shot Learning Zero Shot Learning Voxable The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/6/2021 • 50 minutes, 59 seconds

Build Composable And Reusable Feature Engineering Pipelines with Feature-Engine

Summary Every machine learning model has to start with feature engineering. This is the process of combining input variables into a more meaningful signal for the problem that you are trying to solve. Many times this process can lead to duplicating code from previous projects, or introducing technical debt in the form of poorly maintained feature pipelines. In order to make the practice more manageable Soledad Galli created the feature-engine library. In this episode she explains how it has helped her and others build reusable transformations that can be applied in a composable manner with your scikit-learn projects. She also discusses the importance of understanding the data that you are working with and the domain in which your model will be used to ensure that you are selecting the right features. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Soledad Galli about feature-engine, a Python library to engineer features for use in machine learning models Interview Introductions How did you get introduced to Python? Can you describe what feature-engine is and the story behind it? What are the complexities that are inherent to feature engineering? What are the problems that are introduced due to incidental complexity and technical debt? What was missing in the available set of libraries/frameworks/toolkits for feature engineering that you are solving for with feature-engine? What are some examples of the types of domain knowledge that are needed to effectively build features for an ML model? Given the fact that features are constructed through methods such as normalizing data distributions, imputing missing values, combining attributes, etc. what are some of the potential risks that are introduced by incorrectly applied transformations or invalid assumptions about the impact of these manipulations? Can you describe how feature-engine is implemented? How have the design and goals of the project changed or evolved since you started working on it? What (if any) difference exists in the feature engineering process for frameworks like scikit-learn as compared to deep learning approaches using PyTorch, Tensorflow, etc.? Can you describe the workflow of identifying and generating useful features during model development? What are the tools that are available for testing and debugging of the feature pipelines? What do you see as the potential benefits or drawbacks of integrating feature-engine with a feature store such as Feast or Tecton? What are the most interesting, innovative, or unexpected ways that you have seen feature-engine used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on feature-engine? When is feature-engine the wrong choice? What do you have planned for the future of feature-engine? Keep In Touch LinkedIn @Soledad_Galli on Twitter solegalli on GitHub Picks Tobias Dune Movie Dune Series Soledad The Social Dilemma Don’t Be Evil by Rana Foroohar Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links feature-engine Feature Engineering Python Feature Engineering Cookbook scikit-learn Feature Stores Podcast Episode Pandas Podcast Episode PyTorch Podcast Episode Tensorflow Feast Tecton Data Engineering Podcast Episode Kaggle Dask Data Engineering Podcast Episode The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/31/2021 • 53 minutes, 29 seconds

Speed Up Your Python Data Applications By Parallelizing Them With Bodo

Summary The speed of Python is a subject of constant debate, but there is no denying that for compute heavy work it is not the optimal tool. Rather than rewriting your data oriented applications, or having to rearchitect them, the team at Bodo wrote a compiler that will do the optimization for you. In this episode Ehsan Totoni explains how they are able to translate pure Python into massively parallel processes that are optimized for high performance compute systems. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Ehsan Totoni about Bodo, an inferential compiler for Python that automatically parallelizes your data oriented projects Interview Introductions How did you get introduced to Python? Can you describe what Bodo is and the story behind it? What are some of the use cases that it is being applied to? What are the motivating factors for something like Dask or Ray as compared to Bodo? What are the software patterns that contribute to slowdowns in data processing code? What are some of the ways that the compiler is able to optimize those operations? Can you describe how Bodo is implemented? How does Bodo process the Python code for compiling to the optimized form? What are the compilation techniques for understanding the semantics of the code being processed? How do you manage packages that rely on C extensions? What do you use as an intermediate representation for translating into the optimized output? What is the workflow for applying Bodo to a Python project? What debugging utilities does it provide for identifying any errors that occur due to the added parallelism? What kind of support does Bodo have for optimizing a machine learning project with Bodo? (e.g. using PyTorch/Tensorflow/MxNet/etc.) When working with a workflow orchestrator such as Dagster for Airflow, what would the integration process look like for being able to take advantage of the optimized Bodo output? What are the most interesting, innovative, or unexpected ways that you have seen Bodo used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Bodo? When is Bodo the wrong choice? What do you have planned for the future of Bodo? Keep In Touch LinkedIn @EhsanTn on Twitter ehsantn on GitHub Picks Tobias Paracord Crafts Ehsan [ Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Bodo Data Engineering Podcast Episode University of Illinois Urbana-Champaign HPC MPI Elastic Fabric Adapter All-to-All Communication Dask Data Engineering Podcast Episode Ray Podcast Episode Pandas Extension Arrays Podcast Episode GeoPandas Numba LLVM scikit-learn Horovod Dagster Podcast.__init__ Episode Data Engineering Podcast Episode Airflow Podcast Episode IPython Parallel Parquet The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/25/2021 • 58 minutes, 6 seconds

An Exploration Of Financial Exchange Risk Management Strategies

Summary The world of finance has driven the development of many sophisticated techniques for data analysis. In this episode Paul Stafford shares his experiences working in the realm of risk management for financial exchanges. He discusses the types of risk that are involved, the statistical methods that he has found most useful for identifying strategies to mitigate that risk, and the software libraries that have helped him most in his work. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Paul Stafford about building risk models to guard against financial exchange rate volatility Interview Introductions How did you get introduced to Python? What are the principles involved in risk management, and how are statistical methods used? How did you get involved in financial markets? In what ways did your background in science and engineering prepare you for work in finance and risk management? What are the tools that you have found most useful in your career in finance? How have recent trends such as the widespread adoption of deep learning impacted the capabilities and risks present in foreign exchange strategies? What are the challenges that you face in obtaining and validating the input data that you are relying on for building financial and statistical models? How has the volatility of the pandemic impacted the robustness and resilience of your predictive capabilities? What are the areas where the available tools are typically insufficient? What are the most interesting, innovative, or unexpected strategies or techniques that you have seen applied to risk management? What are the most interesting, unexpected, or challenging lessons that you have learned while working in risk management? What are the economic and industry trends that you are keeping a close eye on for your work at Deaglo and your own personal projects? Keep In Touch LinkedIn Picks Tobias The Vault (movie) Paul Motorcycle Trip of the Grand Canyon Links Deaglo Partners, LLC. Value At Risk (VaR) Black-Scholes Equation Linear Algebra Principal Component Analysis Eigenvectors and Eigenvalues Markov Chain Monte Carlo Violin Plot Kurtosis PyMC3 Podcast Episode Bayesian Regression Constrained Optimization Ethereum Smart Contracts Behavioral Finance Black Swan by Nassim Nicholas Taleb (affiliate link) SciPy Convention RealPython 3Blue1Brown Sentiment Analysis The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/16/2021 • 34 minutes, 30 seconds

Build Better Machine Learning Models By Understanding Their Decisions With SHAP

Summary Machine learning and deep learning techniques are powerful tools for a large and growing number of applications. Unfortunately, it is difficult or impossible to understand the reasons for the answers that they give to the questions they are asked. In order to help shine some light on what information is being used to provide the outputs to your machine learning models Scott Lundberg created the SHAP project. In this episode he explains how it can be used to provide insight into which features are most impactful when generating an output, and how that insight can be applied to make more useful and informed design choices. This is a fascinating and important subject and this episode is an excellent exploration of how to start addressing the challenge of explainability. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Scott Lundberg about SHAP, a library that implements a game theoretic approach to explain the output of any machine learning model Interview Introductions How did you get introduced to Python? Can you describe what SHAP is and the story behind it? What are some of the contexts that create the need to explain the reasoning behind the outputs of an ML model? How do different types of models (deep learning, CNN/RNN, bayesian vs. frequentist, etc.) and different categories of ML (e.g. NLP, computer vision) influence the challenge of understanding the meaningful signals in their reasoning? Taking a step back, how do you define "explainability" when discussing inferences produced by ML models? What are the degrees of specificity/accuracy when seeking to understand the decision processes involved? Can you describe how SHAP is implemented? What are the signals that you are tracking to understand what features are being used to determine a given output? What are the assumptions that you had as you started this project that have been challenged or updated as you explored the problem in greater depth? Can you describe the workflow for someone using SHAP? What are the challenges faced by practitioners in interpreting the visualizations generated from SHAP? How much domain knowledge and context is necessary to use SHAP effectively? What are the ongoing areas of research around tracking of ML decision processes? How are you using SHAP in your own work? What are the most interesting, innovative, or unexpected ways that you have seen SHAP used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on SHAP? When is SHAP the wrong choice? What do you have planned for the future of SHAP? Keep In Touch slundberg on GitHub Website LinkedIn Picks Tobias Reminiscence Scott Augustine’s Confessions Links SHAP Microsoft Research Matlab Game Theory Computational Biology LIME Shapley Values Julia Language ResNet CNN == Convolutional Neural Network RNN == Recurrent Neural Network A* Algorithm CFPB == Consumer Financial Protection Bureau NP Hard Huggingface Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations Numba Log Odds InterpretML Polyjuice The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/9/2021 • 1 hour, 4 minutes, 54 seconds

Accelerating Drug Discovery Using Machine Learning With TorchDrug

Summary Finding new and effective treatments for disease is a complex and time consuming endeavor, requiring a high degree of domain knowledge and specialized equipment. Combining his expertise in machine learning and graph algorithms with is interest in drug discovery Jian Tang created the TorchDrug project to help reduce the amount of time needed to find new candidate molecules for testing. In this episode he explains how the project is being used by machine learning researchers and biochemists to collaborate on finding effective treatments for real-world diseases. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Jian Tang about TorchDrug Interview Introductions How did you get introduced to Python? Can you describe what TorchDrug is and the story behind it? What are the goals of the TorchDrug project? Who are the target users of the project? What are the main ways that it is being used? What are the challenges faced by biologists and chemists working on development and discovery of pharmaceuticals? What are some of the other tools/techniques that they would use (in isolation or combination with TorchDrug)? Can you describe how TorchDrug is implemented? How have you approached the design of the project and its APIs to make it accessible to engineers that don’t possess domain expertise in drug discovery research? How do graph structures help when modeling and experimenting with chemical structures for drug discovery? What are the formats and sources of data that you are working with? What are some of the complexities/challenges that you have had to deal with to integrate with up or downstream systems to fit into the overall research process? Can you talk through the workflow of using TorchDrug to build and validate a model? What is involved in determining and codifying a goal state for the model to optimize for? What are the biggest open questions in the area of drug discovery and research? How is TorchDrug being used to assist in the exploration of those problems? What are the most interesting, unexpected, or challenging lessons that you have learned while working on TorchDrug? When is TorchDrug the wrong choice? What do you have planned for the future of TorchDrug? Keep In Touch tangjianpku on GitHub @tangjianpku on Twitter Website LinkedIn Picks Tobias Rope refactoring library Jian Attending conferences once the pandemic is over Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links TorchDrug Mila Yoshua Bengio Alphafold Few-shot learning Metalearning PyTorch Geometric DeepGraph Library NetworKit Podcast Episode graph-tool Podcast Episode The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/30/2021 • 44 minutes, 30 seconds

An Exploration Of Automated Speech Recognition

Summary The overwhelming growth of smartphones, smart speakers, and spoken word content has corresponded with increasingly sophisticated machine learning models for recognizing speech content in audio data. Dylan Fox founded Assembly to provide access to the most advanced automated speech recognition models for developers to incorporate into their own products. In this episode he gives an overview of the current state of the art for automated speech recognition, the varying requirements for accuracy and speed of models depending on the context in which they are used, and what is required to build a special purpose model for your own ASR applications. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Dylan Fox about the challenges of training and deploying large models for automated speech recognition Interview Introductions How did you get introduced to Python? What is involved in building an ASR model? How does the complexity/difficulty compare to models for other data formats? (e.g. computer vision, NLP, NER, etc.) How have ASR models changed over the last 5, 10, 15 years? What are some other categories of ML applications that work with audio data? How does the level of complexity compare to ASR applications? What is the typical size of an ASR model that you are deploying at Assembly? What are the factors that contribute to the overall size of a given model? How does accuracy compare with model size? How does the size of a model contribute to the overall challenge of deploying/monitoring/scaling it in a production environment? How can startups effectively manage the time/cost that comes with training large models? What are some techniques that you use/attributes that you focus on for feature definitions in the source audio data? Can you describe the lifecycle stages of an ASR model at Assembly? What are the aspects of ASR which are still intractable or impractical to productionize? What are the most interesting, innovative, or unexpected ways that you have seen ASR technology used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on ASR? What are the trends in research or industry that you are keeping an eye on? Keep In Touch LinkedIn @YouveGotFox on Twitter Picks Tobias The Hitman’s Wife’s Bodyguard Dylan Inspiration 4 Documentary Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Learn Python The Hard Way DeepSpeech Wav2Letter BERT GPT-3 Convolutional Neural Network (CNN) Recurrent Neural Network (RNN) Mycroft Podcast Episode CMU Sphinx Pocket Sphinx Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) DeepSpeech Paper Transformer Architecture Audio Analytic Sound Recognition Podcast Episode Horovod distributed training library Knowledge Distillation Libre Speech Data Set Lambda Labs Wav2Vec The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/26/2021 • 54 minutes, 1 second

Experimenting With Reinforcement Learning Using MushroomRL

Summary Reinforcement learning is a branch of machine learning and AI that has a lot of promise for applications that need to evolve with changes to their inputs. To support the research happening in the field, including applications for robotics, Carlo D’Eramo and Davide Tateo created MushroomRL. In this episode they share how they have designed the project to be easy to work with, so that students can use it in their study, as well as extensible so that it can be used by businesses and industry professionals. They also discuss the strengths of reinforcement learning, how to design problems that can leverage its capabilities, and how to get started with MushroomRL for your own work. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Davide Tateo and Carlo D’Eramo about MushroomRL, a library for building reinforcement learning experiments Interview Introductions How did you get introduced to Python? Can you start by describing what reinforcement learning is and how it differs from other approaches for machine learning? What are some example use cases where reinforcement learning might be necessary? Can you describe what MushroomRL is and the story behind it? Who are the target users of the project? What are its main goals? What are your suggestions to other developers for implementing a succesful library? What are some of the core concepts that researchers and/or engineers need to understand to be able to effectively use reinforcement learning techniques? Can you describe how MushroomRL is architected? How have the goals and design of the project changed or evolved since you began working on it? What is the workflow for building and executing an experiment with MushroomRL? How do you track the states and outcomes of experiments? What are some of the considerations involved in designing an environment and reward functions for an agent to interact with? What are some of the open questions that are being explored in reinforcement learning? How are you using MushroomRL in your own research? What are the most interesting, innovative, or unexpected ways that you have seen MushroomRL used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on MushroomRL? When is MushroomRL the wrong choice? What do you have planned for the future of MushroomRL? How can the open-source community contribute to MushroomRL? What kind of support you are willing to provide to users? Keep In Touch Davide boris-il-forte on GitHub Website Carlo carloderamo on GitHub Website Picks Tobias Britannia TV Series Davide 1984 by George Orwell Carlo Twin Peaks TV Series Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links MushroomRL TU Darmstadt MuJoCo PyBullet iGibson Habitat OpenAI Gym PyTorch Podcast Episode RLLib Ray Podcast Episode OpenAI Baselines Stable Baselines ROS The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/19/2021 • 54 minutes, 18 seconds

Doing Dask Powered Data Science In The Saturn Cloud

Summary A perennial problem of doing data science is that it works great on your laptop, until it doesn’t. Another problem is being able to recreate your environment to collaborate on a problem with colleagues. Saturn Cloud aims to help with both of those problems by providing an easy to use platform for creating reproducible environments that you can use to build data science workflows and scale them easily with a managed Dask service. In this episode Julia Signall, head of open source at Saturn Cloud, explains how she is working with the product team and PyData community to reduce the points of friction that data scientists encounter as they are getting their work done. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Julia Signell about building distributed processing workflows in Python through the power of Dask Interview Introductions How did you get introduced to Python? Can you describe what you are building at Saturn Cloud? Who are your target users and how does that inform the features and priorities that you build into your platform? What are the road blocks that data scientists typically encounter when working on their laptop/workstation? How does open source factor into the Saturn product? What are some of the projects that you are collaborating with/contributing to as part of your work at Saturn? How has your experience at Anaconda informed your work at Saturn? Can you describe how the Saturn Cloud platform is architected? How has it changed or evolved since it was first launched? Can you describe the learning curve that data scientists go through when adopting Dask? What are some examples of projects or workflows that Dask enables which are not possible/practical to do locally? How would you characterize the overall awareness/adoption of Dask in the Python data science community? What are the most interesting, innovative, or unexpected ways that you have seen Saturn Cloud used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Saturn Cloud? When is Saturn Cloud the wrong choice? What do you have planned for the future of Saturn Cloud? Keep In Touch @jsignell on Twitter jsignell on GitHub Picks Tobias Peter Rabbit 2 Julia PawPaw Fruit Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Saturn Cloud Dask Podcast Episode Pangeo XArray Conda Mamba Holoviz Dash Anaconda Podcast Episode Kubernetes Tornado Podcast Episode Prefect Podcast Episode Dagster Podcast Episode Airflow Ray Podcast Episode The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/10/2021 • 38 minutes

Monitor The Health Of Your Machine Learning Products In Production With Evidently

Summary You’ve got a machine learning model trained and running in production, but that’s only half of the battle. Are you certain that it is still serving the predictions that you tested? Are the inputs within the range of tolerance that you designed? Monitoring machine learning products is an essential step of the story so that you know when it needs to be retrained against new data, or parameters need to be adjusted. In this episode Emeli Dral shares the work that she and her team at Evidently are doing to build an open source system for tracking and alerting on the health of your ML products in production. She discusses the ways that model drift can occur, the types of metrics that you need to track, and what to do when the health of your system is suffering. This is an important and complex aspect of the machine learning lifecycle, so give it a listen and then try out Evidently for your own projects. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Emeli Dral about monitoring machine learning models in production with Evidently Interview Introductions How did you get introduced to Python? Can you describe what Evidently is and the story behind it? What are the metrics that are useful for determining the performance and health of a machine learning model? What are the questions that you are trying to answer with those metrics? How does monitoring of machine learning models compare to monitoring of infrastructure or "traditional" software projects? What are the failure modes for a model? Can you describe the design and implementation of Evidently? How has the architecture changed or evolved since you started working on it? What categories of model is Evidently designed to work with? What are some strategies for making models conducive to monitoring? What is involved in monitoring a model on a continuous basis? What are some considerations when establishing useful thresholds for metrics to alert on? Once an alert has been triggered what is the process for resolving it? If the training process takes a long time, how can you mitigate the impact of a model failure until the new/updated version is deployed? What are the most interesting, innovative, or unexpected ways that you have seen Evidently used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Evidently? When is Evidently the wrong choice? What do you have planned for the future of Evidently? Keep In Touch LinkedIn @EmeliDral on Twitter emeli-dral on GitHub Picks Tobias The Suicide Squad Emeli Airflow Links Evidently AI Open Source Yandex Grafana The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/3/2021 • 50 minutes, 59 seconds

Making Automated Machine Learning More Accessible With EvalML

Summary Building a machine learning model is a process that requires a lot of iteration and trial and error. For certain classes of problem a large portion of the searching and tuning can be automated. This allows data scientists to focus their time on more complex or valuable projects, as well as opening the door for non-specialists to experiment with machine learning. Frustrated with some of the awkward or difficult to use tools for AutoML, Angela Lin and Jeremy Shih helped to create the EvalML framework. In this episode they share the use cases for automated machine learning, how they have designed the EvalML project to be approachable, and how you can use it for building and training your own models. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Angela Lin, Jeremy Shih about EvalML, an AutoML library which builds, optimizes, and evaluates machine learning pipelines Interview Introductions How did you get introduced to Python? Can you describe what EvalML is and the story behind it? What do we mean by the term AutoML? What are the kinds of problems that are best suited to applications of automated ML? What does the landscape for AutoML tools look like? What was missing in the available offerings that motivated you and your team to create EvalML? Who is the target audience for EvalML? How is the EvalML project implemented? How has the project changed or evolved since you first began working on it? What is the workflow for building a model with EvalML? Can you describe the preprocessing steps that are necessary and the input formats that it is expecting? What are the supported algorithms/model architectures? How does EvalML explore the search space for an optimal model? What decision functions does it employ to determine an appropriate stopping point? What is involved in operationalizing an AutoML pipeline? What are some challenges or edge cases that you see users of EvalML run into? What are the most interesting, innovative, or unexpected ways that you have seen EvalML used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on EvalML? When is EvalML the wrong choice? When is auto ML the wrong approach? What do you have planned for the future of EvalML? Keep In Touch Angela angela97lin on GitHub LinkedIn Jeremy jeremyliweishih on GitHub LinkedIn Picks Tobias Gloryhammer Angela Sarma mediterranean restaurant Jeremy Crucial Conversations by Stephen Covey (affiliate link) Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links EvalML FeatureLabs Alteryx Scheme NetLogo Flask AutoML Woodwork FeatureTools Compose Random Forest XGBoost Prophet GreyKite Shap The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/25/2021 • 45 minutes, 53 seconds

Growing And Supporting The Data Science Community At Anaconda

Summary Data scientists are tasked with answering challenging questions using data that is often messy and incomplete. Anaconda is on a mission to make the lives of data professionals more manageable through creation and maintenance of high quality libraries and frameworks, the distribution of an easy to use Python distribution and package ecosystem, and high quality training material. In this episode Kevin Goldsmith, CTO of Anaconda, discusses the technical and social challenges faced by data scientists, the ways that the Python ecosystem has evolved to help address those difficulties, and how Anaconda is engaging with the community to provide high quality tools and education for this constantly changing practice. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Kevin Goldsmith about Anaconda’s contributions to the Python ecosystem for data science Interview Introductions How did you get introduced to Python? Can you start by describing what Anaconda focuses on solving for? What was your path into the CTO position? From your perspective as the CTO of Anaconda, what are the biggest challenges facing data scientists today? What is the breakdown between technical and organizational sources for those difficulties? How is the Anaconda product suite architected to help address some of those problems? Where are you spending your focus to allow Anaconda to address the current and future needs of data scientists? Python has been a dominant force in the data and analytics ecosystem for several years now. What do you see as the future of the space? (e.g. monoglot vs. polyglot workflows) What are the most interesting, innovative, or unexpected ways that you have seen the Anaconda platform used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Anaconda and data science tooling? Keep In Touch LinkedIn @KevinGoldsmith on Twitter Website Picks Tobias Perdido Street Station The Scar Iron Council Kevin Lego Typewriter Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Anaconda Spotify Lisp Scheme C# Anaconda Nucleus PyData AnacondaCon Grid Computing PyTorch Podcast Episode Tensorflow Pyston Podcast Episode Dask Podcast Episode Numba Panel dashboard framework Datashader Jupyter R Julia AstroPy Podcast Episode Arrow Data Teams by Jesse Anderson Podcast Episode The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/19/2021 • 55 minutes, 48 seconds

Network Analysis At The Speed Of C With The Power Of Python Using NetworKit

Summary Analysing networks is a growing area of research in academia and industry. In order to be able to answer questions about large or complex relationships it is necessary to have fast and efficient algorithms that can process the data quickly. In this episode Eugenio Angriman discusses his contributions to the NetworKit library to provide an accessible interface for these algorithms. He shares how he is using NetworKit for his own research, the challenges of working with large and complex networks, and the kinds of questions that can be answered with data that fits on your laptop. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Eugenio Angriman about NetworKit, an open-source toolkit for large-scale network analysis Interview Introductions How did you get introduced to Python? Can you describe what NetworKit is and the story behind it? A core focus of the project is for use with graphs containing millions to billions of nodes. What are some of the situations where you might encounter networks of that scale? There are a number of network analysis libraries in Python. How would you characterize NetworKit’s position in the ecosystem? What are the algorithmic challenges that graph structures pose when aiming for scalability and performance? How do you approach building efficient algorithms for complex network analysis? Can you describe how NetworKit is architected? What are the design principles that you focus on for the library? How have the design and goals of the project changed or evolved since you have been working on it? NetworKit’s code base has now a discrete size and several developers contributed to it. Are there any minimum quality requirements that new code needs to fulfill before it can be merged into NetworKit? How do you ensure that such requirements are met? What are some of the active areas of research for networked data analysis? How are you using NetworKit for your own work? What are kind of background knowledge in graph analysis is necessary for users of NetworKit? What are some of the underutilized or overlooked aspects of NetworKit that you think should be highlighted? What are the most interesting, innovative, or unexpected ways that you have seen NetworKit used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on NetworKit? When is NetworKit the wrong choice? What do you have planned for the future of NetworKit? Keep In Touch angriman on GitHub LinkedIn Picks Tobias Edgar Allen Poe NetworKit The Spinoza Problem Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links NetworKit Humboldt University Berlin graph-tool Podcast Episode NetworkX Adjacency List Cython Podcast Episode Node Embeddings Centrality Score NetworKit In The Cloud Gunrock Hornet The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/15/2021 • 37 minutes, 7 seconds

Delivering Deep Learning Powered Speech Recognition As A Service For Developers At AssemblyAI

Summary Building a software-as-a-service (SaaS) business is a fairly well understood pattern at this point. When the core of the service is a set of machine learning products it introduces a whole new set of challenges. In this episode Dylan Fox shares his experience building Assembly AI as a reliable and affordable option for automatic speech recognition that caters to a developer audience. He discusses the machine learning development and deployment processes that his team relies on, the scalability and performance considerations that deep learning models introduce, and the user experience design that goes into building for a developer audience. This is a fascinating conversation about a unique cross-section of considerations and how Dylan and his team are building an impressive and useful service. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Dylan Fox about AssemblyAI, a powerful and easy to use speech recognition API designed for developers Interview Introductions How did you get introduced to Python? Can you describe what Assembly AI is and the story behind it? Speech recognition is a service that is being added to every cloud platform, video service, and podcast product. What do you see as the motivating factors for the current growth in this industry? How would you characterize your overall position in the market? What are the core goals that you are focused on with AssemblyAI? Can you describe the different ways that you are using Python across the company? How is the AssemblyAI platform architected? What are the complexities that you have to work around to maintain high uptime for an API powered by a deep learning model? What are the scaling challenges that crop up, whether on the training or serving? What are the axes for improvement for a speech recognition model? How do you balance tradeoffs of speed and accuracy as you iterate on the model? What is your process for managing the deep learning workflow? How do you manage CI/CD for your deep learning models? What are the open areas of research in speech recognition? What are the most interesting, innovative, or unexpected ways that you have seen AssemblyAI used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on AssemblyAI? When is AssemblyAI the wrong choice? What do you have planned for the future of AssemblyAI? Keep In Touch LinkedIn @YouveGotFox on Twitter Picks Tobias H.P. Lovecraft Dylan Project Hail Mary by Andy Weir Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links AssemblyAI Two Scoops of Django Nuance Dragon Natural Speaking PyTorch Podcast Episode Tensorflow FastAPI Flask Tornado Podcast Episode Neural Magic Podcast Episode The Martian The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/4/2021 • 52 minutes, 20 seconds

Taking Aim At The Legacy Of SQL With The Preql Relational Language

Summary SQL has gone through many cycles of popularity and disfavor. Despite its longevity it is objectively challenging to work with in a collaborative and composable manner. In order to address these shortcomings and build a new interface for your database oriented workloads Erez Shinan created Preql. It is based on the same relational algebra that inspired SQL, but brings in more robust computer science principles to make it more manageable as you scale in complexity. In this episode he shares his motivation for creating the Preql project, how he has used Python to develop a new language for interacting with database engines, and the challenges of taking on the legacy of SQL as an individual. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Erez Shinan about Preql, an interpreted, relational programming language, that specializes in database queries Interview Introductions How did you get introduced to Python? Can you describe what Preql is and the story behind it? What are goals and target use cases for the project? There have been numerous projects that aim to make SQL more maintainable and composable. What is it about the language and syntax that makes it so challenging? How does Preql approach this problem that is different from other efforts? (e.g. ORMs, dbt-style Jinja, PyPika) How did you approach the design of the syntax to make it familiar to people who know SQL? Can you describe how Preql is implemented? How has the design and architecture changed or evolved since you began working on it? What is a typical workflow for someone using Preql to build a library of analytical queries? Beyond strict compilation to SQL, what are some of the other features that you have incorporated into Preql? How does a Preql program get executed against a target database, particularly when using capabilities that can’t be directly translated to SQL? ** What are the main difficulties / challenges of compiling to SQL ? What are some of the features or use cases that are not immediately obvious or prone to be overlooked that you think are worth mentioning? What are the most interesting, innovative, or unexpected ways that you have seen Preql used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Preql? When is Preql the wrong choice? What do you have planned for the future of Preql? Keep In Touch erezsh on GitHub erezsh on Twitter Picks Tobias Counterpart Erez Bansko, Bulgaria Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Preql Lark Postgres Data Engineering Podcast Episode MySQL Relational Algebra Pandas Podcast Episode ORM == Object Relational Mapper dbt Data Engineering Podcast Episode PyPika GraphQL Julia runtype Rich terminal UI library prompt-toolkit DuckDB Askgit BigQuery Snowflake The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/28/2021 • 36 minutes, 38 seconds

Unleash The Power Of Dataframes At Any Scale With Modin

Summary When you start working on a data project there are always a variety of unknown factors that you have to explore. One of those is the volume of total data that you will eventually need to handle, and the speed and scale at which it will need to be processed. If you optimize for scale too early then it adds a high barrier to entry due to the complexities of distributed systems, but if you invest in a lot of engineering up front then it can be challenging to refactor for scale. Modin is a project that aims to remove that decision by letting you seamlessly replace your existing Pandas code and scale across CPU cores or across a cluster of machines. In this episode Devin Petersohn explains why he started working on solving this problem, how Modin is architected to allow for a smooth escalation from small to large volumes of data and compute, and how you can start using it today to accelerate your Pandas workflows. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Devin Petersohn about Modin, a Pandas compatible dataframe library for datasets from 1MB to 1TB+ Interview Introductions How did you get introduced to Python? Can you describe what Modin is and the story behind it? Why study dataframes? How do dataframes compare to databases? What can you do in a dataframe that you couldn’t in a database? What are your overall goals for the Modin project? Who are the target users of Modin and how does that influence your prioritization of features? What are some of the API inconsistencies that you have had to abstract and work around between Pandas, Ray, and Dask to give users a seamless experience? What are some of the considerations in terms of capabilities or user experience that will influence whether to use Ray or Dask as the execution engine? Can you describe how Modin is implemented? How has the constraint of replicating the Pandas API influenced your architectural choices? What are the most complex or challenging Pandas APIs to replicate in Modin? In addition to the core Pandas API you have also added experimental features such as SQL support and a spreadsheet interface. How have those capabilities affected the range of potential use cases and end users? What are some of the complexities that come from acting as a middleware between the Pandas API and the Ray and Dask frameworks? What are some of the initial ideas or assumptions that you had about the design or utility of Modin that have been challenged as you worked through building and releasing it? What are the most interesting, innovative, or unexpected ways that you have seen Modin used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Modin? When is Modin the wrong choice? What do you have planned for the future of Modin? Keep In Touch devin-petersohn on GitHub LinkedIn Picks Tobias xxh Devin Lux Podcast Episode Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Modin UC Berkeley RISELAB XArray Pandas Podcast Episode Dask Podcast Episode Ray Podcast Episode Spark The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/22/2021 • 38 minutes, 53 seconds

Exploring The SpeechBrain Toolkit For Speech Processing

Summary With the rising availability of computation in everyday devices, there has been a corresponding increase in the appetite for voice as the primary interface. To accomodate this desire it is necessary for us to have high quality libraries for being able to process and generate audio data that can make sense of human speech. To facilitate research and industry applications for speech data Mirco Ravanelli and Peter Plantinga are building SpeechBrain. In this episode they explain how it works under the hood, the projects that they are using it for, and how you can get started with it today. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Mirco Ravanelli and Peter Plantinga about SpeechBrain, an open-source and all-in-one speech toolkit powered by PyTorch Interview Introductions How did you get introduced to Python? Can you describe what SpeechBrain is and the story behind it? What are the goals and target use cases of the SpeechBrain project? What are some of the ways that processing audio with a focus on speech differs from more general audio processing? What are some of the other libraries/frameworks/services that are available to work with speech data and what are the unique capabilities that SpeechBrain offers? How is SpeechBrain implemented? What was your decision process for determining which framework to build on top of? What are some of the original ideas and assumptions that you had for SpeechBrain which have been changed or invalidated as you worked through implementing it? Can you talk through the workflow of using SpeechBrain? What would be involved in developing a system to automate transcription with speaker recognition and diarization? In the documentation it mentions that SpeechBrain is built to be used for research purposes. What are some of the kinds of research that it is being used for? What are some of the features or capabilities of SpeechBrain which might be non-obvious that you would like to highlight? What are the most interesting, innovative, or unexpected ways that you have seen SpeechBrain used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on SpeechBrain? When is SpeechBrain the wrong choice? What do you have planned for the future of SpeechBrain? Keep In Touch Mirco mravanelli on GitHub LinkedIn @mirco_ravanelli on Twitter Peter pplantinga on GitHub @ComPeterScience on Twitter Website LinkedIn Picks Tobias x.ai Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links SpeechBrain Mila Speech Processing Speech Enhancement NumPy SciPy Theano PyTorch Podcast Episode Speech Recognition NeMo ESPNet Sequence to Sequence (Seq2Seq) HyperParameters TorchAudio PyTorch Lightning Keras HuggingFace Generative Adversarial Network Snorkel Data Engineering Podcast Episode The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/14/2021 • 37 minutes, 26 seconds

Fast And Educational Exploration And Analysis Of Graph Data Structures With graph-tool

Summary If you are interested in a library for working with graph structures that will also help you learn more about the research and theory behind the algorithms then look no further than graph-tool. In this episode Tiago Peixoto shares his work on graph algorithms and networked data and how he has built graph-tool to help in that research. He explains how it is implemented, how it evolved from a simple command line tool to a full-fledged library, and the benefits that he has found from building a personal project in the open. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Tiago Peixoto about graph-tool, an efficient Python module for manipulation and statistical analysis of graphs Interview Introductions How did you get introduced to Python? Can you describe what graph-tool is and the story behind it? What are some scenarious where someone might encounter a graph oriented data set? In what ways are those graphs typically represented? In your experience, what is the overlap of people who are working with networked data, and the use of graph-native databases? (e.g. Neo4J, DGraph, etc.) What kinds of analysis or manipulation might someone need to perform on a graph structure? There are a few different tools in Python for working with networked data. How would you characterize the current ecosystem and why someone might choose graph-tool? Can you describe how graph-tool is implemented? How have the goals and design of the package changed or evolved since you first began working on it? Who are your target users and what are the guiding principles that you use to inform the API design for the package? How much knowledge of graph theory or algorithms are required to make effective use of graph-tool? Can you talk through an example workflow of using graph-tool to load, process, and analyze a graph? What are some of the overlooked or underutilized aspects of graph-tool that you think more people should know about? What are some systems/applications that you have seen which would be simplified by adopting a graph model for their data? What is your impression of the overall awareness of the benefits of graphs for simplifying aspects of data processing and analysis? What are some cases where a graph structure adds unnecessary complexity? What are the most interesting, innovative, or unexpected ways that you have seen graph-tool used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on graph-tool? When is graph-tool the wrong choice? What do you have planned for the future of graph-tool? Keep In Touch Website graph-tool Picks Tobias 97 Things Every Data Engineer Should Know Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Central European University NetworkX GML GraphML Neo4J DGraph Data Engineering Podcast Episode NetworKit igraph Matplotlib C++ Templates Boost Graph Library OpenMP Maximum Matching The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/7/2021 • 41 minutes, 59 seconds

Lightening The Load For Deep Learning With Sparse Networks Using Neural Magic

Summary Deep learning has largely taken over the research and applications of artificial intelligence, with some truly impressive results. The challenge that it presents is that for reasonable speed and performance it requires specialized hardware, generally in the form of a dedicated GPU (Graphics Processing Unit). This raises the cost of the infrastructure, adds deployment complexity, and drastically increases the energy requirements for training and serving of models. To address these challenges Nir Shavit combined his experiences in multi-core computing and brain science to co-found Neural Magic where he is leading the efforts to build a set of tools that prune dense neural networks to allow them to execute on commodity CPU hardware. In this episode he explains how sparsification of deep learning models works, the potential that it unlocks for making machine learning and specialized AI more accessible, and how you can start using it today. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Nir Shavit about Neural Magic and the benefits of using sparsification techniques for deep learning models Interview Introductions How did you get introduced to Python? Can you describe what Neural Magic is and the story behind it? What are the attributes of deep learning architectures that influence the bias toward GPU hardware for training them? What are the mathematical aspects of neural networks that have biased the current generation of software tools toward that architectural style? How does sparsifying a network architecture allow for improved performance on commodity CPU architectures? What is involved in converting a dense neural network into a sparse network? Can you describe the components of the Neural Magic architecture and how they are used together to reduce the footprint of deep learning architectures and accelerate their performance on CPUs? What are some of the goals or design approaches that have changed or evolved since you first began working on the Neural Magic platform? For someone who has an existing model defined, what is the process to convert it to run with the DeepSparse engine? What are some of the options for applications of deep learning that are unlocked by enabling the models to train and run without GPU or other specialized hardware? The current set of components for Neural Magic is either open source or free to use. What is your long-term business model, and how are you approaching governance of the open source projects? What are the most interesting, innovative, or unexpected ways that you have seen Neural Magic and model sparsification used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Neural Magic? When is Neural Magic or sparse networks the wrong choice? What do you have planned for the future of Neural Magic? Keep In Touch Research Overview LinkedIn Picks Tobias The Tick TV show Nir Bauhaus documentary Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Neural Magic MIT Computational Neurobiology 6.006 MIT Course FLOPS == FLoating point OPerations per Second Perceptron Convolutional Neural Network Lisp Quantization of ML YOLO ML Model Federated Learning Podcast Episode Reinforcement Learning GPT-3 OpenAI Transfer Learning Podcast Episode about Transfer Learning for NLP Tensor Columns Neural Magic DeepSparse Engine ONNX CUDA Sparse Zoo Tab9 The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/30/2021 • 48 minutes, 32 seconds

Finding The Core Of Python For A Bright Future With Brett Cannon

Summary Brett Cannon has been a long-time contributor to the Python language and community in many ways. In this episode he shares some of his work and thoughts on modernizing the ecosystem around the language. This includes standards for packaging, discovering the true core of the language, and how to make it possible to target mobile and web platforms. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Are you bored with writing scripts to move data into SaaS tools like Salesforce, Marketo, or Facebook Ads? Hightouch is the easiest way to sync data into the platforms that your business teams rely on. The data you’re looking for is already in your data warehouse and BI tools. Connect your warehouse to Hightouch, paste a SQL query, and use their visual mapper to specify how data should appear in your SaaS systems. No more scripts, just SQL. Supercharge your business teams with customer data using Hightouch for Reverse ETL today. Get started for free at pythonpodcast.com/hightouch. Your host as usual is Tobias Macey and today I’m interviewing Brett Cannon about improvements in the packaging ecosystem, the promise of WebAssembly, and his recent explorations of CPython’s interpreter Interview Introductions How did you get introduced to Python? As a core contributor to CPython, a member of the steering Council, and the team lead for VSCode’s Python extension, what are your current areas of focus for the language? One of the PEPs that you were involved with recently introduced the pyproject.toml file for simplifying the work of building Python packages. Can you share some of the background behind that work and the goals that you had for it? Since its introduction a lot of people have co-opted that file for other project configuration. What was your reaction to that, and if you had foreseen that usage what might you have changed or added in the PEP to account for it? What are the long term impacts on the packaging ecosystem that you anticipate with the standardization efforts that are happening? Another area where there is a lot of attention right now is being able to target additional deployment environments such as the browser, with web assembly, and mobile devices, with projects like BriefCase and Kivy. You had a recent post where you posed some questions about the true nature of Python and the possibility of removing pieces of it to simplify building for these other runtimes. What is your personal sense of the minimal set of features that we need for something to still be Python? How have projects such as MicroPython and PyOdide influenced your thinking on the matter? You have also recently been writing a series of articles about the implementation details of different syntactic elements of Python. What was your inspiration for that? What are some of the interesting or surprising details that you encountered while unwrapping the way that the interpreter handles those syntactic elements? How have those explorations helped you in your efforts to identify the core of Python? Recent releases of Python have brought in some substantial changes to the interpreter and new language features (e.g. PEG parser, pattern matching). What are some of the other large initiatives that you are keeping track of? What are your personal goals for the near to medium term future of Python? What are the most interesting, unexpected, or challenging lessons that you have learned while working on the Python language and related tooling? If you were to redesign Python today, what are some of the things that you would do differently? Keep In Touch brettcannon on GitHub @brettsky on Twitter Blog Picks Tobias Cold Brew Iced Tea Loki on Disney+ Brett Rich Textual The physics facts included in all of the Python 3.10 release announcements, e.g. you will never see a green star Links Brett’s Blog Python VSCode Extension Python Steering Council Python Package Authority UC Berkeley Vancouver, BC Squamish, Musquiam, Tsleil-waututh First Nations Pascal Python C O’Reilly PyCon US 2021 Steering Council Keynote Python Developer-In-Residence PSF Visionary Sponsorship Setuptools Pip Python Wheels PyPI PEP 518 PEP 517 PEP 621 pyproject.toml Flit Enscons PyPA Build PyOxidizer Pex Shiv cx_Freeze cibuildwheel Thomas Kluyver Poetry Vaults of Parnassus MicroPython Podcast Episode CircuitPython Podcast Episode Desugaring Python Blog Series JupyterHub PyOdide JupyterLite ANSI C99 PyPy Jython IPython ncurses Kivy Briefcase Toga PEP 401 The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/23/2021 • 1 hour, 3 minutes, 18 seconds

Traversing The Challenges And Promise Of Graph Machine Learning

Summary The foundation of every ML model is the data that it is trained on. In many cases you will be working with tabular or unstructured information, but there is a growing trend toward networked, or graph data sets. Benedek Rozemberczki has focused his research and career around graph machine learning applications. In this episode he discusses the common sources of networked data, the challenges of working with graph data in machine learning projects, and describes the libraries that he has created to help him in his work. If you are dealing with connected data then this interview will provide a wealth of context and resources to improve your projects. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Are you bored with writing scripts to move data into SaaS tools like Salesforce, Marketo, or Facebook Ads? Hightouch is the easiest way to sync data into the platforms that your business teams rely on. The data you’re looking for is already in your data warehouse and BI tools. Connect your warehouse to Hightouch, paste a SQL query, and use their visual mapper to specify how data should appear in your SaaS systems. No more scripts, just SQL. Supercharge your business teams with customer data using Hightouch for Reverse ETL today. Get started for free at pythonpodcast.com/hightouch. Your host as usual is Tobias Macey and today I’m interviewing Benedek Rozemberczki about his work on machine learning for graph data, including a variety of libraries to support his efforts Interview Introductions How did you get introduced to Python? Can you start by giving an overview of when you might want to do machine learning on networked/graph data? How do networked data sets change the way that you approach machine learning tasks? Can you describe the current state of the ecosystem for machine learning on graphs? You have created a number of libraries to address different aspects of machine learning on graphs. Can you list them and share some of the stories behind their creation? How do the different tools relate to each other? Can you talk through some of the structural and user experience design principles that you lean on when building these libraries? When you are working with networked data sets, what is your current workflow from idea to completion? What are the most difficult aspects of working with networked data sets for machine learning applications? What are the most interesting, innovative, or unexpected ways that you have seen graph ML used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on graph ML problems? What are some examples of when you would choose not to use some or all of your own libraries? What do you have planned for the future of your libraries/what new libraries do you anticipate needing to build? Keep In Touch benedekrozemberczki on GitHub @benrozemberczki on Twitter LinkedIn Picks Tobias Wrath of Man Benedek Hunt for the Wilderpeople Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Karate Club PyTorch Geometric Temporal AstraZeneca Budapest University of Edinburgh Matlab R Bipartite Graph Node Classification Graph Classification PyTorch Podcast Episode PyTorch Geometric DGL (Deep Graph Library) Parametric Machine Learning graph-tool Jax NetworkX Little Ball of Fur GCN == Graph Convolutional Network NetworKit Gensim Podcast Episode Nvidia cuGraph Random Walk scikit-learn MalNet Graph Representation Learning by William Hamilton The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/16/2021 • 47 minutes, 47 seconds

Keep Your Analytics Lint Free With SQLFluff

Summary The growth of analytics has accelerated the use of SQL as a first class language. It has also grown the amount of collaboration involved in writing and maintaining SQL queries. With collaboration comes the inevitable variation in how queries are written, both structurally and stylistically which can lead to a significant amount of wasted time and energy during code review and employee onboarding. Alan Cruickshank was feeling the pain of this wasted effort first-hand which led him down the path of creating SQLFluff as a linter and formatter to enforce consistency and find bugs in the SQL code that he and his team were working with. In this episode he shares the story of how SQLFluff evolved from a simple hackathon project to an open source linter that is used across a range of companies and fosters a growing community of users and contributors. He explains how it has grown to support multiple dialects of SQL, as well as integrating with projects like DBT to handle templated queries. This is a great conversation about the long detours that are sometimes necessary to reach your original destination and the powerful impact that good tooling can have on team productivity. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! We’ve all been asked to help with an ad-hoc request for data by the sales and marketing team. Then it becomes a critical report that they need updated every week or every day. Then what do you do? Send a CSV via email? Write some Python scripts to automate it? But what about incremental sync, API quotas, error handling, and all of the other details that eat up your time? Today, there is a better way. With Census, just write SQL or plug in your dbt models and start syncing your cloud warehouse to SaaS applications like Salesforce, Marketo, Hubspot, and many more. Go to pythonpodcast.com/census today to get a free 14-day trial. Your host as usual is Tobias Macey and today I’m interviewing Alan Cruickshank about SQLFluff, a dialect-flexible and configurable SQL linter Interview Introductions How did you get introduced to Python? Can you describe what SQLFluff is and the story behind it? SQL is one of the oldest programming languages that is still in regular use. Why do you think that there are so few linters for it? Who are the target users of SQLFluff and how do those personas influence the design and user experience of the project? What are some of the characteristics of SQL and how it is used that contribute to readability/comprehension challenges? What are some of the additional difficulties that are introduced by templating in the queries? How is SQLFluff implemented? How have the goals and design of the project changed since you first began working on it? How do you handle support of varying SQL dialects without undue maintenance burdens? What are some of the stylistic elements and strategies for making SQL code more maintainable? What are some strategies for making queries self-documenting? What are some signs that you should document it anyway? What are some of the kinds of bugs that you are able to identify with SQLFluff? What are some of the resources/references that you relied on for identifying useful linting rules? What are some methods for measuring code quality in SQL? What are the most interesting, innovative, or unexpected ways that you have seen SQLFluff used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on SQLFluff? When is SQLFluff the wrong choice? What do you have planned for the future of SQLFluff? Keep In Touch alanmcruickshank on GitHub Website LinkedIn Picks Tobias The Nevers Alan Lost Connections: Uncovering the Real Causes of Depression – and the Unexpected Solutions by Johann Hari (affiliate link) The Wim Hof Method by Wim Hof Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links SQLFluff Tails.com Hypothesis Podcast Episode Project Euler Flake8 Podcast Episode Black dbt Data Engineering Podcast Episode Snowflake Data Engineering Podcast Episode BigQuery SQL Window Functions ANSI SQL PostgreSQL MS SQL Server Oracle DB Airflow SQL Subquery Common Table Expression (CTE) The Rise Of The Data Engineer blog post The Downfall Of The Data Engineer blog post Object-Relational Mapper (ORM) Tableau Fishtown Analytics SQL Styleguide Mozilla SQL Styleguide The Zen of Python dbt Packages yapf Set Theory Flake8 SQL Plugin The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/9/2021 • 1 hour, 13 minutes, 13 seconds

Exploring The Patterns And Practices For Deep Learning With Andrew Ferlitsch

Summary Deep learning is gaining an immense amount of popularity due to the incredible results that it is able to offer with comparatively little effort. Because of this there are a number of engineers who are trying their hand at building machine learning models with the wealth of frameworks that are available. Andrew Ferlitsch wrote a book to capture the useful patterns and best practices for building models with deep learning to make it more approachable for newcomers ot the field. In this episode he shares his deep expertise and extensive experience in building and teaching machine learning across many companies and industries. This is an entertaining and educational conversation about how to build maintainable models across a variety of applications. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! We’ve all been asked to help with an ad-hoc request for data by the sales and marketing team. Then it becomes a critical report that they need updated every week or every day. Then what do you do? Send a CSV via email? Write some Python scripts to automate it? But what about incremental sync, API quotas, error handling, and all of the other details that eat up your time? Today, there is a better way. With Census, just write SQL or plug in your dbt models and start syncing your cloud warehouse to SaaS applications like Salesforce, Marketo, Hubspot, and many more. Go to pythonpodcast.com/census today to get a free 14-day trial. Scaling your data infrastructure is hard. Maintaining data quality standards as you scale is harder. Databand solves this. Their Unified Data Observability platform gives data engineers visibility over their stack without changing existing pipeline code. Get end-to-end visibility on your pipelines, and identify the root cause of issues before bad data is delivered. Seamlessly integrate with over 20 tools like Apache Airflow, Spark, Snowflake, and more. Use customizable dashboards to see where pipelines are broken and how that impacts delivery downstream. Get alerts on leading indicators of pipeline failure. Open up your pipeline and see exactly which code strings are broken – so you can fix the issue immediately. Create more reliable data products. Go to pythonpodcast.com/databand today to start your free trial! Your host as usual is Tobias Macey and today I’m interviewing Andrew Ferlitsch about the patterns and practices for deep learning applications Interview Introductions How did you get introduced to Python? Can you start by describing the major elements of a model architecture? What is the relationship between the specific learning task being addressed and the architecture of the learning network? In your experience, what is the level of awareness of a typical ML engineer or data scientist with respect to the most current design patterns in deep learning? Your currently working on a book about deep learning patterns and practices. What was your motivation for starting that project? What are your goals for the book? How have advancements in the operability of machine learning influenced the ways that the models are designed and trained? How do recent approaches such as transfer learning impact the needs of the supporting tools and infrastructure? Can you describe the different design patterns that you cover in your book and the selection process for when and how to apply them? What are the aspects of bringing deep learning to production that continue to be a challenge? What are some of the emerging practices that you are optimistic about? What are some of the industry trends or areas of current research that you are most excited about? What are the most interesting, innovative, or unexpected patterns that you have encountered? What are the most interesting, unexpected, or challenging lessons that you have learned while working on the book? What are some of the other resources that you recommend for listeners to learn more about how to build production ready models? Keep In Touch LinkedIn @AndrewFerlitsch on Twitter andrewferlitsch on GitHub Picks Tobias Designing Data Intensive Applications (affiliate link) Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Google Cloud AI Sharp Corporation Deep Learning Patterns and Practices (affiliate link) use the code podinit21 at checkout for 35% off all books at Manning! CID Bioscience Latent Space AI Winter Numerical Stability Surrogate Model GAN == Generative Adversarial Network Gradient Descent The Gang of 4 – Design Patterns: Elements of Reusable Object-Oriented Software (affiliate link) The Lottery Hypothesis Manning Publications (affiliate link) The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/2/2021 • 44 minutes, 19 seconds

Automatically Generate Your Unit Tests From Scratch With Pynguin

Summary Unit tests are an important tool to ensure the proper functioning of your application, but writing them can be a chore. Stephan Lukasczyk wants to reduce the monotony of the process for Python developers. As part of his PhD research he created the Pynguin project to automate the creation of unit tests. In this episode he explains the complexity involved in generating useful tests for a dynamic language, how he has designed Pynguin to address the challenges, and how you can start using it today for your own work. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! We’ve all been asked to help with an ad-hoc request for data by the sales and marketing team. Then it becomes a critical report that they need updated every week or every day. Then what do you do? Send a CSV via email? Write some Python scripts to automate it? But what about incremental sync, API quotas, error handling, and all of the other details that eat up your time? Today, there is a better way. With Census, just write SQL or plug in your dbt models and start syncing your cloud warehouse to SaaS applications like Salesforce, Marketo, Hubspot, and many more. Go to pythonpodcast.com/census today to get a free 14-day trial. Are you bored with writing scripts to move data into SaaS tools like Salesforce, Marketo, or Facebook Ads? Hightouch is the easiest way to sync data into the platforms that your business teams rely on. The data you’re looking for is already in your data warehouse and BI tools. Connect your warehouse to Hightouch, paste a SQL query, and use their visual mapper to specify how data should appear in your SaaS systems. No more scripts, just SQL. Supercharge your business teams with customer data using Hightouch for Reverse ETL today. Get started for free at pythonpodcast.com/hightouch. Your host as usual is Tobias Macey and today I’m interviewing Stephan Lukasczyk about Pynguin, the PYthoN General UnIt test geNerator Interview Introductions How did you get introduced to Python? Can you describe what Pynguin is and the story behind it? What are the problems that Pynguin is designed to solve? What other projects are you drawing inspiration from? What are some of the use cases for automatic test generation? How is Pynguin implemented? What are the challenges that the dynamic nature of Python introduces? What are some of the packages and libraries that have been most helpful while building Pynguin? Can you talk through the workflow of using Pynguin to generate tests for a project? What are some of the limitations on what kinds of projects Pynguin can be used for? What are some design or implementation strategies in the code that you are generating tests for that will help make Pynguin’s job easier? Once a test suite has been created, what are the next steps? What are some of the initial assumptions or goals of the project that have been revised or challenged once you began implementing it? What are the most interesting, innovative, or unexpected ways that you have seen Pynguin used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Pynguin? When is Pynguin the wrong choice? What do you have planned for the future of Pynguin? Keep In Touch Related to Pynguin: best via GitHub Find me on Twitter Picks Tobias Concourse CI Stephan Cycling Take care of your health Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Pynguin University of Passau Passau, Germany Evosuite Hypothesis Podcast Episode Astor Walrus Operator MyPy Podcast Episode Pytest Podcast Episode UnitTest Bytecode library Pytype Monkeytype Podcast Episode Atheris from Google – coverage-guided fuzzing Blog series about “Python behind the scenes”: Ten thousand meters by Victor Skvortsov The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/25/2021 • 57 minutes, 40 seconds

Leveling Up Natural Language Processing with Transfer Learning

Summary Natural language processing is a powerful tool for extracting insights from large volumes of text. With the growth of the internet and social platforms, and the increasing number of people and communities conducting their professional and personal activities online, the opportunities for NLP to create amazing insights and experiences are endless. In order to work with such a large and growing corpus it has become necessary to move beyond purely statistical methods and embrace the capabilities of deep learning, and transfer learning in particular. In this episode Paul Azunre shares his journey into the application and implementation of transfer learning for natural language processing. This is a fascinating look at the possibilities of emerging machine learning techniques for transforming the ways that we interact with technology. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! We’ve all been asked to help with an ad-hoc request for data by the sales and marketing team. Then it becomes a critical report that they need updated every week or every day. Then what do you do? Send a CSV via email? Write some Python scripts to automate it? But what about incremental sync, API quotas, error handling, and all of the other details that eat up your time? Today, there is a better way. With Census, just write SQL or plug in your dbt models and start syncing your cloud warehouse to SaaS applications like Salesforce, Marketo, Hubspot, and many more. Go to pythonpodcast.com/census today to get a free 14-day trial. Are you bored with writing scripts to move data into SaaS tools like Salesforce, Marketo, or Facebook Ads? Hightouch is the easiest way to sync data into the platforms that your business teams rely on. The data you’re looking for is already in your data warehouse and BI tools. Connect your warehouse to Hightouch, paste a SQL query, and use their visual mapper to specify how data should appear in your SaaS systems. No more scripts, just SQL. Supercharge your business teams with customer data using Hightouch for Reverse ETL today. Get started for free at pythonpodcast.com/hightouch. Your host as usual is Tobias Macey and today I’m interviewing Paul Azunre about using transfer learning for natural language processing Interview Introductions How did you get introduced to Python? Can you start by explaining what transfer learning is? How is transfer learning being applied to natural language processing? What motivated you to write a book about the application of transfer learning to NLP? What are some of the applications of NLP that are impractical on intractable without transfer learning? At a high level, what are the steps for building a new language model via transfer learning? There have been a number of base models created recently, such as BERT and ERNIE, ELMo, GPT-3, etc. What are the factors that need to be considered when selecting which model to build from? If there are multiple models that contain the seeds for different aspects of the end goal that you are trying to obtain, what is the feasibility of extracting the relevant capabilities from each of them and combining them in the final model? What are some of the tools or frameworks that you have found most useful while working with NLP and transfer learning? How would you characterize the current state of the ecosystem for transfer learning and deep learning techniques applied to NLP problems? What are the most interesting, innovative, or unexpected applications of transfer learning with NLP that you have seen? What are the most interesting, unexpected, or challenging lessons that you have learned while working on the book? When is transfer learning the wrong choice for an NLP project? What are the trends or techniques that you are most excited for? Keep In Touch LinkedIn Website @pazunre on Twitter Picks Tobias Infected Mushroom Paul Tenet Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Transfer Learning for Natural Language Processing by Paul Azunre (affiliate link) Use the code podinit21 at checkout for 35% off all books at Manning! Low Resource Languages Fortran C++ MatLab MIT 6.003 Transfer Learning Computer Vision Deep Neural Network Convolutional Neural Network (CNN) Recurrent Neural Network (RNN) GLUE == General Lanuage Understanding Evaluation NLP SuperGLUE NLP Encoder Named Entity Recognition ImageNet Mathematical Optimization Gradient Descent Yonder AI ELMo language model from Allen NLP Ghana ArXiv BERT language model TF-IDF == Term Frequency – Inverse Document Frequency Word2Vec GPT-3 Ghana NLP Automatic Speech Recognition ULM Fit Keras Tensorflow Huggingface Transformers Multi-Task Learning Fast.ai OpenAI AWS SageMaker Kaggle Kernels Colab Notebooks Azure ML Studio BLEU Score Khaya application Android iOS The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/18/2021 • 46 minutes, 34 seconds

Federated Learning For All With Flower

Summary Machine learning is a tool that has typically been performed on large volumes of data in one place. As more computing happens at the edge on mobile and low power devices, the learning is being federated which brings a new set of challenges. Daniel Beutel co-created the Flower framework to make federated learning more manageable. In this episode he shares his motivations for starting the project, how you can use it for your own work, and the unique challenges and benefits that this emerging model offers. This is a great exploration of the federated learning space and a framework that makes it more approachable. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! We’ve all been asked to help with an ad-hoc request for data by the sales and marketing team. Then it becomes a critical report that they need updated every week or every day. Then what do you do? Send a CSV via email? Write some Python scripts to automate it? But what about incremental sync, API quotas, error handling, and all of the other details that eat up your time? Today, there is a better way. With Census, just write SQL or plug in your dbt models and start syncing your cloud warehouse to SaaS applications like Salesforce, Marketo, Hubspot, and many more. Go to pythonpodcast.com/census today to get a free 14-day trial. Are you bored with writing scripts to move data into SaaS tools like Salesforce, Marketo, or Facebook Ads? Hightouch is the easiest way to sync data into the platforms that your business teams rely on. The data you’re looking for is already in your data warehouse and BI tools. Connect your warehouse to Hightouch, paste a SQL query, and use their visual mapper to specify how data should appear in your SaaS systems. No more scripts, just SQL. Supercharge your business teams with customer data using Hightouch for Reverse ETL today. Get started for free at pythonpodcast.com/hightouch. Your host as usual is Tobias Macey and today I’m interviewing Daniel Beutel about Flower, a framework for building federated learning systems Interview Introductions How did you get introduced to Python? Can you start by describing what federated learning is? What is Flower and what’s the story behind it? What are the trade-offs between federated and centralized models of machine learning? What are some of the types of use cases or workloads that federated learning is used for? Federated learning appears to be a growing area of interest. How would you characterize the current state of the ecosystem? What are the most complex or challenging aspects of federating model training? How does Flower simplify the process of distributing the model training process? Can you describe how Flower is implemented? How have the goals and/or design of Flower changed or evolved since you first began working on it? One of the design principles that you list is "understandability". What are some of the ways that that manifests in the project? It also mentions extensibility. What are the interfaces that Flower exposes for integration or extending its capabilities? For someone who has an existing project that runs in a centralized manner, what are some indicators that a federated approach would be beneficial? What is involved in translating the existing project to run in a federated fashion using Flower? What is involved in building a production ready system with Flower? How does your work at Adap inform the design and product direction for Flower? What are some of the most interesting, innovative, or unexpected ways that you have seen Flower used? What are the most interesting, unexpected, or challenging lessons that you have learned from your work on and with Flower? When is Flower the wrong choice? What do you have planned for the future of the project? Keep In Touch LinkedIn danieljanes on GitHub @daniel_janes on Twitter Picks Tobias Rummy Card Game Daniel Stand Up Paddling Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Flower Adap Hyperparameter Optimization Federated Learning University of Oxford University of Cambridge Nvidia Jetson PyTorch Podcast Episode Tensorflow Lite Tensorflow Federated PySyft Flower Summit Jax CNN == Convolutional Neural Network Keras gRPC MQTT NumPy NDArray AWS Device Farm Ray Framework Podcast Episode The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/11/2021 • 1 hour, 1 minute, 28 seconds

Data Exploration and Visualization Made Effortless with Lux

Summary Data exploration is an important step in any analysis or machine learning project. Visualizing the data that you are working with makes that exploration faster and more effective, but having to remember and write all of the code to build a scatter plot or histogram is tedious and time consuming. In order to eliminate that friction Doris Lee helped create the Lux project, which wraps your Pandas data frame and automatically generates a set of visualizations without you having to lift a finger. In this episode she explains how Lux works under the hood, what inspired her to create it in the first place, and how it can help you create a better end result. The Lux project is a valuable addition to the toolbox of anyone who is doing data wrangling with Pandas. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! We’ve all been asked to help with an ad-hoc request for data by the sales and marketing team. Then it becomes a critical report that they need updated every week or every day. Then what do you do? Send a CSV via email? Write some Python scripts to automate it? But what about incremental sync, API quotas, error handling, and all of the other details that eat up your time? Today, there is a better way. With Census, just write SQL or plug in your dbt models and start syncing your cloud warehouse to SaaS applications like Salesforce, Marketo, Hubspot, and many more. Go to pythonpodcast.com/census today to get a free 14-day trial. Are you bored with writing scripts to move data into SaaS tools like Salesforce, Marketo, or Facebook Ads? Hightouch is the easiest way to sync data into the platforms that your business teams rely on. The data you’re looking for is already in your data warehouse and BI tools. Connect your warehouse to Hightouch, paste a SQL query, and use their visual mapper to specify how data should appear in your SaaS systems. No more scripts, just SQL. Supercharge your business teams with customer data using Hightouch for Reverse ETL today. Get started for free at pythonpodcast.com/hightouch. Your host as usual is Tobias Macey and today I’m interviewing Doris Lee about Lux, a Python library that facilitates fast and easy data exploration by automating the visualization and data analysis process Interview Introductions How did you get introduced to Python? Can you start by describing what Lux is and how the project got started? What is the role of visualization in a data science workflow? What are the challenges that data scientists face in the exploratory phase of their analysis? There are a wide variety of data visualization tools in the Python ecosystem with differing areas of focus. What is the role of Lux in that ecosystem? How does Lux compare to tools such as scikit-yb? What is the workflow for someone using Lux in their analysis and what problems does it solve for them? Can you talk through how Lux is architected? How have the goals and design of Lux changed or evolved since you first began working on it? Data visualization is a broad field. How do you determine which kinds of charts or plots are best suited to a particular data set or exploration? What are some of the capabilities of Lux that are often overlooked or underutilized? How has Lux impacted your own work in data analysis/data science? What are some of the other gaps that you see in the available tooling for data science? What are some of the most interesting, innovative, or unexpected ways that you have seen Lux used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on and with Lux? When is Lux the wrong choice? What do you have planned for the future of the project? Keep In Touch dorisjlee on GitHub Website LinkedIn Picks Tobias Pirates of the Carribean movies Doris Snake Wrangling for Kids Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Lux UC Berkeley RISE Lab School of Information Pandas Podcast Episode Bokeh Podcast Episode Seaborn Altair Podcast Episode Matplotlib Grammar of Graphics Plotly Scikit YellowBrick Podcast Episode D3.js Vega Numpy xarray Tensorflow Jupyter Widget Chloropleth Map G10 Countries Ray Podcast Episode Modin Dask Data Engineering Podcast Episode Podcast Interview About Coiled The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/4/2021 • 51 minutes, 5 seconds

Extensible Open Source Authorization For Your Applications With Oso

Summary Any project that is used by more than one person will eventually need to handle permissions for each of those users. It is certainly possible to write that logic yourself, but you’ll almost certainly do it wrong at least once. Rather than waste your time fighting with bugs in your authorization code it makes sense to use a well-maintained library that has already made and fixed all of the mistakes so that you don’t have to. In this episode Sam Scott shares the Oso framework to give you a clean separation between your authorization policies and your application code. He explains how you can call a simple function to ask if something is allowed, and then manage the complex rules that match your particular needs as a separate concern. He describes the motivation for building a domain specific language based on logic programming for policy definitions, how it integrates with the host language (such as Python), and how you can start using it in your own applications today. This is a must listen even if you never use the project because it is a great exploration of all of the incidental complexity that is involved in permissions management. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! We’ve all been asked to help with an ad-hoc request for data by the sales and marketing team. Then it becomes a critical report that they need updated every week or every day. Then what do you do? Send a CSV via email? Write some Python scripts to automate it? But what about incremental sync, API quotas, error handling, and all of the other details that eat up your time? Today, there is a better way. With Census, just write SQL or plug in your dbt models and start syncing your cloud warehouse to SaaS applications like Salesforce, Marketo, Hubspot, and many more. Go to pythonpodcast.com/census today to get a free 14-day trial. Are you bored with writing scripts to move data into SaaS tools like Salesforce, Marketo, or Facebook Ads? Hightouch is the easiest way to sync data into the platforms that your business teams rely on. The data you’re looking for is already in your data warehouse and BI tools. Connect your warehouse to Hightouch, paste a SQL query, and use their visual mapper to specify how data should appear in your SaaS systems. No more scripts, just SQL. Supercharge your business teams with customer data using Hightouch for Reverse ETL today. Get started for free at pythonpodcast.com/hightouch. Your host as usual is Tobias Macey and today I’m interviewing Sam Scott about Oso, an open source library for managing authorization in your applications Interview Introductions How did you get introduced to Python? Can you start by describing what Oso is and the story behind it? What was missing from the ecosystem of authorization libraries/frameworks that motivated you to create a new one? What are some of the most common mistakes that you see developers make when implementing authorization logic? At a high level, what is the process of using Oso to add access control policies to a piece of software? What is the motivation for using a DSL for defining policies as opposed to writing those definitions in pure Python? How have you approached the design of the policy language, particularly deciding what constraints to impose? What other policy frameworks or dialects have you drawn inspiration from? How is the Oso framework implemented? How have the goals and design of Oso changed or evolved since you first began working on it? What are some useful design patterns for integrating Oso into an application? How does the type of application (e.g. web app vs. system daemon, etc.) affect the ways that Oso is used? Given that Oso supports multiple language runtimes, what is involved in defining and enforcing policies that span multiple processes? (e.g. Python backend and Javascript frontend, Python microservice communicating with Go microservice, etc.) What are some of the common mistakes or areas of confusion for users who are getting started with Oso and Polar? What are some of the capabilities of Oso that are often overlooked or misunderstood? I noticed that you’re backed by some venture firms. What is your current product vision and how does that relate to your current open source goals? What are some of the most interesting, innovative, or unexpected ways that you have seen Oso used? What are some of the most interesting, unexpected, or challenging lessons that you have learned while working on and with oso? When is Oso the wrong choice? What do you have planned for the future of the project? Keep In Touch LinkedIn samscott89 on GitHub @samososos on Twitter Picks Tobias Chaos Walking Sam Hades video game Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Oso Oso Authorization Academy Number Theory Sage Math RBAC == Role-Based Access Control ABAC == Attribute-Based Access Control Polar Policy Language Prolog Logic Programming Open Policy Agent AWS IAM XACML Google Zanzibar Rust Web Assembly (Wasm) OAuth Scopes The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/27/2021 • 51 minutes, 49 seconds

Teaching Geeks The Value And Skills Of Public Speaking

Summary Being able to present your ideas is one of the most valuable and powerful skills to have as a professional, regardless of your industry. For software engineers it is especially important to be able to communicate clearly and effectively because of the detail-oriented nature of the work. Unfortunately, many people who work in software are more comfortable in front of the keyboard than a crowd. In this episode Neil Thompson shares his story of being an accidental public speaker and how he is helping other engineers start down the road of being effective presenters. He discusses the benefits for your career, how to build the skills, and how to find opportunities to practice them. Even if you never want to speak at a conference, it’s still worth your while to listen to Neil’s advice and find ways to level up your presentation and speaking skills. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! We’ve all been asked to help with an ad-hoc request for data by the sales and marketing team. Then it becomes a critical report that they need updated every week or every day. Then what do you do? Send a CSV via email? Write some Python scripts to automate it? But what about incremental sync, API quotas, error handling, and all of the other details that eat up your time? Today, there is a better way. With Census, just write SQL or plug in your dbt models and start syncing your cloud warehouse to SaaS applications like Salesforce, Marketo, Hubspot, and many more. Go to pythonpodcast.com/census today to get a free 14-day trial. Are you bored with writing scripts to move data into SaaS tools like Salesforce, Marketo, or Facebook Ads? Hightouch is the easiest way to sync data into the platforms that your business teams rely on. The data you’re looking for is already in your data warehouse and BI tools. Connect your warehouse to Hightouch, paste a SQL query, and use their visual mapper to specify how data should appear in your SaaS systems. No more scripts, just SQL. Supercharge your business teams with customer data using Hightouch for Reverse ETL today. Get started for free at pythonpodcast.com/hightouch. Your host as usual is Tobias Macey and today I’m interviewing Neil Thompson about the value of public speaking skills as a developer and how to gain them Interview Introductions How did you get into engineering? Can you start by discussing the different types of public speaking that we are talking about and some of the different venues where it might take place? How did you get into public speaking? What are some of the ways that our speaking abilities can impact the value that we provide and the trajectory of our career as engineers? What were some of the methods and resources that you used to improve your own public speaking skills? What are the common mistakes that people make when speaking to a group? What are some of the non-obvious ways that speaking skills can be useful as an engineer? What was your approach to learning how to be an effective speaker? What are some of the mis-steps or dead ends that you encountered? What are the different skills or capabilities that are necessary for being an effective presenter? What are some ways that engineers can practice their presentation skills? How do different audiences/venues influence the approach that you take to how to prepare for a presentation? How has your experience in public speaking factored into the work you do for your podcast? What are some of the most interesting, innovative, or unexpected presentations or speaking techniques that you have seen or used/created? What are the most interesting, unexpected, or challenging lessons that you have learned from speaking and teaching others to speak in a professional context? What resources do you recommend for engineers who want to improve their speaking and presenting skills? Keep In Touch LinkedIn @neil_i_thompson on Twitter Picks Tobias Falcon and the Winter Soldier Neil Teach The Geek To Speak Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Materials Science Toastmasters Teach The Geek To Speak Teach The Geek Podcast Developer Advocate The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/20/2021 • 42 minutes, 54 seconds

Let The Robots Do The Work Using Robotic Process Automation with Robocorp

Summary One of the great promises of computers is that they will make our work faster and easier, so why do we all spend so much time manually copying data from websites, or entering information into web forms, or any of the other tedious tasks that take up our time? As developers our first inclination is to "just write a script" to automate things, but how do you share that with your non-technical co-workers? In this episode Antti Karjalainen, CEO and co-founder of Robocorp, explains how Robotic Process Automation (RPA) can help us all cut down on time-wasting tasks and let the computers do what they’re supposed to. He shares how he got involved in the RPA industry, his work with Robot Framework and RPA framework, how to build and distribute bots, and how to decide if a task is worth automating. If you’re sick of spending your time on mind-numbing copy and paste then give this episode a listen and then let the robots do the work for you. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! We’ve all been asked to help with an ad-hoc request for data by the sales and marketing team. Then it becomes a critical report that they need updated every week or every day. Then what do you do? Send a CSV via email? Write some Python scripts to automate it? But what about incremental sync, API quotas, error handling, and all of the other details that eat up your time? Today, there is a better way. With Census, just write SQL or plug in your dbt models and start syncing your cloud warehouse to SaaS applications like Salesforce, Marketo, Hubspot, and many more. Go to pythonpodcast.com/census today to get a free 14-day trial. Software is read more than it is written, so complex and poorly organized logic slows down everyone who has to work with it. Sourcery makes those problems a thing of the past, giving you automatic refactoring recommendations in your IDE or text editor while you write (I even have it working in Emacs). It isn’t just another linting tool that nags you about issues. It’s like pair programming with a senior engineer, finding and applying structural improvements to your functions so that you can write cleaner code faster. Best of all, listeners of Podcast.__init__ get 6 months of their Pro tier for free if you go to pythonpodcast.com/sourcery today and use the promo code INIT when you sign up. Your host as usual is Tobias Macey and today I’m interviewing Antti Karjalainen about the RPA Framework for automating your daily tasks and his work at Robocorp to manage your robots in production Interview Introductions How did you get introduced to Python? Can you start by giving an overview of what Robotic Process Automation is? What are some of the ways that RPA might be used? What are the advantages over writing a custom library or script in Python to automate a given task? How does the functionality of RPA compare to automation services like Zapier, IFTTT, etc.? What are you building at Robocorp and what was your motivation for starting the business? Who is your target customer and how does that inform the products that you are building? Can you give an overview of the state of the ecosystem for RPA tools and products and how Robocorp and RPA framework fit within it? How does the RPA Framework relate to Robot Framework? What are some of the challenges that developers and end users often run into when trying to build, use, or implement an RPA system? How is the RPA framework itself implemented? How has the design of the project evolved since you first began working on it? Can you talk through an example workflow for building a robot? Once you have built a robot, what are some of the considerations for local execution or deploying it to a production environment? How can you chain together multiple robots? What is involved in extending the set of operations available in the framework? What are the available integration points for plugging a robot written with RPA Framework into another Python project? What are the dividing lines between RPA Framework and Robocorp? How are you handling the governance of the open source project? What are some of the most interesting, innovative, or unexpected ways that you have seen RPA Framework and the Robocorp platform used? What are the most interesting, unexpected, or challenging lessons that you have learned while building and growing RPA Framework and the Robocorp business? When is RPA and RPA Framework the wrong choice for automation? What do you have planned for the future of the framework and business? Keep In Touch aikarjal on GitHub @aikarjal on Twitter LinkedIn Picks Tobias WandaVision Antti Tenet Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Robocorp RPA Framework RCC Robotic Process Automation Zapier IFTTT (If This Then That) Robot Framework Selenium Playwright Conda Micro Mamba PyOxidizer Podcast Episode XKCD "Is It Worth The Time?" XKCD Automation Curve The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/13/2021 • 45 minutes, 33 seconds

Keep Your Code Clean And Maintainable Using Static Analysis With Flake8

Summary When you are writing code it is all to easy to introduce subtle bugs or leave behind unused code. Unused variables, unused imports, overly complex logic, etc. If you are careful and diligent you can find these problems yourself, but isn’t that what computers are supposed to help you with? Thankfully Python has a wealth of tools that will work with you to keep your code clean and maintainable. In this episode Anthony Sottile explores Flake8, one of the most popular options for identifying those problematic lines of code. He shares how he became involved in the project and took over as maintainer and explains the different categories of code quality tooling and how Flake8 compares to other static analyzers. He also discusses the ecosystem of plugins that have grown up around it, including some detailed examples of how you can write your own (and why you might want to). Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! We’ve all been asked to help with an ad-hoc request for data by the sales and marketing team. Then it becomes a critical report that they need updated every week or every day. Then what do you do? Send a CSV via email? Write some Python scripts to automate it? But what about incremental sync, API quotas, error handling, and all of the other details that eat up your time? Today, there is a better way. With Census, just write SQL or plug in your dbt models and start syncing your cloud warehouse to SaaS applications like Salesforce, Marketo, Hubspot, and many more. Go to pythonpodcast.com/census today to get a free 14-day trial. Your host as usual is Tobias Macey and today I’m interviewing Anthony Sottile about Flake8 Interview Introductions How did you get introduced to Python? Can you start by giving an overview of what Flake8 is and how you got involved with the project? There are a variety of tools available for checking or enforcing code quality. How would you characterize Flake8 in comparison to the other options? What do you see as the motivating factors for individuals or teams to integrate static analysis/linting in their toolchain and workflow? What are some of the challenges that might prevent someone from adopting something like Flake8? How can developers add Flake8 to an existing project without spending hours or days fixing all of the violations? Can you describe the overall design and implementation of Flake8? How has the design and goals of the project changed or evolved? There are a wide array of plugins for Flake8. What is involved in adding new functionality or linting rules? What capabilities does Flake8 provide that make it a viable platform for building plugins? What are some of the limitations of Flake8 as a platform? What do you see as the factors that have contributed to the widespread usage of Flake8 and the large number of available plugins? What challenges does that pose as a maintainer of Flake8? What are some of the other tools that you see developers use alongside Flake8 to help manage code quality and style enforcement? What are some of the most interesting, innovative, or unexpected ways that you have seen Flake8 and its plugin ecosystem used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Flake8? When is Flake8 the wrong choice? What do you have planned for the future of Flake8? Keep In Touch @codewithanthony on Twitter asottile on GitHub LinkedIn Picks Tobias SEVENEVES by Neal Stephenson Anthony pre-commit CI Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Flake8 PyFlakes PyCodestyle McCabe pre-commit Podcast Episode PEP 484 MyPy Pylance Pyright Pylint Black yapf autopep8 pyupgrade isort reorder-python-imports Static Analysis pydocstyle autoflake pyproject.toml Abstract Syntax Tree Concrete Syntax Tree Dagster The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/6/2021 • 49 minutes, 31 seconds

Make Your Code More Readable With The Magic Of Refactoring Using Sourcery

Summary Writing code that is easy to read and understand will have a lasting impact on you and your teammates over the life of a project. Sometimes it can be difficult to identify opportunities for simplifying a block of code, especially if you are early in your journey as a developer. If you work with senior engineers they can help by pointing out ways to refactor your code to be more readable, but they aren’t always available. Brendan Maginnis and Nick Thapen created Sourcery to act as a full time pair programmer sitting in your editor of choice, offering suggestions and automatically refactoring your Python code. In this episode they share their journey of building a tool to automatically find opportunities for refactoring in your code, including how it works under the hood, the types of refactoring that it supports currently, and how you can start using it in your own work today. It always pays to keep your tool box organized and your tools sharp and Sourcery is definitely worth adding to your repertoire. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! We’ve all been asked to help with an ad-hoc request for data by the sales and marketing team. Then it becomes a critical report that they need updated every week or every day. Then what do you do? Send a CSV via email? Write some Python scripts to automate it? But what about incremental sync, API quotas, error handling, and all of the other details that eat up your time? Today, there is a better way. With Census, just write SQL or plug in your dbt models and start syncing your cloud warehouse to SaaS applications like Salesforce, Marketo, Hubspot, and many more. Go to pythonpodcast.com/census today to get a free 14-day trial. Your host as usual is Tobias Macey and today I’m interviewing Nick Thapen and Brendan Maginnis about Sourcery, an advanced refactoring engine that cleans up your code as you work Interview Introductions How did you get introduced to Python? Can you start by giving an overview of what Sourcery is? What was your motivation for building a system for performing automated refactoring? What are your goals and priorities with Sourcery? There are a number of services that aim to automate portions of the developer workflow, such as code completions, quality checks, refactoring, etc. What was lacking in the existing tooling that made Sourcery a necessary project? How does Sourcery compare with some of the other services that offer AI or ML powered assistance? (e.g. Kite, Tab9, Codata(?)) What was your reasoning for focusing solely on Python for your refactoring, rather than trying to support multiple language targets? Can you give some examples of the types of refactoring that you are able to automate? Can you describe how Sourcery is implemented? What are some of the ways that the system has changed or evolved in design and/or scope? What are some examples of the types of refactorings that Sourcery is ill-suited for and which still require manual intervention? What is involved in adding support for a new editor? How much variation is there in terms of implementation or available functionality across editors? How has the introduction of the Language Server Protocol influenced your approach to editor integration? What are some of the most interesting, unexpected, or challenging lessons that you have learned while working on Sourcery? When is Sourcery the wrong choice? What do you have planned for the future of Sourcery Keep In Touch Nick LinkedIn @nthapen on Twitter Brendan LinkedIn @brendan_m6s on Twitter brendanator on GitHub Picks Tobias The Croods: New Age Nick The Magicians TV Series Brendan David Copperfield by Charles Dickens Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Sourcery IBM RPG Delphi Java Scala PyTorch Podcast Episode NLP == Natural Language Processing Tensorflow Language Server Protocol Kent Beck Martin Fowler MyPy Clojure Lisp Abstract Syntax Tree ASTroid Podcast Episode Rope Sans I/O pre-commit framework Podcast Episode The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/30/2021 • 1 hour, 58 seconds

Be Data Driven At Any Scale With Superset

Summary Becoming data driven is the stated goal of a large and growing number of organizations. In order to achieve that mission they need a reliable and scalable method of accessing and analyzing the data that they have. While business intelligence solutions have been around for ages, they don’t all work well with the systems that we rely on today and a majority of them are not open source. Superset is a Python powered platform for exploring your data and building rich interactive dashboards that gets the information that your organization needs in front of the people that need it. In this episode Maxime Beauchemin, the creator of Superset, shares how the project got started and why it has become such a widely used and popular option for exploring and sharing data at companies of all sizes. He also explains how it functions, how you can customize it to fit your specific needs, and how to get it up and running in your own environment. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! We’ve all been asked to help with an ad-hoc request for data by the sales and marketing team. Then it becomes a critical report that they need updated every week or every day. Then what do you do? Send a CSV via email? Write some Python scripts to automate it? But what about incremental sync, API quotas, error handling, and all of the other details that eat up your time? Today, there is a better way. With Census, just write SQL or plug in your dbt models and start syncing your cloud warehouse to SaaS applications like Salesforce, Marketo, Hubspot, and many more. Go to pythonpodcast.com/census today to get a free 14-day trial. Your host as usual is Tobias Macey and today I’m interviewing Max Beauchemin about Superset, an open source platform for data exploration and visualization Interview Introductions How did you get introduced to Python? Can you start by giving an overview of what Superset is and what it might be used for? What problem were you trying to solve when you created it? What tools or platforms did you consider before deciding to build something new? There are a few different ways that someone might categorize Superset, such as business intelligence, data exploration, dashboarding, data visualization. How would you characterize it and how it fits in the current state of the industry and ecosystem? What are some of the lessons that you have learned from your work on Airflow that you applied to Superset? Can you give an overview of how Superset is implemented? How have the goals, design and architecture evolved since you first began working on it? Given its origin as a hackathon project the choice of Python seems natural. What are some of the challenges that choice has posed over the life of the project? If you were to start the whole project over today what might you do differently? Can you describe what’s involved in getting started with a new setup of Superset? What are the available interfaces and integration points for someone who wants to extend it or add new functionality? What are some of the most often overlooked, misunderstood, or underused capabilities of Superset? One of the perennial challenges with a tool that allows users to build data visualizations is the potential to build dashboards or charts that are visually appealing but ultimately meaningless or wrong. How much guidance does Superset provide in helping to select a useful representation of the data? In addition to being the original author and a project maintainer you have also started a company to offer Superset as a service. What are your goals with that business and what is the opportunity that it provides? What are some of the most interesting, innovative, or unexpected ways that you have seen Superset used? What are the most interesting, unexpected, or challenging lessons that you have learned while building and growing the Superset project and community? When is Superset the wrong choice? What do you have planned for the future of Superset and Preset? Keep In Touch LinkedIn @mistercrunch on Twitter mistercrunch on GitHub Picks Tobias SOPS Max Frank Zappa Documentary Accelerate: The Science of Lean Software and DevOps Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Superset Preset Blog Airflow Podcast Episode AirBnB Lyft Django Flask CRUD == Create, Read, Update, Delete Business Intelligence Apache Druid Presto Trino (formerly known as Presto SQL) Redash Podcast Episode Looker Data Engineering Podcast Episode Metabase Data Engineering Podcast Episode Flask App Builder React Redux Typescript GraphQL Celery Redis RabbitMQ S3 AirBnB Superset Blog Post D3 The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/22/2021 • 47 minutes, 33 seconds

Practical Advice On Using Python To Power A Business

Summary Python is a language that is used in almost every imaginable context and by people from an amazing range of backgrounds. A lot of the people who use it wouldn’t even call themselves programmers, because that is not the primary focus of their job. In this episode Chris Moffitt shares his experience writing Python as a business user. In order to share his insights and help others who have run up against the limits of Excel he maintains the site Practical Business Python where he publishes articles that help introduce newcomers to Python and explain how to perform tasks such as building reports, automating Excel files, and doing data analysis. This is a great conversation that illustrates how useful it is to learn Python even if you never intend to write software professionally. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! We’ve all been asked to help with an ad-hoc request for data by the sales and marketing team. Then it becomes a critical report that they need updated every week or every day. Then what do you do? Send a CSV via email? Write some Python scripts to automate it? But what about incremental sync, API quotas, error handling, and all of the other details that eat up your time? Today, there is a better way. With Census, just write SQL or plug in your dbt models and start syncing your cloud warehouse to SaaS applications like Salesforce, Marketo, Hubspot, and many more. Go to pythonpodcast.com/census today to get a free 14-day trial. Your host as usual is Tobias Macey and today I’m interviewing Chris Moffitt about how Python is used to help manage business needs and processes and his work to share advice on this topic at Practical Business Python Interview Introductions How did you get introduced to Python? Can you start by giving an overview of your mission at Practical Business Python? What was your inspiration for starting the site and what keeps you motivated? What are some of the kinds of problems that a business user is looking to solve for themselves? Why is Python a viable tool for a business user to become familiar with? How would you characterize the difference between the ways that a software engineer and a business user approach Python? What do you see as the tipping point of complexity or time investment past which a business user will pass a given project on to a software engineer? How much familiarity with adjacent concerns such as version control, software design, etc. do you consider useful for a business user? What are some of the ways that you use Python in your day-to-day? What are some of the onramps for integrating Python into a user’s workflow? What are some common stumbling blocks that business users run into when getting started with Python? What are some of the most interesting, innovative, or impressive ways that you have seen Python employed by business users? What are some of the most interesting, unexpected, or challenging lessons that you have learned while working on the Practical Business Python site? What are some cases where you would advocate for a tool other than Python for a business use case? What do you have planned for the future of the site? Keep In Touch LinkedIn chris1610 on GitHub @chris1610 on Twitter Picks Tobias The Data Science Roundup Newsletter This Week In Data Newsletter Chris Moffitt Line Of Duty BBC Series Out Of The Dark by David Weber Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Practical Business Python blog Electrical Engineering Unix Perl Data Science Django Raspberry Pi Pandas Excel VBA == Visual Basic for Applications VSCode Excel PowerFX Pathlib Conda Python Wheels PEP 582 SAP Salesforce Tableau Prophet library for timeseries forecasting Talk Python Course Moving From Excel To Python The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/16/2021 • 49 minutes, 30 seconds

Analyzing The Ecosystem of Python Data Companies With Tony Liu

Summary There are a large and growing number of businesses built by and for data science and machine learning teams that rely on Python. Tony Liu is a venture investor who is following that market closely and betting on its continued success. In this episode he shares his own journey into the role of an investor and discusses what he is most excited about in the industry. He also explains what he looks at when investing in a business and gives advice on what potential founders and early employees of startups should be thinking about when starting on that journey. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Tony Liu about his perspectives on the landscape of Python in the data ecosystem from his role as an investor Interview Introductions How did you get introduced to Python? Can you start by sharing your background in the data ecosystem? What led you to your current role as a venture investor? What is your current area of focus in your investments? What do you see as the major strengths of Python in the current landscape for data and analytics? What are the areas where the ecosystem is still lacking? Where are you seeing growth in the space and what do you see as the motivating factors? As an investor, what are the qualities that you look for in a startup that is trying to compete in the data ecosystem? What is your process for learning about and identifying companies that demonstrate the potential to succeed? Do you focus on a particular problem domain and research a grouping of companies that are focused on that problem, or do you start from a given company to determine where to place your bets? How has COVID changed the competitive landscape? Can you share some of the companies that you have invested in? What was noteable about their respective businesses that provided you with the confidence that they were worth investing in? What are some of the most interesting, unexpected, or challenging lessons that you have learned from your experience as a venture investor? What are some of the companies that you are keeping a close eye on, whether as potential investments or as competitors to your existing portfolio? What are some of the problem spaces that you would like to see companies try to tackle? What advice do you have for engineers who might be considering building a new business? Do you have any advice for engineers who are working at a startup as to how best to compete in the current market? Keep In Touch LinkedIn Picks Tobias The Sleepover movie What do ya do with a Bernie Sanders? music video Tony Uncut Gems Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Costanoa Ventures Sports Analytics Turo Databricks Koalas DataRobot Faust Podcast Episode Oozie Azkaban Airflow Podcast Episode Prefect Data Engineering Podcast Episode Dagster Podcast Episode Data Engineering Podcast Episode Kubeflow MLFlow Metaflow Podcast Episode Pandas Podcast Episode Spark Data Engineering Podcast Episode DBT Data Engineering Podcast Episode SnowflakeDB Data Engineering Podcast Episode Coiled Podcast Episode Noteable Dask Data Engineering Podcast Episode Data Engineering Podcast Episode About Notebooks at Netflix The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/9/2021 • 39 minutes, 30 seconds

Go From Notebook To Pipeline For Your Data Science Projects With Orchest

Summary Jupyter notebooks are a dominant tool for data scientists, but they lack a number of conveniences for building reusable and maintainable systems. For machine learning projects in particular there is a need for being able to pivot from exploring a particular dataset or problem to integrating that solution into a larger workflow. Rick Lamers and Yannick Perrenet were tired of struggling with one-off solutions when they created the Orchest platform. In this episode they explain how Orchest allows you to turn your notebooks into executable components that are integrated into a graph of execution for running end-to-end machine learning workflows. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Rick Lamers and Yannick Perrenet about Orchest, a development environment designed for building data science pipelines from notebooks and scripts. Interview Introductions How did you get introduced to Python? Can you start by giving an overview of what Orchest is and the story behind it? Who are the users that you are building Orchest for and what are their biggest challenges? What are some examples of the types of tools or workflows that they are using now? What are some of the other tools or strategies in the data science ecosystem that Orchest might replace? (e.g. MLFlow, Metaflow, etc.) What problems does Orchest solve? Can you describe how Orchest is implemented? How have the design and goals of the project changed since you first started working on it? What is the workflow for someone who is using Orchest? What are some of the sharp edges that they might run into? What is the deployable unit once a pipeline has been created? How do you handle verification and promotion of pipelines across staging and production environments? What are the interfaces available for integrating with or extending Orchest? How might an organization incorporate a pipeline defined in Orchest with the rest of their data orchestration workflows? How are you approaching governance and sustainability of the Orchest project? What are the most interesting, innovative, or unexpected ways that you have seen Orchest used? What are the most interesting, unexpected, or challenging lessons that you have learned while building Orchest? When is Orchest the wrong choice? What do you have planned for the future of the project and company? Keep In Touch Rick ricklamers on GitHub LinkedIn @RickLamers on Twitter Yannick yannickperrenet on GitHub LinkedIn Picks Tobias Fresh Bagels Rick Vaex Yannick Cookiecutter Pyenv Links Orchest Geoffrey Hinton Yann LeCun CoffeeScript Vim GAN == Generative Adversarial Network Git SQL BigQuery Software Carpentry Podcast Episode Google Colab Airflow Podcast Episode Kedro Data Engineering Podcast Episode nbdev Podcast Episode Papermill Data Engineering Podcast Episode MLFlow Metaflow Podcast Episode DVC Podcast Episode Andrew Ng Kubeflow Lua Caddy Traefik DAG == Directed Acyclic Graph Jupyter Enterprise Gateway Streamlit Kubernetes Dagster Podcast.__init__ Episode Data Engineering Podcast Episode DBT Data Engineering Podcast Episode GitLab Spark ETL The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/2/2021 • 44 minutes, 24 seconds

Write Your Python Scripts In A Flow Based Visual Editor With Ryven

Summary When you are writing a script it can become unwieldy to understand how the logic and data are flowing through the program. To make this easier to follow you can use a flow-based approach to building your programs. Leonn Thomm created the Ryven project as an environment for visually constructing a flow-based program. In this episode he shares his inspiration for creating the Ryven project, how it changes the way you think about program design, how Ryven is implemented, and how to get started with it for your own programs. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Leon Thomm about Ryven, a flow-based visual scripting environment for Python Interview Introductions How did you get introduced to Python? Can you start by giving an overview of what Ryven is and what inspired you to create it? What is flow-based visual scripting? What are other popular flow-based visual scripting systems out there and have they been inspiring to the project? What problem(s) do these try to solve? What are some of the places where you are drawing inspiration for Ryven? What are the kinds of projects that someone might build with Ryven? How are you using Ryven in your personal projects? How does structuring a project as a set of nodes in a flow graph influence the way that you think about how to design the solution to a problem? Can you describe how Ryven is implemented? How has the design or goals of the project changed or evolved since you first began working on it? For someone who wants to use Ryven to build a project can you describe their workflow? How do you handle things like code quality and tests for a Ryven project? How do you manage collaboration for a Ryven project? (e.g. version control) What are some of the most interesting, innovative, or unexpected ways that you have seen Ryven used? What are the most interesting, unexpected, or challenging lessons that you have learned while building Ryven? When is Ryven the wrong choice? What do you have planned for the future of the project? Keep In Touch leon-thomm on GitHub Picks Tobias PyInfra Leon A Universe from Nothing! by Lawrence M. Krauss Links Ryven Switzerland Qt C++ framework Flow-based Scripting Unreal Engine Node-RED IFTTT == IF This Then That DAG == Directed Acyclic Graph Mind Map Literate Programming nbdev Podcast Episode Org Mode OpenCV scikit-learn Unreal Python The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/23/2021 • 47 minutes, 21 seconds

CrossHair: Your Automatic Pair Programmer

Summary One of the perennial challenges in software engineering is to reduce the opportunity for bugs to creep into the system. Some of the tools in our arsenal that help in this endeavor include rich type systems, static analysis, writing tests, well defined interfaces, and linting. Phillip Schanely created the CrossHair project in order to add another ally in the fight against broken code. It sits somewhere between type systems, automated test generation, and static analysis. In this episode he explains his motivation for creating it, how he uses it for his own projects, and how to start incorporating it into yours. He also discusses the utility of writing contracts for your functions, and the differences between property based testing and SMT solvers. This is an interesting and informative conversation about some of the more nuanced aspects of how to write well-behaved programs. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Phillip Schanely about CrossHair, an analysis tool for Python that blurs the line between testing and type systems. Interview Introductions How did you get introduced to Python? Can you start by giving an overview of what the CrossHair project is and how it got started? What are some examples of the types of tools that CrossHair might augment or replace? (e.g. Pydantic, Doctest, etc.) What are the categories of bugs or problems in your code that CrossHair can help to identify or discover? Can you explain the benefits of implementing contracts in your software? What are the limitations of contract implementations? What are the available interfaces for creating and validating contracts? How does the use of contracts in your software influence the overall design of the system? How does CrossHair compare to type systems in terms of use cases or capabilities? Can you describe how CrossHair is implemented? How has the design or goal of CrossHair changed or evolved since you first began working on it? What are some of the other projects that you have gained inspiration or ideas from while working on CrossHair? (inside or outside of the Python ecosystem) For someone who wants to get started with CrossHair, can you talk through the developer workflow? I noticed that you recently added support for validating the functional equivalency of different method implementations. What was the inspiration for that capability? What kinds of use cases does that enable? How much of CrossHair are you able to dogfood while developing CrossHair? What are some of the most interesting, innovative, or unexpected ways that you have seen CrossHair used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on CrossHair? When is CrossHair the wrong choice? What do you have planned for the future of the project? Keep In Touch pschanely on GitHub @pschanely on Twitter LinkedIn Picks Tobias The War With Grandpa Phillip Hammock chairs! (affiliate link) Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links CrossHair NLTK == Natural Language ToolKit ACL2 Liquid Haskell SMT Solver Doctest Property Based Testing Hypothesis Podcast Episode Halting Problem Pydantic PEP 316 icontract Eiffel programming language Design By Contract Metamorphic Testing Higher Order Types Fuzz Testing The Fuzzing Book Python Audit Hooks GitHub Scientist Laboratory Python implementation of GitHub Scientist Podcast Episode Taint Analysis The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/16/2021 • 42 minutes, 53 seconds

Giving Your Data Science Projects And Teams A Home At DagsHub

Summary Collaborating on software projects is largely a solved problem, with a variety of hosted or self-managed platforms to choose from. For data science projects, collaboration is still an open question. There are a number of projects that aim to bring collaboration to data science, but they are all solving a different aspect of the problem. Dean Pleban and Guy Smoilovsky created DagsHub to give individuals and teams a place to store and version their code, data, and models. In this episode they explain how DagsHub is designed to make it easier to create and track machine learning experiments, and serve as a way to promote collaboration on open source data science projects. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Dean Pleban and Guy Smoilovsky about DagsHub, a platform to track experiments, and version data, models & pipelines for your data science and machine learning projects. Interview Introduction How did you first get introduced to Python? Can you start by describing what the DagsHub platform is and why you built it? There are a number of projects and platforms that aim to support collaboration among data scientists. What are the distinguishing features of DagsHub and how does it compare to the other options in the ecosystem? What are the biggest opportunities for improvement that you still see in the space of collaboration on data projects? What do you see as the biggest points of friction for building experiments and managing source data collaboratively? Can you describe how the DagsHub platform is implemented? How have the design and goals of the system changed or evolved since you first began working on it? How has your own understanding and practices of working on data science/ML projects changed changed? GitHub has a number of convenience features beyond just storing a git repository. What are the capabilities that you are focusing on to add value to the data science workflow within DagsHub? How are you approaching the bootstrapping problem of building a critical mass of users to be able to generate a beneficial network effect? Are there any conventions that make it easier or more familiar for newcomers to a given project? (e.g. code layout, data labeling/tagging formats, etc.) What are your recommendations for managing onwership/licensing of data assets in public projects? What are some of the most interesting, innovative, or unexpected ways that you have seen DagsHub used? What are the most interesting, unexpected, or challenging lessons that you have learned while building DagsHub? When is DagsHub the wrong choice? What do you have planned for the future of the platform and business? Keep In Touch Follow us on Twitter or LinkedIn, join our Discord, sign up to DAGsHub @DeanPlbn @Guy_T_Sky @TheRealDAGsHub DagsHub Discord Picks Tobias The Remarkable Journey of Prince Jen by Lloyd Alexander Dean Quantum Computing Since Democritus by Scott Aaronson The Expanse TV Series Guy Try to consume only the very best of available content, not the things that are coming out right now. Applies to textbooks, TV shows, movies Less Wrong blog Slate Star Codex \ Astral Codex Ten Avatar: The Last Airbender 3 Blue 1 Brown YouTube Channel Haskell Clojure Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links DagsHub DVC Podcast Episode Data Science Cookiecutter Jupyter Notebooks Papers With Code Connected Papers The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/9/2021 • 59 minutes, 20 seconds

Exploring Literate Programming For Python Projects With nbdev

Summary Creating well designed software is largely a problem of context and understanding. The majority of programming environments rely on documentation, tests, and code being logically separated despite being contextually linked. In order to weave all of these concerns together there have been many efforts to create a literate programming environment. In this episode Jeremy Howard of fast.ai fame and Hamel Husain of GitHub share the work they have done on nbdev. The explain how it allows you to weave together documentation, code, and tests in the same context so that it is more natural to explore and build understanding when working on a project. It is built on top of the Jupyter environment, allowing you to take advantage of the other great elements of that ecosystem, and it provides a number of excellent out of the box features to reduce the friction in adopting good project hygiene, including continuous integration and well designed documentation sites. Regardless of whether you have been programming for 5 days, 5 years, or 5 decades you should take a look at nbdev to experience a different way of looking at your code. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Jeremy Howard and Hamel Husain about nbdev, a library for turning Jupyter notebooks into Python libraries. Interview Introductions How did you get introduced to Python? Can you start by describing what nbdev is and the goals of the project? What is the story behind how and why it got started? Who is the target audience for the nbdev project? How does that focus influence the features and design of nbdev? What do you see as the primary challenges of building and collaborating on projects written in notebooks? What are some of the other projects that are working to simplify or improve the experience of using notebooks? How does nbdev compare to or complement those other tools? Can you describe how nbdev is implemented? How has the design and goals of the project evolved since it was first started? What is the workflow of someone who is using nbdev? At what point in the lifecycle of a notebook oriented project should someone start integrating nbdev? How does nbdev scale when working on a project that spans multiple notebooks/modules? How does working in a notebook environment change your approach to software development and project design? What are the most interesting, innovative, or unexpected ways that you have seen nbdev used? What are the most interesting, unexpected, or challenging lessons that you have learned from working on nbdev? When is nbdev the wrong choice? What do you have planned for the future of the project? Keep In Touch Jeremy LinkedIn @jeremyphoward on Twitter jph00 on GitHub Hamel hamelsmu on GitHub Website @HamelHusain on Twitter LinkedIn Picks Tobias Rivals! Frenemies Who Changed The World Jeremy Chess Hamel Moonwalking With Einstein by Joshua Foer (affiliate link) Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links nbdev fast.ai GitHub Perl Fastmail R Studio R Markdown Literate Programming fastcore JupyterLab nteract Jupyter Voilà GitHub Actions Sphinx Google Colab Working In Public by Nadia Eghbal (affiliate link) Jekyll Hugo Cython Podcast Episode The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/2/2021 • 51 minutes, 38 seconds

Making The Sans I/O Ideal A Reality For The Websockets Library

Summary Working with network protocols is a common need for software projects, particularly in the current age of the internet. As a result, there are a multitude of libraries that provide interfaces to the various protocols. The problem is that implementing a network protocol properly and handling all of the edge cases is hard, and most of the available libraries are bound to a particular I/O paradigm which prevents them from being widely reused. To address this shortcoming there has been a movement towards "sans I/O" implementations that provide the business logic for a given protocol while remaining agnostic to whether you are using async I/O, Twisted, threads, etc. In this episode Aymeric Augustin shares his experience of refactoring his popular websockets library to be I/O agnostic, including the challenges involved in how to design the interfaces, the benefits it provides in simplifying the tests, and the work needed to add back support for async I/O and other runtimes. This is a great conversation about what is involved in making an ideal a reality. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Aymeric Augustin about his work on the websockets library and the work involved in making it sans I/O Interview Introductions How did you get introduced to Python? Can you start by giving an overview of your work on the websockets library and how the project got started? What does "sans I/O" mean and what are the goals associated with it? Can you share the history of your work on the websockets project? What was your motivation for starting down the path of rearchitecting a project that is already production ready? Can you talk through how the websockets library is architected currently? How has the design of the project evolved since you first began working on it? At a high level, what were the changes required to make it functionally sans i/o? What do you see as the primary challenges associated with making network related libraries sans i/o? In your experience of porting websockets to be purely protocol oriented, what are the technical and design challenges that you faced? One of the goals of the Sans I/O approach is to support reusability and composability of network protocol implementations. What has your experience been as to the viability of those goals in practice? What is your current perspective on the cost/benefit of the sans i/o conversion? Who are the primary consumers of the websockets library? How do you foresee the target audience changing once you have completed extracting the protocol logic? What are some of the most interesting, innovative, or unexpected ways that you have seen the websockets project used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on the websockets project and sans i/o conversion? What do you have planned for the future of the project? Keep In Touch LinkedIn @aymericaugustin on Twitter Website Picks Tobias Jigsaw Puzzles Aymeric Inside Qonto interview Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Sans I/O: When The Rubber Meets The Road Websockets library Websockets Protocol Qonto Tulip Asyncio CERN Particle Accelerator Sans I/O Cory Benfield HTTP/2 Twisted Curio Trio Inversion of Control ohneio helper library for implementing sans I/O network protocols SOCKS Proxy Sanic The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/26/2021 • 38 minutes, 4 seconds

Driving Toward A Faster Python Interpreter With Pyston

Summary One of the common complaints about Python is that it is slow. There are languages and runtimes that can execute code faster, but they are not as easy to be productive with, so many people are willing to make that tradeoff. There are some use cases, however, that truly need the benefit of faster execution. To address this problem Kevin Modzelewski helped to create the Pyston intepreter that is focused on speeding up unmodified Python code. In this episode he shares the history of the project, discusses his current efforts to optimize a fork of the CPython interpreter, and his goals for building a business to support the ongoing work to make Python faster for everyone. This is an interesting look at the opportunities that exist in the Python ecosystem and the work being done to address some of them. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Kevin Modzelewski about his work on Pyston, an interpreter for Python focused on compatibility and speed. Interview Introductions How did you get introduced to Python? Can you start by describing what Pyston is and how it got started? Can you share some of the history of the project and the recent changes? What is your motivation for focusing on Pyston and Python optimization? What are the use cases that you are primarily focused on with Pyston? Why do you think Python needs another performance project? Can you describe the technical implementation of Pyston? How has the project evolved since you first began working on it? What are the biggest challenges that you face in maintaining compatibility with CPython? How does the approach to Pyston compare to projects like PyPy and Pyjion? How are you approaching sustainability and governance of the project? What are some of the most interesting, innovative, or unexpected uses for Pyston that you have seen? What have you found to be the most interesting, unexpected, or challenging lessons that you have learned while working on Pyston? When is Pyston the wrong choice? What do you have planned for the future of the project? Keep In Touch kmod on GitHub Blog LinkedIn Picks Tobias Last Week In AWS Newsletter Kevin Meditation Calm App Headspace Links Pyston Discord Chat Dropbox CPython PyPy Pyjion Podcast Episode Jython hpy Podcast Episode JIT Compiler Python Software Foundation Podcast Episode The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/19/2021 • 44 minutes, 6 seconds

Project Scaffolding That Evolves With Your Software Using Copier

Summary Every software project has a certain amount of boilerplate to handle things like linting rules, test configuration, and packaging. Rather than recreate everything manually every time you start a new project you can use a utility to generate all of the necessary scaffolding from a template. This allows you to extract best practices and team standards into a reusable project that will save you time. The Copier project is one such utility that goes above and beyond the bare minimum by supporting project evolution, letting you bring in the changes to the source template after you already have a project that you have dedicated significant work on. In this episode Jairo Llopis explains how the Copier project works under the hood and the advanced capabilities that it provides, including managing the full lifecycle of a project, composing together multiple project templates, and how you can start using it for your own work today. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Jairo Llopis about Copier, a library for managing project templates Interview Introductions How did you get introduced to Python? Can you start by describing what the Copier project is? How did you get involved in the project? Can you share some of the history of the project? What do you see as the most common uses for a project templating tool? There are a variety of different tools for scaffolding projects across a wide range of languages. What are the distinguishing features of Copier that might lead someone to choose it over the alternatives? Can you describe how the Copier project is implemented? How has the design and feature set evolved over time? What is the workflow for someone building a template with Copier? What are some of the edge cases or complexities that they might run into? What are the options for extensibility or integration with Copier? What are some of the capabilities or use cases for Copier that are often overlooked? What are some of the most interesting, innovative, or unexpected ways that you have seen Copier used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on and with Copier? When is Copier the wrong choice? What do you have planned for the future of the project? Keep In Touch Yajo on GitHub __yajo on Twitter Website Picks Tobias Playing Cards Jairo Mozilla Hubs Links Copier Tecnativa Odoo Open Source ERP Cookiecutter Yeoman Jinja Cookiecutter, Yeoman, and Copier Blog Post doodba-copier-template Copier Templates A Story of Duplicate Code Traefik The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/12/2021 • 57 minutes, 56 seconds

How Python's Evolution Impacts Your Fluency With Luciano Ramalho

Summary On its surface Python is a simple language which is what has contributed to its rise in popularity. As you move to intermediate and advanced usage you will find a number of interesting and elegant design elements that will let you build scalable and maintainable systems and design friendly interfaces. Luciano Ramalho is best known as the author of Fluent Python which has quickly become a leading resource for Python developers to increase their facility with the language. In this episode he shares his journey with Python and his perspective on how the recent changes to the interpreter and ecosystem are influencing who is adopting it and how it is being used. Luciano has an interesting perspective on how the feedback loop between the community and the language is driving the curent and future priorities of the features that are added. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Luciano Ramalho about the recent and upcoming changes in the Python language Interview Introductions How did you get introduced to Python? Can you start by giving an overview of the role that Python has played in your career? What other languages do you work with on a regular basis? How has that experience influenced the ways that you use Python? What do you see as the biggest changes that have been added to Python in recent years? How have the changes in Python changed the way that you approach program design? How has your work on Fluent Python influenced your perspective on the language and its utility? What do you find to be the most confusing aspects of Python, whether for newcomers or experienced developers? How would you characterize the types of features that have been added to Python in recent years? What, if any, trends have you observed in the types of features that are proposed and included in Python and what do you see as the motivating factors for them? What changes to the language are you tracking? Which are you personally invested in? What new features or capabilities would you like to see included in Python? Keep In Touch @ramalhoorg on Twitter ramalho on GitHub LinkedIn Picks Tobias Magic: The Gathering: Arena Luciano The Queen’s Gambit Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Fluent Python Library and Information Sciences Thoughtworks São Paulo, Brazil Perl PHP Object Oriented Programming Dunder Methods Python Essential Reference Python In A Nutshell Python Typing Module Pytype Pyre MyPy AsyncIO Typing Protocols Duck Typing Static Typing Where Possible, Dynamic Typing Where Needed TypeScript Ruby 3 Type Annotations C# Go Language KotlinJS Matrix Multiplication Operator Walrus Operator == Assignment Expressions CPython PEG Parser Podcast Episode PEP 3099: Things that will Not Change in Python 3000 Elixir Pattern Matching Erlang Prolog Python Pattern Matching PEP SWIG Symbolic Computation Python Descriptors Beeware The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/5/2021 • 1 hour, 13 seconds

Making Content Management A Smooth Experience With A Headless CMS

Summary Building a web application requires integrating a number of separate concerns into a single experience. One of the common requirements is a content management system to allow product owners and marketers to make the changes needed for them to do their jobs. Rather than spend the time and focus of your developers to build the end to end system a growing trend is to use a headless CMS. In this episode Jake Lumetta shares why he decided to spend his time and energy on building a headless CMS as a service, when and why you might want to use one, and how to integrate it into your applications so that you can focus on the rest of your application. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Python has become the default language for working with data, whether as a data scientist, data engineer, data analyst, or machine learning engineer. Springboard has launched their School of Data to help you get a career in the field through a comprehensive set of programs that are 100% online and tailored to fit your busy schedule. With a network of expert mentors who are available to coach you during weekly 1:1 video calls, a tuition-back guarantee that means you don’t pay until you get a job, resume preparation, and interview assistance there’s no reason to wait. Springboard is offering up to 20 scholarships of $500 towards the tuition cost, exclusively to listeners of this show. Go to pythonpodcast.com/springboard today to learn more and give your career a boost to the next level. Your host as usual is Tobias Macey and today I’m interviewing Jake Lumetta about Butter CMS and the role of a headless CMS in the modern web ecosystem. Interview Introductions How did you get introduced to Python? Can you start by describing what a headless CMS is? How does the use case and user experience differ from working with a traditional CMS (e.g. WordPress, etc.)? How does a headless CMS compare to using a framework such as Django CMS or Wagtail? Can you describe what you have built at ButterCMS? What was your motivation for starting a business to provide a CMS as a service? How would you characterize the current state of the CMS ecosystem? How does ButterCMS compare to the available open source and commercial options? What are the trends in the web ecosystem that have made a headless CMS necessary or useful? What types of information are people managing in a CMS? How are people integrating headless CMS systems into their Python applications? Can you describe the architecture for Butter? How has the system changed or evolved since you first began working on it? What was your decision process for determining what language(s) and technology stack to use for building the platform? What are the aspects of building and maintaining a CMS that are most complex? What are some of the most interesting, innovative, or unexpected ways that you have seen ButterCMS used? What have you found to be the most interesting, unexpected, or challenging lessons that you have learned while building ButterCMS? When is ButterCMS the wrong choice? What do you have planned for the future of ButterCMS? Keep In Touch LinkedIn @jakelumetta on Twitter Picks Tobias The Arrow TV Show Jake Ghost In The Wires by Kevin Mitnick Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links ButterCMS Hiring: Dir of Engineering PHP Django MVC == Model, View, Controller Headless CMS WordPress Django CMS Wagtail Podcast Episode SEO == Search Engine Optimization JAM (Javascript, APIs, and Markup) Stack Netlify Vercel Cloudflare Pages Vue.js React.js Django Rest Framework Fastly CDN == Content Delivery Network AWS Cloudfront Ionic React Native The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/28/2020 • 48 minutes, 50 seconds

Turning Notebooks Into Collaborative And Dynamic Data Applications With Hex

Summary Notebooks have been a useful tool for analytics, exploratory programming, and shareable data science for years, and their popularity is continuing to grow. Despite their widespread use, there are still a number of challenges that inhibit collaboration and use by non-technical stakeholders. Barry McCardel and his team at Hex have built a platform to make collaboration on Jupyter notebooks a first class experience, as well as allowing notebooks to be parameterized and exposing the logic through interactive web applications. In this episode Barry shares his perspective on the state of the notebook ecosystem, why it is such as powerful tool for computing and analytics, and how he has built a successful business around improving the end to end experience of working with notebooks. This was a great conversation about an important piece of the toolkit for every analyst and data scientist. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Do you want to get better at Python? Now is an excellent time to take an online course. Whether you’re just learning Python or you’re looking for deep dives on topics like APIs, memory mangement, async and await, and more, our friends at Talk Python Training have a top-notch course for you. If you’re just getting started, be sure to check out the Python for Absolute Beginners course. It’s like the first year of computer science that you never took compressed into 10 fun hours of Python coding and problem solving. Go to pythonpodcast.com/talkpython today and get 10% off the course that will help you find your next level. That’s pythonpodcast.com/talkpython, and don’t forget to thank them for supporting the show. Python has become the default language for working with data, whether as a data scientist, data engineer, data analyst, or machine learning engineer. Springboard has launched their School of Data to help you get a career in the field through a comprehensive set of programs that are 100% online and tailored to fit your busy schedule. With a network of expert mentors who are available to coach you during weekly 1:1 video calls, a tuition-back guarantee that means you don’t pay until you get a job, resume preparation, and interview assistance there’s no reason to wait. Springboard is offering up to 20 scholarships of $500 towards the tuition cost, exclusively to listeners of this show. Go to pythonpodcast.com/springboard today to learn more and give your career a boost to the next level. Your host as usual is Tobias Macey and today I’m interviewing Barry McCardel about Hex, a managed platform to turn your notebooks into collaborative, interactive data apps and stories Interview Introductions How did you get introduced to Python? Can you start by describing what you have built at Hex and your motivation for starting the business? Who are the primary users of the Hex platform? How has that focus influenced your product direction and the features that you prioritize? What are the biggest roadblocks that you see data analysts and data consumers running into? How have those roadblocks shifted in recent years? What is it about the concept of a notebook that has caused them to see such a massive rise in usage and popularity? What are the barriers to productivity and accessibility that still exist in the notebook ecosystem? What are the pieces for working in and with notebooks that are still missing? What does Hex add to the experience of working with notebooks? Can you describe how the Hex platform implemented? How has the design of the platform changed or evolved since you first began working on it? Where does Hex sit in the lifecycle of notebook creation and usage? How does it compare to other services built to support users of notebooks such as Zepl, Saturn Cloud, Noteable, etc.? You focus on the Jupyter platform, but there are a number of other notebook frameworks that have sprung up in recent years. What do you see as being the relative strengths of the available options? What are the trends in the tooling, capabilities, and use cases for notebooks that you are keeping an eye on? What are the most interesting, innovative, or unexpected ways that you have seen the Hex platform used? What are the most interesting, unexpected, or challenging lessons that you have learned while building Hex? When is Hex the wrong choice? What do you have planned for the future of the Hex business and product? Keep In Touch LinkedIn @TheRealBarryM on Twitter Picks Tobias Flakehell DC Extended Universe Movies Barry Wingspan Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Hex Palantir IPython Podcast Episode Jupyter Mathematica IDE == Integrated Development Environment nbconvert Observable Javascript Notebooks React BlueprintJS Papermill Streamlit Podcast Episode Shiny Redshift Snowflake Data Engineering Podcast Episode BigQuery PostgreSQL Data Engineering Podcast Episode Noteable Saturn Cloud Zepl Zeplin Notebooks JupyterHub Binder Kubeflow The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/21/2020 • 42 minutes, 39 seconds

Add Anomaly Detection To Your Time Series Data With Luminaire

Summary When working with data it’s important to understand when it is correct. If there is a time dimension, then it can be difficult to know when variation is normal. Anomaly detection is a useful tool to address these challenges, but a difficult one to do well. In this episode Smit Shah and Sayan Chakraborty share the work they have done on Luminaire to make anomaly detection easier to work with. They explain the complexities inherent to working with time series data, the strategies that they have incorporated into Luminaire, and how they are using it in their data pipelines to identify errors early. If you are working with any kind of time series then it’s worth giving Luminaure a look. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Python has become the default language for working with data, whether as a data scientist, data engineer, data analyst, or machine learning engineer. Springboard has launched their School of Data to help you get a career in the field through a comprehensive set of programs that are 100% online and tailored to fit your busy schedule. With a network of expert mentors who are available to coach you during weekly 1:1 video calls, a tuition-back guarantee that means you don’t pay until you get a job, resume preparation, and interview assistance there’s no reason to wait. Springboard is offering up to 20 scholarships of $500 towards the tuition cost, exclusively to listeners of this show. Go to pythonpodcast.com/springboard today to learn more and give your career a boost to the next level. Your host as usual is Tobias Macey and today I’m interviewing Smit Shah and Sayan Chakraborty about Luminaire, a machine learning based package for anomaly detection on timeseries data Interview Introductions How did you get introduced to Python? Can you start by describing what Luminaire is and how the project got started? Where does the name come from? How does Luminaire compare to other frameworks for working with timeseries data such as Prophet? What are the main use cases that Luminaire is powering at Zillow? What are some of the complexities inherent to anomaly detection that are non-obvious at first glance? How are you addressing those challenges in Luminaire? Can you describe how Luminaire is implemented? How has the design of the project evolved since it was first started? What was the motivation for releasing Luminaire as open source? For someone who is using Luminaire, what is the process for training and deploying a model with it? What are some common ways that it is used within a larger system? How do sustained anomalies such as the current pandemic affect the work of identifying other sources of meaningful outliers? What are some of the most interesting, innovative, or unexpected ways that you have seen Luminaire being used? What are some of the most interesting, unexpected, or challening lessons that you have learned while building and using Luminaire? When is Luminaire the wrong choice? What do you have planned for the future of the project? Keep In Touch Smit LinkedIn shahsmit14 on GitHub Sayan LinkedIn Website @tweettosayan on Twitter Picks Tobias Flakehell Smit Apache Ranger Sayan Prediction Machines: The Simple Economics Of Artificial Intelligence Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Luminaire Zillow Anomaly Detection Facebook Prophet IEEE Big Data Conference Unsupervised Learning ARIMA (Autoregressive Integrated Moving Average) Model Airflow The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/15/2020 • 54 minutes, 23 seconds

Building Big Data Pipelines For Audio With Klio

Summary Technologies for building data pipelines have been around for decades, with many mature options for a variety of workloads. However, most of those tools are focused on processing of text based data, both structured and unstructured. For projects that need to manage large numbers of binary and audio files the list of options is much shorter. In this episode Lynn Root shares the work that she and her team at Spotify have done on the Klio project to make that list a bit longer. She discusses the problems that are specific to working with binary data, how the Klio project is architected to allow for scalable and efficient processing of massive numbers of audio files, why it was released as open source, and how you can start using it today for your own projects. If you are struggling with ad-hoc infrastructure and a medley of tools that have been cobbled together for analyzing large or numerous binary assets then this is definitely a tool worth testing out. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Do you want to get better at Python? Now is an excellent time to take an online course. Whether you’re just learning Python or you’re looking for deep dives on topics like APIs, memory mangement, async and await, and more, our friends at Talk Python Training have a top-notch course for you. If you’re just getting started, be sure to check out the Python for Absolute Beginners course. It’s like the first year of computer science that you never took compressed into 10 fun hours of Python coding and problem solving. Go to pythonpodcast.com/talkpython today and get 10% off the course that will help you find your next level. That’s pythonpodcast.com/talkpython, and don’t forget to thank them for supporting the show. Python has become the default language for working with data, whether as a data scientist, data engineer, data analyst, or machine learning engineer. Springboard has launched their School of Data to help you get a career in the field through a comprehensive set of programs that are 100% online and tailored to fit your busy schedule. With a network of expert mentors who are available to coach you during weekly 1:1 video calls, a tuition-back guarantee that means you don’t pay until you get a job, resume preparation, and interview assistance there’s no reason to wait. Springboard is offering up to 20 scholarships of $500 towards the tuition cost, exclusively to listeners of this show. Go to pythonpodcast.com/springboard today to learn more and give your career a boost to the next level. Your host as usual is Tobias Macey and today I’m interviewing Lynn Root about Klio, an open source pipeline for processing audio and binary data Interview Introductions How did you get introduced to Python? Can you start by describing what Klio is and how it got started? What are some of the challenges that are unique to processing audio data as compared to text? What use cases does Klio enable? What are some of the alternative options available for working with binary data? What capabilities were lacking in other solutions that made it worthwhile to build a new system from scratch? Can you describe the design and architecture of Klio? What was the motivation for implementing Klio as a Python framework, rather than building on top of the Scio project? How much of a challenge has it been to interface to the Beam framework from Python? (Java <-> Python impedance mismatch) One of the interesting optimizations in Klio is the option for bottom up execution of a job to avoid processing a given file unless absolutely necessary. What are some of the other useful or interesting capabilities that are built into Klio? What was the motivation and process for releasing Klio as open source? For someone who is building a pipeline with Klio, can you talk through the workflow? What are the extension and integration points that are exposed? How does Klio handle third party dependencies for a given job? What are some of the challenges, misunderstandings, or edge cases that users of Klio should be aware of? What are some of the most interesting, unexpected, or challenging lessons that you have learned while building and growing the Klio project? What are some of the most interesting, innovative, or unexpected ways that you have seen Klio used? What do you have planned for the future of the project? Keep In Touch GitHub Twitter LinkedIn Picks Tobias PSF Fundraiser Lynn Roam note-taking tool Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Klio Announcement Blog Post Docs GitHub Spotify PyLadies SF Luigi RAML ramlfications Interrogate Apache Beam Librosa PyAudio Pillow Podcast Episode FFMPeg ImageMagick Music Information Retrieval Machine Hearing Data Engineering Podcast Episode Scio Microsoft Azure Google Cloud Platform Google Cloud Dataflow Protocol Buffers Apache Spark PySpark DAG == Directed Acyclic Graph ISMIR Conference Digital Signal Processing (DSP) Python Pickle Research paper on separating vocals from instrumentals of a song New York Times: Why songs of the summer sound the same Microsoft’s Rocket Platform for video analytics The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/7/2020 • 53 minutes, 36 seconds

Open Sourcing The Anvil Full Stack Python Web App Platform

Summary Building a complete web application requires expertise in a wide range of disciplines. As a result it is often the work of a whole team of engineers to get a new project from idea to production. Meredydd Luff and his co-founder built the Anvil platform to make it possible to build full stack applications entirely in Python. In this episode he explains why they released the application server as open source, how you can use it to run your own projects for free, and why developer tooling is the sweet spot for an open source business model. He also shares his vision for how the end-to-end experience of building for the web should look, and some of the innovative projects and companies that were made possible by the reduced friction that the Anvil platform provides. Give it a listen today to gain some perspective on what it could be like to build a web app. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Do you want to get better at Python? Now is an excellent time to take an online course. Whether you’re just learning Python or you’re looking for deep dives on topics like APIs, memory mangement, async and await, and more, our friends at Talk Python Training have a top-notch course for you. If you’re just getting started, be sure to check out the Python for Absolute Beginners course. It’s like the first year of computer science that you never took compressed into 10 fun hours of Python coding and problem solving. Go to pythonpodcast.com/talkpython today and get 10% off the course that will help you find your next level. That’s pythonpodcast.com/talkpython, and don’t forget to thank them for supporting the show. Python has become the default language for working with data, whether as a data scientist, data engineer, data analyst, or machine learning engineer. Springboard has launched their School of Data to help you get a career in the field through a comprehensive set of programs that are 100% online and tailored to fit your busy schedule. With a network of expert mentors who are available to coach you during weekly 1:1 video calls, a tuition-back guarantee that means you don’t pay until you get a job, resume preparation, and interview assistance there’s no reason to wait. Springboard is offering up to 20 scholarships of $500 towards the tuition cost, exclusively to listeners of this show. Go to pythonpodcast.com/springboard today to learn more and give your career a boost to the next level. Your host as usual is Tobias Macey and today I’m interviewing Meredydd Luff about the process and motivations for releasing the Anvil platform as open source Interview Introductions How did you get introduced to Python? Can you start by giving an overview of what Anvil is and some of the story behind it? What is new or different in Anvil since we last spoke in June of 2019? What are the most common or most impressive use cases for Anvil that you have seen? On your website you mention Anvil being used for deploying models and productionizing notebooks. How does Anvil help in those use cases? How much of the adoption of Anvil do you attribute to the use of Skulpt and providing a way to write Python for the browser? What are some of the complications that users might run into when trying to integrate with the broader Javascript ecosystem? How does the release of the Anvil App Server affect your business model? How does the workflow for users of the Anvil platform change if they decide to run their own instance? What is involved in getting it deployed to production? What other tools or companies did you look to for positive and negative examples of how to run a successful business based on open source? What was your motivation for open sourcing the core runtime of Anvil? What was involved in getting the code cleaned up and ready for a public release? What are the other ways that your business relies on or contributes to the open source ecosystem? What do you see as the primary threats to open source business models? What are some of the most interesting, unexpected, or challenging lessons that you have learned while building and growing Anvil? What do you have planned for the future of the platform and business? Keep In Touch LinkedIn @meredydd on Twitter meredydd on GitHub Picks Tobias Magic: The Gathering Meredydd Anvil Advent Calendar Anvil Podcast Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Anvil Podcast Episode Visual Basic Skulpt Streamlit Podcast Episode Plot.ly Dash Anvil Uplink DOM == Document Object Model SQLAlchemy Brython Transcrypt Podcast Episode Comparison of Python in the browser implementations Blog post about Anvil object serializer Create React App Webpack Jetbrains Traefik Let’s Encrypt Corey Quinn WebAssembly PyOdide The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/1/2020 • 51 minutes, 23 seconds

Pants Has Got Your Python Monorepo Covered

Summary In a software project writing code is just one step of the overall lifecycle. There are many repetitive steps such as linting, running tests, and packaging that need to be run for each project that you maintain. In order to reduce the overhead of these repeat tasks, and to simplify the process of integrating code across multiple systems the use of monorepos has been growing in popularity. The Pants build tool is purpose built for addressing all of the drudgery and for working with monorepos of all sizes. In this episode core maintainers Eric Arellano and Stu Hood explain how the Pants project works, the benefits of automatic dependency inference, and how you can start using it in your own projects today. They also share useful tips for how to organize your projects, and how the plugin oriented architecture adds flexibility for you to customize Pants to your specific needs. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Python has become the default language for working with data, whether as a data scientist, data engineer, data analyst, or machine learning engineer. Springboard has launched their School of Data to help you get a career in the field through a comprehensive set of programs that are 100% online and tailored to fit your busy schedule. With a network of expert mentors who are available to coach you during weekly 1:1 video calls, a tuition-back guarantee that means you don’t pay until you get a job, resume preparation, and interview assistance there’s no reason to wait. Springboard is offering up to 20 scholarships of $500 towards the tuition cost, exclusively to listeners of this show. Go to pythonpodcast.com/springboard today to learn more and give your career a boost to the next level. Feature flagging is a simple concept that enables you to ship faster, test in production, and do easy rollbacks without redeploying code. Teams using feature flags release new software with less risk, and release more often. ConfigCat is a feature flag service that lets you easily add flags to your Python code, and 9 other platforms. By adopting ConfigCat you and your manager can track and toggle your feature flags from their visual dashboard without redeploying any code or configuration, including granular targeting rules. You can roll out new features to a subset or your users for beta testing or canary deployments. With their simple API, clear documentation, and pricing that is independent of your team size you can get your first feature flags added in minutes without breaking the bank. Go to pythonpodcast.com/configcat today to get 35% off any paid plan with code PYTHONPODCAST or try out their free forever plan. Your host as usual is Tobias Macey and today I’m interviewing Eric Arellano and Stu Hood about Pants, a flexible build system that works well with monorepos. Interview Introductions How did you get introduced to Python? Can you start by describing what Pants is and how it got started? What’s the story behind the name? What is a monorepo and why might I want one? What are the challenges caused by working with a monorepo? Why are monorepos so uncommon in Python projects? What is the workflow for a developer or team who is managing a project with Pants? How does Pants integrate with the broader ecosystem of Python tools for dependency management and packaging (e.g. Poetry, Pip, pip-tools, Flit, Twine, Pex, Shiv, etc.)? What is involved in setting up Pants for working with a new Python project? What complications might developers encounter when trying to implement Pants in an existing project? How is Pants itself implemented? How have the design, goals, or architecture evolved since Pants was first created? What are the major changes in the v2 release? What was the motivation for the major overhaul of the project? How do you recommend developers lay out their projects to work well with Python? How can I handle code shared between different modules or packages, and reducing the third party dependencies that are built into the respective packages? What are some of the most interesting, unexpected, or innovative ways that you have seen Pants used? What have you found to be the most interesting, unexpected, or challenging aspects of working on Pants? What are the cases where Pants is the wrong choice? What do you have planned for the future of the pants project? Keep In Touch Eric LinkedIn Eric-Arellano on GitHub @EArellanoAZ on Twitter Stu stuhood on GitHub @stuhood on Twitter LinkedIn Picks Tobias Cursed TV show Eric Turtle Graphics Stu Faster Than Lime blog Links Pants Foursquare Twitter Toolchain Bazel build tool Ant build tool Monorepo isort Tox Poetry distutils setuptools mypy Bandit Flake8 Sample Python Pants Project gRPC Protocol Buffers Rust GIL == Global Interpreter Lock PEP 420 Blog post about using Pants to migrate from Python 2 to 3 Pex Shiv PyOxidizer The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/23/2020 • 51 minutes, 38 seconds

Scale Your Data Science Teams With Machine Learning Operations Principles

Summary Building a machine learning model is a process that requires well curated and cleaned data and a lot of experimentation. Doing it repeatably and at scale with a team requires a way to share your discoveries with your teammates. This has led to a new set of operational ML platforms. In this episode Michael Del Balso shares the lessons that he learned from building the platform at Uber for putting machine learning into production. He also explains how the feature store is becoming the core abstraction for data teams to collaborate on building machine learning models. If you are struggling to get your models into production, or scale your data science throughput, then this interview is worth a listen. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Do you want to get better at Python? Now is an excellent time to take an online course. Whether you’re just learning Python or you’re looking for deep dives on topics like APIs, memory mangement, async and await, and more, our friends at Talk Python Training have a top-notch course for you. If you’re just getting started, be sure to check out the Python for Absolute Beginners course. It’s like the first year of computer science that you never took compressed into 10 fun hours of Python coding and problem solving. Go to pythonpodcast.com/talkpython today and get 10% off the course that will help you find your next level. That’s pythonpodcast.com/talkpython, and don’t forget to thank them for supporting the show. Python has become the default language for working with data, whether as a data scientist, data engineer, data analyst, or machine learning engineer. Springboard has launched their School of Data to help you get a career in the field through a comprehensive set of programs that are 100% online and tailored to fit your busy schedule. With a network of expert mentors who are available to coach you during weekly 1:1 video calls, a tuition-back guarantee that means you don’t pay until you get a job, resume preparation, and interview assistance there’s no reason to wait. Springboard is offering up to 20 scholarships of $500 towards the tuition cost, exclusively to listeners of this show. Go to pythonpodcast.com/springboard today to learn more and give your career a boost to the next level. Your host as usual is Tobias Macey and today I’m interviewing Mike Del Balso about what is involved in operationalizing machine learning, and his work at Tecton to provide that platform as a service Interview Introductions How did you get introduced to Python? Can you start by describing what is encompassed by the term "Operational ML"? What other approaches are there to building and managing machine learning projects? How do these approaches differ from operational ML in terms of the use cases that they enable or the scenarios where they can be employed? How would you characterize the current level of maturity for the average organization or enterprise in terms of their capacity for delivering ML projects? What are the necessary components for an operational ML platform? You helped to build the Michelangelo platform at Uber. How did you determine what capabilities were necessary to provide a unified approach for building and deploying models? How did your work on Michelangelo inform your work on Tecton? How does the use of a feature store influence the structure and workflow of a data team? In addition to the feature store, what are the other necessary components of a full pipeline for identifying, training, and deploying machine learning models? Once a model is in production, what signals or metrics do you track to feed into the next iteration of model development? One of the common challenges in data science and machine learning is managing collaboration. How do tools such as feature stores or the Michelangelo platform address that problem? What are the most interesting, unexpected, or challenging lessons that you have learned while building operational ML platforms? What advice or recommendations do you have for teams who are trying to work with machine learning? What do you have planned for the future of Tecton? Keep In Touch LinkedIn Picks Tobias Sandman graphic novel series by Neil Gaiman Mike At Home: A Short History of Private Life by Bill Bryson Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Tecton Michelangelo sklearn Pandas Data Engineering Podcast Episode About StreamSQL Feature Store Master Data Management Amundsen Data Engineering Podcast Episode Jupyter Algorithmia Unix philosophy Feast feature store Kubeflow Andreesen Horowitz Post On Emerging Data Architectures What is a feature store? post on the Tecton blog The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/17/2020 • 51 minutes, 58 seconds

Making The Case For A (Semi) Formal Specification Of CPython

Summary The CPython implementation has grown and evolved significantly over the past ~25 years. In that time there have been many other projects to create compatible runtimes for your Python code. One of the challenges for these other projects is the lack of a fully documented specification of how and why everything works the way that it does. In the most recent Python language summit Mark Shannon proposed implementing a formal specification for CPython, and in this episode he shares his reasoning for why that would be helpful and what is involved in making it a reality. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Do you want to get better at Python? Now is an excellent time to take an online course. Whether you’re just learning Python or you’re looking for deep dives on topics like APIs, memory mangement, async and await, and more, our friends at Talk Python Training have a top-notch course for you. If you’re just getting started, be sure to check out the Python for Absolute Beginners course. It’s like the first year of computer science that you never took compressed into 10 fun hours of Python coding and problem solving. Go to pythonpodcast.com/talkpython today and get 10% off the course that will help you find your next level. That’s pythonpodcast.com/talkpython, and don’t forget to thank them for supporting the show. Your host as usual is Tobias Macey and today I’m interviewing Mark Shannon about his efforts to create a formal specification for the CPython interpreter Interview Introductions How did you get introduced to Python? Can you start by describing the current state of how the Python language and the CPython runtime are defined? What is your motivation in advocating for a specification? After ~25 years of the language, why is now the time to pursue this effort? How does the history of the language and the scope of the ecosystem and community impact the effort required to make this a reality? What is involved in creating the specification and where would it be located once complete? What are some examples of languages that are formally specified? What are the possible benefits of creating a specification for the CPython virtual machine? What is the distinction between a specification for the VM as opposed to a specification for the language? What are some potential downsides to having a (semi-)formal specification become part of the definition of the interpreter? Can you describe the process of doing the work to create the specification? How are you approaching the actual definition of the specification (e.g. prose vs programmatic)? What are the tradeoffs of prose vs. an executable specification (e.g. TLA+, Alloy)? How does this work tie into your goals of improving the speed of the CPython interpreter? What are some of the most interesting, unexpected, or challenging aspects of your efforts to bring this specification to CPython? How can the community contribute to this effort? Keep In Touch markshannon on GitHub Website Picks Tobias American Gods book and TV series Mark Roadside Picnic In Death (VR game) –On Steam Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links CPython PyPy PEP 380 yield from Language Summit RustPython Jython C++ ML programming language Java Python Formal Semantics git repository CPython PEG Parser Episode with Pablo Galindo and Lysandros Nikolaou IETF RFCs The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/10/2020 • 36 minutes, 41 seconds

Bringing Artificial Intelligence Projects From Idea To Production

Summary Artificial intelligence applications can provide dramatic benefits to a business, but only if you can bring them from idea to production. Henrik Landgren was behind the original efforts at Spotify to leverage data for new product features, and in his current role he works on an AI system to evaluate new businesses to invest in. In this episode he shares advice on how to identify opportunities for leveraging AI to improve your business, the capabilities necessary to enable aa successful project, and some of the pitfalls to watch out for. If you are curious about how to get started with AI, or what to consider as you build a project, then this is definitely worth a listen. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Do you want to get better at Python? Now is an excellent time to take an online course. Whether you’re just learning Python or you’re looking for deep dives on topics like APIs, memory mangement, async and await, and more, our friends at Talk Python Training have a top-notch course for you. If you’re just getting started, be sure to check out the Python for Absolute Beginners course. It’s like the first year of computer science that you never took compressed into 10 fun hours of Python coding and problem solving. Go to pythonpodcast.com/talkpython today and get 10% off the course that will help you find your next level. That’s pythonpodcast.com/talkpython, and don’t forget to thank them for supporting the show. Equalum’s end to end data ingestion platform is relied upon by enterprises across industries to seamlessly stream data to operational, real-time analytics and machine learning environments. Equalum combines streaming Change Data Capture, replication, complex transformations, batch processing and full data management using a no-code UI. Equalum also leverages open source data frameworks by orchestrating Apache Spark, Kafka and others under the hood. Tool consolidation and linear scalability without the legacy platform price tag. Go to pythonpodcast.com/equalum today to start a free 2 week test run of their platform, and don’t forget to tell them that we sent you. Your host as usual is Tobias Macey and today I’m interviewing Henrik Landgren about his experiences building AI platforms to transform business capabilities. Interview Introductions How did you get introduced to Python? Can you start by sharing your thoughts on when, where, and how AI/ML are useful tools for a business? What has been your experience in building AI platforms? For organizations who are considering investing in AI capabilities, what are some alternative strategies that they might consider first? What are the cases where AI is likely to be a wasted effort, or will fail to create a return on investment? In order to be succesful in bringing AI products to production, what are the foundational capabilities that are necessary? What have you found to be a useful composition of roles and skills for building AI products? There are various statistics that all point to a remarkably low success rate for bringing AI into production. What are some of the pitfalls that organizations and engineers should be aware of when undertaking such a project? What is your strategy for identifying opportunities for a successful AI product? Once you have determined the possible utility for such a project, how do you approach the work of making it a reality? What are the common factors in what you built at Spotify and EQT ventures? Where do the two efforts diverge? Your work on Motherbrain is interesting because of the fact that it is dealing in what seems to be intangible or unpredictable forces. What kinds of input are you relying on to generate useful predictions? What are some of the most interesting, innovative, or unexpected uses of AI that you have seen? What are some of the biggest failures of AI that you are aware of? In your work at Spotify and EQT ventures, what are the most interesting, unexpected, or challenging lessons that you have learned? What advice or recommendations do you have for anyone who wants to learn more about the potential for AI and the work involved in bringing it to production? Keep In Touch LinkedIn @hlandgren on Twitter Picks Tobias Whale bat Henrik Observable Dataform Data Engineering Podcast Episode Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links EQT Ventures Stockholm Sweden Motherbrain Accenture Spotify Basic C# ASP.NET Javascript Hadoop McKinsey Deep Learning Data Engineer Data Scientist Machine Learning Engineer Discover Weekly Spotify Playlist GPT-3 Deep Fakes DBT Data Engineering Podcast Episode The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/3/2020 • 47 minutes, 49 seconds

Power Up Your Java Using Python With JPype

Summary Python and Java are two of the most popular programming languages in the world, and have both been around for over 20 years. In that time there have been numerous attempts to provide interoperability between them, with varying methods and levels of success. One such project is JPype, which allows you to use Java classes in your Python code. In this episode the current lead developer, Karl Nelson, explains why he chose it as his preferred tool for combining these ecosystems, how he and his team are using it, and when and how you might want to use it for your own projects. He also discusses the work he has done to enable use of JPype on Android, and what is in store for the future of the project. If you have ever wanted to use a library or module from Java, but the rest of your project is already in Python, then this episode is definitely worth a listen. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Karl Nelson about JPype, a language bridge that lets you use Java classes in your Python programs Interview Introductions How did you get introduced to Python? Can you start by giving an overview of what JPype is? What was your motivation for becoming such a regular contributor to the project? Why might someone want to be able to call into the Java ecosystem from a Python program? There have been a number of other projects aiming to combine the capabilities of Java and Python, such as Jython and PyJNIus. What are the relative tradeoffs between the different options? Many of those other projects have stalled or stopped altogether. What about JPype has allowed it to survive for so long? Can you explain how JPype is implemented? How has the design and implementation of the project evolved since it was first implemented? How do the relative language versions influence the compatibility of programs on either side of the bridge? What is involved in creating a project that uses JPype? How are dependencies, packaging, distribution, etc. handled across the Java and Python portions of the code? What are some of the ways that JPype can be used for Android applications? What are some of the sharp edges or pitfalls that users of JPype should be aware of? What are some of the most interesting, innovative, or unexpected ways that you have seen JPype used? What have you found to be the most interesting or challenging aspects of building JPype? When is JPype the wrong choice? What is in store for the future of the project? Keep In Touch Thrameos on GitHub LinkedIn Picks Tobias Hiking All Trails The Hiking Project Karl Summoner’s Rift Links JPype Java Overview of Python to Java bridges Lawrence Livermore National Lab GTK– Gnome Perl C++ Matlab Java Native Interface (JNI) SciPy NumPy Matplotlib Jython PyJNIus Py4J Jep Ruby Reflection Ivy Maven JDBC Kivy Android Python Slots PyPy Java ASM Arrow Columnar Memory Format Protocol Buffers GraalVM The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/26/2020 • 48 minutes, 39 seconds

The Journey To Replace Python's Parser And What It Means For The Future

Summary The release of Python 3.9 introduced a new parser that paves the way for brand new features. Every programming language has its own specific syntax for representing the logic that you are trying to express. The way that the rules of the language are defined and validated is with a grammar definition, which in turn is processed by a parser. The parser that the Python language has relied on for the past 25 years has begun to show its age through mounting technical debt and a lack of flexibility in defining new syntax. In this episode Pablo Galindo and Lysandros Nikolaou explain how, together with Python’s creator Guido van Rossum, they replaced the original parser implementation with one that is more flexible and maintainable, why now was the time to make the change, and how it will influence the future evolution of the language. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Pablo Galindo and Lysandros Nikolaou about their work on replacing the parser in CPython and what that means for the language Interview Introductions How did you get introduced to Python? Can you start by discussing the role of the parser in the lifecycle of a Python program? What were the limitations of the previous parser, and how did that contribute to complexity and technical debt in the CPython runtime? What are the options for styles of parsers, and what are the benefits of using a PEG style grammar? How does the new parser impact the approachability of the CPython code for new contributors? What was the process for reimplementing the parser and guarding against regressions in the syntax? As developers switch to the 3.9 release, what potential edge cases/bugs might they see from introducing the new parser? What new syntax options does this parser provide for the Python language? Are there any specific features that are planned for implementation in the 3.10 release that are enabled by the new parser grammar? As the language evolves due to new capabilities offered by the updated parser, how will that impact other implementations such as PyPy? What were the most interesting, unexpected, or challenging aspects of this project? What other aspects of the CPython code do you think should be reconsidered or reimplemented in light of the changes in computing and the usage of the language? Keep In Touch Pablo pablogsal on GitHub @pyblogsal on Twitter LinkedIn Lysandros LinkedIn lysnikolaou on GitHub @lysnikolaou on Twitter Picks Tobias Annual Python Developer Survey Jessica Jones TV show Pablo Raised By Wolves TV Series Lysandros Afterlife TV show Links PEP 617 – New PEG Parser for CPython Podcast Episode About Parsers CPython Bloomberg PEG Parsers Seafair LL(1) Parsers Łukasz Langa Parser Generator Concrete Syntax Tree Abstract Syntax Tree PyPy RustPython Podcast Episode IronPython Structural Pattern Matching – PEP 622 Pylint ASTroid Podcast Episode Hy Podcast Episode Walrus Operator/Assignment Expressions C99 Reference Counting Cycle Hunting/Generational Garbage Collection The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/19/2020 • 1 hour, 5 minutes, 48 seconds

Cloud Native Application Delivery Using GitOps

Summary The way that applications are being built and delivered has changed dramatically in recent years with the growing trend toward cloud native software. As part of this movement toward the infrastructure and orchestration that powers your project being defined in software, a new approach to operations is gaining prominence. Commonly called GitOps, the main principle is that all of your automation code lives in version control and is executed automatically as changes are merged. In this episode Victor Farcic shares details on how that workflow brings together developers and operations engineers, the challenges that it poses, and how it influences the architecture of your software. This was an interesting look at an emerging pattern in the development and release cycle of modern applications. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Tree Schema is a data catalog that is making metadata management accessible to everyone. With Tree Schema you can create your data catalog and have it fully populated in under five minutes when using one of the many automated adapters that can connect directly to your data stores. Tree Schema includes essential cataloging features such as first class support for both tabular and unstructured data, data lineage, rich text documentation, asset tagging and more. Built from the ground up with a focus on the intersection of people and data, your entire team will find it easier to foster collaboration around your data. With the most transparent pricing in the industry – $99/mo for your entire company – and a money-back guarantee for excellent service, you’ll love Tree Schema as much as you love your data. Go to pythonpodcast.com/treeschema today to get your first month free, and mention this podcast to get %50 off your first three months after the trial. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Victor Farcic about using GitOps practices to manage your application and your infrastructure in the same workflow Interview Introductions How did you get introduced to Python? Can you start by giving an overview of what GitOps is? What are the architectural or design elements that developers need to incorporate to make their applications work well in a GitOps workflow? What are some of the tools that facilitate a GitOps approach to managing applications and their target environments? What are some useful strategies for managing local developer environments to maintain parity with how production deployments are architected? As developers acquire more resonsibility for building the automation to provision the production environment for their applications, what are some of the operations principles that they need to understand? What are some of the development principles that operators and systems administrators need to acquire to be effective in contributing to an environment that is managed by GitOps? What are the areas for collaboration and dividing lines of responsibility between developers and platform engineers in a GitOps environment? Beyond the application development and deployment, what are some of the additional concerns that need to be built into an application in order for it to be manageable and maintainable once it is in production? What are some of the organizational principles that contribute to a successful implementation of GitOps? What are some of the most interesting, innovative, or unexpected ways that you have seen GitOps employed? What have you found to be the most challenging aspects of creating a scalable and maintainable GitOps practice? When is GitOps the wrong choice, and what are the alternatives? What resources do you recommend for anyone who wants to dig deeper into this subject? Keep In Touch LinkedIn Blog @vfarcic on Twitter Picks Tobias Pulumi Podcast Episode Victor Loki Links GitOps CodeFresh Kubernetes DevOps Paradox Podcast Perl Cloud Native ArgoCD Flux Observability Prometheus Helm KNative MiniKube Viktor’s Udemy Books and Courses Viktor’s YouTube channel The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/12/2020 • 53 minutes, 43 seconds

Threading The Needle Of Interesting And Informative While You Learn To Code

Summary Learning to code is a neverending journey, which is why it’s important to find a way to stay motivated. A common refrain is to just find a project that you’re interested in building and use that goal to keep you on track. The problem with that advice is that as a new programmer, you don’t have the knowledge required to know which projects are reasonable, which are difficult, and which are effectively impossible. Steven Lott has been sharing his programming expertise as a consultant, author, and trainer for years. In this episode he shares his insights on how to help readers, students, and colleagues interested enough to learn the fundamentals without losing sight of the long term gains. He also uses his own difficulties in learning to maintain, repair, and captain his sailboat as relatable examples of the learning process and how the lessons he has learned can be translated to the process of learning a new technology or skill. This was a great conversation about the various aspects of how to learn, how to stay motivated, and how to help newcomers bridge the gap between what they want to create and what is within their grasp. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at pythonpodcast.com/datadog. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Steven F. Lott about finding a project that you care about to aid in learning to program Interview Introductions How did you get introduced to Python? Can you start by outlining your experiences working with and teaching Python? Does your day-to-day experience at work suggest ways to help newcomers learn about Python? How have your experiences as an author influenced your perspective on how to help newcomers become motivated to learn programming? One of the common pieces of advice that I and others have given to people learning Python or other languages is to find a project that they want to build, but that’s not necessarily a practical approach. What are some of the difficulties that might come of that approach? What are some strategies that you have tried for helping learners identify what kinds of project are possible and practical? Beyond the difficulty of understanding what is possible and what is going to require a dedicated team of engineers to even attempt, there is the question of remaining motivated for long enough to follow through on a project in the face of syntax errors and design challenges. What can language developers and ecosystems do to improve the newcomer experience in exploring possibilities? How can we make syntax errors educational and recoverable, rather than needing accrued knowledge, or hours of web searches? As an author, there are complementary goals that may lead to conflict in the form of wanting to provide structured guidance and progression while allowing for creativity and experimentation. How have you approached those objectives in your books? What are some of the projects that have motivated you to learn new skills? What advice do you have for anyone who is working on or considering writing a book to teach a technical skill? What advice do you have for anyone who is trying to learn programming or acquire a skill in a new language, platform, or framework? Why are both of you movie picks black and white? Are you a film noir fan? Keep In Touch Website Blog LinkedIn slott56 on GitHub @s_lott on Twitter Picks Tobias The Hobbit Trilogy: Extended Edition (affiliate link) The Lord Of The Rings Trilogy: Extended Edition (affiliate link) Steven Dr. Strangelove or: How I Learned to Stop Worrying and Love the Bomb The General Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Modern Python Cookbook Packt Publishing Eiffel Modula 3 COBOL Stack Overflow Capital One The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/6/2020 • 56 minutes, 29 seconds

Solving Python Package Creation For End User Applications With PyOxidizer

Summary Python is a powerful and expressive programming language with a vast ecosystem of incredible applications. Unfortunately, it has always been challenging to share those applications with non-technical end users. Gregory Szorc set out to solve the problem of how to put your code on someone else’s computer and have it run without having to rely on extra systems such as virtualenvs or Docker. In this episode he shares his work on PyOxidizer and how it allows you to build a self-contained Python runtime along with statically linked dependencies and the software that you want to run. He also digs into some of the edge cases in the Python language and its ecosystem that make this a challenging problem to solve, and some of the lessons that he has learned in the process. PyOxidizer is an exciting step forward in the evolution of packaging and distribution for the Python language and community. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at pythonpodcast.com/datadog. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Gregory Szorc about his work on PyOxidizer, a revolutionary new approach to building and distributing self-contained Python applications Interview Introductions How did you get introduced to Python? Can you start by giving an overview on the shortcomings of the current state of the art for distributing Python projects, both for deployment and end-user consumption? What is PyOxidizer and what motivated you to create it? How does PyOxidizer differ from projects such as CxFreeze, Py2Exe, or Shiv? What are the characteristics of CPython and the packaging ecosystem that make it so challenging to easily distribute self-contained applications? For someone using PyOxidizer, what is their workflow for building an executable that they can share with end users? What are some of the edge cases or special considerations that they need to be aware of? How is PyOxidizer implemented? How has the design or direction evolved since you first began working on it? From your experience in working on PyOxidizer, what changes would you like to see in the Python language or the CPython reference implementation? What are some of the most interesting, unexpected, or challenging lessons that you have learned while working on PyOxidizer? What do you have planned for the future of PyOxidizer? What are the ways that listeners can contribute to PyOxidizer? Keep In Touch Website indygreg on GitHub Picks Tobias Carlos Santana Gregory Home Air Quality Monitor Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links PyOxidizer Mercurial Podcast Episode Mozilla Virtualenv Pip Docker Py2Exe CXFreeze Beeware Shiv FPM Python Build Standalone Importlib Rust Russell Keith-Magee Black Swans Keynote Followup Podcast Episode The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/29/2020 • 49 minutes, 39 seconds

Flexible Network Security Detection And Response With Grapl

Summary Servers and services that have any exposure to the public internet are under a constant barrage of attacks. Network security engineers are tasked with discovering and addressing any potential breaches to their systems, which is a never-ending task as attackers continually evolve their tactics. In order to gain better visibility into complex exploits Colin O’Brien built the Grapl platform, using graph database technology to more easily discover relationships between activities within and across servers. In this episode he shares his motivations for creating a new system to discover potential security breaches, how its design simplifies the work of identifying complex attacks without relying on brittle rules, and how you can start using it to monitor your own systems today. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at pythonpodcast.com/datadog. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Colin O’Brien about Grapl, an open source platform for detection and response of system security incidents Interview Introductions How did you get introduced to Python? Can you start by describing what Grapl is and the problem that you are trying to solve with it? What was your original motivation to create it? What were the existing options for security detection and response, and how is Grapl differentiated from them? Who is the target audience for the Grapl project? How is the Grapl system architected? How has the design of the system evolved since you first began working on it? How much effort would it be to separate the Grapl architecture from AWS to migrate it to other environments? What have you found to be the benefits of splitting the implementation of the system between Rust for the system and Python for the exploration? What challenges have you faced as a result of working across those languages? What data sources does Grapl use to build its graph of events within a system? Can you talk through the overall workflow for someone using Grapl? What are some examples of the types of exploits that you can identify with Grapl? What are some of the most interesting, unexpected, or innovative ways that you have seen Grapl used? What are some of the most interesting, unexpected, or challenging lessons that you have learned while building it? When is Grapl the wrong choice? What do you have planned for the future of Grapl? Keep In Touch insanitybit on GitHub LinkedIn @InsanityBit on Twitter Picks Tobias Artemis Fowl book series by Eoin Colfer Artemis Fowl Movie Colin PyO3 Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Grapl Grapl Security SIEM == Security Information and Event Management Rapid7 Metasploit Insight IDR Erlang DGraph Splunk Elasticsearch AWS Lambda Sysdig Sysmon AWS CloudTrail Guard Duty OpenFaaS AWS SQS DynamoDB PyO3 Dropper Malware SSH Session Hijacking Vagrant The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/22/2020 • 53 minutes, 32 seconds

Simplified Data Extraction And Analysis For Current Events With Newspaper

Summary News media is an important source of information for understanding the context of the world. To make it easier to access and process the contents of news sites Lucas Ou-Yang built the Newspaper library that aids in automatic retrieval of articles and prepare it for analysis. In this episode he shares how the project got started, how it is implemented, and how you can get started with it today. He also discusses how recent improvements in the utility and ease of use of deep learning libraries open new possibilities for future iterations of the project. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at pythonpodcast.com/datadog. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Lucas Ou-Yang about Newspaper, a framework for easily extracting and processing online articles. Interview Introductions How did you get introduced to Python? Can you start by describing what the Newspaper project is and your motivations for creating it? What are the main use cases that Newspaper is built for? What are some libraries or tools that Newspaper might replace? What are the common structures in news sites that allow you to abstract across them for content extraction? What are some ways of determining whether a site will be a good candidate for using with Newspaper? Can you talk through the developer workflow of someone using Newspaper? What are some of the other libraries or tools that are commonly used alongside Newspaper? How is Newspaper implemented? How has the design of he project evolved since you first began working on it? What are some of the most complex or challenging aspects of building an automated article extraction tool? What are some of the most interesting, unexpected, or innovative projects that you have seen built with Newspaper? What keeps you interested in the ongoing support and maintenance of the project? What do you have planned for the future of Newspaper? Keep In Touch LinkedIn @LucasOuYang on Twitter Website codelucas on GitHub Picks Tobias Million Bazillion Podcast Lucas Hackers and Painters: Big Ideas from the Computer Age by Paul Graham Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Newspaper Los Angeles Reddit Django NLP == Natural Language Processing Web Scraping Podcast Episode Requests Wintria Python Goose Diffbot Heuristics Stop Words RSS SpaCy Podcast Episode Gensim Podcast Episode PyTorch Podcast Episode NLTK LXML Beautiful Soup The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/15/2020 • 43 minutes, 27 seconds

Digging Into Dagster: An Opinionated Open Source Framework For Data Orchestration

Summary Data applications are complex and continually evolving, often requiring collaboration across multiple teams. In order to keep everyone on the same page a high level abstraction is needed to facilitate a cross-cutting view of the data orchestration across integration, transformation, analytics, and machine learning. Dagster is an innovative new framework that leans on the power and flexibility of Python to provide an extensible interface to the complete lifecycle of data projects. In this episode Nick Schrock explains how he designed the Dagster project to allow for integration with the entire data ecosystem while providing an opinionated structure for connecting the different stages of computation. He also discusses how he is working to grow an open ecosystem around the Dagster project, and his thoughts on building a sustainable business on top of it without compromising the integrity of the community. This was a great conversation about playing the long game when building a business while providing a valuable utility to a complex problem domain. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at pythonpodcast.com/datadog. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Nick Schrock about Dagster, an open source data orchestrator for powering data engineering, analytics, and machine learning Interview Introductions How did you get introduced to Python? Can you start by describing what Dagster is and how it got started? What are the most common difficulties that organizations face when working with data projects? How does Dagster help in addressing those challenges? There are a number of workflow orchestration platforms, spanning a few generations of tooling. What do you see as the defining characteristics of the various options, and how does Dagster fit in that ecosystem? What are the assumptions that you made at the start of building Dagster and how have they been challenged, updated, or invalidated over the past year of working with end users? How are the internals of Dagster implemented? How has the design changed or evolved since you first began working on it? For someone who is building on top of Dagster, what is their workflow from first steps through to production? What are your guiding principles for desigining the user facing API? What are the available extension points for Dagster? What was your reason for implementing Dagster as a Python framework? With the benefit of hindsight, would you make the same decision today? What are some of the most interesting, innovative, or unexpected ways that you have seen Dagster used? What are the most interesting, unexpected, or challenging lessons that you have learned while building Dagster and working to grow its ecosystem? When is Dagster the wrong choice? As you continue to build Dagster, what is your vision for it and its ecosystem? What are the next steps that you are taking to achieve that vision? Keep In Touch @schrockn on Twitter schrockn on GitHub LinkedIn Picks Tobias Caddy web server Nick Black code formatter Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Dagster Elementl IronPython Fluent Python GraphQL Maslow’s Hierarchy of Needs Hierarchy of Data Needs DAG == Directed Acyclic Graph Informatica Airflow Luigi Dagster Config Schema Dask Data Engineering Podcast Episode Coiled Episode gRPC MyPy Podcast Episode Data Lineage Pandas Podcast Episode Amundsen Podcast Episode DataHub Podcast Episode Gatsby.js Panama Papers Mode Analytics Podcast Episode Papermill Podcast Episode DBT Podcast Episode Databricks Tobias’ Dagster Repository The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/7/2020 • 59 minutes, 28 seconds

When, Why, and How To Use Web Scraping In A Nutshell

Summary The internet is a rich source of information, but a majority of it isn’t accessible programmatically through APIs or databases. To address that shortcoming there are a variety of web scraping frameworks that aid in extracting structured data from web pages. In this episode Attila Tóth shares the challenges of web data extraction, the ways that you can use it, and how Scrapy and ScrapingHub can help you with your projects. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at datadog.com/pythonpodcast. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Attila Tóth about doing data extraction with web scraping. Interview Introductions How did you get introduced to Python? Can you start by explaining what web scraping is and when you might want to use it? How did you first get started with web scraping? There are a number of options for web scraping tools in Python, as well as other languages. What are the characteristics of the Scrapy project and community that have made it stand out and retain such widespread popularity? One of the perpetual questions with web scraping is that of copyright and content ownership. What should we all be aware of when scraping a given website? What are some of the most challenging aspects of crawling and scraping the web? What are some of the features of Scrapy that aid in those challenges? Once you have retrieved the content from a site, what are some of the considerations for storing and processing the data that we should be thinking about? How can we guard against a scraper breaking due to changes in the layout of a site, or simple updates that weren’t accounted for in the initial implementation? What are some of the most complicated aspects of scaling web scrapers? For someone who is interested in using Scrapy, what are some of the common pitfalls that they should be aware of? What are some of the most interesting, innovative, or unexpected projects that are built with Scrapy and ScrapingHub? What are the most interesting, unexpected, or challenging lessons that you have learned while working with web scrapers and ScrapingHub? What resources would you recommend to anyone who is looking to learn more about web scraping? Keep In Touch LinkedIn Picks Tobias Gov’t Mule Attila Awesome Web Scraping Awesome Scrapy Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Web Scraping ScrapingHub Java Android Scrapy JSoup HTMLUnit Selenium Pandas robots.txt Puppeteer Splash The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/1/2020 • 41 minutes, 51 seconds

Working In The Code Mines: Mining Software Repositories With PyDriller

Summary A large portion of the software industry has standardized on Git as the version control sytem of choice. But have you thought about all of the information that you are generating with your branches, commits, and code changes? Davide Spadini created the PyDriller framework to simplify the work of mining software repositories to perform research on the technical and social aspects of software engineering. In this episode he shares some of the insights that you can gain by exploring the history of your code, the complexities of building a framework to interact with Git, and some of the interesting ways that PyDriller can be used to inform your own development practices. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Davide Spadini about PyDriller, a framework for mining software repositories Interview Introductions How did you get introduced to Python? Can you start by describing what PyDriller is and how the project got started? How is Pydriller different from other Git frameworks? What kinds of information can you discover by mining a software repository? Where and how might the collected information be used? What are the limitations of the capabilities offered by Git for investigating the repository? What are the additional metrics that you are able to extract using PyDriller? Can you describe how PyDriller itself is implemented? How has the project evolved since you first began working on it? I noticed that for testing PyDriller you crafted a set of repositories to serve as test cases. What has been the most complex or challenging aspect of writing meaningful tests to ensure a reasonable coverage of this problem domain? What would be required to add support for other version control systems? How have you used PyDriller in your own research? What are some of the most interesting, unexpected, or innovative ways that you have seen PyDriller used? What are some of the most interesting, unexpected, or challenging lessons that you have learned while working on and with PyDriller? What do you have planned for the future of PyDriller? Keep In Touch Website ishepard on GitHub @DavideSpadini on Twitter Picks Tobias pre-commit Davide Fall guys Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links PyDriller Delft Git GitPython PyGit2 RepoDriller Mining Software Repositories Conference Lizard Hadoop Mercurial Podcast Episode Subversion CVS Neo4J GraphRepo The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/25/2020 • 40 minutes, 3 seconds

Building The Open Data Ecosystem For Music And More At Metabrainz

Summary The Musicbrainz project was an early entry in the movement to build an open data ecosystem. In recent years, the Metabrainz Foundation has fostered a growing ecosystem of projects to support the contribution of, and access to, metadata, listening habits, and review of music. The majority of those projects are written in Python, and in this episode Param Singh explains how they are built, how they fit together, and how they support the goals of the Metabrains Foundation. This was an interesting exporation of the work involved in building an ecosystem of open data, the challenges of making it sustainable, and the benefits of building for the long term rather than trying to achieve a quick win. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Before you put your code into production you need to make sure that it passes all of the tests, that it has been packaged with all of the dependencies, and that you haven’t introduced any security issues. Instead of running all of that on your laptop, let Codefresh handle it automatically with their continuous integration and continuous delivery platform. Built for the modern era of cloud-native computing, they make publishing to Kubernetes, serverless platforms, and virtual machines fast and seamless. With a growing library of pre-made steps, a flexible pipeline definition, and unlimited scale Codefresh lets you ship faster and safer than ever. Go to pythonpodcast.com/codefresh today to get unlimited builds on your free account. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Param Singh about the ways that Python is being used across the various Metabrainz projects Interview Introductions How did you get introduced to Python? Can you start by giving an overview of what the Metabrainz organization is and the various projects that it encompasses? What are the motivations for creating those projects and some of the origin story for Metabrainz? The Musicbrainz server is the longest running project and is written in Perl. What was the reason for switching to Python for all of the other *brainz projects? How does the MetaBrainz Foundation sustain itself? Where do the funds come from? How do you determine where and how to allocate the funding that you receive? Which of the *brainz projects is the most complex or challenging to build, whether due to technical or sociological reasons? How do you source and manage the information that powers all of the Metabrainz projects? How is development of the various projects organized? How does that influence the amount of code sharing that is possible between them? Of the projects that you have been involved in, how are they architected? What are the main ways that the projects differ in how they are implemented? What are some of the ways that you are using Python in support of the various projects that you work on? What are some of the most interesting, innovative, or unexpected ways that you have seen the projects or data built by Metabrainz being used? What are some of the most interesting, unexpected, or challenging lessons that you have learned while working as a contributor and maintainer of the Metabrainz projects? What is in store for the future of the existing Metabrainz projects? What are the next domains that are being considered for building a Metabrainz platform for? Keep In Touch LinkedIn paramsingh on GitHub Website Picks Tobias Beets music library organizer Podcast Episode Param Prateek Kuhad Links Metabrainz Musicbrainz Listenbrainz Acousticbrainz Bookbrainz Critiquebrainz Picard Stripe The Himalayas Dublin Ireland XKCD Import Antigravity Antigravity Python Module Last.fm Google Summer of Code CDDB Perl Flask SQLAlchemy 3rd anniversary cake Redis PostgreSQL RabbitMQ Spark Music Technology Group Splunk Artist Origins Map on ListenBrainz The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/17/2020 • 48 minutes, 6 seconds

Growing Dask To Make Scaling Python Data Science Easier At Coiled

Summary Python is a leading choice for data science due to the immense number of libraries and frameworks readily available to support it, but it is still difficult to scale. Dask is a framework designed to transparently run your data analysis across multiple CPU cores and multiple servers. Using Dask lifts a limitation for scaling your analytical workloads, but brings with it the complexity of server administration, deployment, and security. In this episode Matthew Rocklin and Hugo Bowne-Anderson discuss their recently formed company Coiled and how they are working to make use and maintenance of Dask in production. The share the goals for the business, their approach to building a profitable company based on open source, and the difficulties they face while growing a new team during a global pandemic. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at datadog.com/pythonpodcast. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Matthew Rocklin and Hugo Bowne-Anderson about their work building a business around the Dask ecosystem at Coiled Interview Introductions How did you get introduced to Python? Can you give a quick overview of what Dask is and your motivations for creating it? How has Dask changed or evolved in the past 3 1/2 years since we last talked about it? How has the rest of the ecosystem changed in that time? After working on Dask for the past few years, what led you to the decision to build a business around it? What are the sharp edges of programming for Dask that users are looking for help on solving? What are the difficulties that users face in deploying and maintaining a production installation of Dask? What are the limitations of Dask when scaling both up and down? What are you building at Coiled to improve the user experience for users of Python and Dask? What are your thoughts on the pros and cons of orienting your messaging around the scalability of Python, as opposed to focusing on a specific industry or problem domain? What are the challenges that you are facing in managing the tensions between the open source and proprietary work that you are doing? How are you handling the ongoing governance of the Dask project? What are some of the most interesting, unexpected, or challenging lessons that you have learned while building and launching a company based on an open source project? What do you have planned for the future of both Coiled and Dask? Keep In Touch Matt Website @mrocklin on Twitter mrocklin on GitHub Hugo LinkedIn @hugobowne on Twitter Website Picks Tobias The Hobbit Audiobook Audible Free Trial (affiliate link) Matt Prefect Hugo Race After Technology by Ruha Benjamin Ruha Benjamin on deep learning: Computational depth without sociological depth is ‘superficial learning’ Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Sign up for the Coiled Beta! Coiled Dask Data Engineering Podcast Interview About Dask PyData NumPy SciPy Cell Biology Datacamp Dataframed Matthew Rocklin on Podcast.__init__ about functional programming with Toolz IPython Notebook PyTorch Podcast Episode Airflow Prefect XGBoost Tornado Coiled Blog Post About The Goals of Dask Spark AsyncIO Concurrent.futures Pangeo Xarray RAPIDS Nvidia Cuda Prefect Data Engineering Podcast Episode Celery Life Sciences Tensorflow Snorkel Data Engineering Podcast Episode Dagster Data Engineering Podcast Episode DevOps Docker Kubernetes Metaflow Podcast Episode Ray Podcast Episode Anyscale Yarn Gartner Hype Cycle Travis Oliphant Postgres Amazon ECS Django Django Allauth Quansight Wes McKinney Podcast Interview Ursa Labs The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/10/2020 • 52 minutes, 7 seconds

Supporting The Full Lifecycle Of Machine Learning Projects With Metaflow

Summary Netflix uses machine learning to power every aspect of their business. To do this effectively they have had to build extensive expertise and tooling to support their engineers. In this episode Savin Goyal discusses the work that he and his team are doing on the open source machine learning operations platform Metaflow. He shares the inspiration for building an opinionated framework for the full lifecycle of machine learning projects, how it is implemented, and how they have designed it to be extensible to allow for easy adoption by users inside and outside of Netflix. This was a great conversation about the challenges of building machine learning projects and the work being done to make it more achievable. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at datadog.com/pythonpodcast. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Savin Goyal about Netflix’s infrastructure for machine learning Interview Introductions How did you get introduced to Python? Can you start by describing the work you are doing at Netflix to support their machine learning workloads? How are you addressing the impedance mismatch of machine learning/data science work between local experimentation and production deployment? What was the motivation for building Metaflow? How does Metaflow compare to other tools in the ecosystem such as MLFlow? What was missing in the other available tools that made Metaflow necessary? workflow for someone using Metaflow How do you approach the design of the developer interface to make it approachable to machine learning engineers? level of coupling with overall Netflix data stack How is Metaflow implemented? How has the architecture and design of the system evolved since you first began working on it? supporting infrastructure/integration points motivation/benefits of releasing it as open source What are some of the most interesting, unexpected, or challenging lessons that you have learned while building infrastructure and tooling for machine learning? When is Metaflow the wrong choice? What do you have planned for the future of Metaflow and Keep In Touch LinkedIn @savingoyal on Twitter savingoyal on GitHub Picks Tobias vdist Savin Reparing Vintage Watches Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Metaflow OCaml EC2 S3 Data Lake PyTorch Tensorflow Netflix Data Stack Spinnaker Chaos Engineering Chaos Toolkit Podcast Episode Chaos Monkey Netflix Simian Army Netflix Titus AWS Batch Netflix Meson Dataflow Programming DAG == Directed Acyclic Graph MLFlow DVC (Data Version Control) Podcast Episode CML (Continuous Machine Learning) AWS Step Functions The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/4/2020 • 44 minutes, 45 seconds

Learning To Program By Building Tiny Python Projects

Summary One of the best methods for learning programming is to just build a project and see how things work first-hand. With that in mind, Ken Youens-Clark wrote a whole book of Tiny Python Projects that you can use to get started on your journey. In this episode he shares his inspiration for the book, his thoughts on the benefits of teaching testing principles and the use of linting and formatting tools, as well as the benefits of trying variations on a working program to see how it behaves. This was a great conversation about useful strategies for supporting new programmers in their efforts to learn a valuable skill. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at datadog.com/pythonpodcast. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Ken Youens-Clark about his book Tiny Python Projects Interview Introductions How did you get introduced to Python? What is your goal with your book of Tiny Python Projects? What motivated you to start writing it? Who is the target audience that you wrote the book for? One of the notable aspects of the book is the fact that you introduce linting and testing in the first chapter. Why is that a useful subject for the first steps of someone getting started in Python? What are some of the problems that users experience if they are introduced to these tools after they have already established a set of habits? How did you approach the structure of the book to be approachable by newcomers to Python? What was your process for deciding on the scope of the information to include in the book? What are some of the challenges that you faced in identifying self-contained projects that could fit into a single chapter? As a book that is intended to serve as a learning resource, what was your process for soliciting feedback to determine if your tone and structure is effective in teaching the reader? What elements of the Python language and ecosystem did you consciously leave out to avoid overwhelming the readers? What are some of the most interesting, unexpected, or challenging lessons that you learned while working on the book? What are your thoughts on useful resources and next steps for readers who are interested in progressing in their use of Python? Keep In Touch kyclark on GitHub Website @kycl4rk on Twitter Picks Tobias Marvel Cinematic Universe Ken Parks & Recreation TV Show Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Tiny Python Projects University of Arizona BioInformatics Perl BioPython Podcast Episode Seq Podcast Episode Pytest Podcast Episode Windows Subsystem for Linux Pylint Podcast Episode YAPF Black Python Formatter Mad Libs Boolean Algebra Object Oriented Programming Delphi OmniGraffle Kent Beck Test Driven Development Clojure Regular Expression The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/28/2020 • 54 minutes, 59 seconds

Idiomatic Functional Programming With DRY Python

Summary Python is an intuitive and flexible language, but that versatility can also lead to problematic designs if you’re not careful. Nikita Sobolev is the CTO of Wemake Services where he works on open source projects that encourage clean coding practices and maintainable architectures. In this episode he discusses his work on the DRY Python set of libraries and how they provide an accessible interface to functional programming patterns while maintaining an idiomatic Python interface. He also shares the story behind the wemake Python styleguide plugin for Flake8 and the benefits of strict linting rules to engender good development habits. This was a great conversation about useful practices to build software that will be easy and fun to work on. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at datadog.com/pythonpodcast. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Nikita Sobolev about his work with DRY Python and Wemake Services Interview Introductions How did you get introduced to Python? Can you start by sharing your overarching philosophies or design aesthetics for writing maintainable software? What is your process for starting a new project, beginning at the design phase? What are some of the challenges or shortcomings that you see in the "default" way that most developers write Python? What is DRY Python is and how does it help in addressing those concerns? What was your motivation for creating these projects? There are a number of different projects that are being built under the DRY Python umbrella. Can you list the ones that are currently active and outline how they fit together? What are some of the initial challenges that newcomers to the DRY Python libraries encounter? How do you approach the design of the API and developer experience to make these development approaches more accessible? What have you seen in terms of real world impact on the maintainability and extensibility of projects that you have built on top of the DRY Python components? In addition to DRY Python you are also involved with development of the wemake-python-styleguide. Can you describe that projects goal and how it got started? If you make the linting too restrictive then developers are likely to just ignore or disable it. What have you found to be the right balance to which rules will fail a build and which are just informational? Why do you push the responsibility for things like formatting onto the developer, rather than an autoformatter such as YAPF or Black? What are some of the other supporting technologies that you rely on during your development workflow? What are some of the elements that you think are missing in the common toolbox for Python developers? What tools are we lacking entirely? What are the cases where DRY Python is the wrong choice? What are your goals and plans for the future of DRY Python and the various Wemake libraries? Keep In Touch Blog sobolevn on GitHub Picks Tobias The Map To Everywhere Nikita Russian Python Week Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links DRY Python Wemake Services wemake-python-styleguide Turbogears 2 Dotenv Linter Returns Wemake Python Package Cookiecutter Template Test Driven Development Requirements Analysis RESTs Django Rest Framework Classes Monads Functors Scala Kotlin Haskell Punq dependency injection library Flake8 Wemake Django Template Flake8 Baseline isort Nitpick Mypy Darglint Poetry Pip Dependency Resolver Podcast Episode Hypothesis Podcast Episode Schemathesis Pytest Auto Hypothesis Typescript Rust Elixir Zio Scala GitHub Sponsors Do Not Log blog post The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/21/2020 • 47 minutes, 42 seconds

The Past, Present, And Future Of The FLUFL: Barry Warsaw Shares His History With Python

Summary Barry Warsaw has been a member of the Python community since the very beginning. His contributions to the growth of the language and its ecosystem are innumerable and diverse, earning him the title of Friendly Language Uncle For Life. In this episode he reminisces on his experiences as a core developer, a member of the Python Steering Committee, and his roles at Canonical and LinkedIn supporting the use of Python at those companies. In order to know where you are going it is always important to understand where you have been and this was a great conversation to get a sense of the history of how Python has gotten to where it is today. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This episode of Python Podcast is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at datadog.com/pythonpodcast. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Barry Warsaw about his role in the Python community, past, present, and future. Interview Introductions How did you get introduced to Python? For anyone who isn’t familiar with you, how would you characterize your role in the Python language and community? What have been your main areas of focus in your role as a core developer? What are some of the other forms that your contributions to the language and community have taken? What are the contributions to Python that you are most proud of? Looking back at the past 25 years of Python, what do you find most interesting/surprising/exciting? How has the focus of the community changed or evolved since you first began using it? What are you currently focused on in your role in the steering council? What are the aspects of the language and community that you think need greater attention? What are the core strengths of the language and community that you believe will carry it through the next 25 years? In your current and previous roles you acted as a guiding force for Python. What are the main use cases for Python at LinkedIn? What kinds of projects are you involved with to support the other engineers in their use of Python? How much of an impact has the invisible hand of the PSU had on the overall trajectory of Python? Outside of Python, what are the programming languages or communities that you look to for inspiration? What are your personal goals for the future of Python? Keep In Touch Website warsaw on GitHub warsaw on GitLab Blog @pumpichank on Twitter Picks Tobias Hanna TV Series Barry Midnight Gospel The Expanse TV Series Audio Books Free 30 Day Audible Trial (Affiliate Link) Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links FLUFL PEP 401 Python Steering Council The PEP Talk episode Usenet BBS == Bulletin Board System comp.lang.python NIST == National Institute of Standards and Technology CNRI == Corporation for National Research Initiatives BayPIGgies Tcl/Tk PEP 572 := The Walrus Operator "The Grand Renaming" IETF == Internet Engineering Task Force RFC WebAssembly Python Software Foundation Podcast Episode Python Black Swans keynote by Russell Keith-Magee Followup Podcast Episode Ewa Jodlowska Canonical Launchpad Mypy Podcast Episode Python Type Annotations Iris Event Paging System OnCall Pager Rotation System Shiv PyOxidizer Rust Flake8 isort Black Sphinx Read The Docs Podcast Episode Sybil Manuel Doctest Pytest Coverage.py Cargo package system Tai Chi Python Core Mentorship The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/13/2020 • 51 minutes, 40 seconds

Pure Python Configuration Management With PyInfra

Summary Building and managing servers is a challenging task. Configuration management tools provide a framework for handling the various tasks involved, but many of them require learning a specific syntax and toolchain. PyInfra is a configuration management framework that embraces the familiarity of Pure Python, allowing you to build your own integrations easily and package it all up using the same tools that you rely on for your applications. In this episode Nick Barrett explains why he built it, how it is implemented, and the ways that you can start using it today. He also shares his vision for the future of the project and you can get involved. If you are tired of writing mountains of YAML to set up your servers then give PyInfra a try today. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This portion of Podcast.__init__ is brought to you by Datadog. Do you have an app in production that is slower than you like? Is its performance all over the place (sometimes fast, sometimes slow)? Do you know why? With Datadog, you will. You can troubleshoot your app’s performance with Datadog’s end-to-end tracing and in one click correlate those Python traces with related logs and metrics. Use their detailed flame graphs to identify bottlenecks and latency in that app of yours. Start tracking the performance of your apps with a free trial at datadog.com/pythonpodcast. If you sign up for a trial and install the agent, Datadog will send you a free t-shirt. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Nick Barrett about PyInfra, a pure Python framework for agentless configuration management Interview Introductions How did you get introduced to Python? Can you start by describing what PyInfra is and its origin story? There are a number of options for configuration management of various levels of complexity and language options. What are the features of PyInfra that might lead someone to choose it over other systems? What do you see as the major pain points in dealing with infrastructure today? For someone who is using PyInfra to manage their servers, what is the workflow for building and testing deployments? How do you handle enforcement of idempotency in the operations being performed? Can you describe how PyInfra is implemented? How has its design or focus evolved since you first began working on it? What are some of the initial assumptions that you had at the outset which have been challenged or updated as it has grown? The library of available operations seems to have a good baseline for deploying and managing services. What is involved in extending or adding operations to PyInfra? With the focus of the project being on its use of pure Python and the easy integration of external libraries, how do you handle execution of python functions on remote hosts that requires external dependencies? What are some of the other options for interfacing with or extending PyInfra? What are some of the edge cases or points of confusion that users of PyInfra should be aware of? What has been the community response from developers who first encounter and trial PyInfra? What have you found to be the most interesting, unexpected, or challenging aspects of building and maintaining PyInfra? When is PyInfra the wrong choice for managing infrastructure? What do you have planned for the future of the project? Keep In Touch Fizzadar on GitHub Website @Fizzadar on Twitter LinkedIn Picks Tobias My Spy Nick Das Keyboard Ultimate Korean Short Ribs Kimchi Fried Rice Links PyInfra Oxygem WordPress Lua Gary’s Mod Java Ansible SaltStack Chef Puppet EC2 Boto 3 Hashicorp Vault Vagrant Docker Testinfra SaltStack Testinfra Plugin Dockerfile Idempotence Nginx POSIX gevent Jinja2 Click Zero Tier BSD AST Module RedBaron The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/6/2020 • 43 minutes, 8 seconds

Build Your Own Domain Specific Language in Python With textX

Summary Programming languages are a powerful tool and can be used to create all manner of applications, however sometimes their syntax is more cumbersome than necessary. For some industries or subject areas there is already an agreed upon set of concepts that can be used to express your logic. For those cases you can create a Domain Specific Language, or DSL to make it easier to write programs that can express the necessary logic with a custom syntax. In this episode Igor Dejanović shares his work on textX and how you can use it to build your own DSLs with Python. He explains his motivations for creating it, how it compares to other tools in the Python ecosystem for building parsers, and how you can use it to build your own custom languages. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to pythonpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host as usual is Tobias Macey and today I’m interviewing Igor Dejanović about textX, a meta-language for building domain specific languges in Python Interview Introductions How did you get introduced to Python? Can you start by describing what a domain specific language is and some examples of when you might need one? What is textX and what was your motivation for creating it? There are a number of other libraries in the Python ecosystem for building parsers, and for creating DSLs. What are the features of textX that might lead someone to choose it over the other options? What are some of the challenges that face language designers when constructing the syntax of their DSL? Beyond being able to parse and process an arbitrary syntax, there are other concerns for consumers of the definition in terms of tooling. How does textX provide support to those end users? How is textX implemented? How has the design or goals of textX changed since you first began working on it? What is the workflow for someone using textX to build their own DSL? Once they have defined the grammar, how do they distribute the generated interpreter for others to use? What are some of the common challenges that users of textX face when trying to define their DSL? What are some of the cases where a PEG parser is unable to unambiguously process a defined grammar? What are some of the most interesting/innovative/unexpected ways that you have seen textX used? What have you found to be the most interesting, unexpected, or challenging lessons that you have learned while building and maintaining textX and its associated projects? While preparing for this interview I noticed that you have another parser library in the form of Parglare. How has your experience working with textX informed your designs of that project? What lessons have you taken back from Parglare into textX? When is textX the wrong choice, and someone might be better served by another DSL library, different style of parser, or just hand-crafting a simple parser with a regex? What do you have planned for the future of textX? Keep In Touch Website igordejanovic on GitHub @dejanovicigor on Twitter Picks Tobias wemake-python-styleguide Igor Interactive Fiction genre Awesome Interactive Fiction The Interactive Fiction Database TADS Inform 7 Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links textX U of Novi Sad Serbia DSL course Secondary Notation Django Xtext Eclipse PLY SLY PyParsing Lark PEG Grammar Language Workbench Language Server Protocol Visual Studio Code textX-LS Arpeggio Parser Context-Free Grammar pyTabs Guitar Tablatures Parglare GLR parsing TEP 1 Evennia Podcast Episode The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/30/2020 • 54 minutes, 18 seconds

Adding Observability To Your Python Applications With OpenTelemetry

Summary Once you release an application into production it can be difficult to understand all of the ways that it is interacting with the systems that it integrates with. The OpenTracing project and its accompanying ecosystem of technologies aims to make observability of your systems more accessible. In this episode Austin Parker and Alex Boten explain how the correlation of tracing and metrics collection improves visibility of how your software is behaving, how you can use the Python SDK to automatically instrument your applications, and their vision for the future of observability as the OpenTelemetry standard gains broader adoption. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Austin Parker and Alex Boten about the OpenTelemetry project and its efforts to standardize the collection and analysis of observability data for your applications Interview Introductions How did you get introduced to Python? Can you start by describing what OpenTelemetry is and some of the story behind it? How do you define observability and in what ways is it separate from the "traditional" approach to monitoring? What are the goals of the OpenTelemetry project? For someone who wants to begin using OpenTelemetry clients in their Python application, what is the process of integrating it into their application? How does the definition and adoption of a cross-language standard for telemetry data benefit the broader software community? How do you avoid the trap of limiting the whole ecosystem to the lowest common denominator? What types of information are you focused on collecting and analyzing to gain insights into the behavior of applications and systems? What are some of the challenges that are commonly faced in interpreting the collected data? With so many implementations of the specification, how are you addressing issues of feature parity? For the Python SDK, how is it implemented? What are some of the initial designs or assumptions that have had to be revised or reconsidered as it gains adoption? What is your approach to integration with the broader ecosystem of tools and frameworks in the Python community? What are some of the interesting or unexpected challenges that you have faced or lessons that you have learned while working on instrumentation of Python projects? Once an application is instrumented, what are the options for delivering and storing the collected data? What are some of the most interesting, unexpected, or challenging lessons that you have learned while working on and with the OpenTelemetry ecosystem? What are some of the most interesting, innovative, or unexpected ways that you have seen components in the OpenTelemetry ecosystem used? When is OpenTelemetry the wrong choice? What is in store for the future of the OpenTelemetry project? Keep In Touch Austin @austinlparker on Twitter austinlparker on GitHub Alex LinkedIn @codeboten on Twitter codeboten on GitHub Picks Tobias Pulumi Podcast Episode Austin Helm 3 Alex Algorithms To Live By: The Computer Science Of Everyday Decisions by Brian Christian and Tom Griffiths Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links OpenTelemetry Lightstep OpenTracing OpenCensus Distributed Tracing Jaeger Zipkin Observability Kubernetes Spring Flask gRPC Structlog Filebeat W3C Trace Context OpenTelemetry Python SDK OpenTelemetry Django OpenTelemetry Flask OpenTelemetry Collector OTLP == Open Telemetry Protocol The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/23/2020 • 53 minutes, 44 seconds

Build A Personal Knowledge Store With Topic Modeling In Contextualize

Summary Our thought patterns are rarely linear or hierarchical, instead following threads of related topics in unpredictable directions. Topic modeling is an approach to knowledge management which allows for forming a graph of associations to make capturing and organizing your thoughts more natural. In this episode Brett Kromkamp shares his work on the Contextualize project and how you can use it for building your own topic models. He explains why he wrote a new topic modeling engine, how it is architected, and how it compares to other systems for organizing information. Once you are done listening you can take Contextualize for a test run for free with his hosted instance. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Brett Kromkamp about Contextualise, a topic modeling application that helps you build a mind map for information-heavy projects Interview Introductions How did you get introduced to Python? Can you start by describing what Contextualize is and some of the types of projects that it can be used for? What was your motivation for creating it? How do you use topic maps in your own work and creative endeavors? The space of personal note-taking and knowledge management is vast and varied. What does Contextualize do well that you have been unable to find or implement in other tools? For someone using Contextualize, what does that workflow look like? How are you approaching integration with different creative contexts (e.g. text editors, graphics editors, word processing, etc.)? Can you describe how Contextualize is implemented? How has the design evolved since you first began working on it? In the documentation for Contextualize it mentions that this is the latest in a string of topic mapping platforms that you have built. What are some of the lessons that you have learned from previous efforts that have influenced the design of this one? One of the challenges with many knowledge management tools is that they are proscriptive in how to work with them. In what ways has your own preference for how to interact with information influenced the direction of Contextualize? Being an open source application, how has its exposure to the public directed your software and user design? How do you approach the challenge of reducing friction in adding content and relations while allowing for flexibility and context management? What are some of the projects that you are using Contextualize for? What are your thoughts on the utility of something like Contextualize for capturing and organizing the collective knowledge of a team of collaborators, whether in a work or casual context? What have you found to be the most interesting, complex, or complicated aspects of building a topic mapping platform? When is Contextualize the wrong choice? What do you have planned for the future of the project? Keep In Touch Website @brettkromkamp on Twitter brettkromkamp on GitHub Picks Tobias Pydantic Podcast Episode MyPy Podcast Episode Brett Black Lives Matter Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Contextualise GitHub Repository Norway IBM Rexx Java Semantic Web Topic Map ISO standard for topic maps RDF Spain Knowledge Management Graph Database Worldbuilding Roam Research TopicDB Twitter Bootstrap Hypergraph Digital Gardening Notion TiddlyWiki The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/15/2020 • 58 minutes, 6 seconds

Open Source Product Analytics With PostHog

Summary You spend a lot of time and energy on building a great application, but do you know how it’s actually being used? Using a product analytics tool lets you gain visibility into what your users find helpful so that you can prioritize feature development and optimize customer experience. In this episode PostHog CTO Tim Glaser shares his experience building an open source product analytics platform to make it easier and more accessible to understand your product. He shares the story of how and why PostHog was created, how to incorporate it into your projects, the benefits of providing it as open source, and how it is implemented. If you are tired of fighting with your user analytics tools, or unwilling to entrust your data to a third party, then have a listen and then test out PostHog for yourself. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! You listen to this show because you love Python and want to keep your skills up to date, and machine learning is finding its way into every aspect of software engineering. Springboard has partnered with us to help you take the next step in your career by offering a scholarship to their Machine Learning Engineering career track program. In this online, project-based course every student is paired with a Machine Learning expert who provides unlimited 1:1 mentorship support throughout the program via video conferences. You’ll build up your portfolio of machine learning projects and gain hands-on experience in writing machine learning algorithms, deploying models into production, and managing the lifecycle of a deep learning prototype. Springboard offers a job guarantee, meaning that you don’t have to pay for the program until you get a job in the space. Podcast.__init__ is exclusively offering listeners 20 scholarships of $500 to eligible applicants. It only takes 10 minutes and there’s no obligation. Go to pythonpodcast.com/springboard and apply today! Make sure to use the code AISPRINGBOARD when you enroll. Your host as usual is Tobias Macey and today I’m interviewing Tim Glaser about PostHog, an open source platform for product analytics Interview Introductions How did you get introduced to Python? Can you start by describing what PostHog is and what motivated you to build it? What are the goals of PostHog and who are the target audience? In the description of PostHog it mentions being a product focused analytics platform, as opposed to session based. What are the meaningful differences between the two? Customer analytics is a rather crowded market, with a large number of both commercial and open source offerings (e.g. Google Analytics, Heap, Matomo, Snowplow, etc.). How does PostHog fit in that landscape and what are the differentiating factors that would lead someone to select it over the alternativs? For anyone interested in using PostHog, do you offer a migration path from other platforms? necessary features for a customer analytics tool privacy and security issues around analytics How is PostHog implemented and how has its design evolved since you first began building it? reason for choosing Python benefits of Django thoughts on introducing Channels option to include it as a pluggable Django app integration points data lake integration challenges of providing understandable statistics and exposing options for detailed analysis Having data about how users are interacting with your site or application is interesting, but how does it help in determining the useful actions to drive success? business model and project governance What are the most complex, complicated, or misunderstood aspects of building a product analytics platform? What have you found to be the most interesting, unexpected, or challenging lessons that you have learned in the process of building PostHog? When is PostHog the wrong choice? What do you have planned for the future of PostHog? Keep In Touch timgl on GitHub LinkedIn @timgl on Twitter Picks Tobias Hitchhiker’s Guide To The Galaxy Tim Triumph Of The City by Edward Glaeser Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links PostHog MixPanel Amplitude Heap Data Engineering Podcast Episode Snowplow Data Engineering Podcast Episode Looker Data Engineering Podcast Episode SnowflakeDB Data Engineering Podcast Episode Tableau DOM == Document Object Model for web pages Django Django Rest Framework React.js Kea state management for React.js Redux TypeScript Django Stubs Django Channels Sentry Podcast Episode Pluggable Django App PostgreSQL ELT Data Lake Optimizely Feature Flags Podcast Episode PostHog Roadmap PostHog Employee Handbook Matomo (formerly Piwik) The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/8/2020 • 49 minutes, 8 seconds

Extending The Life Of Python 2 Projects With Tauthon

Summary The divide between Python 2 and 3 lasted a long time, and in recent years all of the new features were added to version 3. To help bridge the gap and extend the viability of version 2 Naftali Harris created Tauthon, a fork of Python 2 that backports features from Python 3. In this episode he explains his motivation for creating it, the process of maintaining it and backporting features, and the ways that it is being used by developers who are unable to make the leap. This was an interesting look at how things might have been if the elusive Python 2.8 had been created as a more gentle transition. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! You listen to this show because you love Python and want to keep your skills up to date, and machine learning is finding its way into every aspect of software engineering. Springboard has partnered with us to help you take the next step in your career by offering a scholarship to their Machine Learning Engineering career track program. In this online, project-based course every student is paired with a Machine Learning expert who provides unlimited 1:1 mentorship support throughout the program via video conferences. You’ll build up your portfolio of machine learning projects and gain hands-on experience in writing machine learning algorithms, deploying models into production, and managing the lifecycle of a deep learning prototype. Springboard offers a job guarantee, meaning that you don’t have to pay for the program until you get a job in the space. Podcast.__init__ is exclusively offering listeners 20 scholarships of $500 to eligible applicants. It only takes 10 minutes and there’s no obligation. Go to pythonpodcast.com/springboard and apply today! Make sure to use the code AISPRINGBOARD when you enroll. Your host as usual is Tobias Macey and today I’m interviewing Naftali Harris about his work on Tauthon, a fork of Python 2 that backports features from Python 3 Interview Introductions How did you get introduced to Python? Can you start by describing what Tauthon is and your motivations for creating it? What’s the story behind the name? What types of applications and environments are you using Tauthon in? How much adoption of Tauthon have you seen? What are some of the different ways that your users are employing it? Is this the missing "2.8" release? In other words, is this intended to be a bridge for simplifying the migration of existing Python 2 code to Python 3, or as an extended support window for Python 2? What features have you backported from Python 3? What is your process for identifying and prioritizing features to bring into Tauthon? What is your workflow for implementing the backported functionality in Tauthon? What are some of the cases where you have had to compromise on the functionality or syntax of a feature that you have backported in order to fit into Python 2? What is your governing philosophy for how to manage syntax or behavior differences between Python 2 and 3? What have been the most challenging features to backport and maintain? What are some of the ways that Tauthon might break existing Python 2 code? What is the story for compatibility with libraries that are Python 3 only? What have you seen in terms of adoption of Tauthon? Do you have any sense of the commonalities among those users? What are some of the ecosystem challenges that faces users of Tauthon? (e.g. Pip support, package compatibility, etc.) What are some of the most interesting, unexpected, or challenging lessons that you have learned in the process of creating and maintaining Tauthon? What are your long-term plans for Tauthon, and how have they changed since you first started working on it? Keep In Touch Website @naftaliharris on Twitter naftaliharris on GitHub Picks Tobias Dagster PyCon 2020 Online Naftali Sentilink Timsort Tim Peters Links Tauthon Function Annotations Tau Nick Coghlan MyPy Podcast Episode Matrix Multiplier Operator Python 3.9 PEG Parser lazysorted nonlocal keyword Valgrind The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/2/2020 • 33 minutes, 7 seconds

Dependency Management Improvements In Pip's Resolver

Summary Dependency management in Python has taken a long and winding path, which has led to the current dominance of Pip. One of the remaining shortcomings is the lack of a robust mechanism for resolving the package and version constraints that are necessary to produce a working system. Thankfully, the Python Software Foundation has funded an effort to upgrade the dependency resolution algorithm and user experience of Pip. In this episode the engineers working on these improvements, Pradyun Gedam, Tzu-Ping Chung, and Paul Moore, discuss the history of Pip, the challenges of dependency management in Python, and the benefits that surrounding projects will gain from a more robust resolution algorithm. This is an exciting development for the Python ecosystem, so listen now and then provide feedback on how the new resolver is working for you. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show because you love Python and want to keep your skills up to date, and machine learning is finding its way into every aspect of software engineering. Springboard has partnered with us to help you take the next step in your career by offering a scholarship to their Machine Learning Engineering career track program. In this online, project-based course every student is paired with a Machine Learning expert who provides unlimited 1:1 mentorship support throughout the program via video conferences. You’ll build up your portfolio of machine learning projects and gain hands-on experience in writing machine learning algorithms, deploying models into production, and managing the lifecycle of a deep learning prototype. Springboard offers a job guarantee, meaning that you don’t have to pay for the program until you get a job in the space. Podcast.__init__ is exclusively offering listeners 20 scholarships of $500 to eligible applicants. It only takes 10 minutes and there’s no obligation. Go to pythonpodcast.com/springboard and apply today! Make sure to use the code AISPRINGBOARD when you enroll. Your host as usual is Tobias Macey and today I’m interviewing Tzu-ping Chung, Pradyun Gedam, and Paul Moore about their work to improve the dependency resolution capabilities of Pip and its user experience Interview Introductions How did you get introduced to Python? Can you start by describing the focus of the work that you are doing? What is the scope of the work, and what is the established criteria for when it is considered complete? What is your history with working on the Pip source code and what interests you most about this project? What are the main sources or manifestations of technical debt that exist in Pip as of today? How does it currently handle dependency resolution? What are some of the workarounds that developers have had to resort to in the absence of a robust dependency resolver in Pip? How is the new dependency resolver implemented? How has your initial design evolved or shifted as you have gotten further along in its implementation? What are the pieces of information that the resolver will rely on for determining which packages and versions to install? (e.g. will it install setuptools > 45.x in a Python 2 virtualenv?) What are the new capabilities in Pip that will be enabled by this upgrade to the dependency resolver? What projects or features in the encompassing ecosystem will be unblocked with the introduction of this upgrade? What are some of the changes that users will need to make to adopt the updated Pip? How do you anticipate the changes in Pip impacting the viability or adoption of Python and its ecosystem within different communities or industries? What are some of the additional changes or improvements that you would like to see in Pip or other core elements of the Python landscape? What are some of the most interesting, unexpected, or challenging lessons that you have learned while working on these updates to Pip? Keep In Touch Pradyun Website pradyunsg on GitHub @pradyunsg on Twitter Paul pfmoore on GitHub Tzu-Ping uranusjr on GitHub Website @uranusjr on Twitter Picks Tzu-ping Python Launcher Joe Abercrombie author The Shattered Sea Trilogy Anime PipX Standalone Paul pipx Black nox tox scoop Neil Gaiman Good Omens Book TV Series Pradyun because my picks can be anything — things that have kept me sane in this lockdown world Music: Chris Daughtry Video Game: Parkitect Tobias Language Server Protocol Emacs lsp-mode Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Pip Podcast interview with Donald Stufft Macdown Taiwan Pipenv PyPI Podcast Episode TOML Python Package Metadata Standards iBook G4 Acorn Computer distutils easy_install Python Eggs setuptools Python Wheels CPAN Conda Inside The Cheeseshop Google Summer of Code Zazo PEP517 pip-tools Poetry resolvelib SAT Solver Trove Classifiers PyPA pyproject.toml The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/25/2020 • 1 hour, 16 minutes, 31 seconds

Easy Data Validation For Your Python Projects With Pydantic

Summary One of the most common causes of bugs is incorrect data being passed throughout your program. Pydantic is a library that provides runtime checking and validation of the information that you rely on in your code. In this episode Samuel Colvin explains why he created it, the interesting and useful ways that it can be used, and how to integrate it into your own projects. If you are tired of unhelpful errors due to bad data then listen now and try it out today. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show because you love Python and want to keep your skills up to date. Machine learning is finding its way into every aspect of software engineering. Springboard has partnered with us to help you take the next step in your career by offering a scholarship to their Machine Learning Engineering career track program. In this online, project-based course every student is paired with a Machine Learning expert who provides unlimited 1:1 mentorship support throughout the program via video conferences. You’ll build up your portfolio of machine learning projects and gain hands-on experience in writing machine learning algorithms, deploying models into production, and managing the lifecycle of a deep learning prototype. Springboard offers a job guarantee, meaning that you don’t have to pay for the program until you get a job in the space. Podcast.__init__ is exclusively offering listeners 20 scholarships of $500 to eligible applicants. It only takes 10 minutes and there’s no obligation. Go to pythonpodcast.com/springboard and apply today! Make sure to use the code AISPRINGBOARD when you enroll. Your host as usual is Tobias Macey and today I’m interviewing Samuel Colvin about Pydantic, a library for enforcing type hints at runtime Interview Introductions How did you get introduced to Python? Can you start by describing what Pydantic is and what motivated you to create it? What are the main use cases that benefit from Pydantic? There are a number of libraries in the Python ecosystem to handle various conventions or "best practices" for settings management. How does pydantic fit in that category and why might someone choose to use it over the other options? There are also a number of libraries for defining data schemas or validation such as Marshmallow and Cerberus. How does Pydantic compare to the available options for those cases? What are some of the challenges, whether technical or conceptual, that you face in building a library to address both of these areas? The 3.7 release of Python added built in support for dataclasses as a means of building containers for data with type validation. What are the tradeoffs of pydantic vs the built in dataclass functionality? How much overhead does pydantic add for doing runtime validation of the modelled data? In the documentation there is a nuanced point that you make about parsing vs validation and your choices as to what to support in pydantic. Why is that a necessary distinction to make? What are the limitations in terms of usage that you are accepting by choosing to allow for implicit conversion or potentially silent loss of precision in the parsed data? What are the benefits of punting on the strict validation of data out of the box? What has been your design philosophy for constructing the user facing API? How is Pydantic implemented and how has the overall architecture evolved since you first began working on it? What have you found to be the most challenging aspects of building a library for managing the consistency of data structures in a dynamic language? What are some of the strengths and weaknesses of Python’s type system? What is the workflow for a developer who is using Pydantic in their code? What are some of the pitfalls or edge cases that they might run into? What is involved in integrating with other libraries/frameworks such as Django for web development or Dagster for building data pipelines? What are some of the more advanced capabilities or use cases of Pydantic that are less obvious? What are some of the features or capabilities of Pydantic that are often overlooked which you think should be used more frequently? What are some of the most interesting, innovative, or unexpected ways that you have seen Pydantic used? What are some of the most interesting, challenging, or unexpected lessons that you have learned through your work on or with Pydantic? When is Pydantic the wrong choice? What do you have planned for the future of the project? Keep In Touch samuelcolvin on GitHub Website LinkedIn @samuel_colvin on Twitter Picks Tobias Devil Sticks Samuel Flash Boys by Michael Lewis Algorithms To Live By by Brian Christian and Tom Griffiths NGrok.com Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Pydantic Matlab C# FastAPI Podcast Episode Marshmallow Podcast Episode Cerberus 12 Factor App Django Python Type Hints Cython Podcast Episode MyPy Podcast Episode Duck Typing Haskell Higher Order Types PyCharm Pydantic Plugin Django Rest Framework Avro Parquet Dagster Data Engineering Podcast Episode Starlette Flask Ludwig Deep Pavlov Fast MRI Reagent Pynt Open Source Has Failed article The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/18/2020 • 47 minutes, 14 seconds

Managing Distributed Teams In The Age Of Remote Work

Summary More of us are working remotely than ever before, many with no prior experience with a remote work environment. In this episode Quinn Slack discusses his thoughts and experience of running Sourcegraph as a fully distributed company. He covers the lessons that he has learned in moving from partially to fully remote, the practices that have worked well in managing a distributed workforce, and the challenges that he has faced in the process. If you are struggling with your remote work situation then this conversation has some useful tips and references for further reading to help you be successful in the current environment. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You monitor your website to make sure that you’re the first to know when something goes wrong, but what about your data? Tidy Data is the DataOps monitoring platform that you’ve been missing. With real time alerts for problems in your databases, ETL pipelines, or data warehouse, and integrations with Slack, Pagerduty, and custom webhooks you can fix the errors before they become a problem. Go to pythonpodcast.com/tidydata today and get started for free with no credit card required. Your host as usual is Tobias Macey and today I’m interviewing Quinn Slack about his experience managing a fully remote company and useful tips for remote work Interview Introductions How did you get introduced to Python? Can you start by giving an overview of the team structure at Sourcegraph? You recently moved to being fully remote. What was the motivating factor and how has it changed your personal workflow? What is your prior history with working remote? team practices for visibility of progress impact of remote teams on how code is written and organized reducing review burden by writing clearer code structuring meetings when remote points of friction for remote developer teams benefits of being fully remote incentivizing documentation compensation structure Keep In Touch LinkedIn @sqs on Twitter sqs on GitHub Picks Tobias Joplin App Quinn Skunkworks by Ben Rich Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Sourcegraph Quinn’s Python Search Engine Sourcegraph Employee Handbook Gitlab Gitlab Handbook Zapier Zapier Guide To Remote Work Automattic Automattic Blog On Distributed Work Comments Showing Intent The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/11/2020 • 48 minutes, 45 seconds

Maintainable Infrastructure As Code In Pure Python With Pulumi

Summary After you write your application, you need a way to make it available to your users. These days, that usually means deploying it to a cloud provider, whether that’s a virtual server, a serverless platform, or a Kubernetes cluster. To manage the increasingly dynamic and flexible options for running software in production, we have turned to building infrastructure as code. Pulumi is an open source framework that lets you use your favorite language to build scalable and maintainable systems out of cloud infrastructure. In this episode Luke Hoban, CTO of Pulumi, explains how it differs from other frameworks for interacting with infrastructure platforms, the benefits of using a full programming language for treating infrastructure as code, and how you can get started with it today. If you are getting frustrated with switching contexts when working between the application you are building and the systems that it runs on, then listen now and then give Pulumi a try. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You monitor your website to make sure that you’re the first to know when something goes wrong, but what about your data? Tidy Data is the DataOps monitoring platform that you’ve been missing. With real time alerts for problems in your databases, ETL pipelines, or data warehouse, and integrations with Slack, Pagerduty, and custom webhooks you can fix the errors before they become a problem. Go to pythonpodcast.com/tidydata today and get started for free with no credit card required. Your host as usual is Tobias Macey and today I’m interviewing Luke Hoban about building and maintaining infrastructure as code with Pulumi Interview Introductions How did you get introduced to Python? Can you start by describing the concept of "infrastructure as code"? What is Pulumi and what is the story behind it? Where does the name come from? How does Pulumi compare to other infrastructure as code frameworks, such as Terraform? What are some of the common challenges in managing infrastructure as code? How does use of a full programming language help in addressing those challenges? What are some of the dangers of using a full language to manage infrastructure? How does Pulumi work to avoid those dangers? Why is maintaining a record of the provisioned state of your infrastructure necessary, as opposed to relying on the state contained by the infrastructure provider? What are some of the design principles and constraints that developers should be considering as they architect their infrastructure with Pulumi? Can you describe how Pulumi is implemented? How does Pulumi manage support for multiple languages while maintaining feature parity across them? How do you manage testing and validation of the different providers? The strength of any tool is largely measured in the ecosystem that exists around it, which is one of the reasons that Terraform has been so successful. How are you approaching the problem of bootstrapping the community and prioritizing platform support? Can you talk through the workflow of working with Pulumi to build and maintain a proper infrastructure? What are some of the ways to approach testing of infrastructure code? What does the CI/CD lifecycle for infrastructure look like? What are the limitations of infrastructure as code? How do configuration management tools fit with frameworks such as Pulumi? The core framework of Pulumi is open source, and your business model is focused around a managed platform for tracking state. How are you approaching governance of the project to ensure its continued viability and growth? What are some of the most interesting, innovative, or unexpected design patterns that you have seen your users include in their infrastructure projects? When is Pulumi the wrong choice? What do you have planned for the future of Pulumi? Keep In Touch LinkedIn lukehoban on GitHub @lukehoban on Twitter Picks Tobias Bookshelf App Luke GoBinaries.com Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Pulumi Terraform IronPython HCL == Hashicorp Config Language Kubernetes TypeScript DevOps CloudFormation ARM == Azure Resource Manager AWSx GCP == Google Cloud Platform Pulumi SaaS SaltStack Podcast Episode Ansible Elastic Beanstalk The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/4/2020 • 1 hour, 54 seconds

Teaching Python Machine Learning

Summary Python has become a major player in the machine learning industry, with a variety of widely used frameworks. In addition to the technical resources that make it easy to build powerful models, there is also a sizable library of educational resources to help you get up to speed. Sebastian Raschka’s contribution of the Python Machine Learning book has come to be widely regarded as one of the best references for newcomers to the field. In this episode he shares his experiences as an author, his views on why Python is the right language for building machine learning applications, and the insights that he has gained from teaching and contributing to the field. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Sebastian Raschka about his experiences writing the popular Python Machine Learning book Interview Introductions How did you get introduced to Python? How did you get started in machine learning? What were the concepts that you found most difficult in your career with statistics and machine learning? One of your notable contributions to the field is your book "Python Machine Learning". What inspired you to write the initial version? How did you approach the challenge of striking the right balance of depth, breadth, and accessibility for the content? What was your process for determining which aspects of machine learning to include? You have made 3 editions of the book from 2015 through December of 2019. In what ways has the book changed? What are the biggest changes to the ecosystem and approaches to ML in that timeframe? What are the fundamental challenges of developing machine learning projects that continue to present themselves? What new difficulties have arisen with the introduction of new technologies and the rise of deep learning? What are some of the ways that the Python language lends itself to analytical work? What are its shortcomings and how has the community worked around them? What do you see as the biggest risks to the popularity of Python in the data and analytics space? What are some of the common pitfalls that your readers and students face while learning about different aspects of machine learning? What are some of the industries that can benefit most from applications of machine learning? What are you most excited about in the applications or capabilities of machine learning? What are you most worried about? Keep In Touch Website @rasbt on Twitter rasbt on GitHub LinkedIn Picks Tobias Trolls World Tour Sebastian FFMPeg Normalize Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Python Machine Learning (Packt) Buy On Amazon (affiliate link) UW Madison Pascal Delphi R Perl Bioinformatics Seq Podcast Episode BioPython Podcast Episode CodeCademy Udacity CS101 Andrew Ng Coursera Support-Vector Machine Bayesian Statistics Matlab scikit-learn NumPy Pandas Podcast Episode Sebastian’s Blog Perceptron Heatmaps In R The Hundred Page Machine Learning Book by Andriy Burkov ImageNet Random Forest Logistic Regression XGBoost Theano Generative Adversarial Networks Is This Person Real / This Person Does Not Exist Reinforcement Learning AlphaGo AlphaStar Ray RLlib Open AI Google DeepMind Google Colab CUDA Julia Sebastian Raschka, Joshua Patterson, and Corey Nolet (2020). Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligence. Information 2020, 11, 193 Swift Language Swift for TensorFlow Matplotlib Differential Privacy PrivacyNet YouTube recordings of Stat453: Introduction to Deep Learning and Generative Models (Spring 2020) ffmpeg-normalize The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/28/2020 • 49 minutes, 24 seconds

Build The Next Generation Of Python Web Applications With FastAPI

Summary Python has an embarrasment of riches when it comes to web frameworks, each with their own particular strengths. FastAPI is a new entrant that has been quickly gaining popularity as a performant and easy to use toolchain for building RESTful web services. In this episode Sebastián Ramirez shares the story of the frustrations that led him to create a new framework, how he put in the extra effort to make the developer experience as smooth and painless as possible, and how he embraces extensability with lightweight dependency injection and a straightforward plugin interface. If you are starting a new web application today then FastAPI should be at the top of your list. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Sebastián Ramirez about FastAPI, a framework for building production ready APIs in Python 3 Interview Introductions How did you get introduced to Python? Can you start by describing what FastAPI is? What are the main frustrations that you ran into with other frameworks that motivated you to create an entirely new one? What are some of the main use cases that FastAPI is designed for? Many web frameworks focus on managing the end-to-end functionality of a website, including the UI. Why did you focus on just API capabilities? What are the benefits of building an API only framework? If you wanted to integrate a presentation layer, what would be involved in that effort? What API formats does FastAPI support? What would be involved in adding support for additional specifications such as GraphQL or JSON-LD? There are a huge number of web frameworks available just in the Python ecosystem. How does FastAPI fit into that landscape and why might someone choose it over the other options? Can you share your design philosophy for the project? What are your main sources of inspiration for the framework? You have also built the Typer CLI library which you refer to as the little sibling of FastAPI. How have your experiences building these two projects influenced their counterpart’s evolution? What are the benefits of incorporating type annotations into a web framework and in what ways do they manifest in its functionality? What is the workflow for a developer building a complex application in FastAPI? Can you describe how FastAPI itself is architected and how its design has evolved since you first began working on it? What are the extension points that are available for someone to build plugins for FastAPI? What are some of the challenges that you have faced in building an async framework that is leveraging the new ASGI specification? What are some sharp edges that users should keep an eye out for? What are some unique or underutilized features of FastAPI that users might not be aware of? What are some of the most interesting, unexpected, or innovative ways that you have seen FastAPI used? When is FastAPI the wrong choice? What are some of the most interesting, unexpected, or challenging lessons that you have learned in the process of building and maintaining FastAPI? What do you have planned for the future of the project? Keep In Touch @tiangolo on Twitter. @tiangolo on GitHub. Picks Tobias Once Upon A Time TV Show Sebastián Cloud Atlas Movie Isaac Asimov’s robot short stories Python devtools debug function async compatible requests with HTTPX RescueTime for automatic time tracking Joplin for Notes Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links FastAPI Typer Typer CLI FastAPI Alternatives, Inspiration and Comparisons Explosion’s spaCy Explosion’s Prodigy Starlette Pydantic Uvicorn Hypercorn fastapi-utils Class Based Views GrahQL Ariadne Coronavirus Tracker API Terminals from browser: termpair XPublish Uber’s Ludwig Netflix Dispatch Colombia Berlin Germany Explosion AI Python Type Annotations Django Rest Framework Flask Swagger/OpenAPI Sanic NodeJS JSON Schema OAuth2 Swagger UI ReDoc React VueJS Angular REST == REpresentational State Transfer JSON-LD Go Language Hug API framework Click CLI Framework Flask Blueprints Tom Christie Podcast Interview Dependency Injection ASGI Podcast Episode WSGI Thread Local Variables Context Vars OAUTH2 Scopes PipX XArray JAM Stack NextJS Hugo GatsbyJS FastAPI Project Templates The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/20/2020 • 58 minutes, 34 seconds

Distributed Computing In Python Made Easy With Ray

Summary Distributed computing is a powerful tool for increasing the speed and performance of your applications, but it is also a complex and difficult undertaking. While performing research for his PhD, Robert Nishihara ran up against this reality. Rather than cobbling together another single purpose system, he built what ultimately became Ray to make scaling Python projects to multiple cores and across machines easy. In this episode he explains how Ray allows you to scale your code easily, how to use it in your own projects, and his ambitions to power the next wave of distributed systems at Anyscale. If you are running into scaling limitations in your Python projects for machine learning, scientific computing, or anything else, then give this a listen and then try it out! Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Robert Nishihara about Ray, a framework for building and running distributed applications and machine learning Interview Introductions How did you get introduced to Python? Can you start by describing what Ray is and how the project got started? How did the environment of the RISE lab factor into the early design and development of Ray? What are some of the main use cases that you were initially targeting with Ray? Now that it has been publicly available for some time, what are some of the ways that it is being used which you didn’t originally anticipate? What are the limitations for the types of workloads that can be run with Ray, or any edge cases that developers should be aware of? For someone who is building on top of ray, what is involved in either converting an existing application to take advantage of Ray’s parallelism, or creating a greenfield project with it? Can you describe how Ray itself is implemented and how it has evolved since you first began working on it? How does the clustering and task distriubtion mechanism in Ray work? How does the increased parallelism that Ray offers help with machine learning workloads? Are there any types of ML/AI that are easier to do in this context? What are some of the additional layers or libraries that have been built on top of the functionality of Ray? What are some of the most interesting, challenging, or complex aspects of building and maintaining Ray? You and your co-founders recently announced the formation of Anyscale to support the future development of Ray. What is your business model and how are you approaching the governance of Ray and its ecosystem? What are some of the most interesting or unexpected projects that you have seen built with Ray? What are some cases where Ray is the wrong choice? What do you have planned for the future of Ray and Anyscale? Keep In Touch Website @robertnishihara on Twitter robertnishihara on GitHub Picks Tobias D&D Castle Ravenloft board game One Deck Dungeon Robert The Everything Store Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Ray Anyscale UC Berkeley RISELab MATLAB Deep Learning Theano Tensorflow PyTorch Podcast Episode Philip Moritz Reinforcement Learning Hyperparameter Tuning IPython Parallel AMPLab Apache Spark Data Engineering Podcast Episode Actor Model Horovod(?) Flink Data Engineering Podcast Episode Spark Streaming Dask Data Engineering Podcast Episode gRPC Tune Rust C++ C Apache Arrow Wes McKinney Podcast Interview DataBricks MongoDB Elastic Data Engineering Podcast Episode Confluent Embarassingly Parallel Ant Financial Flame Graph The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/14/2020 • 40 minutes, 59 seconds

Building The Seq Language For Bioinformatics

Summary Bioinformatics is a complex and computationally demanding domain. The intuitive syntax of Python and extensive set of libraries make it a great language for bioinformatics projects, but it is hampered by the need for computational efficiency. Ariya Shajii created the Seq language to bridge the divide between the performance of languages like C and C++ and the ecosystem of Python with built-in support for commonly used genomics algorithms. In this episode he describes his motivation for creating a new language, how it is implemented, and how it is being used in the life sciences. If you are interested in experimenting with sequencing data then give this a listen and then give Seq a try! Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on great conferences. And now, the events are coming to you, with no travel necessary! We have partnered with organizations such as ODSC, and Data Council. Upcoming events include the Observe 20/20 virtual conference on April 6th and ODSC East which has also gone virtual starting April 16th. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Ariya Shajii about Seq, a programming language built for bioinformatics and inspired by Python Interview Introductions How did you get introduced to Python? Can you start by describing what Seq is and your motivation for creating it? What was lacking in other languages or libraries for your use case that is made easier by creating a custom language? If someone is already working in Python, possibly using BioPython, what might motivate them to consider migrating their work to Seq? Can you give an impression of the scope and nature of the tasks or projects that a biologist or geneticist might build with Seq? What was your process for identifying and prioritizing features and algorithms that would be beneficial to the target audience? For someone using Seq can you describe their workflow and how it might differ from performing the same task in Python? How is Seq implemented? What are some of the features that are included to simplify the work of bioinformatics? What was your process of designing the language and runtime? How has the scope or direction of the project evolved since it was first conceived? What impact do you anticipate Seq having on the domain of bioinformatics and genomics? What have you found to be the most interesting, unexpected, and/or challenging aspects of building a language for this problem domain? What is in store for the future of Seq? Keep In Touch arshajii on GitHub Website Picks Tobias Board Games Labyrinth Boardgame Board Game Geek Ariya Breakthrough documentary Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Seq MIT CSAIL Bioinformatics LLVM Intermediate Representation MatLab Moore’s Law BioPython Smith Waterman Algorithm Hamming Distance Pattern Matching in Functional Programming SIMD == Single Instruction Multiple Data Computational Genomics Phylogenetics Sequence Read Archive public data set Google Cloud Life Sciences The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/7/2020 • 36 minutes, 25 seconds

An Open Source Toolchain For Natural Language Processing From Explosion AI

Summary The state of the art in natural language processing is a constantly moving target. With the rise of deep learning, previously cutting edge techniques have given way to robust language models. Through it all the team at Explosion AI have built a strong presence with the trifecta of SpaCy, Thinc, and Prodigy to support fast and flexible data labeling to feed deep learning models and performant and scalable text processing. In this episode founder and open source author Matthew Honnibal shares his experience growing a business around cutting edge open source libraries for the machine learning developent process. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on great conferences. And now, the events are coming to you, with no travel necessary! We have partnered with organizations such as ODSC, and Data Council. Upcoming events include the Observe 20/20 virtual conference on April 6th and ODSC East which has also gone virtual starting April 16th. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Matthew Honnibal about the Thinc and Prodigy tools and an update on SpaCy Interview Introductions How did you get introduced to Python? Can you start by giving an overview of your mission at Explosion? We spoke previously about your work on SpaCy. What has changed in the past 3 1/2 years? How have recent innovations in language models such as BERT and GPT-2 influenced the direction or implementation of the project? When I last looked SpaCy only supported English and German, but you have added several new languages. What are the most challenging aspects of building the additional models? What would be required for supporting symbolic or right-to-left languages? How has the ecosystem for language processing in Python shifted or evolved since you first introduced SpaCy? Another project that you have released is Prodigy to support labelling of datasets. Can you talk through the motivation for creating it and describe the workflow for someone using it? What was lacking in the other annotation tools that you have worked with that you are trying to solve for in Prodigy? What are some of the most challenging or problematic aspects of labelling data sets for use in machine learning projects? What is a typical scale of data that can be reasonably handled by an individual or small team working with Prodigy? At what point do you find that it makes sense to use a labeling service rather than generating the labels yourself? Your most recent project is Thinc for building and using deep learning models. What was the motivation for creating it and what problem does it solve in the ecosystem? How does its design and usage compare to other deep learning frameworks such as PyTorch and Tensorflow? How does it compare to projects such as Keras that abstract across those frameworks? How do the SpaCy, Prodigy, and Thinc libraries work together? What are some of the biggest challenges that you are facing in building open source tools to meet the needs of data scientists and machine learning engineers? What are some of the most interesting or impressive projects that you have seen built with the tools your team is creating? What do you have planned for the future of Explosion, SpaCy, Prodigy, and Thinc? Keep In Touch LinkedIn @honnibal on Twitter honnibal on GitHub Picks Tobias Onward movie Matthew Coronavirus Preparedness Ray Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Explosion AI SpaCy Podcast Episode Thinc Prodigy Natural Language Processing Perl NLTK GPU == Graphics Processing Unit TPU == Tensor Processing Unit Transfer Learning Airflow Luigi Perceptron PyTorch Tensorflow Functional Programming MxNet Keras Cuda C Language Continuous Integration Blackstone Allen AI Institute SciSpaCy Holmes Sense2Vec FastAPI The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/30/2020 • 51 minutes, 19 seconds

A Flexible Open Source ERP Framework To Run Your Business

Summary Running a successful business requires some method of organizing the information about all of the processes and activity that take place. Tryton is an open source, modular ERP framework that is built for the flexibility needed to fit your organization, rather than requiring you to model your workflows to match the software. In this episode core developers Nicolas Évrard and Cédric Krier are joined by avid user Jonathan Levy to discuss the history of the project, how it is being used, and the myriad ways that you can adapt it to suit your needs. If you are struggling to keep a consistent view of your business and ensure that all of the necessary workflows are being observed then listen now and give Tryton a try. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Nicolas Évrard, Cédric Krier, and Jonathan Levy about Tryton Interview Introductions How did you get introduced to Python? Can you start by describing what Tryton is and how it got started? What kinds of businesses is Tryton most suited to? What kinds of businesses is Tryton not a good fit for? Within a business, who are the primary users of Tryton? Can you talk through a typical workflow for interacting with Tryton? What are some of the most complex or challenging aspects of modeling a business while maintaining a high degree of customizability? Can you describe how Tryton is architected and how its design has evolved since it was first started? If you were to start over today, what would you do differently? There are a number of plugins for Tryton. What kinds of functionality can be customized using the available interfaces? What is the process for building a custom module for Tryton? How do you manage sustainability of the Tryton project? Given the criticality of the Tryton platform, how do you approach ongoing stability and security of the project? What is involved in deploying and maintaining an installation of Tryton? What are some of the most interesting, innovative, or unexpected ways that you have seen Tryton used? What is in store for the future of Tryton? Keep In Touch Nicolas nicoe on GitHub @nicoe on Twitter Cédric @cedrickrier on Twitter cedk on GitHub Jonathan LinkedIn Picks Tobias Audio Books Audible free trial (Affiliate Link) Overdrive – ebooks and audiobooks from your local library Public Domain Audiobooks Nicolas Civilization VI FreeCiv The 3 Body Problem Cédric Valérian and Laureline Jonathan Roil.com Links Tryton B2CK Tryton Foundation Advocate Consulting Legal Group Scheme Lisp Belgium EuroPython Conference Plone Zope VBA (Visual Basic for Applications) Django Odoo ERP == Enterprise Resource Planning Small/Medium Enterprise (SME) GTK (Gnome ToolKit) 3-Tier Application Cookiecutter Tryton Module Cookiecutter Tryton Repository Docker GNU Health Nereid The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/23/2020 • 1 hour, 7 minutes, 33 seconds

Getting A Handle On Portable C Extensions With hpy

Summary One of the driving factors of Python’s success is the ability for developers to integrate with performant languages such as C and C++. The challenge is that the interface for those extensions is specific to the main implementation of the language. This contributes to difficulties in building alternative runtimes that can support important packages such as NumPy. To address this situation a team of developers are working to create the hpy project, a new interface for extension developers that is standardized and provides a uniform target for multiple runtimes. In this episode Antonio Cuni discusses the motivations for creating hpy, how it benefits the whole ecosystem, and ways to contribute to the effort. This is an exciting development that has the potential to unlock a new wave of innovation in the ways that you can run your Python code. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! As a developer, maintaining a state of flow is key to your productivity. Don’t let something as simple as the wrong function ruin your day. Kite is the smartest completions engine available for Python, featuring a machine learning model trained by the brightest stars of GitHub. Featuring ranked suggestions sorted by relevance, offering up to full lines of code, and a programming copilot that offers up the documentation you need right when you need it. Get Kite for free today at getkite.com with integrations for top editors, including Atom, VS Code, PyCharm, Spyder, Vim, and Sublime. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Antonio Cuni about hpy, a project aiming to reimagine the C API for Python Interview Introductions How did you get introduced to Python? Can you start by describing what the hpy project is and how it got started? What are the goals for the project? Who else is involved? How much engagement have you had with CPython core contributors or the steering council? Who are the consumers of the current C API for the CPython implementation? What are some of the pain points or shortcomings for those consumers? What impact does that have for users of a given library that leverages C extensions? Can you talk through the structure of the hpy project? What are some of the design challenges that you are facing for determining the external API? What is involved in integrating the hpy interface into alternate runtimes such as PyPy or RustPython? What is the potential or observed performance impact for libraries that currently rely on the existing C API? How has the vision and scope of this project been updated as you have gotten further along in the implementation? What are the downstream impacts that you anticipate in projects such as PyPy and Cython? What have you found to be the most challenging or contentious aspects of implementing hpy so far? What are some of the most interesting/unexpected/useful lessons that you have learned while working on hpy? What do you have planned for the near to medium term for hpy? Keep In Touch antocuni on GitHub Website @antocuni on Twitter Picks Tobias Poetry Antonio Collapse: How Societies Choose To Fail Or Succeed by Jared Diamond Links hpy PyPy Alex Martelli Podcast Interview Python C Extensions EuroPython Victor Stinner Cython Podcast Episode Armin Rigo NumPy ultrajson GIL == Global Interpreter Lock RustPython Podcast Episode GraalPython hpy-rust The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/16/2020 • 35 minutes, 14 seconds

Open Source Machine Learning On Quantum Computers With Xanadu AI

Summary Quantum computers promise the ability to execute calculations at speeds several orders of magnitude faster than what we are used to. Machine learning and artificial intelligence algorithms require fast computation to churn through complex data sets. At Xanadu AI they are building libraries to bring these two worlds together. In this episode Josh Izaac shares his work on the Strawberry Fields and Penny Lane projects that provide both high and low level interfaces to quantum hardware for machine learning and deep neural networks. If you are itching to get your hands on the coolest combination of technologies, then listen now and then try it out for yourself. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, fast object storage, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! As a developer, maintaining a state of flow is key to your productivity. Don’t let something as simple as the wrong function ruin your day. Kite is the smartest completions engine available for Python, featuring a machine learning model trained by the brightest stars of GitHub. Featuring ranked suggestions sorted by relevance, offering up to full lines of code, and a programming copilot that offers up the documentation you need right when you need it. Get Kite for free today at getkite.com with integrations for top editors, including Atom, VS Code, PyCharm, Spyder, Vim, and Sublime. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Josh Izaac about how the work that he is doing at Xanadu AI to make it easier to build applications for quantum processors Interview Introductions How did you get introduced to Python? Can you start by describing what you are working on at Xanadu AI? How do the specifics of your quantum hardware influence the way in which developers need to build their algorithms? (e.g. as compared to DWave) What are some of the underlying principles that developers need to understand in order to take full advantage of the capabilities provided by quantum processors? Can you outline the different components and libraries that you are building to simplify the work of building machine learning/AI projects for quantum processors? What’s the story behind all of the Beatles references? How do the different libraries fit together? What are some of the workloads and use cases that you and your customers are focused on? What are some of the most challenging aspects of designing a library that is accessible to developers while being able to take advantage of the underlying hardware? How does the workflow for machine learning on quantum computers differ from what is being done in classical environments? Given the magnitude of computational power and data processing that can be achieved in a quantum processor it seems that there is a potential for small bugs to have disproportionately large impacts. How can developers identify and mitigate potential sources of error in their algorithms? For someone who is building an application or algorithm to be executed on a Xanadu processor, what does their workflow look like? What are some of the common errors or misconceptions that you have seen in customer code? Can you describe the design and implementation of the Penny Lane and Strawberry Fields libraries and how they have evolved since you first began working on them? What are some of the most ambitious or exciting use cases for quantum systems that you have seen? How are you using the computational capabilities of your platform to feed back into the research and design of successive generations of hardware? What are some useful heuristics for determining whether it is worthwhile to build for a quantum processor rather than leveraging classical hardware? What are some of the most interesting/unexpected/useful lessons that you have learned while working on quantum algorithms and the libraries to support them? What is in store for the future of the Xanadu software ecosystem? What are your predictions for the near to medium term of quantum computing? Keep In Touch josh146 on GitHub Website LinkedIn Picks Tobias Knives Out movie Josh Baking Sourdough Bread Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Xanadu AI Strawberry Fields PennyLane Quantum Physics ASIC == Application Specific Integrated Circuit FPGA == Field Programmable Gate Array GPU == Graphics Processing Unit Quantum Photonics Qubit Trapped Ions Quantum Optics Coherent Light Heisenberg’s Uncertainty Principle Wave/Particle Duality Continuous Variable Quantum Computation NetworkX Tensorflow The Walrus Rigetti Computing PyTorch Podcast Episode The Walrus Operator (Assignment Expressions) Fortran NumPy SciPy IPython Podcast Episode Jax Quantum Machine Learning Xanadu User Discussion Forum The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/10/2020 • 57 minutes, 21 seconds

The Advanced Python Task Scheduler

Summary Most long-running programs have a need for executing periodic tasks. APScheduler is a mature and open source library that provides all of the features that you need in a task scheduler. In this episode the author, Alex Grönholm, explains how it works, why he created it, and how you can use it in your own applications. He also digs into his plans for the next major release and the forces that are shaping the improved feature set. Spare yourself the pain of triggering events at just the right time and let APScheduler do it for you. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Alex Grönholm about APScheduler, a library for scheduling tasks in your Python projects Interview Introductions How did you get introduced to Python? Can you start by describing what APScheduler is and the main use cases that APScheduler is designed for? What was your movitvation for creating it? What is the workflow for integrating APScheduler into an application? In the documentation it says not to run more than one instance of the scheduler, what are some strategies for scaling schedulers? What are some common architectures for applications that take advantage of APScheduler? What are some potential pitfalls that developers should be aware of? Can you describe how APScheduler is implemented and how its design has evolved since you first began working on it? What have you found to be the most complex or challenging aspects of building or using a scheduling framework? What are some of the most interesting/innovative/unexpected ways that you have seen APScheduler used? What are some of the features or capabilities that you have consciously left out? What design strategies or features of APScheduler are often overlooked or underappreciated? What are some of the most useful or interesting lessons that you have learned while building and maintaining APScheduler? When is APScheduler the wrong choice for managing task execution? What do you have planned for the future of the project? Keep In Touch agronholm on GitHub Picks Tobias The Data Exchange Podcast Alex Tenacity Links APScheduler PHP Java ECMAScript Celery ERP == Enterprise Resource Planning Cron Daemon RPyC Zookeeper Data Engineering Podcast Episode RethinkDB Daylight Saving Time Falsehoods Programmers Believe About Time PyTZ Celery Beats Asphalt Framework Podcast Episode AnyIO Twisted Podcast Episode Py2EXE PyInstaller The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/2/2020 • 33 minutes, 15 seconds

Reducing The Friction Of Embedded Software Development With PlatformIO

Summary Embedded software development is a challenging endeavor due to a fragmented ecosystem of tools. Ivan Kravets experienced the pain of programming for different hardware platforms when embroiled in a home automation project. As a result he built the PlatformIO ecosystem to reduce the friction encountered by engineers working with multiple microcontroller architectures. In this episode he describes the complexities associated with targeting multiple platforms, the tools that PlatformIO offers to simplify the workflow, and how it fits into the development process. If you are feeling the pain of working with different editing environments and build toolchains for various microcontroller vendors then give this interview a listen and then try it out for yourself. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Ivan Kravets about PlatformIO, an open source ecosystem for IoT development including a cross-platform IDE, unified debugger, remote unit testing, and firmware updates. Interview Introductions How did you get introduced to Python? Can you start by describing what PlatformIO is? What was your motivation for creating it? What are the aspects of embedded development that keep you interested and engaged in this space? What are some of the types of projects that someone might use PlatformIO to build? What are some of the common challenges that a developer might encounter when working on embedded systems? What are the additional complexities that get introduced as more hardware targets get added to a project? What is the workflow for someone using PlatformIO for embedded systems development? What are the different elements of PlatformIO and how do they simplify the work of building embedded systems projects? How is PlatformIO implemented and how has the system design evolved since you first began working on it? What was your reason for selecting Python as the implementation language? If you were to start over today what would you do differently? How has the embedded hardware and software landscape changed since you first started work on PlatformIO? How has that impacted your product direction? How do developers handle testing and validation of their applications? How does PlatformIO help with updating deployed devices with new firmware? What have been some of the most interesting/unexpected/innovative projects that you have seen built with PlatformIO? What have been some of the most interesting/unexpected/challenging aspects of building and maintaining PlatformIO? How are you approaching sustainability of the project and business? What do you have planned for the future of PlatformIO? Keep In Touch LinkedIn Website ivankravets on GitHub @ikravets on Twitter Picks Tobias UMass Amherst Making Electricity From Thin Air Ivan Don’t focus on the money side of your project, just focus on building a great product. Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links PlatformIO Ukraine Home Automation Home Assistant Podcast Episode Twisted Podcast Episode Zigbee Radio Serial I/O RS-232 ARM CPU Architecture RISC-V AVR Microcontrollers Arduino Texas Instruments Launchpad Eclipse IDE MCU == MicroController Unit VSCode PlatformIO Extension SCons Make Raspberry Pi ESP8266 Marlin 3D Printer Firmware ESP Home Zephyr Realtime Operating System Western Digital The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/25/2020 • 46 minutes, 49 seconds

APIs, Sustainable Open Source and The Async Web With Tom Christie

Summary Tom Christie is probably best known as the creator of Django REST Framework, but his contributions to the state the web in Python extend well beyond that. In this episode he shares his story of getting involved in web development, his work on various projects to power the asynchronous web in Python, and his efforts to make his open source contributions sustainable. This was an excellent conversation about the state of asynchronous frameworks for Python and the challenges of making a career out of open source. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, node balancers, a 40 Gbit/s public network, and a brand new managed Kubernetes platform, all controlled by a convenient API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they’ve got dedicated CPU and GPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Tom Christie about the Encode organization and the work he is doing to drive the state of the art in async for Python Interview Introductions How did you get introduced to Python? Can you start by describing what the Encode organization is and how it came to be? What are some of the other approaches to funding and sustainability that you have tried in the past? What are the benefits to the developers provided by an organization which you were unable to achieve through those other means? What benefits are realized by your sponsors as compared to other funding arrangements? What projects are part of the Encode organization? How do you determine fund allocation for projects and participants in the organization? What is the process for becoming a member of the Encode organization and what benefits and responsibilities does that entail? A large number of the projects that are part of the organization are focused on various aspects of asynchronous programming in Python. Is that intentional, or just an accident of your own focus and network? For those who are familiar with Python web programming in the context of WSGI, what are some of the practices that they need to unlearn in an async world, and what are some new capabilities that they should be aware of? Beyond Encode and your recent work on projects such as Starlette you are also well known as the creator of Django Rest Framework. How has your experience building and growing that project influenced your current focus on a technical, community, and professional level? Now that Python 2 is officially unsupported and asynchronous capabilities are part of the core language, what future directions do you foresee for the community and ecosystem? What are some areas of potential focus that you think are worth more attention and energy? What do you have planned for the future of Encode, your own projects, and your overall engagement with the Python ecosystem? Keep In Touch Website tomchristie on Github @_tomchristie on Twitter Picks Tobias Maleficent: Mistress of Evil Abominable Tom The Lobster The Master And His Emissary by Ian McGilchrist Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Encode Django Rest Framework Starlette Zope Django Django Piston Django Tastypie Andrew Godwin ASGI Django Channels Podcast Episode Flask Pyramid Sentry Podcast Episode Tidelift Uvicorn HTTPX Tidelift Open Collective Stripe Github Sponsors Python Software Foundation Podcast Episode Firebase Databases ORM HTTP3 The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/18/2020 • 43 minutes, 45 seconds

Learning To Program Python By Building Video Games With Arcade

Summary Video games have been a vehicle for learning to program since the early days of computing. Continuing in that tradition, Paul Craven created the Arcade library as a modern alternative to PyGame for use in his classroom. In this episode he explains his motivations for starting a new framework for video game development, his view on the benefits of games in computer education, and how his students and the broader community are using it to build interesting and creative projects. If you are looking for a way to get new programmers engaged, or just want to experiment with building your own games, then this is the conversation for you. Give it a listen and then give Arcade a try for yourself. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Paul Craven about Arcade, an easy-to-learn Python library for creating 2D video games Interview Introductions How did you get introduced to Python? Can you start by describing what Arcade is? What inspired you to begin working on it? Who is your primary audience? As an educator, what have you found to be most effective about using games as a vehicle for teaching programming? What elements of programming or computer science do you have difficulty in addressing within the context of a video game? For someone who wants to move on from working on games to something like web development or data analytics, what elements of software design and structure are easily translated to other domains? Can you describe how Arcade is implemented and how the architecture has evolved since you first began working on it? If you were to start over today, what would you do differently? What have you found to be the most interesting/unexpected/challenging aspects of building and maintaining Arcade? What are some of the most interesting/innovative/unexpected ways that you have seen Arcade used? When is Arcade the wrong platform, or at what point does someone need to move on from Arcade? What do you have planned for the future of Arcade? Keep In Touch @professorcraven on Twitter pvcraven on GitHub Faculty Page Picks Tobias Ori And The Blind Forest Paul Fahrenheit 451 by Ray Bradbury “Mistakes can be profited by Man, when i was young I showed my ignorance in people’s faces. They beat me with sticks. By the time I was forty my blunt instrument had been honed to a fine cutting point for me. If you hide your ignorance, no one will hit you and you’ll never learn.” Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Arcade Simpson College PyGame SDL OpenGL Unity Unreal Engine GoDot Automate The Boring Stuff With Python Minesweeper Pyglet Spatial Hashing Tiled Map Editor Python Type Hints F Strings Data Classes PyMunk FFMPEG PyWeek Podcast Episode Python Discord Arcade Enhancement Requests The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/11/2020 • 41 minutes, 42 seconds

Build Your Own Personal Data Repository With Nostalgia

Summary The companies that we entrust our personal data to are using that information to gain extensive insights into our lives and habits while not always making those findings accessible to us. Pascal van Kooten decided that he wanted to have the same capabilities to mine his personal data, so he created the Nostalgia project to integrate his various data sources and query across them. In this episode he shares his motivation for creating the project, how he is using it in his day-to-day, and how he is planning to evolve it in the future. If you’re interested in learning more about yourself and your habits using the personal data that you share with the various services you use then listen now to learn more. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Pascal van Kooten about his nostalgia project, a nascent framework for taking control of your personal data Interview Introductions How did you get introduced to Python? Can you start by describing your mission with the nostalgia project? How did the topic of personal data management come to be a focus for you? What other options exist for users to be able to collect and manage their own data? What capabilities were lacking in those options that made you feel the need to build Nostalgia? What is your target audience for this set of projects? How are you using Nostalgia in your own life? What are some of the insights that you have been able to gain as a result of integrating your data with Nostalgia? Can you describe the current architecture of the Nostalgia platform and how it has evolved since you began work on it? What are some of the assumptions that you are using to direct the focus of your development and interaction design? What are the minimum number of data sources needed to make this useful? What are some of the challenges that you are facing in collating and integrating different data sources? What are some of the drawbacks of using something like Nostalgia for managing your personal data? What are some of the most interesting/challenging/unexpected aspects of your work on Nostalgia so far? What do you have planned for the future of the project? Keep In Touch Website LinkedIn @kootenpv on Twitter kootenpv on GitHub Picks Tobias Jumanji: The Next Level Jumanji Pascal Bup Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links timeliner qs_ledger Nostalgia Shrynk Whereami R Language Duck Duck Go Caddy Perkeep Dark Programming Language Pandas Podcast Episode Neo4J Pandas Extension Arrays Podcast Episode Parquet Data Engineering Podcast Episode ElectronJS Zincbase The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/4/2020 • 32 minutes, 57 seconds

Simplifying Social Login For Your Web Applications

Summary A standard feature in most modern web applications is the ability to log in or register using accounts that you already own on other sites such as Google, Facebook, or Twitter. Building your own integrations for each service can be complex and time consuming, distracting you from the features that you and your users actually care about. Fortunately the Python social auth library makes it easy to support third party authentication with a large and growing number of services with minimal effort. In this episode Matías Aguirre discusses his motivation for creating the library, how he has designed it to allow for flexibility and ease of use, and the benefits of delegating identity and authentication to third parties rather than managing passwords yourself. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Matías Aguirre about Python social auth and the complexities of third-party authentication Interview Introductions How did you get introduced to Python? Can you start by describing what the Python social auth project is and your motivation for starting it? Why might someone want to integrate with or rely on a third-party identity provider in their projects? What are some of the tradeoffs or drawbacks of implementing Can you describe the current architecture of the library and how it has evolved since you first began working on it? There are a number of pre-built integrations with different web frameworks in the social auth github organization, but Django is the only one that has seen any commits recently. What are the contributing factors for that state of affairs? There are a number of authentication protocols that you support. What are the common capabilities that they each support and what are some of the more challenging differences between them? How have you implemented the interface for plugging different authentication mechanisms to allow for the variation between them while keeping the library code maintainable? What is involved in adding support for a new authentication provider or protocol? Many times authorization and authentication are conflated or used interchangeably. How does Python social auth address those concerns and what are the limitations of different mechanisms for defining permissions? For someone who is using Python social auth, what is the workflow for integrating it with their application as a consumer? What are some of the most interesting/unexpected/innovative ways that you have seen Python social auth used? What are some of the most interesting/useful/unexpected lessons that you have learned in the process of building and maintaining Python social auth? When is Python social auth more effort than it’s worth? What do you have planned for the future of the project? Keep In Touch omab on GitHub Website @linuxaddict on Twitter LinkedIn Picks Tobias Joker movie Matías Sanic asynchronous web framework Star Trek Picard TV series Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Python Social Auth Uruguay Django Ruby on Rails MonkeyLearn Social Authentication Django Social Auth Salted and hashed passwords Magic Link Authentication OAuth OpenID SAML FastAPI Sanic ASGI WSGI AsyncIO The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/27/2020 • 34 minutes, 5 seconds

Building A Business On Building Data Driven Businesses

Summary In order for an organization to be data driven they need easy access to their data and a simple way of sharing it. Arik Fraimovich built Redash as a way to address that need by connecting to any data source and building attractive dashboards on top of them. In this episode he shares the origin story of the project, his experiences running a business based on open source, and the challenges of working with data effectively. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Arik Fraimovich about Redash, an open source business intelligence platform that helps you make sense of your data. Interview Introductions How did you get introduced to Python? Can you start by describing what Redash is and its origin story? What are the primary ways that it is used? The business intelligence market is quite mature and has many commercial and open source projects to choose from. What are the aspects of Redash that have allowed you to be successful? What would you consider to be your closest competitors? What was your background with data before starting on Redash? What are some of the most notable lessons that you have learned about business intelligence since starting the project? How has the landscape for business intelligence and data analysis changed since you began the project? Beyond just accessing data, Redash focuses on enabling visualization of the results. What types of visualizations do you support and how do you support users in choosing the most effective ways to represent the information? What are some of the common challenges that your users and customers encounter when communicating with data? One of the critical aspects of enabling data access in an organization is the ability to collaborate on asking and answering questions. How do you approach that challenge in Redash? How is Redash implemented and how has the overall design and architecture evolved since you first started working on it? How do you manage the complexity of supporting so many different data sources? If you were to start over today, what would you do differently? Beyond the code of Redash, you also have a business around providing it as a hosted service. What are some of the most interesting, challenging, or unexpected lessons that you have learned in the process of building and growing that service? How do you approach the direction and governance of the open source project and balance that against the wants and needs of the community? What are some of the most interesting, innovative, or unexpected ways that you have seen Redash used? When is Redash the wrong platform to use? What do you have planned for the future of the Redash business and project? Keep In Touch arikfr on GitHub Website @arikfr on Twitter Picks Tobias Data Engineering Podcast Arik Peewee ORM Amazon ECS Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Redash Google App Engine EverythingMe RedShift Metabase Data Engineering Podcast Interview Apache Superset Elasticsearch Data Engineering Podcast Interview Tableau Looker Data Engineering Podcast Interview PowerBI Data Warehouse Data Lake Athena Spark Data Engineering Podcast Interview Redash Funnel Visualization Stephen Few Flask SQLAlchemy Redis PostgreSQL Data Engineering Podcast Interview Celery RQ Tornado Django ORM AngularJS ReactJS NodeJS Redash Query Results Data Source IBM DB2 Retool Forest Admin Grafana The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/20/2020 • 41 minutes, 26 seconds

Using Deliberate Practice To Level Up Your Python

Summary An effective strategy for teaching and learning is to rely on well structured exercises and collaboration for practicing the material. In this episode long time Python trainer Reuven Lerner reflects on the lessons that he has learned in the 5 years since his first appearance on the show, how his teaching has evolved, and the ways that he has incorporated more hands-on experiences into his lessons. This was a great conversation about the benefits of being deliberate in your approach to ongoing education in the field of technology, as well as having some helpful references for ways to keep your own skills sharp. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m pleased to welcome back Reuven Lerner to talk about the benefits of deliberate practice for learning and improving programming skills Interview Introductions How did you get introduced to Python? In your first appearance on the show back in episode 2 we talked about your experience as a Python trainer. How has your teaching style evolved in the past 5 years? How has the focus and scope of your training changed in that time period? What have you found to be some of the most helpful and effective tactics in your training? From the learner perspective, what are some strategies that you recommend for retaining information, particularly in the context of gaining technical knowledge? In-person training vs. real-time online training vs. recorded videos, advantages and disadvantages of each. Blended learning, in which we combine aspects of the above Beyond in-person training, what are your preferred methods for learning and maintaining new skills? What is deliberate practice and how does it differ from the habits that many of us might default to? What are some of the resources that you provide for students of your trainings for practicing? What are some of the outside resources which you have found most useful or effective? Keep In Touch Website Blog @reuvenmlerner on Twitter Picks Tobias The Manager’s Path by Camille Fournier Reuven Lab Rats: How Silicon Valley Made Work Miserable For The Rest Of Us by Dan Lyons Links Deliberate Practice Reuven On Episode 2 CGI == Common Gateway Interface Language Phrasebook Jupyter Notebook Walrus Operator PyCon 2019 Presentation Python Bytes List Comprehension Weekly Python Exercise Python Morsels PyBites Practice Your Python Python Workout book by Reuven Lerner PyTest Brian Okken The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/13/2020 • 48 minutes, 39 seconds

Checking Up On Python's Role in DevOps

Summary Python has been part of the standard toolkit for systems administrators since it was created. In recent years there has been a shift in how servers are deployed and managed, and how code gets released due to the rise of cloud computing and the accompanying DevOps movement. The increased need for automation and speed of iteration has been a perfect use case for Python, cementing its position as a powerful tool for operations. In this episode Moshe Zadka reflects on his experiences using Python in a DevOps context and the book that he wrote on the subject. He also discusses the difference in what aspects of the language are useful as an introduction for system operators and where they can continue their learning. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Moshe Zadke about his recent book DevOps In Python Interview Introductions How did you get introduced to Python? How did you gain experience in managing systems with Python? What is DevOps? What makes Python a good fit for managing systems? What is unique to the devops/sysadmin domain in terms of what software is used and what aspects of the language are useful? What are the main ways that Python is used for managing servers and infrastructure? What are some of the most notable changes in the ways that Python is used for server administration over the past several years? How has Python3 impacted the lives of operators? What was your motivation for writing a book about Python focused specifically on DevOps and server automation? What are some of the tools that have been replaced in your own workflow over the years? Keep In Touch Website LinkedIn @moshezadka on Twitter Picks Tobias SaltStack Podcast Episode Moshe Automat Podcast Episode Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links DevOps In Python SurveyMonkey Twisted Episode DevOps B=hive CI/CD Amoeba OS Python OS module Requests Canary Deployments Post Mortem Bash Shell Z Shell Linux Unix AWS Boto3 GitHub GitLab Debian Ubuntu CentOS Pip Poetry Pipenv pip-tools dh-virtualenv Docker Hyneck Schlaweck Presentation On Building Docker Images Ansible SaltStack Chef Puppet The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/6/2020 • 33 minutes, 35 seconds

Python's Built In IDE Isn't Just Sitting IDLE

Summary One of the first challenges that new programmers are faced with is figuring out what editing environment to use. For the past 20 years, Python has had an easy answer to that question in the form of IDLE. In this episode Tal Einat helps us explore its history, the ways it is used, how it was built, and what is in store for its future. Even if you have never used the IDLE editor yourself, it is still an important piece of Python’s strength and history, and this conversation helps to highlight why that is. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Tal Einat about the IDLE editor for Python, it’s history, and what is in store for its future Interview Introductions How did you get introduced to Python? For anyone who hasn’t used it, can you start by explaining what IDLE is? IDLE has been part of the standard library for Python for quite some time now. What was the motivation for adding it to the core of Python? How has the evolution of our computing environment changed the motivation for maintaining IDLE and the use cases that it is most beneficial for? What are the benefits of including a basic editor in the default distribution of Python? What are some of the ways in which it is often used? What are the limiting factors that lead users to other IDEs or text editors? What role do you think IDLE has played in the growth of Python? What was your motivation for getting involved as a Python contributor and working on the implementation of IDLE? How is IDLE implemented and what are some of the ways that it has evolved since its initial introduction? How well has the code for IDLE aged as new features and capabilities are added to the language? What are some of the integration points available for extending IDLE? What are some of the most interesting or innovative ways that you have seen IDLE used and extended? What is planned for the future of the IDLE module? Keep In Touch LinkedIn @TalEinat on Twitter taleinat on GitHub Picks Tobias Mr. Robot Tal Captain Fantastic The Lesson To Unlearn article by Paul Graham Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links IDLE FullProof Israel Mandatory Military Service Eric Idle Monty Python Visual Studio IDLE-fork Vi Emacs Sublime Text Visual Studio Code REPL == Read Eval Print Loop Tcl/Tk Tkinter RPC == Remote Procedure Call IDLEx VPython Podcast Episode Python Turtle SVN (Subversion) The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/23/2019 • 36 minutes, 33 seconds

Riding The Rising Tides Of Python

Summary The past two decades have seen massive growth in the language, community, and ecosystem of Python. The career of Pete Fein has occurred during that same period and his use of the language has paralleled some of the major shifts in focus that have occurred. In this episode he shares his experiences moving from a trader writing scripts, through the rise of the web, to the current renaissance in data. He also discusses how his engagement with the community has evolved, why he hasn’t needed to use any other languages in his career, and what he is keeping an eye on for the future. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, Alluxio, and Data Council. Upcoming events include the combined events of the Data Architecture Summit and Graphorum, the Data Orchestration Summit, and Data Council in NYC. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Pete Fein about his voyage on the rising tide of Python Interview Introductions How did you get introduced to Python? I understand that you have used Python exclusively in your professional life. What other languages have you been exposed to and taken inspiration from? What are some of the projects that you have been involved with which you are most proud of? How has the community and your involvement with it changed over the years? In your experience, how has the growth in the size and breadth of the community impacted its accessibility to newcomers? You have been using Python and participating in the community for quite some time now, and there have been significant changes in both within that period. What are some of the most significant technological shifts that you have noticed and been a part of? How have those shifts influenced the direction of your career? As you have moved through the different phases of your career with different areas of focus, what are some of the aspects of the work which have remained constant? What have been the biggest differences across the different problem domains? What are some of the aspects of the language or its ecosystem which you feel are lacking or don’t get enough attention? What are some of the industry trends which you are keeping a close eye on and how do you anticipate them influencing the direction of the community and your career in the upcoming years? Keep In Touch Consulting Website Personal Website @wearpants on Twitter LinkedIn wearpants on GitHub Picks Tobias Matomo Analytics Pete FastAPI PyDantic Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Chicago Scheme Structure and Interpretation of Computer Programs David Beazley Podcast Episode Twiggy logging library Jesse Noller Log4J Debian RedHat StructLog Elliot Podcast Episode Logbook Armin Ronacher Podcast Episode Pittsburgh Python Meetup Boltons library Elixir ChiPy Chicago Python user group Subversion Ruby On Rails Django Data Engineering Data Engineering Podcast Internet of Things Pittsburgh Artificial Pancreas Project Eric Holscher Read The Docs Podcast Episode Circuit Playground Express CircuitPython Podcast Episode Rust Language PyOhio PyGotham The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/16/2019 • 44 minutes, 16 seconds

Debugging Python Projects With PySnooper

Summary Debugging is a painful but necessary practice in software development. The tools that are available in Python range from the built-in debugger, to tools integrated with your coding environment, to the trusty print function. In this episode Ram Rachum describes his work on PySnooper and how it can be used to speed up your problem solving in complex or legacy applications. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, or running your build servers, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media and the Python Software Foundation. Upcoming events include the Software Architecture Conference in NYC and PyCon US in Pittsburgh. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Ram Rachum about PySnooper, an alternative approach to debugging your python projects Interview Introductions How did you get introduced to Python? How do developers normally debug their code, and what need does PySnooper address that isn’t addressed by the established methods? What is the workflow for using PySnooper for investigating or debugging a project? (This will probably be answered in the answer to the question above) What are some of the pieces of information that it surfaces and how do they aid the developer in directing their investigation? What were some of the projects that you were testing it with and how did they influence the direction that you took PySnooper? Can you describe how PySnooper is implemented and some of the ways that it has evolved since you first began working on it? What are some of the initial goals that you had for the project which you have since abandoned as either not useful or too challenging to implement? What are some of the edge cases or technical challenges that you have encountered while working on PySnooper, either in Python itself or in the tool? There is another project called Snoop which builds on top of your work on PySnooper to add some extra functionality and developer ergonomics. What, if anything, was your reaction to it and how has it influenced your work on PySnooper? One of the notable aspects of your work on PySnooper is the amount of attention that it garnered shortly after you published it. How has that visibility affected the long-term popularity and use of PySnooper? What have been some of the most interesting, unexpected, or difficult aspects of creating, maintaining, and promoting PySnooper? What do you have planned for the future of the project? Keep In Touch cool-RR on GitHub Personal Website Consulting Website Picks Tobias PyCon US Call for proposals Registration Ram Nonviolent communication Links PySnooper Ram’s Python workshops The PyWeb-IL meetup BlueVine’s career page Submit your CV to Ram’s email mailto:[email protected] Tel Aviv Israel Paul Graham Y Combinator startup accelerator Wing IDE PyCharm sys.settrace Python f_trace coverage.py Podcast.init Interview PEP == Python Enhancement Proposal Podcast Episode snoop project Alex Hall pdb pudb pdb++ The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/9/2019 • 45 minutes, 30 seconds

Making Complex Software Fun And Flexible With Plugin Oriented Programming

Summary Starting a new project is always exciting because the scope is easy to understand and adding new features is fun and easy. As it grows, the rate of change slows down and the amount of communication necessary to introduce new engineers to the code increases along with the complexity. Thomas Hatch, CTO and creator of SaltStack, didn’t want to accept that as an inevitable fact of software, so he created a new paradigm and a proof-of-concept framework to experiment with it. In this episode he shares his thoughts and findings on the topic of plugin oriented programming as a way to build and scale complex projects while keeping them fun and flexible. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, Alluxio, and Data Council. Upcoming events include the combined events of the Data Architecture Summit and Graphorum, the Data Orchestration Summit, and Data Council in NYC. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Thomas Hatch about his work on the POP library and how he is using plugin oriented programming in his work at SaltStack Interview Introductions How did you get introduced to Python? Can you start by giving your definition of Plugin Oriented Programming and your thoughts on what benefits it provides? You created the POP library as a framework for enabling developers to incorporate this pattern into their own projects. What capabilities does that framework provide and what was your motivation for creating it? How has your work on Salt influenced your thinking on how to implement plugins for software projects? How does POP fit into the future of the SaltStack project? What are some of the advanced patterns or paradigms that the POP model allows for? Can you describe how the POP library itself is implemented and some of the ways that its design has evolved since you first began experimenting with it? What are some of the languages or libraries that you have looked at for inspiration in your design and philosophy around this development pattern? For someone who is building a project on top of POP what does their workflow look like and what are some of the up-front design considerations they should be thinking of? How do you define and validate the contract exposed by or expected from a plugin subsystem? One of the interesting capabilities that you highlight in the documentation is the concept of merging applications. What are your thoughts on the challenges that an engineer might face when merging library or microservice applications built with POP into a single deployable artifact? What would be involved in going the other direction to split a single application into independently runnable microservices? When extracting common functionality from a group of existing applications, what are the relative merits of creating a plugin sybsystem vs writing a library? How does the system design of a POP application impact the available range of communication patterns for software and the teams building it? What are some antipatterns that you anticipate for teams building their projects on top of POP? In the documentation you mention that POP is just an example implementation of the broader pattern and that you hope to see other languages and developer communities adopt it. What are some of the barriers to adoption that you foresee? What are some of the limitations of POP or cases where you would recommend against following this paradigm? What are some of the most interesting, innovative, or unexpected ways that you have seen POP used? What have been some of the most interesting, unexpected, or challenging aspects of building POP? What do you have planned for the future of the POP library, or any applications where you plan to employ this pattern? Keep In Touch thatch45 on GitHub @thatch45 on Twitter Picks Tobias The Man In The High Castle TV series Thomas Jack Ryan TV Series Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Episode 1 POP SaltStack Ruby Microservices Linus Torvalds SaltConf SaltStack Thorium Salt Beacons Salt Reactors Salt Grains Idem AsyncIO Nim OCaml Julia LLVM Object Oriented Programming Go Language Rust RBAC == Role Based Access Control The Mythical Man Month Linux Kernel Heist Umbra Flow Programming Magic The Gathering The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/3/2019 • 1 hour, 2 minutes, 37 seconds

Faster And Safer Software Development With Feature Flags

Summary Any software project that is worked on or used by multiple people will inevitably reach a point where certain capabilities need to be turned on or off. In this episode Pete Hodgson shares his experience and insight into when, how, and why to use feature flags in your projects as a way to enable that practice. In addition to the simple on and off controls for certain logic paths, feature toggles also allow for more advanced patterns such as canary releases and A/B testing. This episode has something useful for anyone who works on software in any language. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, Alluxio, and Data Council. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Pete Hodgson about the concept of feature flags and how they can benefit your development workflow Interview Introductions How did you get introduced to Python? Can you start by describing what a feature flag is? What was your first experience with feature flags and how did it affect your approach to software development? What are some of the ways that feature flags are used? What are some antipatterns that you have seen for teams using feature flags? What are some of the alternative development practices that teams will employ to achieve the same or similar outcomes to what is possible with feature flags? Can you describe some of the different approaches to implementing feature flags in an application? What are some of the common pitfalls or edge cases that teams run into when building an in-house solution? What are some useful considerations when making a build vs. buy decision for a feature toggling service? What are some of the complexities that get introduced by feature flags for mantaining application code over the long run? What have you found to be useful or effective strategies for cataloging and documenting feature toggles in an application, particularly if they are long lived or for open source applications where there is no institutional context? Can you describe some of the lifecycle considerations for feature flags, and how the design, implementation, or use of them changes for short-lived vs long-lived use cases? What are some cases where the overhead of implementing and maintaining a feature flag infrastructure outweighs the potential benefit? What advice or references do you recommend for anyone who is interested in using feature flags for their own work? Keep In Touch Website @ph1 on Twitter moredip on GitHub Picks Tobias Circuit Playground Express CircuitPython Episode Pete Accelerate by Nicole Forsgren, Jez Humble, and Gene Kim Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Perl Ruby Django Feature Flag Pete’s Blog Post On Feature Flags Thoughtworks Continuous Delivery Continuous Delivery Book Trunk Based Development Branch By Abstraction Technical Debt Strategy Pattern Polymorphism The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/26/2019 • 1 hour, 1 minute, 28 seconds

From Simple Script To Beautiful Web Application With Streamlit

Summary Building well designed and easy to use web applications requires a significant amount of knowledge and experience across a range of domains. This can act as an impediment to engineers who primarily work in so-called back-end technologies such as machine learning and systems administration. In this episode Adrien Treuille describes how the Streamlit framework empowers anyone who is comfortable writing Python scripts to create beautiful applications to share their work and make it accessible to their colleagues and customers. If you have ever struggled with hacking together a simple web application to make a useful script self-service then give this episode a listen and then go experiment with how Streamlit can level up your work. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Having all of your logs and event data in one place makes your life easier when something breaks, unless that something is your Elastic Search cluster because it’s storing too much data. CHAOSSEARCH frees you from having to worry about data retention, unexpected failures, and expanding operating costs. They give you a fully managed service to search and analyze all of your logs in S3, entirely under your control, all for half the cost of running your own Elastic Search cluster or using a hosted platform. Try it out for yourself at pythonpodcast.com/chaossearch and don’t forget to thank them for supporting the show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, Alluxio, and Data Council. Upcoming events include the combined events of the Data Architecture Summit and Graphorum, the Data Orchestration Summit, and Data Council in NYC. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Adrien Treuille about Streamlit, an open source app framework built for machine learning and data science teams Interview Introductions How did you get introduced to Python? Can you start by explaining what Streamlit is and its origin story? What are some of the types of applications that are commonly built by data teams and who are the typical consumers of those projects? What are some of the challenges or complications that are unique to this problem space? What are some of the complications or challenges that you have faced to integrate Streamlit with so many different machine learning frameworks? Can you describe the technical implementation of Streamlit and how it has evolved since you began working on it? How did you approach the design of the API and development workflow to tailor it for the needs and capabilities of machine learning engineers? If you were to start the project from scratch today what would you do differently? What is a typical workflow for someone working on a machine learning application and how does Streamlit fit in? What are some of the types of tools or processes that it replaces? What are some of the most interesting or unexpected ways that you have seen Streamlit used? What have you found to be some of the most challenging or unexpected aspects of building and evolving Streamlit? How do you see Python evolving in light of Streamlit and other work in the machine learning space? What do you have in store for the future of Streamlit or any adjacent products and services? How are you approaching the governance and sustainability of the Streamlit open source project? Keep In Touch Website LinkedIn @myelbows on Twitter treuille on GitHub Picks Tobias The Book Of Why by Judea Pearl Adrien No Self, No Problem by Anam Thubten Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Streamlit Forum GitHub Twitter Carnegie Mellon University Google X Zoox IBM Cornell University NumPy SciPy Machine Learning Engineer Jupyter DeckGL Matplotlib Plotly Seaborn Altair PyTorch Tensorflow Protocol Buffers Streamlit for teams Heroku EC2 React JS Awesome Streamlit Flask Plotly Dash Voila NeurIPS The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/18/2019 • 49 minutes, 1 second

Automate Your Server Security With GrapheneX

Summary The internet is rife with bots and bad actors trying to compromise your servers. To counteract these threats it is necessary to diligently harden your systems to improve server security. Unfortunately, the hardening process can be complex or confusing. In this week’s episode 18 year old Orhun Parmaksiz shares the story of how he and his friends created the GrapheneX framework to simplify the process of securing and maintaining your servers using the power and flexibility of Python. If you run your own software then this is definitely worth a listen. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Having all of your logs and event data in one place makes your life easier when something breaks, unless that something is your Elastic Search cluster because it’s storing too much data. CHAOSSEARCH frees you from having to worry about data retention, unexpected failures, and expanding operating costs. They give you a fully managed service to search and analyze all of your logs in S3, entirely under your control, all for half the cost of running your own Elastic Search cluster or using a hosted platform. Try it out for yourself at pythonpodcast.com/chaossearch and don’t forget to thank them for supporting the show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, Alluxio, and Data Council. Upcoming events include the combined events of the Data Architecture Summit and Graphorum, the Data Orchestration Summit, and Data Council in NYC. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Orhun Parmaksiz about GrapheneX, a framework for simplifying the process of hardening your servers Interview Introductions How did you get introduced to Python? Can you start by explaining what we mean when we talk about hardening of servers? What are the common ways of hardening a system, which techniques can we use for this purpose? What are some of the high level categories of threats that operators should be considering? What is GrapheneX and what was your motivation for creating it? How does GrapheneX aid users in the process of increasing the security of their infrastructure? Is any extra operating system knowledge required for using GrapheneX? Can you talk through the workflow for someone using GrapheneX to harden their systems? What options does it support for managing deployment across a fleet of servers? Some security controls can actually prevent proper operation of the applications and services that are deployed on a server. How do you approach preventing those scenarios or educating the users in determining which controls are appropriate? Why did you choose Python for a project like GrapheneX? How is GrapheneX implemented? How has the design evolved since you first began working on it? If you were to start the project over today, what would you do differently? Do you accept contributions to the framework? If so, what kind of contributions are needed for improving GrapheneX? For someone who is interested in adding a new module to the framework, what is involved? What have you found to be the most interesting or challenging aspects of your work on GrapheneX? What, if any, aspects of server security have you consciously avoided implementing in GrapheneX? What are your future plans about the GrapheneX? Keep In Touch Orhun GitHub Twitter LinkedIn Picks Tobias Chess Orhun Creeping in My Soul by Cryoshell Gravity Hurts by Cryoshell Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links GrapheneX GitHub Website PyPI Twitter Trello Graphene New Modules for GNU/Linux & Windows (Issue) Flask Flask-SocketIO React trimstray/linux-hardening-checklist The Windows Server Hardening Checklist Firewall Windows Firewall Linux iptables PCI-DSS 2.2 requirement- server hardening standards CIS Benchmarks The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/11/2019 • 35 minutes, 41 seconds

Accelerating The Adoption Of Python At Wayfair

Summary Large companies often have a variety of programming languages and technologies being used across departments to keep the business running. Python has been gaining ground in these environments because of its flexibility, ease of use, and developer productivity. In order to accelerate the rate of adoption at Wayfair this week’s guest Jonathan Biddle started a team to work with other engineering groups on their projects and show them how best to take advantage of the benefits of Python. In this episode he explains their operating model, shares their success stories, and provides advice on the pitfalls to avoid if you want to follow in his footsteps. This is definitely worth a listen if you are using Python in your work or would like to aid in its adoption. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Having all of your logs and event data in one place makes your life easier when something breaks, unless that something is your Elastic Search cluster because it’s storing too much data. CHAOSSEARCH frees you from having to worry about data retention, unexpected failures, and expanding operating costs. They give you a fully managed service to search and analyze all of your logs in S3, entirely under your control, all for half the cost of running your own Elastic Search cluster or using a hosted platform. Try it out for yourself at pythonpodcast.com/chaossearch and don’t forget to thank them for supporting the show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, Alluxio, and Data Council. Upcoming events include the combined events of the Data Architecture Summit and Graphorum, the Data Orchestration Summit, and Data Council in NYC. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Jonathan Biddle about his work to encourage and empower Wayfair engineers in their use of Python Interview Introductions How did you get introduced to Python? Can you start by describing the mission statement for you and your team at Wayfair? What is the origin story for how your group got started? How and where was Python being used within Wayfair at the time? What are the primary languages that are used throughout Wayfair? What is involved in the selection process for a language and technology stack for new projects within Wayfair? Can you describe how and why you work with different groups throughout Wayfair? What are some of the common misconceptions or barriers that you encounter when working with other engineering and product teams about how and where Python will be useful? How large is your team currently and what is the length of a typical engagement? How has the scale and scope of your work changed since your group was first formed? How many different product teams have you worked with at this point and what are some of the notable outcomes? What are some of the most challenging aspects, both technical and organizational, of educating other engineers on when and how to use Python? Can you share some examples of engagements that you would classify as a failure? What lessons have you learned from those situations? What advice do you have for other groups or organizations who may be considering or actively launching similar initiatives? Keep In Touch Website LinkedIn @jonbiddle on Twitter Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Picks Tobias Learning Bayesian Statistics Podcast Jonathan PyDantic FastAPI MKDocs Links Wayfair Zope Django PHP Java Javascript .NET Kafka Jack Diederich – Stop Writing Classes The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/3/2019 • 42 minutes, 2 seconds

Building Quantum Computing Algorithms In Python

Summary Quantum computers are the biggest jump forward in processing power that the industry has seen in decades. As part of this revolution it is necessary to change our approach to algorithm design. D-Wave is one of the companies who are pushing the boundaries in quantum processing and they have created a Python SDK for experimenting with quantum algorithms. In this episode Alexander Condello explains what is involved in designing and implementing these algorithms, how the Ocean SDK helps you in that endeavor, and what types of problems are well suited to this approach. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, Alluxio, and Data Council. Upcoming events include the combined events of the Data Architecture Summit and Graphorum, the Data Orchestration Summit, and Data Council in NYC. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Alex Condello about the Ocean SDK from D-Wave for building quantum algorithms in Python Interview Introductions How did you get introduced to Python? Can you start by giving a high-level overview of quantum computing? What is the Ocean SDK and how does it fit into the business model for D-Wave? What are some of the problem types that a quantum processor is uniquely well suited for? How does the overall system design for a quantum computer compare to that of the Von Neumann architecture that is common for the machines that we are all familiar with? What are some of the differences in algorithm design when programming for a quantum processor? Is there any specialized background knowledge that is necessary for making effective use of the QPU’s capabilities? What are some of the common difficulties that you have seen users struggle with? How does the Ocean SDK assist the developer in implementing and understanding the patterns necessary for Quantum algorithms? What was the motivation for choosing Python as the target language for an SDK to attract developers to experiment with quantum algorithms? Can you describe how the SDK is implemented and some of the integrations that are necessary for being able to operate on a quantum processor? What have you found to be some of the most interesting, challenging, or unexpected aspects of your work on the Ocean software stack? How do you handle the abstraction of the execution context to allow for replicating the program behavior on CPU/GPU vs QPU Is there any potential for quantum computing to impact research in previously intractable computer science research, such as the P vs NP problem? What are your current scaling limits in terms of providing compute to customers for their problems? What are some of the most interesting, innovative, or unexpected ways that you have seen developers use the Ocean SDK and quantum processors? What are you most excited for as you look to the future capabilities of quantum systems? What are some of the upcoming challenges that you anticipate for the quantum computing industry? Keep In Touch arcondello on GitHub Picks Tobias QuTip Podcast Interview Alex Cython Podcast Interview Links Ocean SDK D-Wave Quantum Computing Quantum Annealing Quantum Superposition Qubit D-Wave Leap Von Neumann Architecture Cuda Linear Programming D-Wave ML Papers D-Wave NetworkX Maximum Cut Problem Ising Problem Los Alamos National Laboratory Vertex Cover Problem D-Wave Hybrid The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/29/2019 • 36 minutes, 14 seconds

Illustrating The Landscape And Applications Of Deep Learning

Summary Deep learning is a phrase that is used more often as it continues to transform the standard approach to artificial intelligence and machine learning projects. Despite its ubiquity, it is often difficult to get a firm understanding of how it works and how it can be applied to a particular problem. In this episode Jon Krohn, author of Deep Learning Illustrated, shares the general concepts and useful applications of this technique, as well as sharing some of his practical experience in using it for his work. This is definitely a helpful episode for getting a better comprehension of the field of deep learning and when to reach for it in your own projects. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, Alluxio, and Data Council. Upcoming events include the combined events of the Data Architecture Summit and Graphorum, the Data Orchestration Summit, and Data Council in NYC. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Jon Krohn about his recent book, deep learning illustrated Interview Introductions How did you get introduced to Python? Can you start by giving a brief description of what we’re talking about when we say deep learning and how you got involved with the field? How does your background in neuroscience factor into your work on designing and building deep learning models? What are some of the ways that you leverage deep learning techniques in your work? What was your motivation for writing a book on the subject? How did the idea of including illustrations come about and what benefit do they provide as compared to other books on this topic? While planning the contents of the book what was your thought process for determining the appropriate level of depth to cover? How would you characterize the target audience and what level of familiarity and proficiency in employing deep learning do you wish them to have at the end of the book? How did you determine what to include and what to leave out of the book? The sequencing of the book follows a useful progression from general background to specific uses and problem domains. What were some of the biggest challenges in determining which domains to highlight and how deep in each subtopic to go? Because of the continually evolving nature of the field of deep learning and the associated tools, how have you guarded against obsolescence in the content and structure of the book? Which libraries did you focus on for your examples and what was your selection process? Now that it is published, is there anything that you would have done differently? One of the critiques of deep learning is that the models are generally single purpose. How much flexibility and code reuse is possible when trying to repurpose one model pipeline for a slightly different dataset or use case? I understand that deployment and maintenance of models in production environments is also difficult. What has been your experience in that regard, and what recommendations do you have for practitioners to reduce their complexity? What is involved in actually creating and using a deep learning model? Can you go over the different types of neurons and the decision making that is required when selecting the network topology? In terms of the actual development process, what are some useful practices for organizing the code and data that goes into a model, given the need for iterative experimentation to achieve desired levels of accuracy? What is your personal workflow when building and testing a new model for a new use case? What are some of the limitations of deep learning and cases where you would recommend against using it? What are you most excited for in the field of deep learning and its applications? What are you most concerned by? Do you have any parting words or closing advice for listeners and potential readers? Keep In Touch Website @jonkrohnlearns on Twitter jonkrohn on GitHub Picks Tobias Spurious Correlations Jon Data Elixir Newsletter Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Untapt Deep Learning Illustrated Pearson Columbia University New York City Data Science Academy NIH (National Institutes of Health) Oxford Uniersity Matlab R Language Neuroscience Artificial Neural Network Deep Learning Natural Language Processing Computer Vision Generative Adversarial Networks Deep Learning by Ian Goodfellow, et al. Hands On Machine Learning by Aurélien Géron O’Reilly Online Learning Transfer Learning Keras Tensorflow PyTorch Gary Marcus Judea Pearl Artificial General Intelligence Explainable AI Yuval Noah Harrari Sapiens Home Deus Wait But Why? The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/22/2019 • 56 minutes, 21 seconds

Andrew's Adventures In Coderland

Summary Software development is a unique profession in many ways, and it has given rise to its own subculture due to the unique sets of challenges that face developers. Andrew Smith is an author who is working on a book to share his experiences learning to program, and understand the impact that software is having on our world. In this episode he shares his thoughts on programmer culture, his experiences with Python and other language communities, and how learning to code has changed his views on the world. It was interesting getting an anthropological perspective from a relative newcomer to the world of software. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, and Data Council. Upcoming events include the combined events of the Data Architecture Summit and Graphorum, Data Council in Barcelona, and the Data Orchestration Summit. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Andrew Smith about his anthropological study of software engineering culture in his upcoming book Adventures In Coderland. Interview Introductions How did you get introduced to Python? Can you start by describing the scope and intent of your work on Adventures In Coderland? What was your motivation for embarking on this particular project? Prior to the start of your research for this book, what was your level of familiarity with software development as a discipline and a cultural phenomenon? How are you approaching the research for this book and to what level of detail are you trying to address the problem space? What are some of the most striking contrasts that you have identified between software engineers and coding culture as it compares to that of a layperson? We met at the most recent PyCon US, which I understand you attended as a means of conducting research for your book. What are some of the notable aspects of the Python community that you discovered while you were attending? What are some of the other programming communities that you have engaged with? What are some of the differentiating factors that you have noticed between the communities that you have interacted with? What are some of the most surprising discoveries that you have made in the process of writing this book? What is your metric for determining when you have gathered enough raw material to complete the book? Now that you have delved into the peculiarities of "coderland", how has it changed your own outlook on both the software industry, and society at large? What advice do you have for the engineers who are listening as it pertains to your experiences in writing your book? Keep In Touch Website @wiresmith on Twitter Picks Tobias Throughline Podcast Andrew 20 Thousand Hertz Podcast Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Linksj Adventures In Coderland https://us.pycon.org?utm_source=rss&utm_medium=rss Nicholas Tollervey 1843 Magazine The Economist Free Code Camp Code Golf Moon Dust book about the astronauts who first landed on the moon The Face magazine The Observer The Guardian Charlie Duke Totally Wired Code For America Supercollider programming environment SonicPi George Boole FMRI (Functional Magnetic Resonance Imaging) Ruby Language The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/14/2019 • 1 hour, 26 seconds

Network Automation At Enterprise Scale With Python

Summary Designing and maintaining enterprise networks and the associated hardware is a complex and time consuming task. Network automation tools allow network engineers to codify their workflows and make them repeatable. In this episode Antoine Fourmy describes his work on eNMS and how it can be used to automate enterprise grade networks. He explains how his background in telecom networking led him to build an open source platform for network engineers, how it is architected, and how you can use it for creating your own workflows. This is definitely worth listening to as a way to gain some appreciation for all of the work that goes on behind the scenes to make the internet possible. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, Alluxio, and Data Council. Upcoming events include the combined events of the Data Architecture Summit and Graphorum, the Data Orchestration Summit, and Data Council in NYC. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Antoine Fourmy about eNMS, an enterprise-grade vendor-agnostic network automation platform. Interview Introductions How did you get introduced to Python? Can you start by explaining what eNMS is What was your motivation for creating it? Who are the target users of eNMS and how much background knowledge of network management is required to be effective with it? What are some of the alternative tools that exist in this space and why might a network operator choose to use eNMS in their place? What are some of the most challenging aspects of network creation and maintenance and how does eNMS assist with them? What are some of the mundane and/or error-prone tasks that can be replaced or automated with eNMS? What are some of the additional features that come into play for more complex networking tasks? Can you describe the system architecture of eNMS and how it has evolved since you first began working on it? eNMS is an impressive project that looks to have a substantial amount of polish. How large is the overall community of users and contributors? For someone who wants to get involved in contributing to eNMS what are some of the types of skills and background that would be helpful? What are some of the most innovative/unexpected ways that you have seen eNMS used? When is eNMS the wrong choice? What do you have planned for the future of the project? Keep In Touch Website LinkedIn afourmy on GitHub Picks Tobias Tedeschi Trucks Band Antoine CheckIO Podcast Episode Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links eNMS Orange Netmiko NAPALM Podcast Episode Paramiko Ansible Requests OpenNMS LibreNMS Ansible Tower Rundeck SaltStack Podcast Episode StackStorm Podcast Episode SaltStack Proxy Minions Hashicorp Vault VirtualBox Flask Django SQLAlchemy APScheduler Docker Podcast Episode Redis Celery The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/8/2019 • 34 minutes, 37 seconds

Building A Modern Discussion Forum In Python To Support Healthy Communities

Summary Building and sustaining a healthy community requires a substantial amount of effort, especially online. The design and user experience of the digital space can impact the overall interactions of the participants and guide them toward respectful conversation. In this episode Rafał Pitoń shares his experience building the Misago platform for creating community forums. He explains his motivation for creating the project, the lessons he has learned in the process, and how it is being used by himself and others. This was a great conversation about how technology is just a means, and not the end in itself. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, and Data Council. Upcoming events include the combined events of the Data Architecture Summit and Graphorum, Data Council in Barcelona, and the Data Orchestration Summit. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Rafał Pitoń about Misago, a fully featured modern forum application that is fast, scalable, and responsive Interview Introductions How did you get introduced to Python? Can you start by explaining what Misago is and your motivation for creating it? How does it compare to other modern forum options such as Discourse and Flarum? How did you generate and prioritize the set of features that you have implemented and what are the main capabilities that are still on your roadmap? Is Misago intended to be run in isolation, or does it allow for integrating into a larger Django project? Is there any support for multi-tenancy? How is Misago itself implemented and how has the architecture evolved since you first began working on it? If you were to start it today, what are some of the choices that you would make differently? What are the extension points that developers can hook into for adding custom functionality? In addition to the technical challenges, managing a forum involves a fair amount of social challenges. How does Misago help with management of a healthy community? How do different design elements factor into promoting healthy conversation and sustainable engagement? What are some of the aspects of community management and the accompanying platform features that enable them which aren’t initially obvious? For someone who wants to use Misago, what is involved in deploying and configuring it? What are some of the routine maintenance tasks that they should be aware of? What are some of the most interesting or unexpected ways that you have seen Misago used? What have you found to be the most interesting, unexpected, and challenging aspects of building and maintaining a forum platform? What do you have planned for the future of Misago? Keep In Touch rafalp on GitHub @RafalPiton on Twitter Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Picks Tobias Fear Innoculum by Tool Rafał github.com/encode Ariadne GraphQL Library Links Misago Poland Mirumee Saleor Episode PHP Discourse Flarum MySQL PostgreSQL Data Engineering Podcast Interview jQuery DJango Rest Framework EmberJS MithrilJS AngularJS ReactJS PHPBB Celery GDPR == General Data Privacy Regulation Docker misago_docker VPS == Virtual Private Server Nginx Starlette Async API framework Ariadne GraphQL Library The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/1/2019 • 52 minutes, 22 seconds

Exploratory Data Analysis Made Easy At The Command Line

Summary There are countless tools and libraries in Python for data scientists to perform powerful analyses, but they often have a setup cost that acts as a barrier to ad-hoc exploration of data. Visidata is a command line application that eliminates the friction involved with starting the discovery process. In this episode Saul Pwanson explains his motivation for creating it, why a terminal environment is a useful place for this work, and how you can use Visidata for your own work. If you have ever avoided looking at a data set because you couldn’t be bothered with the boilerplate for a Jupyter notebook, then Visidata is the perfect addition to your toolbox. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, and Data Council. Upcoming events include the Strata Data conference, the combined events of the Data Architecture Summit and Graphorum, and Data Council in Barcelona. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Saul Pwanson about Visidata, a terminal oriented interactive multitool for tabular data Interview Introductions How did you get introduced to Python? Can you start by describing what Visidata is and how the project got started? What are the main use cases for Visidata? What are some tools that it has replaced in your workflow? Can you talk through a typical workflow for data exploration and analysis with Visidata? One of the capabilities that you mention on the website is quickly opening large files. What are some strategies that you have used to enable performant access for files that might crash a typical editor (e.g. Vim, Emacs)? Can you describe how Visidata is implemented and how it has evolved since you started working on it (including the upcoming 2.0 release)? What libraries or language features have proven most useful? Why did you choose to implement Visidata as a terminal only tool and what constraints does that bring with it? What are some of the most challenging aspects of building a terminal UI for data exploration and analysis? Because of its manifestation as a terminal/CLI application it relies heavily on keyboard bindings. How do you approach key assignments to ensure a consistent and intuitive user experience? What are some of the types of analysis that Visidata can be used for out of the box? What are some of the most interesting/unexpected/innovative ways that you have seen Visidata used? How much community adoption have you seen and how do you approach project governance as a solo developer? What do you have planned for the future of Visidata? Keep In Touch Website saulpw on GitHub @saulfp on Twitter LinkedIn Picks Tobias Data Is Plural newsletter Saul TMate Mosh – The Mobile Shell Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Visidata F5 Networks HDF5 PyTables Podcast Interview vgit vping Jeremy Singer-Vine data.boston.gov Recurse Center Curses dateutil decorators Electron OpenRefine Tmux Visicalc Windows Subsystem For Linux Saul’s Lightning Talk The Book of Visidata Where In The World Is Carmen San Diego Oh My Zsh The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/23/2019 • 52 minutes, 50 seconds

Cultivating The Python Community In Argentina

Summary The Python community in Argentina is large and active, thanks largely to the motivated individuals who manage and organize it. In this episode Facundo Batista explains how he helped to found the Python user group for Argentina and the work that he does to make it accessible and welcoming. He discusses the challenges of encompassing such a large and distributed group, the types of events, resources, and projects that they build, and his own efforts to make information free and available. He is an impressive individual with a substantial list of accomplishments, as well as exhibiting the best of what the global Python community has to offer. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, and Data Council. Upcoming events include the O’Reilly AI conference, the Strata Data conference, the combined events of the Data Architecture Summit and Graphorum, and Data Council in Barcelona. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Facundo Batista about his experiences founding and fostering the Argentinian Python community, working as a core developer, and his career in Python Interview Introductions How did you get introduced to Python? What was your motivation for organizing a Python user group in Argentina? How does the geography and culture of Argentina influence the focus of the community? Argentina is a fairly large country. What is the reasoning for having the user group encompass the whole nation and how is it organized to provide access to everyone? What are some notable projects that have been built by or for members of PyAr? What are some of the challenges that you faced while building CDPedia and what aspects of it are you most proud of? How did you get started as a core developer? What areas of the language and runtime have you been most involved with? As a core developer, what are some of the most interesting/unexpected/challenging lessons that you have learned? What other languages do you currently use and what is it about Python that has motivated you to spend so much of your attention on it? What are some of the shortcomings in Python that you would like to see addressed in the future? Outside of CPython, what are some of the projects that you are most proud of? How has your involvement with core development and PyAr influenced your life and career? Keep In Touch @facundobatista on Twitter Blog Picks Tobias Dictionary of Difficult Words Facundo Fades Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links PyAr Argentina PyAr Mailing List PyAr Telegram PyCon Argentina Buenos Aires Cordoba Rosario Mendoza CDPedia PyCamp PSF == Python Software Foundation Wikipedia Internet Archive Decimal Module PEP 327 Tim Peters Canonical Tennis Fades The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/18/2019 • 41 minutes, 46 seconds

Python Powered Journalistic Freedom With SecureDrop

Summary The internet has made it easier than ever to share information, but at the same time it has increased our ability to track that information. In order to ensure that news agencies are able to accept truly anonymous material submissions from whistelblowers, the Freedom of the Press foundation has supported the ongoing development and maintenance of the SecureDrop platform. In this episode core developers of the project explain what it is, how it protects the privacy and identity of journalistic sources, and some of the challenges associated with ensuring its security. This was an interesting look at the amount of effort that is required to avoid tracking in the modern era. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, and Data Council. Upcoming events include the O’Reilly AI conference, the Strata Data conference, the combined events of the Data Architecture Summit and Graphorum, and Data Council in Barcelona. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Jen Helsby and Kushal Das about SecureDrop, a secure platform for submitting and receiving documents anonymously Interview Introductions How did you get introduced to Python? Can you start by describing what SecureDrop is and how it got started? How did you get involved in the project? Can you give some background on where and why it is useful? For someone using a running instance, what does their workflow look like? What are some of the ways that you minimize user experience hurdles to prevent them from circumventing the security through laziness or apathy? I was a bit surprised to see the references to the messaging system that is included. Why is that an important feature? What form do the submissions generally take and what are the limits on formats that you can accept? How is the system itself architected and how has the design evolved since the first implementation? In terms of the security protocols and technologies that are implemented, what factors are you considering as you develop the project? What are the weak points or edge cases that could lead to compromise and how do you guard against them? In terms of the deployment and maintenance of a SecureDrop instance, how much technological sophistication is necessary for the organization running it, and how much effort do you put into simplifying it? What are some of the notable uses of a SecureDrop deployment and what motivates you to continue working on it? What are the most interesting/innovative/unexpected uses of SecureDrop that you have seen? How do you approach the sustainability of the platform? What have you found most challenging/interested/unexpected in your work on SecureDrop? What is in store for the future of the project? Keep In Touch Jen @redshiftzero on Twitter redshiftzero on GitHub Blog Kushal Website @kushaldas on Twitter kushaldas on GitHub Picks Tobias Laser Tag Kushal Permanent Record by Edward Snowden Jen Permanent Record by Edward Snowden Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links SecureDrop Aaron Swartz Freedom Of The Press Foundation SecureDrop Directory TOR Browser TOR == The Onion Router Tails OS Ubuntu IDS == Intrusion Detection System Ansible DEF CON Mozilla Open Source Support (MOSS) Testinfra Flask Molecule unit test library for Ansible Bandit Safety Qubes OS Qt The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/10/2019 • 38 minutes, 22 seconds

Combining Python And SQL To Build A PyData Warehouse

Summary The ecosystem of tools and libraries in Python for data manipulation and analytics is truly impressive, and continues to grow. There are, however, gaps in their utility that can be filled by the capabilities of a data warehouse. In this episode Robert Hodges discusses how the PyData suite of tools can be paired with a data warehouse for an analytics pipeline that is more robust than either can provide on their own. This is a great introduction to what differentiates a data warehouse from a relational database and ways that you can think differently about running your analytical workloads for larger volumes of data. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Taking a look at recent trends in the data science and analytics landscape, it’s becoming increasingly advantageous to have a deep understanding of both SQL and Python. A hybrid model of analytics can achieve a more harmonious relationship between the two languages. Read more about the Python and SQL Intersection in Analytics at mode.com/init. Specifically, we’re going to be focusing on their similarities, rather than their differences. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, and Data Council. Upcoming events include the O’Reilly AI conference, the Strata Data conference, the combined events of the Data Architecture Summit and Graphorum, and Data Council in Barcelona. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Robert Hodges about how the PyData ecosystem can play nicely with data warehouses Interview Introductions How did you get introduced to Python? To start with, can you give a quick overview of what a data warehouse is and how it differs from a "regular" database for anyone who isn’t familiar with them? What are the cases where a data warehouse would be preferable and when are they the wrong choice? What capabilities does a data warehouse add to the PyData ecosystem? For someone who doesn’t yet have a warehouse, what are some of the differentiating factors among the systems that are available? Once you have a data warehouse deployed, how does it get populated and how does Python fit into that workflow? For an analyst or data scientist, how might they interact with the data warehouse and what tools would they use to do so? What are some potential bottlenecks when dealing with the volumes of data that can be contained in a warehouse within Python? What are some ways that you have found to scale beyond those bottlenecks? How does the data warehouse fit into the workflow for a machine learning or artificial intelligence project? What are some of the limitations of data warehouses in the context of the Python ecosystem? What are some of the trends that you see going forward for the integration of the PyData stack with data warehouses? What are some challenges that you anticipate the industry running into in the process? What are some useful references that you would recommend for anyone who wants to dig deeper into this topic? Keep In Touch LinkedIn hodgesrm on GitHub Picks Tobias Foundations Of Architecting Data Solutions: Managing Successful Data Projects by Ted Malaska & Jonathan Seidman Robert Reading old academic papers such as CStore Python Machine Learning by Sebastian Raschka Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links Altinity Clickhouse Data Engineering Podcast Interview MySQL Data Warehouse Column Oriented Database SIMD == Single Instruction Multiple Data PostgreSQL Data Engineering Podcast Episode Microsoft SQL Server Pandas NumPy Tensorflow Jupyter Data Sampling Dask Data Engineering Podcast Ray Map/Reduce Vertica Sharding Hadoop SnowflakeDB Delta Lake Data Engineering Podcast Episode BigQuery RedShift Snowflake Data Sharing OracleDB Kubernetes DBT Data Engineering Podcast Episode CSV Parquet Data Engineering Podcast Episode Kafka UC Davis Web Scraping Clickhouse Python Driver SQLAlchemy Altinity Blog Post Materialized View PyTorch Podcast Interview scikit-learn Spark Data Engineering Podcast Interview BigQuery ML Apache Arrow Wes McKinney Podcast Interview User Defined Function KDB CStore Paper by Dr. Michael Stonebraker, et al Kinetica MapD/OmniSci The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/2/2019 • 43 minutes, 44 seconds

AI Driven Automated Code Review With DeepCode

Summary Software engineers are frequently faced with problems that have been fixed by other developers in different projects. The challenge is how and when to surface that information in a way that increases their efficiency and avoids wasted effort. DeepCode is an automated code review platform that was built to solve this problem by training a model on a massive array of open sourced code and the history of their bug and security fixes. In this episode their CEO Boris Paskalev explains how the company got started, how they build and maintain the models that provide suggestions for improving your code changes, and how it integrates into your workflow. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, and Data Council. Upcoming events include the O’Reilly AI conference, the Strata Data conference, the combined events of the Data Architecture Summit and Graphorum, and Data Council in Barcelona. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host as usual is Tobias Macey and today I’m interviewing Boris Paskalev about DeepCode, an automated code review platform for detecting security vulnerabilities in your projects Interview Introductions Can you start by explaining what DeepCode is and the story of how it got started? How is the DeepCode platform implemented? What are the current languages that you support and what was your guiding principle in selecting them? What languages are you targeting next? What is involved in maintaining support for languages as they release new versions with new features? How do you ensure that the recommendations that you are making are not using languages features that are not available in the runtimes that a given project is using? For someone who is using DeepCode, how does it fit into their workflow? Can you explain the process that you use for training your models? How do you curate and prepare the project sources that you use to power your models? How much domain expertise is necessary to identify the faults that you are trying to detect? What types of labelling do you perform to ensure that the resulting models are focusing on the proper aspects of the source repositories? How do you guard against false positives and false negatives in your analysis and recommendations? Does the code that you are analyzing and the resulting fixes act as a feedback mechanism for a reinforcement learning system to update your models? How do you guard against leaking intellectual property of your scanned code when surfacing recommendations? What have been some of the most interesting/unexpected/challenging aspects of building the DeepCode product? What do you have planned for the future of the platform and business? Keep In Touch LinkedIn Picks Tobias Redwall Series by Brian Jacques Boris Artifical Intelligence Get outside Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Links DeepCode Zurich, Switzerland BigCode ETH Zurich Datalog F Strings Data Classes DeepCode Research The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/26/2019 • 33 minutes, 15 seconds

Security, UX, and Sustainability For The Python Package Index

Summary PyPI is a core component of the Python ecosystem that most developer’s have interacted with as either a producer or a consumer. But have you ever thought deeply about how it is implemented, who designs those interactions, and how it is secured? In this episode Nicole Harris and William Woodruff discuss their recent work to add new security capabilities and improve the overall accessibility and user experience. It is a worthwhile exercise to consider how much effort goes into making sure that we don’t have to think much about this piece of infrastructure that we all rely on. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence, and Data Counsil. Upcoming events include the O’Reilly AI conference, the Strata Data conference, the combined events of the Data Architecture Summit and Graphorum, and Data Council in Barcelona. Go to pythonpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Nicole Harris and William Woodruff about the work they are doing on the PyPI service to improve the security and utility of the package repository that we all rely on Interview Introductions How did you get introduced to Python? Can you start by sharing how you each got involved in working on PyPI? What was the state of the system at the time that you first began working on it? Once you committed to working on PyPI how did you each approach the process of identifying and prioritizing the work that needed to be done? What were the most significant issues that you were faced with at the outset? How often have the issues that you each focused on overlapped at the cross section of UX and security? How do you balance the tradeoffs that exist at that boundary? What is the surface area of the domains that you are each working in? (e.g. web UI, system API, data integrity, platform support, etc.) What are some of the pain points or areas of confusion from a user perspective that you have dealt with in the process of improving the platform? What have been the most notable features or improvements that you have each introduced to PyPI? What were the biggest challenges with implementing or integrating those changes? How do you approach introducing changes to PyPI given the volume of traffic that it needs to support and the level of importance that it serves in the community? What are some examples of attack vectors that exist as a result of the nature of the PyPI platform and what are you most concerned by? How does poor accessibility or user experience impact the utility of PyPI and the community members who interact with it? What have you found to be the most interesting/challenging/unexpected aspects of working on Warehouse? What are some of the most useful lessons that you have learned in the process? What do you have planned for future improvements to the platform? How can the listeners get involved and help out? How was this work funded? Keep In Touch Nicole @nlhkabu on Twitter Website If you’re using CI to upload to PyPI and would like to speak with Nicole please book a time here If you’re using assistive technology and would like to speak with Nicole please book a time here William @8x5clPW2 Website Email Please get in touch if you’d like to work with Trail of Bits on your next security project! Picks Tobias The Expanse TV Series Nicole The Great Hack documentary William Abraham Lincoln Autobiography by Carl Sandburg Links PyPI Warehouse Issue Tracker Good First Issues PeopleDoc Trail of Bits OSQuery Django Ruby Python Software Foundation Python Packaging Working Group Podcast Episode Donald Stufft Podcast Episode UX (User Experience) Design OTF == Open Technology Fund Bootstrap TOTP WebauthN Yubikey Changeset Consulting Sumana Harihareswara WCAG (Web Content Accessibility Guidelines) 2.0 Macaroon Security Tokens Docker Compose MOSS = Mozilla Open Source Support The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/19/2019 • 51 minutes, 38 seconds

Learning To Program In Python With CodeGrades

Summary With the increasing role of software in our world there has been an accompanying focus on teaching people to program. There are numerous approaches that have been attempted to achieve this goal with varying levels of success. Nicholas Tollervey has begun a new effort that blends the approach adopted by musicians and martial artists that uses a series of grades to provide recognition for the achievements of students. In this episode he explains how he has structured the study groups, syllabus, and evaluations to help learners build projects based on their interests and guide their own education while incorporating useful skills that are necessary for a career in software. If you are interested in learning to program, teach others, or act as a mentor then give this a listen and then get in touch with Nicholas to help make this endeavor a success. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, Corinium Global Intelligence. Coming up this fall is the combined events of Graphorum and the Data Architecture Summit. The agendas have been announced and super early bird registration for up to $300 off is available until July 26th, with early bird pricing for up to $200 off through August 30th. Use the code BNLLC to get an additional 10% off any pass when you register. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today Nicholas Tollervey is back to talk about his work on CodeGrades, a new effort that he is building to blend his backgrounds in music, education, and software to help teach kids of all ages how to program. Interview Introductions How did you get introduced to Python? Can you start by describing what CodeGrades is and what motivated you to start this project? How does it differ from other approaches to teaching software development that you have encountered? Is there a particular age or level of background knowledge that you are targeting with the curriculum that you are developing? What are the criteria that you are measuring against and how does that criteria change as you progress in grade levels? For someone who completes the full set of levels, what level of capability would you expect them to have as a developer? Given your affiliation with the Python community it is understandable that you would target that language initially. What would be involved in adapting the curriculum, mentorship, and assessments to other languages? In what other ways can this idea and platform be adapted to accomodate other engineering skills? (e.g. system administration, statistics, graphic design, etc.) What interesting/exciting/unexpected outcomes and lessons have you found while iterating on this idea? For engineers who would like to be involved in the CodeGrades platform, how can they contribute? What challenges do you anticipate as you continue to develop the curriculum and mentor networks? How do you envision the future of CodeGrades taking ship in the medium to long term? Keep In Touch ntoll on GitHub Website @ntoll on Twitter Picks Tobias Parsy Nevermoor: The Trials of Morrigan Crow Nicholas Kivy Wittgenstein: The Duty Of Genious The Hitchhiker’s Guide To The Galaxy by Douglas Adams Links CodeGrades Blog Post C# .NET London IronPython Musical Grades Autodidact Lambda School How To Draw An Owl Dunder (double underscore) methods Duck Typing Impostor Syndrome Django Girls Mu Editor Baroque Music Chamber Music PyData Adafruit CircuitPython Podcast Interview PyPortal Hypercard Pypercard Kivy Podcast Interview Alan Turing The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/12/2019 • 1 hour, 4 minutes, 2 seconds

Build Your Own Knowledge Graph With Zincbase

Summary Computers are excellent at following detailed instructions, but they have no capacity for understanding the information that they work with. Knowledge graphs are a way to approximate that capability by building connections between elements of data that allow us to discover new connections among disparate information sources that were previously uknown. In our day-to-day work we encounter many instances of knowledge graphs, but building them has long been a difficult endeavor. In order to make this technology more accessible Tom Grek built Zincbase. In this episode he explains his motivations for starting the project, how he uses it in his daily work, and how you can use it to create your own knowledge engine and begin discovering new insights of your own. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. With such an intuitive tool it’s easy to make sure that everyone in the business is on the same page. Podcast.init listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Coming up this fall is the combined events of Graphorum and the Data Architecture Summit. The agendas have been announced and super early bird registration for up to $300 off is available until July 26th, with early bird pricing for up to $200 off through August 30th. Use the code BNLLC to get an additional 10% off any pass when you register. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Tom Grek about knowledge graphs, when they’re useful, and his project Zincbase that makes them easier to build Interview Introductions How did you get introduced to Python? Can you start by explaining what a knowledge graph is and some of the ways that they are used? How did you first get involved in the space of knowledge graphs? You have built the Zincbase project for building and querying knowledge graphs. What was your motivation for creating this project and what are some of the other tools that are available to perform similar tasks? Can you describe how Zincbase is implemented and some of the ways that it has evolved since you first began working on it? What are some of the assumptions that you had at the outset of the project which have been challenged or updated in the process of working on and with it? What are some of the common challenges when building or using knowledge graphs? How has the domain of knowledge graphs changed in recent years as new approaches to entity resolution and data processing have been introduced? Can you talk through a use case and workflow for using Zincbase to design and populate a knowledge graph? What are some of the ways that you are using Zincbase in your own projects? What have you found to be the most challenging/interesting/unexpected lessons that you have learned in the process of building and maintaining Zincbase? What do you have planned for the future of the project? Keep In Touch tomgrek on GitHub Website @tomgrek on Twitter LinkedIn Picks Tobias Banana Blueberry Oat Bars Tom Pickled Habañero Links Zincbase Commodore 64 Electronic Engineering Artificial Intelligence Primer.ai Artificial General Intelligence Matlab IPython NumPy Excel Jupyter Pandas Knowledge Graph Data Engineering Podcast Episode About Enigma Knowledge Graph The Matrix Keanu Reeves Ontology Semantic Web Word2Vec SparQL Neo4J Graph Database Data Engineering Podcast Episode About DGraph AWS Neptune PostgreSQL Data Engineering Podcast Episode Dask Data Engineering Podcast Episode BBC Micro BASIC Prolog NLP ELMO BERT GPT-2 Winograd Schema Challenge PyTorch BigGraph Ampligraph SpaCy Podcast.__init__ Episode AI Winter PyTorch Podcast Episode scikit-learn NetworkX SciPy CircleCI Read The Docs Podcast Episode Project Gutenberg Allen NLP Doctest Reinforcement Learning Metacognition The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/5/2019 • 48 minutes, 44 seconds

Docker Best Practices For Python In Production

Summary Docker is a useful technology for packaging and deploying software to production environments, but it also introduces a different set of complexities that need to be understood. In this episode Itamar Turner-Trauring shares best practices for running Python workloads in production using Docker. He also explains some of the security implications to be aware of and digs into ways that you can optimize your build process to cut down on wasted developer time. If you are using Docker, thinking about using it, or just heard of it recently then it is worth your time to listen and learn about some of the cases you might not have considered. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! To connect with the startups that are shaping the future and take advantage of the opportunities that they provide, check out Angel List where you can invest in innovative business, find a job, or post a position of your own. Sign up today at pythonpodcast.com/angel and help support this show. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Coming up this fall is the combined events of Graphorum and the Data Architecture Summit. The agendas have been announced and super early bird registration for up to $300 off is available until July 26th, with early bird pricing for up to $200 off through August 30th. Use the code BNLLC to get an additional 10% off any pass when you register. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Itamar Turner-Trauring about what you need to know about running Python workloads in Docker Interview Introductions How did you get introduced to Python? For anyone who is unfamiliar with it, can you describe what Docker is and the benefits that it can provide? What was your motivation for dedicating so much time and energy to the specific area of using Docker for Python production usage? What are some of the common issues that developers and operations engineers run into when dealing with Docker and its build system? What are some of the issues that are specific to Python that you have run into when using Docker? How does the ecosystem for Python in containers compare to other languages that you are familiar with? What are some of the security issues that engineers are likely to run into when using some of the advice and pre-existing containers that are publicly available? One of the issues that you call out is the speed of container builds. What are some of the contributing factors that lead to such slow packaging times? Can you talk through some of the aspects of multi-layer packages and useful ways to take proper advantage of them? There have been some recent projects that attempt to work around the shortcomings of the Dockerfile itself. What are your thoughts on that overall effort and any specific tools that you have experimented with? When is Docker the wrong choice for a production environment? What are some useful alternatives to Docker, for Python specifically and for software distribution in general that you have had good luck with? Keep In Touch Website @itamarst on Twitter itamarst on GitHub Picks Tobias Shazam Movie Itamar Veronica Mars Links Itamar’s Best Practices Guide Docker Zope GitLab CI Heresy In The Church Of Docker Poetry Pipenv Dockerfile 40 Years of DSL Disasters (Slides) Ubuntu Debian Docker Layers Bitnami Alpine Linuxhttps://alpinelinux.org?utm_source=rss&utm_medium=rss PodMan Nix Heroku Buildpacks Itamar’s Docker Template Hashicorp Packer Rkt Solaris Zones BSD Jails PyInstaller Snap FlatPak Conda The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/29/2019 • 44 minutes

Protecting The Future Of Python By Hunting Black Swans

Summary The Python language has seen exponential growth in popularity and usage over the past decade. This has been driven by industry trends such as the rise of data science and the continued growth of complex web applications. It is easy to think that there is no threat to the continued health of Python, its ecosystem, and its community, but there are always outside factors that may pose a threat in the long term. In this episode Russell Keith-Magee reprises his keynote from PyCon US in 2019 and shares his thoughts on potential black swan events and what we can do as engineers and as a community to guard against them. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! And to grow your professional network and find opportunities with the startups that are changing the world then Angel List is the place to go. Go to pythonpodcast.com/angel to sign up today. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Upcoming events include the O’Reilly AI Conference, the Strata Data Conference, and the combined events of the Data Architecture Summit and Graphorum. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Russell Keith-Magee about potential black swans for the Python language, ecosystem, and community and what we can do about them Interview Introductions How did you get introduced to Python? Can you start by explaining what a Black Swan is in the context of our conversation? You were the opening keynote for PyCon this year, where you talked about some of the potential challenges facing Python. What motivated you to choose this topic for your presentation? What effect did your talk have on the overall tone and focus of the conversations that you experienced during the rest of the conference? What were some of the most notable or memorable reactions or pieces of feedback that you heard? What are the biggest potential risks for the Python ecosystem that you have identified or discussed with others? What is your overall sentiment about the potential for the future of Python? As developers and technologists, does it really matter if Python continues to be a viable language? What is your personal wish list of new capabilities or new directions for the future of the Python language and ecosystem? For listeners to this podcast and members of the Python community, what are some of the ways that we can contribute to the long-term success of the language? Keep In Touch BeeWare freakboy3742 on GitHub @freakboy3742 on Twitter Website Picks Tobias Jethro Tull Russell pytest-tldr pdbpp Links PyCon 2019 Keynote Presentation Perth Western Australia Django BeeWare RedHat Emacs Vim Lisp Glyph Twisted Cal Henderson Flickr Slack Black Swan Animal Book Metaphor Nassim Nicholas Taleb PyCon US Ewa Jodlowska Python Software Foundation Podcast Interview Swift JavaScript Django Girls Briefcase packaging tool PyPy Web Assembly (WASM) COBOL Tidelift Cricket unit test runner The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/22/2019 • 54 minutes, 35 seconds

A Modern Open Source Project Management Platform

Summary Project management is a discipline that has been through many incarnations, spawning an entire industry of businesses and tools. The challenge is to build a platform that is sufficiently powerful and adaptable to fit the workflow of your teams, while remaining opinionated enough to be useful. It also helps to have an open and extensible platform that can be customized as needed. In this episode Pablo Ruiz Múzquiz explains the motivation for creating the open source tool Taiga, how it compares to the other options in the market, and how you can use it for your own projects. He also discusses the challenges inherent to project management tools, his philosophies on what makes a project successful, and how to manage your team workflows to be most effective. It was helpful learning from Pablo’s long experience in the software industry and managing teams of various sizes. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Coming up this fall is the combined events of Graphorum and the Data Architecture Summit. The agendas have been announced and super early bird registration for up to $300 off is available until July 26th, with early bird pricing for up to $200 off through August 30th. Use the code BNLLC to get an additional 10% off any pass when you register. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Pablo Ruiz Múzquiz about Taiga, a project management platform for agile developers & designers and project managers who want a beautiful tool that makes work truly enjoyable Interview Introductions How did you get introduced to Python? Can you start by explaining what Taiga is and the reason for building it? Project management platforms have been available for a long time. Can you describe how Taiga fits into that market and what makes it stand out? Can you describe how you view project management and some of the unique challenges that it poses when building a tool for it? How do the requirements differ between project management for software teams vs other disciplines? How is Taiga implemented and how has the system design evolved since it was first started? For someone who is using Taiga can you talk through the features of the platform and how it fits into a typical workflow? How do you maintain a balance between usability and structure in managing project workflows against flexibility and customization? Within an engineering team how do you view the responsibility for driving and maintaining the lifecycle of a project? What are the most common points of friction within a project management workflow and how are you working to address them in Taiga? Onboarding and discovery for a new contributor in a given project is often painful. What are some steps that a project manager or product team can take to make that process more palatable? How has the landscape of project management practices and tools changed since you first began working on Taiga and how has that influenced your roadmap? What have been the most challenging or difficult aspects of building and growing the Taiga project and community? What lessons have you learned in the process that have been particularly valuable or unexpected? What are some of the most interesting/unexpected/innovative ways that you have seen Taiga used? When is Taiga the wrong choice for a given project or team? What do you have planned for the future of Taiga? Added by Pablo Why did you choose AGPLv3 for a license? How can Taiga integrate itself with other platforms that are typically used by teams? Keep In Touch @diacritica on Twitter LinkedIn Website Picks Tobias Marchway Hydration Pack Pablo Archery 3D Archery Links Taiga Madrid, Spain Traditional Archery Kaleidos Perl Monty Python Blender Agile Project Management Redmine Trac Agile Manifesto REST Django AngularJS Django REST Framework Scrum Kanban Taiga Mobile App Webhooks AGPLv3 FOSDEM Iocaine The Princess Bride Taiga Tribe Fedora Atlassian Jira Trello The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/15/2019 • 1 hour, 9 minutes, 5 seconds

Domain Driven Design For Python

Summary When your software projects start to scale it becomes a greater challenge to understand and maintain all of the pieces. In this episode Henry Percival shares his experiences working with domain driven design in large Python projects. He explains how it is helpful, and how you can start using it for your own applications. This was an informative conversation about software architecture patterns for large organizations and how they can be used by Python developers. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. With such an intuitive tool it’s easy to make sure that everyone in the business is on the same page. Podcast.init listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Coming up this fall is the combined events of Graphorum and the Data Architecture Summit. The agendas have been announced and super early bird registration for up to $300 off is available until July 26th, with early bird pricing for up to $200 off through August 30th. Use the code BNLLC to get an additional 10% off any pass when you register. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Harry Percival about domain driven design and enterprise application architecture in Python Interview Introductions How did you get introduced to Python? Can you start by explaining what "application architecture" is and how it compares to the types of application designs that Python developers and teams typically rely on? how does it contrast with "enterprise architecture"? What are the influences that tend to lead engineers into sub-optimal architectures and how can they guard against them? One of the core concepts in this problem space is that of "domain driven design". Can you unpack that term and explain the benefits that it provides to software architecture? What are some of the other concepts that are common among application architecture patterns? What are some of the common points of confusion among engineers who are first working with DDD? Is there any particular size or scope of project and organization that merits the approach of domain driven design or is it applicable even at small scales of complexity and team size? Now that we’ve convinced everyone that they should be using DDD can you talk through the steps involved in identifying and encapsulating the various implementation details that they will need to work through? How does that process change when dealing with an existing application as opposed to a "greenfield" project? How do Python language constructs and libraries impact the approach to implementation of application architecture patterns as compared to more traditional "enterprise" languages such as Java and C#? What are some of the architectural anti-patterns to watch out for when implementing DDD? On any given team, who is responsible for identifying and ensuring adherence to proper architectural principles? Are there any publicly visible projects that implement DDD which listeners can look at and learn from? To help Python developers in their efforts to learn and implement DDD and other aspects of enterprise architecture you have been working on a book. Can you talk about your motivation for that undertaking, what listeners can expect to learn when the read it, and any challenges that you have encountered in the process? What are some trends in terms of system design and architecture, or technology influences, that you are keeping an eye on? Keep In Touch @hjwp on Twitter hjwp on GitHub Website LinkedIn Picks Tobias Dragon Pearl by Yoon Ha Lee Harry Tremé Why We Sleep: Unlocking The Power Of Sleep and Dreams by Matthew Walker PhD Links MADE Obey The Testing Goat Python Anywhere XP (eXtreme Programming) Django Dive Into Python Domain Driven Design Design Patterns Gang Of Four Book MVC (Model View Controller) Microservices µCon "Uncle" Bob Martin Clean Architecture book Python LEAP Book Dependency Injection Inversion Of Control Test Pyramid Gary Bernhardt Podcast Interview Functional Core, Imperative Shell Harry’s Blog The "Blue" Book by Eric Evans Gartner Hype Cycle The Clean Architecture In Python by Leonardo Giordani DRY Python The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/8/2019 • 55 minutes, 41 seconds

Open Source Automated Machine Learning With MindsDB

Summary Machine learning is growing in popularity and capability, but for a majority of people it is still a black box that we don’t fully understand. The team at MindsDB is working to change this state of affairs by creating an open source tool that is easy to use without a background in data science. By simplifying the training and use of neural networks, and making their logic explainable, they hope to bring AI capabilities to more people and organizations. In this interview George Hosu and Jorge Torres explain how MindsDB is built, how to use it for your own purposes, and how they view the current landscape of AI technologies. This is a great episode for anyone who is interested in experimenting with machine learning and artificial intelligence. Give it a listen and then try MindsDB for yourself. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. With such an intuitive tool it’s easy to make sure that everyone in the business is on the same page. Podcast.init listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Coming up this fall is the combined events of Graphorum and the Data Architecture Summit. The agendas have been announced and super early bird registration for up to $300 off is available until July 26th, with early bird pricing for up to $200 off through August 30th. Use the code BNLLC to get an additional 10% off any pass when you register. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing George Hosu and Jorge Torres about MindsDB, a framework for streamlining the use of neural networks Interview Introductions How did you get introduced to Python? Can you start by explaining what MindsDB is and the problem that it is trying to solve? What was the motivation for creating the project? Who is the target audience for MindsDB? Before we go deep into MindsDB can you explain what a neural network is for anyone who isn’t familiar with the term? For someone who is using MindsDB can you talk through their workflow? What are the types of data that are supported for building predictions using MindsDB? How much cleaning and preparation of the data is necessary before using it to generate a model? What are the lower and upper bounds for volume and variety of data that can be used to build an effective model in MindsDB? One of the interesting and useful features of MindsDB is the built in support for explaining the decisions reached by a model. How do you approach that challenge and what are the most difficult aspects? Once a model is generated, what is the output format and can it be used separately from MindsDB for embedding the prediction capabilities into other scripts or services? How is MindsDB implemented and how has the design changed since you first began working on it? What are some of the assumptions that you made going into this project which have had to be modified or updated as it gained users and features? What are the limitations of MindsDB and what are the cases where it is necessary to pass a task on to a data scientist? In your experience, what are the common barriers for individuals and organizations adopting machine learning as a tool for addressing their needs? What have been the most challenging, complex, or unexpected aspects of designing and building MindsDB? What do you have planned for the future of MindsDB? Keep In Touch George Blog George3d6 on GitHub @Cerebralab2 on Twitter LinkedIn Jorge LinkedIn MindsDB Website @mindsdb on Twitter mindsdb on GitHub Picks Tobias Bose QuietComfort 25 noise cancelling headphones George Open CourseWare – Brain And Cognitive Sciences Cerebralab Blog Jorge Lightwood MKDocs with Google Material Templates Links MindsDB GitHub 3Blue1Brown – Neural Networks Think Bayes Backpropagation Reverse Automatic Differentiation Ludwig deep learning toolbox Lightwood Tensorflow PyTorch Podcast Interview Aerospike scikit-learn The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/1/2019 • 58 minutes, 10 seconds

Behind The Scenes At The Python Software Foundation

Summary One of the secrets of the success of Python the language is the tireless efforts of the people who work with and for the Python Software Foundation. They have made it their mission to ensure the continued growth and success of the language and its community. In this episode Ewa Jodlowska, the executive director of the PSF, discusses the history of the foundation, the services and support that they provide to the community and language, and how you can help them succeed in their mission. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. With such an intuitive tool it’s easy to make sure that everyone in the business is on the same page. Podcast.init listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial. Bots and automation are taking over whole categories of online interaction. Discover.bot is an online community designed to serve as a platform-agnostic digital space for bot developers and enthusiasts of all skill levels to learn from one another, share their stories, and move the conversation forward together. They regularly publish guides and resources to help you learn about topics such as bot development, using them for business, and the latest in chatbot news. For newcomers to the space they have the Beginners Guide To Bots that will teach you the basics of how bots work, what they can do, and where they are developed and published. To help you choose the right framework and avoid the confusion about which NLU features and platform APIs you will need they have compiled a list of the major options and how they compare. Go to pythonpodcast.com/discoverbot today to get started and thank them for their support of the show. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Coming up this fall is the combined events of Graphorum and the Data Architecture Summit. The agendas have been announced and super early bird registration for up to $300 off is available until July 26th, with early bird pricing for up to $200 off through August 30th. Use the code BNLLC to get an additional 10% off any pass when you register. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Ewa Jodlowska about the Python Software Foundation and the role that it serves in the language and community Interview Introductions How did you get introduced to Python? Can you start by explaining what the PSF is for anyone who isn’t familiar with it? How did you get involved with the PSF and what is your current role? What was the motivation for creating the PSF? What are the primary responsibilities of the PSF? How has the scope and scale of the responsibilities for the PSF shifted in the years since its foundation? What is the relationship between the PSF and the language core developers? What are some reasons that someone would want to become a member of the PSF and what is involved in gaining membership? What are the challenges confronted by you and the PSF, currently and in the recent past? What are you most worried about and most proud of in the PSF, the core language, or the community? What challenges or changes do you foresee for the PSF in the near to medium future? What are some of the most interesting/unexpected/challenging lessons that you have learned while working with the PSF? How are the PSF and the PSU (Python Secret Underground) related? Outside of the PSF, how can the community contribute to the health and longevity of the language, its ecosystem, and its community? Keep In Touch Ewa @ewa_jodlowska on Twitter Email The Python Software Foundation Website @thepsf on Twitter Blog Picks Tobias Russell Keith-Magee PyCon 2019 Keynote Ewa Donate To The PSF Links The PSF Informix PHP PyCon PyLadies PyPI Denmark PSF Mission Statement ChiPy Brett Cannon PyCon 2018 Keynote Mozilla Open Source Support Fund The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/24/2019 • 37 minutes, 31 seconds

Algorithmic Trading In Python Using Open Tools And Open Data

Summary Algorithmic trading is a field that has grown in recent years due to the availability of cheap computing and platforms that grant access to historical financial data. QuantConnect is a business that has focused on community engagement and open data access to grant opportunities for learning and growth to their users. In this episode CEO Jared Broad and senior engineer Alex Catarino explain how they have built an open source engine for testing and running algorithmic trading strategies in multiple languages, the challenges of collecting and serving currrent and historical financial data, and how they provide training and opportunity to their community members. If you are curious about the financial industry and want to try it out for yourself then be sure to listen to this episode and experiment with the QuantConnect platform for free. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. With such an intuitive tool it’s easy to make sure that everyone in the business is on the same page. Podcast.init listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Coming up this fall is the combined events of Graphorum and the Data Architecture Summit. The agendas have been announced and super early bird registration for up to $300 off is available until July 26th, with early bird pricing for up to $200 off through August 30th. Use the code BNLLC to get an additional 10% off any pass when you register. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. The Python Software Foundation is the lifeblood of the community, supporting all of us who want to run workshops and conferences, run development sprints or meetups, and ensuring that PyCon is a success every year. They have extended the deadline for their 2019 fundraiser until June 30th and they need help to make sure they reach their goal. Go to pythonpodcast.com/psf today to make a donation. If you’re listening to this after June 30th of 2019 then consider making a donation anyway! Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Jared Broad and Alex Catarino about QuantConnect, a platform for building and testing algorithmic trading strategies on open data and cloud resources Interview Introductions How did you get introduced to Python? Can you start by explaining what QuantConnect is and how the business got started? What is your mission for the company? I know that there are a few other entrants in this market. Can you briefly outline how you compare to the other platforms and maybe characterize the state of the industry? What are the main ways that you and your customers use Python? For someone who is new to the space can you talk through what is involved in writing and testing a trading algorithm? Can you talk through how QuantConnect itself is architected and some of the products and components that comprise your overall platform? I noticed that your trading engine is open source. What was your motivation for making that freely available and how has it influenced your design and development of the project? I know that the core product is built in C# and offers a bridge to Python. Can you talk through how that is implemented? How do you address latency and performance when bridging those two runtimes given the time sensitivity of the problem domain? What are the benefits of using Python for algorithmic trading and what are its shortcomings? How useful and practical are machine learning techniques in this domain? Can you also talk through what Alpha Streams is, including what makes it unique and how it benefits the users of your platform? I appreciate the work that you are doing to foster a community around your platform. What are your strategies for building and supporting that interaction and how does it play into your product design? What are the categories of users who tend to join and engage with your community? What are some of the most interesting, innovative, or unexpected tactics that you have seen your users employ? For someone who is interested in getting started on QuantConnect what is the onboarding process like? What are some resources that you would recommend for someone who is interested in digging deeper into this domain? What are the trends in quantitative finance and algorithmic trading that you find most exciting and most concerning? What do you have planned for the future of QuantConnect? Keep In Touch Jared LinkedIn @jaredbroad on Twitter Alex AlexCatarino on GitHub LinkedIn @AlexCatx on Twitter QuantConnect @QuantConnect on Twitter Website Picks Tobias Good Omens book and miniseries Jared Chernobyl HBO Series Alex The 100 Links QuantConnect LEAN algorithm engine Alpha Streams Google Spanner PyCharm Visual Studio Code IronPython NumPy SymPy Pandas PythonNet Tensorflow Keras Udemy The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/17/2019 • 50 minutes, 43 seconds

Web Application Development Entirely In Python With Anvil

Summary The knowledge and effort required for building a fully functional web application has grown at an accelerated rate over the past several years. This introduces a barrier to entry that excludes large numbers of people who could otherwise be producing valuable and interesting services. To make the onramp easier Meredydd Luff and Ian Davies created Anvil, a platform for full stack web development in pure Python. In this episode Meredydd explains how the Anvil platform is built and how you can use it to build and deploy your own projects. He also shares some examples of people who were able to create profitable businesses themselves because of the reduced complexity. It was interesting to get Meredydd’s perspective on the state of the industry for web development and hear his vision of how Anvil is working to make it available for everyone. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. With such an intuitive tool it’s easy to make sure that everyone in the business is on the same page. Podcast.init listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial. Bots and automation are taking over whole categories of online interaction. Discover.bot is an online community designed to serve as a platform-agnostic digital space for bot developers and enthusiasts of all skill levels to learn from one another, share their stories, and move the conversation forward together. They regularly publish guides and resources to help you learn about topics such as bot development, using them for business, and the latest in chatbot news. For newcomers to the space they have the Beginners Guide To Bots that will teach you the basics of how bots work, what they can do, and where they are developed and published. To help you choose the right framework and avoid the confusion about which NLU features and platform APIs you will need they have compiled a list of the major options and how they compare. Go to pythonpodcast.com/discoverbot today to get started and thank them for their support of the show. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Coming up this fall is the combined events of Graphorum and the Data Architecture Summit. The agendas have been announced and super early bird registration for up to $300 off is available until July 26th, with early bird pricing for up to $200 off through August 30th. Use the code BNLLC to get an additional 10% off any pass when you register. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. The Python Software Foundation is the lifeblood of the community, supporting all of us who want to run workshops and conferences, run development sprints or meetups, and ensuring that PyCon is a success every year. They have extended the deadline for their 2019 fundraiser until June 30th and they need help to make sure they reach their goal. Go to pythonpodcast.com/psf today to make a donation. If you’re listening to this after June 30th of 2019 then consider making a donation anyway! Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Meredydd Luff about Anvil, platform for building full stack web applications entirely in Python Interview Introductions How did you get introduced to Python? Can you start by explaining what Anvil is and the story of how and why you created it? Web applications come in a vast array of styles. What are the primary formats of web applications that Anvil supports building and what are its limitations? Are there certain categories of users that tend to gravitate toward Anvil? How do you approach user experience design and overall usability given the varied backgrounds of your customers? For someone who wants to use Anvil can you talk through a typical workflow and highlight the different components of the platform? Can you describe how Anvil itself is implemented and how it has evolved since you first began working on it? For the javascript transpilation, are you using an existing project such as Transcrypt or PyJS, or did you develop your own? Given that the Python dependencies on your servers are managed by how, how do you approach version upgrades to avoid breaking your customer’s applications? What are the main assumptions that you had going into the project and how have those assumptions been challenged or updated in the process of growing the business? What have been some of the biggest challenges that you have faced in the process of building and growing Anvil? What are some of the edge cases that you have run into while developing Anvil? (e.g. browser APIs, javascript <-> Python impedance mismatch, etc.) Can you talk through how you manage deployments of your customer’s applications? What are some of the features of Anvil that are often overlooked, under-utilized, or misunderstood which you think users would benefit from knowing about? What are some of the most interesting/innovative/unexpected ways that you have seen Anvil used? What are the limitations of Anvil and when is it the wrong choice? What do you have planned for the future of Anvil? Keep In Touch @meredydd on Twitter LinkedIn Website meredydd on GitHub Picks Tobias Pipx Meredydd Skulpt Python in the Browser implementations generally Links Anvil Delphi Visual Basic Human-Computer Interaction Amazon RDS (Relational Database Service) Bokeh Podcast Interview Plotly Raspberry Jam by the Raspberry Pi Foundation PyCharm Websockets Skulpt Comparing implementations of Python in the Browser on Python Tips Brython The Matrix Pyodide How Skulpt works (PyCon 2017 Lightning Talk) How Anvil’s autocompleter works (PyCon UK 2017 Lightning Talk) The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/10/2019 • 57 minutes, 30 seconds

Building A Business On Serverless Technology

Summary Serverless computing is a recent category of cloud service that provides new options for how we build and deploy applications. In this episode Raghu Murthy, founder of DataCoral, explains how he has built his entire business on these platforms. He explains how he approaches system architecture in a serverless world, the challenges that it introduces for local development and continuous integration, and how the landscape has grown and matured in recent years. If you are wondering how to incorporate serverless platforms in your projects then this is definitely worth your time to listen to. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. With such an intuitive tool it’s easy to make sure that everyone in the business is on the same page. Podcast.init listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial. Bots and automation are taking over whole categories of online interaction. Discover.bot is an online community designed to serve as a platform-agnostic digital space for bot developers and enthusiasts of all skill levels to learn from one another, share their stories, and move the conversation forward together. They regularly publish guides and resources to help you learn about topics such as bot development, using them for business, and the latest in chatbot news. For newcomers to the space they have the Beginners Guide To Bots that will teach you the basics of how bots work, what they can do, and where they are developed and published. To help you choose the right framework and avoid the confusion about which NLU features and platform APIs you will need they have compiled a list of the major options and how they compare. Go to pythonpodcast.com/discoverbot today to get started and thank them for their support of the show. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Coming up this fall is the combined events of Graphorum and the Data Architecture Summit. The agendas have been announced and super early bird registration for up to $300 off is available until July 26th, with early bird pricing for up to $200 off through August 30th. Use the code BNLLC to get an additional 10% off any pass when you register. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. The Python Software Foundation is the lifeblood of the community, supporting all of us who want to run workshops and conferences, run development sprints or meetups, and ensuring that PyCon is a success every year. They have extended the deadline for their 2019 fundraiser until June 30th and they need help to make sure they reach their goal. Go to pythonpodcast.com/psf2019 today to make a donation. If you’re listening to this after June 30th of 2019 then consider making a donation anyway! Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Raghu Murthy from DataCoral about his experience building and deploying a personalized SaaS platform on top of serverless technologies Interview Introductions How did you get introduced to Python? Can you start by giving a brief overview of DataCoral? Before we get too deep can you share your definition of what types of technologies fall under the umbrella of "serverless"? How are you using serverless technologies at DataCoral? How has your usage evolved as your business and the underlying technologies have evolved? How do serverless technologies impact your approach to application architecture? What are some of the main benefits for someone to target services such as Lambda? What is your litmus test for determining whether a given project would be a good fit for a Function as a Service platform? What are the most challenging aspects of running code on Lambda? What are some of the major design differences between running on Lambda vs the more familiar server-oriented paradigms? What are some of the other services that are most commonly used alongside Function as as Service (e.g. Lambda) to build full featured applications? With serverless function platforms there is the cold start problem, can you explain what that means and some application design patterns that can help mitigate it? When building on cloud-based technologies, especially proprietary ones, local development can be a challenge. How are you handling that issue at DataCoral? In addition to development this new deployment paradigm upends some of the traditional approaches to CI/CD. How are you approaching testing and deployment of your services? How do you identify and maintain dependency graphs between your various microservices? In addition to deployment, it is also necessary to track performance characteristics and error events across service boundaries. How are you managing observability and alerting in your product? What are you most excited for in the serverless space that listeners should know about? Keep In Touch LinkedIn Medium Picks Tobias Avengers Endgame Raghu Golden State Warriors Links DataCoral Data Engineering Podcast Interview Perl Airflow Podcast Interview Serverless Computing DynamoDB Aurora SNS SQS Lambda S3 API Gateway EMR Apache Hive AWS Glue RedShift SnowflakeDB Hadoop Function As A Service Distributed Systems Conway’s Law SRE == Site Reliability Engineer Rollbar AWS Batch The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/4/2019 • 47 minutes, 13 seconds

A Data Catalog For Your PyData Projects

Summary One of the biggest pain points when working with data is getting is dealing with the boilerplate code to load it into a usable format. Intake encapsulates all of that and puts it behind a single API. In this episode Martin Durant explains how to use the Intake data catalogs for encapsulating source information, how it simplifies data science workflows, and how to incorporate it into your projects. It is a lightweight way to enable collaboration between data engineers and data scientists in the PyData ecosystem. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Martin Durant about Intake, a lightweight package for finding, investigating, loading and disseminating data Interview Introductions How did you get introduced to Python? Can you start by explaining what Intake is and the story behind its creation? Can you outline some of the other projects and products that intersect with the functionality of Intake and describe where it fits in terms of use case and capabilities? (e.g. Quilt Data, Arrow, Data Retriever) Can you describe the workflows for using Intake, both from the data scientist and the data engineer perspective? One of the persistent challenges in working with data is that of cataloging and discovery of what already exists. In what ways does Intake address that problem? Does it have any facilities for capturing and exposing data lineage? For someone who needs to customize their usage of Intake, what are the extension points and what is involved in building a plugin? Can you describe how Intake is implemented and how it has evolved since it first started? What are some of the most challenging, complex, or novel aspects of the Intake implementation? Intake focuses primarily on integrating with the PyData ecosystem (e.g. NumPy, Pandas, SciPy, etc.). What are some other communities that are, or could be, benefiting from the work being done on Intake? What are some of the assumptions that are baked into Intake that would need to be modified to make it more broadly applicable? What are some of the assumptions that were made going into this project that have needed to be reconsidered after digging deeper into the problem space? What are some of the most interesting/unexpected/innovative ways that you have seen Intake leveraged? What are your plans for the future of Intake? Keep In Touch martindurant on GitHub Website @martin_durant_ on Twitter Picks Tobias Ubersuggest SEO tool Links Intake Anaconda Dask Data Engineering Podcast Interview Fast Parquet IDL Space Telescope Institute Blaze Quilt Data Podcast Interview Arrow Data Retriever Podcast Interview Parquet Data Engineering Podcast Interview DataFrame Apache Spark Dremio Data Engineering Podcast Interview Dat Project – distributed peer-to-peer data sharing Data Engineering Podcast Interview GeoPandas XArray Solr Streamz PyViz S3FS The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/27/2019 • 50 minutes, 1 second

Hardware Hacking Made Easy With CircuitPython

Summary Learning to program can be a frustrating process, because even the simplest code relies on a complex stack of other moving pieces to function. When working with a microcontroller you are in full control of everything so there are fewer concepts that need to be understood in order to build a functioning project. CircuitPython is a platform for beginner developers that provides easy to use abstractions for working with hardware devices. In this episode Scott Shawcroft explains how the project got started, how it relates to MicroPython, some of the cool ways that it is being used, and how you can get started with it today. If you are interested in playing with low cost devices without having to learn and use C then give this a listen and start tinkering! Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Scott Shawcroft about CircuitPython, the easiest way to program microcontrollers Interview Introductions How did you get introduced to Python? Can you start by explaining what CircuitPython is and how the project got started? I understand that you work at Adafruit and I know that a number of their products support CircuitPython. What other runtimes do you support? Microcontrollers have typically been the domain of C because of the resource and performance constraints. What are the benefits of using Python to program hardware devices? With the wide availability of powerful computing platforms, what are the benefits of experimenting with microcontrollers and their peripherals? I understand that CircuitPython is a friendly fork of MicroPython. What have you changed in your version? How do you structure your development to avoid conflicts with the upstream project? What are some changes that you have contributed back to MicroPython? What are some of the features of CircuitPython that make it easier for users to interact with sensors, motors, etc.? CircuitPython provides an easy on-ramp for experimenting with hardware projects. Is there a point where a user will outgrow it and need to move to a different language or framework? What are some of the most interesting/innovative/unexpected projects that you have seen people build using CircuitPython? Are there any cases of someone building and shipping a production grade project in CircuitPython? What have been some of the most interesting/challenging/unexpected aspects of building and maintaining CircuitPython? What is in store for the future of the project? Keep In Touch @tannewt on Twitter Website tannewt on GitHub Picks Tobias Wings Of Fire book series Scott Brandon Sanderson The Wheel Of Time Series Mist Born Links Adafruit CircuitPython MicroPython Podcast Interview Microcontroller Arduino Microsoft MakeCode NodeBots Espruino I2C Hackspace Magazine Adafruit Blinka learn.adafruit.com Scott Hanselman Blog Post Reflow Oven Adafruit Crickit – Creative Robotics Platform Adabox Moore’s Law SparkFun DigiKey The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/20/2019 • 54 minutes, 5 seconds

Building A Privacy Preserving Voice Assistant

Summary Being able to control a computer with your voice has rapidly moved from science fiction to science fact. Unfortunately, the majority of platforms that have been made available to consumers are controlled by large organizations with little incentive to respect users’ privacy. The team at Snips are building a platform that runs entirely off-line and on-device so that your information is always in your control. In this episode Adrien Ball explains how the Snips architecture works, the challenges of building a speech recognition and natural language understanding toolchain that works on limited resources, and how they are tackling issues around usability for casual consumers. If you have been interested in taking advantage of personal voice assistants, but wary of using commercially available options, this is definitely worth a listen. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Adrien Ball about SNIPS, a set of technologies to make voice controlled systems that respect user’s privacy Interview Introductions How did you get introduced to Python? Can you start by explaining what the Snips is and how it got started? For someone who wants to use Snips can you talk through the onboarding proces? One of the interesting features of your platform is the option for automated training data generation. Can you explain how that works? Can you describe the overall architecture of the Snips platform and how it has evolved since you first began working on it? Two of the main components that can be used independently are the ASR (Automated Speech Recognition) and NLU (Natural Language Understanding) engines. Each of those have a number of competitors in the market, both open source and commercial. How would you describe your overall position in the market for each of those projects? I know that one of the biggest challenges in conversational interfaces is maintaining context for multi-step interactions. How is that handled in Snips? For the NLU engine, you recently ported it from Python to Rust. What was your motivation for doing so and how would you characterize your experience between the two languages? Are you continuing to maintain both implementations and if so how are you maintaining feature parity? How do you approach the overall usability and user experience, particularly for non-technical end users? How is discoverability handled (e.g. finding out what capabilities/skills are available) One of the compelling aspects of Snips is the ability to deploy to a wide variety of devices, including offline support. Can you talk through that deployment process, both from a user perspective and how it is implemented under the covers? What is involved in updating deployed models and keeping track of which versions are deployed to which devices? What is involved in adding new capabilities or integrations to the Snips platform? What are the limitations of running everything offline and on-device? When is Snips the wrong choice? In the process of building and maintaining the various components of Snips, what have been some of the most useful/interesting/unexpected lessons that you have learned? What have been the most challenging aspects? What are some of the most interesting/innovative/unexpected ways that you have seen the Snips technologies used? What is in store for the future of Snips? Keep In Touch LinkedIn adrienball on GitHub @adrien_ball on Medium @adrien_ball on Twitter Picks Tobias Chrome OS Adrien Google I/O Facebook F8 User Privacy Links Snips 2048 Game Smart Cities Raspberry Pi WikiData MQTT Google Assistant Amazon Alexa Microsoft Cortana Mozilla Common Voice Rust Language Snips Hermes messaging library The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/13/2019 • 56 minutes, 27 seconds

Hacking The Government With The USDS

Summary The U.S. government has a vast quantity of software projects across the various agencies, and many of them would benefit from a modern approach to development and deployment. The U.S. Digital Services Agency has been tasked with making that happen. In this episode the current director of engineering for the USDS, David Holmes, explains how the agency operates, how they are using Python in their efforts to provide the greatest good to the largest number of people, and why you might want to get involved. Even if you don’t live in the U.S.A. this conversation is worth listening to so you can see an interesting model of how to improve government services for everyone. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Bots and automation are taking over whole categories of online interaction. Discover.bot is an online community designed to serve as a platform-agnostic digital space for bot developers and enthusiasts of all skill levels to learn from one another, share their stories, and move the conversation forward together. They regularly publish guides and resources to help you learn about topics such as bot development, using them for business, and the latest in chatbot news. For newcomers to the space they have the Beginners Guide To Bots that will teach you the basics of how bots work, what they can do, and where they are developed and published. To help you choose the right framework and avoid the confusion about which NLU features and platform APIs you will need they have compiled a list of the major options and how they compare. Go to pythonpodcast.com/discoverbot today to get started and thank them for their support of the show. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing David Holmes about his work at the US Digital Services organization Interview Introductions How did you get introduced to Python? Can you start by explaining what the USDS is and how you got involved with it? The terminology that is used around "Tours of Service" is interesting. Can you explain what that entails? relocation what if you have a house and career? Can you explain the model of how the USDS works? What is involved in staffing a new project? What is your typical toolkit, and how does that vary with the specific departments that you are working with? What are some of the most interesting projects that you and the team at USDS have worked on? What are some of the most challenging projects that you have been involved with? What are some projects that you hope to be asked to work on? Keep In Touch davideholmes on GitHub Picks Tobias Captain Marvel movie David Avengers: Endgame Game Of Thrones television series Links US Digital Services US Digital Services Job Application US Digital Services Projects 18F The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/7/2019 • 34 minutes, 3 seconds

Probabilistic Modeling In Python (And What That Even Means)

Summary Most programming is deterministic, relying on concrete logic to determine the way that it operates. However, there are problems that require a way to work with uncertainty. PyMC3 is a library designed for building models to predict the likelihood of certain outcomes. In this episode Thomas Wiecki explains the use cases where Bayesian statistics are necessary, how PyMC3 is designed and implemented, and some great examples of how it is being used in real projects. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Thomas Wiecki about PyMC3, a project for probabilistic programming in Python Interview Introductions How did you get introduced to Python? Can you start by explaining what probabilistic programming is? What is the PyMC3 project and how did you get involved with it? The opening line for the project README is packed with a slew of terms that are rather opaque to the lay-person. Can you unpack that a bit and discuss some of the ways that PyMC3 is used in real-world projects? How much knowledge of statistical modeling and Bayesian statistics is necessary to make effective use of PyMC3? Can you talk through an example use case for PyMC3 to illustrate how you would use it in a project? How does it compare to the way that you would approach the same problem in a deterministic or frequentist modeling framework? Can you describe how PyMC3 is implemented? There are a number of other projects that build on top of PyMC3, what are some that you find particularly interesting or noteworthy? What do you find to be the most useful features of PyMC3 and what are some areas that you would like to see it improved? What have been the most interesting/unexpected/challenging lessons that you have learned in the process of building and maintaining PyMC3? What is in store for the future of PyMC3? Keep In Touch PyMC GitHub Discourse Forum Thomas twiecki on GitHub @twiecki on Twitter Website Picks Tobias Fantastic Beasts And Where To Find Them Fantastic Beasts: The Crimes Of Grindelwald Thomas Hyperion by Dan Simmons The Mind Illuminated Links PyMC3 Quantopian University of Tubingen MatLab Probabilistic Modeling Probability Distribution A/B Testing Bayesian Statistics Beta Distribution Bernoulli Distribution P-Value Hamiltonian Monte Carlo sampling algorithm Metropolis Hastings Inference Algorithm Theano Bayesian Methods For Hackers by Cameron Davidson-Pilon Bayesian Analysis With Python by Osvaldo Martin Tensorflow MXNet deep learning framework PyTorch Tensorflow Probability BAMBI package to build generalized linear models PMProphet PyMC3 implementation of Facebook’s Prophet for timeseries prediction Exoplanet BEAT (Bayesian Earthquake Analysis Tool) PyMC3 in Google Summer of Code The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/29/2019 • 54 minutes, 49 seconds

Exploring Indico: A Full Featured Event Management Platform

Summary Managing an event is rife with inherent complexity that scales as you move from scheduling a meeting to organizing a conference. Indico is a platform built at CERN to handle their efforts to organize events such as the Computing in High Energy Physics (CHEP) conference, and now it has grown to manage booking of meeting rooms. In this episode Adrian Mönnich, core developer on the Indico project, explains how it is architected to facilitate this use case, how it has evolved since its first incarnation two decades ago, and what he has learned while working on it. The Indico platform is definitely a feature rich and mature platform that is worth considering if you are responsible for organizing a conference or need a room booking system for your office. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Bots and automation are taking over whole categories of online interaction. Discover.bot is an online community designed to serve as a platform-agnostic digital space for bot developers and enthusiasts of all skill levels to learn from one another, share their stories, and move the conversation forward together. They regularly publish guides and resources to help you learn about topics such as bot development, using them for business, and the latest in chatbot news. For newcomers to the space they have the Beginners Guide To Bots that will teach you the basics of how bots work, what they can do, and where they are developed and published. To help you choose the right framework and avoid the confusion about which NLU features and platform APIs you will need they have compiled a list of the major options and how they compare. Go to pythonpodcast.com/discoverbot today to get started and thank them for their support of the show. You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Adrian Mönnich about Indico, the effortless open-source tool for event organisation, archival and collaboration Interview Introductions How did you get introduced to Python? Can you start by describing what Indico is and how the project got started? What are some other projects which target a similar use case and what were they lacking that led to Indico being necessary? Can you talk through an example workflow for setting up and managing an event in Indico? How does the lifecycle change when working with larger events, such as PyCon? Can you describe how Indico is architected and how its design has evolved since it was first built? What are some of the most complex or challenging portions of Indico to implement and maintain? There are a lot of areas for exercising constraint resolution algorithms. Can you talk through some of the business logic of how that operates? Most of Indico is highly configurable and flexible. How do you approach managing sane defaults to prevent users getting overwhelmed when onboarding? What is your approach to testing given how complex the project is? What are some of the most interesting or unexpected ways that you have seen Indico used? What are some of the most interesting/unexpected lessons that you have learned in the process of building Indico? What do you have planned for the future of the project? Keep In Touch Indico Website GitHub IRC Adrian ThiefMaster on GitHub Picks Tobias Mortal Engines movie Adrian Virtual Reality Portal VR Links Indico Tornado Podcast Interview CERN High Energy Physics CHEP (Computing in High Energy Physics) conference ZODB PostgreSQL Data Engineering Podcast Interview SQLAlchemy Flask WSGI == Web Server Gateway Interface Mako Templates Jinja ReactJS Stripe Paypal Indico Introduction Video Reveal.js Mod_Python Zope Doodle LDAP == Lightweight Directory Access Protocol Daylight Saving Time Indico User Guide Py.Test Podcast Episode Selenium Flask Plugin Engine CERN Indico Plugins Linux Plumber’s Conference Open SUSE F-Strings The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/22/2019 • 53 minutes, 46 seconds

Exploring Python's Internals By Rewriting Them In Rust

Summary The CPython interpreter has been the primary implementation of the Python runtime for over 20 years. In that time other options have been made available for different use cases. The most recent entry to that list is RustPython, written in the memory safe language Rust. One of the added benefits is the option to compile to WebAssembly, offering a browser-native Python runtime. In this episode core maintainers Windel Bouwman and Adam Kelly explain how the project got started, their experience working on it, and the plans for the future. Definitely worth a listen if you are curious about the inner workings of Python and how you can get involved in a relatively new project that is contributing to new options for running your code. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host is Tobias Macey and today I’m interviewing Adam Kelly and Windel Bouwman about RusPython, a project to implement a new Python interpreter in Rust Interview Introduction How did you get introduced to Python? Can you start by explaining what Rust is for anyone who isn’t familiar with it? How did RustPython got started and what are your goals for the project? Can you discuss what is involved in implementing a fully compliant Python interpreter? What are some of the challenges that you face in replicating the capabilities of the CPython interpreter? Are you attempting to maintain bug parity? How much of the stdlib needs to be reimplemented? Can you compare and contrast the benefits of Rust vs C? Will the end result be compatible with libraries that rely on C extensions such as NumPy? What is the current state of the project? What are some of the notable missing features? Can you talk through your vision of how the WebAssembly support will manifest and the types of applications that it will enable? How much effort have you put into size optimization for the webassembly target to reduce client-side load time? Are there any existing options for minification of Python code so that it can be delivered to users with less bandwidth? What have been some of the most interesting/challenging/unexpected aspects of implementing a Python runtime? What do you have planned for the future of the project? What are the risks that you anticipate which could derail the project before it becomes production ready? Contact Info Windel windelbouwman on GitHub Website @windelbouwman on Twitter @[email protected] on Mastodon Adam cthulahoops on GitHub @cthulahoops on Twitter Picks Tobias Oysterhead Adam FZF fuzzy finder Windel TQDM Python progress bar Links RustPython Windel Presentation EuroPython Rust C++ Rust Memory Safety MicroPython Podcast Episode PyPy Ouroboros – Pure Python standard library WebAssembly lalrpop – Rust parser generator Rust Crates PickItUp in-browser Python game engine QuickSilver Game Engine PEP 441 JIT (Just-In-Time) Compilation The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

4/15/2019 • 40 minutes, 28 seconds

Version Control For Your Machine Learning Projects

Summary Version control has become table stakes for any software team, but for machine learning projects there has been no good answer for tracking all of the data that goes into building and training models, and the output of the models themselves. To address that need Dmitry Petrov built the Data Version Control project known as DVC. In this episode he explains how it simplifies communication between data scientists, reduces duplicated effort, and simplifies concerns around reproducing and rebuilding models at different stages of the projects lifecycle. If you work as part of a team that is building machine learning models or other data intensive analysis then make sure to give this a listen and then start using DVC today. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Bots and automation are taking over whole categories of online interaction. Discover.bot is an online community designed to serve as a platform-agnostic digital space for bot developers and enthusiasts of all skill levels to learn from one another, share their stories, and move the conversation forward together. They regularly publish guides and resources to help you learn about topics such as bot development, using them for business, and the latest in chatbot news. For newcomers to the space they have the Beginners Guide To Bots that will teach you the basics of how bots work, what they can do, and where they are developed and published. To help you choose the right framework and avoid the confusion about which NLU features and platform APIs you will need they have compiled a list of the major options and how they compare. Go to pythonpodcast.com/discoverbot today to get started and thank them for their support of the show. You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data management. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Dmitry Petrov about DVC, an open source version control system for machine learning projects Interview Introductions How did you get introduced to Python? Can you start by explaining what DVC is and how it got started? How do the needs of machine learning projects differ from other software applications in terms of version control? Can you walk through the workflow of a project that uses DVC? What are some of the main ways that it differs from your experience building machine learning projects without DVC? In addition to the data that is used for training, the code that generates the model, and the end result there are other aspects such as the feature definitions and hyperparameters that are used. Can you discuss how those factor into the final model and any facilities in DVC to track the values used? In addition to version control for software applications, there are a number of other pieces of tooling that are useful for building and maintaining healthy projects such as linting and unit tests. What are some of the adjacent concerns that should be considered when building machine learning projects? What types of metrics do you track in DVC and how are they collected? Are there specific problem domains or model types that require tracking different metric formats? In the documentation it mentions that the data files live outside of git and can be managed in external storage systems. I’m wondering if there are any plans to integrate with systems such as Quilt or Pachyderm that provide versioning of data natively and what would be involved in adding that support? What was your motivation for implementing this system in Python? If you were to start over today what would you do differently? Being a venture backed startup that is producing open source products, what is the value equation that makes it worthwile for your investors? What have been some of the most interesting, challenging, or unexpected aspects of building DVC? What do you have planned for the future of DVC? Keep In Touch dmpetrov on GitHub Blog @fullstackml on Twitter LinkedIn Picks Tobias Otter.ai Dmitry Go outside and get some fresh air Links DVC Iterative.ai Linear Regression Logistic Regression C++ Perl Git Version Control System Uber Michaelangelo Domino Data Lab Git LFS AUC == Area Under Curve metric for evaluating machine learning model performance Wes McKinney Interview PyTorch Podcast Interview Tensorflow TensorBoard MLFlow Quilt Data Data Engineering Podcast Episode Pachyderm Data Engineering Podcast Episode Apache Airflow Podcast Interview The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/8/2019 • 44 minutes, 39 seconds

Building Scalable Ecommerce Sites On Saleor

Summary Ecommerce is an industry that has largely faded into the background due to its ubiquity in recent years. Despite that, there are new trends emerging and room for innovation, which is what the team at Mirumee focuses on. To support their efforts, they build and maintain the open source Saleor framework for Django as a way to make the core concerns of online sales easy and painless. In this episode Mirek Mencel and Patryk Zawadzki discuss the projects that they work on, the current state of the ecommerce industry, how Saleor fits with their technical and business strategy, and their predictions for the near future of digital sales. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Check out the Practical AI podcast from our friends at Changelog Media to learn and stay up to date with what’s happening in AI You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data management. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Your host as usual is Tobias Macey and today I’m interviewing Mirek Mencel and Patryk Zawadzki about their work at Mirumee, building ecommerce applications in Python, based on their open source framework Saleor Interview Introductions How did you get introduced to Python? Can you start by describing the types of projects that you work on at Mirumee and how the company got started? There are a number of libraries and frameworks that you build and maintain. What is your motivation for providing these components freely and how does that play into your overall business strategy? The most substantial project that you maintain is Saleor. Can you describe what it is and the story behind its creation? How does it compare to other ecommerce implementations in the Python space? If someone is agnostic to language and web framework, what would make them choose Saleor over other options that would be available to them? What are some of the most challenging aspects of building a successful ecommerce platform? How do the technical needs of an ecommerce site differ as it grows from small to medium and large scale? Which components of an online store are often overlooked? One of the common features of ecommerce sites that can drive substantial revenue is a well-built recommender system. What are some best practice strategies that you have discovered during your client work? What are some projects that you have seen built with Saleor that were particular interesting, innovative, or unexpected? What are your predictions for the future of the ecommerce industry? What do you have planned for the future of the Saleor framework and the Mirumee business? Keep In Touch Mirumee Website Github @mirumeelabs on Twitter Mirek @mirekmencel on Twitter mirekm on GitHub Patryk patrys on GitHub @patrys on Twitter Website Picks Tobias Wreck It Ralph: Ralph Breaks The Internet Mirek A Guide To The Good Life: The Ancient Art Of Stoic Joy by William B. Irvine Patryk Release It: Design And Deploy Production Ready Software by Michael Nygard Links Mirumee Saleor Django PHP Pyramid web framework Pylons Magento eCommerce platform Ecommerce Satchmo Satchless Prices library for handling price data French National Assembly Django Oscar Podcast Interview David Winterbottom Ebay Amazon Etsy Shopify Ariadne GraphQL framework for Python Graphene GraphQL framework for Python Podcast Interview Apollo JavaScript GraphQL framework PWA == Progressive Web Apps SKU == Stock Keeping Unit Collective Intelligence Elasticsearch Data Engineering Podcast Interview A/B Testing Room Lab store built on Saleor Augmented Reality WebGL Saleor Cloud ASGI Podcast Interview The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/1/2019 • 58 minutes, 2 seconds

A Quick Python Check-in With Naomi Ceder

Summary Naomi Ceder was fortunate enough to learn Python from Guido himself. Since then she has contributed books, code, and mentorship to the community. Currently she serves as the chair of the board to the Python Software Foundation, leads an engineering team, and has recently completed a new draft of the Quick Python Book. In this episode she shares her story, including a discussion of her experience as a technical author and a detailed account of the role that the PSF plays in supporting and growing the community. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Check out the Practical AI podcast from our friends at Changelog Media to learn and stay up to date with what’s happening in AI You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data management. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Dataversity, and the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Your host as usual is Tobias Macey and today I’m interviewing Naomi Ceder about her career and contributions in the Python community Interview Introductions How did you get introduced to Python? How are you using Python in your current day-to-day? You have been working with Python for a long time at this point, and you have become very involved in supporting and growing the community. What is your motivation for dedicating so much of your time and energy into work that isn’t directly related to paying the bills? You have been the chair of the PSF for a few years now. What are your responsibilities in that position? What do you find to be the most under-rated, misunderstood, or overlooked activities of the PSF? How much of the success of the Python language and its community can be attributed to the presence and support of the PSF? In addition to the work you do with the PSF, other community activities, and your day job, you have also written the 2nd and 3rd editions of the Quick Python Book. Can you give a synopsis of what the book covers and the audience that it is intended for? In the process of writing the book and updating it between revisions, what are some of the features of the language or standard library that you discovered or learned more about which you have been able to use in your work? What are some of the other language communities that you have been involved with and what lessons have you learned from them that you would like to see reflected in Python? What are some of the other projects that you have been involved with that you are most proud of, whether technical or otherwise? What are you most excited about in the near to medium future? Keep In Touch @NaomiCeder on Twitter Web Quick Python Book Get 40% off everything at Manning with code podinit19 at checkout Enter to win a free copy Picks Tobias Inkscape vector graphics editor Naomi La Casa de las Flores (House of Flowers) (Netflix) Links The Quick Python Book Dick Blick Art Materials The PSF @ThePSF on Twitter Manning Publishers PEP8 ETL Collections Module Turtle Library PyCon Hatchery PyCon Charlas Podcast Episode The GIL The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/25/2019 • 38 minutes, 32 seconds

Wes McKinney's Career In Python For Data Analysis

Summary Python has become one of the dominant languages for data science and data analysis. Wes McKinney has been working for a decade to make tools that are easy and powerful, starting with the creation of Pandas, and eventually leading to his current work on Apache Arrow. In this episode he discusses his motivation for this work, what he sees as the current challenges to be overcome, and his hopes for the future of the industry. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Check out the Practical AI podcast from our friends at Changelog Media to learn and stay up to date with what’s happening in AI You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with O’Reilly Media for the Strata conference in San Francisco on March 25th and the Artificial Intelligence conference in NYC on April 15th. Here in Boston, starting on May 17th, you still have time to grab a ticket to the Enterprise Data World, and from April 30th to May 3rd is the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Your host as usual is Tobias Macey and today I’m interviewing Wes McKinney about his contributions to the Python community and his current projects to make data analytics easier for everyone Interview Introductions How did you get introduced to Python? You have spent a large portion of your career on building tools for data science and analytics in the Python ecosystem. What is your motivation for focusing on this problem domain? Having been an open source author and contributor for many years now, what are your current thoughts on paths to sustainability? What are some of the common challenges pertaining to data analysis that you have experienced in the various work environments and software projects that you have been involved in? What area(s) of data science and analytics do you find are not receiving the attention that they deserve? Recently there has been a lot of focus and excitement around the capabilities of neural networks and deep learning. In your experience, what are some of the shortcomings or blind spots to that class of approach that would be better served by other classes of solution? Your most recent work is focused on the Arrow project for improving interoperability across languages. What are some of the cases where a Python developer would want to incorporate capabilities from other runtimes? Do you think that we should be working to replicate some of those capabilities into the Python language and ecosystem, or is that wasted effort that would be better spent elsewhere? Now that Pandas has been in active use for over a decade and you have had the opportunity to get some space from it, what are your thoughts on its success? With the perspective that you have gained in that time, what would you do differently if you were starting over today? You are best known for being the creator of Pandas, but can you list some of the other achievements that you are most proud of? What projects are you most excited to be working on in the near to medium future? What are your grand ambitions for the future of the data science community, both in and outside of the Python ecosystem? Do you have any parting advice for active or aspiring data scientists, or resources that you would like to recommend? Keep In Touch wesm on GitHub Website @wesmckinn on Twitter Picks Tobias Roald Dahl Wes The Soul Of A New Machine by Tracy Kidder Links Ursa Labs Pandas Podcast Interview with Jeff Reback Pandas Extension Arrays Interview with Tom Augsburger AQR Capital Management Distributed Computing SQL Excel Duke University AppNexus Chang She Ibis Open Source Governance Apache Software Foundation Paul Graham Schlep Blindness Big Data File Formats Avro Parquet ORC Data Engineering Podcast Episode Apache Arrow Hadoop Spark Data Engineering Podcast Episode Apache Impala R Language Ruby Rust Pandas 2.0 Design Docs Apache Arrow and the 10 Things I Hate About Pandas GeoPandas Statsmodels Python For Data Analysis by Wes McKinney 2 Sigma R Studio The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/18/2019 • 51 minutes, 44 seconds

The Past, Present, and Future of Deep Learning In PyTorch

Summary The current buzz in data science and big data is around the promise of deep learning, especially when working with unstructured data. One of the most popular frameworks for building deep learning applications is PyTorch, in large part because of their focus on ease of use. In this episode Adam Paszke explains how he started the project, how it compares to other frameworks in the space such as Tensorflow and CNTK, and how it has evolved to support deploying models into production and on mobile devices. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Check out the Practical AI podcast from our friends at Changelog Media to learn and stay up to date with what’s happening in AI You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with O’Reilly Media for the Strata conference in San Francisco on March 25th and the Artificial Intelligence conference in NYC on April 15th. Here in Boston, starting on May 17th, you still have time to grab a ticket to the Enterprise Data World, and from April 30th to May 3rd is the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Your host as usual is Tobias Macey and today I’m interviewing Adam Paszke about PyTorch, an open source deep learning platform that provides a seamless path from research prototyping to production deployment Interview Introductions How did you get introduced to Python? Can you start by explaining what deep learning is and how it relates to machine learning and artificial intelligence? Can you explain what PyTorch is and your motivation for creating it? Why was it important for PyTorch to be open source? There is currently a large and growing ecosystem of deep learning tools built for Python. Can you describe the current landscape and how PyTorch fits in relation to projects such as Tensorflow and CNTK? What are some of the ways that PyTorch is different from Tensorflow and CNTK, and what are the areas where these frameworks are converging? How much knowledge of machine learning, artificial intelligence, or neural network topologies are necessary to make use of PyTorch? What are some of the foundational topics that are most useful to know when getting started with PyTorch? Can you describe how PyTorch is architected/implemented and how it has evolved since you first began working on it? You recently reached the 1.0 milestone. Can you talk about the journey to that point and the goals that you set for the release? What are some of the other components of the Python ecosystem that are most commonly incorporated into projects based on PyTorch? What are some of the most novel, interesting, or unexpected uses of PyTorch that you have seen? What are some cases where PyTorch is the wrong choice for a problem? What is the process for incorporating these new techniques and discoveries into the PyTorch framework? What are the areas of active research that you are most excited about? What are some of the most interesting/useful/unexpected/challenging lessons that you have learned in the process of building and maintaining PyTorch? What do you have planned for the future of PyTorch? Keep In Touch apaszke on GitHub @apaszke on Twitter LinkedIn Picks Tobias Un Lun Dun by China Miéville Adam In Praise Of Copying by Marcus Boon Links PyTorch University of Warsaw Poland Polish Olympiad In Informatics Deep Learning Automatic Differentiation Torch 7 Lua Tensorflow CNTK Tensorflow 2 Caffe2 EPFL (Ecole polytechnique fédérale de Lausanne) Fast.ai TorchScript ONNX Transfer Learning C++ Reinforcement Learning NumPy SciPy MatPlotLib The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/10/2019 • 42 minutes, 12 seconds

How To Include Redis In Your Application Architecture

Summary The Redis database recently celebrated its 10th birthday. In that time it has earned a well-earned reputation for speed, reliability, and ease of use. Python developers are fortunate to have a well-built client in the form of redis-py to leverage it in their projects. In this episode Andy McCurdy and Dr. Christoph Zimmerman explain the ways that Redis can be used in your application architecture, how the Python client is built and maintained, and how to use it in your projects. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. Podcast.__init__ listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with O’Reilly Media for the Strata conference in San Francisco on March 25th and the Artificial Intelligence conference in NYC on April 15th. Here in Boston, starting on May 17th, you still have time to grab a ticket to the Enterprise Data World, and from April 30th to May 3rd is the Open Data Science Conference. Go to pythonpodcast.com/conferences to learn more and take advantage of our partner discounts when you register. Your host as usual is Tobias Macey and today I’m interviewing Andy McCurdy and Christoph Zimmerman about the Redis database, and some of the various ways that it is used by Python developers Interview Introductions How did you get introduced to Python? Can you start by explaining what Redis is and how you got involved in the project? How does the redis-py project relate to the Redis database and what motivated you to create the Python client? What are some of the main use cases that Redis enables? Can you describe how Redis-py is implemented and some of the primitives that it provides for building applications on top of? How do the release cycles of redis-py and the Redis database relate to each other? How closely does redis-py match the features of the Redis database? What are some of the convenience methods or features that you have added to make the client more Pythonic? Redis is often used as a key/value cache for web applications, in some cases replacing Memcached. What are the characteristics of Redis that lend themselves well to this purpose? What are some edge cases or gotchas that users should be aware of? What are some of the common points of confusion or difficulties when storing and retrieving values in Redis? What have been some of the most challenging aspects of building and maintaining the Redis Python client? What are some of the anti-patterns that you have seen around how developers build on top of Redis? What are some of the most interesting or unexpected ways that you have seen Redis used? What are some of the least used or most misunderstood features of Redis that you think developers should know about? What are some of the recent and near-future improvements or features in Redis that you are most excited by? Keep In Touch Andy @andymccurdy on Twitter andymccurdy on GitHub Christoph chrisAtRedis on GitHub LinkedIn Picks Tobias Rowan Atkinson Andy The Food Lab: Better Home Cooking Through Science by J. Kenji Lopez-Alt Dota 2 Auto Chess (Community Mod) Christoph IPA Infused With Grapefruit Juice redis-py Selenium Python client Daniel Suarez Influx Links redis-py Redis DB Redis Labs PHP Django Reflective Operating System Architectures TCL Perl Linux Memcached NextCloud C programming language uWSGI Flask Gevent PyPy re-json redis-graph Redis-search MongoDB Bloom Filter hiredis Redis Sentinel HA plugin Lua programming language OpenWRT LuCI MicroPython Podcast Episode The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/4/2019 • 1 hour, 1 minute, 10 seconds

Marshmallow Data Validation Library

Summary Any time that your program needs to interact with other systems it will have to deal with serializing and deserializing data. To prevent duplicate code and provide validation of the data structures that your application is consuming Steven Loria created the Marshmallow library. In this episode he explains how it is built, how to use it for rendering data objects to various serialization formats, and some of the interesting and unique ways that it is incorporated into other projects. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. Podcast.__init__ listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at pythonpodcast.com/chat You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss the Strata conference in San Francisco on March 25th and the Artificial Intelligence conference in NYC on April 15th, both run by our friends at O’Reilly Media. Go to pythonpodcast.com/stratacon and pythonpodcast.com/aicon to register today and get 20% off Your host as usual is Tobias Macey and today I’m interviewing Steven Loria about Marshmallow, a Python serialization library that is agnostic to your framework and object mapper of choice Interview Introductions How did you get introduced to Python? Can you start by describing what Marshmallow is and the history of the project? What are some of the capabilities that make it unique from other similar projects in the Python ecosystem? What are some of the main use cases for schematized serialization and deserialization? Can you walk through how a user would get started with Marshmallow, particularly for complex or nested schemas? Can you describe how Marshmallow is implemented? How has that design evolved since you first began working on it? How have the changes in the Python language and ecosystem impacted the requirements and use cases for Marshmallow? What are some of the most interesting or unexpected ways that you have seen Marshmallow used? What have been some of the most interesting, complex, or challenging aspects of building the Marshmallow project and community? What are lessons you’ve learned from maintaining marshmallow? What have been some of the benefits and drawbacks of keeping Marshmallow agnostic to any frameworks or object mappers? What are some of the edge cases that users of Marshmallow should be aware of? What are some of the little-known features of Marshmallow that you find most useful? What do you have planned for the future of Marshmallow? Keep In Touch Email Website @sloria1 on Twitter Picks Tobias Sherlock BBC tv series Steven Greater Than Code podcast Links Marshmallow Butterfly Network Biology ORM (Object Relational Mapper) ODM (Object Document Mapper) Webargs Avro Swagger/OpenAPI REST (REpresentational State Transfer) JSON-Schema Environs Django Rest Framework WTForms DynamoDB MongoDB Etsy’s boundary-layer for building Airflow DAGs from config files Airflow Podcast Episode Toasted Marshmallow Lyft Blog Post The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/25/2019 • 34 minutes, 4 seconds

Unpacking The Python Toolkit For Chaos Engineering

Summary Chaos engineering is the practice of injecting failures into your production systems in a controlled manner to identify weaknesses in your applications. In order to build, run, and report on chaos experiments Sylvain Hellegouarch created the Chaos Toolkit. In this episode he explains his motivation for creating the toolkit, how to use it for improving the resiliency of your systems, and his plans for the future. He also discusses best practices for building, running, and learning from your own experiments. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. Podcast.__init__ listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Sylvain Hellegouarch about Chaos Toolkit, a framework for building and automating chaos engineering experiments Interview Introductions How did you get introduced to Python? Can you start by explaining what Chaos Engineering is? What is the Chaos Toolkit and what motivated you to create it? How does it compare to the Gremlin platform? What is the workflow for using Chos Toolkit to build and run an experiment? What are the best practices for building a useful experiment? Once you have an experiment created, how often should it be executed? When running an experiment, what are some strategies for identifying points of failure, particularly if they are unexpected? What kinds of reporting and statistics are captured during a test run? Can you describe how Chaos Toolkit is implemented and how it has evolved since you began working on it? What are some of the most challenging aspects of ensuring that the experiments run via the Chaos Toolkit are safe and have a reliable rollback available? What have been some of the most interesting/useful/unexpected lessons that you have learned in the process of building and maintaining the Chaos Toolkit project and community? What do you have planned for the future of the project? Keep In Touch lawouach on GitHub Blog @lawouach on Twitter LinkedIn Picks Tobias Time Trap Sylvain Playing Guitar Step away from the computer Links Chaos Toolkit Chaos IQ Gremlin chaos engineering service Russ Miles Chaos IQ co-founder Zope CherryPy minimalist Python web framework Cherrypy Essentials book Chaos Engineering Chaos Engineering Book DevOps SRE (Site Reliability Engineering) Dark Debt Netflix Simian Army Chaos Monkey Terraform Kubecon Istio service mesh Chaos Platform PyInstaller Composition vs Inheritance Open Chaos Initiative CNCF The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/18/2019 • 59 minutes, 39 seconds

Computational Musicology For Python Programmers

Summary Music is a part of every culture around the world and throughout history. Musicology is the study of that music from a structural and sociological perspective. Traditionally this research has been done in a manual and painstaking manner, but the advent of the computer age has enabled an increase of many orders of magnitude in the scope and scale of analysis that we can perform. The music21 project is a Python library for computer aided musicology that is written and used by MIT professor Michael Scott Cuthbert. In this episode he explains how the project was started, how he is using it personally, professionally, and in his lectures, as well as how you can use it for your own exploration of musical analysis. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. Podcast.__init__ listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Michael Cuthbert about music21, a toolkit for computer aided musicology Interview Introductions How did you get introduced to Python? Can you start by explaining what computational musicology is? What is music21 and what motivated you to create it? What are some of the use cases that music21 supports, and what are some common requests that you purposefully don’t support? How much knowledge of musical notation, structure, and theory is necessary to be able to work with music21? Can you talk through a typical workflow for doing analysis of one or more pieces of existing music? What are some of the common challenges that users encounter when working with it (either on the side of Python or musicology/musical theory)? What about for doing exploration of new musical works? As a professor at MIT, what are some of the ways that music21 has been incorporated into your classroom? What have they enjoyed most about it? How is music21 implemented, and how has its structure evolved since you first started it? What have been the most challenging aspects of building and maintaining the music21 project and community? What are some of the most interesting, unusual, or unexpected ways that you have seen music21 used? What are some analyses that you have performed which yielded unexpected results? What do you have planned for the future of music21? Beyond computational analysis of musical theory, what are some of the other ways that you are using Python in your academic and professional pursuits? Keep In Touch mscuthbert on GitHub @mscuthbert on Twitter Picks Tobias Mozart’s Requiem performed by Berlin Philharmonik and conducted by Claudio Abbado Michael von Karajan Institute – Karajan was a major conductor of the 60s — his Institute now sponsors research into new projects in music technology and are big advocates of using Python for their data analysis. Ruth Crawford Seeger, String Quartet (1931) performed by The Playground Ensemble Links music21 Studies in Western Music History: Quantitative and Computational Approaches to Music History on MIT Open Courseware MIT Perl National Bureau of Economic Research Zen of Python Musicology Matplotlib Orange Podcast Episode scikit-learn Abjad Python Package SciPy numpy Pandas Podcast Episode PyLevenshtein Levenshtein Distance PyGame AVL Tree Subversion (SVN) Bach Chorales Artusi.xyz Interactive Music Theory VexFlow MIT Digital Humanities NLTK Flask Fortran Django Humdrum The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/11/2019 • 47 minutes, 48 seconds

Classic Computer Science For Pythonistas

Summary Software development is a career that attracts people from all backgrounds, and Python in particular helps to make it an approachable occupation. Because of the variety of paths that can be taken it is becoming increasingly common for practitioners to bypass the traditional computer science education. In this episode David Kopec discusses some of the classic problems that he has found most useful to understand in his work as a professor and practitioner of software engineering. He shares his motivation for writing the book "Classic Computer Science Problems In Python", the practical approach that he took, and an overview of how the contents can be used in your day-to-day work. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. Podcast.__init__ listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing David Kopec about his recent book "Classic Computer Science Problems In Python" Interview Introductions How did you get introduced to Python? Can you start by discussing your motivation for creating this book and the subject matter that it covers? How do you define a "classic" computer science problem and what was your criteria for selecting the specific cases that you included in the book? What are your favorite features of the Python language, and which of them did you learn as part of the process of writing the examples for this book? Which classes of problems have you found to be most difficult for your readers and students to master? Which do you consider to be most relevant/useful to professional software engineers? I was pleasantly surprised to see introductory aspects of artificial intelligence included in the subject matter that you covered. How did you approach the challenge of making the underlying principles accessible to readers who don’t necessarily have a background in the related fields of mathematics? What are some of the most interesting or unexpected changes that you had to make in the process of adapting your examples from Swift to Python in order to make them appropriately idiomatic? By aiming for an intermediate audience you free yourself of the need to incorporate fundamental aspects of programming, but there can be a wide variety of experiences at that level of experience. How did you approach the challenge of making the text accessible while still being accurate and engaging? What are some of the resources that you would recommend to readers who would like to continue learning about computer science after completing your book? Keep In Touch @davekopec on Twitter Website Book Discount And Giveaway Use code podinit19 to get 40% off all Manning products Picks Tobias Elementor David nesdev The Curse of Oak Island Links Classic Computer Science Problems in Python Classic Computer Science Problems in Swift Dart For Absolute Beginners Dart Swift Manning Publications Apress Python Data Classes Python Type Hints Recursion A* Search Algorithm Neural Network Champlain College Burlington, VT, USA HyperLoop Data Structures And Algorithms In Python by Michael T. Goodrich, Roberto Tamassia, Michael H. Goldwasser MyPy Podcast Interview PyTorch Minimax Dartmouth College Big O Notation The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/4/2019 • 47 minutes, 28 seconds

What You Need To Know About Open Source Licenses And Intellectual Property

Summary As a developer and user of open source code, you interact with software and digital media every day. What is often overlooked are the rights and responsibilities conveyed by the intellectual property that is implicit in all creative works. Software licenses are a complicated legal domain in their own right, and they can often conflict with each other when you factor in the web of dependencies that your project relies on. In this episode Luis Villa, Co-Founder of Tidelift, explains the catagories of software licenses, how to select the right one for your project, and what to be aware of when you contribute to someone else’s code. Announcements Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. Podcast.__init__ listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Luis Villa about software licensing and intellectual property rules that developers need to know Interview Introductions How did you get started as a programmer? Intellectual property law and licensing of software, data, and media are complicated topics that are often poorly understood by developers. Can you start off by giving an overview of categories of intellectual property that we should be thinking of? Most of us who have created or used software, whether it is open or closed source, have at some point come across various licenses. What may not be immediately obvious is that there are degrees of compatibility between these licenses. What are some guiding principles for determining which licenses are in conflict? In an organization, who is responsible for ensuring compliance with software and content licensing within a given project? When introducing new dependencies into a project or system what steps should be taken to evaluate license compatibility and compliance? When creating a new project, one of the steps in the process is to select a license. What are some useful guidelines or questions to determine which license to use? Another aspect of software licensing that developers might run into is when contributing to an open source project where a contributor license agreement might be necessary. What should we be thinking about when deciding whether to sign such an agreement? In addition to software libraries, developers might need to use content such as images, audio, or video in their projects which have their own copyright and licensing considerations. What are some of the things that we should be looking for in those situations? Another component of our systems that has grown in its importance with the rise of advanced analytics is data. We may need to use open data sources, pay for access to data repositories, or provide access to data that is under our control. What are some common approaches to licensing or terms of use for these contexts? What should we be wary of when using or providing data in our applications? How much of the work that you do at Tidelift is spent on educating developers and customers on the finer points of intellectual property management? What are some of the most common difficulties or points of confusion that you encounter? What are some useful resources that you would recommend to anyone who is interested in learning more about intellectual property and software licensing? Keep In Touch Website @luis_in_140 on Twitter LinkedIn Picks Tobias Spider Man: Into The Spiderverse Luis The Good Place Twitter and Teargas by Zeynep Tufecki Links Intellectual Property and Open Source: A Practical Guide To Protecting Code by Van Lindberg Tidelift BASIC Apple //e Copyright Trademark Patent Copyleft OSI Approved Licenses Permissive Licenses Strong and Weak Copyleft SSPL (Server Side Public License) OSI (Open Source Initiative) Contributor License Agreement FSF (Free Software Foundation) DCO (Developer Certificate of Origin) Creative Commons Noun Project Free Music Archive Wikimedia Commons TL;DR Legal The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/28/2019 • 1 hour, 2 minutes, 58 seconds

Counteracting Code Complexity With Wily

Summary As we build software projects, complexity and technical debt are bound to creep into our code. To counteract these tendencies it is necessary to calculate and track metrics that highlight areas of improvement so that they can be acted on. To aid in identifying areas of your application that are breeding grounds for incidental complexity Anthony Shaw created Wily. In this episode he explains how Wily traverses the history of your repository and computes code complexity metrics over time and how you can use that information to guide your refactoring efforts. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. Podcast.__init__ listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Anthony Shaw about Wily, a command-line application for tracking and reporting on complexity of Python tests and applications Interview Introductions How did you get introduced to Python? Can you start by describing what Wily is and what motivated you to create it? What is software complexity and why should developers care about it? What are some methods for measuring complexity? I know that Python has the McCabe tool, but what other methods are there for determining complexity, both in Python and for other languages? What kinds of useful signals can you derive from evaluating historical trends of complexity in a codebase? What are some other useful metrics for tracking and maintaining the health of a software project? Once you have established the points of complexity in your software, what are some strategies for remediating it? What are your favorite tools for refactoring? What are some of the aspects of developer-oriented tools that you have found to be most important in your own projects? What are your plans for the future of Wily, or any other tools that you have in mind to aid in producing healthy software? Keep In Touch anthonywritescode on GitHub @anthonypjshaw on Twitter Website Medium Picks Tobias Baobab Impractical Jokers Anthony Line Of Duty Fierce Girls Links Wily Dimension Data Pluralsight Real Python Seattle C# Cyclomatic Complexity McCabe Git C Assembly Halstead Radon The Zen Of Python Vocabulary Metric Java Anti Patterns God Object Pre-Commit Codeclimate Glom ASQ PyCharm PyDocStyle PyLint Black Sunburst Chart Visual Studio Code The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/21/2019 • 36 minutes, 17 seconds

Teaching Digital Archaeology With Jupyter Notebooks

Summary Computers have found their way into virtually every area of human endeavor, and archaeology is no exception. To aid his students in their exploration of digital archaeology Shawn Graham helped to create an online, digital textbook with accompanying interactive notebooks. In this episode he explains how computational practices are being applied to archaeological research, how the Online Digital Archaeology Textbook was created, and how you can use it to get involved in this fascinating area of research. Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. Podcast.__init__ listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Shawn Graham about his work on the Online Digital Archaeology Textbook Interview Introductions How did you get introduced to Python? Can you start by explaining what digital archaeology is? To facilitate your teaching you have collaborated on the O-DATE textbook and associated Jupyter notebooks. Can you describe what that resource covers and how the project got started? What have you found to be the most critical lessons for your students to help them be effective archaeologists? What are the most useful aspects of leveraging computational techniques in an archaeological context? Can you describe some of the sources and formats of data that would commonly be encountered by digital archaeologists? The notebooks that accompany the text have a mixture of R and Python code. What are your personal guidelines for when to use each language? How have the skills and tools of software engineering influenced your views and approach to research and education in the realm of archaeology? What are some of the most novel or engaging ways that you have seen computers applied to the field of archaeology? What are your goals and aspirations for the O-DATE project? Keep In Touch Blog @electricarchaeo on Twitter Picks Tobias TaoTronics Noise Cancelling Earbuds Shawn Ian Rankin In A House Of Lies Links O-DATE Textbook Carleton University Ottawa Canada Simulation Modeling Agent Based Modeling NetLogo Complexity Theory Archaeology Digital Archaeology The Programming Historian University of Western Ontario Historical GIS ArcGIS QGIS Digital Humanities Project Jupyter Podcast Episode Binder – Service for hosting Jupyter notebooks E-Campus Ontario Graph Databases SparQL OpenContext.org TDAR (The Digital Archaeology Record) R Language R OpenSci Arrow Pandas Podcast Episode Neural Networks Generative Adversarial Networks Computer Vision Archaeogaming Alamagordo Atari Excavation Leiden University Interactive Pasts Conference Photogrammetry LIDAR Palmyran Arch Ben Marwick Matt Harris Jolene Smith Sara Perry Rachel Opitz Colleen Morgan Patrick Burns Ethan Watrall Andrew Reinhard Neha Gupta Katherine Cook Value Foundation The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/14/2019 • 49 minutes, 35 seconds

Analyzing Satellite Image Data Using PyTroll

Summary Every day there are satellites collecting sensor readings and imagery of our Earth. To help make sense of that information, developers at the meteorological institutes of Sweden and Denmark worked together to build a collection of Python packages that simplify the work of downloading and processing satellite image data. In this episode one of the core developers of PyTroll explains how the project got started, how that data is being used by the scientific community, and how citizen scientists like you are getting involved. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. Podcast.__init__ listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Martin Raspaud about PyTroll, a suite of projects for processing earth observing satellite data Interview Introductions How did you get introduced to Python? Can you start by explaining what PyTroll is and how the overall project got started? What is the story behind the name? What are the main use cases for PyTroll? (e.g. types of analysis, research domains, etc.) What are the primary types of data that would be processed and analayzed with PyTroll? (e.g. images, sensor readings, etc.) When retrieving the data, are you communicating directly with the satellites, or are there facilities that fetch the information periodically which you can then interface with? How do you locate and select which satellites you wish to retrieve data from? What are the main components of PyTroll and how do they fit together? For someone processing satellite data with PyTroll, can you describe the workflow? What are some of the main data formats that are used by satellites? What tradeoffs are made between data density/expressiveness and bandwidth optimization? What are some of the common issues with data cleanliness or data integration challenges? Once the data has been retrieved, what are some of the types of analysis that would be performed with PyTroll? Are there other tools that would commonly be used in conjunction with PyTroll? What are some of the unique challenges posed by working with satellite observation data? How has the design and capability of the various PyTroll packages evolved since you first began working on it? What are some of the most interesting or unusual ways that you have seen PyTroll used? What are some of the lessons that you have learned while building PyTroll that you have found to be most useful or unexpected? What do you have planned for the future of PyTroll? Keep In Touch Martin mraspaud on GitHub @MartinRaspaud on Twitter Pytroll Website Slack Mailing List @PyTroll on Twitter Picks Tobias Tool A Perfect Circle Martin Vulfpeck Links PyTroll Swedish Meteorological and Hydrological Institute Common Lisp Danish Meteorological Institute Trolls in Scandinavian Lore NumPy KISS (Keep It Simple Stupid) Spectroscopy Radiance Polar Orbiting Satellite Geostationary Satellite EUMETSAT SatPy PyResample Cartographic Projection Proj4 GOES16 [GOES17](https://en.wikipedia.org/wiki/GOES-17?utmsource=rss&utmmedium=rss Dask Data Engineering Podcast Episode NetCDF HDF5 PySpectral PyCoast SupervisorD TrollCast European Space Agency The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/7/2019 • 43 minutes, 57 seconds

Building GraphQL APIs in Python Using Graphene with Syrus Akbary

Summary The web has spawned numerous methods for communicating between applications, including protocols such as SOAP, XML-RPC, and REST. One of the newest entrants is GraphQL which promises a simplified approach to client development and reduced network requests. To make implementing these APIs in Python easier, Syrus Akbary created the Graphene project. In this episode he explains the origin story of Graphene, how GraphQL compares to REST, how you can start using it in your applications, and how he is working to make his efforts sustainable. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. Podcast.__init__ listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Syrus Akbary about Graphene, a python library for building your APIs with GraphQL Interview Introductions How did you get introduced to Python? What is GraphQL and what is the benefit vs a REST-based API? How does it compare to specifications such as OpenAPI (formerly Swagger) or RAML? Can you explain what Graphene is and your motivation for building it? In addition to the Python implementation there is also a JavaScript library. Is that primarily for use as a client or can it also be used in Node for serving APIs? What is involved in building a GraphQL API? What does Graphene do to simplify this process? How is Graphene implemented and how has that evolved since you first started working on it? Is there a set of tests for verifying the compliance of Graphene or a specific API with the GraphQL specification? What are some of the most complex or confusing aspects of building a GraphQL API? What are some of the unique capabilities that are offered by building an application with GraphQL as the communication interface? While reading through documentation in preparation for our conversation I noticed the Quiver project. Can you explain what that is and how it fits with the other Graphene projects? What is it doing under the hood to optimize serving of the API? For someone who is interested in adding a GraphQL interface to an existing application, what would be involved? The documentation mentions creation of a schema, as well as defining queries. Is it possible for a client to craft queries that don’t match directly with those defined in the server layer? What are some of the most interesting or surprising uses of Graphene and GraphQL that you have seeen? What are some cases where it would be more practical to implement an API using REST instead of GraphQL? What are some references that you would recommend for anyone who wants to learn more about GraphQL and its ecosystem? What are your plans for the future of Graphene? Keep In Touch syrusakbary on GitHub Website @syrusakbary on Twitter Picks Tobias Audible Syrus Web Assembly Links Graphene GraphQL REST (REpresentational State Transfer OpenAPI RAML PHP Facebook Engineering Graphene-SQLAlchemy Graphene-Django GraphiQL PyJade Django Rest Framework How To GraphQL Python 3.7 Dataclasses Graphene GitHub Issue The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/31/2018 • 52 minutes, 48 seconds

AIORTC: An Asynchronous WebRTC Framework with Jeremy Lainé

Summary Real-time communication over the internet is an amazing feat of modern engineering. The protocol that powers a majority of video calling platforms is WebRTC. In this episode Jeremy Lainé explains why he wrote a Python implementation of this protocol in the form of AIORTC. He also discusses how it works, how you can use it in your own projects, and what he has planned for the future. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. Podcast.__init__ listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Jeremy Lainé about AIORTC, an asynchronous implementation of the WebRTC and ObjectRTC protocols in Python Interview Introductions How did you get introduced to Python? Can you start by explaining what the WebRTC and ObjectRTC protocols are? What are some of the main use cases for these protocols? What is AIORTC and what was your motivation for creating it? How does it compare to other implementations of the RTC protocols? Why do you think there haven’t been any other Python implementations? What are some of the benefits of having a Python implementation of the RTC protocol? How is AIORTC implemented? What have been some of the most difficult or challenging aspects of implementing a WebRTC compliant library? What are some of the most interesting or useful lessons that you have learned in the process? What is involved in building an application on top of AIORTC? What would be required to integrate AIORTC into an existing application built with something such as Flask or Django? What are some of the most interesting uses of AIORTC that you have seen? What are some of the projects that you would like to build with AIORTC? What are some cases where it would make more sense to use a different library or framework for your WebRTC projects? What are your plans for the future of AIORTC? Keep In Touch jlaine on GitHub Website @JeremyLaine on Twitter Picks Tobias Tengger Cavalry Jeremy PyAV Mike Boers Links AIORTC WebRTC Electrical Engineering [C](https://en.wikipedia.org/wiki/C(programminglanguage)?utmsource=rss&utmmedium=rss C++ PHP Ruby STUN (Session Traversal Utilities for NAT) TURN (Traversal Using Relays around NAT) ICE (Internet Connectivity Establishment) TLS (Transport Layer Security) RTP (Real-time Transport Protocol) Zencastr Jitsi RawRTC AsyncIO AIOICE Cryptography Podcast.init Episode OpenCV PyAV FFMPEG Edge Detection Asterisk Raspberry Pi Datagram Transport Security Mozilla Augmented Reality The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/24/2018 • 40 minutes, 50 seconds

Polyglot: Multi-Lingual Natural Language Processing with Rami Al-Rfou

Summary Using computers to analyze text can produce useful and inspirational insights. However, when working with multiple languages the capabilities of existing models are severely limited. In order to help overcome this limitation Rami Al-Rfou built Polyglot. In this episode he explains his motivation for creating a natural language processing library with support for a vast array of languages, how it works, and how you can start using it for your own projects. He also discusses current research on multi-lingual text analytics, how he plans to improve Polyglot in the future, and how it fits in the Python ecosystem. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. Podcast.__init__ listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Rami Al-Rfou about Polyglot, a natural language pipeline with support for an impressive amount of languages Interview Introductions How did you get introduced to Python? Can you start by describing what Polyglot is and your reasons for starting the project? What are the types of use cases that Polyglot enables which would be impractical with something such as NLTK or SpaCy? A majority of NLP libraries have a limited set of languages that they support. What is involved in adding support for a given language to a natural language tool? What is involved in adding a new language to Polyglot? Which families of languages are the most challenging to support? What types of operations are supported and how consistently are they supported across languages? How is Polyglot implemented? Is there any capacity for integrating Polyglot with other tools such as SpaCy or Gensim? How much domain knowledge is required to be able to effectively use Polyglot within an application? What are some of the most interesting or unique uses of Polyglot that you have seen? What have been some of the most complex or challenging aspects of building Polyglot? What do you have planned for the future of Polyglot? What are some areas of NLP research that you are excited for? Keep In Touch Picks Tobias Duolingo Rami The Wizard and the Prophet: Two Remarkable Scientists and Their Dueling Visions to Shape Tomorrow’s World by Charles C. Mann Links Polyglot Polyglot-NER Jordan NLP (Natural Language Processing) Stony Brook University Arabic Sentiment Analysis Assembly Language C .NET Stack Overflow Deep Learning Word Embedding Wikipedia Word2Vec NLTK (Python Natural Language Toolkit) SpaCy Podcast Episode Gensim Podcast Episode Morphology Morpheme Transfer Learning Read The Docs BERT (Bidirectional Encoder Representations from Transformers) FastText data.world Data Engineering Podcast Episode Quilt package management for data Data Engineering Podcast Episode The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/17/2018 • 43 minutes, 41 seconds

Gnocchi: A Scalable Time Series Database For Your Metrics with Julien Danjou

Summary Do you know what your servers are doing? If you have a metrics system in place then the answer should be “yes”. One critical aspect of that platform is the timeseries database that allows you to store, aggregate, analyze, and query the various signals generated by your software and hardware. As the size and complexity of your systems scale, so does the volume of data that you need to manage which can put a strain on your metrics stack. Julien Danjou built Gnocchi during his time on the OpenStack project to provide a time oriented data store that would scale horizontally and still provide fast queries. In this episode he explains how the project got started, how it works, how it compares to the other options on the market, and how you can start using it today to get better visibility into your operations. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. Podcast.__init__ listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at pythonpodcast.com/chat Your host as usual is Tobias Macey and today I’m interviewing Julien Danjou about Gnocchi, an open source time series database built to handle large volumes of system metrics Interview Introductions How did you get introduced to Python? Can you start by describing what Gnocchi is and how the project got started? What was the motivation for moving Gnocchi out of the Openstack organization and into its own top level project? The space of time series databases and metrics as a service platforms are both fairly crowded. What are the unique features of Gnocchi that would lead someone to deploy it in place of other options? What are some of the tools and platforms that are popular today which hadn’t yet gained visibility when you first began working on Gnocchi? How is Gnocchi architected? How has the design changed since you first started working on it? What was the motivation for implementing it in Python and would you make the same choice today? One of the interesting features of Gnocchi is its support of resource history. Can you describe how that operates and the types of use cases that it enables? Does that factor into the multi-tenant architecture? What are some of the drawbacks of pre-aggregating metrics as they are being written into the storage layer (e.g. loss of fidelity)? Is it possible to maintain the raw measures after they are processed into aggregates? One of the challenging aspects of building a scalable metrics platform is support for high-cardinality data. What sort of labelling and tagging of metrics and measures is available in Gnocchi? For someone who wants to implement Gnocchi for their system metrics, what is involved in deploying, maintaining, and upgrading it? What are the available integration points for extending and customizing Gnocchi? Once metrics have been stored, aggregated, and indexed, what are the options for querying and analyzing the collected data? When is Gnocchi the wrong choice? What do you have planned for the future of Gnocchi? Keep In Touch jd on GitHub Website @juldanjou on Twitter Picks Tobias Marketplace Podcast Julien Mergify Links Gnocchi RedHat OpenStack Object Oriented Programming O’Reilly Debian Ceilometer Prometheus Time Series MySQL Gerrit Zuul Podcast Episode GitHub GitLab Graphite Podcast Episode DataDog RabbitMQ InfluxDB Ceph Podcast Episode S3 OpenStack Swift Cassandra Honeycomb Observability Service Podcast Episode AMQP Redis DSL (Domain Specific Language) Golang RBAC (Role-Based Access Control) CollectD StatsD Gnocchi Client Telegraf Grafana TimescaleDB Podcast Episode OpenStack Heat The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/10/2018 • 39 minutes, 16 seconds

Keeping Up With The Python Community For Fun And Profit with Dan Bader

Summary Keeping up with the work being done in the Python community can be a full time job, which is why Dan Bader has made it his! In this episode he discusses how he went from working as a software engineer, to offering training, to now managing both the Real Python and PyCoders properties. He also explains his strategies for tracking and curating the content that he produces and discovers, how he thinks about building products, and what he has learned in the process of running his businesses. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at podcastinit.com/chat Your host as usual is Tobias Macey and today I’m interviewing Dan Bader about finding, filtering, and creating resources for Python developers at Real Python, PyCoders, and his own trainings Interview Introductions How did you get introduced to Python? Let’s start by discussing your primary job these days and how you got to where you are. In the past year you have also taken over management of the Real Python site. How did that come about and what are your responsibilities? You just recently took over management of the PyCoders newsletter and website. Can you describe the events that led to that outcome and the responsibilities that came along with it? What are the synergies that exist between your various roles and projects? What are the areas of conflict? (e.g. time constraints, conflicts of interest, etc.) Between PyCoders, Real Python, your training materials, your Python tips newsletter, and your coaching you have a lot of incentive to keep up to date with everything happening in the Python ecosystem. What are your strategies for content discovery? With the diversity in use cases, geography, and contributors to the landscape of Python how do you work to counteract any bias or blindspots in your work? There is a constant stream of information about any number of topics and subtopics that involve the Python language and community. What is your process for filtering and curating the resources that are ultimately included in the various media properties that you oversee? In my experience with the podcast one of the most difficult aspects of maintaining relevance as a content creator is obtaining feedback from your audience. What do you do to foster engagement and facilitate conversations around the work that you do? You have also built a few different product offerings. Can you discuss the process involved in identifying the relevant opportunities and the creation and marketing of them? Creating, collecting, and curating content takes a significant investment of time and energy. What are your avenues for ensuring the sustainability of your various projects? What are your plans for the future growth and development of your media empire? As someone who is so deeply involved in the conversations flowing through and around Python, what do you see as being the greatest threats and opportunities for the language and its community? Keep In Touch @dbaderorg on Twitter Website dbader on GitHub Picks Tobias Data Engineering Podcast Dan Black code formatter Łukasz Langa Links Dan Bader Nerd Lettering Real Python PyCoders Computer Science Vancouver, BC Django Raymond Hettinger Data Science Flask Pythonista Cafe Python Tricks The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/3/2018 • 57 minutes, 56 seconds

Using Calibre To Keep Your Digital Library In Order with Kovid Goyal

Summary Digital books are convenient and useful ways to have easy access to large volumes of information. Unfortunately, keeping track of them all can be difficult as you gain more books from different sources. Keeping your reading device synchronized with the material that you want to read is also challenging. In this episode Kovid Goyal explains how he created the Calibre digital library manager to solve these problems for himself, how it grew to be the most popular application for organizing ebooks, and how it works under the covers. Calibre is an incredibly useful piece of software with a lot of hidden complexity and a great story behind it. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at podcastinit.com/chat Your host as usual is Tobias Macey and today I’m interviewing Kovid Goyal about Calibre, the powerful and free ebook management tool Interview Introductions How did you get introduced to Python? Can you start by explaining what Calibre is and how the project got started? How are you able to keep up to date with device support in Calibre, given the continual release of new devices and platforms that a user can read ebooks on? What are the main features of Calibre? What are some of the most interesting and most popular plugins that have been creatd for Calibre? Can you describe the software architecture for the project and how it has evolved since you first started working on it? You have been maintaining and improving Calibre for a long time now. What is your motivation to keep working on it? How has the focus of the project and the primary use cases changed over the years that you have been working on it? In addition to its longevity, Calibre has also become a de-facto standard for ebook management. What is your opinion as to why it has gained and kept its popularity? What are some of the competing options and how does Calibre differentiate from them? In addition to the myriad devices and platforms, there is a significant amount of complexity involved in supporting the different ebook formats. What have been the most challenging or complex aspects of managing and converting between the formats? One of the challenges around maintaining a private library of electronic resources is the prevalence of DRM restricted content available through major publishers and retailers. What are your thoughts on the current state of digital book marketplaces? What was your motivation for implementing Calibre in Python? If you were to start the project over today would you make the same choice? Are there any aspects of the project that you would implement differently if you were starting over? What are your plans for the future of Calibre? Keep In Touch kovidgoyal on GitHub Website Patreon Picks Tobias American Gods by Neil Gaiman Kovid Into Thin Air by John Krakauer About how an expedition to climb Everest went wrong. Wonderful account of the difficulties of high altitude mountaineering and the determination it needs. The Steerswoman’s Road by Rosemary Kirstein About the spirit of scientific enquiry in a fallen civilization on an alien planet with partial terraforming that is slowly failing. Links Calibre KDE Caltech Sony PRS500 Linux Kindle Kobo ePUB Calibre Recipes Rapydscrypt NG Goodreads Qt PyQt build-calibre Kitty DRM (Digital Rights Management) The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/26/2018 • 43 minutes, 25 seconds

Entity Extraction, Document Processing, And Knowledge Graphs For Investigative Journalists with Friedrich Lindenberg

Summary Investigative reporters have a challenging task of identifying complex networks of people, places, and events gleaned from a mixed collection of sources. Turning those various documents, electronic records, and research into a searchable and actionable collection of facts is an interesting and difficult technical challenge. Friedrich Lindenberg created the Aleph project to address this issue and in this episode he explains how it works, why he built it, and how it is being used. He also discusses his hopes for the future of the project and other ways that the system could be used. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode today to get a $20 credit and launch a new server in under a minute. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at podcastinit.com/chat Registration for PyCon US, the largest annual gathering across the community, is open now. Don’t forget to get your ticket and I’ll see you there! Your host as usual is Tobias Macey and today I’m interviewing Friedrich Lindenberg about Aleph, a tool to perform entity extraction across documents and structured data Interview Introductions How did you get introduced to Python? Can you start by explaining what Aleph is and how the project got started? What is investigative journalism? How does Aleph fit into their workflow? What are some other tools that would be used alongside Aleph? What are some ways that Aleph could be useful outside of investigative journalism? How is Aleph architected and how has it evolved since you first started working on it? What are the major components of Aleph? What are the types of documents and data formats that Aleph supports? Can you describe the steps involved in entity extraction? What are the most challenging aspects of identifying and resolving entities in the documents stored in Aleph? Can you describe the flow of data through the system from a document being uploaded through to it being displayed as part of a search query? What is involved in deploying and managing an installation of Aleph? What have been some of the most interesting or unexpected aspects of building Aleph? Are there any particularly noteworthy uses of Aleph that you are aware of? What are your plans for the future of Aleph? Keep In Touch Website @pudo on Twitter pudo on GitHub Picks Tobias Mechanical Soup Friedrich phonenumbers – because it’s useful pyicu – super nerdy but amazing sqlalchemy – my all-time favorite python package Links Aleph Organized Crime and Corruption Reporting Project OCR (Optical Character Recognition) Jorge Luis Borges Buenos Aires Investigative Journalism Azerbaijan Signal Open Corporates Open Refine Money Laundering E-Discovery CSV SQL Entity Extraction (Named Entity Recognition) Apache Tika Polyglot SpaCy Podcast.__init__ Episode LibreOffice Tesseract followthemoney Elasticsearch Knowledge Graph Neo4J Gephi Edward Snowden Document Cloud Overview Project Veracrypt Qubes OS I2 Analyst Notebook The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/19/2018 • 39 minutes, 12 seconds

Bringing Python To The Spanish Language Community with Maricela Sanchez

Summary The Python Community is large and growing, however a majority of articles, books, and presentations are still in English. To increase the accessibility for Spanish language speakers, Maricela Sanchez helped to create the Charlas track at PyCon US, and is an organizer for Python Day Mexico. In this episode she shares her motivations for getting involved in community building, her experiences working on Python Day Mexico and PyCon Charlas, and the lessons that she has learned in the process. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at podcastinit.com/chat Your host as usual is Tobias Macey and today I’m interviewing Maricela Sanchez Miranda about her work in organizing PyCon Charlas, the spanish language track at PyCon US, as well as Python Day Mexico Interview Introductions How did you get introduced to Python? Can you briefly describe PyCon Charlas and Python Day Mexico? What has been your motivation for getting involved with organizing these community events? What do you find to be the unique characteristics of the Python community in Mexico? What kind of feedback have you gotton from the Charlas track at PyCon? What are your goals for fostering these Spanish language events? What are some of the lessons that you have learned from PyCon Charlas that were useful in organizing Python Day Mexico? What have been the most challenging or complicated aspects of organizing Python Day Mexico? How many attendees do you anticipate? How has that affected your planning and preparation? Are there any aspects of the geography, infrastructure, or culture of Mexico that you have found to be either beneficial or challenging for organizing a conference? Do you anticipate PyCon Charlas and Python Day Mexico becoming annual events? What is your advice for anyone who is interested in organizing a conference in their own region or language? Keep In Touch mayela on GitHub @mayela0x14 on Twitter Picks Tobias CardLine Dinosaurs Maricela Links Python Day Mexico PyCon Charlas PyCon Hatchery PyCon Latin America Mexico City Guadalajara The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/29/2018 • 19 minutes, 28 seconds

Of Checklists, Ethics, and Data with Emily Miller and Peter Bull

Summary As data science becomes more widespread and has a bigger impact on the lives of people, it is important that those projects and products are built with a conscious consideration of ethics. Keeping ethical principles in mind throughout the lifecycle of a data project helps to reduce the overall effort of preventing negative outcomes from the use of the final product. Emily Miller and Peter Bull of Driven Data have created Deon to improve the communication and conversation around ethics among and between data teams. It is a Python project that generates a checklist of common concerns for data oriented projects at the various stages of the lifecycle where they should be considered. In this episode they discuss their motivation for creating the project, the challenges and benefits of maintaining such a checklist, and how you can start using it today. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at podcastinit.com/chat Your host as usual is Tobias Macey and today I’m interviewing Emily Miller and Peter Bull about Deon, an ethics checklist for data projects Interview Introductions How did you get introduced to Python? Can you start by describing what Deon is and your motivation for creating it? Why a checklist, specifically? What’s the advantage of this over an oath, for example? What is unique to data science in terms of the ethical concerns, as compared to traditional software engineering? What is the typical workflow for a team that is using Deon in their projects? Deon ships with a default checklist but allows for customization. What are some common addendums that you have seen? Have you received pushback on any of the default items? How does Deon simplify communication around ethics across team boundaries? What are some of the most often overlooked items? What are some of the most difficult ethical concerns to comply with for a typical data science project? How has Deon helped you at Driven Data? What are the customer facing impacts of embedding a discussion of ethics in the product development process? Some of the items on the default checklist coincide with regulatory requirements. Are there any cases where regulation is in conflict with an ethical concern that you would like to see practiced? What are your hopes for the future of the Deon project? Keep In Touch Emily LinkedIn ejm714 on GitHub Peter LinkedIn @pjbull on Twitter pjbull on GitHub Driven Data @drivendataorg on Twitter drivendataorg on GitHub Website Picks Tobias Richard Bond Glass Art Emily Tandem Coffee in Portland, Maine Peter The Model Bakery in Saint Helena and Napa, California Links Deon Driven Data International Development Brookings Institution Stata Econometrics Metis Bootcamp Pandas Podcast Episode C# .NET Podcast.__init__ Episode On Software Ethics Jupyter Notebook Podcast Episode Word2Vec cookiecutter data science Logistic Regression The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/22/2018 • 45 minutes, 16 seconds

How Python Is Used To Build A Startup At Wanderu with Chris Kirkos and Matt Warren

Summary The breadth of use cases that Python supports, coupled with the level of productivity that it provides through its ease of use have contributed to the incredible popularity of the language. To explore the ways that it can contribute to the success of a young and growing startup two of the lead engineers at Wanderu discuss their experiences in this episode. Matt Warren, the technical operations lead, explains the ways that he is using Python to build and scale the infrastructure that Wanderu relies on, as well as the ways that he deploys and runs the various Python applications that power the business. Chris Kirkos, the lead software architect, describes how the original Django application has grown into a suite of microservices, where they have opted to use a different language and why, and how Python is still being used for critical business needs. This is a great conversation for understanding the business impact of the Python language and ecosystem. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at podcastinit.com/chat Your host as usual is Tobias Macey and today I’m interviewing Matt Warren and Chris Kirkos and about the ways that they are using Python at Wanderu Interview Introductions How did you get introduced to Python? Can you start by describing what Wanderu does? How is the platform architected? What are the broad categories of problems that you are addressing with Python? What are the areas where you chose to use a different language or service? What ratio of new projects and features are implemented using Python? How much of that decision process is influenced by the fact that you already have so much pre-existing Python code? For the projects where you don’t choose Python, what are the reasons for going elsewhere? What are some of the limitations of Python that you have encountered while working at Wanderu? What are some of the places that you were surprised to find Python in use at Wanderu? What have you enjoyed most about working with Python? What are some of the sharp edges that you would like to see smoothed over in future versions of the language? What is the most challenging bug that you have dealt with at Wanderu that was attributable in some sense to the fact that the code was written in Python? If you were to start over today on any of the pieces of the Wanderu platform, are there any that you would write in a different language? Which libraries have been the most useful for your work at Wanderu? Which ones have caused you the most pain? Keep In Touch Matt @matthewwwarren on Twitter LinkedIn Chris LinkedIn Picks Tobias DataGrip Matt Chacarero Chris PDB IPDB PUDB VSCode Links Wanderu Northeastern University C++ Perl Microservices PostgreSQL Data Engineering Podcast Episode MongoDB Django Node.js Go-lang AWS ETL (Extract, Transform, and Load) Data Warehouse Graph Database Twisted Podcast Episode Gevent Scrapy Virtualenv Ruby Rbenv Boto3 PyMongo Ansible Pip TLS Cryptography Podcast Episode Setuptools Openstack Requests PyCountry SOAP (Simple Object Access Protocol) XML Jinja OpenSSL pytest Bandit Podcast Episode Gang of Four The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/15/2018 • 34 minutes, 22 seconds

Building A Game In Python At PyWeek with Daniel Pope

Summary Many people learn to program because of their interest in building their own video games. Once the necessary skills have been acquired, it is often the case that the original idea of creating a game is forgotten in favor of solving the problems we confront at work. Game jams are a great way to get inspired and motivated to finally write a game from scratch. This week Daniel Pope discusses the origin and format for PyWeek, his experience as a participant, and the landscape of options for building a game in Python. He also explains how you can register and compete in the next competition. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at podcastinit.com/chat Your host as usual is Tobias Macey and today I’m interviewing Daniel Pope about PyWeek, a one week challenge to build a game in Python Interview Introductions How did you get introduced to Python? Can you start by describing what PyWeek is and how the competition got started? What is your current role in relation to PyWeek and how did you get involved? What are the strengths of the Python lanaguage and ecosystem for developing a game? What are some of the common difficulties encountered by participants in the challenge? What are some of the most commonly used libraries and tools for creating and packaging the games? What are some shortcomings in the available tools or libraries for Python when it comes to game development? What are some examples of libraries or tools that were created and released as a result of a team’s efforts during PyWeek? How often do games that get started during PyWeek continue to be developed and improved? Have there ever been games that went on to be commercially viable? What are some of the most interesting or unusual games that you have seen submitted to PyWeek? Can you describe your experience as a competitor in PyWeek? How do you structure your time during the competition week to ensure that you can complete your game? What are the benefits and difficulties of the one week constraint for development? How has PyWeek changed over the years that you have been involved with it? What are your hopes for the competition as it continues into the future? Keep In Touch @lordmauve on Twitter Blog lordmauve on GitHub Picks Tobias The Architecht Show Dan Red Blob Games Designing Virtual Worlds by Richard Bartle Links PyWeek Two Sigma Game Jam Richard Jones PyGame Pyglet SDL PyGame Zero Cocos 2D Doctor Corovich’s Flying Atomic Squid Mortimer The Lepidopterist Ludum Dare The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/9/2018 • 30 minutes, 6 seconds

Managing Application Secrets with Brian Kelly

Summary Any application that communicates with other systems or services will at some point require a credential or sensitive piece of information to operate properly. The question then becomes how best to securely store, transmit, and use that information. The world of software secrets management is vast and complicated, so in this episode Brian Kelly, engineering manager at Conjur, aims to help you make sense of it. He explains the main factors for protecting sensitive information in your software development and deployment, ways that information might be leaked, and how to get the whole team on the same page. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at podcastinit.com/chat Your host as usual is Tobias Macey and today I’m interviewing Brian Kelly about how to store, deploy, and use sensitive information in your applications Interview Introductions How did you get introduced to Python? To begin with, how do you define a secret in the context of an application? What are the broad categories for solutions to secrets management? What are the different aspects of secrets management in the lifecycle of developing, deploying, and maintaining an application? How does the scale of a project or organization impact the strategies that are reasonable for secrets management? What are some of the most challenging aspects of secrets management at the different stages of usage? What are some of the common reasons that secrets management strategies fail? What are some of the vulnerabilities or attack vectors that development teams should be thinking about when working with credentials? What are your thoughts on versioning of secrets? Beyond storing and deploying sensitive information, what are some of the secondary concerns around secrets management that development teams should be thinking about? How does the use of multiple environments (e.g. dev, QA, production, etc.) affect the strategies used for secrets management? What are some of the most useful resources that you have found for anyone looking to learn more about this subject? Keep In Touch @brikelly on Twitter Blog brikelly on GitHub Picks Tobias The Inheritance Cycle Brian Donegal Ireland Links Conjur CyberArk Datawire Transpiler IDL CSRF (Cross-Site Request Forgery) Hashicorp Vault Continuous Integration Continuous Delivery TLS (Transport Layer Security) RBAC (Role Based Access Control) Terraform SQL Injection Secretless MFA Duo Security Kubernetes Summon OWASP Top 10 Configuration Management Puppet Chef Ansible SaltStack Immutable Infrastructure Conjur Blog Krebs On Security The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/2/2018 • 39 minutes, 3 seconds

Django, Channels, And The Asynchronous Web with Andrew Godwin

Summary Once upon a time the web was a simple place with one main protocol and a predictable sequence of request/response interactions with backend applications. This is the era when Django began, but in the intervening years there has been an explosion of complexity with new asynchronous protocols and single page Javascript applications. To help bridge the gap and bring the most popular Python web framework into the modern age Andrew Godwin created Channels. In this episode he explains how the first version of the asynchronous layer for Django applications was created, how it has changed in the jump to version 2, and where it will go in the future. Along the way he also discusses the challenges of async development, his work on designing ASGI as the spiritual successor to WSGI, and how you can start using all of this in your own projects today. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at podcastinit.com/chat Your host as usual is Tobias Macey and today I’m interviewing Andrew Godwin about Django Channels 2.x and the ASGI specification for modern, asynchronous web protocols Interview Introductions How did you get introduced to Python? Can you start with an overview of the problem that Channels is aiming to solve? Asynchronous frameworks have existed in Python for a long time. What are the tradeoffs in those frameworks that would lead someone to prefer the combination of Django and Channels? For someone who is familiar with traditional Django or working on an existing application, what are the steps involved in integrating Channels? Channels is a project that you have been working on for a significant amount of time and which you recently re-architected. What were the shortcomings in the 1.x release that necessitated such a major rewrite? How is the current system architected? What have you found to be the most challenging or confusing aspects of managing asynchronous web protocols both as an author of Channels/ASGI and someone building on top of them? While reading through the documentation there were mentions of the synchronous nature of the Django ORM. What are your thoughts on asynchronous database access and how important that is for future versions of Django and Channels? As part of your implementation of Channels 2.x you introduced a new protocol for asynchronous web applications in Python in the form of ASGI. How does this differ from the WSGI standard and what was your process for developing this specification? What are your hopes for what the Python community will do with ASGI? What are your plans for the future of Channels? What are some of the most interesting or unexpected uses of Channels and/or ASGI? Keep In Touch @andrewgodwin on Twitter Website andrewgodwin on GitHub Picks Tobias Nobody Listens To Paula Poundstone Andrew Literary Appreciation Of The Olson Timezones Database Links Channels ASGI Django South Django Migrations PHP Turbogears WSGI Websockets Eventlet HTTP WebRTC IPFS Twisted Tornado Podcast Episode Daphne Redis Uvicorn Heisenbugs Deadlock CherryPy Flask WSGI 2 Podcast Episode Starlette Django Rest Framework Thom Christie PEP Process Episode The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/24/2018 • 41 minutes, 46 seconds

The Business Of Technical Authoring With William Vincent

Summary There are many aspects of learning how to program and at least as many ways to go about it. This is multiplicative with the different problem domains and subject areas where software development is applied. In this episode William Vincent discusses his experiences learning how web development mid-career and then writing a series of books to make the learning curve for Django newcomers shallower. This includes his thoughts on the business aspects of technical writing and teaching, the challenges of keeping content up to date with the current state of software, and the ever-present lack of sufficient information for new programmers. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at podcastinit.com/chat Your host as usual is Tobias Macey and today I’m interviewing William Vincent about his experience learning to code mid-career and then writing a series of books to bring you along on his journey from beginner to advanced Django developer Interview Introductions How did you get introduced to Python? How has your experience as someone who began working as a developer mid-career influenced your approach to software? How do you compare Python options for web development (Django/Flask) to others such as Ruby on Rails or Node/Express in the JavaScript world? What was your motivation for writing a beginner guide to Django? What was the most difficult aspect of determining the appropriate level of depth for the content? At what point did you decide to publish the tutorial you were compiling as a book? In the posts that you wrote about your experience authoring the books you give a detailed description of the economics of being an author. Can you discuss your thoughts on that? Focusing on a library or framework, such as Django, increases the maintenance burden of a book, versus one that is written about fundamental principles of computing. What are your thoughts on the tradeoffs involved in selecting a topic for a technical book? Challenges of creating useful intermediate content (lots of beginner tutorials and deep dives, not much in the middle) After your initial foray into technical authoring you decided to follow it with two more books. What other topics are you covering with those? Once you are finished with the third do you plan to continue writing, or will you shift your focus to something else? Translating content to reach a larger audience What advice would you give to someone who is considering writing a book of their own? What alternative avenues do you think would be more valuable for themselves and their audience? Alternative avenues for providing useful training to developers Keep In Touch Email wsvincent on GitHub Website Picks Tobias Practical AI William awesome-django The Digital Doctor by Robert Wachter Links Quizlet Django Learn Python The Hard Way Invent Your Own Computer Games with Python Ruby on Rails Node.js Express LearnBoost David Heinemeier Hanson Meteor.js Class-Based Views Rails Tutorial Leanpub Gumroad Stack Overflow Egghead.io Frontend Masters Gatsby.js Jekyll Pachyderm Data Engineering Podcast Pachyderm Interview The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/17/2018 • 49 minutes, 38 seconds

Keep Your Code Clean Using pre-commit with Anthony Sottile

Summary Maintaining the health and well-being of your software is a never-ending responsibility. Automating away as much of it as possible makes that challenge more achievable. In this episode Anthony Sottile describes his work on the pre-commit framework to simplify the process of writing and distributing functions to make sure that you only commit code that meets your definition of clean. He explains how it supports tools and repositories written in multiple languages, enforces team standards, and how you can start using it today to ship better software. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at podcastinit.com/chat Your host as usual is Tobias Macey and today I’m interviewing Anthony Sottile about pre-commit, a framework for managing and maintaining hooks for multiple languages Interview Introductions How did you get introduced to Python? Can you start by describing what a pre-commit hook is and some of the ways that they are useful for developers? What was you motivation for creating a framework to manage your pre-commit hooks? How does it differ from other projects built to manage these hooks? What are the steps for getting someone started with pre-commit in a new project? Which other event hooks would be most useful to implement for maintaining the health of a repository? What types of operations are most useful for ensuring the health of a project? What types of routines should be avoided as a pre-commit step? Installing the hooks into a user’s local environment is a manual step, so how do you ensure that all of your developers are using the configured hooks? What factors have you found that lead to developers skipping or disabling hooks? How is pre-commit implemented and how has that design evolved from when you first started? What have been the most difficult aspects of supporting multiple languages and package managers? What would you do differently if you started over today? Would you still use Python? For someone who wants to write a plugin for pre-commit, what are the steps involved? What are some of the strangest or most unusual uses of pre-commit hooks that you have seen? What are your plans for the future of pre-commit? Keep In Touch asottile on GitHub @codewithanthony on Twitter anthonywritescode on twitch anthonywritescode on YouTube Picks Tobias Tag Anthony Yes Theory Links pre-commit List of hooks Lyft Careers Git Git hooks https://githooks.com/?utmsource=rss&utmmedium=rss Flake8 Make Tox Type Annotations xargs Bash shlex The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/10/2018 • 24 minutes, 52 seconds

Infection Monkey Vulnerability Scanner with Daniel Goldberg

Summary How secure are your servers? The best way to be sure that your systems aren’t being compromised is to do it yourself. In this episode Daniel Goldberg explains how you can use his project Infection Monkey to run a scan of your infrastructure to find and fix the vulnerabilities that can be taken advantage of. He also discusses his reasons for building it in Python, how it compares to other security scanners, and how you can get involved to keep making it better. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at podcastinit.com/chat Your host as usual is Tobias Macey and today I’m interviewing Daniel Goldberg about Infection Monkey, an open source system breach simulation tool for evaluating the security of your network Interview Introductions How did you get introduced to Python? What is infection monkey and what was the reason for building it? What was the reasoning for building it in Python? If you were to start over today what would you do differently? Penetration testing is typically an endeavor that requires a significant amount of knowledge and experience of security practices. What have been some of the most difficult aspects of building an automated vulnerability testing system? How does a deployed instance keep up to date with recent exploits and attack vectors? How does Infection Monkey compare to other tools such as Nessus and Nexpose? What are some examples of the types of vulnerabilities that can be discovered by Infection Monkey? What kinds of information can Infection Monkey discover during a scan? How does that information get reported to the user? How much security experience is necessary to understand and address the findings in a given report generated from a scan? What techniques do you use to ensure that the simulated compromises can be safely reverted? What are some aspects of network security and system vulnerabilities that Infection Monkey is unable to detect and/or analyze? For someone who is interested in using Infection Monkey what are the steps involved in getting it set up? What is the workflow for running a scan? Is Infection Monkey intended to be run continuously, or only with the interaction of an operator? What are your plans for the future of Infection Monkey? Keep In Touch danielguardicore on GitHub Guardicore Blog Picks Tobias Darkest Hour Daniel How Complex Systems Fail Links Infection Monkey Guardicore Stack Overflow Metasploit AsyncIO React Nessus Nexpose Shellshock Wannacry Simian Army Chaos Engineering Capuchin Monkey Google Summer of Code The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/3/2018 • 34 minutes, 24 seconds

Fast Stream Processing In Python Using Faust with Ask Solem

Summary The need to process unbounded and continually streaming sources of data has become increasingly common. One of the popular platforms for implementing this is Kafka along with its streams API. Unfortunately, this requires all of your processing or microservice logic to be implemented in Java, so what’s a poor Python developer to do? If that developer is Ask Solem of Celery fame then the answer is, help to re-implement the streams API in Python. In this episode Ask describes how Faust got started, how it works under the covers, and how you can start using it today to process your fast moving data in easy to understand Python code. He also discusses ways in which Faust might be able to replace your Celery workers, and all of the pieces that you can replace with your own plugins. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at podcastinit.com/chat Your host as usual is Tobias Macey and today I’m interviewing Ask Solem about Faust, a library for building high performance, high throughput streaming systems in Python Interview Introductions How did you get introduced to Python? What is Faust and what was your motivation for building it? What were the initial project requirements that led you to use Kafka as the primary infrastructure component for Faust? Can you describe the architecture for Faust and how it has changed from when you first started writing it? What mechanism does Faust use for managing consensus and failover among instances that are working on the same stream partition? What are some of the lessons that you learned while building Celery that were most useful to you when designing Faust? What have you found to be the most common areas of confusion for people who are just starting to build an application on top of Faust? What has been the most interesting/unexpected/difficult aspects of building and maintaining Faust? What have you found to be the most challenging aspects of building streaming applications? What was the reason for releasing Faust as an open source project rather than keeping it internal to Robinhood? What would be involved in adding support for alternate queue or stream implementations? What do you have planned for the future of Faust? Keep In Touch @asksol on Twitter ask on GitHub Picks Tobias Super Troopers 2 Ask Microsound by Curtis Roads Links Faust RobinHood Kafka Streams RabbitMQ AsyncIO Celery Kafka Confluent Write-Ahead Log RocksDB Redis Pulsar KSQL Exactly Once Semantics The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/27/2018 • 28 minutes, 45 seconds

Don't Just Stand There, Get Programming! with Ana Bell

Summary Writing a book is hard work, especially when you are trying to teach such a broad concept as programming. In this episode Ana Bell discusses her recent work in writing Get Programming: Learn To Code With Python, including her views on how to separate the principles from the implementation, making the book evergreen in its appeal, and how her experience as a lecturer at MIT has helped her maintain the perspectives of beginners. She also shares her views on the values of learning about programming, even when you have no intention of doing it as a career and ways to take the next steps if that is your goal. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. As you know, Python has become one of the most popular programming languages in the world, due to the size, scope, and friendliness of the language and community. But, it can be tough learning it when you’re just starting out. Luckily, there’s an easy way to get involved. Written by MIT lecturer Ana Bell and published by Manning Publications, Get Programming: Learn to code with Python is the perfect way to get started working with Python. Ana’s experience as a teacher of Python really shines through, as you get hands-on with the language without being drowned in confusing jargon or theory. Filled with practical examples and step-by-step lessons to take on, Get Programming is perfect for people who just want to get stuck in with Python. Get your copy of the book with a special 40% discount for Podcast.__init__ listeners at podcastinit.com/get-programming using code: Bell40! Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at podcastinit.com/chat Your host as usual is Tobias Macey and today I’m interviewing Ana Bell about her book, Get Programming: Learn to code with Python, and her approach to teaching how to code Interview Introductions How did you get introduced to Python? Can you start by describing your motivation for writing a book about learning to program? Who is the target audience for this book? What level of competence do you want the reader to have when they have completed it? What were the most challenging aspects of writing a book for beginning programmers? What did you do to recapture the “beginner mind” while writing? There are a large variety of books on learning to program and at least as many approaches. Can you describe the techniques that you use in your book to help readers grasp the concepts that you cover? One of the problems of writing a book about technology is that there is no stationary target to aim for due to the constant advancement of the industry. How do you reconcile that reality with the need for a book to remain relevant for an extended period of time? How do you decide what to include and what to leave out when writing about learning how to program? What advice do you have for people who have read your book and want to continue on to a career in development? Keep In Touch MIT Bio @anabellphd on Twitter Picks Tobias Netdata Ana Star Realms Between Two Cities Links Get Programming by Ana Bell edX MIT Machine Learning Github The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/20/2018 • 35 minutes, 7 seconds

The Masonite Web Framework With Joe Mancuso

Summary Masonite is an ambitious new web framework that draws inspiration from many other successful projects in other languages. In this episode Joe Mancuso, the primary author and maintainer, explains his goal of unseating Django from its position of prominence in the Python community. He also discusses his motivation for building it, how it is architected, and how you can start using it for your own projects. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at podcastinit.com/chat Your host as usual is Tobias Macey and today I’m interviewing Joe Mancuso about Masonite, the modern and developer centric python web framework. Interview Introductions How did you get introduced to Python? What is Masonite and what was the motivation for creating it? How does it fit in the current landscape of Python web frameworks? Why might someone choose to use Masonite over Python frameworks? If someone isn’t already decided on using Python, what are some reasons that they might choose Masonite over frameworks in other languages? Can you describe the framework architecture and how it has evolved over the lifetime of the project? What are some examples of projects that have been built with Masonite and what aspects of the framework are they leveraging? For someone who is starting a new project with Masonite what are some of the concepts that they should be familiar with? What is their workflow for starting their project? How does that workflow change when working with an existing application? What are some of the plans that you have for the future of Masonite? Keep In Touch Joe Blog @masoniteproject on Twitter josephmancuso on GitHub Masonite MasoniteFramework on GitHub Docs Slack Picks Tobias Yeti Mugs Joe Gitbook.io Dev.to Links Masonite on GitHub Codecademy PHP Django Laravel Dependency Injection Inversion of Control WSGI Gunicorn Waitress Nexmo Masonite Slack Mathias Johansson Trello @masoniteproject Masonite Repo Masonite Documentation The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/13/2018 • 43 minutes, 20 seconds

Helping Teacher's Bring Python Into The Classroom With Nicholas Tollervey

Summary There are a number of resources available for teaching beginners to code in Python and many other languages, and numerous endeavors to introduce programming to educational environments. Sometimes those efforts yield success and others can simply lead to frustration on the part of the teacher and the student. In this episode Nicholas Tollervey discusses his work as a teacher and a programmer, his work on the micro:bit project and the PyCon UK education summit, as well as his thoughts on the place that Python holds in educational programs for teaching the next generation. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at podcastinit.com/chat Your host as usual is Tobias Macey and today I’m interviewing Nicholas Tollervey about his efforts to improve the accessibility of Python for educators Interview Introductions How did you get introduced to Python? How has your experience as a teacher influenced your work as a software engineer? What are some of the ways that practicing software engineers can be most effective in supporting the efforts teachers and students to become computationally literate? What are your views on the reasons that computational literacy is important for students? What are some of the most difficult barriers that need to be overcome for students to engage with Python? How important is it, in your opinion, to expose students to text-based programming, as opposed to the block-based environment of tools such as Scratch? At what age range do you think we should be trying to engage students with programming? When the teacher’s day was introduced as part of the education summit for PyCon UK what was the initial reception from the educators who attended? How has the format for the teacher’s portion of the conference changed in the subsequent years? What have been some of the most useful or beneficial aspects for the teacher’s and how much engagement occurs between the conferences? What was your involvement in the initiative that brought the BBC micro:bit to UK classrooms? What kinds of feedback have you gotten from students who have had an opportunity to use them? What are some of the most interesting or unexpected uses of the micro:bit that you have seen? Keep In Touch @ntoll on Twitter ntoll on GitHub Website Picks Tobias The Dark Materials Trilogy Audiobooks by Phillip Pullman Nicholas Moon Dust by Andrew Smith Totally Wired by Andrew Smith Links ntoll.org Tuba Royal College of Music Fry IT MicroPython Podcast Interview With Damien George MicroPython Book Mu Scratch Jupyter John Pinner London Python Code Dojo Alan Turing Tim Berners-Lee Charles Babbage REPL (Read-Eval-Print Loop Daniel Pope PyGame Raspberry Pi Foundation PyGame Zero Network Zero GPIO Zero Computing At School BBC PSF TouchDevelop TypeScript Damien George ARM Code Kingdoms micro:bit Barclay’s PyCon US Education Summit Raspberry Pi Foundation Code Club Qumisha Goss Keynote Adafruit CircuitPython NeoPixel PyBoard The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/6/2018 • 59 minutes, 19 seconds

Continuous Delivery For Complex Systems Using Zuul with Monty Taylor

Summary Continuous integration systems are important for ensuring that you don’t release broken software. Some projects can benefit from simple, standardized platforms, but as you grow or factor in additional projects the complexity of checking your deployments grows. Zuul is a deployment automation and gating system that was built to power the complexities of OpenStack so it will grow and scale with you. In this episode Monty Taylor explains how he helped start Zuul, how it is designed for scale, and how you can start using it for your continuous delivery systems. He also discusses how Zuul has evolved and the directions it will take in the future. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at podcastinit.com/chat Your host as usual is Tobias Macey and today I’m interviewing Monty Taylor about Zuul, a platform that drives continuous integration, delivery, and deployment systems with a focus on project gating and interrelated projects. Interview Introductions How did you get introduced to Python? Can you start by explaining what Zuul is and how the project got started? How do you view Zuul in the broader landscape of CI/CD systems (e.g. GoCD, Jenkins, Travis, etc.)? What is the workflow for someone who is defining a pipeline in Zuul? How are the pipelines tested and promoted? One of the problems that are often encountered in CI/CD systems is the difficulty of testing changes locally. What kind of support is available in Zuul for that? Can you describe the project architecture? What aspects of the architecture enable it to scale to large projects and teams? How difficult would it be to swap the Ansible integration for another orchestration tool? What would be involved in adding support for additional version control systems? What are your plans for the future of the project? Keep In Touch emonty on GitHub Website @emonty on Twitter Picks Tobias Hitchhiker’s Guide To The Galaxy Monty Bojack Horseman Links Red Hat Zuul OpenStack Jim Blair Perl SNPP Rackspace NASA Drizzle Sun Microsystems MySQL Continuous Integration Continuous Delivery Launchpad Bzr Jenkins Jess Frazelle Graphite StatsD graphite.openstack.org grafana.openstack.org subunit Ansible Helm Software Factory Gerrit Git Perforce Subversion Zookeeper Gearman The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/30/2018 • 1 hour, 7 minutes, 1 second

Michael Foord On Testing, Mock, TDD, And The Python Community

Summary Michael Foord has been working on building and testing software in Python for over a decade. One of his most notable and widely used contributions to the community is the Mock library, which has been incorporated into the standard library. In this episode he explains how he got involved in the community, why testing has been such a strong focus throughout his career, the uses and hazards of mocked objects, and how he is transitioning to freelancing full time. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at podcastinit.com/chat Your host as usual is Tobias Macey and today I’m interviewing Michael Foord mockingly, about his career in Python Interview Introductions How did you get introduced to Python? One of the main threads in your career appears to be software testing. What aspects of testing do you find so interesting and how did you first get exposed to that aspect of building software? How has the language and ecosystem support for testing evolved over the course of your career? What are some of the areas that you find it to still be lacking? Mock is one of your projects that has been widely adopted and ultimately incorporated into the standard library. What was your reason for starting it in the first place? Mocking can be a controversial topic. What are your current thoughts on how and when to use mocks, stubs, and fixtures? How do you view the state of the art for testing in Python as it compares to other languages that you have worked in? You were fairly early in the move to supporting Python 2 and 3 in a single project with Mock. How has that overall experience changed in the intervening years since Python 2.4 and 3.2? What are some of the notable evolutions in Python and the software industry that you have experienced over your career? You recently transitioned to acting as a software trainer and consultant full time. Where are you focusing your energy currently and what are your grand plans for the future? Keep In Touch Email Website Twitter Picks Tobias -Ology Books Michael Imaginary Authors Falling Into The Sea City On Fire Links IronPython London IronPython in Action Mock UnitTest Play By Email Smalltalk Regular Expression Dijkstra’s Algorithm urllib2 Resolver Systems TDD (Test-Driven Development) PyCon Trent Nelson Fractals Unicode Joel Spolsky (Unicode) OOP (Object-Oriented Programming) End-to-end Testing Unit Testing Canonical Selenium Ansible Ansible Tower AWX (Open Source Tower Codebase) Continuous Integration Continuous Delivery Agile Software Development GitHub GitLab Jenkins Nightwatch.js py.test Martin Fowler Monkey Patching Decorator Context Manager autospec Golang 2to3 Six Instagram Keynote Trans-code Django Girls PyLadies Agile Abstractions David Beazley The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/23/2018 • 55 minutes, 11 seconds

The Past, Present, and Future of Twisted with Moshe Zadka

Summary Twisted is one of the earliest frameworks for developing asynchronous applications in Python and it has yet to fulfill its original purpose. It can be used to build network servers that integrate a multitude of protocols, increase the performance of your I/O bound applications, serve as the full web stack for your WSGI projects, and anything else that needs a battle tested and performant foundation. In this episode long time maintainer Moshe Zadka discusses the history of Twisted, how it has evolved over the years, the transition to Python 3, some of its myriad use cases, and where it is headed in the future. Try it out today and then send some thanks to all of the people who have dedicated their time to building it. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. And with their new Kubernetes integration it’s even easier to deploy and scale your build agents. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at podcastinit.com/chat Your host as usual is Tobias Macey and today I’m interviewing Moshe Zadka about Twisted, the original multi-function tool for asynchronous operations and network protocols in Python Interview Introductions How did you get introduced to Python? For anyone who isn’t familiar with Twisted can you share a brief overview of what it is? What was the original motivation for creating it? How did you get involved with the project and what is your current role in the team? How can people learn to use Twisted? What are some of the common difficulties that new users encounter? What did you learn working on Twisted? Who uses Twisted? When is Twisted the wrong choice? What are some examples of systems that aren’t using Twisted but should be? What are some of the ways that Twisted has evolved and changed over the years? What are some of the ways people can support Twisted? What are some of the plans for the future of Twisted? Keep In Touch Moshe Zadka Twisted Mailing List IRC Picks Tobias Leatherman Wave+ Moshe Unsong Book Links Twisted Glyph Lefkowitz IRC async/await Pyvideo PyCon 2017 Tutorial asyncio GTK SNMP Gunicorn uWSGI WSGI Nginx Supervisor asynchat asyncore Ncolony The Ultimate Quality Development System Moshe’s article on UQDS Unicode prefix 2to3 Six Unit Tests Automat TLA+ PyCon CA Presentation Sans IO Cory Benfield’s talk Tubes Hyper H2 H11 Apple Calendar Server Github Duo Security using Cyclone Matrix — Used by French government AIOHTTP The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/16/2018 • 34 minutes, 42 seconds

Mike Driscoll And His Career In Python

Summary Mike Driscoll has been writing blogs and books for the Python community for years, including his popular series on the Python Module Of The Week. In his daily work he uses Python to test graphical interfaces written in C++ and QT for embedded platforms. In this episode he explains his work, how he got involved in writing as a regular exercise, and an overview of his recent books. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. And with their new Kubernetes integration it’s even easier to deploy and scale your build agents. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Mike Driscoll about using Python to test QT UIs for embedded platforms, his experience running a popular Python blog, and being a self-published author Technically, I am testing a C++ Qt app that is deployed to an embedded system Interview Introductions How did you get introduced to Python? Can you start by describing the way in which you are using Python for your work? What benefits does Python provide for writing and running tests for projects written in other languages? What are the drawbacks or limitations? What are some of the tools or techniques that you have found most useful for your work? How much of that was hard-earned knowledge vs finding it in reference material or prior art? What are some of the most interesting and/or difficult aspects of testing graphical interfaces? What are some of the most surprising or unexpected aspects of the problem space that you have discovered through your work? What are some of the other ways in which you have worked with the Python language and community? What are you most interested in working toward in the future? Keep In Touch Blog @driscollis on Twitter driscollis on GitHub Books Picks Tobias Draw.io Mike Qt for Python Jupyter Notebook Links Mouse vs. Python C++ Qt Ag Leader Squish CFFI Ctypes Tcl Javascript Ruby Froglogic Selenium Pillow OpenCV WxPython PSF PyCon Brett Cannon Carol Willing ReportLab PDFRW Brett Cannon PyCon 2018 Keynote The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/8/2018 • 23 minutes, 31 seconds

The Pulp Artifact Repository with Bihan Zhang and Austin Macdonald

Summary Hosting your own artifact repositories can have a huge impact on the reliability of your production systems. It reduces your reliance on the availability of external services during deployments and ensures that you have access to a consistent set of dependencies with known versions. Many repositories only support one type of package, thereby requiring multiple systems to be maintained, but Pulp is a platform that handles multiple content types and is easily extendable to manage everything you need for running your applications. In this episode maintainers Bihan Zhang and Austin Macdonald explain how the Pulp project works, the exciting new changes coming in version 3, and how you can get it set up to use for your deployments today. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. And with their new Kubernetes integration it’s even easier to deploy and scale your build agents. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Austin Macdonald and Bihan Zhang about Pulp, a platform for hosting and managing software package repositories Interview Introductions How did you get introduced to Python? What is Pulp and how did the project get started? What are the use cases/benefits for hosting your own artifact repository? What is the high level architecture of the platform? Pulp 3 appears to be a fairly substantial change in architecture and design. What will be involved in migrating an existing installation to the new version when it is released? What is involved in adding support for a new type of artifact/package? How does Pulp compare to other artifact repositories? What are the major pieces of work that are required before releasing Pulp 3? What have been some of the most interesting/unexpected/challenging aspects of building and maintaining Pulp? What are your plans for the future of Pulp? Keep In Touch Austin asmacdo on GitHub @asmacdo on Twitter Bihan LinkedIn Pulp Project Email GitHub Website #pulp on freenode Picks Tobias Soonish Austin Shostakovitch String Quartet #8 Bihan AOPA: Air Safety Institute YouTube Channel Links Pulp RedHat French Horn XKCD RPM Debian PyPI Center For Open Science SciPy Ansible Django Project Django Storages Artifactory Warehouse OCI (Open Container Initiative) Crane Docker Twinehttps://github.com/pypa/twine?utmsource=rss&utmmedium=rss Maven Read-through Cache The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/2/2018 • 30 minutes, 43 seconds

Bringing Africa Online At Ascoderu with Clemens Wolff

Summary The future is here, it’s just not evenly distributed. One of the places where this is especially true is in sub-Saharan Africa which is a vast region with little to no reliable internet connectivity. To help communities in this region leapfrog infrastructure challenges and gain access to opportunities for education and market information the Ascoderu non-profit has built Lokole. In this episode one of the lead engineers on the project, Clemens Wolff, explains what it is, how it is built, and how the venerable e-mail protocols can continue to provide access cheaply and reliably. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. And with their new Kubernetes integration it’s even easier to deploy and scale your build agents. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Clemens Wolff about how Ascoderu is using Python to help communities in sub-Saharan Africa gain access to the digital age Interview Introductions How did you get introduced to Python? What is the mission of Ascoderu and how did the organization get started? How did you get involved? The primary project that you build and maintain is Lokole. What is it and how does it help you in achieving the goals of the organization? What are the limitations of using e-mail as the only interface to the broader internet? What are some of the most interesting or unexpected uses of email in isolation have you seen? From the user perspective, can you describe the overall experience of interacting with Lokole? What is happening in the background? Did you consider using a binary message format such as Avro, protocol buffers, or msgpack in place of JSON? What kind of fault tolerance techniques are built into the overall information flow? What are the most challenging or unexpected aspects of building Lokole and interacting with the user communities? What projects do you have planned for the future? Keep In Touch Email GitHub LinkedIn Picks Tobias Hubspot CRM Clemens Ali Farka Toure Links Ascoderu Lokole NLTK Haskell DRC Lokole client Lokole server Ali Express Raspberry Pi Orange Pi Uganda Tanzania JSON Avro msgpack gzip Gmail Lingala wvdial USB Modeswitch Gnome SIM database Benin Agricultural Engineer Outernet Internet In A Box mkvvconf Azure for non-profits Kubernetes Connexion Zalando Open API Sendgrid Azure Service Bus Ambassador Container Pillow United Nations Sustainable Development Goals The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/25/2018 • 42 minutes, 33 seconds

Understanding Machine Learning Through Visualizations with Benjamin Bengfort and Rebecca Bilbro

Summary Machine learning models are often inscrutable and it can be difficult to know whether you are making progress. To improve feedback and speed up iteration cycles Benjamin Bengfort and Rebecca Bilbro built Yellowbrick to easily generate visualizations of model performance. In this episode they explain how to use Yellowbrick in the process of building a machine learning project, how it aids in understanding how different parameters impact the outcome, and the improved understanding among teammates that it creates. They also explain how it integrates with the scikit-learn API, the difficulty of producing effective visualizations, and future plans for improvement and new features. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. And with their new Kubernetes integration it’s even easier to deploy and scale your build agents. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Rebecca Bilbro and Benjamin Bengfort about Yellowbrick, a scikit extension to use visualizations for assisting with model selection in your data science projects. Interview Introductions How did you get introduced to Python? Can you describe the use case for Yellowbrick and how the project got started? What is involved in visualizing scikit-learn models? What kinds of information do the visualizations convey? How do they aid in understanding what is happening in the models? How much direction does yellowbrick provide in terms of knowing which visualizations will be helpful in various circumstances? What does the workflow look like for someone using Yellowbrick while iterating on a data science project? What are some of the common points of confusion that your students encounter when learning data science and how has yellowbrick assisted in achieving understanding? How is Yellowbrick iplemented and how has the design changed over the lifetime of the project? What would be required to integrate with other visualization libraries and what benefits (if any) might that provide? What about other ML frameworks? What are some of the most challenging or unexpected aspects of building and maintaining Yellowbrick? What are the limitations or edge cases for yellowbrick? What do you have planned for the future of yellowbrick? Beyond visualization, what are some of the other areas that you would like to see innovation in how data science is taught and/or conducted to make it more accessible? Keep In Touch Rebecca Bilbro Github Twitter Benjamin Bengfort Github Twitter Picks Tobias Poutine Rebecca The color yellow Benjamin ALL CAPS Links Hadoop Natural Language Processing Machine Learning scikit-learn Model Selection Triple the machine learning workflow scikit-yb Yellowbrick Visualizer API Visual Tests Jupyter Matplotlib Tensorflow Hyperparameter Parallel Coordinates Radviz Rank2D Prediction Error Plot Residuals Plot Validation Curves Alpha Selection Frequency Distribution Plot Bayes Theorem Seaborn Stop Words N-gram Craig – Bias and Fairness of Algorithms Shiny Bokeh Keras StatsModels Tensorboard PyTorch NumPy Voxel Wizard of Oz The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/17/2018 • 55 minutes, 13 seconds

Modern Database Clients On The Command Line with Amjith Ramanujam

Summary The command line is a powerful and resilient interface for getting work done, but the user experience is often lacking. This can be especially pronounced in database clients because of the amount of information being transferred and examined. To help improve the utility of these interfaces Amjith Ramanujam built PGCLI, quickly followed by MyCLI with the Prompt Toolkit library. In this episode he describes his motivation for building these projects, how their popularity led him to create even more clients, and how these tools can help you in your command line adventures. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. And with their new Kubernetes integration it’s even easier to deploy and scale your build agents. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Amjith Ramanujam about DBCLI, an umbrella project for command line database clients with autocompletion and syntax highlighting. Interview Introductions How did you get introduced to Python? What is the DBCLI project? Which of the clients was the first to be created and what was your motivation for starting it? At what point did you decide to create the DBCLI umbrella for the different projects and what benefits does it provide? How much functionality is shared between the different clients? What additional functionality do the different clients provide over those that are distributed with their respective engines? How do you optimize for cases where large volumes of data are returned from a query? What are some of the most interesting or surprising things that you have learned about database engines in the process of building client interfaces for them? What are the most challenging aspects of building the different database clients? What are some unexpected hardships that you encountered through this open source project? What are some unexpected pleasant surprises that you encountered through this project? Why did you hand over the project leadership for pgcli and mycli to other devs? Was it a hard decision? Why do you optimize on being nice over being right? How did Microsoft get involved with dbcli? mssql-cli What’s been the reception for the projects? What are your plans for upcoming releases of the various clients? Which database engines are you planning to target next? Keep In Touch amjith on GitHub @amjithr on Twitter Blog Picks Tobias Downsizing Amjith Dosas Sarasate Links DBCLI Haskell Learn you as haskell List Comprehension PGCLI MyCLI MSSQL-CLI Prompt Toolkit Podcast.__init__ Interview BPython DjangoCon EU CLI Helpers Python Generators PGSpecial Longboarding Irina Truong Thomas Roten(sp) PostGreSQL MySQL Microsoft SQL Server SQLite Oracle DB Cassandra DB The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/11/2018 • 30 minutes, 39 seconds

Pandas Extension Arrays with Tom Augspurger

Summary Pandas is a swiss army knife for data processing in Python but it has long been difficult to customize. In the latest release there is now an extension interface for adding custom data types with namespaced APIs. This allows for building and combining domain specific use cases and alternative storage mechanisms. In this episode Tom Augspurger describes how the new ExtensionArray works, how it came to be, and how you can start building your own extensions today. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. And with their new Kubernetes integration it’s even easier to deploy and scale your build agents. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Tom Augspurger about the extension interface for Pandas data frames and the use cases that it enables Interview Introductions How did you get introduced to Python? Most people are familiar with Pandas, but can you describe at a high level the new extension interface? What is the story behind the implementation of this functionality? Prior to this interface what was the option for anyone who wanted to extend Pandas? What are some of the new data types that are available as external packages? What are some of the unique use cases that they enable? How is the new interface implemented within Pandas? What were the most challenging or difficult aspects of building this new functionality? What are some of the more interesting possibilities that you are aware of for new extension types? What are the limitations of the interface for libraries that add new array functionality? What is the next major change or improvement that you would like to add in Pandas? Keep In Touch tomaugspurger on GitHub @TomAugspurger on Twitter Picks Tobias Black Panther Tom Dask-ML Links Pandas ExtensionArray Original IP Address proposal Mid-implementation blog post Dataframe Numpy Cyberpandas Geopandas GIS Arrow CuPy JQ Wes McKinney Array ufunc Matplotlib Altair Seaborn Bokeh Podcast.__init__ Interview Dask Data Engineering Interview The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/4/2018 • 33 minutes, 26 seconds

Making A Difference Through Software With Eric Schles

Summary Software development is a skill that can create value and reduce drudgery in a wide variety of contexts. Sometimes the causes that are most in need of software expertise are also the least able to pay for it. By volunteering our time and abilities to causes that we believe in, we can help make a tangible difference in the world. In this episode Eric Schles describes his experiences working on social justice initiatives and the types of work that proved to be the most helpful to the groups that he was working with. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. And with their new Kubernetes integration it’s even easier to deploy and scale your build agents. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Eric Schles about how to get involved with social justice causes as an engineer Interview Introductions How did you get introduced to Python? What are some ways that engineers can create real-world impact with their skills? What are some of the common roadblocks to contribution that people should be aware of? What are some of the types of projects or tools that can provide the most value compared to the amount of effort? Do you have any advice for picking an organization or cause that will benefit the most from technical expertise? Many of the tools and systems that get built for public or non-profit organizations require some amount of data for them to be useful. Do you have any advice on methods for identifying, locating, or collecting the necessary information for feeding into these projects? What are some of the design factors that should be considered when building tools for these organizations to allow them to be maintainable and sustainable in the absense of an experienced engineer? Keep In Touch EricSchles on GitHub @EricSchles on Twitter Picks Tobias Shoes without laces Eric Catboost Pomegranate Links USDS 18F OCW Python Course SAS R Machine Learning Version Control GitHub Agile OCR (Optical Character Recognition) Eric Schles Interview On Podcast.__init__ Excel ETL (Extract Transform Load) Automate The Boring Stuff Web Scraping Thomas Levine Elasticsearch Trello The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/27/2018 • 43 minutes, 13 seconds

Asking Questions From Data Using Active Learning with Tivadar Danka

Summary One of the challenges of machine learning is obtaining large enough volumes of well labelled data. An approach to mitigate the effort required for labelling data sets is active learning, in which outliers are identified and labelled by domain experts. In this episode Tivadar Danka describes how he built modAL to bring active learning to bioinformatics. He is using it for doing human in the loop training of models to detect cell phenotypes with massive unlabelled datasets. He explains how the library works, how he designed it to be modular for a broad set of use cases, and how you can use it for training models of your own. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. And with their new Kubernetes integration it’s even easier to deploy and scale your build agents. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Tivadar Danka about modAL, a modular active learning framework for Python3 Interview Introductions How did you get introduced to Python? What is active learning? How does it differ from other approaches to machine learning? What is modAL and what was your motivation for starting the project? For someone who is using modAL, what does a typical workflow look like to train their models? How do you avoid oversampling and causing the human in the loop to become overwhelmed with labeling requirements? What are the most challenging aspects of building and using modAL? What do you have planned for the future of modAL? Keep In Touch @TivadarDanka on Twitter cosmic-cortex on GitHub https://www.tivadardanka.com?utmsource=rss&utmmedium=rss for anything else Picks Tobias Peter Rabbit Movie Tivadar Uri Alon: An Introduction to Systems Biology – Design Principles of Biological Circuits, book and online lectures Links modAL homepage modAL on GitHub modAL paper Bioinformatics Hungary Phenotypes Active Learning Supervised Learning Unsupervised Learning Snorkel Active Feature-Value Acquisition scikit-learn Entropy PyTorch Tensorflow Keras Jupyter Notebooks Bayesian Optimization Hyperparameters The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/21/2018 • 27 minutes, 51 seconds

Great Expectations For Your Data Pipelines with Abe Gong and James Campbell

Summary Testing is a critical activity in all software projects, but one that is often neglected in data pipelines. The complexities introduced by the inherent statefulness of the problem domain and the interdependencies between systems contribute to make pipeline testing difficult to manage. To make this endeavor more manageable Abe Gong and James Campbell have created Great Expectations. In this episode they discuss how you can use the project to create tests in the exploratory phase of building a pipeline and leverage those to monitor your systems in production. They also discussed how Great Expectations works, the difficulties associated with pipeline testing and managing associated technical debt, and their future plans for the project. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Finding a bug in production is never a fun experience, especially when your users find it first. Airbrake error monitoring ensures that you will always be the first to know so you can deploy a fix before anyone is impacted. With open source agents for Python 2 and 3 it’s easy to get started, and the automatic aggregations, contextual information, and deployment tracking ensure that you don’t waste time pinpointing what went wrong. Go to podcastinit.com/airbrake today to sign up and get your first 30 days free, and 50% off 3 months of the Startup plan. To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. And with their new Kubernetes integration it’s even easier to deploy and scale your build agents. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected] Your host as usual is Tobias Macey and today I’m interviewing James Campbell and Abe Gong about Great Expectations, a tool for testing the data in your analytics pipelines Interview Introduction How did you first get introduced to Python? What is Great Expectations and what was your motivation for starting it? What are some of the complexities associated with testing analytics pipelines? What types of tests can be executed to ensure data integrity and accuracy? What are some examples of the potential impact of pipeline debt? What is Great Expectations and how does it simplify the process of building and executing pipeline tests? What are some examples of the types of tests that can be built with Great Expectations? For someone getting started with Great Expectations what does the workflow look like? What was your reason for using Python for building it? How does the choice of language benefit or hinder the contexts in which Great Expectations can be used? What are some cases where Great Expectations would not be usable or useful? What have been some of the most challenging aspects of building and using Great Expectations? What are your hopes for Great Expectations going forward? Contact Info James jpcampb2 on GitHub Abe abegong on GitHub Website @AbeGong on Twitter Picks Tobias Fitbit Versa James Unplug and spend some time away from the computer Abe Superconductive Health Slack: Getting Past Burnout, Busy Work, and the Myth of Total Efficiency Links Superconductive Health Laboratory for Analytical Sciences Great Expectations Medium Post DAG (Directed Acyclic Graph) SLA (Service Level Agreement) Integration Testing Data Engineering Histogram Pandas SQLAlchemy Tutorial Videos Jupyter Notebooks Dataframe Airflow Luigi Spark Oozie Azkaban JSON XML The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

5/13/2018 • 50 minutes, 42 seconds

Exploring Color Theory In Python With Thomas Mansencal

Summary We take it for granted every day, but creating and displaying vivid colors in our digital media is a complicated and often difficult process. There are different ways to represent color, the ways in which they are displayed can cause them to look different, and translating between systems can cause losses of information. To simplify the process of working with color information in code Thomas Mansencal wrote the Colour project. In this episode we discuss his motiviation for creating and sharing his library, how it works to translate and manage color representations, and how it can be used in your projects. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Finding a bug in production is never a fun experience, especially when your users find it first. Airbrake error monitoring ensures that you will always be the first to know so you can deploy a fix before anyone is impacted. With open source agents for Python 2 and 3 it’s easy to get started, and the automatic aggregations, contextual information, and deployment tracking ensure that you don’t waste time pinpointing what went wrong. Go to podcastinit.com/airbrake today to sign up and get your first 30 days free, and 50% off 3 months of the Startup plan. To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. And with their new Kubernetes integration it’s even easier to deploy and scale your build agents. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) Your host as usual is Tobias Macey and today I’m interviewing Thomas Mansencal about Colour, a python library for working with algorithms and transformations to explore color theory Interview Introductions How did you get introduced to Python? What is color theory? How does Colour assist in the process of working with some of the practical applications of colour science? What was your motivation for creating Colour? What are some example use cases for colour? One of the aspects of color in digital environments that is often confusing is the number of different ways that it can be represented. What are the relative benefits of things like RGB, HSV, CMYK, etc.? How is the Colour library architected and how has that evolved over time? Are there new developments in the area of color theory that need to be periodically incorporated into the library? What have you found to be some of the most often misunderstood aspects of color? What have been some of the most difficult or frustrating aspects of building, maintaining, and promoting Colour? What are some of the most interesting or unexpected uses of Colour that you have seen? What are your plans for the future of Colour? Keep In Touch Website Picks Tobias Beasts of Olympus by Lucy Coates Thomas Coursera Mathematics Machine Learning Course Links Colour Color Theory Color Science Weta Digital Wingnut AR Visual Effects Artist Allegro AutoDesk Maya PyQT Isaac Newton Color Wheel Colorimetry CIE VY Canis Majoris (Red Hypergiant) Rigel (Blue-White Supergiant) Kelvin Temperature Scale Black Body Radiation HDRI (High Dynamic Range Imaging) Adobe DNG SDK ICC OpenColorIO MERCK Group Color Space RGB HSV CMYK CIE XYZ CIE RGB CIE Lab CIE Luv sRGB Gamma Correction Additive Color Space Subtractive Color Space Color Blindness Gustavo Machado Rods and Cones Dichromacy Color Appearance Model Uniform Color Spaces JOSS ArXiv CIECAM02 Color Appearance Model Cinematic Color Jeremy Selan (Author of OpenColorIO) Academy Color Encoding System Color Appearance Models by Mark D. Fairchild The Reproduction of Colour by Dr. R.W.G. Hunt Color Science: Concepts and Methods, Quantitative Data and Formulae, 2nd Edition by Günther Wyszecki and W. S. Stiles Katherine Crowson Google Colab The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/6/2018 • 57 minutes, 40 seconds

Destroy All Software With Gary Bernhardt

Summary Many developers enter the market from backgrounds that don’t involve a computer science degree, which can lead to blind spots of how to approach certain types of problems. Gary Bernhardt produces screen casts and articles that aim to teach these principles with code to make them approachable and easy to understand. In this episode Gary discusses his views on the state of software education, both in academia and bootcamps, the theoretical concepts that he finds most useful in his work, and some thoughts on how to build better software. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Finding a bug in production is never a fun experience, especially when your users find it first. Airbrake error monitoring ensures that you will always be the first to know so you can deploy a fix before anyone is impacted. With open source agents for Python 2 and 3 it’s easy to get started, and the automatic aggregations, contextual information, and deployment tracking ensure that you don’t waste time pinpointing what went wrong. Go to podcastinit.com/airbrake today to sign up and get your first 30 days free, and 50% off 3 months of the Startup plan. To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. And with their new Kubernetes integration it’s even easier to deploy and scale your build agents. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) Your host as usual is Tobias Macey and today I’m interviewing Gary Bernhardt about teaching and learning Python in the current software landscape Interview Introductions How did you get introduced to Python? As someone who makes a living from teaching aspects of programming what is your view on the state of software education? What are some of the ways that we as an industry can improve the experience of new developers? What are we doing right? You spend a lot of time exploring some of the fundamental aspects of programming and computation. What are some of the lessons that you have learned which transcend software languages? Utility of graphs in understanding software Mechanical sympathy What are the benefits of ‘from scratch’ tutorials that explore the steps involved in building simple versions of complex topics such as compilers or web frameworks? Keep In Touch @garybernhardt on Twitter garybernhardt on GitHub Picks Tobias Terry Pratchett Gary Destroy All Software Deconstruct Conference Out Of The Tarpit Algorithms + Data Structures = Programs by Niklaus Wirth Dan Grossman Programming Languages Course (click the “Videos” links under “course materials”) U of W John Carmack post reconsidering some earlier positions Links Wat Birth and Death of Javascript Destroy All Software Deconstruct Data Structures Computer Science Compilers Programming Bootcamps Graph Theory Julia Evans @b0rk on Twitter Allen Downey Jupyter Notebook Halting Problem Idris Visual Basic 3.0 Set Theory ML Family of Languages SML, a simple dialect of ML SML/NJ, a compiler for SML OCamL, a more modern dialect of ML F#, an even newer dialect of ML Clojure, a modern Lisp-like language Lua Grammar (scroll to the very bottom for the full grammar) John Carmack Twitter Thread Explaining Episode Context The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/30/2018 • 52 minutes, 6 seconds

Scaling Deep Learning Using Polyaxon with Mourad Mourafiq

Summary With libraries such as Tensorflow, PyTorch, scikit-learn, and MXNet being released it is easier than ever to start a deep learning project. Unfortunately, it is still difficult to manage scaling and reproduction of training for these projects. Mourad Mourafiq built Polyaxon on top of Kubernetes to address this shortcoming. In this episode he shares his reasons for starting the project, how it works, and how you can start using it today. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Finding a bug in production is never a fun experience, especially when your users find it first. Airbrake error monitoring ensures that you will always be the first to know so you can deploy a fix before anyone is impacted. With open source agents for Python 2 and 3 it’s easy to get started, and the automatic aggregations, contextual information, and deployment tracking ensure that you don’t waste time pinpointing what went wrong. Go to podcastinit.com/airbrake today to sign up and get your first 30 days free, and 50% off 3 months of the Startup plan. To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. And with their new Kubernetes integration it’s even easier to deploy and scale your build agents. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) Your host as usual is Tobias Macey and today I’m interviewing Mourad Mourafiq about Polyaxon, a platform for building, training and monitoring large scale deep learning applications. Interview Introductions How did you get introduced to Python? Can you give a quick overview of what Polyaxon is and your motivation for creating it? What is a typical workflow for building and testing a deep learning application? How is Polyaxon implemented? How has the internal architecture evolved since you first started working on it? What is unique to deep learning workloads that makes it necessary to have a dedicated tool for deploying them? What does Polyaxon add on top of the existing functionality in Kubernetes? It can be difficult to build a docker container that holds all of the necessary components for a complex application. What are some tips or best practices for creating containers to be used with Polyaxon? What are the relative tradeoffs of the various deep learning frameworks that you support? For someone who is getting started with Polyaxon what does the workflow look like? What is involved in migrating existing projects to run on Polyaxon? What have been the most challenging aspects of building Polyaxon? What are your plans for the future of Polyaxon? Keep In Touch Website @mmourafiq on Twitter mouradmourafiq on GitHub Picks Tobias Kubernetes Kubernetes Up And Running Kelsey Hightower Food Fight Show With Kelsey Hightower Mourad Schopenhauer Links Polyaxon Investment Banking Luxembourg Matlab Text Mining Tensorflow Docker Kubernetes Deep Learning Free Deep Learning Textbook Machine Learning Engineer Hyperparameters Continuous Integration PyTorch MXNet Scikit-Learn Helm Mesos Spark SparkML The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/23/2018 • 35 minutes, 59 seconds

Electricity Map: Real Time Visibility of Power Generation with Olivier Corradi

Summary One of the biggest issues facing us is the availability of sustainable energy sources. As individuals and energy consumers it is often difficult to understand how we can make informed choices about energy use to reduce our impact on the environment. Electricity Map is a project that provides up to date and historical information about the balance of how the energy we are using is being produced. In this episode Olivier Corradi discusses his motivation for creating Electricity Map, how it is built, and his goals for the project and his other work at Tomorrow Co. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Finding a bug in production is never a fun experience, especially when your users find it first. Airbrake error monitoring ensures that you will always be the first to know so you can deploy a fix before anyone is impacted. With open source agents for Python 2 and 3 it’s easy to get started, and the automatic aggregations, contextual information, and deployment tracking ensure that you don’t waste time pinpointing what went wrong. Go to podcastinit.com/airbrake today to sign up and get your first 30 days free, and 50% off 3 months of the Startup plan. To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. And with their new Kubernetes integration it’s even easier to deploy and scale your build agents. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) Your host as usual is Tobias Macey and today I’m interviewing Olivier Corradi about Electricity Map and using Python to analyze data of global power generation Interview Introductions How did you get introduced to Python? What was your motivation for creating Electricity Map? How can an average person use or benefit from the information that is available in the map? What sources are you using to gather the information about how electricity is generated and distributed in various geographic regions? Is there any standard format in which this data is produced? What are the biggest difficulties associated with collecting and consuming this data? How much confidence do you have in the accuracy of the data sources? Is there any penalty for misrepresenting the fuel consumption or waste generation for a given plant? Can you describe the architecture of the system and how it has evolved? What are some of the most interesting uses of the data in your database and API that you are aware of? How do you measure the impact or effectiveness of the information that you provide through the different interfaces to the data that you have aggregated? How have you built a community around the project? How has the community helped in building and growing Electricity Map? What are some of the most unexpected things that you have learned in the process of building Electricity Map? What are your plans for the future of Electricity Map? Keep In Touch @corradio on Twitter LinkedIn corradio on GitHub Picks Tobias Rollerblading Olivier Deep Mind AlphaGo Documentary Consumer’s Guide To Climate Change Impact Links Electricity Map Machine Learning Youtube Climate Change Fossil Fuels Carbon Intensity Greenhouse Gas Equivalencies Calculations Open Data Electricity Map Project Source Lignite Marginal Carbon Intensity Electricity Map Forecast API IPCC (Intergovernmental Panel on Climate Change Redis D3.js Spark Tensorflow Spatiotemporal Data MongoDB Matrix Inversion PyGRIB Tomorrow Co. The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/15/2018 • 47 minutes, 53 seconds

Building And Growing Nylas with Christine Spang

Summary Email is one of the oldest methods of communication that is still in use on the internet today. Despite many attempts at building a replacement and predictions of its demise we are sending more email now than ever. Recognizing that the venerable inbox is still an important repository of information, Christine Spang co-founded Nylas to integrate your mail with the rest of your tools, rather than just replacing it. In this episode Christine discusses how Nylas is built, how it is being used, and how she has helped to grow a successful business with a strong focus on diversity and inclusion. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. Finding a bug in production is never a fun experience, especially when your users find it first. Airbrake error monitoring ensures that you will always be the first to know so you can deploy a fix before anyone is impacted. With open source agents for Python 2 and 3 it’s easy to get started, and the automatic aggregations, contextual information, and deployment tracking ensure that you don’t waste time pinpointing what went wrong. Go to podcastinit.com/airbrake today to sign up and get your first 30 days free, and 50% off 3 months of the Startup plan. To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. And with their new Kubernetes integration it’s even easier to deploy and scale your build agents. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Christine Spang about Nylas and the modern era of email Interview Introductions How did you get introduced to Python? Can you explain what Nylas is and some of its history? What do you think it is about email as a protocol and a means of communication that has made it so resilient in the face of technological evolution? What lessons did you learn from your initial offering of the N1 mail client and how has that informed your current focus? Nylas as a company appears to have a strong focus on diversity and inclusion. Can you speak to how you encourage that type of environment and how it manifests at work? What are some of the ways that Python is used at Nylas? Can you share some examples of services that you have written in other languages and why you felt that Python was not the right choice? What are some of the use cases that Nylas enables? What are some of the most interesting or innovative uses of the Nylas platform that you have seen? How do you manage privacy and security in your sync service given the sensitivity of the data that you are handling? What are some of the biggest challenges that you are currently facing at Nylas? What do you think will be the future of email? Keep In Touch LinkedIn @spang on Twitter Website GitHub Picks Tobias Trello Christine Founders For Change Links Nylas MIT KSplice Debian Lisp REST Email N1 Mail Client Mailspring Nylas Employee Handbook Hackbright Academy Code2040 TextIO Key Values IMAP OAuth MySQL Gevent React CRM (Customer Relationship Management) SendGrid MailGun MailChimp GDPR (General Data Protection Regulation) SOC2 OWASP Top 10 Principle of Least Privilege The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/8/2018 • 43 minutes, 29 seconds

Synthetic Data Generation Using Mimesis with Nikita Sobolev

Summary Most applications require data to operate on in order to function, but sometimes that data is hard to come by, so why not just make it up? Mimesis is a library for randomly generating data of different types, such as names, addresses, and credit card numbers, so that you can use it for testing, anonymizing real data, or for placeholders. This week Nikita Sobolev discusses how the project got started, the challenges that it has posed, and how you can use it in your applications. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) Your host as usual is Tobias Macey and today I’m interviewing Nikita Sobolev about Mimesis, a library for quickly generating synthetic data Interview Introductions How did you get introduced to Python? What is mimesis and how does it compare to other projects such as faker and factory_boy? What was the motivation for creating it? One of the features that is advertised is the speed of Mimesis. What techniques are used to ensure that the data is generated quickly? What are the built in mechanisms for generating data? What options do users have for customizing the types of data that can get generated? What are some of the most complicated providers to write and maintain? What are some of the use cases outside of unit or integration tests where Mimesis could be beneficial? How would you use Mimesis to anonymize data from a production environment to be used for testing? What are the most challenging aspects of maintaining the Mimesis project? What are some of the plans that you have for the future of Mimesis? Keep In Touch sobolevn on GitHub @sobolevn on Twitter Email Picks Tobias Coco Nikita I Am A Mediocre Developer Links Mimesis Django Faker Factory Boy Internationalization (I18N) Unicode Enum Pipfile GeoJSON Mimesis Cloud Sanic GraphQL Impostor Syndrome Imposter Syndrome Disclaimer: Add this to all of your projects! Jacob Kaplan-Moss PyCon Keynote The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/1/2018 • 32 minutes, 37 seconds

Luminoth: AI Powered Computer Vision for Python with Joaquin Alori

Summary Making computers identify and understand what they are looking at in digital images is an ongoing challenge. Recent years have seen notable increases in the accuracy and speed of object detection due to deep learning and new applications of neural networks. In order to make it easier for developers to take advantage of these techniques Tryo Labs built Luminoth. In this interview Joaquín Alori explains how how Luminoth works, how it can be used in your projects, and how it compares to API oriented services for computer vision. Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. For complete visibility into your application stack, deployment tracking, and powerful alerting, DataDog has got you covered. With their monitoring, metrics, and log collection agent, including extensive integrations and distributed tracing, you’ll have everything you need to find and fix bugs in no time. Go to podcastinit.com/datadog today to start your free 14 day trial and get a sweet new T-Shirt. To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) Your host as usual is Tobias Macey and today I’m interviewing Joaquín Alori about Luminoth, a deep learning toolkit for computer vision in Python Interview Introductions How did you get introduced to Python? What is Luminoth and what was your motivation for creating it? Computer vision has been a focus of AI research for decades. How do current approaches with deep learning compare to previous generations of tooling? What are some of the most difficult problems in visual processing that still need to be solved? What are the limitations of Luminoth for building a computer vision application and how do they differ from the capabilities of something built with a prior generation of tooling such as OpenCV? For someone who is interested in using Luminoth in their project what is the current workflow? How do the capabilities of Luminoth compare with some of the various service based options such as Rekognition for Amazon or the Cloud Vision API from Google? What are some of the motivations for using Luminoth in place of these services? What are some of the highest priority features that you are focusing on implementing in Luminoth? When is Luminoth the wrong choice for a computer vision application and what are some of the strongest alternatives at the moment? Keep In Touch @JoaquinAlori on Twitter LinkedIn Picks Tobias PyCon US Joaquin 3Blue1Brown Links Luminoth Luminoth Release Announcement Tryo Labs Uruguay Industrial Engineering Manufacturing Engineering Elon Musk Artificial Intelligence Deep Learning Neural Networks Object Detection Image Segmentation Convolutional Neural Network Recurrent Neural Network Back Propagation Geoff Hinton Capsule Networks Generative Adversarial Networks SVM (Support Vector Machine) Haar Classifiers OpenCV Drones GPU (Graphics Processing Unit) Rekognition Cloud Vision API TensorFlow Object Detection API Sonnet DeepMind Caffe The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/25/2018 • 21 minutes, 27 seconds

Thonny: The IDE For Beginning Programmers with Aivar Annamaa

Summary Learning to program is a rewarding pursuit, but is often challenging. One of the roadblocks on the way to proficiency is getting a development environment installed and configured. In order to simplify that process Aivar Annamaa built Thonny, a Python IDE designed for beginning programmers. In this episode he discusses his initial motivations for starting Thonny and how it helps newcomers to Python learn and understand how to write software. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute. For complete visibility into your application stack, deployment tracking, and powerful alerting, DataDog has got you covered. With their monitoring, metrics, and log collection agent, including extensive integrations and distributed tracing, you’ll have everything you need to find and fix bugs in no time. Go to podcastinit.com/datadog today to start your free 14 day trial and get a sweet new T-Shirt. To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons. Visit podcastinit.com to subscribe to the show, sign up for the newsletter, and read the show notes. Your host as usual is Tobias Macey and today I’m interviewing Aivar Annamaa about Thonny, a Python IDE for beginning programmers Interview Introductions How did you get introduced to Python? What was your motivation for building an IDE focused on beginning programmers? What are the features of Thonny that make it easier for users to understand what is happening in their programs? What have you found to be the types of issues that users most frequently struggle with and how does Thonny help overcome those gaps in understanding? What kinds of tutorials or supporting material have you found to be the most useful for teaching students the principles that they need to be able to take advantage of the environment that Thonny provides? How is Thonny built and what have been the most challenging aspects of writing an IDE in Python? What are some of the interface design choices that you have made to avoid confusing or overwhelming beginning users? Once a user becomes more proficient in Python is there a point where it no longer makes sense to continue using Thonny for development? I noticed that Thonny has an plugin architecture and there is an extension for interacting with the BBC micro:bit. What are some of the other types of extensions that you would like to see built for Thonny? Keep In Touch Aivar @aivarannamaa on Twitter aivarannamaa on GitHub Google Scholar Page Thonny Website Forum @thonnyide on Twitter Source repository and wiki Picks Tobias Data Engineering Podcast Kubo and the Two Strings Aivar MicroPython Podcast.__init__ Interview How to Talk So Kids Will Listen & Listen So Kids Will Talk Links Thonny University of Tartu Estonia Recursion TKinter Aivar Estonian Textbook Pascal MyPy Podcast.__init__ Interview BBC Micro:bit Version Control GitHub GitLab Elm Compiler Messages The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/18/2018 • 29 minutes, 50 seconds

Keeping The Beets with Adrian Sampson

Summary Maintaining a consistent taxonomy for your music library is a challenging and time consuming endeavor. Eventually you end up with a mess of folders and files with inconsistent names and missing metadata. Beets is built to solve this problem by programmatically managing the tags and directory structure for all of your music files and providing a fast lookup when you are trying to find that perfect song to play. Adrian Sampson began the project because he was trying to clean up his own music collection and in this episode he discusses how the project was built, how streaming media is affecting our relationship to digital music, and how he envisions Beets position in the ecosystem in the future. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Adrian Sampson about Beets, the swiss army knife for managing your music library. Interview Introductions How did you get introduced to Python? What is Beets and what was your reason for creating it? What was your reason for using Python and if you were to start over today would you make the same choice? If I have a directory with inconsistent naming conventions, poor organization, and some random folders full of mixed MP3 files how can Beets help me and what does the workflow look like? How is Beets architected to allow for interactively processing a large volume of media files and how has the design evolved over the time that you have been working on it? What are your thoughts on the current trend toward streaming music services replacing local media files? What have been some of the most challenging aspects of building Beets? What are some of the most interesting uses for Beets that you have seen? What are some of the other projects for managing a music library and how does Beets compare to them? Are there any features that you have planned for the future of Beets, or any new functionality that you would like to see contributed? Keep In Touch sampsyo on GitHub Website @samps on Twitter Picks Tobias Mozart’s Requiem Wikipedia YouTube Gov’t Mule Darkest Hour Adrian Spiralizer Spiralized Beats With Pesto Links Beets SQLite Mutagen ID3 Tags Musicbrainz Bandcamp Free Music Archive Cornell AcoustID Chromaprint Musicbrainz Picard iTunes Spotify Amazon Music DLNA UPnP AURA The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/12/2018 • 39 minutes, 23 seconds

Salabim: Logistics Simulation with Ruud van der Ham

Summary Determining the best way to manage the capacity and flow of goods through a system is a complicated issue and can be exceedingly expensive to get wrong. Rather than experimenting with the physical objects to determine the optimal algorithm for managing the logistics of everything from global shipping lanes to your local bank, it is better to do that analysis in a simulation. Ruud van der Ham has been working in this area for the majority of his professional life at the Dutch port of Rotterdam. Using his acquired domain knowledge he wrote Salabim as a library to assist others in writing detailed simulations of their own and make logistical analysis of real world systems accessible to anyone with a Python interpreter. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Ruud van der Ham about Salabim, a Python library for conducting discrete event simulations Interview Introductions How did you get introduced to Python? Can you start by explaining what Discrete Event Simulation is and how Salabim helps with that? Can you explain how you chose the name? What was your motivation for creating Salabim and how does it compare to other tools for discrete event simulation? How does discrete event simulation compare with state machines? How is Salabim implemented and how has the design evolved over the time that you have been working on it? I understand that you have done a majority of Salabim was written on an iPad. Can you speak about why you have chosen that as your development environment and your experience working in that manner? What are some examples of the types of models that you can model with Salabim? What would an implementation of one of these models look like for someone using Salabim? What options does a user have to verify the accuracy of a simulation created with Salabim? One of the nice aspects of Salabim is the fact that it provides a visual output as a simulation runs. Can you describe the workflow for someone who wants to use Salabim for modeling and visualizing a system? At what point does a system become too complex to encapsulate in a simulation and what techniques can you use to modularize it to make a simulation useful? When is Salabim not the right tool to use and what would you suggest for people who find themselves in that situation? What have been some of the most complicated or difficult aspects of building and maintaining Salabim? What are some of the new features or improvements that you have planned for the future of Salabim? Keep In Touch Email Picks Tobias Cuisinart Burr Mill Coffee Grinder Ruud Pythonista (Python for iOS) Python Notes for Professionals Fluent Python by Luciano Ramalho Links Salabim GitHub Dining Philosophers Animation Elevator Animation Rotterdam Discrete Event Simulation Container Terminal Automation Basic Algol Pascal Operations Research Continuous Simulation Simula Coroutines SymPy Another DES in Python: SimPy DES in Julia: SimJulia DES in R: Simmer DES in Delphi/Pascal: Tomas Pillow PyPy Delphi PyGame PyQT TkInter Inspect Module OpenCV Blender The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/4/2018 • 51 minutes, 38 seconds

Laboratory: Safer Refactoring with Joe Alcorn

Summary Every piece of software that has been around long enough ends up with some piece of it that needs to be redesigned and refactored. Often the code that needs to be updated is part of the critical path through the system, increasing the risks associated with any change. One way around this problem is to compare the results of the new code against the existing logic to ensure that you aren’t introducing regressions. This week Joe Alcorn shares his work on Laboratory, how the engineers at GitHub inspired him to create it as an analog to the Scientist gem, and how he is using it for his day job. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. A brief announcement before we start the show: If you work with data or want to learn more about how the projects you have heard about on the show get used in the real world then join me at the Open Data Science Conference in Boston from May 1st through the 4th. It has become one of the largest events for data scientists, data engineers, and data driven businesses to get together and learn how to be more effective. To save 60% off your tickets go to podcastinit.com/odsc-east-2018 and register. Your host as usual is Tobias Macey and today I’m interviewing Joe Alcorn about using Laboratory as a safety net for your refactoring. Interview Introductions How did you get introduced to Python? Can you start be explaining what Laboratory is and what motivated you to start the project? How much of the design and implementation were directly inspired by the Scientist project from GitHub and how much of it did you have to figure out from scratch due to differences in the target languages? What have been some of the most challenging aspects of building and maintaining Laboratory, and have you had any opportunities to use it on itself? For someone who would like to use Laboratory in their project, what does the workflow look like and what potential pitfalls should they watch out for? In the documentation you mention that portions of code that perform I/O and create side effects should be avoided. Have you found any strategies to allow for stubbing out the external interactions while still executing the rest of the logic? How do you keep track of the results for active experiments and what sort of reporting is available? What are some examples of the types of routines that would be good candidates for conducting an experiment? What are some of the most complicated or difficult pieces of code that you have refactored with the help of Laboratory? Given the fact that Laboratory is intended to be run in production and provide a certain measure of safety, what methods do you use to ensure that users of the library will not suffer from a drastic increase in overhead or unintended aberrations in the functionality of their software? Are there any new features or improvements that you have planned for future releases of Laboratory? Keep In Touch joealcorn on GitHub Website Picks Tobias Chronicles of Narnia Joe Why We Sleep: Unlocking The Power of Sleep and Dreams by Matthew Walker, PhD Links Marvel App GitHub: Move Fast and Fix Things GitHub Scientist: Measure Twice, Cut Over Once Scientist Laboratory Sure Footed Refactoring Graphite StatsD The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/26/2018 • 21 minutes, 53 seconds

Software Architecture For Developers with Neal Ford

Summary Whether it is intentional or accidental, every piece of software has an existing architecture. In this episode Neal Ford discusses the role of a software architect, methods for improving the design of your projects, pitfalls to avoid, and provides some resources for continuing to learn about how to design and build successful systems. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. A few announcements before we start the show: There is still time to register for the O’Reilly Software Architecture Conference in New York. Use the link podcastinit.com/sacon-new-york to register and save 20% If you work with data or want to learn more about how the projects you have heard about on the show get used in the real world then join me at the Open Data Science Conference in Boston from May 1st through the 4th. It has become one of the largest events for data scientists, data engineers, and data driven businesses to get together and learn how to be more effective. To save 60% off your tickets go to podcastinit.com/odsc-east-2018 and register. With many thanks to O’Reilly Media, I have two items to give away. To sign up you just need to subscribe to the mailing list at podcastinit.com and you will have the chance to win either a copy of Neal’s book, Building Evolutionary Architectures, or a Bronze ticket to the O’Reilly Software Architecture Conference in New York. I will be picking the winners on February 21st. Your host as usual is Tobias Macey and today I’m interviewing Neal Ford about principles of software architecture for developers Interview Introductions How did you get introduced to Python? A majority of your work has been focused on software architectures and how that can be used to facilitate delivery of working systems. Can you start by giving a high level description of what software architecture is and how it fits into the overall development process? One of the difficulties that arise in long-lived projects is that technical debt accrues to the point that forward progress stagnates due to fear that any changes will cause the system to stop functioning. What are some methods that developers can use to either guard against that eventuality, or address it when it happens? What are some of the broad categories of architectural patterns that developers should be aware of? Are there aspects of the language that a system or application is being implemented in which influence the style of architecture that is commonly used? What are some architectural anti-patterns that you have found to be the most commonly occurring? Software is useless if there is no way to deliver it to the end user. What are some of the challenges that are most often overlooked by engineering teams and how do you solve for them? Beyond the purely technological aspects, what other elements of software production and delivery are necessary for a successful architecture? What resources can you recommend for someone who is interested in learning more about software architecture, whether as an individual contributor or in a full time architect role? Keep In Touch Website @neal4d on Twitter Picks Tobias Jumanji: Welcome to the Jungle Lost City of Z Neal DeveloperToArchitect.com EvolutionaryArchitecture.com Broken Earth Series Links Thoughtworks Neal’s Blog Lisp Thoughtworks Technology Radar Martin Fowler: Who Needs an Architect? O’Reilly Software Architecture Conference Soft Skills Microservices Building Evolutionary Architectures Github: Move Fast and Fix Things Continuous Delivery Github Scientist Laboratory (Scientist in Python) Agile Development The Accidental Architect System Quality Attributes Pipes and Filters MapReduce Hadoop Service Oriented Architecture Linux DevOps Configuration Management React Alibaba Open Source Baidu Open Source Pragmatic Programmer Trunk Based Development PlantUML Visio Mermaid Diagrams Graphviz Evernote Software Architecture Fundamentals Enterprise Integration Patterns Architectural Katas The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/18/2018 • 50 minutes, 28 seconds

ZimboPy

Summary Learning to code is one of the most effective ways to be successful in the modern economy. To that end, Marlene Mhangami and Ronald Maravanyika created the ZimboPy organization to teach women and girls in Zimbabwe how to program in Python. In this episode they are joined by Mike Place to discuss how ZimboPy got started, the projects that their students have worked on, and how the community can get involved. Preface mu- Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. – I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. – When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. – If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. – Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) – To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. – Your host as usual is Tobias Macey and today I’m interviewing Marlene Mhangami, Mike Place, and Ronald Maravanyika about ZimboPy, an organization that teaches women and girls in Zimbabwe how to program using Python Interview Introductions How did you get introduced to Python? Can you start by explaining what the mission of ZimboPy is and how it got started? Which languages did you consider using for your lessons and what was your reason for choosing Python? What subject matter do you cover in addition to pure programming concepts? What are some of the types of projects that the students have completed as part of their work with ZimboPy? What have been the most challenging aspects of running ZimboPy? How is ZimboPy supported and what are your plans to ensure future sustainability? Can you share some success stories for the women and girls that you have worked with? For anyone who is interested in replicating your work for other communities what advice do you have? Keep In Touch Mike cachedout on GitHub @cachedout on Twitter cachedout on Keybase Ronald Rmaravanyika on GitHub @Rmaravanyika on Twitter Marlene @marlenezw on Twitter LinkedIn Picks Tobias Click Ronald Odoo formerly OpenERP Links ZimboPy Unilever Django Girls Thomas Hatch SaltStack Zimbabwe Mechatronics Raspberry Pi OpenCV ZimboPy Curriculum ZimboPy Storefront Oxfam Open Collective ZimboPy Mentorship Registration The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/11/2018 • 29 minutes, 20 seconds

PyRay: Pure Python 3D Rendering with Rohit Pandey

Summary Using a rendering library can be a difficult task due to dependency issues and complicated APIs. Rohit Pandey wrote PyRay to address these issues in a pure Python library. In this episode he explains how he uses it to gain a more thorough understanding of mathematical models, how it compares to other options, and how you can use it for creating your own videos and GIFs. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. A few announcements before we start the show: There’s still time to get your tickets for PyCon Colombia, happening February 9th and 10th. Go to pycon.co to learn more and register. There is also still time to register for the O’Reilly Software Architecture Conference in New York. Use the link podcastinit.com/sacon-new-york to register and save 20% If you work with data or want to learn more about how the projects you have heard about on the show get used in the real world then join me at the Open Data Science Conference in Boston from May 1st through the 4th. It has become one of the largest events for data scientists, data engineers, and data driven businesses to get together and learn how to be more effective. To save 60% off your tickets go to podcastinit.com/odsc-east-2018 and register. Your host as usual is Tobias Macey and today I’m interviewing Rohit Pandey about PyRay, a 3d rendering library written completely in python Interview Introductions How did you get introduced to Python? Can you start by explaining what PyRay is and what motivated you to create it? [rohit] PyRay is an open source library written completely in Python that let’s you render three and higher dimensional objects and scenes. Development on it has been ongoing and new features have so far come about from videos for my Youtube channel. What does the internal architecture of PyRay look like and how has that design evolved since you first started working on it? What capabilities are unlocked by having a pure Python rendering library which would otherwise be impractical or impossible for Python developers to do with existing options? [rohit] Having a pure Python library makes it accessible with minimal fixed cost to Python users. The tradeoff is you lose on speed, but for many applications that isn’t an issue. I haven’t seen a library coded completely in Python that let’s you manipulate 3d and higher dimensional objects. The core usecase right now is Mathematical artwork. Google geometric gifs and you’ll see some fascinating, mesmerizing results. But those are created for the most part using tools that are not Python. Which is a pity since Python has a very extensive library of Mathematical functions. What have been some of the most challenging aspects of building and maintaining PyRay? [rohit] 3d objects – getting mesh plots. I have to develop routines from scratch for almost everything – shading objects, etc. Animated routines for characters. What are some of the most interesting or unexpected uses of PyRay that you are aware of? [rohit] Physical simulations. Ex: Testing if a solid is a fair die, getting lower bounds for space packing efficiencies of solids. Creating interactive demos where a user can draw to provide input. For someone who wanted to contribute to PyRay are there any particular skills or experience that would be most helpful? Basic linear algebra and python What are some of the features or improvements that you have planned for the future of PyRay? Keep In Touch pyray repo – https://github.com/ryu577/pyray?utmsource=rss&utmmedium=rss – Email – GitHub – LinkedIn Picks Tobias Berserker Series by Fred Saberhagen Rohit Samurai Math Youtube Channel 3 Blue 1 Brown Youtube Channel Isaac Arthur Youtube Channel Links PyRay PyRay Youtube Videos Microsoft Azure Data Science Columbia University R Nielsen 3Blue1Brown – Music and Measure Theory Manim Python Subreddit Maya Blender Panda3D POVRay Pillow NumPy SciPy Support Vector Machine Logistic Regression Geometric GIFs Vapory RGB vs HSL Color Scales FFMPEG Quaternions The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/5/2018 • 42 minutes, 41 seconds

MonkeyType with Carl Meyer and Matt Page

Summary One of the draws of Python is how dynamic and flexible the language can be. Sometimes, that flexibility can be problematic if the format of variables at various parts of your program is unclear or the descriptions are inaccurate. The growing middle ground is to use type annotations as a way of providing some verification of the format of data as it flows through your application and enforcing gradual typing. To make it simpler to get started with type hinting, Carl Meyer and Matt Page, along with other engineers at Instagram, created MonkeyType to analyze your code as it runs and generate the type annotations. In this episode they explain how that process works, how it has helped them reduce bugs in their code, and how you can start using it today. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. A few announcements before we start the show: There’s still time to get your tickets for PyCon Colombia, happening February 9th and 10th. Go to pycon.co to learn more and register. There is also still time to register for the O’Reilly Software Architecture Conference in New York Feb 25-28. Use the link podcastinit.com/sacon-new-york to register and save 20% If you work with data or want to learn more about how the projects you have heard about on the show get used in the real world then join me at the Open Data Science Conference in Boston from May 1st through the 4th. It has become one of the largest events for data scientists, data engineers, and data driven businesses to get together and learn how to be more effective. To save 60% off your tickets go to podcastinit.com/odsc-east-2018 and register. Your host as usual is Tobias Macey and today I’m interviewing Carl Meyer and Matt Page about MonkeyType, a system to collect type information at runtime for your Python 3 code Interview Introductions How did you get introduced to Python? What is MonkeyType and how did the project get started? How much overhead does the MonkeyType tracing add to the running system, and what techniques have you used to minimize the impact on production systems? Given that the type information is collected from call traces at runtime, and some functions may accept multiple different types for the same arguments (e.g. add), do you have any logic that will allow for combining that information into a higher-order type that gets set as the annotation? How does MonkeyType function internally and how has the implementation evolved over the time that you have been working on it? Once the type annotations are present in your code base, what other tooling are you using to take advantage of that information? It seems as though using MonkeyType to trace your running production systems could be a way to inadvertantly identify dead sections of code that aren’t being executed. Have you investigated ways to use the collected type information perform that analysis? What have been some of the most challenging aspects of building, using, and maintaining MonkeyType? What have been some of the most interesting or noteworthy things that you have learned in the process of working on and with MonkeyType? What have you found to be the most useful and most problematic aspects of the typing capabilities provided in recent versions of Python? For someone who wants to start using MonkeyType today, what is involved in getting it set up and using it in a new or existing codebase? What features or improvements do you have planned for future releases of MonkeyType? Keep In Touch Carl Email @carljm on Twitter Matt Email @voidstar on Twitter Picks Tobias LyxPro HAS-30 Headphones Carl Broadchurch Happy Valley Matt Anova Sous Vide Links MonkeyType Instagram Dive Into Python Python 3 Typing Module MyPy Project Page Podcast.init Interview Mike Krieger PyAnnotate Type Annotations Type Stubs PEP 523 frame evaluation api Scuba Haskell Rust PEP 563 Postponed Evaluation of Annotations Gary Bernhardt – Ideology coverage.py The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/28/2018 • 48 minutes, 25 seconds

Learn Leap Fly: Using Python To Promote Global Literacy with Kjell Wooding

Summary Learning how to read is one of the most important steps in empowering someone to build a successful future. In developing nations, access to teachers and classrooms is not universally available so the Global Learning XPRIZE serves to incentivize the creation of technology that provides children with the tools necessary to teach themselves literacy. Kjell Wooding helped create Learn Leap Fly in order to participate in the competition and used Python and Kivy to build a platform for children to develop their reading skills in a fun and engaging environment. In this episode he discusses his experience participating in the XPRIZE competition, how he and his team built what is now Kasuku Stories, and how Python and its ecosystem helped make it possible. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Kjell Wooding about Learn Leap Fly, a startup using Python on mobile devices to facilitate global learning Interview Introductions How did you get introduced to Python? Can you start by describing what Learn Leap Fly does and how the company got started? What was your motivation for using Kivy as the primary technology for your mobile applications as opposed to the platform native toolkits or other multi-platform frameworks? What are some of the pedagogical techniques that you have incorporated into the technological aspects of your mobile application and are there any that you were unable to translate to a purely technical implementation. How do you measure the effectiveness of the work that you are doing? How has the framework of the XPRIZE influenced the way in which you have approached the design and development of your work? What have been some of the biggest challenges that you faced in the process of developing and deploying your submission for the XPRIZE? What are some of the features that you have planned for future releases of your platform? Keep In Touch Learn Leap Fly Website @learnleapfly on Twitter Kjell llfkj on GitHub @pdokj on Twitter Picks Tobias Yamaha YHT-4930UBL Home Theater System Kjell Instant Pot Anova Sous Vide Modernist Cooking at Home Links Programming Python (O’Reilly) Learn Leap Fly Tim Ferriss Peter Diamandis Global Learning XPRIZE Kasuku Beta Program XPRIZE Foundation Kivy Kivy Flappy Bird Podcast.init Kivy Interview Deliberate Practice Google Pixel C Bayesian Learning SciPy NumPy Keras The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/21/2018 • 43 minutes, 7 seconds

Healthchecks.io: Open Source Alerting For Your Cron Jobs with Pēteris Caune

Summary Your backups are running every day, right? Are you sure? What about that daily report job? We all have scripts that need to be run on a periodic basis and it is easy to forget about them, assuming that they are working properly. Sometimes they fail and in order to know when that happens you need a tool that will let you know so that you can find and fix the problem. Pēteris Caune wrote Healthchecks to be that tool and made it available both as an open source project and a hosted version. In this episode he discusses his motivation for starting the project, the lessons he has learned while managing the hosting for it, and how you can start using it today. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Pēteris Caune about Healthchecks, a Django app which serves as a watchdog for your cron tasks Interview Introductions How did you get introduced to Python? Can you start by explaining what Healthchecks is and what motivated you to build it? How does Healthchecks compare with other cron monitoring projects such as Cronitor or Dead Man’s Snitch? Your pricing on the hosted service for Healthchecks.io is quite generous so I’m curious how you arrived at that cost structure and whether it has proven to be profitable for you? How is Healthchecks functionality implemented and how has the design evolved since you began working on and using it? What have been some of the most challenging aspects of working on Healthchecks and managing the hosted version? For someone who wants to run their own instance of the service what are the steps and services involved? What are some of the most interesting or unusual uses of Healtchecks that you are aware of? Given that Healthchecks is intended to be used as part of an operations management and alerting system, what are the considerations that users should be aware of when deploying it in a highly available configuration? What improvements or features do you have planned for the future of Healthchecks? Keep In Touch cuu508 on GitHub Blog @cuu508 on Twitter Picks Tobias LG 55UJ6300 Pēteris Zwift TrainerRoad Links Healthchecks.io GitHub Riga Latvia Cross Country Cycling Semantic Web Django Flask Cron Cronitor.io Dead Man’s Snitch IPv6 Load Balancing PostGreSQL MySQL Fabric Ansible Dokku Kubernetes Hetzner CloudFlare PGPool II Streaming Replication Citus Data Website Data Engineering Podcast Interview Heroku Fork the Evolution of healthchecks.io Hosting Setup The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/14/2018 • 27 minutes, 24 seconds

Bonobo: Lightweight ETL Toolkit for Python 3 with Romain Dorgueil

Summary A majority of the work that we do as programmers involves data manipulation in some manner. This can range from large scale collection, aggregation, and statistical analysis across distrbuted systems, or it can be as simple as making a graph in a spreadsheet. In the middle of that range is the general task of ETL (Extract, Transform, and Load) which has its own range of scale. In this episode Romain Dorgueil discusses his experiences building ETL systems and the problems that he routinely encountered that led him to creating Bonobo, a lightweight, easy to use toolkit for data processing in Python 3. He also explains how the system works under the hood, how you can use it for your projects, and what he has planned for the future. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Romain Dorgueil about Bonobo, a data processing toolkit for modern Python Interview Introductions How did you get introduced to Python? What is Bonobo and what was your motivation for creating it? What is the story behind the name? How does Bonobo differ from projects such as Luigi or Airflow? [RD] After I explain why that’s totally different things, maybe a good follow up would be to ask about differences from other data streaming solutions, like Apache Beam or Spark. How is Bonobo implemented and how has its architecture evolved since you began working on it? What have been some of the most challenging aspects of building and maintaining Bonobo? What are some extensions that you would like to have but don’t have the time to implement? What are some of the most interesting or creative uses of Bonobo that you are aware of? What do you have planned for the future of Bonobo? Keep In Touch Bonobo Project Bonobo ETL Slack GitHub Romain Website @rdorgueil on Twitter hartym on GitHub Picks Tobias Data Skeptic: Quantum Computing Romain Medikit, or how to manage hundreds of projects at the same time, still being able to sleep at night. Rocker, a better builder for docker images. Links Bonobo RedHat Anaconda Installer ETL Pentaho RDC.ETL DAG (Directed Acyclic Graph) Luigi Airflow NamedTuple Jupyter OAuth Graphviz Dask Data Engineering Podcast Dask Interview Selenium Zapier IFTTT (If This Then That) FPGA The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/7/2018 • 53 minutes, 57 seconds

Orange: Visual Data Mining Toolkit with Janez Demšar and Blaž Zupan

Summary Data mining and visualization are important skills to have in the modern era, regardless of your job responsibilities. In order to make it easier to learn and use these techniques and technologies Blaž Zupan and Janez Demšar, along with many others, have created Orange. In this episode they explain how they built a visual programming interface for creating data analysis and machine learning workflows to simplify the work of gaining insights from the myriad data sources that are available. They discuss the history of the project, how it is built, the challenges that they have faced, and how they plan on growing and improving it in the future. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Blaž Zupan and Janez Demsar about Orange, a toolbox for interactive machine learning and data visualization in Python Interview Introductions How did you get introduced to Python? What is Orange and what was your motivation for building it? Who is the target audience for this project? How is the graphical interface implemented and what kinds of workflows can be implemented with the visual components? What are some of the most notable or interesting widgets that are available in the catalog? What are the limitations of the graphical interface and what options do user have when they reach those limits? What have been some of the most challenging aspects of building and maintaining Orange? What are some of the most common difficulties that you have seen when users are just getting started with data analysis and machine learning, and how does Orange help overcome those gaps in understanding? What are some of the most interesting or innovative uses of Orange that you are aware of? What are some of the projects or technologies that you consider to be your competition? Under what circumstances would you advise against using Orange? What are some widgets that you would like to see in future versions? What do you have planned for future releases of Orange? Keep In Touch Blaž University Bio @bzupan on Twitter BlazZupan on GitHub Google Scholar Janez University Bio @jademsar on Twitter janezd on GitHub Google Scholar Picks Tobias Data Stories: What’s Going On In This Graph? Blaž How I Built This Janez Advent of Code Links University of Ljubljani Data Explorer Silicon Graphics Visual Programming PyQT Linear Regression t-SNE K-Means TCL/TK Numpy Scikit-Learn SciPy Textable.io RapidMiner Single Cell Genomics Transfer Learning Orange Video Tutorials The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/31/2017 • 49 minutes, 5 seconds

Dramatiq: Distributed Task Queue For Python 3 with Bogdan Popa

Summary A majority of projects will eventually need some way of managing periodic or long-running tasks outside of the context of the main application. This is where a distributed task queue becomes useful. For many in the Python community the standard option is Celery, though there are other projects to choose from. This week Bogdan Popa explains why he was dissatisfied with the current landscape of task queues and the features that he decided to focus on while building Dramatiq, a new, opinionated distributed task queue for Python 3. He also describes how it is designed, how you can start using it, and what he has planned for the future. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Bogdan Popa about Dramatiq, a distributed task processing library for Python with a focus on simplicity, reliability and performance Interview Introductions How did you get introduced to Python? What is Dramatiq and what was your motivation for creating it? How does Dramatiq compare to other task queues in Python such as Celery or RQ? How is Dramatiq implemented and how has the internal architecture evolved? What have been some of the most difficult aspects of building Dramatiq? What are some of the features that you are most proud of? For someone who is interested in integrating Dramatiq into an application, can you describe the steps involved and the API? Do you provide any form of migration path or compatibility layer for people who are currently using Celery or RQ? Can you describe the licensing structure for the project and your reasoning? How did you determine the price point for commercial licenses? Have you been successful in selling licenses for commercial use? What are some of the features that you have planned for future releases? Keep In Touch Project Website Personal Website Bogdanp on GitHub @Bogdanp on Twitter Picks Tobias The Anybodies by N.E. Bode Bogdan Pipenv Links Dramatiq LeadPages Lisp Celery RQ Billiard Kombu Google App Engine GAE Task Queue RabbitMQ APScheduler Redis Memcached LRU (Least Recently Used) Middleware Gevent Pika SQS (Amazon Simple Queue Service) Google Cloud PubSub Django API* Bundler Cargo The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/24/2017 • 38 minutes, 13 seconds

Jake Vanderplas: Data Science For Academic Research

Summary Jake Vanderplas is an astronomer by training and a prolific contributor to the Python data science ecosystem. His current role is using Python to teach principles of data analysis and data visualization to students and researchers at the University of Washington. In this episode he discusses how he got started with Python, the challenges of teaching best practices for software engineering and reproducible analysis, and how easy to use tools for data visualization can help democratize access to, and understanding of, data. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Jake Vanderplas about data science best practices, and applying them to academic sciences Interview Introductions How did you get introduced to Python? How has your astronomy background informed and influenced your current work? In your work at the University of Washington, what are some of the most common difficulties that students face when learning data science? How does that list differ for professional scientists who are learning how to apply data science to their work? Where is the tooling still lacking in terms of enabling consistent and repeatable workflows? One of the projects that you are spending time on now is Altair, which is a library for generating visualizations from Pandas dataframes. How does that work factor into your teaching? What are some of the most novel applications of data science that you have been involved with? What are some of the trends in data analysis that you are most excited for? Keep In Touch Website @jakevdp jakevdp on GitHub Picks Tobias The Redwall Cookbook Jake Kevin M. Kruse White Flight by Kevin Kruse Links UW eScience Institute NumPy SciPy SciPy Conference PyCon Pandas Sloan Digital Sky Survey Spectroscopy Software Carpentry Data Carpentry Git Mercurial Matplotlib Altair Conda Xonsh Jupyter Jupyter Lab Vega Vega-lite Interactive Data Lab D3 Mike Bostock Brian Granger Bokeh Grammar of Graphics ggplot2 Holoviews Wikimedia AstroPy Podcast.__init__ Interview About AstroPy LIGO Wes McKinney Feather The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/17/2017 • 49 minutes, 27 seconds

Kenneth Reitz

Summary Kenneth Reitz has contributed many things to the Python community, including projects such as Requests, Pipenv, and Maya. He also started the community written Hitchhiker’s Guide to Python, and serves on the board of the Python Software Foundation. This week he talks about his career in the Python community and digs into some of his current work. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Kenneth Reitz about his career in Python Interview Introductions How did you get introduced to Python? An overarching theme of your open source projects is the idea of making them “For Humans”. Can you elaborate on how that came to be a focus for you and how that informs the way that you design and write your code? What are the projects that you are most proud of and which do you think have had the biggest impact on the Python community? A: Requests, Hitchhiker’s Guide to Python, and Pipenv (yet to come to full fruition). Which projects have you authored which are relatively unknown but you think people would benefit from using more often? A: Maya: Datetime for Humans, and Records: SQL for Humans. Outside of the code that you write, what are some of your personal missions for the software industry in general and the Python community in particular? A: I consider myself a “spiritual alchemist”, which means “transformation of dark into light”. I seek to do “the great work”, in however in manifests, outside of the programming world, as well as within it. What do you think is the biggest gap in the tool chest for Python developers? A: I seek to fill all the voids that I see, and I’ve done my best to do that to the best of my ability. I think we have a lot of work to do in the area of single-file executable builds (a-la Go). What are your ambitions for future projects? A: At the moment, I have no current plans for future projects, but I’m sure something will come along at some point If you weren’t working with Python what would you be doing instead? A: I’d have a lot less money and I’d be a lot less fufilled. Keep In Touch Website @kennethreitz on Twitter kennethreitz on GitHub Picks Tobias Algorithms to Live By Kenneth The Linux Programming Interface Links Heroku Salesforce PSF Board of Directors Caldera Linux C Pascal Basic Groovy Java PHP Ruby The Design of Everyday Things Requests Hitchhiker’s Guide Pipenv Pipfile The Update Framework Falsehoods Programmer’s Believe About Time PEP20 Py2EXE Cxfreeze Briefcase The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/10/2017 • 42 minutes, 49 seconds

Asphalt: A Framework For Asynchronous Network Applications with Alex Grönholm

Summary As we rely more on small, distributed processes for building our applications, being able to take advantage of asynchronous I/O is increasingly important for performance. This week Alex Grönholm explains how the Asphalt Framework was created to make it easier to build these network oriented software stacks and the technical challenges that he faced in the process. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Alex Grönholm about the Asphalt Framework, a Python microframework for network oriented applications Interview Introductions How did you get introduced to Python? What is Asphalt and what was your reason for building it? How does Asphalt compare to Twisted? What are the most challenging parts of writing asynchronous and event-based applications and how does Asphalt help simplify that process? When building an Asphalt application it can be easy to accidentally block an async loop by pulling in third party libraries that don’t support asynchronous execution. What are some of the techniques for identifying and resolving blocking portions of your application? What does the internal architecture of Asphalt look like and how has that evolved from when you first started working on it? What have been some of the most difficult aspects of building and evolving Asphalt? What are some of the most interesting or unexpected uses of Asphalt that you have seen? What are some of the new features or improvements that you have planned for the future of Asphalt? Keep In Touch Gitter IRC GitHub agronholm on GitHub @agronholm on Twitter Picks Tobias Thor: Ragnarok Alex Two Steps From Hell Links Asphalt ERP Asyncio Tornado Twisted SQLAlchemy PEP 550 Sanic WAMP Podcast.init Interview About Crossbar Tee FlexGet APScheduler BitTorrent uvloop Tokio The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/3/2017 • 34 minutes, 44 seconds

Golem: End-To-End Test Automation Framework with Luciano Renzi

Summary The importance of testing your software is widely talked about and well understood. What is not as often discussed is the different types of testing, and how end-to-end tests can benefit your team to ensure proper functioning of your application when it gets released to production. This week Luciano Renzi shares the work that he has done on Golem, a framework for building and executing an automation suite to exercise the entire system from the perspective of the user. He discusses his reasons for creating the project, how he things about testing, and where he plans on taking Golem in the future. Give it a listen and then take it for a test drive. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Luciano Renzi about Golem, a framework and automation tool for end-to-end testing in Python Interview Introductions How did you get introduced to Python? What is golem and what motivated you to create it? What was your inspiration for the name? Why did you choose to use Python for Golem and if you were to start over today would you make the same choice? For someone who is unfamiliar with the concept, can you describe what end-to-end testing is and the reasons for making it part of their development process? What is the main goal of Golem What does the internal architecture and implementation of Golem look like and how has that evolved from when you first started the project? How does Golem compare to other Python libraries for automated browser testing and what was lacking in the existing solutions when you created it? What are the differences between golem and robot framework? What about projects written in other languages such as protractor? One of the intriguing features of Golem is the web interface for constructing tests. What are the benefits of codeless automation & record-playback functionality? What are some of the most challenging aspects of building and maintaining Golem? It seems that every browser automation library is ultimately a wrapper around Selenium. Why is a wrapper necessary and why haven’t any strong alternatives been created? What are the advantages of making Golem a framework for test automation, rather than a library? What are some of the most interesting or unexpected uses for Golem that you have seen? What do you have planned for the future of Golem? What is the current state of end to end automation and how do you see it evolving in the future? How do you think machine learning and AI will be used in test automation? Keep In Touch luciano-renzi on GitHub @lucianorenzi on Twitter Picks Tobias Weapons of Math Destruction Links Golem Elementum Pascal Watir JUnit Selenium Page Object Pattern Selenium Grid Sauce Labs py.test Podcast.init Interview About Py.Test Robot Framework Mechanize Acceptance Tests Protractor Webdriver.io Appium The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/25/2017 • 54 minutes, 3 seconds

Graphite Metrics Stack with Jason Dixon and Dan Cech

Summary Do you know what is happening in your production systems right now? If you have a comprehensive metrics platform then the answer is yes. If your answer is no, then this episode is for you. Jason Dixon and Dan Cech, core maintainers of the Graphite project, talk about how graphite is architected to capture your time series data and give you the ability to use it for answering questions. They cover the challenges that have been faced in evolving the project, the strengths that have let it stand the tests of time, and the features that will be coming in future releases. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Now is a good time to start planning your conference schedule for 2018. To help you out with that, guest Jason Dixon is offering a $100 discount for Monitorama in Portland, OR on June 4th – 6th and guest Dan Cech is offering a €50 discount to Grafanacon in Amsterdam, Netherlands March 1st and 2nd. There is also still time to get your tickets to PyCascades in Vancouver, BC Canada January 22nd and 23rd. All of the details are in the show notes Your host as usual is Tobias Macey and today I’m interviewing Jason Dixon and Dan Cech about Graphite Interview Introductions How did you get introduced to Python? What is Graphite and how did you each get involved in the project? Why should developers be thinking about collecting and reporting on metrics from their software and systems? How do you think the Graphite project has contributed to or influenced the overall state of the art in systems monitoring? There are a number of different projects that comprise a fully working Graphite deployment. Can you list each of them and describe how they fit together? What are some of the early design choices that have proven to be problematic while trying to evolve the project? What are some of the challenges that you have been faced with while maintaining and improving the various Graphite projects? What will be involved in porting Graphite to run on Python 3? If you were to start the project over would you still use Python? What are the options for scaling Graphite and making it highly available? Given the level of importance to a companies visibility into their systems, what development practices do you use to ensure that Graphite can operate reliably and fail gracefully? What are some of the biggest competitors to Graphite? When is Graphite not the right choice for tracking your system metrics? What are some of the most interesting or unusual uses of Graphite that you are aware of? What are some of the new features and enhancements that are planned for the future of Graphite? Keep In Touch Jason @obfuscurity on Twitter Website obfuscurity on GitHub Dan @dancech on Twitter Website DanCech on GitHub Picks Tobias Archery Jason Rocket League Monitorama $100 Discount (Limited Quantity) Dan Home Assistant Podcast.__init__ Interview GrafanaCon €50 discount with PODCASTINIT2018 Links Graphite Sensu Monitorama RainTank Grafana Labs Librato GitHub Dyn Telemetry Perl PHP React O’Reilly Graphite Book Time Series RRDTool InfluxDB Adrian Cockcroft NVMe Prometheus CNCF ASAP Smoothing PyCascades The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/19/2017 • 1 hour, 14 minutes, 17 seconds

Surprise! Recommendation Algorithms with Nicolas Hug

Summary A relevant and timely recommendation can be a pleasant surprise that will delight your users. Unfortunately it can be difficult to build a system that will produce useful suggestions, which is why this week’s guest, Nicolas Hug, built a library to help with developing and testing collaborative recommendation algorithms. He explains how he took the code he wrote for his PhD thesis and cleaned it up to release as an open source library and his plans for future development on it. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Nicolas Hug about Surprise, a scikit library for building recommender systems Interview Introductions How did you get introduced to Python? What is Surprise and what was your motivation for creating it? What are the most challenging aspects of building a recommender system and how does Surprise help simplify that process? What are some of the ways that a user or company can bootstrap a recommender system while they accrue data to use a collaborative algorithm? What are some of the ways that a recommender system can be used, outside of the typical ecommerce example? Once an algorithm has been deployed how can a user test the accuracy of the suggestions? How is Surprise implemented and how has it evolved since you first started working on it? What have been the most difficult aspects of building and maintaining Surprise? competitors? What are the attributes of the system that can be modified to improve the relevance of the recommendations that are provided? For someone who wants to use Surprise in their application, what are the steps involved? What are some of the new features or improvements that you have planned for the future of Surprise? Keep In Touch Website @hug_nicolas on Twitter nicolashug on GitHub Picks Tobias Silk profiler for Django Links Surprise Gridsearch Cold Start Problem Content-Based Recommendation Ensemble Learning Spotlight Lightfm Pandas The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/11/2017 • 30 minutes, 22 seconds

Rasa: Build Your Own AI Chatbot with Joey Faulkner

Summary With the proliferation of messaging applications, there has been a growing demand for bots that can understand our wishes and perform our bidding. The rise of artificial intelligence has brought the capacity for understanding human language. Combining these two trends gives us chatbots that can be used as a new interface to the software and services that we depend on. This week Joey Faulkner shares his work with Rasa Technologies and their open sourced libraries for understanding natural language and how to conduct a conversation. We talked about how the Rasa Core and Rasa NLU libraries work and how you can use them to replace your dependence on API services and own your data. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Joey Faulkner about Rasa Core and Rasa NLU for adding conversational AI to your projects. Interview Introductions How did you get introduced to Python? Can you start by explaining the goals of Rasa as a company and highlighting the projects that you have open sourced? What are the differences between the Rasa Core and Rasa NLU libraries and how do they relate to each other? How does the interaction model change when going from state machine driven bots to those which use Rasa Core and what capabilities does it unlock? How is Rasa NLU implemented and how has the design evolved? What are the motivations for someone to use Rasa core or NLU as a library instead of available API services such as wit.ai, LUIS, or Dialogflow? What are some of the biggest challenges in gathering and curating useful training data? What is involved in supporting multiple languages for an application using Rasa? What are the biggest challenges that you face, past, present, and future, building and growing the tools and platform for Rasa? What would be involved for projects such as OpsDroid, Kalliope, or Mycroft to take advantage of Rasa and what benefit would that provide? On the comparison page for the hosted Rasa platform it mentions a feature of collaborative model training, can you describe how that works and why someone might want to take advantage of it? What are some of the most interesting or unexpected uses of the Rasa tools that you have seen? What do you have planned for the future of Rasa? Keep In Touch Gitter Twitter @joeymfaulkner @Rasa_HQ Email GitHub Picks Tobias Information Architecture Joey Dog Spotting Rasa NLU Trainer Links Rasa Technologies Rasa NLU Rasa Core SpaCy Podcast.__init__ Interview with SpaCy Creator yt-project Podcast.__init__ Interview with yt-project Chatbot Word2Vec State Machine Podcast.__init__ Episode About Automat with Glyph Recursive Neural Network MITIE Support Vector Machine Scikit Learn wit.ai LUIS Dialogflow Keras Reinforcement Learning The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/4/2017 • 49 minutes

Eliot: Effective Logging with Itamar Turner-Trauring

Summary Understanding what is happening in a software system can be difficult, especially when you have inconsistent log messages. Itamar Turner-Trauring created Eliot to make it possible for your project to tell you a story about how transactions flow through your program. In this week’s episode we go deep on proper logging practices, anti patterns, and how to improve your ability to debug your software with log messages. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Itamar Turner-Trauring about Eliot, a library for managing complex logs across multiple processes. Interview Introductions How did you get introduced to Python? What is Eliot and what problem were you trying to solve by creating it? How is Eliot implemented and how has the design evolved since you first started working on it? Why is it so important to have a standardized format for your application logs? What are some of the anti-patterns that you consider to be the most harmful when developers are setting up logging in their projects? What have been the most challenging aspects of building and maintaining Eliot? How does Eliot compare to some of the other third party logging libraries available such as structlog or logbook? What are some of the improvements or additional features that you have planned for the future of Eliot? Keep In Touch Website @itamarst on Twitter Picks Tobias Moonshot Podcast Itamar Middlemarch by George Eliot Links Eliot Zope PHP OpenTracing Zipkin Carl De Marcken Sentry Elasticsearch Logstash Kibana Eliot-Tree Daniel Lebrero Flocker Context Local Variables PEP (PEP 550) Flamegraph Brendan Gregg DAG Structlog The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/29/2017 • 49 minutes, 48 seconds

Donkey: Building Self Driving Cars with Will Roscoe

Summary Do you wish that you had a self-driving car of your own? With Donkey you can make that dream a reality. This week Will Roscoe shares the story of how he got involved in the arena of self-driving car hobbyists and ended up building a Python library to act as his pilot. We talked about the hardware involved, how he has evolved the code to meet unexpected challenges, and how he plans to improve it in the future. So go build your own self driving car and take it for a spin! Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]) To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Will Roscoe about Donkey, a python library for building DIY self driving cars. Interview Introductions How did you get introduced to Python? What is Donkey and what was your reason for creating it? What is the story behind the name? What was your reason for choosing Python as the language for implementing Donkey and if you were to start over today would you make the same choice? How is Donkey implemented and how has its software architecture evolved? Is the library built in a way that you can process inputs from additional sensor types, such as proximity detectors or LIDAR? For training the autopilot what are the input features that the model is testing against for the input data, and is it possible to change the features that it will try to detect? Do you have plans to incorporate any negative reinforcement techniques for training the pilot models so that errors in data collection can be identified as undesirable outcomes? What have been some of the most interesting or humorous successes and failures while testing your cars? What are some of the challenges involved with getting such a sophisticated stack of software running on a Raspberry Pi? What are some of the improvements or new features that you have planned for the future of Donkey? Media Donkey Car Photos Keep In Touch Donkey Slack Channel Wills Twitter – @dataduce #donkeycar on social Picks Tobias Orgzly Org Mode for Sublime Org Mode for VSCode Org Mode for Vim Will Algorithms to Live By The Structure of Scientific Revolutions A song I can’t stop nodding my head to Links Donkey Car DIY Robocars Tornado [Tornado on Podcast.init](https://www.pythonpodcast.com/episode-40-ben-darnell-on-tornado/?utmsource=rss&utmmedium=rss Raspberry Pi TensorFlow Convolutional Neural Network Adafruit LIDAR ROS (Robot Operating System) Unity Udacity self driving car nano-degree SparkFun Beagleboard Adam Conway The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/22/2017 • 33 minutes, 49 seconds

Event Sourcing with John Bywater

Summary The way that your application handles data and the way that it is represented in your database don’t always match, leading to a lot of brittle abstractions to reconcile the two. In order to reduce that friction, instead of overwriting the state of your application on every change you can log all of the events that take place and then render the current state from that sequence of events. John Bywater joins me this week to discuss his work on the Event Sourcing library, why you might want to use it in your applications, and how it can change the way that you think about your data. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports the show on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters. If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind. Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected] To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing John Bywater about event sourcing, an architectural approach to make your data layer easier to scale and maintain. Interview Introductions How did you get introduced to Python? Can you start by describing the concept of event sourcing and the benefits that it provides? What is the event sourcing library and what was your reason for starting it? What are some of the reasons that someone might not want to implement an event sourcing approach in their persistence layer? Given that you are storing a record for each event that occurs on a domain object, how does that affect the amount of storage necessary to support an event sourced application? What is the impact on performance and latency from an end user perspective when the application is using event sourcing to render the current state of the system? What does the internal architecture and design of your library look like and how has that evolved over time? In the case where events are delivered out of order, how can you ensure that the present view of an object is reflected accurately? For someone who wants to incorporate an event sourcing design into an existing application, how would they do that? How do you manage schema changes in your domain model when you need to reconstruct present state from the beginning of an objects event sequence? What are some of the most interesting uses of event sourcing that you have seen? What are some of the features or improvements that you have planned for the future of you event sourcing library? Keep In Touch John johnbywater on GitHub @johnbywater on Twitter Picks Tobias Heresy In The Church Of Docker John QuantDSL Links CKAN Data.gov Patterns of Enterprise Application Architecture Object Relational Impedance Mismatch Event Sourcing (Pattern) Event Sourcing (Library) N-Tiered Architecture Domain Driven Design Event Storming ORM, The Vietnam of Computer Science Vaughn Vernon, Implementing Domain Driven Design Active Record Pattern Optimistic Concurrency Control Paxos DynamoDB Martin Fowler Eric Evans The Dark Side of Event Sourcing The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/15/2017 • 1 hour, 8 minutes, 26 seconds

Kalliope with Nicolas Marcq and Thibaud Buffet

Summary Wouldn’t it be nice to have a personal assistant to answer your questions, help you remember important tasks, and control your environment? Meet Kalliope, a Python powered, modular, voice controlled automation platform. This week Nicolas Marcq and Thibaud Buffet explain how they started the project, what makes it stand out from other open source and commercial options, and how you can start using it today. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Need to learn more about how to scale your apps or learn new techniques for building them? Pluralsight has the training and mentoring you need to level up your skills. Go to www.pythonpodcast.com/pluralsight?utm_source=rss&utm_medium=rss to start your free trial today. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. If you work with data for your job or want to learn more about how open source is powering the latest innovations in data science then make your way to the Open Data Science Conference, happening in London in October and San Francisco in November. Follow the links in the show notes to register and help support the show in the process. Your host as usual is Tobias Macey and today I’m interviewing Nicolas Marcq and Thibaud Buffet about Kalliope, a modular always-on voice controlled personal assistant designed for home automation. Interview Introductions How did you get introduced to Python? What is the Kalliope project and how did it get started? How does Kalliope compare to commercial options such as Amazon Alexa and Google Home, as well as other open source projects such as Mycroft or Jasper? The majority of voice assistant projects that I have seen default to interacting in English, whereas Kalliope is multi-lingual. What led you to that design choice and how is that implemented? One of the perennial questions around voice assistants is privacy, so how does Kalliope work to mitigate the issues associated with having an always on device listening in people’s homes? How is Kalliope architected internally and how has the design evolved over time? What are some of the most difficult or challenging aspects of building Kalliope and its associated projects? What are some of the most interesting uses of Kalliope that you are aware of? What are some of the most notable features or improvements that you have planned for the future of Kalliope? How has the choice of Python as the implementation worked for you, and if you were to start over today do you think you would make the same decision? Keep In Touch Nicolas @Sispheor on Twitter Sispheor on GitHub Website Thibaud @TibTac on Twitter LaMonF on GitHub Picks Tobias Kiwi Crate Nicolas Raspberry Pi Speaker Thibaud ReactiveX in Python Links Snowboy Mycroft Mycroft Interview Amazon Alexa Google Home Jasper Kalliope TTS STT CMU Sphinx Abstract Base Class MQTT RxPy Interview The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/8/2017 • 32 minutes, 33 seconds

Modoboa with Antoine Nguyen

Summary Email has long been the most commonly used means of communication on the internet. This week Antoine Nguyen talks about his work on the Modoboa project to make hosting your own mail server easier to manage. He discusses how the project got started, the tools that it ties together, and how he used Django to build a webmail and admin interface to make it more approachable. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Need to learn more about how to scale your apps or learn new techniques for building them? Pluralsight has the training and mentoring you need to level up your skills. Go to www.pythonpodcast.com/pluralsight?utm_source=rss&utm_medium=rss to start your free trial today. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. If you work with data for your job or want to learn more about how open source is powering the latest innovations in data science then make your way to the Open Data Science Conference, happening in London in October and San Francisco in November. Follow the links in the show notes to register and help support the show in the process. Your host as usual is Tobias Macey and today I’m interviewing Antoine Nguyen about Modoboa, a project to make mail hosting simple. Interview Introductions How did you get introduced to Python? What is Modoboa and what is the problem that you were trying to solve when you started it? Where does the name come from? Self-hosting an email server was a common activity during the early stages of the internet, what are some of the reasons that someone should consider running their own mail server now that there are so many options for third-party hosting such as Gmail and Outlook? Email hosting has become more complicated in recent years with the need to jump through a lot of hoops to maintain a sufficient reputation to keep your messages from being flagged as spam. Are there any utilities in Modoboa to assist with that process? There are a lot of components that you have brought together for running an email server. Can you describe how the different pieces fit together and what layers you have built on top to help make the overall system more manageable? What does the scaling strategy look like for Modoboa? What is the most challenging aspect of building and maintaining Modoboa? What are some of the features that you have planned for the future of Modoboa? Keep In Touch Email @antngu on Twitter Picks Tobias Dropbox Paper Antoine Capoeira Links PyTk Postfix Dovecot Nextcloud Owncloud SPF Records DKIM DMARC SMTP IMAP Apache Libcloud Amavis Mail Transfer Agent Radicale Ansible Docker Gentoo Packer Synology Drobo Prosody Lua XMPP The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/1/2017 • 33 minutes, 18 seconds

QuTiP with Paul Nation

Summary The future of computation and our understanding of the world around us is driven by the quantum world. This week Paul Nation explains how the Quantum Toolbox in Python (QuTiP) is being used in research projects that are expanding our knowledge of the physical universe. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Need to learn more about how to scale your apps or learn new techniques for building them? Pluralsight has the training and mentoring you need to level up your skills. Go to www.pythonpodcast.com/pluralsight?utm_source=rss&utm_medium=rss to start your free trial today. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. If you work with data for your job or want to learn more about how open source is powering the latest innovations in data science then make your way to the Open Data Science Conference, happening in London in October and San Francisco in November. Follow the links in the show notes to register and help support the show in the process. Your host as usual is Tobias Macey and today I’m interviewing Paul Nation about QuTIP, the quantum toolbox in Python. Interview Introductions How did you get introduced to Python? Before we start talking about QuTiP, can you provide us with a baseline definition of what quantum mechanics is? What is QuTIP and how did the project get started? Is QuTiP used purely in academics, or are there other users? What are some of the practical innovations that have been created as a result of research into different areas of quantum optics? How do you foresee the advent of practical quantum computers impacting the state of quantum mechanical research? Given the inherent complexity of the subject matter that you are dealing with, how do you approach the challenge of trying to present a usable API to users of QuTiP while not inhibiting their ability to operate at a low level when necessary? What is the process for incorporating new understandings of quantum mechanical theory into the QuTiP package? What are some of the most difficult aspects of simulating quantum systems in a standard computational environment? What is the most enjoyable aspect of working on QuTiP, what is the least enjoyable? What are some of the most notable research results that you are aware of which used QuTiP as part of their studies? What are some resources that you can recommend for anyone who wants to learn more about quantum mechanics? Keep In Touch QuTiP QuSTaR Picks Tobias edx.org Paul Cython Matplotlib Cheyenne Mountain Zoo Links Quantum Optics 2 Level System Complex Numbers Qubit Quantum Computing Harmonic Oscillator Nature Scientific Journal IBM Quantum Experience D-Wave Rigetti Quantum Computing Quantum Supremacy Hamiltonian Sparse Matrix Richard Feynman Dask Project Q Quantum State Transfer via Noisy Photonic and Phononic Waveguides paper by Peter Zoller Extending the lifetime of a quantum bit with error correction in superconducting circuits paper by Rob Shoelkopf (Yale) QuTiP Documentation The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/24/2017 • 36 minutes, 31 seconds

Lego Robotics with David Lechner and Denis Demidov

Summary Do you like Legos, robots, and Python? This week I am joined by David Lechner and Denis Demidov to talk about the ev3dev project and how you can program your Lego Mindstorms with Python! Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Need to learn more about how to scale your apps or learn new techniques for building them? Pluralsight has the training and mentoring you need to level up your skills. Go to www.pythonpodcast.com/pluralsight?utm_source=rss&utm_medium=rss to start your free trial today. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. If you work with data for your job or want to learn more about how open source is powering the latest innovations in data science then make your way to the Open Data Science Conference, happening in London in October and San Francisco in November. Follow the links in the show notes to register and help support the show in the process. Your host as usual is Tobias Macey and today I’m interviewing David Lechner and Denis Demidov about using Python with the ev3dev platform for programming LEGO robots Interview Introductions How did you get introduced to Python? Can you explain what the ev3dev project is and some of the story about how and why it got started? What is LEGO’s opinion of the ev3dev project? For anyone who isn’t familiar with the MINDSTORMS EV3 product from LEGO, can you give a brief overview of the hardware that they come with? Other than allowing users to program in environments other than the block-based editor that LEGO provides, what capabilities does the ev3dev project add to the MINDSTORMS EV3 platform? How are the language bindings generated and how do the different implementations compare to each other? What are the most challenging aspects of building and maintaining the ev3dev distribution and various language bindings? One of the things that my son is curious about is the possibility for integrating his MINDSTORMS with projects such as Kalliope or Mycroft to allow for voice controlled robots. Are you aware of anyone having done so or how you would approach something like that? What are some of the most interesting or innovative projects that you have seen people make with the MINDSTORMS platform running ev3dev? Why would someone want to use MINDSTORMS instead of any of the other robotics platforms that are available? For someone who is interested in learning more about intermediate and advanced robotics, what are some resources that you would recommend? Keep In Touch Denis @denis_demidov on Twitter ddemidov on Github David dlech on Github Website Picks Tobias Raspberry Pi Kalliope Denis pybind11 David Local food LocalHarvest Links ev3dev Lego MINDSTORMS BeagleBone Lego Mindstorms Community C++ Jupyter Notebooks Ralph Hempel Forth RCX NXT EV3 ARMv5 Debian PiStorms BrickPi EVB UART EV3 Schematics Look for “EV3 Hardware Developer Kit” in “Advanced Users” section. I2C RPyC Laurens Valk Liquid Templates Delta Robot Quest For Space Lego Technic Mindsensors.com Cool robots built with ev3dev Micropython The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/17/2017 • 44 minutes, 1 second

Cloud-Init with Scott Moser

Summary Server administration is a complex endeavor, but there are some tools that can make life easier. If you are running your workload in a cloud environment then cloud-init is here to help. This week Scott Moser explains what cloud-init is, how it works, and how it became the de-facto tool for configuring your Linux servers at boot. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Need to learn more about how to scale your apps or learn new techniques for building them? Pluralsight has the training and mentoring you need to level up your skills. Go to www.pythonpodcast.com/pluralsight?utm_source=rss&utm_medium=rss to start your free trial today. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. If you work with data for your job or want to learn more about how open source is powering the latest innovations in data science then make your way to the Open Data Science Conference, happening in London in October and San Francisco in November. Follow the links in the show notes to register and help support the show in the process. Your host as usual is Tobias Macey and today I’m interviewing Scott Moser about cloud-init, a set of python scripts and utilities to make your cloud images be all they can be! Interview Introductions How did you get introduced to Python? What is cloud-init and how did the project get started? Why was Python chosen as the language for implementing cloud-init? How has cloud-init come to be the de-facto utility for managing cloud instances across vendors and distributions? Are there any viable competitors to cloud-init? coreos-cloudinit, some others. How much overlap is there between cloud-init and configuration management tools such as SaltStack, Ansible, Chef, etc. How have you architected cloud-init to allow for compatibility across operating system distributions? What is the most difficult or complex aspect of building and maintaining cloud-init? [os integration, networking, goal of “do stuff without reboot”] Given that it is used as a critical component of the production deployment mechanics for a large number of people, how do you ensure an appropriate level of stability and security while developing cloud-init? How do you think the status of cloud-init as a Canonical project has affected the level of contributions that you receive? How much of the support and roadmap is contributed by individual vs corporate users such as AWS and Azure? What are some of the most unexpected or creative uses of cloud-init that you have seen? [https://wiki.ubuntu.com/OpenCompute?utm_source=rss&utm_medium=rss “disposable use os”] In your experience, what has been the biggest stumbling block for new users of cloud-init? Do you have any notable features or improvements planned for the future of cloud-init, or do you feel that it has reached a state of feature-completeness? Keep In Touch smoser on GitHub Picks Tobias mu4e isync Scott LXD Links IBM – Linux Technology Center Cloud-Init Ubuntu Canonical CoreOS EC2 OpenStack CentOS RHEL coreos-cloudinit JuJu Puppet SystemV Upstart SystemD Joyent SmartOS Digital Ocean IPv4 IPv6 Canonical MaaS+ JSON-Schema LXD Launchpad Bzr Git SUSE FreeBSD KVM Go-lang Pretty Table RAID ZFS LVM The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/10/2017 • 49 minutes, 50 seconds

Biopython with Peter Cock, Wibowo Arindrarto, and Tiago Antão

Summary Advances in the techniques used for genome sequencing are providing us with more information to unlock the secrets of biology. But how does that data get processed and analyzed? With Python of course! This week I am joined by some of the core maintainers of Biopython to discuss what bioinformatics is, how Python is used to help power the research in the field, and how Biopython helps to tie everything together. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Need to learn more about how to scale your apps or learn new techniques for building them? Pluralsight has the training and mentoring you need to level up your skills. Go to www.pythonpodcast.com/pluralsight?utm_source=rss&utm_medium=rss to start your free trial today. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. If you work with data for your job or want to learn more about how open source is powering the latest innovations in data science then make your way to the Open Data Science Conference, happening in London in October and San Francisco in November. Follow the links in the show notes to register and help support the show in the process. Your host as usual is Tobias Macey and today I’m interviewing Peter Cock, Wibowo Arindrarto, and Tiago Antão about biopython, a suite of python tools for computational molecular biology. Interview Introductions How did you get introduced to Python? Can you start by explaining what bioinformatics is and highlight some of the different areas of research? What is biopython and how did it get started? Biopython has a long history behind it. How has the project evolved over that time to meet the changing needs in terms of both research amd computation? How does Biopython compare to the sibling Bio* projects in other programming languages? What does a common workflow look like for someone who is working with biological data? What are some of the most interesting or innovative uses of Biopython that you are aware of? What are some of the most challenging aspects of developing and supporting Biopython? What are some of the most exciting developments in bioinformatics, either recently or coming up? How much domain knowledge is necessary for someone who wants to contribute to the project? What are some of the most problematic limitations of Biopython and how do you work around them? Keep In Touch Peter Website Wibowo Website @_bow_ on Twitter Tiago Website @tiagoantao on Twitter Biopython GitHub Picks Tobias Keep it Low Conf Peter Jupyter Notebooks (formerly IPython) for producing notebooks combining code, graphical output and descriptive code. Can be seen as a modern take on Donald Knuth’s Literate programming? Wibowo Conda for installing software, including BioConda for community packaged software in bioinformatics. Tiago Brython project for writing Python 3 in your browser using JavaScript Glacier National Park in North West Montana Links BioJava BioRuby BioPerl BioJS Open Bioinformatics Foundation Software In The Public Interest Oxford Nanopore Technology (for sequencing in the field etc) The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/3/2017 • 45 minutes, 29 seconds

opsdroid with Jacob Tomlinson

Summary Server administration is an activity that often happens in an isolated context in a terminal. ChatOps is a way of bringing that work into a shared environment and unlocking more collaboration. This week Jacob Tomlinson talks about the work he has done on opsdroid, a new bot framework targeted at tying together the various services and environments that modern production systems rely on. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Need to learn more about how to scale your apps or learn new techniques for building them? Pluralsight has the training and mentoring you need to level up your skills. Go to www.pythonpodcast.com/pluralsight?utm_source=rss&utm_medium=rss to start your free trial today. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. If you work with data for your job or want to learn more about how open source is powering the latest innovations in data science then make your way to the Open Data Science Conference, happening in London in October and San Francisco in November. Follow the links in the show notes to register and help support the show in the process. Your host as usual is Tobias Macey and today I’m interviewing Jacob Tomlinson about opsdroid Interview Introductions How did you get introduced to Python? What is opsdroid and what was the problem that you were trying to solve when you started the project? What led you to choose Python as the language for implementing opsdroid? What did you find lacking in the multitude of other chat bots that necessitated starting a new project? (e.g. Hubot, Errbot, Lita) One of the main features that you list in the documentation is the ease of installation. Why is that such an important aspect of the project and how is that implemented? What has been the most interesting and the most challenging aspect of implementing opsdroid? On the opsdroid organisation on GitHub there are many repositories for plugin modules. Do you see this being a management issue in the long term? How is opsdroid architected and what were the system requirements that led to the current system design? How do you manage authorization and authentication for performing commands against your production infrastructure in a group chat environment? What are some of the other security implications that users should be aware of when deploying a bot for interfacing with their deployment environment? How does a chat-oriented bot framework differ from those that are being created for voice-oriented interaction? What do you have planned for the future of opsdroid? Keep In Touch Website @JacobTomlinson on Twitter jacobtomlinson on GitHub Picks Tobias Rough Translation Podcast Jacob Home Assistant Podcast Links Iron Man Movie Puppet Hubot ChatOps asyncio Home Assistant Podcast.init Interview api.ai Luis Lex Slack Mycroft Kalliope Amazon Alexa opsdroid audio Snowboy Google Home Wit.ai The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/26/2017 • 45 minutes, 40 seconds

Ergonomica with Liam Schumm

Summary As developers we spend a lot of our work day in a terminal window, using shells that were designed 30 years ago. This week Liam Schumm joins me to explain why he decided to write a new, more ergonomic shell environment to simplify his workflow. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Need to learn more about how to scale your apps or learn new techniques for building them? Pluralsight has the training and mentoring you need to level up your skills. Go to www.pythonpodcast.com/pluralsight?utm_source=rss&utm_medium=rss to start your free trial today. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. If you work with data for your job or want to learn more about how open source is powering the latest innovations in data science then make your way to the Open Data Science Conference, happening in London in October and San Francisco in November. Follow the links in the show notes to register and help support the show in the process. Your host as usual is Tobias Macey and today I’m interviewing Liam Schumm about Ergonomica Interview Introductions How did you get introduced to Python? What is Ergonomica and what was your reason for creating it? What are some of the most difficult aspects of the project that you have experienced? How is Ergonomica implemented? What was your reason for using a dialect of Lisp as the interface for a terminal environment as opposed to iterating on the idioms in shells such as Bash? How does Ergonomica’s implementation differ from traditional shells such as Bash, Csh, and Powershell? How does Ergonomica’s implementation differ from other alternative shells such as Xonsh, ZSH, and Fish? Why did you choose to implement Ergonomica in Python? What’s your target group for Ergonomica? What do you have planned for the future of Ergonomica? Reading through your website you are fairly well accomplished. How does your age factor into the kinds of projects that you are engaged in? Keep In Touch Liam’s GitHub Email @liamschumm on Twitter Picks Tobias Magic The Gathering: Arena of the Planeswalkers Liam GitLab CE Python-prompt-toolkit Thriftbooks – 15% off your first order Links PyGame Minecraft Beeware ChiPy Chi Hack Night XKCD Tar Comic POSIX Colorama PLY Peter Norvig How to write a lisp interpreter in python ZSH Fish Xonsh PyVim sh Homebrew The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/20/2017 • 42 minutes, 4 seconds

Data Retriever with Henry Senyondo

Summary Analyzing and interpreting data is a large portion of the work involved in scientific research. Getting to that point can be a lot of work on its own because of all of the steps required to download, clean, and organize the data prior to analysis. This week Henry Senyondo talks about the work he is doing with Data Retriever to make data preparation as easy as retriever install. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Henry Senyondo about Data Retriever, the package manager for public data sets. Interview Introductions How did you get introduced to Python? Can you explain what data retriever is and the problem that it was built to solve? Are there limitations as to the types of data that can be managed by data retriever? What kinds of data sets are currently available and who are the target users? What is involved in preparing a new dataset to be available for installation? How much of the logic for installing the data is shared between the R and Python implementations of Data Retriever and how do you ensure that the two packages evolve in parallel? How is the project designed and what are some of the most difficult technical aspects of building it? What is in store for the future of data retriever? Keep In Touch Github @henrykironde on Twitter Picks Tobias Otium Bluetooth Receiver Panasonic Ergofit Headphones Nitize Adhesive Pocket Clip Henry The Three Idiots Links Weecology Lab University of Florida Data Retriever LG R Julia Open Knowledge Foundation Frictionless Data Format Data Weaver The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/12/2017 • 17 minutes, 55 seconds

Coverage.py with Ned Batchelder

Summary We write tests to make sure that our code is correct, but how do you make sure the tests are correct? This week Ned Batchelder explains how coverage.py fills that need, how he became the maintainer, and how it works under the hood. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Ned Batchelder about coverage.py, the ubiquitous tool for measuring your test coverage. Interview Introductions How did you get introduced to Python? What is coverage.py and how did you get involved with the project? The coverage project has become the de facto standard for measuring test coverage in Python. Why do you think that is? What is the utility of measuring test coverage? What are the downsides to measuring test coverage? One of the notable capabilities that was introduced recently was the plugin for measuring coverage of Django templates. Why is that an important capability and how did you manage to make that work? How does coverage conduct its measurements and how has that algorithm evolved since you first started work on it? What are the most challenging aspects of building and maintaining coverage.py? While I was looking at the bug tracker I was struck by the vast array of contexts in which coverage is used. Do you find it overwhelming trying to support so many operating systems and Python implementations? What might be added to coverage in the future? Keep In Touch @nedbat on Twitter Website Picks Tobias Org-Journal Ned Hypothesis The Infinite Monkey Cage Links edX Lotus Notes Zope Coverage.py Gareth Rees Trace in stdlib Fig Leaf State Machines CodeCov Coveralls Cobertura Turing Completeness Django Templates Jinja2 Mako Hy-lang GCov Jython Code Triage Service Who Tests What The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/6/2017 • 51 minutes, 54 seconds

Yosai with Darin Gordon

Summary For any program that is used by more than one person you need a way to control identity and permissions. There are myriad solutions to that problem, but most of them are tied to a specific framework. Yosai is a flexible, general purpose framework for managing role-based access to your applications that has been decoupled from the underlying platform. This week the author of Yosai, Darin Gordon, joins us to talk about why he started it, his experience porting it from Java, and where he hopes to take it in the future. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Darin Gordon about Yosai, a security framework for Python applications Interview Introductions How did you get introduced to Python? What is Yosai and what is the problem that you were trying to solve when you started it? How does Yosai compare to existing libraries for web frameworks such as Flask-Security or Django Guardian and why might someone choose Yosai instead? In the documentation it mentions that Yosai is a port of the Apache Shiro framework from Java to Python. What was most difficult about exposing a Pythonic interface while maintaining the core principles of the original? Authentication and authorization are difficult problem domains and can cause significant issues if they are not implemented in a secure fashion. How do you ensure an appropriate level of quality in Yosai to be confident having people use it? To start can you describe how the framework is architected and what is involved in integrating it with a project? Outside of the context of web applications, what are some situations where someone should consider integrating authentication and authorization into their project? What have been some of the most challenging aspects of building the Yosai project? Tell us about the Rust extension you wrote earlier this year What do you have planned for the future of Yosai? Keep In Touch Website GitHub @darin_gordon on Twitter Picks Tobias Brains On! podcast Darin The Asphalt Framework. Asphalt is an asyncio-based microframework for network oriented applications. Links Yosai Project Web Page Github Repo RBAC Apache Shiro TOTP Pyramid SOLID Builder Pattern POJO Corey Benfield Hyper HTTP/2 Library Passlib Hugo MKDocs YAML Middleware IoT Authz in Rust PyO3 Snaek PyCon Canada PyCascades JSON Web Tokens The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/30/2017 • 41 minutes, 59 seconds

Moving to MongoDB with Michael Kennedy

Summary There are dozens of decisions that need to be made when building an application. Sometimes this can lead to analysis paralysis and prevent you from making progress, so don’t let the perfect be the enemy of the good. This week Michael Kennedy shares his experience with evolving his application architecture when his business needs outgrew his initial designs. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Mike Kennedy about his work scaling his apps and his business Interview Introductions How did you get introduced to Python? In some of your recent episodes you have mentioned the work that you did to migrate your applications to run on MongoDB. Can you start by describing the business case for these applications and how you arrived at the initial design? What was the limiting factor that led you to consider such a drastic shift in how you store and manage your data and what benefits did you gain when the work was complete? If the issue was with scaling, how did you identify the choke points? Why go from relational (SQLite) to document (Mongo) instead of what would seem a more obvious choice of a production grade relational engine such as PostGreSQL or MySQL? Are there any particular synergies that arise from using a document as opposed to a relational store when working with Python and what are some of the main considerations when deciding between them? What was happening in your business that precipitated the need for this work? How are you talking to MongoDB from Python? Directly (via pymongo) or ORM-style? Why did you make that choice? How well is that working out? Advantages / drawbacks? In addition to podcasting you have also been working to create a number of successful courses to teach people how to use Python. Is there anything specific to the language that translates into how you design the material? For anyone who wants to learn more about the benefits and tradeoffs of using a document store with their Python applications, what are some resources that you recommend? Keep In Touch Michael @mkennedy on Twitter Websites Talk Python Python Bytes Picks Tobias Org Mode Levar Burton Reads Mike Newspaper Robomongo (now Robo 3T) The Dark Secret at the Heart of AI Haibike SDURO Cross 4.0 Links SQLAlchemy SQLIte MySQL PostGreSQL NoSQL MongoDB Database Normalization Foreign Keys Document Database Rollbar MongoEngine Mongo Security Checkup MLab MongoDB Atlas MongoDB World O’Reilly Python Mongo Book MongoDB For Python Developers Momentum Dash The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/22/2017 • 47 minutes, 58 seconds

Zulip Chat with Tim Abbott

Summary In modern work environments the email is being edged out by group chat as the preferred method of communication. The majority of the platforms used are commercial and closed source, but there is one project that is working to change that. Zulip is a project that aims to redefine how effective teams communicate and it is already gaining ground. This week Tim Abbott shares the story of how Zulip got started, how it is built, and why you might want to start using it. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Tim Abbott about Zulip, a powerful open source group chat platform Interview Introductions How did you get introduced to Python? What is Zulip and what was the initial inspiration for creating it? Where does the name come from? My understanding is that the project was initally intended to be a commercial product. Can you share some of the history of the acquisition by Dropbox and the journey to open sourcing it? How has your experience at Dropbox influenced the evolution and implementation of the Zulip project? There are a large number of group chat platforms available, both commercial and open source. How does Zulip differentiate itself from other options such as Slack or Mattermost? Typically real-time communication is difficult to achieve in a WSGI application. How is Zulip architected to allow for interactive communication? What have been the most challenging aspects of building and maintaining the Zulip project? What is involved in installing and running a Zulip server? For a large installation, what are the options for scaling it out to handle greater load? There is a large and healthy community that has built up around the Zulip project. What are some of the methods that you and others have used to foster that growth and engagement? What has been the most unexpected aspect of working on Zulip, whether technically or in terms of the community around it? What do you have planned for the future of Zulip? Keep In Touch Zulip Chat @zuliposs on Twitter Tim @tabbott3 on Twitter Website Picks Tobias Lego Mindstorms EV3 Tim Checklist Manifesto Tim’s Recipe Wiki Links Zulip Ksplice Electron React Native IFTTT Zapier Zephyr Barn Owl MyPy Tornado Django Zulip Tornado Documentation MySQL PostGreSQL ElasticSearch Code Triage Emoji Podcast.__init__ Zulip Chat The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/16/2017 • 1 hour, 39 seconds

NAPALM with David Barroso and Mircea Ulinic

Summary Routers and switches are the stitches in the invisible fabric of the internet which we all rely on. Managing that hardware has traditionally been a very manual process, but the NAPALM (Network Automation and Programmability Abstraction Layer with Multivendor support) is helping to change that. This week David Barroso and Mircea Ulinic explain how Python is being used to make sure that you can watch those cat videos. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing David Barroso and Mircea Ulinic about NAPALM (Network Automation and Programmability Abstraction Layer with Multivendor support), the library for managing programmable network devices Interview Introductions How did you get introduced to Python? [david] 2012 trying to use django 1.4 to store data I had on confluence. [mircea] August 2008, when I bought the Learning Python, Mark Lutz, 2nd edition Can you start by explaining what NAPALM is and the problem that you were solving when you started working on it? [david] trying to remove all the if vendora do this, elif vendorb do this other thing instead [mircea] only if I will feel there’s anything to add What led you to choose Python as the language for implementing it? [david] it’s what I knew best and vendors were starting to provide libraries to interact with their platforms so python seemed like a natural evolution as we could just provide an abstraction on top of those libraries that already existed. [mircea] I didn’t implement NAPALM, I was fistly a user then contributor, now I’m one of the maintainers. When working with network equipment it is easy to apply the wrong settings and bring down a large number of systems or lock yourself out entirely. Are there any tools in NAPALM to help prevent this from happening? [david] We provide mechanisms to ensure proper peer reviewing; we let operators propose a configuration and get a diff. We have a rollback mechanism so if you detect an issue you can immediately rollback and we also added support to the autorollback feature some vendors have. How have you architected the library to allow for easy integration of new classes of network devices? [david] very simple architecture. Trying to avoid complex features like abstract classes, metaprogramming or decorators. Main reason is that I figured my main user base wasn’t going to be very python savvy so I wanted something simple. What I ended doing was simulating interfaces with with a base class that described the supported methods and how they were supposed to behave and an extensive testing framework that ensure the method signatures and the behaviors matched the expectations. Designing and building a consistent API for such a wide variety of hardware and software platforms is a daunting task. How do you determine the lowest common set of functionality that you are going to expose as part of the core library vs delegating to the underlying dependencies? [david] We don’t necessarily go with the lowest common denominator. Sometimes we try to emulate features. For example, if a platform doesn’t support atomic changes we might simulate it by trying to send the configuration as a block and rollback immediately. Obviously a feature likes this is clearly documented so people is aware that this might happen. What we try to avoid though is implementing things that are very specific to a single vendor. In any case the way it has worked so far falls into two categories: configuration management. These are primitives like loading a candidate configuration for merging or replacing into the device, getting a diff back, commiting, discarding or rolling back configuration. These primitives were designed at the very begining of the project based on the netconf protocol and they have changed very little since then. When a primitive is not natively supported by a device we try to emulate it as with the atomicity example I gave before or we don’t implement it at all if it’s not possible. The second category is what we call getters which are methods that retrieve information from the devices. Things like interface counters, bgp neighbors, etc. These are basically community driven. Someone opens an issue on github explaining the data that he or she needs, we discuss it, we define a model and then we work on it. Not all getters are supported on all platforms. People mostly implements them as they need. Now there is a third category though. It is actually funny but I presented napalm for the first time a couple of years ago at NANOG64. It turns out the day after, at the same venue, Google was presenting Openconfig. Openconfig is an effort to design a common set of models to operate the network. So, for example, they have models for BGP neighbors, for interfaces, vlans, etc… Those models try to be vendor agnostic and you should, in theory, be able to use them to configure or to retrieve consistently information from any device. Problem is that, of course, vendors are slow implementing them, they don’t even have plans for all of them or for all the platforms, etc… So the sad truth is that two years later support for Openconfig is extremely limited. However, in the last few months I have been working on integrating napalm with opencofig so now we have a beta version of napalm where you can use python bindings that can translate native data from a device into an Openconfig object and viceversa. That has two direct implications: Now we are not only operating all vendors with the same tool but we are also operating them with the same data structures. This means that I can get the configuration of a cisco device and translate it directly to junos configuration. It also means that because now we are dealing with objects, I can do smart things like having an object that represents the candidate configuration, anotther object that represents a certain running state and simulate merges myself without having to rely on the device itself. I can even generate the exact commands to do the merge without having to rely on them doing the actual merge. I can also simulate the changes offline, I don’t even need access to the device anymore, I could be builting the objects from a backup or from the resulting configuration after merging different branches on github. I have seen a few posts recently discussing the use of NAPALM in conjunction with configuration management platforms such as SaltStack and Ansible. What are the tradeoffs of using the library directly vs integrated with these other tools? [david] napalm is a library in the strict sense. There is no business logic, no workflows, very little tooling embedded. Instead we try to implement as many primitives and be as flexible as possible so other tools can leverage on napalm to implement their workflows. What this means is that using napalm directly is great if you are writing a script to do backups or to solve a specific issue but if you want to build a whole framework for event driven automation or a configuration management system you are probably better off leveraging on napalm integration with salt/ansible/st2. I noticed in the documentation that merging configuration is supported. How do you manage conflicts and priority of nested data structures? [david] we try to make changes atomic. So if you make a change and trigger a conflict or you are missing some datastructure or some configuration is invalid configuration won’t be applied and the user will get an error. For platforms where changes can’t be atomic we try to apply the configuration changes in bulk and revert immediately if there is an error. How does declarative modeling of network devices differ from general purpose operating systems and what unique challenges do they pose? [david] lack of tooling like sed/awk/etc. Lots of state. Configuration is state itself and in most cases you can’t even reload it. Which means you have to type the exact commands to go from state a to state b. Like trying to configure the network stack of linux with only the iproute2 tooling available. What are the most technically challenging aspects of managing different network hardware programmatically? [david] Inconsistencies and buggy code. Not even inconsistencies across different platforms but across minor revisions of the same platforms. Small API changes that are not backwards compatible, small differences on output commands that break regular expressions and APIs that break every second call. What are some of the most interesting or unusual uses of NAPALM that you have seen? [david] I have seen people replacing their SNMP based monitoring system with napalm. I have built myself what you could call “immutable infrastructure for the network”. So for example, when you have to do a configuration change you don’t apply that configuration change. What you do instead is compile a full configuration for the device and fully reload the state of the device. That ensures you are always into a known state. So if a user would connect to the device and do a change outside the change control system because you are fully deploying state you can be certain that the manual change will be wipeout. So there is no way out of the automation. We also have this validate functionality integrated into napalm. With this functionality you can define a desired state, for example certain BGP neighbors have to be up and I must be receiving N prefixes from them. Napalm can then read those rules, figure out which data to retrieve and validate the data retrieved complies so I know some people using this state validation instead of using the traditional times series type of monitoring where you keep retrieving data constantly and alerting when you reach certain thresholds. I guess you could call this test driven monitoring? [mircea] SNMP thing For someone who is interested in learning more about network management, what resources do you recommend? [david] networktocode.com has some resources, labs, the slack community behing the organization is very active as well. ipspace.com has some good resources as well. pynet.twb-tech.com is also another great place to check for courses o’reilly has a book on Network Programmability and Automation which I haven’t read but I know the authors are very good so I am confident the content will be of high quality. [mircea] I blog about NAPALM & generally networking and network automation on my personal space: mirceaulinic.net packetpushers.net Keep In Touch David @dbarrosop on LinkedIn, GitHub and Twitter Blog Mircea Blog @mirceaulinic on LinkedIn, GitHub and Twitter NAPALM @napalmauto on Twitter Documentation Picks Tobias The Twelve Networking Truths Falsehoods Programmers Believe About Networking David The fear saga VR Mircea Daily Zen Links Juniper Arista Paramiko netmiko Cisco IOS Vagrant Netconf Protocol BGP OSPF SNMP TCP IP ZTP (Zero Touch Provisioning) PXE (Preboot eXecution Environment) Boot SaltStack Ansible StackStorm Trigger NAPALM Logs OpenConfig NANOG YANG Data Plane NTP(network time protocol) SSH Networking Resources PacketPushers.net O’Reilly – Network Programmability and Automation networktocode.com ipspace.net pynet.twb-tech.com The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/9/2017 • 58 minutes, 9 seconds

Automat State Machines with Glyph Lefkowitz

Summary The venerable ‘if’ statement is a cornerstone of program flow and busines logic, but sometimes it can grow unwieldy and lead to unmaintainable software. One alternative that can result in cleaner and easier to understand code is a state machine. This week Glyph explains how Automat was created and how it has been used to upgrade portions of the Twisted project. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Glyph about Automat, a library that provides self-service finite-state machines for the programmer on the go. Interview Introductions How did you get introduced to Python? What is a state machine and when might you want to use one? There are a number of libraries available on PyPI that facilitate the creation of state machines. Why did you feel the need to build a new option and how does it differ from what was already available? Why do you think developers fall into the trap of complicated conditional structures rather than reaching for a state machine? For someone who wants to integrate Automat into their project how would they go about that and what are some of the gotchas that they should keep in mind? What do the internals of Automat look like and how did you approach the overall design of the project? What are some of the more difficult aspects of designing and implementing state machines properly? What are some of the technical hurdles that you have been faced with in the process of building a library for implementing state machines? What do you have planned for the future of Automat? What are some of the most interesting use cases of Automat that you have seen? Keep In Touch Email @glyph on Twitter Glyph on GitHub Picks Tobias Commercial Electric color changing LED puck lights Glyph OmniFocus GTD Links Automat Glyph Interview About Software Ethics Finite State Automaton Yacc Bison Flex Parser Generator PyPI State Machine Pure Mealy Machine Moore Machine Mealy vs. Moore Machines Leaky Abstraction The Law of Leaky Abstraction Twisted Python Descriptor GraphViz Hypothesis PyCon Talk – TLS State Machine The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/2/2017 • 49 minutes, 27 seconds

Nuclear Engineering with Dr. Katy Huff

Summary Access to affordable and consistent electricity is one of the big challenges facing our modern society. Nuclear energy is one answer because of its reliable output and carbon-free operation. To make this energy accessible to a larger portion of the global population further reasearch and innovation in reactor design and fuel sources is necessary, and that is where Python can help. This week Dr. Katy Huff talks about the research that she is doing, the problems facing the nuclear industry, and how she uses Python to make it happen. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Dr. Katy Huff about using Python for nuclear engineering Interview Introductions How did you get introduced to Python? Can you start by explaining what nuclear engineering is and give some examples of current research in the field? The most widely used and recognized form of nuclear plant is the light water reactor, which, to my understanding, is also the most susceptible to melt-downs and release of radioactive material carried by escaped steam. What are some of the reactor types that are currently being researched to improve safety and efficiency? One of the major policy and logistics issues regarding nuclear power plants is the problem of how to handle spent fuel rods. What are some of the methods that are being researched to solve this problem? In your PyCon presentation you mentioned the Cyclus and PyNE projects as tools that you use in your research. Can you provide a brief overview of each and explain how you use them? What are some of the most pressing issues in nuclear engineering and how are you leveraging Python to help with addressing them? How does open source software relate to open science, and how do they impact the impact the ways that research is performed? What are some of the current or future developments in nuclear engineering that you are most excited about? Keep In Touch Website Twitter Research Picks Tobias Ryobi Tools Katy Atomic Awakening Atomic Accidents Atomic Adventures Links Plasma Nuclear Energy Thorium Uranium Molten Salt Reactor Spent fuel rods Yucca Mountain Nuclear Fuel Reprocessing Sodium Cooled Fast Reactor PyCon Keynote PyNE Cyclus Anthony Scopatz Moose Framework Partial Differential Equations REPL (Read Eval Print Loop) Stellarator Toroidal Fusion Device Journal of Open Source Software (JOSS) American Nuclear Society NEI IAEA The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/24/2017 • 38 minutes, 15 seconds

Industrial Automation with Jonas Neubert

Summary We all use items that are produced in factories, but do you ever stop to think about the code that powers that production? This week Jonas Neubert takes us behind the scenes and talks about the systems and software that power modern facilities, the development workflows, and how Python gets used to tie everything together. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Jonas Neubert about using Python for industrial automation Interview Introductions How did you get introduced to Python? How did you get involved in factory automation? What are some of the technical challenges that are unique to a factory environment and the physical computing needs associated with it? When developing new capabilities for your factory, how do you manage proper testing of your software given the need to interoperate with the hardware? Which languages are most frequently used for command and control of industrial systems and how does Python interface with them? How do you manage the problem of interfacing with the various different protocols and data formats that are presented by the different hardware instruments? In your PyCon presentation you commented on the fact that security in industrial automation systems is lacking. What are some of the most common issues that you have seen? Why is it that security is such an issue in industrial systems? How are production releases of your software managed and how does it differ from other types of products such as web applications? Aside from manufacturing facilities, what are some other types of environments or industries that require similar levels of hardware automation? What are some of the most interesting or challenging projects that you have worked on? What are some of the packages on PyPI that you find most useful in your day-to-day work? For someone who wants to get involved in industrial automation what kind of experience should they have and what are some of the resources that you recommend? What are some of the innovations in industrial automation that you are most excited about? Keep In Touch @jonemo on Twitter Website Jobs at Tempo Automation Picks Tobias Opeth Jonas Pycon 2017 Talks Eric Evenchick – Hacking Cars with Python Building a wireless speedometer with MicroPython Python from space by Katherine Scott Łukasz Langa – Unicode what is the big deal Morgan Wahl – Text is More Complicated Than You Think Comparing and Sorting Unicode The Prepared Newsletter by Spencer Wright Long Distance Amtrak rides! Links Tempo Automation Palm webOS Infinion Technologies DRAM Service Oriented Architecture Singleton Light Curtain Factory Acceptance Testing Site Acceptance Testing Testing Pyramid Protocol Analyzer Multimeter GCode IEC-61131 Pascal Ladder Logic OPC Standards OPC DA C# Factory Control Systems Stuxnet Industroyer IEC 61850 Industrial Internet of Things Counsyl PySerial FactoryBoy Parameterized Freezegun Struct XMLRPC Factory Tours How It’s Made McMaster.com Mass Customization Life Sciences CRISPR PyCon – Reprogramming the human genome Transcriptic Autodesk Life Sciences The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/18/2017 • 1 hour, 2 minutes, 6 seconds

Jedi Code Completion with David Halter

Summary When you’re writing python code and your editor offers some suggestions, where does that suggestion come from? The most likely answer is Jedi! This week David Halter explains the history of how the Jedi auto completion library was created, how it works under the hood, and where he plans on taking it. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing David Halter about Jedi, an awesome autocompletion and static analysis library for Python Interview Introductions How did you get introduced to Python? Can you explain what Jedi is and what problem you were trying to solve when you created it? What is the story behind the name? While reading through the documentation I noticed that there is alpha support for linting with Jedi. Can you compare the linting approach and capabilities with those found in other tools such as pylint and flake8? What does the internal architecture and design look like? From the research that I did for the show it seems that, rather than use the AST to determine the structure of the code being completed you built your own parser and recursive evaluation of the other methods that you use for determining accurate completion? What was lacking in existing parsers that led you to build your own? What are some of the difficulties that you have encountered building and maintaining the grammar definitions and higher level API for parsing multiple versions of Python, including the 2 vs 3 split? What are some of the biggest challenges associated with introspecting user code? What are some of the ways that Jedi can be confounded by a user’s project? What are some of the most difficult technical hurdles that you have been faced with while building Jedi? What are some unusual or unexpected uses of Jedi that you have seen? What do you have planned for the future of Jedi? Keep In Touch davidhalter on GitHub @jedidjahch on Twitter Picks Tobias Patch utility David Bears Den Soccer Singing Dancing DocOpt OpenStack Links Cloudscale.ch Vim Youcompleteme Neocomplete pyflakes pycodestyle pylint Parser Generator Parser Error Recovery lib2to3 Python grammar file Finite state automata Type inference yapf AST module MyPy IPython The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/11/2017 • 42 minutes, 55 seconds

Coconut with Evan Hubinger

Summary Functional programming is gaining in popularity as we move to an increasingly parallel world. Sometimes you want access to purely functional syntax and capabilities but you don’t want to have to learn an entirely new language. Coconut is here to help! This week Evan Hubinger explains how Coconut is a functional language that compiles to Python and can be mixed and matched with the rest of your program. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Evan Hubinger about Coconut, a functional language implemented as a superset of Python Interview Introductions How did you get introduced to Python? Can you start by explaining what Coconut is and what problem you were trying to solve when you created it? Where did the name come from? How is Coconut implemented and what does the compilation process for Coconut code look like? How will I be able to debug my Python if I’m not the one writing it? The documentation mentions that Coconut itself is compatible with both Python 2 and 3, are there any caveats to be aware of in terms of mixing in standard Python syntax? Are there any performance optimizations that you have had to perform in order to make things like recursion and pattern matching work at reasonable speeds in the Python VM? Which functional languages have you taken inspiration from during the creation of Coconut? What are some of the most interesting or unexpected uses of Coconut that you have seen? What are some resources that you recommend for people who are interested in learning more about functional programming? Keep In Touch Coconut Website GitHub Tutorial Documentation FAQ Chat room Evan GitHub LinkedIn Picks Tobias ElementTree Evan pyparsing is an awesome PyPI package you should check out The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/4/2017 • 33 minutes, 31 seconds

Cauldron with Scott Ernst

Summary The notebook format that has been exemplified by the IPython/Jupyter project has gained in popularity among data scientists. While the existing formats have proven their value, they are still susceptible with difficulties in collaboration and maintainability. Scott Ernst created the Cauldron notebook to be testable, production ready, and friendly to version control. This week we explore the capabilities, use cases, and architecture of Cauldron and how you can start using it today! Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Scott Ernst about Cauldron, a new notebook format built with software engineering best practices in mind. Interview Introductions How did you get introduced to Python? Can you start by explaining what Cauldron is and what problem you were trying to solve when you created it? In the documentation it mentions that you can use any editor for creating the content of the notebook. Can you describe a typical workflow of authoring the various files and cells and viewing the output? How does Cauldron compare to the Jupyter notebook format and what factors would lead someone to choose one over the other? Does Cauldron support running languages other than Python? If not then what would be involved in adding that capability? Cauldron notebooks support unit tests of individual cells. How does that process work and what are the limitations? The option for running the notebook in the context of a task workflow tool appears to be a powerful capability. What are some of the considerations that are necessary when writing a notebook to be run in that manner? What are some of the most interesting or unexpected projects that you have seen people using Cauldron for? What do you have planned for the future of Cauldron? Keep In Touch @swernst on Twitter Website Picks Tobias Tiffany Aching Adventures Scott Apache Big Data Conference Links When I Work IPython Interview Spark R2Py Bokeh Website Podcast.init Interview Luigi Airflow Website Podcast.init Interview Digital Paleontology A16 Project The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/28/2017 • 37 minutes, 51 seconds

Tech Debt and Refactoring at Yelp! with Andrew Mason

Summary Healthy code makes for happy coders, and there are many ways to measure the health of a project. This week Andrew Mason talks about the Undebt project from Yelp!, as well as some of the other tools and practices that have been developed to make sure that the balance on their technical debt card stays low. Give it a listen to learn how and why to measure and address the painful parts of your software. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Andrew Mason about technical debt and refactoring with Undebt. Interview Introductions How did you get introduced to Python? How do you define technical debt and why is it an important aspect of a project to keep track of? How would you characterize refactoring in general and when you might want to do it? What is Undebt and what was the problem that you were facing at Yelp when it was created? For someone who wants to get started with using Undebt what does that process look like and how does it work under the covers? What are some of the other tools and techniques available for refactoring Python code and how do they differ from what is possible in Undebt? What are some of the other tools and methods that you use to maintain the overall health of your codebase? What are some of the limitations and edge cases that you have experiemced working with Undebt? It is often a difficult balancing act when working in a team to determine how much time to spend paying down technical debt and building tools that will act as force multipliers vs doing feature work that will be visible to end-users. In your experience, what are some ways to manage that tension? Keep In Touch Andrew GitHub Website @andrew_mason1 on Twitter Picks Tobias Continuous Delivery by Jez Humble and David Farley Andrew XI Editor The Circle by David Eggers Links Martin Fowler “Uncle” Bob Martin git-code-debt Undebt PyParsing Podcast.init Episode About Parsing Rope Pre-Commit PyLint The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/20/2017 • 34 minutes, 26 seconds

LBRY with Jeremy Kauffman

Summary Content discovery and delivery and how it works in the digital realm is one of the most critical pieces of our modern economy. The blockchain is one of the most disruptive and transformative technologies to arrive in recent years. This week Jeremy Kauffman explains how the company and platform of LBRY are combining the two in an attempt to redefine how content creators and consumers interact by creating a new distributed marketplace for all kinds of media. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Jeremy Kaufman about LBRY, a new marketplace for media built on peer to peer storage and blockchain technologies. Interview Introductions How did you get introduced to Python? What is LBRY and how did the idea for it get started? What, if any, mechanisms are there for content owners to address piracy? Is the LBRY blockchain purpose built for the protocol and application or is it using something like Ethereum under the covers? In order to support a large scale distributed marketplace, the crypto coin that you are using will need to be able to support large transaction volumes so how have you architected it in order to achieve that capability? What technologies are you leveraging to facilitate the content distribution mechanism? One of the current problems with Bitcoin mining is that as the complexity of the proofs has increased and dedicated operations have moved to ASICs it has become less feasible for an individual to take part. Is there any provision for that situation built into the LBRY blockchain or does it not matter due to the capabilities for individual users to earn coins by participating as part of the storage network? What led to the decision to use Python for the initial implementation? For people who are participating in the LBRY network, what is the mechanism for them to convert their earned LBC into fiat currency? How much of the overall LBRY stack is using Python and what other languages are you taking advantage of? What is the business plan for LBRY the company and what do you have planned for the future of LBRY? Keep In Touch Jeremy @jeremykauffman on Twitter Email LBRY Website @LBRYio on Twitter Picks Tobias Neurotribes Jeremy Crystals and Mud in Property Law Links LBRY BitTorrent BitCoin Blockchain Distributed Hash Tables The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/14/2017 • 39 minutes, 39 seconds

Python Goes To The Movies with Dhruv Govil

Summary Movies are magic, and Python is part of what makes that magic possible. We go behind the curtain this week with Dhruv Govil to learn about how Python gets used to bring a movie from concept to completion. He shares the story of how he got started in film, the tools that he uses day to day, and some resources for further learning. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and this week I am joined by Dhruv Govil to talk about how Python is used for making movies. Interview Introductions How did you get introduced to Python? How did you get started in the film-making business? What are some of the ways that Python is used in the process of bringing a movie to completion? How much of the overall pipeline processing happens in Python vs just being used as a means of wiring together other programs. How much of the code that gets written is reusable between different projects? What is involved in testing data assets when they are submitted to the pipeline for the open format conversion process? What are some of the libraries that you have found to be most useful in your day-to-day work? Why do you think that Python is so widely used in the film industry and are there any other languages that you see being used in a similar manner? What are some of the areas where Python is used that you were most surprised by? Are there any portions of the process where you would like to be able to use Python but are unable due to performance or platform constraints? What are some of the most interesting projects that you have worked on and which are you most proud of? How does the work that is done by developers and technical contributors get reflected in the final credits? For anyone who is interested in working in the film industry as a technical contributor what advice do you have? Keep In Touch Dhruv Website @DhruvGovil on Twitter dgovil on GitHub Picks Tobias Firefox on Android Dhruv Google Earth VR Links Udemy: Python for MayaUdemy Vancouver Film School Guardians of the Galaxy Cloudy w/ chance meatballs 2 Blog Post: Python For Feature Film PyQT PySide Autodesk Maya Katana Nuke Cython Rez Alembic Geometry Storage Format Pixar Universal Scene Description Pyblish Open Color IO Edge of Tomorrow PyOpenGL Kraken Fabric Engine SIGGRAPH Convention Ray Tracing In A Weekend Mathematics for Computer Graphics Blender The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/6/2017 • 41 minutes, 41 seconds

Scapy with Guillaume Valadon

Summary Network protocols are often inscrutable, but if you have an effective way to experiment with them then they expose a lot of power. This week Guillaume Valadon explains how Scapy can be used to inspect your network traffic, test the security of your systems, and develop brand new protocols, all in Python! Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Get a shirt and support the show! Go to https://teespring.com/podcastinit?utm_source=rss&utm_medium=rss and get a mug to go with it. Your host as usual is Tobias Macey and today I am interviewing Guillaume Valadon about Scapy, the swiss army knife for packet manipulation in Python Interview Introductions How did you get introduced to Python? Can you explain what Scapy is and what problem it was created to solve? How has the decision to build Scapy in Python benefited the project? How has the 10 year history of the project affected your ability to maintain and evolve the code? How has the project evolved from the initial prototypes by Philippe Biondi through to its current incarnation as Scapy 2? I understand that the project was originally hosted on Bitbucket and then moved to Github. What prompted that decision and how has it played out? Who is the target audience and what are some of the primary intended use cases for Scapy? How is the implementation of packet layering architected in order to allow for such flexibility and composability? What are some of the most interesting and unexpected ways that you have seen Scapy used? What protocols have been the most problematic to implement and maintain? What have been some of the most challenging aspects of developing Scapy? What do you have planned for the future of Scapy? Contact Info Guillaume Website Email @guedou on Twitter Picks Tobias Buckethead Guillaume Rust Links Six UTScapy CodeCov Appveyor Jython OpenBSD MicroPython NSA Extra Bacon SNMP ASN.1 X509 TLS IPSec DNS HTTP2 PEP8 Scapy 3 The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/29/2017 • 31 minutes, 58 seconds

yt-project with Nathan Goldbaum and John Zuhone

Summary Astrophysics and cosmology are fields that require working with complex multidimensional data to simulate the workings of our universe. The yt project was created to make working with this data and providing useful visualizations easy and fun. This week Nathan Goldbaum and John Zuhone share the story of how yt got started, how it works, and how it is being used right now. Announcements The Open Data Science Conference is coming to Boston May 3rd-5th. Get your ticket now so you don’t miss out on your chance to learn more about the state of the art for data science and data engineering. Now you can get T-shirts, sweatshirts, mugs, and a tote bag to let the world know about Podcast.init, and you can support the show at the same time! Go to teespring.com/podcastinit and load up! Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I’m interviewing Nathan Goldbaum and John Zuhone about the YT project for multi-dimensional data analysis. Interview Introductions How did you get introduced to Python? What is yt and how did it get started? Where does the name come from? How does yt compare to other projects such as AstroPy for astronomical data analysis? What are the domains in which yt is most widely used? One of the main use cases of yt is for visualizing multidimensional data. What are some of the design challenges in trying to represent such complicated domains via a visual model? Some of the sample datasets for the examples are rather large. What are some of the biggest challenges associated with running analyses on such substantial amounts of information? How has the project evolved and what are some of the biggest challenges that it is facing going forward? Contact John @njgoldbaum on Twitter Nathan @astrojaz on Twitter Picks Tobias Scout2 Nathan The Expanse Novels John Visual Studio Code Links HDF5Py Matt Turk Seismodome Computational Fluid Dynamics AstroPy Website Podcast Interview SymPy Website Podcast Interview Magnetohydrodynamics Numerical Relativistic Hydrodynamics MPI4Py Matplotlib The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/22/2017 • 38 minutes, 9 seconds

Scikit-Image with Stefan van der Walt and Juan Nunez-Iglesias

Summary Computer vision is a complex field that spans industries with varying needs and implementations. Scikit-Image is a library that provides tools and techniques for people working in the sciences to process the visual data that is critical to their research. This week Stefan Van der Walt and Juan Nunez-Iglesias, co-authors of Elegant SciPy, talk about how the project got started, how it works, and how they are using it to power their experiments. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Your host as usual is Tobias Macey and today I am interviewing Stefan van der Walt and Juan Nunez-Iglesias, co-authors of Elegant SciPy, about scikit-image Interview Introduction How did you get introduced to Python? What is scikit-image and how did the project get started? How does its focus differ from projects like SimpleCV/OpenCV or Pillow? What are some of the common use cases for which the scikit-image package is typically employed? What are some of the ways in which images can exhibit higher dimensionality and what are some of the kinds of operations that scikit-image can perform in those situations? How is scikit designed and what are some of the biggest challenges associated with its development, whether in the past, present, or future? What are some of the most interesting use cases for scikit-image that you have seen? What do you have planned for the future of scikit-image? Contact Information Stefan Email @stefanvdwalt on Twitter Website Juan Email @jnuneziglesias on Twitter Website jni on GitHub Picks Tobias Set Stefan Monkey Island Thimbleweed Park Aqua Notes Juan Matilda the Musical Water Rower Rowing Machine Bored Elon Musk OMG: “News app that connects to a blood pressure monitor and adjusts your feed accordingly.” Links scikits.appspot.com Sphinx Gallery SciPy Conference Minimum Cost Paths Image Stitching Tutorial Elegant SciPy The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/16/2017 • 41 minutes, 53 seconds

Oscar Ecommerce with David Winterbottom and Michael van Tellingen

Summary If you have a product to sell, whether it is a physical good or a subscription service, then you need a way to manage your transactions. The Oscar ecommerce framework for Django is a flexible, extensible, and well built way for you to add that functionality to your website. This week David Winterbottom and Michael van Tellingen talk about how the project got started, how it works under the covers, and how you can start using it today. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Before we start the show I have a couple of announcements I started a new Slack channel for guests and listeners of the show. Go to www.pythonpodcast.com/slack?utm_source=rss&utm_medium=rss to join in the conversation! If you are interested in how open source powers innovations in data then you should check out the Open Source Data Science conference. It is coming to Boston, Massachusetts on March 3rd through the 5th so don’t miss out on your chance to level up and meet some new friends! Your host as usual is Tobias Macey and today I’m interviewing David Winterbottom and Michael van Tellingen about the Oscar framework for building ecommerce applications in Django. Interview Introductions How did you get introduced to Python? What is Oscar and what problem were you trying to solve when you created it? At face value ecommerce seems like a fairly straightforward problem domain but there is a lot of incidental complexity involved. What are some of the most challenging aspects of building and managing a web store? The documentation states in a number of places that Oscar takes a ‘domain driven’ approach to building ecommerce applications. Can you explain what you mean by that and how it manifests in the project? What does the internal design of Oscar look like and how would someone get started with building a site with it? There can be a benefit to having an opinionated approach when building a framework because it simplifies the implemenation for the user. What is the reasoning for choosing to expose and allow for complexity in Oscar? What are some of the most interesting and unexpected projects that you have seen built with Oscar? How has ecommerce changed in the time since Oscar was first created, and how has that impacted its evolution? What is in store for the future of Oscar? Contact David Website @codeinthehole on Twitter GitHub Michael Website Picks David Destroy All Software by Gary Bernhardt Michael PyCharm Zeep (SOAP Library) Links Shopify Tangent Domain Driven Design by Eric Evans (book) Entity, Attribute, Value Pattern Home Assistant Interview Spree Commerce Magento Saleor Wagtail Wagtail Interview Django CMS Kivy Garden Awesome Wagtail SaltStack Formulas Pelican Plugins DjangoPackages.org Django Treebeard TDD (Test Driven Development) The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/8/2017 • 53 minutes, 37 seconds

Duplicity with Kenneth Loafman

Summary Everyone who uses a computer on a regular basis knows the importance of backups. Duplicity is one of the most widely used backup technologies, and it’s written in Python! This week Kenneth Loafman shares how Duplicity got started, how it works, and why you should be using it every day. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Your host as usual is Tobias Macey and today I’m interviewing Kenneth Loafman about Duplicity, the Python based backup tool Interview Introduction How did you get introduced to Python? Can you share some of the history of Duplicity? What is duplicity and how does it differ from other available backup tools? Many backup solutions are written in Java or lower level languages such as C, what is the motivation for using Python as the language for implementing Duplicity? At face value backing up files seems like a straightforward task but there is a lot of incidental complexity. Can you describe the architecture and internals of Duplicity that allow for it to handle a wide variety of use cases? It has been shown in a number of contexts that people will generally use the default settings, so by forcing people to opt out of encrypting their backups you are promoting security best practices in Duplicity. Why is it so important to have the archive encrypted, even if the storage medium is fully under the control of the person doing the backup? Given that backups need to be highly reliable what are the steps that you take during the development process to ensure that there are no regressions? What mechanisms are built into duplicity to prevent data corruption? What are some of the most difficult or complex aspects of the problem space that Duplicity is dealing with? I noticed that you have a proposal for a new archive format to replace Tar. Can you describe the motivation for that and the design choices that have been made? Contact Kenneth Loafman Email @FirstPrime on Twitter Picks Tobias Passengers Kenneth NCIS Plan 9 From Outer Space Links rsync librsync deja-dup duply ECC duplicity The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/1/2017 • 35 minutes, 16 seconds

Digital Identity, Privacy, and Security with Brian Warner

Summary As the internet and digital technologies continue to infiltrate our way of life, we are forced to consider how our concepts of identity and security are reflected in these spaces. Brian Warner joins me this week to discuss his work on privacy focused projects that he has worked on, including the Tahoe LAFS, Firefox Sync, and Magic Wormhole. He also has some intriguing ideas about how we can replace passwords and what it means to have an online identity. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Your host as usual is Tobias Macey and today I’m interviewing Brian Warner about digital identity, privacy, and security Interview Please introduce yourself How did you get introduced to Python? How did you get involved in the area of cryptography and digital privacy? You have created or made significant contributions to a number of projects that are focused on making secure communications and storage more accessible, including Tahoe LAFS (Least Authority File System), Magic Wormhole, and Petmail. Can you provide a brief overview of these projects and any others that you would like to mention? What problem were you trying to solve when you created or began contributing to each of them and how satisfied are you with their current state? What have you found to be the biggest barriers to adoption for these projects? How do Tahoe and Magic Wormhole benefit an average user and what are your plans for their future development? One of the most ubiquitous issues with our modern security infrastructure leading to compromise is the humble password. What are some technologies that you foresee replacing the need for passwords? As technologists we are fairly well aware of the weaknesses in the systems that we use day-to-day. How can we make digital privacy and security more accessible? Contact Info warner on GitHub @lotharrr on Twitter Picks Brian Ra on Things of Interest The Golden Age by John C. Wright Links Tor Petmail (v1, ca 2003) Petmail (new) Mojo Nation Tahoe-LAFS Magic-Wormhole Erasure Coding Firefox Sync JPAKE SPAKE2 PyCon 2016 Presentation on Magic Wormhole (video) (slides) Versioneer Keybase File System Least Authority Enterprises Foolscap SpiderOak Object Capability Pattern Shamir’s Secret Sharing AutoCrypt Signal WhatsApp Simply Secure

3/25/2017 • 46 minutes, 43 seconds

Crossbar.io with Tobias Oberstein and Alexander Gödde

Summary As our system architectures and the Internet of Things continue to push us towards distributed logic we need a way to route the traffic between those various components. Crossbar.io is the original implementation of the Web Application Messaging Protocol (WAMP) which combines Remote Procedure Calls (RPC) with Publish/Subscribe (PubSub) communication patterns into a single communication layer. In this episode Tobias Oberstein describes the use cases and design patterns that become possible when you have event-based RPC in a high-throughput and low-latency system. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Your host as usual is Tobias Macey and today I’m interviewing Tobias Oberstein and Alexander Gödde about Crossbar.io, a high throughput asynchronous router for the WAMP protocol Interview Introductions How did you get introduced to Python? What is Crossbar and what is the problem that you were trying to solve when you created it? What is the status of the IETF WAMP protocol proposal? Why have an open protocol – and how do you see the ecosystem? Python isn’t typically considered to be a high-performance language so what led you to use it for building Crossbar? How is Crossbar architected for proxying requests from a highly distributed set of clients with low latency and high throughput? How do you handle authorization between the various clients of the router so that potentially sensitive messages don’t get published to the wrong component? Does Crossbar encapsulate any business logic or is that all pushed to the edges of the system? What are some of the typical kinds of applications that Crossbar is designed for? What are some common design paradigms that would be better suited for a WAMP implementation? What are some of the most interesting or surprising uses of Crossbar that you have seen? What do you have planned for the future of Crossbar? Keep In Touch Mailing Lists https://groups.google.com/forum/#!forum/autobahnws?utm_source=rss&utm_medium=rss https://groups.google.com/forum/#!forum/wampws?utm_source=rss&utm_medium=rss https://groups.google.com/forum/#!forum/crossbario?utm_source=rss&utm_medium=rss #autobahn on IRC Picks Tobias Logan Alex Pivotal Tracker Tobias PyPy Brian Warner Click prompt-toolkit Links Autobahn WAMP PyPy API Gateway The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/18/2017 • 52 minutes, 47 seconds

MetPy: Taming The Weather With Python

Summary What’s the weather tomorrow? That’s the question that meteorologists are always trying to get better at answering. This week the developers of MetPy discuss how their project is used in that quest and the challenges that are inherent in atmospheric and weather research. It is a fascinating look at dealing with uncertainty and using messy, multidimensional data to model a massively complex system. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Your host as usual is Tobias Macey and today I’m interviewing Ryan May, Sean Arms, and John Leeman about MetPy, a collection of tools and notebooks for analyzing meteorological data in Python. Interview Introductions How did you get introduced to Python? What is MetPy and what is the problem that prompted you to create it? Can you explain the problem domain for Meteorology and how it compares to other domains such as the physical sciences? How do you deal with the inherent uncertainty of atmospheric and weather data? What are some of the data sources and data formats that a meteorologist works with? To what degree is machine learning or artificial intelligence employed when modelling climate and local weather patterns? The MetPy documentation has a number of examples of how to use the library and a number of them produce some fairly complex plots and graphs. How prevalent is the need to interact with meteorological data visually to properly understand what it is trying to tell you? I read through your developer guide and watched your SciPy talk about development automation in MetPy. My understanding is that individuals with a pure science background tend to eschew formal code styles and software engineering practices so I’m curious what your experience has been when interacting with your user community. What are some of the interesting innovations in weather science that you are looking forward to? Keep In Touch MetPy @MetPy on Twitter Documentation GitHub Ryan @dopplershift on Twitter dopplershift on GitHub Picks Tobias Drill To Detail Podcast Data Capital Episode Ryan pytest-mpl Sean Trolls John Embedded.fm Links Unidata University of Oklahoma – College of Atmospheric and Geographic Sciences University Corporation for Atmospheric Research NetCDF GEMPACK XArray The Climate Corporation GOES-16 LDM Goes16 on Twitter Don’t Panic Geocast The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/11/2017 • 52 minutes, 22 seconds

The Update Framework: Securing Your Software Updates with Justin Cappos

Summary If you write software then there’s a good probability that you have had to deal with installing dependencies, but did you stop to ask whether you’re installing what you think you are? My guest this week is Professor Justin Cappos from the Secure Systems Lab at New York University and he joined me to discuss his work on The Update Framework which was built to guarantee that you never install a compromised package in your systems. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Your host as usual is Tobias Macey and today I’m interviewing Justin Cappos about The Update Framework, an open spec and reference implementation for mitigating attacks on software update systems. Interview Introduction How did you first get introduced to Python? Please start by explaining what The Update Framework (TUF) is and the problem that you were trying to solve when you created it. How is TUF architected and what led you to choose Python for the reference implementation? TUF addresses the problem of ensuring that the packages that get installed are created by the right developers, but how do you properly establish trust in the first place? Why are consistent and auditable dependencies important for the security of a system and how does TUF help with that goal? What are some of the known attack vectors for a software update system and how do Python and other systems attempt to mitigate these vulnerabilities? One of the perennial problems with any dependency management system is that of transitive dependencies. How does TUF handle this extra complexity of ensuring that all of the secondary, tertiary, etc. dependencies are also properly pinned and trusted? For someone who wants to start using TUF what are the steps to get it set up with pip? How would a project that wants to use TUF, do so? Who is using TUF and when will it be used with PyPI? Keep In Touch https://ssl.engineering.nyu.edu/?utm_source=rss&utm_medium=rss https://ssl.engineering.nyu.edu/personalpages/jcappos/?utm_source=rss&utm_medium=rss Picks Tobias The Enchanted Forest Chronicles Justin Hand Pulled Noodles Lam Zhou Links When the Going Gets Tough, Get TUF Going – PyCon 2016 RPM Apt Stork Package Manager Yubikey Distribution Packages Considered Insecure Notary Flynn Uptane in-toto The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/4/2017 • 37 minutes, 21 seconds

Pandas with Jeff Reback

Summary Pandas is one of the most versatile and widely used tools for data manipulation and analysis in the Python ecosystem. This week Jeff Reback explains why that is, how you can use it to make your life easier, and what you can look forward to in the months to come. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. When you’re writing Python you need a powerful editor to automate routine tasks, maintain effective development practices, and simplify challenging things like refactoring. Our sponsor JetBrains delivers the perfect solution for you in the form of PyCharm, providing a complete set of tools for productive Python, Web, Data Analysis and Scientific development, available in 2 editions. The free and open-source PyCharm Community Edition is perfect for pure Python coding. PyCharm Professional Edition is a full-fledged tool, designed for professional Python, Web and Data Analysis developers. Today JetBrains is offering a 3-month free PyCharm Professional Edition individual subscription. Don’t miss this chance to use the best-in-class tool with intelligent code completion, automated testing, and integration with modern tools like Docker – go to <www.pythonpodcast.com/pycharm?utm_source=rss&utm_medium=rss> and use the promo code podcastinit during checkout. Visit the site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Your host as usual is Tobias Macey and today I’m interviewing Jeff Reback about Pandas, the swiss army knife of data analysis in Python. Interview Introductions How did you get introduced to Python? To start off, what is Pandas and what is its origin story? How did you get involved in the project’s development? For someone who is just getting started with Pandas what are the fundamental ideas and abstractions in the library that are necessary to understand how to use it for working with data? Pandas has quite an extensive API and I noticed that the most recent release includes a nice cheat sheet. How do you balance the power and flexibility of such an expressive API with the usability issues that can be introduced by having so many options of how to manipulate the data? There is a strong focus for use in science and data analytics, but there are a number of other areas where Pandas is useful as well. What are some of the most interesting or unexpected uses that you have seen or heard of? What are some of the biggest challenges that you have encountered while working on Pandas? Do you find the constraint of only supporting two dimensional arrays to be limiting, or has it proven to be beneficial for the success of pandas? What’s coming for pandas? Pandas 2.0! Keep In Touch @jreback on Twitter jreback on GitHub Picks Tobias http://standards.mousepawgames.com/index.html?utmsource=rss&utmmedium=rss Jeff Travis CI Appveyor Circle CI Links Continuum Analytics Myths Programmers Believe About Time Jupyter Notebook XArray Dask Website Interview NumFocus PyLint Interview The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/26/2017 • 49 minutes, 22 seconds

PyTables with Francesc Alted

Summary HDF5 is a file format that supports fast and space efficient analysis of large datasets. PyTables is a project that wraps and expands on the capabilities of HDF5 to make it easy to integrate with the larger Python data ecosystem. Francesc Alted explains how the project got started, how it works, and how it can be used for creating sharable and archivable data sets. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Linode will has announced new plans, including 1GB for $5 plan, high memory plans starting at 16GB for $60/mo and an upgrade in storage from 24GB to 30GB on our 2GB for $10 plan. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Your host as usual is Tobias Macey and today I’m interviewing Francesc Alted about PyTables Interview Introductions How did you get introduced to Python? To start with, what is HDF5 and what was the problem that motivated you to wrap Python around it to create PyTables? Which are the most relevant contributors for PyTables? How you interacted? How is the project architected and what are some of the design decisions that you are most proud of? What are some of the typical use cases for PyTables and how does it tie into the broader Python data ecosystem? How common is it to use an HDF5 file as a data interchange format to be shared between researchers or between languages? Given the ability to create custom node types, does that inhibit the ability to interact with the stored data using other libraries? What are some of the capabilities of HDF5 and PyTables that can’t be reasonably replicated in other data storage systems? One of the more intriguing capabilities that I noticed while reading the documentation is the ability to perform undo and redo operations on the data. How might that be leveraged in a real-world use case? What are some of the most interesting or unexpected uses of PyTables that you are aware of? Keep In Touch @FrancescAlted on Twitter FrancescAlted on GitHub Picks Tobias The Accountant Francesc Blosc a high speed compressor, specially meant for binary data The Lego Batman Movie Links PyTables PyTables – Optimization Presentations and Videos about PyTables Part of the story behind PyTables HDF5 Pandas SIMD NumFOCUS The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/18/2017 • 49 minutes, 15 seconds

SKIDL with Dave Vandenbout

Summary As circuits and electronic components become more complex, visual circuit building tools are more difficult to use effectively. If you wish that you could just write your circuits in Python then you’re in luck! Dave Vandenbout created a library called SKIDL that brings the power and flexibility of Python to the realm of Electrical Engineering and he tells us all about it in this weeks show. Preamble Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your host as usual is Tobias Macey and today I’m interviewing Dave Vandenbout about SKIDL, a library for designing and validating circuit layouts. Interview Introductions How did you get introduced to Python? Can you describe what SKIDL is and the problem that you were trying to solve when you first started it? Most of my experience designing circuits has been done using a graphical tool. If you are using Python for the entire layout does it become difficult to understand the overall circuit without the visual representation? Is there a way to generate a circuit diagram from the SKIDL code for a visual reference? It seems that there is a substantial amount of electrical knowledge required to be able to design and build schematics in code. For someone who is more of a hobbyist or is just starting to work with circuit design are there any facilities of SKIDL to assist with that understanding? What does the testing and validation process of a generated circuit look like? What does the internal architecture of SKIDL look like and what are some of the biggest challenges that you have faced while building it? For the generated netlist does SKIDL take into account voltage losses due to the lengths of the traces in the final PCB and does it have any facilities to optimize the overall layout for space and efficiency? Sometimes a circuit board is meant to be accessible for maintenance or even display purposes. Is it possible to specify the arrangement of components to make them more aesthetically pleasing or to space them so that they are easier to access physical interface ports (e.g. GPIO pins or I2C buses)? What are some of the most interesting or surprising uses of SKIDL that you have seen? Keep In Touch Website Documentation Picks Tobias Samsonite Tectonic Backpack Dave Ball 4 by Jim Bouton Links KiCad Gerber Files ASIC FPGA PHDL MyHDL VHDL SPICE Simulator The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/11/2017 • 40 minutes, 49 seconds

Parsing and Parsers with Dave Beazley and Erik Rose

Summary If you have ever found yourself frustrated by a complicated regular expression or wondered how you can build your own dialect of Python then you need a parser. Dave Beazley and Erik Rose talk about what parsers are, how some of them work, and what you can do with them in this episode. Preface Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your host as usual is Tobias Macey and today I’m interviewing Erik Rose and Dave Beazley about what parsing is, why you might want to use it, and how their respective libraries Parsimonious and PLY make it easy. Interview Introductions How did you get introduced to Python? Can you each start by talking a bit about your respective libraries and what problem you were trying to solve when they were first created? In what ways does a full-fledged parser differ from what a regular expression engine is capable of? What are some of the different high-level approaches to building a parser and when might you want to choose one over the others? I’m sure that when most people hear the term parsing they associate it with reading in a data interchange format such as JSON or CSV. What are some of the more interesting or broadly applicable uses of parsing that might not be as obvious? One term that kept coming up while I was doing research for this interview was “Grammars”. How would you explain that concept for someone who is unfamiliar with it? Once an input has been parsed, what does the resulting data look like and how would a developer interact with it to do something useful? For someone who wants to build their own domain specific language (DSL) what are some of the considerations that they should be aware of to create the grammar? What are some of the most interesting or innovative uses of parsers that you have seen? Keep In Touch Dave Beazley @dabeaz on Twitter Website Erik Rose @ErikRose on Twitter Website Picks Tobias Terminix Erik Riven ScummVM Dave iTerm2 Kerbal Space Program Links Python Cookbook Python Essential Reference Fathom SWIG Windows Scripting Host PEG (Brian Foord) Parsing Techniques by Grune and Jacobs The Dragon Book Stack Overflow HTML regex parsing Earley parsing SPARK Hy-lang Docs Interview Trampolining Lisp NLTK SLY DXR LLVM Numba The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/4/2017 • 50 minutes

Home Assistant with Paulus Schoutsen

Summary Don’t you wish you could make all of your devices talk to each other? Check out Home Assistant, the Python 3 platform for unified automation. Paulus Schoutsen shares the story of how the project got started, what makes it tick, and how you can use it today! Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your host as usual is Tobias Macey and today I’m interviewing Paulus Schoutsen about Home Assistant, the Python 3 platform for unifying your home automation. Interview Introductions How did you get introduced to Python? What is Home Assistant and what was the initial frustration that inspired you to create it? How useful would home assistant be for someone who doesn’t have a lot of the so-called ‘smart home’ technology? Given the fact that the intended context for Home Assistant is in the user’s house or apartment, how do you ensure that their data and privacy are safe? Reading through the documenation for installing and configuring Home Assitant, it seems prohibitively complex for someone who is not technically inclined. Has any work been done to try to package the project in a way that is more friendly to a casual user? What are some of the most difficult challenges that you have faced while building Home Assistant? Why did you choose Python 3 as the technology for building this platform? The list of supported services and integrations is quite impressive. How does the current architecture allow for that kind of growth? How has the architecture of Home Assistant evolved from when you first started it? What are some of the products or platforms that you consider to be competitors of Home Assistant and how do you differentiate yourself? What are some of the most interesting or unexpected uses of Home Assistant that you have seen? What do you see as some of the most promising and the most troubling trends in the future of home automation? Keep In Touch Gitter Chatroom Forum Picks Tobias Miss Peregrine’s Home for Peculiar Children Paulus Read a Newspaper Links Mycroft Interview Project Homepage Let’s Encrypt Voluptuous JSON-Schema Home Assistant PyCon Presentation asyncio Open HAB Merai Botnet The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/28/2017 • 41 minutes, 46 seconds

Cryptography with Paul Kehrer

Summary Sooner or later you will need to encrypt or hash some data. Thankfully we have the Cryptography library, along with the other projects maintained by the Python Cryptographic Authority, to make sure that your crypto is done right. In this episode Paul Kehrer talks about how the PyCA got started, the projects that they maintain, and how you can start using cryptography in your programs today. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your app or experimenting with something you hear about in this episode. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your host as usual is Tobias Macey and today I’m interviewing Paul Kehrer about cryptography and encryption in Python Interview with Paul Kehrer Introductions How did you get introduced to Python? Can you share a bit of the background behind the Python Cryptographic Authority and how you got involved? There is an adage that you should never roll your own crypto because if there are bugs or exploits in your implementation then it can have potentially serious side effects. What problem was the Cryptography library created to solve that was important enough to proceed despite that risk? Given the sensitive nature of the libraries that you are working on, what development practices are you relying on to prevent the introduction of vulnerabilities? While reading through the documentation I noticed that Cryptography links against OpenSSL. Is it possible to swap that out for alternative implementations such as LibreSSL or S2N? What are some of the testing techniques that you use to ensure the accuracy of the algorithms that you are using? What are some of the factors that a developer should keep in mind when selecting which cryptographic library to use in their projects? When might someone want to use the capabilities found in the cryptography library what do they need to be aware of while writing their application? For someone who wants to incorporate the cryptography library into their project what are some of the potential pitfalls that they should be aware of and how much knowledge of encryption should they possess? In what ways does the security landscape in Python differ from that of other languages that you are familiar with and what unique challenges do we face? What are some of the fundamental aspects of encryption and cryptography that you feel every developer should at least be aware of? If anyone wants to learn more about security and encryption, what resources do you recommend? Keep In Touch Twitter – @reaperhulk Picks Tobias Migadu Castle Panic Paul Frinkiac.com Morbotron Links S2N LibreSSL Cryptography 101 General Number Field Sieve Lattice Based Crypto Google New Hope Cryptography Hypothesis Mersenne Twister CryptoPals Crypto Challenges The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/21/2017 • 42 minutes

Translate House with Dwayne Bailey and Ryan Northey

Summary What is internationalization, when should you add it to your program, and how do you get started? This week Dwayne Bailey and Ryan Northey tell us about their work with Translate House and the different projects that they have built to make translating your software easier. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your host as usual is Tobias Macey and today I’m interviewing Dwayne Bailey and Ryan Northey about Translate House and the process of internationalization and localization for software projects. Interview with Dwayne Bailey and Ryan Northey Introductions How did you get introduced to Python? Why did you get involved in localisation, what got you started? How would you describe the difference between internationalization and localization? Are there cases where it makes sense to only do one of those things? Why should people localise software into other languages? Translate House is an organization focused on localizing and internationalizing software projects. To that end there are a collection of projects that you develop and maintain. Can you briefly introduce each of them and describe their purpose? What was the first project that was created in that list and how did it lead to the creation of the other tools? At what point did you decide that creating an organization to own and support the tools that you were building was the right choice to make? You run a distributed organisation, how do you manage that? I was recently speaking with Michal Čihař about the Weblate project and he mentioned that he uses the Translate Toolkit for handling the low level aspects of managing the translation files. What are some of the architectural and design challenges that arise from needing to support so many different systems for managing source text and translations? How do Pootle and Virtaal compare to other tools for web or desktop based translation? Are they primarily used for translating software or do they get used for other sources of text as well? Given that Virtaal is intended for use on desktop systems by people who aren’t necessarily technically adept how have you approached the packaging and deployment aspects of it? What are some of the challenges that you have had to overcome? Given the fact that multi-lingual translation requires interacting with a large quantity of text in numerous alphabets, what kind of impact has the unicode handling in Python 3 had on your projects? What do you have planned for the future of your projects? Keep In Touch Github Gitter Ryan Github Picks Tobias Google Chromecast Dwayne Jitsi Meet Ryan Gitter Links XLIFF Gettext PO Format CLDR (Common Locale Data Repository) The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/14/2017 • 58 minutes, 52 seconds

Morepath with Martijn Faassen

Summary Python has a wide and growing variety of web frameworks to choose from, but if you want one with super powers then you need Morepath. This week Martijn Faassen shares the story of how Morepath was created, how it differentiates itself from the other available options, and how you can use it to power your next project. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. You’ll want to make sure that your users don’t have to put up with bugs, so you should use Rollbar for tracking and aggregating your application errors to find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your host as usual is Tobias Macey and today I’m interviewing Martijn Faassen about the Morepath web framework. Interview with Martijn Faassen Introductions How did you get introduced to Python? What is Morepath and what problem were you trying to solve when you created it? The tag line for the Morepath project is that it’s a web microframework with superpowers. What is special or different about it that sets it apart from the other options in the Python ecosystem? It can be difficult to convince someone to migrate to a new framework, particularly if there is a lack of supporting ecosystem. What are some of the motivating factors for a developer to switch to Morepath if they already have experience with one of the more widely used frameworks? What does the internal architecture for Morepath look like and what are some of the challenges that you have faced while building it? One of the features is the automatic link generation for ensuring that you don’t end up with dead links. Is there any support for permalinks or redirects so that if you refactor your site people won’t end up at a path that no longer exists? In the documentation you make a number of references to the fact that Morepath is a routing based framework. Can you explain what you mean by that and how it differs from a traversal based framework? Part of the core elements of Morepath are your libraries Reg and Dectate. Can you describe each of them and explain some of how they came to be created? Morepath has a different conception of models than most frameworks that I’ve dealt with in that they aren’t necessarily associated with any form of database. Can you explain why that is and some of the patterns that it allows for? The method for extending and reusing applications built in Morepath is through subclassing the objects and overriding specific methods. What is it about this approach that you found to be more flexible than the alternatives exhibited by other frameworks? What are some of the most interesting or unexpected uses of Morepath that you have seen? What do you have planned for the future of Morepath? Keep In Touch Blog Twitter GitHub Email Picks Tobias IMDB Gyroscopes Martijn Ken And Robin Talk About Stuff Viili Links 13th age JSON API JSON-LD Hydra (REST standard) GraphQL Falcor aiohttp Zope Pyramid Grok OneGov Martijn – My Exit From Zope LXML Elementree The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/7/2017 • 1 hour, 5 minutes, 50 seconds

ERPNext with Rushabh Mehta

Summary If you need to track all of the pieces of a business and don’t want to use 15 different tools then you should probably be looking at an ERP (Enterprise Resource Planning) system. Unfortunately, a lot of them are big, clunky, and difficult to manage, so Rushabh Mehta decided to build one that isn’t. ERPNext is an open-source, web-based, easy to use ERP platform built with Python. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. You’ll want to make sure that your users don’t have to put up with bugs, so you should use Rollbar for tracking and aggregating your application errors to find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your host as usual is Tobias Macey and today I’m interviewing Rushabh Mehta about ERPNext Interview with Rushabh Mehta Introductions How did you get introduced to Python? What does ERP stand for and what kinds of busineesses require that kind of software? What problem were you trying to solve when you created ERPNext and what factors led to the decision to write it in Python? How is ERPNext architected and what are some of the biggest challenges that were faced during its creation? While researching the project I noticed that you created your own framework which is used for building ERPNext. What was lacking in the existing options that made building a new framework appealing? What are some of the projects that you consider to be your competitors and what are the features that would convince a user to choose ERPNext? For someone who wants to self-host ERPNext what are the system requirements and what does the scaling strategy look like? On the marketing site for ERPNext it is advertised as being for small and medium businesses. What are the characteristics of larger businesses that might not make them a good fit for the features or structure of ERPNext? What are some of the most interesting or unexpected ways that you have seen ERPNext put to use? Are there any interesting projects of features that you are working on for release in the near future? Keep In Touch Rushabh Twitter ERPNext Forum GitHub Website Picks Tobias WordPress Rushabh Ready Player One Links 8088 PC XT Odoo The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/31/2016 • 30 minutes, 33 seconds

Jackie Kazil

Summary Jackie Kazil has led a distinguished and varied career with a strong focus on providing information and tools that empower others. This includes her work in data journalism, as a presidential innovation fellow, co-founding 18F, co-authoring a book, and being elected to the board of the Python Software Foundation. In this episode she shares these stories and more with us and how Python has helped her along the way. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your application. You’ll want to make sure that your users don’t have to put up with bugs, so you should use Rollbar for tracking and aggregating your application errors to find and fix the bugs before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com to join other listeners of the show and share ideas for how to make it better. Your host as usual is Tobias Macey and today I’m interviewing Jackie Kazil about her work with 18F, writing Data Wrangling with Python, and her career with Python. Interview with Jackie Kazil Introductions How did you get introduced to Python? Looking at your background it shows that you got your start in Journalism and that you are now working on an additional degree in Computational Social Science. Can you share a bit about that journey and what set you on that path? What is computational social science and what has your particular focus been within that field? How has your work in news media prepared you for your current role? One of your many notable achievements is co-founding 18F. Can you start by explaining what that organization is and how you got involved in the efforts to build it? What are some of the notable uses of Python at 18F? In what ways did your experience working with 18F differ from the work you have done at companies outside of government? You recently co-wrote and published Data Wrangling with Python through O’Reilly Media. What kind of subject matter do you cover in the book and who is the target audience? There are a number of resources available to learn the various tools for working with data in Python. What is the gap that this book is aiming to fill and how did you get started with it? What are some of the most interesting things that you learned while working on the book? Keep In Touch Twitter Email Picks Tobias Jason Bourne Movies Jackie Czech Dumpling Dough Links Byteback NetworkX Project Mesa GeoQ openFOIA OpenFEC API The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/24/2016 • 39 minutes, 47 seconds

Weblate with Michal Čihař

Summary Adding translations to our projects makes them usable in more places by more people which, ultimately, makes them more valuable. Managing the localization process can be difficult if you don’t have the right tools, so this week Michal čihař tells us about the Weblate project and how it simplifies the process of integrating your translations with your source code. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. You’ll want to make sure that your users don’t have to put up with bugs, so you should use Rollbar for tracking and aggregating your application errors to find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your host as usual is Tobias Macey and today I’m interviewing Michal Čihař about Weblate Interview with Michal Čihař Introductions How did you get introduced to Python? Can you explain what Weblate is and the problem that you were trying to solve by creating it? What are the benefits of using Weblate over other tools for localization and internationalization? One of the advertised features of Weblate is integration with git and mercurial. Can you explain how that works and what a typical translation workflow looks like both for a developer and a translator? Given that part of the focus for the tool is to allow for community translation, how do you simplify the experience for first time contributors? I understand that Weblate is written as a django application. Is it possible to use Weblate with other Web frameworks or non-web projects? Can this be used with projects implemented in other programming laguages? Are there any capabilities that are lot in this scenario? Why should developers and product managers be concerned with localizing an application? How does Weblate help to reduce the level of investment necessary for such an undertaking? What are some of the biggest difficulties that you have encountered while building and maintaining Weblate? What are the most common problems that you see people encounter on both the translator and developer side when dealing with internationalization and localization? Keep In Touch Weblate.org Facebook Twitter GitHub Picks Tobias War Dogs Michal Jordi’s Chocolate Links L20N The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/17/2016 • 32 minutes, 34 seconds

SpaCy with Matthew Honnibal

Summary As the amount of text available on the internet and in businesses continues to increase, the need for fast and accurate language analysis becomes more prominent. This week Matthew Honnibal, the creator of SpaCy, talks about his experiences researching natural language processing and creating a library to make his findings accessible to industry. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. You’ll want to make sure that your users don’t have to put up with bugs, so you should use Rollbar for tracking and aggregating your application errors to find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your host as usual is Tobias Macey and today I’m interviewing Matthew Honnibal about SpaCy and Explosion.AI Interview with Matthew Honnibal Introductions How did you get introduced to Python? Can you start by sharing what SpaCy is and what problem you were trying to solve when you created it? Another project for natural language processing that has been part of the Python ecosystem for a number of years is the Natural Language Tool Kit (NLTK). How does SpaCy differ from the NLTK and are there any cases where that would be the better choice? How much knowledge of NLP and computational linguistics is necessary to be able to use SpaCy? What does the internal design and architecture of SpaCy look like and what are the biggest challenges associated with its development to date and into the future? One of the projects that you have built around SpaCy which I think is really cool and caught my attention when I first found your project is the displaCy visualization tool. Can you explain what that is and why you think it is important? What are some kinds of applications where SpaCy would be useful which might not be obvious candidates for it? Why is speed such an important focus for an NLP library? One of the ways that you have been able to gain a speed boost is through releasing the GIL and allowing for true parallelism via Cython. How have you managed to ensure that this doesn’t lead to data races and program failures? Building on the success of SpaCy you founded a company called Explosion AI. Can you explain what your goals are for this endeavor and the kinds of services that you are offering? What are some of the most interesting uses of SpaCy that you have seen? What do you have planned for the future of SpaCy? Keep In Touch Twitter Matthew SpaCy Explosion AI Mailing List Explosion AI Contact Form Picks Tobias Zoom H4N Pro Shure SM58 Links Reddit sense2vec demo DisplaCy DisplaCy Entity Visualizer SpaCy Showcase NLTK Chartbeat Cytora The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/11/2016 • 36 minutes, 47 seconds

Kinto with Alexis Metaireau and Mathieu Leplatre

Summary Are you looking for a backend as a service offering where you have full control of your data? Look no further than Kinto! This week Alexis Metaireau and Mathieu Leplatre share the story of how Kinto was created, how it works under the covers, and some of the ways that it is being used at Mozilla and around the web. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. You’ll want to make sure that your users don’t have to put up with bugs, so you should use Rollbar for tracking and aggregating your application errors to find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your host as usual is Tobias Macey and today I’m interviewing Alexis Metaireau and Mathieu Leplatre about Kinto Interview with Alexis and Mathieu Introductions How did you get introduced to Python? What is Kinto and how did it get started? What does the internal architecture of Kinto look like? Given that the primary data format being stored is JSON, why did you choose PostGreSQL as your storage backend instead of a NoSQL document database such as CouchDB? Synchronization of transactions from multiple users, including offline first support, is a difficult problem. How have you approached that in Kinto and what are some of the alternate solutions that were considered? Designing usable APIs is a complicated subject. What features did you prioritize while creating the interfaces to Kinto? What are some of the most innovative uses of Kinto that you have seen? What are some of the biggest challenges that you have faced while building Kinto? What do you have planned for the future of Kinto? Keep In Touch Kinto Github Mailing List Alexis Email Mathieu Twitter Email Picks Tobias What are you working on this week with Python? Alexis Miles Davis – Bitches Brew Mathieu Sigal Subliminal Links Pocket CouchDB OpenAPI WebCrypto Formbuilder Firebase Kinto Comparison Table Mozilla Persona Portier The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/4/2016 • 56 minutes, 1 second

Plone with Eric Steele

Summary Plone is one of the first CMS projects to be built using Python and it is still being actively developed. This week Eric Steele, the release manager for Plone, tells us about how it got started, how it is architected, and how the community is one of its greatest strengths Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. You’ll want to make sure that your users don’t have to put up with bugs, so you should use Rollbar for tracking and aggregating your application errors to find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your host as usual is Tobias Macey and today I’m interviewing Eric Steele about the Plone CMS. Interview with Eric Steele Introductions How did you get introduced to Python? Can you start by explaining a bit about what Plone is and how you got involved with it? How did the Plone project get started and how has it evolved over the years? What makes Plone unique among the myriad CMS tools that are available and which of them do you consider to be direct competitors? Plone has managed to keep an impressive track record of security. What are some of the key features that enable that? I know that for much of its history, the default data storage for plone was the ZODB (Zope Object DataBase). How would you describe its benefits and drawbacks for someone who is familiar with a relational database? Plone is one of the most long-lived Python projects that I am aware of. What are some of the most difficult maintenance challenges that you have encountered over the years of its existence? What does the internal architecture of Plone look like? One of the major tenets of the project is the ability to install extensions. What are some of the most interesting plugins that you are aware of? What kinds of projects are Plone best suited for? What does the workflow look like for a user of Plone? What are some of the most interesting uses of Plone that you have seen? What are the biggest challenges facing the Plone project and community as development and deployment paradigms continue to change? Keep In Touch Plone Website Forum IRC: #plone on freenode.net Eric Twitter E-mail Picks Tobias The Inquiry (podcast) PyCon US Eric Really Bad Chess Home Assistant Links Zope ZEO PloneFormGen Rapido CastleCMS Plumi Bika LIMS Quaive (Plone Intranet) Open Advice The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA .

11/26/2016 • 50 minutes, 26 seconds

Retrospective

Summary In this episode Chris and I look back at the past 83 episodes of the show and talk about what we learned, what we’ve enjoyed, and some of the highlights. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. You’ll want to make sure that your users don’t have to put up with bugs, so you should use Rollbar for tracking and aggregating your application errors to find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your hosts as usual are Tobias Macey and Chris Patti Today we’re interviewing each other about the past year and a half of the show. Interview with Tobias and Chris Introductions What have been some of the most unexpected or surprising aspects of the show for you during the past year and a half? – Tobias What are your top three favorite shows so far and why? – Chris If you could have a longer conversation with any of the past guests, who would you pick? – Tobias What has doing the show meant to you? – Chris What have you learned while doing the show that you wish you had known at the start? – Tobias How has the production process evolved since the beginning of the show? – Chris Chris Leaving the Show – Chris Tobias and I started new jobs (At MIT Office of Digital Learning and Amazon Web Services, respectively) We’re much, much busier these days, making coordination difficult Tobias is ready to take the show solo and I (Chris) support him in this Chris still plans to support the show as an avid fan Keep In Touch Chris’s Contact Info Picks Tobias Locust Chris StaSh – Shell for Pythonista Producing a Podcast The Python Community The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/19/2016 • 37 minutes, 29 seconds

HouseCanary with Travis Jungroth

Summary Housing is something that we all have experience with, but many don’t understand the complexities of the market. This week Travis Jungroth talks about how HouseCanary uses data to make the business of real estate more transparent. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. You’ll want to make sure that your users don’t have to put up with bugs, so you should use Rollbar for tracking and aggregating your application errors to find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your host as usual is Tobias Macey and today I’m interviewing Travis Jungrot about HouseCanary, a company that is using Python and machine learning to help you make real estate decisions. Interview with Travis Jungroth Introductions How did you get introduced to Python? What is HouseCanary and what problem is it trying to solve? Who are your customers? Is it possible to get data and predictions at the neighborhood level for individual homebuyers to use in their purchasing decisions? What do you use for your data sources and how do you validate their accuracy? What are some of the sources of bias that are present in your data and what strategies are you using to account for them? Can you describe where Python is leveraged in your environment? What are some of the biggest software design and architecture challenges that you are facing while you continue to grow? What are the areas where Python isn’t the right choice and which languages are used in its place? What are the biggest predictors of future value for residential real estate? Can your system be used to identify risks associated with the housing market, similar to those seen in the bubble that triggered the 2008 economic failure? What are some of the most interesting details that you have discovered about real estate and housing markets while working with HouseCanary? Keep In Touch HouseCanary Website Twitter Travis Twitter Github Picks Tobias Railsea by China Miéville Kraken by China Miéville Travis DDT On Writing Well by William Zinser Links Hacking Secret Ciphers with Python The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/12/2016 • 39 minutes, 45 seconds

Mycroft with Steve Penrod

Summary Speech is the most natural interface for communication, and yet we force ourselves to conform to the limitations of our tools in our daily tasks. As computation becomes cheaper and more ubiquitous and artificial intelligence becomes more capable, voice becomes a more practical means of controlling our environments. This week Steve Penrod shares the work that is being done on the Mycroft project and the company of the same name. He explains how he met the other members of the team, how the project got started, what it can do right now, and where they are headed in the future. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. You’ll want to make sure that your users don’t have to put up with bugs, so you should use Rollbar for tracking and aggregating your application errors to find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com to talk to previous guests and other listeners of the show. Your host as usual is Tobias Macey and today I’m interviewing Steve Penrod about the company and project Mycroft, a voice controlled, AI powered personal assistant written in Python. Interview with Steve Penrod Introductions How did you get introduced to Python? Can you start by describing what Mycroft is and how the project and business got started? How is Mycroft architected and what are the biggest challenges that you have encountered while building this project? What are some of the possible applications of Mycroft? Why would someone choose to use Mycroft in place of other platforms such as Amazon’s Alexa or Google’s personal assistant? What kinds of machine learning approaches are being used in Mycroft and do they require a remote system for execution or can they be run locally? What kind of hardware is needed for someone who wants to build their own Mycroft and what does the install process look like? It can be difficult to run a business based on open source. What benefits and challenges are introduced by making the software that powers Mycroft freely available? What are the mechanisms for extending Mycroft to add new capabilities? What are some of the most surprising and innovative uses of Mycroft that you have seen? What are the long term goals for the Mycroft project and the business that you have formed around it? Keep In Touch Website Picks Tobias yip Myths and Legends Podcast Steve Ethiopian Cuisine Blue Nile in KC Kansas City Barbecue Joe’s KC Links Google Home Tom Waits – Heart Attack & Vine mycroft.ai FLITE Vocalid Vocalid TED Talk PocketSphinx GE FirstBuild Sonar GNU Linux The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/5/2016 • 1 hour, 5 minutes, 12 seconds

Annapoornima Koppad

Summary Annapoornima Koppad is a director of the PSF, founder of the Bangalore chapter of PyLadies, and is a Python instructor at the Indian Institute of Science. In this week’s episode she talks about how she got started with Python, her experience running the PyLadies meetup, and working with the PSF. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. You’ll want to make sure that your users don’t have to put up with bugs, so you should use Rollbar for tracking and aggregating your application errors to find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your hosts as usual are Tobias Macey and Chris Patti Today we’re interviewing Annapoornima Koppad about her career with Python and her experiences running the PyLadies chapter in Bangalore, India and being a director for the Python Software Foundation. Interview with Annapoornima Koppad Introductions How did you get introduced to Python? – Tobias I noticed that you have been freelancing for several years now. How much of that has been in Python and how has that fed back into your other activities? – Tobias While preparing for this interview I came across the book that you self-published on Amazon. What was your motivation for writing it and who is the target audience? – Tobias Can you tell us about your experience with starting the PyLadies group in Bangalore? What were some of the biggest challenges that you encountered and how have you approached the task of growing awareness and membership of the group? – Tobias You recently started teaching Python at the Indian Institute of Science. What kinds of subject matter do you cover in your lessons? – Tobias What is it about Python and its community that has inspired you to dedicate so much of your time to contributing back to it? – Tobias In what ways would you like to see the Python ecosystem improve? – Tobias You were voted in as a director of the Python Software Foundation in the most recent election. Can you share what responsibilities that entails? – Tobias What would you like to achieve with your time in the PSF? – Tobias Keep In Touch PyLadies Bangalore Meetup Blog Email Twitter Picks Tobias Fluentd Annapoornina The Lord of The Rings by J.R.R. Tolkien Storks Food Street The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/29/2016 • 19 minutes, 23 seconds

Python for GIS with Sean Gillies

Summary Location is an increasingly relevant aspect of software systems as we have more internet connected devices with GPS capabilities. GIS (Geographic Information Systems) are used for processing and analyzing this data, and fortunately Python has a suite of libraries to facilitate these endeavors. This week Sean Gillies, an author and contributor of many of these tools, shares the story of his career and contributions, and the work that he is doing at MapBox. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. You’ll want to make sure that your users don’t have to put up with bugs, so you should use Rollbar for tracking and aggregating your application errors to find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your host as usual is Tobias Macey Today I’m interviewing Sean Gillies about writing Geographic Information Systems in Python. Interview with Sean Gillies Introductions How did you get introduced to Python? Can you start by describing what Geographic Information Systems are and what kinds of projects might take advantage of them? How did you first get involved in the area of GIS and location-based computation? What was the state of the Python ecosystem like for writing these kinds of applications? You have created and contributed to a number of the canonical tools for building GIS systems in Python. Can you list at least some of them and describe how they fit together for different applications? What are some of the unique challenges associated with trying to model geographical features in a manner that allows for effective computation? How does the complexity of modeling and computation scale with increasing land area? Mapping and cartography have an incredibly long history with an ever-evolving set of tools. What does our digital age bring to this time-honored discipline that was previously impossible or impractical? To build accurate and effective representations of our physical world there are a number of domains involved, such as geometry and geography. What advice do you have for someone who is interested in getting started in this particular niche? What level of expertise would you advise for someone who simply wants to add some location-aware features to their application? I know that you joined Mapbox a little while ago. Which parts of their stack are written in Python? What are the areas where Python still falls short and which languages or tools do you turn to in those cases? Keep In Touch Email Twitter Picks Tobias Roku Streaming Stick Sean The Tacopedia Stromae Links GDAL SWIG QGIS Shapefiles Shapely Fiona Raster File GEOS Rasterio PostGIS RTree GeoPandas GeoJSON Orthorectification Mapbox SCONS Mapnik The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/22/2016 • 37 minutes, 49 seconds

K Lars Lohn

Summary K Lars Lohn has had a long and varied career, spending his most recent years at Mozilla. This week he shares some of his stories about getting involved with Python, his work with Mozilla, and his inspiration for the closing keynote at PyCon US 2016. He also elaborates on the intricate mazes that he draws and his life as an organic farmer in Oregon. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project We also have a new sponsor this week. Rollbar is a service for tracking and aggregating your application errors so that you can find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your host as usual is Tobias Macey Today we’re interviewing K Lars Lohn about his career, his art, and his work with Mozilla Interview with K Lars Lohn Introductions How did you get introduced to Python? You have an interesting pair of articles on your website that attempt to detail how you perceive code and why you think that formatting should be configured in a manner analogous to CSS. Can you explain a bit about how your particular perception affects the way that you program? On your website you have some images of incredibly detailed artwork that are actually mazes. Can you describe some of your creation process for those? What is it about mazes that keeps you interested in them and how did you first start using them as a form of visual art? At Mozilla you have helped to create a project called Socorro which utilizes complexity analysis for correlating stacktraces. How did you conceive of that approach to error monitoring? Can you describe how Socorro is architected and how it works under the covers? At this year’s PyCon US you presented the closing keynote and it was one of the most engaging talks that I’ve seen. Where did you get the inspiration for the content and the mixed media approach? For anyone who hasn’t seen it, you managed to weave together a very personal story with a musical performance, and some applications of complexity analysis into a seamless experience. How much did you have to practice before you felt comfortable delivering that in front of an audience? In addition to your technical career you are also very focused on living in a manner that is sustainable and in tune with your environment. What kinds of synergies and conflicts exist between your professional and personal philosophies? Keep In Touch Website Twitter Picks Tobias Terry Pratchett Lars Bach’s Tocatta & Fugue in D Minor Links Functional Geekery Episode 65 – Morten Kromberg talks about APL K Lars Lohn’s Portfolio The Well Tempered API Temple Grandin The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/15/2016 • 42 minutes, 21 seconds

Lorena Mesa

Summary One of the great strengths of the Python community is the diversity of backgrounds that our practitioners come from. This week Lorena Mesa talks about how her focus on political science and civic engagement led her to a career in software engineering and data analysis. In addition to her professional career she founded the Chicago chapter of PyLadies, helps teach women and kids how to program, and was voted onto the board of the PSF. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. Check out our sponsor Linode for running your awesome new Python apps. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project You want to make sure your apps are error-free so give our other sponsor, Rollbar, a look. Rollbar is a service for tracking and aggregating your application errors so that you can find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. By leaving a review on iTunes, or Google Play Music it becomes easier for other people to find us. Join our community! Visit discourse.pythonpodcast.com to help us grow and connect our wonderful audience. Your host as usual is Tobias Macey Today we’re interviewing Lorena Mesa about what inspires her in her work as a software engineer and data analyst. Interview with Lorena Mesa Introductions How did you get introduced to Python? How did your original interests in political science and community outreach lead to your current role as a software engineer? You dedicate a lot of your time to organizations that help teach programming to women and kids. What are some of the most meaningful experiences that you have been able to facilitate? Can you talk a bit about your work getting the PyLadies chapter in Chicago off the ground and what the reaction has been like? Now that you are a member of the board for the PSF, what are your goals in that position? What is it about software development that made you want to change your career path? What are some of the most interesting projects that you have worked on, whether for your employer or for fun? Do you think that the bootcamp you attended did a good job of preparing you for a position in industry? What is your view on the concept that software development is the modern form of literacy? Do you think that everyone should learn how to program? Keep In Touch Twitter Picks Tobias Zencastr Lorena Weapons of Math Destruction What I Talk About When I talk About Running Links idealist.org Schemas For The Real World The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/8/2016 • 42 minutes, 22 seconds

Podbuzzz with Kyle Martin

Summary Podcasts are becoming more popular now than they ever have been. Podbuzzz is a service for helping podcasters to track their reviews and imporove SEO to reach a wider audience. In this episode we spoke with Kyle Martin about his experience using Python to build Podbuzzz and manage it in production. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. You need a place to run your awesome new Python apps, so check out our sponsor Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project. You want to make sure your apps are error-free so give our next sponsor, Rollbar, a look. Rollbar is a service for tracking and aggregating your application errors so that you can find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. By leaving a review on iTunes, or Google Play Music it becomes easier for other people to find us. Join our community! Visit discourse.pythonpodcast.com to help us grow and connect our wonderful audience. Your hosts as usual are Tobias Macey and Chris Patti Today we’re interviewing Kyle Martin about Podbuzzz Interview with Kyle Martin Introductions How did you get introduced to Python? – Chris Can you start by explaining what Podbuzz is? – Tobias Why did you end up choosing Python as the language for building thx#is service? – Tobias What have been the biggest engineering challenges in building Podbuzzz? – Tobias How did you conceive of the idea to build Podbuzzz and what inspired you to provide it as a service? – Tobias Part of the service that you are building is a widget that encourages listeners to rate a podcast on iTunes. Why is that important and what are some of the techniques that you have leveraged to determine the most effective messaging? – Tobias What are some of the features that you plan on adding to your service? – Tobias Do you intend to run Podbuzzz as a side project or do you envision it becoming a company with its own staff? – Tobias In addition to your work with Podbuzzz as a way for podcasters to gain visibility for their shows, you’re also working on an analytics platform for the same target audience. Can you explain a bit about that and the problems that you’ve had to overcome? – Tobias What is it about podcasting that makes it hard to gain useful metrics and what is your strategy for overcoming some of those obstacles? – Tobias Keep In Touch Twitter Email Picks Tobias Thank You Scientist Chris Hell or High Water Kyle Udacity Self-Driving Car Engineering Nanodegree Startups For The Rest of Us Zero To Scale Speechmatics Links Canva Internet Business Mastery Podcast The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/1/2016 • 38 minutes, 36 seconds

PsychoPy with Jonathan Peirce

Summary We’re delving into the complex workings of your mind this week on Podcast.init with Jonathan Peirce. He tells us about how he started the PsychoPy project and how it has grown in utility and popularity over the years. We discussed the ways that it has been put to use in myriad psychological experiments, the inner workings of how to design and execute those experiments, and what is in store for its future. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. Hired is sponsoring us this week. If you’re looking for a job as a developer or designer then Hired will bring the opportunities to you. Sign up at hired.com/podcastinit to double your signing bonus. Once you land a job you can check out our other sponsor Linode for running your awesome new Python apps. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project You want to make sure your apps are error-free so give our last sponsor, Rollbar, a look. Rollbar is a service for tracking and aggregating your application errors so that you can find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. By leaving a review on iTunes, or Google Play Music it becomes easier for other people to find us. Join our community! Visit discourse.pythonpodcast.com to help us grow and connect our wonderful audience. Your hosts as usual are Tobias Macey and Chris Patti Today we’re interviewing Jonathan Peirce about PsychoPy, an open source application for the presentation and collection of stimuli for psychological experimentation Interview with Jonathan Peirce Introductions How did you get introduced to Python? – Chris Can you start by telling us what PsychoPy is and how the project got started? – Tobias How does PsychoPy compare feature wise against some of the proprietary alternatives? – Chris In the documentation you mention that this project is useful for the fields of psychophysics, cognitive neuroscience and experimental psychology. Can you provide some insight into how those disciplines differ and what constitutes an experiment? – Tobias Do you find that your users who have no previous formal programming training come up to speed with PsychoPy quickly? What are some of the challenges there? -Chris Can you describe the internal architecture of PsychoPy and how you approached the design? – Tobias How easy is it to extend PsychoPy with new types of stimulus? – Chris What are some interesting challenges you faced when implementing PsychoPy? – Chris I noticed that you support a number of output data formats, including pickle. What are some of the most popular analysis tools for users of PsychoPy? – Tobias Have you investigated the use of the new Feather library? – Tobias How is data input typically managed? Does PsychoPy support automated readings from test equipment or is that the responsibility of those conducting the experiment? – Tobias What are some of the most interesting experiments that you are aware of having been conducted using PsychoPy? – Chris While reading the docs I found the page describing the integration with the OSF (Open Science Framework) for sharing and validating an experiment and the collected data with other members of the field. Can you explain why that is beneficial to the researchers and compare it with other options such as GitHub for use within the sciences? – Tobias Do you have a roadmap of features that you would like to add to PsychoPy or is it largely driven by contributions from practitioners who are extending it to suit their needs? – Tobias Keep In Touch PsychoPy Discourse Forum Picks Tobias Hackers: Heroes of the Computer Revolution by Steven Levy Chris Castro 2 Jon Discourse Links Feather Pyglet HDF5 Open Science Framework The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/25/2016 • 1 hour, 12 minutes, 10 seconds

Sandstorm.io with Asheesh Laroia

Summary Sandstorm.io is an innovative platform that aims to make self-hosting applications easier and more maintainable for the average individual. This week we spoke with Asheesh Laroia about why running your own services is desirable, how they have made security a first priority, how Sandstorm is architected, and what the installation process looks like. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project We are also sponsored by Rollbar. Rollbar is a service for tracking and aggregating your application errors so that you can find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Hired has also returned as a sponsor this week. If you’re looking for a job as a developer or designer then Hired will bring the opportunities to you. Sign up at hired.com/podcastinit to double your signing bonus. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. I would also like to mention that the organizers of PyCon Zimbabwe are looking to the global Python community for help in supporting their event. If you would like to donate the link will be in the show notes. Your hosts as usual are Tobias Macey and Chris Patti Today we’re interviewing Asheesh Laroia about Sandstorm.io, a project that is trying to make self-hosted applications easy and secure for everyone. Interview with Asheesh Laroia Introductions How did you get introduced to Python? – Tobias Can you start by telling everyone about the Sandstorm project and how you got involved with it? – Tobias What are some of the reasons that an individual would want to self-host their own applications rather than using comparable services available through third parties? – Tobias How does Sandstorm try to make the experience of hosting these various applications simple and enjoyable for the broadest variety of people? – Tobias What does the system architecture for Sandstorm look like? – Tobias I notice that Sandstorm requires a very recent Linux kernel version. What motivated that choice and how does it affect adoption? – Chris One of the notable aspects of Sandstorm is the security model that it uses. Can you explain the capability-based authorization model and how it enables Sandstorm to ensure privacy for your users? – Tobias What are some of the most difficult challenges facing you in terms of software architecture and design? – Tobias What is involved in setting up your own server to run Sandstorm and what kinds of resources are required for different use cases? – Tobias You have a number of different applications available for users to install. What is involved in making a project compatible with the Sandstorm runtime environment? Are there any limitations in terms of languages or application architecture for people who are targeting your platform? – Tobias How much of Sandstorm is written in Python and what other languages does it use? – Tobias Keep In Touch Twitter Blog Email Picks Tobias OpsGenie Chris Viking Godfather Safety Razor Who Killed Sherlock Holmes? by Paul Cornell Petrus Aged Red Asheesh Amtrak The Master Switch by Tim Wu Rocket Chat Links North Star Post Contact Otter Hacker Slides Permanote Radicale Media Goblin IPython Notebook The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/17/2016 • 59 minutes, 35 seconds

Python at Zalando

Summary Open source has proven its value in many ways over the years. In many companies that value is purely in terms of consuming available projects and platforms. In this episode Zalando describes their recent move to creating and releasing a number of their internal projects as open source and how that has benefited their business. We also discussed how they are leveraging Python and a couple of the libraries that they have published. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project Rollbar is also sponsoring us this week. Rollbar is a service for tracking and aggregating your application errors so that you can find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Hired has also returned as a sponsor this week. If you’re looking for a job as a developer or designer then Hired will bring the opportunities to you. Sign up at hired.com/podcastinit to double your signing bonus. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your hosts as usual are Tobias Macey and Chris Patti Today we’re interviewing Jie Bao and João Santos about their use of Python at Zalando Interview with Zalando Introductions How did you get introduced to Python? – Tobias Can you start by telling us a bit about what Zalando does and some of the technologies that you use? – Tobias What role does Python play in your environment? – Tobias Is the use of Python for a particular project governed by any particular operational guidelines or is it largely a matter of developer choice? – Tobias Given that you have such a variety of platforms to support, how do you architect your systems to keep them easy to maintain and reason about? – Tobias One of the projects that you have open sourced is Connexion. Can you explain a bit about what that is and what it is used for at Zalando? – Tobias What made you choose to standardize on Swagger/OpenAPI vs RAML or some of the other API standards? – Tobias Did Connexion start its life as open source or was it extracted from another project? – Tobias ExpAn is another one of your projects that is written in Python. What do you use that for? – Tobias Can you describe the internal implementation of ExpAn and what it takes to get it set up? – Tobias Given the potential complexity of and the need for statistical significance in the data for proper A/B testing, how did you design ExpAn to satisfy those requirements? – Tobias Given the laws in Germany around digital privacy, were there any special considerations that needed to be made in the collection strategy for the data that gets used in ExpAn? – Tobias Keep In Touch João Twitter Jie Twitter Laurie Twitter Picks Tobias Hacker’s Keyboard Jie Shah of Shahs by Ryszard Kapuściński João Serendipity Laurie Flow) The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/10/2016 • 40 minutes, 26 seconds

Alex Martelli

Summary Alex Martelli has dedicated a large part of his career to teaching others how to work with software. He has the highest number of Python questions answered on Stack Overflow, he has written and co-written a number of books on Python, and presented innumerable times at conferences in multiple countries. We spoke to him about how he got started in software, his work with Google, and the trends in development and design patterns that are shaping modern software engineering. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project We also have a returning sponsor this week. Rollbar is a service for tracking and aggregating your application errors so that you can find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Hired is sponsoring us this week. If you’re looking for a job as a developer or designer then Hired will bring the opportunities to you. Sign up at hired.com/podcastinit to double your signing bonus. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers. Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your hosts as usual are Tobias Macey and Chris Patti Today we’re interviewing Alex Martelli Interview with Alex Martelli Introductions How did you get introduced to Python? – Chris You have achieved a number of honors and recognitions throughout your career for significant technical achievements. What kind of learning strategies do you use to enable you to achieve mastery of technical topics? – Tobias How do you keep the Python In A Nutshell book current as aspects of the core language and its libraries change? – Chris You are known for your prolific contributions to Stack Overflow, particularly on topics pertaining to Python. Was that a specific goal that you had set for yourself or did it happen organically? – Tobias When answering Stack Overflow questions, do you usually already know the answers or do you treat it as a learning opportunity? – Tobias What are some of the most difficult Python questions that you have been faced with? – Tobias You have presented quite a number of times at various Python conferences. What are some of your favorite talks? – Tobias Design patterns and idiomatic code are common themes in a number of your presentations. Why is it important for developers to understand these concepts and what are some of your favorite resources on the topic? – Tobias What do you see as the most influential trends in software development and design, both currently and heading into the future? – Tobias As a long-time computer engineer, are there any features or ideas from other languages that you would like to see incorporated into Python? Picks Tobias The Great Gatsby Movie Chris Stone Ruination Double IPA Ghost Soldiers Alex Alexander Hamilton by Ron Chernow Hamilton Musical Links Permission or Forgiveness Good enough is good enough Modern Python Patterns and Idioms Handling Errors and Exceptions in Modern Python Microservices Google SRE Book Python In A Nutshell use code AUTHD for a discount The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

9/3/2016 • 1 hour, 4 minutes, 49 seconds

Dave Beazley

Summary Dave Beazley has been using and teaching Python since the early days of the language. He has also been instrumental in spreading the gospel of asynchronous programming and the many ways that it can improve the performance of your programs. This week I had the pleasure of speaking with him about his history with the language and some of his favorite presentations and projects. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project We are also sponsored by Sentry this week. Stop hoping your users will report bugs. Sentry’s real-time tracking gives you insight into production deployments and information to reproduce and fix crashes. Check them out at getsentry.com and use the code podcastinit at signup to get a $50 credit! Hired has also returned as a sponsor this week. If you’re looking for a job as a developer or designer then Hired will bring the opportunities to you. Sign up at hired.com/podcastinit to double your signing bonus. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your hosts as usual are Tobias Macey and Chris Patti Today we’re interviewing Dave Beazley about his career with Python Interview with Dave Beazley Introductions How did you get introduced to Python? – Tobias How has Python and its community helped to shape your career? – Tobias What are some of the major themes that you have focused on in your work? – Tobias One of the things that you are known for is doing live-coding presentations, many of which are fairly advanced. What is it about that format that appeals to you? – Tobias What are some of your favorite stories about a presentation that didn’t quite go as planned? – Tobias You have given a large number of talks at various conferences. What are some of your favorites? – Tobias What impact do you think that asynchronous programming will have on the future of the Python language and ecosystem? – Tobias Are there any features that you see in other languages that you would like to have incorporated in Python? – Tobias On the about page for your website you talk about some of the low-level code and hardware knowledge that you picked up by working with computers as a kid. Do you think that people who are getting started with programming now are missing out by not getting exposed to the kinds of hardware and software that was present before computing became mainstream? You have had the opportunity to work on a large variety of projects, both on a hobby and professional level. What are some of your favorites? – Tobias What is it about Python that has managed to hold your interest for so many years? – Tobias Keep In Touch Twitter Picks Tobias Criminal Dave Samuel Beckett Plays Links Python Concurrency From The Ground Up XKCD compiling Clifford Stoll Superboard talk Curio PyOhio async talk The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/27/2016 • 45 minutes, 42 seconds

GenSim with Radim Řehůřek

Summary Being able to understand the context of a piece of text is generally thought to be the domain of human intelligence. However, topic modeling and semantic analysis can be used to allow a computer to determine whether different messages and articles are about the same thing. This week we spoke with Radim Řehůřek about his work on GenSim, which is a Python library for performing unsupervised analysis of unstructured text and applying machine learning models to the problem of natural language understanding. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project We are also sponsored by Sentry this week. Stop hoping your users will report bugs. Sentry’s real-time tracking gives you insight into production deployments and information to reproduce and fix crashes. Check them out at getsentry.com and use the code podcastinit at signup to get a $50 credit on your account. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your hosts as usual are Tobias Macey and Chris Patti Today we’re interviewing Radim Řehůřek about Gensim, a library for topic modeling and semantic analysis of natural language. Interview with Radim Řehůřek Introductions How did you get introduced to Python? – Chris Can you start by giving us an explanation of topic modeling and semantic analysis? – Tobias What is Gensim and what inspired you to create it? – Tobias What facilities does Gensim provide to simplify the work of this kind of language analysis? – Tobias Can you describe the features that set it apart from other projects such as the NLTK or Spacy? – Tobias What are some of the practical applications that Gensim can be used for? – Tobias One of the features that stuck out to me is the fact that Gensim can process corpora on disk that would be too large to fit into memory. Can you explain some of the algorithmic work that was necessary to allow for this streaming process to be possible? – Tobias Given that it can handle streams of data, could it also be used in the context of something like Spark? – Tobias Gensim also supports unsupervised model building. What kinds of limitations does this have and when would you need a human in the loop? – Tobias Once a model has been trained, how does it get saved and reloaded for subsequent use? – Tobias What are some of the more unorthodox or interesting uses people have put Gensim to that you’ve heard about? – Chris In addition to your work on Gensim, and partly due to its popularity, you have started a consultancy for customers who are interested in improving their data analysis capabilities. How does that feed back into Gensim? – Tobias Are there any improvements in Gensim or other libraries that you have made available as a result of issues that have come up during client engagements? – Tobias Is it difficult to find contributors to Gensim because of its advanced nature? – Tobias Are there any resources you’d like to recommend our listeners explore to get a more in depth understanding of topic modeling and related techniques? – Chris Keep In Touch RaRe Technologies Twitter Email Github Mailing List Picks Tobias Dark Matter and the Dinosaurs by Lisa Randall Chris m-cli Radim 1177 BC: The Year Civilization Collapsed Links Nadia Eghbal Gensim SQL Addict NLTK Spacy Latent Dirichlet Allocation (LDA) LSI Keynote in Italy on distributed processing Google Scholar references for Gensim Stylometric analysis On Writing Well Student Incubator Wikipedia on topic modeling The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/20/2016 • 53 minutes, 28 seconds

Python on Windows with Steve Dower

Summary In order for Python to continue to attract new users, we need to have an easy way for people to get started with it, and Windows is still the most widely used operating system among computers. Steve Dower is the build maintainer for the Windows installers of Python and this week we spoke with him about his work in that role. He told us about the changes that he has made to the installer to make it easier for new users to get started and how modern updates to the packaging ecosystem for libraries has simplified dependency management. He also told us about how the Visual Studio team is building a set of tools to make development of Python code more enjoyable and how Microsoft’s adoption of open source is making Windows a more attractive platform for developers. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project We are also sponsored by Sentry this week. Stop hoping your users will report bugs. Sentry’s real-time tracking gives you insight into production deployments and information to reproduce and fix crashes. Check them out at getsentry.com and use the code podcastinit at signup to get a $50 credit on your account! Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your hosts as usual are Tobias Macey and Chris Patti Today we’re interviewing Steve Dower about Python on Windows Interview with Steve Dower Introductions How did you get introduced to Python? – Chris You are currently the release manager for Python on Windows. How did you end up with that responsibility? – Tobias While Python has supported Windows for a long time, the overall experience has historically been rather poor. Can you give a bit of the background of why that was and tell us about some of the work that you and others have been doing to make it better? – Tobias Given that a large percentage of users are still on Windows, having a good story for getting started with Python on that platform is important for adoption of the language. What are some of the areas where the current situation needs to be improved? – Tobias What is the most difficult part of building a distribution of Python for a Windows environment? Has it gotten easier in recent years? – Tobias When we were speaking at PyCon you mentioned that the most frequently downloaded version of Python from the python.org site is the 32 bit version for Windows. Do you think that is an accurate and useful metric? What other statistics do you wish you could capture or improve? – Tobias How does Python Tools for Visual Studio compare with other Python IDEs like Pycharm? – Chris What are some unique features that Python Tools for Visual Studio offers that other tools don’t? – Chris Are there any compelling aspects of developing Python on Windows that could convince users on other platforms to make the switch? – Tobias Could you give our listeners a whirlwind tour of the underlying implementation of PTVS? How does Visual Studio provide such in depth introspection for your Python code? – Chris Keep In Touch Twitter Github Microsoft Azure steve.dower Picks Tobias Kdiff3 SpyderCo Triangle Sharpmaker Chris Audible Steve Sandisk Extreme Portable SSD SMBC Random Encounters Links Windows compilers Visual C++ Build Tools (for Python 3.5 and later) Visual C++ Compiler for Python 2.7 PEP 514 The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/13/2016 • 54 minutes, 23 seconds

PyCon Canada with Francis Deslauriers and Peter McCormick

Summary Aside from the national Python conferences such as PyCon US and EuroPyCon there are a number of regional conferences that operate at a smaller scale to service their local communities. This week we interviewed Peter McCormick and Francis Deslauriers about their work organizing PyCon Canada to provide a venue for Canadians to talk about how they are using the language. If you happen to be near Toronto in November then you should get a ticket and help contribute to their success! Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project We are also sponsored by Sentry this week. Stop hoping your users will report bugs. Sentry’s real-time tracking gives you insight into production deployments and information to reproduce and fix crashes. Check them out at getsentry.com and use the code podcastinit at signup to get a $50 credit! Hired has also returned as a sponsor this week. If you’re looking for a job as a developer or designer then Hired will bring the opportunities to you. Sign up at hired.com/podcastinit to double your signing bonus. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your hosts as usual are Tobias Macey and Chris Patti Today we’re interviewing Peter McCormick and Francis Deslauriers about their experiences organizing PyCon Canada Interview with Peter McCormick and Francis Deslauriers Introductions How did you get introduced to Python? – Chris How did you get involved as an organizer of PyCon Canada? – Tobias How does PyCon Canada, and other regional conferences, differ from PyCon US, both in terms of scale and overall experience? – Tobias How do the audience and presenters differ from the US conferences? Is there perhaps a differen mix of industry versus academia, or maybe different disciplines? Chris Are you thinking of trying to hold the conference in different cities across Canada, similarly to how PyCon US moves venues every two years? – Tobias In addition to the national and regional conferences, there are a number of special interest Python conferences that take place (e.g. SciPy, PyData, etc.). What kind of relationship do you have with organizers of those events and how do they impact the kinds of talk submissions that you are likely to receive? – Tobias There has been a lot of focus in recent years on trying to increase the diversity of conference speakers. What are some of the methods that you have used to encourage speakers of various backgrounds to submit talks? – Tobias Organizing a conference involves a lot of moving parts. How do you structure the process to ensure a safe and enjoyable experience for the attendees? – Tobias What are some of the biggest logistical challenges you face as conference organizers? – Chris Given that PyCon Canada is a regional conference, how has that affected your focus in terms of marketing and the general theme? – Tobias Tell our listeners about your favorite PyCon Canada moments. – Chris What has been the most surprising part of organizing the conference? – Tobias Keep In Touch PyCon Canada Twitter Website Email for sponsorship enquiries Peter Email Twitter Website Francis Email Twitter Picks Tobias Juice SSH Chris Chinese Man Stiletto Amazon Echo Peter DjangoCon US documentation Francis Spam Nation Links PSF Calendar of Events Symposion The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

8/6/2016 • 46 minutes

Test Engineering with Cris Medina

Summary We all know that testing is an important part of software and systems development. The problem is that as our systems and applications grow, the amount of testing necessary increases at an exponential rate. Cris Medina joins us this week to talk about some of the problems and approaches associated with testing these complex systems and some of the ways that Python can help. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project We are also sponsored by Sentry this week. Stop hoping your users will report bugs. Sentry’s real-time tracking gives you insight into production deployments and information to reproduce and fix crashes. Check them out at getsentry.com Hired has also returned as a sponsor this week. If you’re looking for a job as a developer or designer then Hired will bring the opportunities to you. Sign up at hired.com/podcastinit to double your signing bonus. The O’Reilly Velocity conference is coming to New York this September and we have a free ticket to give away. If you would like the chance to win it then just sign up for our newsletter at pythonpodcast.com To help other people find the show you can leave a review on iTunes, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your hosts as usual are Tobias Macey and Chris Patti Today we’re interviewing Cris Medina about test engineering for large and complex systems. Interview with Cris Medina Introductions How did you get introduced to Python? – Chris To get us started can you share your definition of test engineering and how it differs from the types of testing that your average developer is used to? – Tobias What are some common industries or situations where this kind of test engineering becomes necessary? – Tobias How and where does Python fit into the kind of testing that becomes necessary when dealing with these complex systems? – Tobias How do you determine which areas of a system to test and how can Python help in that discovery process? – Tobias What are some of your favorite tools and libraries for this kind of work? – Tobias What are some of the areas where the existing Python tooling falls short? – Tobias Given the breadth of concerns that are encompassed with testing the various components of these large systems, what are some ways that a test engineer can get a high-level view of the overall state? – Tobias How can that information be distilled for presentation to other areas of the business? – Tobias Could that information be used to provide a compelling business case for the resources required to test properly? – Chris Given the low-level nature of this kind of work I imagine that proper visibility of the work being done can be difficult. How do you make sure that management can properly see and appreciate your efforts? – Tobias Keep In Touch Twitter Picks Tobias Samsung Galaxy Tab S2 Anker SoundCore Bluetooth Speaker Chris On Writing Well This Episode Was Written by an AI The Three Rs Cris CherryPy Etcd Thinking Fast And Slow by Daniel Kahneman Spain Links Behave Pytest BDD Hypothesis Episode XX – Hypothesis Flask CherryPy Django Pandas NumPy Celery Bokeh Vincent Toga D3 Sunburst D3 Chord Diagrams The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/30/2016 • 58 minutes, 9 seconds

Crossing The Streams - Talk Python with Michael Kennedy

Summary The same week that we released our first episode of Podcast.__init__, Michael Kennedy was publishing the very first episode of Talk Python To Me. The years long drought of podcasts about Python has been quenched with a veritable flood of quality content as we have both continued to deliver the stories of the wonderful people who make our community such a wonderful place. This week we interviewed Michael about what inspired him to get started, his process and experience as Talk Python continues to evolve, and how that has led him to create online training courses alongside the podcast. He also interviewed us, so check out this weeks episode of Talk Python To Me for a mirror image of this show! Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project We are also sponsored by Sentry this week. Stop hoping your users will report bugs. Sentry’s real-time tracking gives you insight into production deployments and information to reproduce and fix crashes. Check them out at getsentry.com Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your hosts as usual are Tobias Macey and Chris Patti Today we’re interviewing Michael Kennedy about his work with Talk Python to Me, another podcast about Python and its community, and on-demand Python trainings. Michael has also offered to give away one of each of his Python courses to our listeners. If you would like the chance to win, then sign up for our newsletter at pythonpodcast.com, or our forum at discourse.pythonpodcast.com. If you want to double your chances, then sign up for both! Interview with Michael Kennedy Introductions How did you get into programming? How did you get introduced to Python? (Chris) What is the craziest piece of software you’ve ever written? – Tobias You’ve taken some pretty drastic steps around Python and your career lately. What inspired you to do that and how’s it going?(yes, quit my job, focus only on podcast and online courses). You are basically self-taught as a developer, how did you get into this teaching / mentor role? Why did you first get started with Talk Python to Me? – Tobias Did you know when you started that it would turn into a full-time endeavor? – Tobias For a while there weren’t any podcasts available that focused on Python and now we’re each producing one. What’s it like to run a successful podcast? – Tobias What have been your most popular episodes? Tell us a bit about each – Tobias In your excellent episode with Kate Heddleston you talked about how we tend to bash other programming languages. We’ve done a fair bit of Java bashing here. How can we help get ourselves and others in our community out of this bad habit? – Chris How do you select the guests and topics for your show? – Tobias What topics do you have planned for the next few episodes? How do you prepare the questions for each episode? – Tobias What is the most significant thing you’ve learned from the podcasting experience? What do you wish you did differently and how are you looking to improve? – Tobias I had a great time hanging out with you at PyCon this year. What was your impression of the conference? What were your favorite sessions and do you have any shows scheduled to follow up on them? – Tobias Your sites are 100% “hand-crafted” as they say. Can you give us a look inside? What are the moving parts in there? So you stirred things up with Stitcher this week. What’s up with that? Can you recommend some podcasts? What’s in your playlist? Final call to action? Keep In Touch Twitter Podcast Web Github Picks Tobias Batman v Superman: Dawn of Justice Lego Brickumentary Hashicorp Consul Chris Yarn Apple Magic Mouse 2 Remembering Stonewall Michael PyPI passlib Python 2016 Youtube Channel K Lars Lohn – Closing Keynote Links Thinking Fast and Slow by Daniel Kahneman Trello Recommended podcasts: Test and Code Podcast Partially Derivative Exponent Podcast Mixergy Startup Podcast (season 1 & 2) Away from the keyboard Developer On Fire Michael’s courses: Python Jumpstart by Building 10 Apps Write Pythonic Code Like a Seasoned Developer The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/23/2016 • 1 hour, 17 minutes, 47 seconds

Zorg with Gunther Cox and Kevin Brown

Summary Everyone loves to imagine what they would do if they had their own robot. This week we spoke with Gunther Cox and Kevin Brown about their work on Zorg, which is a Python library for building a robot of your own! We discussed how the project got started, what platforms it supports, and some of the projects that have been built with it. Give it a listen and then get building! Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project We are also sponsored by Sentry this week. Stop hoping your users will report bugs. Sentry’s real-time tracking gives you insight into production deployments and information to reproduce and fix crashes. Check them out at getsentry.com Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your host as usual is Tobias Macey Today we’re interviewing Gunther Cox and Kevin Brown about Zorg, a Python framework for robotics and physical computing Interview with Gunther Cox and Kevin Brown Introductions How did you get introduced to Python? – Tobias What is Zorg and what is its origin story? – Tobias How would you define and differentiate the concepts of robotics, physical computing, and the internet of things? – Tobias I noticed in the documentation that Zorg is based on the Cylon.js project. How closely does the implementation of Zorg stick to that of Cylon and how much needs to be changed due to differences in the language? – Tobias Is Zorg useful for production applications or is it primarily intended for educational purposes and hobby projects? – Tobias Zorg currently only supports the Intel Edison, with plans for Raspberry Pi and Arduino Firmata support in the works. What is involved in adding compatibility with other platforms? – Tobias What are some of the most interesting projects that you have seen created using Zorg? – Tobias How does Zorg compare to other Python robotics projects such as ROSPy? – Tobias Robotics is a large and complex problem space. What are some of the other features and projects in Python that are often used when building robots? – Tobias Keep In Touch GitHub Newsletter Picks Tobias Padlock Password Manager Vault Gunther Robot Builder’s Bonanza Kevin Facial Recognition with OpenCV in Python Links RS232 The Hybrid Group Gobot Artoo Cylon.js Salvius ROSPy The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/17/2016 • 25 minutes, 18 seconds

Mypy with David Fisher and Greg Price

Summary As Python developers we are fond of the dynamic nature of the language. Sometimes, though, it can get a bit too dynamic and that’s where having some type information would come in handy. Mypy is a project that aims to add that missing level of detail to function and variable definitions so that you don’t have to go hunting 5 levels deep in the stack to understand what shape that data structure is supposed to be. This week we spoke with David Fisher and Greg Price about their work on Mypy and its use within Dropbox and the broader community. They explained how it got started, how it works under the covers, and why you should consider adding it to your projects. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project We are also sponsored by Sentry this week. Stop hoping your users will report bugs. Sentry’s real-time tracking gives you insight into production deployments and information to reproduce and fix crashes. Check them out at getsentry.com Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your hosts as usual are Tobias Macey and Chris Patti Today we’re interviewing David Fisher and Greg Price about Mypy, a library for adding optional static types to your Python code. es Interview with David Fisher and Greg Price Introductions How did you get introduced to Python? – Chris Can you explain a bit about what Mypy is and its origin story? – Tobias What are the benefits of using Mypy for both new and existing projects? – Tobias How does the Mypy compilation step work? – Tobias What are the biggest technical challenges in implementing Mypy? – Chris Are there any limitations imposed by the syntax of Python that prevented you from implementing any features or syntax that you would have liked to include in Mypy? – Tobias In Guido’s keynote from this year’s PyCon he mentioned some tentative plans for adding variable type declarations to the Python syntax in one of the next major releases. How much of that idea was inspired by Mypy? – Tobias Type theory is a large and complex problem domain. Can you explain where Mypy falls in this space? – Tobias Which language(s) had the biggest influence on the particular syntax and semantics used in Mypy? – Tobias What kinds of type definitions and guarantees can be encoded using Mypy? – Tobias Can you talk a bit about user defined types as implemented in Mypy? – Chris How has the inclusion of the typing module in the Python standard libary influenced the evolution of Mypy? – Tobias Did the inclusion of multiple inheritance add any implementation complexity to Mypy? – Chris Do you know of any formal studies that have been performed to research the ergonomics or efficiency gains of static or gradual type systems? – Tobias What does the future roadmap for Mypy look like? – Tobias Keep In Touch David GitHub Greg web page GitHub $ pip3 install mypy-lang Bug reports, feature requests, questions welcome on issue tracker: github.com/python/mypy Picks Tobias Functional Geekery – Andreas Stefik episode about studies performed on the human factors of development Soft Skills Engineering Podcast Chris Grimm Artisenal Ales Lucky Cloud jq – json swiss army knife David fzf – a fuzzy finder Thinking, Fast And Slow by Daniel Kahneman Ringworld Greg On Proof and Progress in Mathematics, essay by Bill Thurston Axiomatic by Greg Egan Links GitHub repo, and CONTRIBUTING file PEP 484 PyCon 2016 workshop slides Typeshed shared repo for stubs Other tools (PyCharm, pylint, pytype, …) using PEP 484 types The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/10/2016 • 1 hour, 20 seconds

BeeWare with Russell Keith-Magee

Summary When you have good tools it makes the work you do even more enjoyable. Russel Keith-Magee has been building up a set of tools that are aiming to let you write graphical interfaces in Python and run them across all of your target platforms. Most recently he has been working on a capstone project called Toga that targets the Android and iOS platforms with the same set of code. In this episode we explored his journey through programming and how he has built and designed the Beeware suite. Give it a listen and then try out some or all of his excellent projects! Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project We are also sponsored by Sentry this week. Stop hoping your users will report bugs. Sentry’s real-time tracking gives you insight into production deployments and information to reproduce and fix crashes. Check them out at getsentry.com and use the code podcastinit to get a $50 credit! Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your hosts as usual are Tobias Macey and Chris Patti Today we’re interviewing Russel Keith-Magee about the Beeware project, which is a collection of tools and libraries that are meant to be composed together for building up your Python development environment. Interview with Firstname Lastname Introductions How did you get introduced to Python? – Chris What is the BeeWare project and what goals do you have for it? – Tobias What kinds of projects are contained under the BeeWare umbrella and what inspired you to start creating these kinds of tools? – Tobias Did each project arise from a particular need that you had at the time or has there been a logical progression from one tool to the next? – Tobias At PyCon US of this year (2016) you made a presentation about the work that you have been doing to bring Python to the iOS and Android platforms. Can you provide a high-level overview for anyone who hasn’t seen that talk yet? – Tobias Let’s talk about Toga – how does Toga differ from some of the other cross platform UI framework efforts for various languages like Kivy or Shoes? – Chris What are some of the biggest challenges that you had to overcome in order to get Python to run on both iOS and Android? – Tobias How does runtime performance for applications written in Python compare with the same program running in the languages that are natively supported on those platforms? – Tobias Can you walk us through the low level flow of a single toga API request? – Chris Do you view your work on Toga and the associated libraries as a hobby project or do you think that it will turn into a production ready tool set that people will use for shipping applications? – Tobias IDEs like Android Studio and XCode have a lot of features that simplify the development and UI creation process. Do you have to forego those niceties when developing a mobile app in Python? – Tobias Shipping Python applications is a problem that tends to pose a host of issues for people, which you are addressing with the Briefcase project. What are some of the biggest hurdles and design choices that you have encountered while working on that? – Tobias Do you think that there will ever be a release of iOS or Android, or even a brand new mobile platform, that will ship with native Python support? – Tobias Keep In Touch Twitter Website GitHub Picks Tobias Japanese cast iron tea set Chris Bantam Cider Pythonista 3 Russell MHPrompt Open Sourcing Mental Illness Blue Hackers Beyond Blue Black Dog institute Mental Health.gov Links A Tale of Two Cellphones Python interpreter in 500 lines of code The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/2/2016 • 1 hour, 10 minutes, 35 seconds

Armin Ronacher

Summary Armin Ronacher is a prolific contributor to the Python software ecosystem, creating such widely used projects as Flask and Jinja2. This week we got the opportunity to talk to him about how he got his start with Python and what has inspired him to create the various tools that have made our lives easier. We also discussed his experiences working in Rust and how it can interface with Python. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project We are also sponsored by Sentry this week. Stop hoping your users will report bugs. Sentry’s real-time tracking gives you insight into production deployments and information to reproduce and fix crashes. Check them out at getsentry.com Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your hosts as usual are Tobias Macey and Chris Patti Today we’re interviewing Armin Ronacher about his contributions to the Python community. Interview with Armin Ronacher Introductions How did you get introduced to Python? – Chris What was the first open source project that you created in Python? – Tobias What is your view of the responsibility for open source project maintainers and how do you manage a smooth handoff for projects that you no longer wish to be involved in? – Tobias You have created a large number of successful open source libraries and tools during your career. What are some of the projects that may be less well known that you think people might find interesting? – Tobias (e.g. logbook) I notice that you recently worked on the pipsi project. Please tell us about it! – Chris Following on from the last question, where would you like to see the Python packaging infrastructure go in the future? – Chris You have had some strong opinions of Python 2 vs Python 3. How has your position on that subject changed over time? – Tobias Let’s talk about Lektor – what differentiates it from the pack, and what keeps you coming back to CMS projects? – Chris How has your blogging contributed to the work that you do and the success you have achieved? – Tobias Lately you have been doing a fair amount of work with Rust. What was your reasoning for learning that language and how has it influenced your work with Python? – Tobias In addition to the code you have written, you also helped to form the Pocoo organization. Can you explain what Pocoo is and what it does? What has inspired the rebranding to the Pallets project? – Tobias Keep In Touch Twitter Picks Tobias Radical Candor Chris Loverbeer BeerBrugna The Human Resource Machine Armin Biermanufaktur Loncium Matakustix – Hai Hai Haibodn Links PHPbb Pocoo Pallets Project The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/26/2016 • 1 hour, 20 seconds

Bandit with Tim Kelsey, Travis McPeak, and Eric Brown

Summary Making sure that your code is secure is a difficult task. In this episode we spoke to Eric Brown, Travis McPeak, and Tim Kelsey about their work on the Bandit library, which is a static analysis engine to help you find potential vulnerabilities before your application reaches production. We discussed how it works, how to make it fit your use case, and why it was created. Give the show a listen and then go start scanning your projects! Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project. And they just doubled the RAM for their introductory level servers, so that $20 will get you even more performance. We are also sponsored by Sentry this week. Stop hoping your users will report bugs. Sentry’s real-time tracking gives you insight into production deployments and information to reproduce and fix crashes. Check them out at getsentry.com and use the code podcastinit at signup to get a $50 credit! Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your hosts as usual are Tobias Macey and Chris Patti Today we’re interviewing Tim Kelsey and Eric Brown about Bandit which is a static analysis engine for finding security vulnerabilities in your Python code. Interview with Eric Brown, Travis McPeak and Tim Kelsey Introductions How did you get introduced to Python? – Chris What is Bandit and what was the inspiration for creating it? – Tobias How did you each get involved with the Bandit project? – Tobias At what stage of the development process would you want to use Bandit? – Tobias What kinds of analysis does Bandit do on the source code that it is run against? – Tobias How does it determine whether a particular segment of code is introducing a vulnerability and what means does it use to determine the severity? – Tobias What does the generated report include and what can be done with that information? – Tobias What are some of the biggest design and implementation difficulties that have been encountered in the process of creating Bandit? – Tobias How does bandit compare to similar tools in other languages such as Ruby’s BrakeMan? – Tobias What are some of the most interesting extensions that you have seen for Bandit? – Tobias What is on the roadmap for the future of Bandit? – Tobias Keep In Touch OpenStack Security IRC OpenStack Security Weekly Meeting Tim Twitter Travis Twitter Picks Tobias Toggl Listener Review of Toggl Any.do Tim IFTTT (If This Then That) Eric Slack Travis Brilliance Trilogy Uncharted 4 Risky Business Podcast The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/18/2016 • 28 minutes, 48 seconds

Sentry with David Cramer

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list. Summary As developers we all have to deal with bugs sometimes, but we don’t have to make our users deal with them too. Sentry is a project that automatically detects errors in your applications and surfaces the necessary information to help you fix them quickly. In this episode we interviewed David Cramer about the history of Sentry and how he has built a team around it to provide a hosted offering of the open source project. We covered how the Sentry project got started, how it scales, and how to run a company based on open source. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show, subscribe, join our newsletter, check out the show notes, and get in touch you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project We are also sponsored by Sentry this week. Stop hoping your users will report bugs. Sentry’s real-time tracking gives you insight into production deployments and information to reproduce and fix crashes. Check them out at getsentry.com and use the code podcastinit at signup to get a $50 credit!- Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your hosts as usual are Tobias Macey and Chris Patti Today we’re interviewing David Cramer about Sentry which is an open source and hosted service for capturing and tracking exceptions in your applications. Interview with Firstname Lastname Introductions How did you get introduced to Python? – Chris What is Sentry and how did it get started? – Tobias What led you to choose Python for writing Sentry and would you make the same choice again? – Tobias Error reporting needs to be super light weight in order to be useful. What were some implementation challenges you faced around this issue? – Chris Why would a developer want to use a project like Sentry and what makes it stand out from other offerings? – Tobias When would someone want to use a different error tracking service? – Tobias Can you describe the architecture of the Sentry project both in terms of the software design and the infrastructure necessary to run it? – Tobias What made you choose Django versus another Python web framework, and would you choose it today? – Chris What languages and platforms does Sentry support and how does a developer integrate it into their application? – Tobias One of the big discussions in open source these days is around maintainability and a common approach is to have a hosted offering to pay the bills for keeping the project moving forward. How has your experience been with managing the open source community around the project in conjunction with providing a stable and reliable hosted service for it? – Tobias Are there any benefits to using the hosted offering beyond the fact of not having to manage the service on your own? – Tobias Have you faced any performance challenges implementing Sentry’s server side? – Chris What advice can you give to people who are trying to get the most utility out of their usage of Sentry? – Tobias What kinds of challenges have you encountered in the process of adding support for such a wide variety of languages and runtimes? – Tobias Capturing the context of an error can be immensely useful in finding and solving it effectively. Can you describe the facilities in Sentry and Raven that assist developers in providing that information? – Tobias It’s challenging to create an effective method for aggregating incoming issues so that they are sufficiently visible and useful while not hiding or discarding important information. Can you explain how you do that and what the evolution of that system has been like? – Tobias I notice a lot of from future import in Sentry. Does it support Python 3 and/or what’s the plan for getting there? – Chris Looking back to the beginning of the project, what are some of the most interesting and surprising changes that have happened during its lifetime? How does it differ from its original vision? – Tobias Keep In Touch Twitter Picks Tobias BPython Chris Developer on Fire Song Exploder David React Webpack Alpine Climbing Percy.io Red Rising Trilogy The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/12/2016 • 1 hour, 9 minutes, 27 seconds

Mercurial with Augie Fackler

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list. Summary As developers, one of the most important tools that we use daily is our version control system. Mercurial is one such tool that is written in Python, making it eminently flexible, customizable, and incredibly powerful. This week we spoke with Augie Fackler to learn about the history, features, and future of Mercurial. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project We are also sponsored by Sentry this week. Stop hoping your users will report bugs. Sentry’s real-time tracking gives you insight into production deployments and information to reproduce and fix crashes. Check them out at getsentry.com and use the code podcastinit at signup to get a $50 credit! Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your hosts as usual are Tobias Macey and Chris Patti Today we’re interviewing Augie Fackler about the Mercurial version control system Interview with Augie Fackler Introductions How did you get introduced to Python? – Chris Can you describe what Mercurial is and how the project got started? – Tobias How did you get involved with working on Mercurial? – Tobias What are some of the features that can be found in Mercurial which are lacking in similar tools such as Git or Bazaar? – Tobias One of the common complaints with Git is that its human interface could use some work. How is Mercurial’s UX an improvement over Git? – Chris For someone who is using Mercurial to work with a Git or other VCS repository, what are some of the edge cases that they should watch out for? Are there certain operations that could be performed in Mercurial which would break that compatibility layer? – Tobias How is Mercurial architected and what are some of the design choices that allow for it to be so flexible and extensible? – Tobias One of the core goals of Mercurial is for it to be safe. Can you explain what safety means in this context and how it is architected to achieve that goal? – Tobias One of the noteworthy aspects of Mercurial is the strong focus on making extensions a first-class concern in the project, so much so that a number of the core functions are written as extensions. Can you describe why that is and how the extensions plug into the core execution engine? – Tobias What are some of the most notable extensions that are available for use with Mercurial? – Tobias For someone who is familiar with Git, what are some of the concepts that they would need to learn about in order to use Mercurial in an idiomatic way? – Tobias A large part of the reason that Git has seen such large adoption is due to the prevalence of GitHub. There is the option of using BitBucket when using Mercurial. Are there any other noteworthy Mercurial hosting options? Do you think that the dearth of open source mercurial servers is partially due to the fact that Mercurial ships with a functional server built in? – Tobias Can you share some of the most recent features that have been added to Mercurial? – Tobias What do you have planned for the future of Mercurial? – Tobias How do you think current day DVCS systems like Mercurial, Git and Darcs might evolve in the future? – Chris Keep In Touch Twitter Picks Tobias Sapiens: A Brief History of Humankind by Yuval Noah Harrari Cultures of Continuous Learning Keynote by Vanessa Hurst Chris Intro to Django Video Series Transistor Podcast Embedded Podcast Augie Leviathan Wakes Three Body Problem Prometheus Links Mercurial: The Definitive Guide Online Print Revsets Git Pickaxe Facebook Mercurial Post Remote File Log Gerrit Kallithea Reviewboard Mozilla Review Board A Case of Computational Thinking: The Subtle Effect ofHidden Dependencies on the User Experience of VersionControl The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/5/2016 • 55 minutes, 11 seconds

Pillow with Alex Clark

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list. Summary If you need to work with images the Pillow is the library to use. The Python Image Libary (PIL) has long been the gold standard for resizing, analyzing, and processing pictures in Python. Pillow is the modern fork that is bringing the PIL into the future so that we can all continue to use it moving forward. This week I spoke with Alex Clark about what first led him to fork the project and his experience maintaining it, including the migration to Python 3. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project We also have a new sponsor this week. Rollbar is a service for tracking and aggregating your application errors so that you can find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your host as usual is Tobias Macey Today we’re interviewing Alex Clark about the Pillow project Interview with Alex Clark Introductions How did you get introduced to Python? – Tobias What were you working on that led you to forking the Python Image Library (PIL)? – Tobias What does Fredrik Lundh (author of PIL) think of Pillow? When you first forked the PIL project did you think that you would still be maintaining and updating that fork by now? – Tobias Who else works on the project with you and how did they get involved? – Tobias What kinds of special knowledge or experience have you found to be necessary for understanding and extending the routines in the library and for adding new capabilities? – Tobias Can you describe what PIL and now Pillow are and what kinds of use cases they support? – Tobias How does Pillow compare to libraries with a similar purpose such as ImageMagick? – Tobias I have seen Pillow used in computer vision contexts. What are some of the capabilities of the library that lend themselves to this purpose? – Tobias What architectural patterns does Pillow use to make image operations fast and flexible? Have you found the need to do any significant refactorings of the original code to make it compatible with modern uses and execution environments? – Tobias Have you kept up to date with newer image formats, such as webp? Are there any image formats that Pillow does not support that you would like to see added to the project? – Tobias What are some of the most interesting or innovative uses of Pillow that you have seen? – Tobias What do you have planned for the future of Pillow? – Tobias Keep In Touch Website Picks Tobias Minimalist Baker Bisect module Alex Muse – Uprising Fanstatic Links Image-SIG Random (Psychedelic) Art The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/28/2016 • 20 minutes, 1 second

Wagtail with Tom Dyson

Visit our site to sign up for the newsletter, explore past episodes, subscribe to the show, and help support our work. Summary If you are operating a website that needs to publish and manage content on a regular basis, a CMS (Content Management System) becomes the obvious choice for reducing your workload. There are a plethora of options available, but if you are looking for a solution that leverages the power of Python and exposes its flexibility then you should take a serious look at Wagtail. In this episode Tom Dyson explains how Wagtail came to be created, what sets it apart from other options, and when you should implement it for your projects. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project We also have a new sponsor this week. Rollbar is a service for tracking and aggregating your application errors so that you can find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Tom Dyson about Wagtail, a modern and sophisticated CMS for Django. Interview with Tom Dyson Introductions How did you get introduced to Python? – Chris Can you start by explaining what a content management system is and why they are useful? – Tobias How did the Wagtail project get started and what makes it stand out from other comparable offerings? – Tobias What made you choose Django as the basis for the project as opposed to another framework or language such as Pyramid, Flask, or Rails? – Tobias What is your target user and are there any situations in which you would encourage someone to use a different CMS? – Tobias Can you explain the software design approach that was taken with Wagtail and describe the challenges that have been overcome along the way? – Tobias How did you approach the project in a way to make the CMS feel well integrated into the other apps in a given Django project so that it doesn’t feel like an afterthought? – Tobias For someone who wants to get started with using Wagtail, what does that experience look like? – Tobias What are some of the features that are unique to Wagtail? – Tobias Given that Wagtail is such a flexible tool, what are some of the gotchas that people should watch out for as they are working on a new site? – Tobias Does Wagtail have any built-in support for multi-tenancy? – Tobias Does Wagtail have a plugin system to allow developers to create extensions to the base CMS? – Tobias Having built such a sizable plugin with deep integrations to Django, what are some of the shortcomings in the framework that you would like to see improved? – Tobias Keep In Touch Twitter Site GitHub Picks Tobias Pumpkin Pie Tom Hasbean Ethiopian Coffee Hario V60 Links Royal College of Arts Simon Willison’s Blog Vagrant Willow project Django Model Cluster Divio The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/21/2016 • 52 minutes, 32 seconds

Buildbot with Pierre Tardy

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list. Summary As technology professionals, we need to make sure that the software we write is reliably bug free and the best way to do that is with a continuous integration and continuous deployment pipeline. This week we spoke with Pierre Tardy about Buildbot, which is a Python framework for building and maintaining CI/CD workflows to keep our software projects on track. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show, subscribe, join our newsletter, check out the show notes, and get in touch you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project We are also sponsored by Rollbar this week. Rollbar is a service for tracking and aggregating your application errors so that you can find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Pierre Tardy about the Buildbot continuous integration system. Interview with Pierre Tardy Introductions How did you get introduced to Python? – Chris For anyone who isn’t familiar with it can you explain what Buildbot is? – Tobias What was the original inspiration for creating the project? – Tobias How did you get involved in the project? – Tobias Can you describe the internal architecture of Buildbot and outline how a typical workflow would look? – Tobias There are a number of packages out on PyPI for doing subprocess invocation and control, in addition to the functions in the standard library. Which does buildbot use and why? – Chris What makes Buildbot stand out from other CI/CD options that are available today? – Tobias Scaling a large CI/CD system can become a challenge. What are some of the limiting factors in the Buildbot architecture and in what ways have you seen people work to overcome them? – Tobias Are there any design or architecture choices that you would change in the project if you were to start it over? – Tobias If you were starting from scratch on implementing buildbot today, would you still use Python? Why? – Chris What are some of the most difficult challenges that have been faced in the creation and evolution of the project? – Tobias What are some of the most notable uses of Buildbot and how do they uniquely leverage the capabilities of the framework? – Tobias What are some of the biggest challenges that people face when beginning to implement Buildbot in their architecture? – Tobias Does buildbot support the use of docker or public clouds as a part of the build process? – Chris I know that the execution engine for Buildbot is written in Twisted. What benefits does that provide and how has that influenced any efforts for providing Python 3 support? – Tobias Does buildbot support build parallelization at all? For instance splitting one very long test run up into 3 instances each running a section of tests to cut build time? – Chris What are some of the most requested features for the project and are there any that would be unreasonably difficult to implement due to the current design of the project? – Tobias Does buildbot offer a plugin system like Jenkins does, or is there some other approach it uses for custom extensions to the base buildbot functionality? – Chris Managing a reliable build pipeline can be operationally challenging. What are some of the thorniest problems for Buildbot in this regard and what are some of the mechanisms that are built in to simplify the operational characteristics? – Tobias What were some of the challenges around supporting slaves running on platforms with very different environmental characteristics like Microsoft Windows? – Chris What is on the roadmap for Buildbot? – Tobias Keep In Touch Buildbot Website GitHub Picks Tobias Viking Safety Razor Chris Lifeline Suzaku Sake Links Crossbar.io The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/14/2016 • 1 hour, 25 minutes, 7 seconds

Onion IoT with Lazar and Zheng

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list. Summary One of the biggest new trends in technology is the Internet of Things and one of the driving forces is the wealth of new sensors and platforms that are being continually introduced. In this episode we spoke with the founder and head engineer of one such platform named Onion. The Omega board is a new hardware platform that runs OpenWRT and lets you configure it using a number of languages, not least of which is Python. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project We are also sponsored by Rollbar this week. Rollbar is a service for tracking and aggregating your application errors so that you can find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan. Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. The Open Data Science Conference in Boston is happening on May 21st and 22nd. If you use the code EP during registration you will save 20% off of the ticket price. If you decide to attend then let us know, we’ll see you there! Your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Lazar and Zheng about the Onion IoT platform Interview with Lazar and Zheng Introductions How did you get introduced to Python? – Chris What is the Onion platform and how does it leverage Python? – Tobias Can you compare and contrast the Python support you provide for Onion as compared with Raspberry Pi? – Chris I noticed that you are using the OpenWRT distribution of Linux in order to provide support for multiple languages. What was the driving intent behind choosing it and why is multiple language support so important for an IoT product? – Tobias Do you provide any libraries for using with the Omega to abstract away some of the hardware level tasks? What are some of the design considerations that were involved when developing that? – Tobias What are some of the most interesting projects you have seen people build with Python on your platform? – Tobias Keep In Touch Forum Twitter Picks Tobias Now You See Me Chris Portrait / Landscape Phone / Tablet Stand Tom Bihn Bags Lazar Ex Machina The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/7/2016 • 35 minutes, 51 seconds

LibCloud with Anthony Shaw

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list. Summary More and more of our applications are running in the cloud and there are increasingly more providers to choose from. The LibCloud project is a Python library to help us manage the complexity of our environments from a uniform and pleasant API. In this episode Anthony Shaw joins us to explain how LibCloud works, the community that builds and supports it, and the myriad ways in which it can be used. We also got a peek at some of the plans for the future of the project. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project The Open Data Science Conference in Boston is happening on May 21st and 22nd. If you use the code EP during registration you will save 20% off of the ticket price. If you decide to attend then let us know, we’ll see you there! Your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Anthony Shaw about the Apache LibCloud project Interview with Anthony Shaw Introductions How did you get introduced to Python? – Chris What is LibCloud and how did it get started? – Tobias How much overhead does using libcloud impose versus native SDKs for performance sensitive APIs like block storage? – Chris What are some of the design patterns and abstractions in the library that allow for supporting such a large number of cloud providers with a mostly uniform API? – Tobias Given that there are such differing services provided by the different cloud platforms, do you face any difficulties in exposing those capabilities? – Tobias How does LibCloud compare to similar projects such as the Fog gem in Ruby? – Tobias What inspired the choice of Python as the language for creating the LibCloud project? Would you make the same choice again? – Tobias Which versions of Python are supported and what challenges has that created? – Tobias What is your opinion on the state of PyPI as a package maintainer? What statistics are most useful to you and what else do you wish you could track? – Tobias Could you walk our listeners through the under the cover process details of instantiating a computer instance in say, Azure using libcloud? – Chris Does LibCloud have any native support for parallelization, such as for the purpose of launching a large number of compute instances simultaneously? – Tobias What does it mean to be an Apache project and what benefits does it provide? – Tobias What are some of the most notable projects that leverage LibCloud for interacting with platform and infrastructure service providers? – Tobias Could you describe how libcloud could be extended to abstract away a new type of service that’s not yet supported – e.g. a database? – Chris Would you suggest that libcloud users extend libcloud to cover ‘native’ services they might use like AWS Lambda, or should they mix libcloud and ‘native’ SDKs in cases like this? – Chris Could you talk a little bit about the cloud oriented network services that libcloud supports? Is it possible to create AWS VPCs, subnets, etc using libcloud? – Chris Do you know if people use LibCloud for abstracting the APIs of a single cloud provider, even if they don’t have any intention of using a different platform? – Tobias Do you think that people are more likely to use LibCloud for bridging across muliple public cloud platforms, or is it more commonly used in a hybrid cloud type of environment? – Tobias What is on the roadmap for LibCloud that people should keep an eye out for? – Tobias Keep In Touch Twitter GitHub GitHub Picks Tobias Blue Yeti Microphone Diablo Swing Orchestra Chris Rosewill RK Keycaps Enki Catch 22 Anthony Hidden Brain Podcast PyKwalify Doing Nothing Links Dimension Data Austin Bingham and Robert Smallshire Pluralsight Python Training CloudKick PyPI Ranking website Apache JClouds SaltStack Scalr Apache Software Foundation Mist.io StackStorm The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/1/2016 • 1 hour, 24 minutes, 34 seconds

Pip and the Python Package Authority with Donald Stufft

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list. Summary As Python developers we have all used pip to install the different libraries and projects that we need for our work, but have you ever wondered about who works on pip and how the package archive we all know and love is maintained? In this episode we interviewed Donald Stufft who is the primary maintainer of pip and the Python Package Index about how he got involved with the projects, what kind of work is involved, and what is on the roadmap. Give it a listen and then give him a big thank you for all of his hard work! Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Google Play Music just launched support for podcasts, so now you can check us out there and subscribe to the show. Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project We also have a new sponsor this week. Rollbar is a service for tracking and aggregating your application errors so that you can fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcatinit to get 90 days and 300,000 errors for free on their bootstrap plan. The Open Data Science Conference in Boston is happening on May 21st and 22nd. If you use the code EP during registration you will save 20% off of the ticket price. If you decide to attend then let us know, we’ll see you there! Your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Donald Stufft about Pip and the Python Packaging Authority Interview with Donald Stufft Introductions How did you get introduced to Python? – Chris How did you get involved with the Pip project? – Tobias What is the Python Packaging Authority and what does it do? – Tobias How is PyPi / the Python Packaging Authority funded? – Chris What is your opinion on the current state of Python packaging? Are there lessons from other languages and package managers that you think should be adopted by Python? – Tobias What was involved in getting pip into the standard Python distribution? Was there any controversy around this? – Chris Can you describe some of the mechanics of Pip and how it differs from the other packaging systems that Python has used in the past? – Tobias Does pip interact at all with virtualenv, pyenv and the like? – Chris The newest package format for Python is the wheel system. Can you describe what that is and what its benefits are? – Tobias What are the biggest challenges that you have encountered while working on Pip? – Tobias What does the infrastructure for the Python Package Index look like? – Tobias What have been some of the challenges around scaling Pypi’s infrastructure to meet demand? – Chris You’re currently working on a replacement for the PyPI site with the Warehouse project. Can you explain your motivation for that and how it improves on the current system? – Tobias Where do you see the future of dependency management in Python headed? – Chris A few days ago there was a big story about how an NPM library was removed from the index, breaking a large number of dependent projects and applications. Do you think that anything like that could happen in the Python ecosystem? – Tobias What’s on the roadmap for Pip? – Tobias Keep In Touch GitHub DistUtils Special Interest Group Email @dstufft on Twitter Picks Tobias Xiki Chris Agar.io Culprate TCP/IP Illustrated Volume I: The Protocols Donald Linux on Windows 10 Links Bandersnatch Wheel Warehouse pypa/warehouse PyPI Sponsors56 DevPI The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/23/2016 • 52 minutes, 59 seconds

StackStorm with Tomaž Muraus and Patrick Hoolboom

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list. Summary If you are responsible for managing any amount of servers, then you know that automation is critical for maintaining your sanity. This week we spoke with Tomaž Muraus and Patrick Hoolboom about their work on StackStorm, which is a platform for tracking and reacting to events in your infrastructure. By allowing you to register actions with event triggers it frees you from having to worry about a whole class of concerns so that you can focus on building new capabilities rather than babysitting what you already have. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project I would also like to thank Hired, a job marketplace for developers and designers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. ODSC East in Boston is happening on May 21st – 22nd. Use the discount code EP for 20% off when you register Your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Tomaž Muraus and Patrick Hoolboom about the StackStorm project, which is an event-driven system automation framework. Interview Introductions How did you get introduced to Python? – Chris What is StackStorm and what problems does it solve? – Tobias What was your inspiration for creating StackStorm and what were some of the biggest architectural and design challenges? – Tobias What made you choose Python for StackStorm’s implementation rather than another language like Go? – Chris Can you describe the architecture of StackStorm and what the setup looks like? – Tobias Other than chat driven events, what types of event sources does StackStorm support, and what use cases do those alternate event streams enable? – Chris The home page describes StackStorm as being an event-driven framework for automating the users infrastructure. What kinds of capabilities are made possible by this and do you think that it simplifies or complicates the work of operations engineers? – Tobias Is there a minimum or maximum size of infrastructure for which it would make sense to use StackStorm? – Tobias It looks like StackStorm is made up of a number of discrete components. What do the components use to communicate, and how did those choices influence the design of StackStorm’s overall architecture? – Chris I use SaltStack in my work which is a tool that also focuses on event-driven architecture. Can you compare and contrast the capabilities and focus of StackStorm with the features of SaltStack? Would it make sense to use both frameworks in the same infrastructure? – Tobias One of the advertised features of StackStorm is a strong focus on ChatOps. Can you explain that concept for people who might not be familiar with it and describe why it is such a useful paradigm? – Tobias Extensibility is a critical capability for an operations platform due to the wide variety of environments that people are inclined to build. In StackStorm the unit of extensibility is a pack. Can you describe what a pack is and how you arrived at that abstraction? – Tobias Have you encountered any situations in which the concept of a pack has been the wrong abstraction and made something more difficult than it may have been otherwise? – Tobias In very large scale environments like Netflix, how would one build a StackStorm cluster to handle the immense load. More specifically, how does one determine what kinds of machine resources each component needs? – Chris Management of credentials is always a difficult problem in operations. Does StackStorm attempt to tackle that issue or does it defer that responsibility to other systems, such as the user’s configuration management platform? – Tobias Does StackStorm interface with Kibana, Splunk or other log / metric aggregation packages? – Chris What are some of the most surprising uses that you have heard of from people using the platform? – Tobias Keep In Touch Tomaž Twitter website/blog Patrick Twitter Picks Tobias SAWS Bill Peet Chris Grimm Brewing Subliminal Message Sour Red Ale Lobste.rs Medium Tomaž Understanding Air France 447 Aviation Herald Patrick True Nutrition JP Cycles The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/16/2016 • 59 minutes, 22 seconds

Hypothesis with David MacIver

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list. Summary Writing tests is important for the stability of our projects and our confidence when making changes. One issue that we must all contend with when crafting these tests is whether or not we are properly exercising all of the edge cases. Property based testing is a method that attempts to find all of those edge cases by generating randomized inputs to your functions until a failing combination is found. This approach has been popularized by libraries such as Quickcheck in Haskell, but now Python has an offering in this space in the form of Hypothesis. This week, the creator and maintainer of Hypothesis, David MacIver, joins us to tell us about his work on it and how it works to improve our confidence in the stability of our code. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project Open Data Science Conference on May 21-22nd in Boston. 20% Your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing David MacIver about the Hypothesis project which is an advanced Quickcheck implementation for Python. Interview with David MacIver Introductions How did you get introduced to Python? – Chris Can you provide some background on what Quickcheck is and what inspired you to write an implementation in Python? – Tobias Are there any ways in which Hypothesis improves on the original design of Quickcheck? – Tobias Can you walk us through the execution of a simple Hypothesis test to give our listeners a better sense for what Hypothesis does? – Chris Have you had trouble getting people to use Hypothesis? How has adoption been? – David What does this sort of testing get you that conventional testing doesn’t? – David Why do you think this sort of testing hasn’t caught on in the Python world before? – David Are there any facilities of the Python language that make your job easier? Are there aspects of the language that make this style of testing more difficult? – Tobias What are some of the design challenges that you have been presented with while working on Hypothesis and how did you overcome them? – Tobias Given that testing is an important part of the development process for ensuring the reliability and correctness of the system under test, how do you make sure that Hypothesis doesn’t introduce uncertainty into this step? – Tobias Given the sophisticated nature of the internals of Hypothesis, do you find it difficult to attract contributors to the project? – Tobias A few months ago you went through some public burnout with regards to open source and Hypothesis in particular, but circumstances have brought you back to it with a more focused plan for making it sustainable. Can you provide some background and detail about your experiences and reasoning? – Tobias What’s next for Hypothesis? – Chris Keep In Touch Twitter Blog NewsLetter Picks Tobias TypeForm Listener Survey CI Survey Chris Seashine CheckIO Mike Coutermarsh’s Jr. Developer series David Make It Stick by Peter Brown Beeminder Vorkosigan Saga by Lois McMaster Bujold The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/9/2016 • 47 minutes, 1 second

Pyjion with Dino Viehland and Brett Cannon

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list. Summary In an attempt to improve the performance characteristics of the CPython implementation, Dino Viehland began work on a patch to allow for a pluggable interface to a JIT (Just In Time) compiler. His employer, Microsoft, decided to sponsor his efforts and the result is the Pyjion project. In this episode we spoke with Dino Viehland and Brett Cannon about the goals of the project, the progress they have made so far, and the issues they have encountered along the way. We also made an interesting detour to discuss the general state of performance in the Python ecosystem and why the GIL isn’t the bogeyman it’s made out to be. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project I would also like to thank Hired, a job marketplace for developers and designers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. Your hosts as usual are Tobias Macey and Chris Patti Open Data Science Conference, Boston MA May 21st – 22nd, use the discount code EP at registration for 20% off Today we are interviewing Brett Cannon and Dino Viehland about their work on Pyjion, a CPython extension that provides an API to allow for plugging a JIT compilation engine into the CPython runtime. Interview with Brett Cannon and Dino Viehland Introductions How did you get introduced to Python? – Chris What was the inspiration for the Pyjion project and what are its goals? – Tobias The FAQ mentions that Pyjion could easily be made cross platform, but this being a Microsoft project it was bootstrapped on Windows. Have any of the discrete tasks required to get Pyjion running under OSX or Linux been laid out even in outline form? – Chris Given that this is a Microsoft backed project it makes sense that the first JIT engine to be implemented is for the CoreCLR. What would an alternative implementation provide and in what ways can a JIT framework be tuned for particular workloads? – Tobias What kinds of use cases and problem domains that were previously impractical will be enabled by this? – Tobias Does Microsoft’s recent acquisition of Xamarin and the Mono project change things for the Pyjion project at all? – Chris What are the challenges associated with your work on Pyjion? Are there certain aspects of the Python language and the CPython implementation that make the work more difficult than it might be otherwise? – Tobias When I think of Microsoft and programming languages I generally think of C++ and C#. Did your team have to go through an approval process in order to utilize Python, and further to open source your work on Pyjion? – Chris How does Pyjion hook into the CPython runtime and what kinds of primitives does it expose to JIT engines for them to be able to work with? – Tobias Would an entire project be run through the JIT engine during runtime or is it possible to target a subset of the code being executed? – Tobias In what ways can a JIT compiler implementation be purpose-built for a given workload and how would someone go about creating one? – Tobias Could a JIT plugin be designed with different trade-offs, like no C API compatibility, but that worked around the GIL to provide real concurrency in Python? – Chris One of the most notable benefits of having a JIT implementation for the CPython runtime is the fact that modules with C extensions can be used, such as NumPy. Does that pose any difficulties in the compilation methods used for optimizing the Python portion of the code? – Tobias What kinds of performance improvements have you seen in your experimentation? – Tobias Which release of Python do you hope to have Pyjion incorporated into? – Tobias Has any thought been given to making Python a first class citizen in Visual Studio Code? – Chris What areas of the project could use some help from our listeners? – Chris Keep In Touch Dino GitHub Brett Twitter Blog Python Engineering @ Microsoft Blog Picks Tobias Logitech Wave MK550 SaltStack TestInfra SaltStack Formula Cookiecutter Chris Anchor – Public Radio for the People The Magicians Portal is a Feminist Masterpiece – PBS Gameshow Brett Breville Tea Maker Bodom Mugs Alto’s Adventure Dino Come Dine With Me The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/1/2016 • 1 hour, 10 minutes, 26 seconds

Transcrypt with Jacques de Hooge

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list. Summary Any programmer who has dealt with a website for any length of time knows that writing JavaScript isn’t always the most enjoyable. Wouldn’t you rather write that code in Python and just have it work on your website? In this episode we learn about Transcrypt with its creator Jacques de Hooge. Transcrypt is a Python to JavaScript transpiler that embraces the JavaScript ecosystem while letting you use the familiar syntax of Python for writing your logic, rather than trying to shoehorn a Python runtime into your browser. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project I would also like to thank Hired, a job marketplace for developers and designers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. ODSC East in Boston is happening on May 21st – 22nd. Use the discount code EP for 20% off when you register Your host today is Tobias Macey Today I am interviewing Jacques de Hooge about his work on the Transcrypt Project Interview with Jacques de Hooge Introductions How did you get introduced to Python? – Tobias What is Transcrypt and what inspired you to create it? – Tobias As you mention in the documentation, there are a number of projects that attempt to shoehorn Python into the browser. What makes Transcrypt different? – Tobias I like that you decided to embrace the web environment by calling into JavaScript libraries. What are some of the challenges that you encountered while creating that functionality? – Tobias How is the transpilation performed and what are some of the methods that you used to get the build size as small as it is? – Tobias Given the nature of JavaScripts prototypical inheritance and differences in class semantics, I imagine that adding support for multiple inheritance and reflecting the structure of Python classes must have been challenging. Can you describe that process and how you arrived at your current solution? – Tobias Which aspects of the language were most difficult to translate to JavaScript? – Tobias Is Transcrypt complete and stable enough to be used in production? – Tobias Keep in Touch Transcrypt.org Forum Email Picks Tobias Cookiecutter Jacques Programming The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/26/2016 • 42 minutes, 10 seconds

VPython with Ruth Chabay and Bruce Sherwood

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list. Summary Wouldn’t it be nice to be able to generate interactive 3D visualizations of physical systems in a declarative manner with Python? In this episode we spoke with Ruth Chabay and Bruce Sherwood about the VPython project which does just that. They tell us about how the use VPython in their classrooms, how the project got started, and the work they have done to bring it into the browser. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project I would also like to thank Hired, a job marketplace for developers and designers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. Your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Ruth Chabay and Bruce Sherwood about their work on VPython Interview Introductions How did you get introduced to Python? – Chris What is VPython and how did it get started? – Tobias What problems inspired you to create VPython? – Chris How do you design an API that allows for such powerful 3D visualization while still making it accessible to students who are focusing on learning new concepts in mathematics and physics so that they don’t get overwhelmed by the tool? – Tobias I know many schools have embraced the open curriculum idea, have any of your physics courses using VPython been made available to the non matriculating public? – Chris How does VPython perform its rendering? If you were to reimplement it would you do anything differently? – Tobias One of the remarkable points about VPython is its ability to execute the simulations in a browser environment. Can you explain the technologies involved to make that work? – Tobias Given the real-time rendering capabilities in VPython I’m sure that performance is a core concern for the project. What are some of the methods that are used to ensure an appropriate level of speed and does the cross-platform nature of the package pose any additional challenges? – Tobias How does collision detection work in VPython, and does it handle more complex assemblies of component objects? – Chris Can you talk a little bit about VPython’s design, and perhaps walk us through how a simple scene is rendered, say the results of the sphere() call? – Chris Keep In Touch VPython Forum Glowscript Forum Github Picks Tobias Land of Lisp by Conrad Barsky M.D. Chris The Magicians Swift Atari Logo Bruce VPython.org Glowscript.org Ruth matterandinteractions.org/student NetLogo Links Coursera GATech Intro to Physics Alice Project glowscript.org Jupyter VPython RapydScript The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/18/2016 • 1 hour, 3 minutes, 2 seconds

PyData London with Ian Ozsvald and Emlyn Clay

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list. Summary Ian Ozsvald and Emlyn Clay are co-chairs of the London chapter of the PyData organization. In this episode we talked to them about their experience managing the PyData conference and meetup, what the PyData organization does, and their thoughts on using Python for data analytics in their work. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project I would also like to thank Hired, a job marketplace for developers and designers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. Your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Ian Ozsvald and Emlyn Clay about their work with PyData London, a group within the PyData organization. PyData London represents the largest Python group in London at ~2850 members, they hold regular monthly meetups for ~200 members at AHL near Bank and a yearly conference for around ~300 members. Last year, they and their sponsors raised over £26,000 to sponsor the development of core numerical libraries in Python. Use the promo code podcastinit20 to get a $20 credit when you sign up! On Hired software engineers & designers can get 5+ interview requests in a week and each offer has salary and equity upfront. With full time and contract opportunities available, users can view the offers and accept or reject them before talking to any company. Work with over 2,500 companies from startups to large public companies hailing from 12 major tech hubs in North America and Europe. Hired is totally free for users and If you get a job you’ll get a $2,000 “thank you” bonus. If you use our special link to signup, then that bonus will double to $4,000 when you accept a job. If you’re not looking for a job but know someone who is, you can refer them to Hired and get a $1,337 bonus when they accept a job. Interview Introductions How did you get introduced to Python? – Chris What is the PyData organization, how does PyData London fit into it and what is your relationship with it? – Tobias In what ways does a PyData conference differ from a PyCon? – Tobias Does PyData do anything in particular to encourage users from disciplines that might not be aware of how much our community has to offer to choose the Python suite of data analysis tools? – Chris You have both spent a good portion of your careers using Python for working with and analyzing data from various domains. How has that experience evolved over the past several years as newer tools have become available? – Tobias For someone who is just getting started in the data analytics space, what advice can you give? – Tobias How can conferences like PyData help strengthen the bonds and synergies between the Python software community and the sciences? – Chris There are a number of different subtopics within the blanket categorization of data science. Is it difficult to balance the subject matter in PyData conferences and meetups to keep members of the audience from being alienated? – Tobias Data science is a young field and we’ve yet to see lots of examples of the successful use of data. How are London-based companies using data with Python? – Ian Is there a Python data science library you think needs a little love? – Emlyn Keep In Touch Ian Blog Twitter Emlyn Twitter Picks Tobias xcape Keybase Filesystem Chris The Player of Games Undertale The Big Short Ian Seaborn: Python visualisation tool Mastering Predictive Analytics with R: Rui Miguel Forte Allergect Rhinitis research using ML London Unreal City Audio Tour Emlyn ipython nbconvert –template flag Damian Avila’s Blog post on making slides with iPython Notebook The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/12/2016 • 1 hour, 3 minutes, 11 seconds

Efene with Mariano Guerra

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list. Summary Efene is a language that runs on the Erlang Virtual Machine (BEAM) and is inspired by the Zen of Python. It is intended as a bridge language that serves to ease the transition into the Erlang ecosystem for people who are coming from languages like Python. In this episode I spoke with Mariano Guerra, the creator of Efene, about how Python influenced his design choices, why you might want to use it, and when Python is the better tool. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. Your host today is Tobias Macey Today we are interviewing Mariano Guerra about his work on the Efene language. Interview with Mariano Guerra Introductions How did you get introduced to Python? – Chris So Efene is a language that runs on the BEAM VM which you say was at least partially inspired by the Zen of Python. Can you explain in greater detail in what form that inspiration manifested and some of the process involved in the creation of Efene? – Tobias What inspired you to create Efene and what problems does it solve? – Tobias How does Efene compare to other BEAM based languages such as Elixir? – Tobias When would a Python developer want to consider using Efene? – Tobias What benefits does the BEAM provide that can’t be easily replicated in the Python ecosystem? – Tobias Does the Efene language ease the transition to a more functional mindset for developers who are already familiar with Python paradigms? – Tobias I understand that you are experimenting with another language implementation that runs on the BEAM. Can you describe that project and compare it to Efene? What were your inspirations? – Tobias Keep In Touch Twitter GitHub Blog Efene Emesene Python Argentina Picks Tobias Dotphiles The Unreasonable Effectiveness of Dynamic Typing for Practical Programs Mariano Om Next David Nolan on Om Next Clojurescript Things Network Links Erlang Elixir Lisp Flavored Erlang Joxa Rebar3 Erlang MK Hex Interfix The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

3/4/2016 • 59 minutes, 35 seconds

Functional Python with Matthew Rocklin and Alexander Schepanovsky

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list. Summary What is functional programming, why would you want to use it, and how can you get started with it in Python? Our guests this week, Matthew Rocklin and Alexander Schepanovsky, help us understand all of that and more. Matthew and Alexander have each created their own Python libraries to make it easier to employ functional paradigms in your Python code. In this episode they help us understand the benefits that functional styles can have and the benefits that can be realized by trying them out for yourself. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project I would also like to thank Hired, a job marketplace for developers and designers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. Your host today is Tobias Macey Today we are interviewing Matthew Rocklin and Alexander Schepanovski about their work on functional libraries for Python. Interview with Alexander and Matthew Introductions How did you get introduced to Python? – Chris Can you first explain what functional programming is and how it differs from the procedural or object oriented programming that most Pythonistas are familiar with? – Tobias How did you get started with functional programming? – Tobias What are the benefits of functional programming and when might someone want to use functional paradigms in their projects? – Tobias What is it about functional programming that people find so intimidating and what do you think has led to its recent rise in popularity? – Tobias What aspects of the Python language lend themselves to being used in a functional manner and where does it fall down? – Tobias Can you each describe what your respective libraries provide in terms of functional capabilities and what their particular focus is? Are they distinct enough from each other that it would make sense to use them both in a single project? – Tobias What inspired each of you to create your respective libraries? – Tobias There is a functools module in the Python standard library that provides some methods that enable functional paradigms. Where does that module fall short and how do your respective libraries augment or replace the functionality in that module? – Tobias There is also a library named fn.py which provides functional paradigms for use in Python. Can you each compare and contrast it with your own work? – Tobias There are a number of concepts involved in functional programming such as currying, function composition, immutable data, and pure functions. Can you describe some of those concepts and then explain which of them you tried to incorporate into your libraries? – Tobias What are some of the resources that you have found to be most helpful when trying to learn and apply functional principles to your programs? – Tobias Keep In Touch Alexander Twitter Blog Matthew Website Toolz Twitter GitHub Picks Tobias DataDog Alexander The Expanse Revolut Matthew Riemann Five Dances Distributed Links Rosetta Code PyToolz Funcy Fn.py MacroPy Code Transformer Simple Made Easy by Rich Hickey The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/29/2016 • 1 hour, 20 minutes, 2 seconds

Cython with Craig Citro and Robert Bradshaw

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list. Summary Do you find yourself reaching for a different language when you need some extra speed? With Cython you can get the best of both worlds by writing your code in Python and executing it as compiled code. In this episode we were joined by Craig Citro and Robert Bradshaw from the Cython project to discuss how and when you might want to incorporate it into your applications. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. Your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Craig Citro and Robert Bradshaw Interview with Craig Citro and Robert Bradshaw Introductions How did you get introduced to Python? – Chris What is Cython and how did the project get started? – Tobias My understanding is that Cython can improve the performance of a Python program without even having to provide any type annotations. How does it manage to do that? – Tobias Can a Cython module be used as a way to sidestep the GIL? What are some of the pitfalls that can be caused by doing so? – Tobias Can you give some examples of how Cython can be used to improve the perfomance of Python programs? – Tobias How does Cython work under the covers? – Tobias What were some of the challenges during the creation of Cython and what design decisions were made to overcome them? – Tobias Does Python’s cross platform nature create any unique challenges when compiling down to the C level? – Chris What processor and system architectures does Cython support and are there plans to expand that support? – Tobias How do generators and list comprehensions map to C, and did those higher level language constructs pose any special challenges in Cython’s design? – Chris Would Rust ever be a potential compile target for performance and safety optimized modules? – Tobias Keep In Touch Craig Twitter GitHub Website Robert Email Picks Tobias Certificates, Reputation, and the Blockchain Craig Curious Kids Science Book by Asia Citro dplyr magrittr Everything Is Obvious: How Common Sense Fails Us by Duncan Watts Robert Mo Willems Philips Hue Lights Sage Math Cloud Links Sage (Math) Pyrex) The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/19/2016 • 52 minutes, 2 seconds

Airflow with Maxime Beauchemin

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list. Summary Are you struggling with trying to manage a series of related, interdependent batch jobs? Then you should check out Airflow. In this episode we spoke with the project’s creator Maxime Beauchemin about what inspired him to create it, how it works, and why you might want to use it. Airflow is a data pipeline management tool that will simplify how you build, deploy, and monitor your complex data processing tasks so that you can focus on getting the insights you need from your data. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project I would also like to thank Hired, a job marketplace for developers and designers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. Your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Maxime Beauchemin about his work on the Airflow project. Interview with Maxime Beauchemin Introductions How did you get introduced to Python? – Chris What is Airflow and what are some of the kinds of problems it can be used to solve? – Chris What are some of the biggest challenges that you have seen when implementing a data pipeline with a workflow engine? – Tobias What are some of the signs that a workflow engine is needed? – Tobias Can you share some of the design and architecture of Airflow and how you arrived at those decisions? – Tobias How does Airflow compare to other workflow management solutions, and why did you choose to write your own? – Chris One of the features of Airflow that is emphasized in the documentation is the ability to dynamically generate pipelines. Can you describe how that works and why it is useful? – Tobias For anyone who wants to get started with using Airflow, what are the infrastructure requirements? – Tobias Airflow, like a number of the other tools in the space, support interoperability with Hadoop and its ecosystem. Can you elaborate on why JVM technologies have become so prevalent in the big data space and how Python fits into that overall problem domain? – Tobias Airflow comes with a web UI for visualizing workflows, as do a few of the other Python workflow engines. Why is that an important feature for this kind of tool and what are some of the tasks and use cases that are supported in the Airflow web portal? – Tobias One problem with data management is tracking the provenance of data as it is manipulated and shuttled between different systems. Does Airflow have any support for maintaining that kind of information and if not do you have recommendations for how practitioners can approach the issue? – Tobias What other kinds of metadata can Airflow track as it executes tasks and what are some of the interesting uses you have seen or created for that information? – Tobias With all the other languages competing for mindshare, what made you choose Python when you built Airflow? – Chris I notice that Airflow supports Kerberos. It’s an incredibly capable security model but that comes at a high price in terms of complexity. What were the challenges and was it worth the additional implementation effort? – Chris When does the data pipeline/workflow management paradigm break down and what other approaches or tools can be used in those cases? – Tobias So, you wrote another tool recently called Panoramix. Can you describe what it is and maybe explain how it fits in the data management domain in relation to Airflow? – Tobias Keep In Touch Google Group Gitter GitHub Picks Tobias Empire of the East by Fred Saberhagen The Book of Swords by Fred Saberhagen Chris Buraka Son Sistema Star Wars – Despecialized Edition The Iron Druid Chronicles Maxime Flask App Builder The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/13/2016 • 1 hour, 3 minutes, 17 seconds

WSGI 2

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list. Summary The Web Server Gateway Interface, or WSGI for short, is a long-standing pillar of the Python ecosystem. It has enabled a vast number of web frameworks to proliferate by not having to worry about how exactly to interact with the HTTP protocol and focus instead on building a library that is robust, extensible, and easy to use. With recent evolutions to how we interact with the web, it appears that WSGI may be in need of an update and that is what our guests on this episode came to discuss. Cory Benfield is leading an effort to determine what if any modifications should be made to the WSGI standard or if it is time to retire it in favor of something new. Andrew Godwin has been hard at work building the Channels framework for Django to allow for interoperability with websockets. They bring their unique perspectives to bear on how and why we may want to consider bringing WSGI into the current state of the web. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. Your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Cory Benfield and Andrew Godwin about a proposed update to the WSGI specification. Interview with Cory Benfield and Andrew Godwin Introductions How did you get introduced to Python? – Chris First off, what is WSGI? – Tobias What are some of the ways the current WSGI spec has fallen out of step with the needs of the modern developer? – Chris How did you come to be involved with the new WSGI specification? What brought you into this process? – Chris Do you think the WSGI name itself brings a lot of expectation, or is it good to keep it as a well-recognised Python landmark? – Tobias Would it be better to make a clean break and implement an entirely new set of APIs and style of interaction? – Tobias What kind of compatibility guarantees should be made between the current spec and the proposed upgrade? What would the impact be if the new specification was incompatible? – Tobias How has the response been to your call for comments? What are some of the most frequently raised concerns or suggestions? – Tobias What are some of the proposed changes to the specification? – Tobias Are there any future directions you think WSGI should take that perhaps haven’t been considered yet? – Chris Has your opinion or vision of the proposed update changed as you reviewed responses to the conversation on the mailing list? – Tobias Do you have any ideas of how to design the new specification in order to avoid a similar situation of needing to deprecate the current standards in order to accomodate new web protocols? – Tobias What are some of the points of contention or rigorous debate that have kept previous WSGI 2 attempts from succeeding? – Chris Keep In Touch Andrew Twitter GitHub Cory Twitter GitHub Picks Tobias Discourse Chris The Expanse Puerto Rico for IOS Dominion for IOS Splendor for IOS Cory Wusthof Knives Australian Football XCOM 2 Andrew Archery Tromsø Norway The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

2/7/2016 • 1 hour, 4 minutes, 46 seconds

SymPy With Aaron Meurer

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list. Summary Looking for an open source alternative to Mathematica or MatLab for solving algebraic equations? Look no further than the excellent SymPy project. It is a well built and easy to use Computer Algebra System (CAS) and in this episode we spoke with the current project maintainer Aaron Meurer about its capabilities and when you might want to use it. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ Join our community at discourse.pythonpodcast.com to follow up with the guests and help us make the show better! nn I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit and double your signing bonus to $4,000. We are recording today on January 18th, 2016 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Aaron Meurer about SymPy Interview with Aaron Meurer Introductions How did you get introduced to Python? – Chris What is Sympy and what kinds of problems does it aim to solve? – Chris How did the SymPy project get started? – Tobias How did you get started with the SymPy project? – Chris Are there any limits to the complexity of the equations SymPy can model and solve? – Chris How does SymPy compare to similar projects in other languages? – Tobias How does Sympy render results using such beautiful mathematical symbols when the inputs are simple ASCII? – Chris What are some of the challenges in creating documentation for a project like SymPy that is accessible to non-experts while still having the necessary information for professionals in the fields of mathematics? – Tobias Which fields of academia and business seem to be most heavily represented in the users of SymPy? – Tobias What are some of the uses of Sympy in education outside of the obvious like students checking their homework? – Chris How does SymPy integrate with the Jupyter Notebook? – Chris Is SymPy generally used more as an interactive mathematics environment or as a library integrated within a larger application? – Tobias What were the challenges moving SymPy from Python 2 to Python 3? – Chris Are there features of Python 3 that simplify your work on SymPy or that make it possible to add new features that would have been too difficult previously? – Tobias Were there any performance bottlenecks you needed to overcome in creating Sympy? – Chris What are some of the interesting design or implementation challenges you’ve found when creating and maintaining SymPy? – Chris Are there any new features or major updates to SymPy that are planned? – Tobias How is the evolution of SymPy managed from a feature perspective? Have there been any occasions in recent memory where a pull request had to be rejected because it didn’t fit with the vision for the project? – Tobias Which of the features of SymPy do you find yourself using most often? – Tobias Picks Tobias Functional Geekery Nekrogoblikon Heavy Meta Marble Fun Run Chris Surprisingly Awesome All Watched Over by Machines of Loving Grace Pizzicato 5 Mayflower Hoppy Brown Ale Aaron Fermat’s Library catimg iTerm2 Keep In Touch Twitter Mailing List Gitter Channel Links Project Euler Richardson’s Theorem Doing Math With Python by Amit Saha (and Aaron’s book review) The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/31/2016 • 1 hour, 3 minutes, 6 seconds

RPython with Maciej Fijalkowski

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list. Summary RPython is a subset of Python that is used for writing high performance interpreters for dynamic languages. The most well-known product of this tooling is the PyPy interpreter. In this episode we had the pleasure of speaking with Maciej Fijalkowski about what RPython is, what it isn’t, what kinds of projects it has been used for, and what makes it so interesting. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. We are recording today on December 17th, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Maciej Fijalkowski on RPython Interview with Maciej Fijalkowski Introductions How did you get introduced to Python? – Chris What is RPython and how does it differ from CPython? – Tobias Can you share some of the history of RPython in terms of the major improvements and design choices? – Tobias In the documentation it says that RPython is able to generate a Just In Time compiler for dynamic languages. Can you explain why that is significant and some of the ways that it does that? – Tobias The most well-known use of RPython is the PyPy interpreter for Python. Can you share some of the other languages that have been ported to the RPython runtime and how their performance has been improved or altered in the process? – Tobias Are there any languages that have been designed entirely for use with RPython, rather than translating an existing language to run on it? – Tobias Do you know of any cases where an application has been written to run directly on RPython? – Tobias What are the computer architecture and operating system platforms that RPython supports and do you have any plans to expand that support? – Tobias Are there any minimum hardware specifications that are necessary to be able to effectively run a language written against the RPython platform? – Tobias Is RPython similar in concept to other efforts like Parrot in the Perl world? – Chris Are there any particular areas of the project that you need help with and how can people get involved with the project? – Tobias Picks Tobias PyCoders 2015 Recap Shape Up Xbox One Xbox One Kinect Selfless Chris Skunk Bear Category 6 Environments) Maciej PyCon South Africa Keep In Touch IRC Mailing List PyPy consultancy Links Psyco (Python JIT) Truffle HippyVM Topaz Pycket Pyxie-lang The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/22/2016 • 35 minutes, 34 seconds

Ben Darnell on Tornado

Visit our site to listen to past episodes, support the show, join our Discourse community, and sign up for our mailing list. Summary If you are trying to build a web application in Python that can scale to a high number of concurrent users, or you want to leverage the power of websockets, then Tornado just may be the library you need. In this episode we interview Ben Darnell about his work as the maintainer of the Tornado project and how it can be used in a number of ways to power your next high traffic site. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ We are also running a listener survey to get feedback about the show. You can find it at bit.do/podcastinit-survey. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project I would also like to thank Hired, a job marketplace for developers and designers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus to $4,000. Your hosts as usual are Tobias Macey and Chris Patti We recently launched a new Discourse forum for the show which you can find at discourse.pythonpodcast.com. Join us to discuss the show, the episodes, and ideas for future interviews. Today we are interviewing Ben Darnell about his work on Tornado Interview with Ben Darnell Introductions How did you get introduced to Python? – Chris What is Tornado and what sets it apart from other HTTP servers? – Chris How did you get involved with Tornado? – Ben What was the inspiration for the name? – Tobias Tornado was created before the recent focus on asynchronous applications. What prompted that design choice and when might someone care about using async in their development? – Tobias What is involved in creating an event loop and what are some of the specific design decisions that you made when implementing one for Tornado? – Tobias How does Tornado’s event loop compare to other packages such as Twisted or the asyncio module in the standard library? – Tobias The web module appears to provide a minimal framework for developing web apps. How scalable are those capabilities and is there a recommended architecture for people using Tornado to develop web applications? – Tobias What are some use cases in which a developer might choose Tornado over other similar options? – Chris Could you please give our listeners an overview of Tornado’s concurrency options including coroutines? – Chris I see that Tornado supports interoperability with the WSGI protocol and one of the use cases mentioned is for running a Django application alongside a Tornado app. Is that a common way for providing websocket capabilities alongside an existing web app? – Tobias I noticed that Tornado provides non-blocking versions of bare sockets and TCP connections. Are there any add-on packages available to simplify the use of various network protocols along the lines of what Twisted includes? – Tobias Please tell us about the transition of Tornado to Python 3. What obstacles did you face and how did you overcome them? – Chris Based on your issue tracker it looks like http2 support is definitely on the roadmap. Could you please detail your future plans in this area? – Chris What are some of the common “gotcha’s” for people who are just starting to use Tornado? – Tobias Picks Tobias Adventures of Riley Dayworld Trilogy by Philip José Farmer Chris Sense8 Habits of a Happy Brain Ethereum Ben The Memory Palace Newsblur Keep In Touch Mailing List Links Motor The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/16/2016 • 1 hour, 6 minutes, 27 seconds

Yves Hilpisch on Quantitative Finance

Visit our site to listen to past episodes, join our community Discourse, support the show, and sign up for our mailing list. Summary Yves Hilpisch is a founder of The Python Quants, a consultancy that offers services in the space of quantitative financial analysis. In addition, they have created open source libraries to help with that analysis. In this episode we spoke with him about what quantitative finance is, how Python is used in that domain, and what kinds of knowledge are necessary to do these kinds of analysis. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus to $4,000. We are recording today on December 30th, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Yves Hilpisch about Quantitative Finance On Hired software engineers & designers can get 5+ interview requests in a week and each offer has salary and equity upfront. With full time and contract opportunities available, users can view the offers and accept or reject them before talking to any company. Work with over 2,500 companies from startups to large public companies hailing from 12 major tech hubs in North America and Europe. Hired is totally free for users and If you get a job you’ll get a $2,000 “thank you” bonus. If you use our special link to signup, then that bonus will double to $4,000 when you accept a job. If you’re not looking for a job but know someone who is, you can refer them to Hired and get a $1,337 bonus when they accept a job. Interview with Yves Hilpisch Introductions How did you get introduced to Python? – Chris Can you explain what Quantitative Finance is? – Tobias How common is it for Python to be used in an investment bank or hedge fund? – Tobias What factors contribute to the choice of whether or not to use Python in a Quantitative Finance role? – Tobias Are there any performance bottle necks or other considerations inherent in using Python for quantitative finance? – Chris What kind of background is necessary for getting started in Quantitative Finance? – Tobias What kinds of libraries or algorithms in Python are useful for the day-to-day work of a quant? – Tobias Is Python actually used to enact the trades? What protocols, APis, and libraries are used in this process? – Chris Could you please walk us through how a simple analysis using DXAnalytics might work? – Chris You work for a company called ‘The Python Quants‘. What kinds of services do you provide and what kinds of organizations typically hire you? – Tobias Picks Tobias Kraken by China Miéville Heroes in Training series Olympians Graphic Novels Data Elixir Newsletter Chris Hill Farmstead – Edward Long Trail – Brush & Barrel Series – Culmination Chocolate Porter Long Trail – Spaaaaaace Juice Double IPA Flask-RESTLess Yves The Willpower Instinct The Way of the Seal Sapiens: A Brief History of Humankind Python High Performance Computing Keep In Touch Twitter Website Links Quandl Yahoo Finance Market Data Ravenpack DX Analytics DataPark.io Python for Finance Derivatives Analytics With Python Python Quants Conference Open Source for Quant Finance The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/8/2016 • 1 hour, 10 minutes, 30 seconds

Scott Sanderson on Algorithmic Trading

Visit our site to listen to past episodes, support the show, and sign up for our mailing list. Summary Because of its easy learning curve and broad extensibility Python has found its way into the realm of algorithmic trading at Quantopian. In this episode we spoke with Scott Sanderson about what algorithmic trading is, how it differs from high frequency trading, and how they leverage Python for empowering everyone to try their hand at it. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com We are recording today on December 16th, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Scott Sanderson on Algorithmic Trading Interview with Scott Sanderson Introductions How did you get introduced to Python? – Chris Can you explain what algorithmic trading is and how it differs from high frequency trading? – Tobias What kinds of algorithms and libraries are commonly leveraged for algorithmic trading? – Tobias Quantopian aims to make algorithmic trading accessible to everyone. What do people need to know in order to get started? Is it necessary to have a background in mathematics or data analysis? – Tobias Does the Quantopian platform build in any safe guards to prevent user’s algorithms from spiraling out of control and creating or contributing to a market crash? – Chris How is Python used within Quantopian and when do you leverage other languages? – Tobias What Pypi packages does Quantopian leverage in its platform? – Chris How do the financial returns compare between algorithmic vs human trading on the stock market? – Tobias Can you speak about any trends you see in the trading algorithms people are creating for the Quantopian platform? – Chris Picks Tobias Kinetic Sand Trivium Thrift Books Chris Threes Jessica Jones) Serial Scott Dota 2 Philosophical Investigations Logicomix Infinite Jest Keep In Touch Twitter Email GitHub Links QGrid SlickGrid Jupyter Hub Light Table CodeMirror Cython PyData NYC Talk by Scott Blaze Dask Theano TensorFlow Zipline Pyfolio PGContents SQLAlchemy Gevent quantopian.com/lectures The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

1/3/2016 • 1 hour, 27 minutes, 53 seconds

The PEP Talk

Visit our site to listen to past episodes, support the show, and sign up for our mailing list. Summary The Python language is built by and for its community. In order to add a new feature, change the specification, or create a new policy the first step is to submit a proposal for consideration. Those proposals are called PEPs, or Python Enhancement Proposals. In this episode we had the great pleasure of speaking with three of the people who act as stewards for this process to learn more about how it got started, how it works, and what impacts it has had. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com This episode is sponsored by Zato – Microservices, ESB, SOA, REST, API, and Cloud Integrations in Python. Visitzato.io to learn more about how to integrate smarter in the modern world. I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. Searching for Pythonistas with Disabilities We are recording today on December 7th, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing some of the PEP editors Interview with PEP editors Introductions How did you get introduced to Python? – Chris For anyone who isn’t familiar with them, can you explain what a PEP is and how they influence the Python language? – Tobias What are the requirements for a PEP to be considered for approval and what does the overall process look like to get it finalized? – Tobias How has the PEP process evolved to meet challenges posed by changes in the Python community? – Chris How many reviewers are there and how did each of you end up in that role? Is there a set number of editors that must be maintained and if so how did you arrive at that number? – Tobias What mistakes have other communities made when creating similar processes, and how has PEP learned from those mistakes? – Chris There are different categories for PEPs. Can you describe what those are and how you arrived at that ontology? – Tobias Is there any significance to the numbering system used for identifying different PEPs? – Tobias How does the PEP process maintain its sense of humor (e.g. PEP 20) while being sure to be taken seriously where it really counts? – Chris Along the lines of humorous PEPs, can you share the story of PEP 401? – Tobias How does the PEP process strive to prevent an undesirable level of control by any one company or other special interest group? – Chris How much control does Guido have over the PEP process? Has a PEP ever directly countered Guido’s wishes? How did it turn out? – Chris What is your favorite PEP and why? – Tobias Barry: PEP 20 Chris: PEP 479 David: PEP 20 What, in your opinion, has been the most important or far-reaching PEP, whether it was approved or not? – Tobias David: PEP 20 Chris: PEP 466 Barry: PEP 8 What was the strangest / most extreme PEP proposal you’ve ever seen? – Chris Chris: PEP 501 Barry: PEP 507 David: PEP 666 Picks Tobias Wagtail CMS Inside Out Spark Podcast Hymn for Atheists Chris Trumbo Kivy Crash Course Jihadology Podcast Barry Tox Nose2 Jessica Jones The Joy of Science Chris The Git Manpage Generator Daily MTG David Tim’s Vermeer Ready Player One The Aristocrats Scientific Songs of Praise Hollywood Babble On Keep In Touch Barry Blog Chris Blog GitHub David Website Blog Links Monty Python – All the Words Monty Python – On YouTube PEP 404 PEP 666 Raymond Hettinger PyCon 2015 PEP8 talk Python Dev Mailing List Python Ideas Mailing List Python Bug Mailing List The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/27/2015 • 1 hour, 45 minutes, 41 seconds

Eric Holscher on Documentation and Read The Docs

Visit our site to listen to past episodes, support the show, and sign up for our mailing list. Summary The first place we all go for learning about new libraries is the documentation. Lack of effective documentation can limit the adoption of an otherwise excellent project. In this episode we spoke with Eric Holscher, co-creator of Read The Docs, about why documentation is important and how we can all work to make it better. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $10 credit to try out their fast and reliable Linux virtual servers for your next project We are recording today on November 30th, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Eric Holscher about Documentation Use the promo code podcastinit10 to get a $10 credit when you sign up! On Hired software engineers & designers can get 5+ interview requests in a week and each offer has salary and equity upfront. With full time and contract opportunities available, users can view the offers and accept or reject them before talking to any company. Work with over 2,500 companies from startups to large public companies hailing from 12 major tech hubs in North America and Europe. Hired is totally free for users and If you get a job you’ll get a $2,000 “thank you” bonus. If you use our special link to signup, then that bonus will double to $4,000 when you accept a job. If you’re not looking for a job but know someone who is, you can refer them to Hired and get a $1,337 bonus when they accept a job. Interview with Eric Holscher Introductions How did you get introduced to Python? – Chris You are one of the people behind the Read The Docs project. What was your inspiration for creating that platform and why is documentation so important in software? – Tobias What makes Read The Docs different from other static sources for documentation? – Chris The Python community seems to have a stronger focus on well-documented projects than some other languages. Do you have any theories as to why that is the case? – Tobias Can you outline the landscape of projects that leverage the documentation capabilities that are built in to the Python language? – Tobias Can you estimate the overall user base for Read The Docs? – Chris Do you have any advice around methods or approaches that can help developers create and maintain effective documentation? – Tobias Can you list some projects that you have found to provide the best documentation and what was remarkable about them? – Tobias Newcomers to open source are often encouraged to submit improvements to a projects documentation as a way to get started and become involved with the community. Do you have any general advice on how to find and understand undocumented features? – Tobias Do you have any statistics on the languages represented among the projects that host their documentation with you? – Tobias What are some of the challenges you’ve faced and overcome in maintaining such a large repository of documentation from so many projects? – Chris How can our listeners contribute to the project? – Chris Picks Tobias The Man from Uncle Minute Physics Chris SigAvdi Black Flags: The Rise of ISIS Veritassium Eric Khao Soi Climate Change Gardening & healthy eating – Classic Keep In Touch Twitter @ericholscher @readthedocs @writethedocs Links Stripe docs Django Girls Tutorial Write The Docs Write The Docs Meetup Talk Write The Docs Slack Channel The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/20/2015 • 1 hour, 5 minutes, 33 seconds

Sylvain Thénault on ASTroid

Visit our site to listen to past episodes, support the show, and sign up for our mailing list. Summary The Python AST (Abstract Syntax Tree) is a powerful abstraction that allows for a number of innovative projects. ASTroid is a library that provides additional convenience methods to simplify working with the AST. In this episode we spoke with Sylvain Thénault from Logilab about his work on ASTroid and how it is used to power the popular PyLint static analysis tool. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $10 credit to try out their fast and reliable Linux virtual servers for your next project We are recording today on November 23rd, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Sylvain Thénault about ASTroid On Hired software engineers & designers can get 5+ interview requests in a week and each offer has salary and equity upfront. With full time and contract opportunities available, users can view the offers and accept or reject them before talking to any company. Work with over 2,500 companies from startups to large public companies hailing from 12 major tech hubs in North America and Europe. Hired is totally free for users and If you get a job you’ll get a $2,000 “thank you” bonus. If you use our special link to signup, then that bonus will double to $4,000 when you accept a job. If you’re not looking for a job but know someone who is, you can refer them to Hired and get a $1,337 bonus when they accept a job. Use the promo code podcastinit10 to get a $10 credit when you sign up! Interview with Sylvain Thénault Introductions How did you get introduced to Python? – Chris Can you explain what an Abstract Syntax Tree is and why it is a useful language feature? – Tobias What was your inspiration for creating ASTroid? – Chris What features does ASTroid offer over Python’s standard AST package, and what makes those features important? – Chris I know that the ASTroid package is used in Pylint which is also maintained by Logilab. How does the AST facilitate static analysis of Python projects and are there cases where you have to fall back to text parsing? – Tobias Beyond static analysis, what are some of the other possible uses for the Python AST? – Tobias The documentation for the AST package in Python mentions that the specific syntax objects in the tree are subject to change between releases. Does the ASTroid package provide any abstractions to maintain a consistent API between versions or does it just provide a pass-through? – Tobias Have you encountered any challenges in testing ASTroid given that it operates at such a low level in the language? – Chris Do you have trouble attracting contributors given the great understanding of Python’s inner working required? – Chris Does the implementation or representation of the AST differ between different distributions of Python such as CPython, PyPy and Jython? – Tobias What are some of the most interesting applications ASTroid has been used in? – Chris Picks Tobias Pre-Commit Existential Comics htmlPy Chris Pretty Things – Fluffy White Rabbits Fallout 4 Sylvain PyReverse CubicWeb Keep In Touch Code Quality Mailing List PyLint Dev Mailing List Twitter @sythenault @logilab Logilab Links Visitor pattern Pylint The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/12/2015 • 47 minutes, 28 seconds

Stuart Mumford on SunPy

Visit our site to listen to past episodes, support the show, and sign up for our mailing list. Summary What is Solar Physics? How does it differ from AstroPhysics? What does this all have to do with Python? In this episode we answer all of those questions when we interview Stuart Mumford about his work on SunPy. So put on your sunglasses and learn about how to use Python to decipher the secrets of our closest star. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $10 credit to try out their fast and reliable Linux virtual servers for your next project We are recording today on November 17th, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Stuart Mumford about SunPy Use the promo code podcastinit10 to get a $10 credit when you sign up! On Hired software engineers & designers can get 5+ interview requests in a week and each offer has salary and equity upfront. With full time and contract opportunities available, users can view the offers and accept or reject them before talking to any company. Work with over 2,500 companies from startups to large public companies hailing from 12 major tech hubs in North America and Europe. Hired is totally free for users and If you get a job you’ll get a $2,000 “thank you” bonus. If you use our special link to signup, then that bonus will double to $4,000 when you accept a job. If you’re not looking for a job but know someone who is, you can refer them to Hired and get a $1,337 bonus when they accept a job. Interview with Stuart Mumford Introductions How did you get introduced to Python? – Chris Can you explain what the research and applications of solar physics are and how SunPy facilitates those activities? – Tobias What was your inspiration for the SunPy project and what are you using it for in your research? – Tobias Can you tell us what SunPy’s map and light curve classes are and how they might be used? – Chris Are there any considerations that you need to be aware of when writing software libraries for practitioners of the hard sciences that would be different if the target audience were software engineers? – Tobias Can SunPy consume data directly from telescopes and other observational apparatus? – Chris I noticed on the project site that SunPy leverages AstroPy internally. Can you describe the relationship between the two projects and why someone might want to use SunPy in place of or in addition to AstroPy? – Tobias Looking at the documentation I got the impression that there is a fair amount of visual representation of data for analysis. Can you describe some of the challenges that has posed? Is there integrated support for project Jupyter and are there other graphical environments that SunPy supports? – Tobias What are some of the most interesting applications that SunPy has been used for? – Chris Picks Tobias Elm Avro Common Sense Media Chris Massdrop 21st Amendment Fireside Chat Extra Creditz Stuart Live ISS Stream with space-to-ground radio Live ISS HD video stream 24/7 yt Calf Studio – Live Audio Processing Keep In Touch Twitter(@sunpyproject) SunPy.org GitHub IRC The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

12/4/2015 • 40 minutes, 38 seconds

Maneesha Sane on Software and Data Carpentry

Visit our site to listen to past episodes, support the show, and sign up for our mailing list. Summary The Software and Data Carpentry organizations have a mission of making it easier for scientists and data analysts in academia to replicate and review each others work. In order to achieve this goal they conduct training and workshops that teach modern best practices in software and data engineering, including version control and proper data management. In this episode we had the opportunity to speak with Maneesha Sane, the program coordinator for both organizations, so that we could learn more about how these projects are related and how they approach their mission. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com This episode is sponsored by Zato – Microservices, ESB, SOA, REST, API, and Cloud Integrations in Python. Visit zato.io to learn more about how to integrate smarter in the modern world. I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $10 credit to try out their fast and reliable Linux virtual servers for your next project We are recording today on November 10th, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Maneesha Sane about Software Carpentry and Data Carpentry Interview with Maneesha Sane Introductions How did you get introduced to Python? Can you explain what the Software and Data Carpentry organizations are and what their respective goals are? What is the history of these organizations and how are they related? What does a typical Software Carpentry or Data Carpentry workshop look like? What is the background of your instructors? Can you explain why Python was chosen as the language for your workshops and why it is such a good language to use for teaching proper software engineering practices to scientists? In what ways do the lessons taught by both groups differ and what parts are common between the two organizations? What are some of the most important tools and lessons that you teach to scientists in academia? Do you tend to focus mostly on procedural development or do you also teach object oriented programming in Software Carpentry? What is the target audience for Data Carpentry and what are some of the most important lessons and tools taught to them? Do you teach any particular method of pre-coding design like flowcharting, pseudocode, or top down decomposition in software carpentry? What scientific domains are most commonly represented among your workshop participants for Software Carpentry? What are some specific things the Python community and the Python core team could do to make it easier to adopt for your students? What are the most common concepts students have trouble with in software & data carpentry? How can our audience help support the goals of these organizations? Picks Tobias Vivaldi Browser vyte.in Pocket Casts Chris Chiptunes = Win ESM – Electronic Study Music Supergalactic Expansive Maneesha QPython New Boston Lunar Baboon Keep In Touch Twitter @swcarpentry @datacarpentry @maneeshasane Blog Software Carpentry Data Carpentry Links NumFocus Software Carpentry GitHub – Training Courses Instructor Training Discussion Mailing List The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/25/2015 • 44 minutes, 28 seconds

Erik Tollerud on AstroPy

Visit our site to listen to past episodes, support the show, and subscribe to our mailing list. Summary Erik Tollerud is an astronomer with a background in software engineering. He leverages these backgrounds to help build and maintain the AstroPy framework and its associated modules. AstroPy is a set of Python libraries that provide useful mechanisms for astronomers and astrophysicists to perform analyses on the data that they receive from observational equipment such as the mountain observatory that Erik was preparing to visit when we talked to him about his work. If you like Python and space then you should definitely give this episode a listen! Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $10 credit to try out their fast and reliable Linux virtual servers for your next project We are recording today on November 2nd, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Erik Tollerud about AstroPy On Hired software engineers & designers can get 5+ interview requests in a week and each offer has salary and equity upfront. With full time and contract opportunities available, users can view the offers and accept or reject them before talking to any company. Work with over 2,500 companies from startups to large public companies hailing from 12 major tech hubs in North America and Europe. Hired is totally free for users and If you get a job you’ll get a $2,000 “thank you” bonus. If you use our special link to signup, then that bonus will double to $4,000 when you accept a job. If you’re not looking for a job but know someone who is, you can refer them to Hired and get a $1,337 bonus when they accept a job. Use the promo code podcastinit10 to get a $10 credit when you sign up! Interview with Erik Tollerud Introductions How did you get introduced to Python? What was the inspiration to create AstroPy and what kinds of astronomical research can it be used for? Can you tell us what AstroPy’s modeling functions are and give us examples of where they might be used? Are there any considerations that you need to be aware of when writing software libraries for practitioners of the hard sciences that would be different if the target audience were software engineers? What are some of the most interesting applications that AstroPy has been used for? Are there open data sets that are available for people outside of academia to do analysis of astronomical data using AstroPy? Have there been any useful discoveries made in this way? Could you please tell us about AstroPy’s Virtual Observatory capabilities? What are some interesting use cases for AstroPy’s Cosmological calculations? Are there other libraries available that provide similar capabilities, perhaps in other languages? What makes AstroPy unique among them? Can AstroPy consume data directly from telescopes and other observational apparatus? The amount of data generated from observing astronomical phenomena must be immense. What are some of the tools used to manage that data and how does AstroPy interface with them? How might AstroPy be used to prove or disprove the cold dark matter hypothesis? What are some of the architectural choices that have been made to allow for the AstroPy library to serve as the core for a number of other add-ons? Does AstroPy provide a common data format to allow for easy interoperability between the various addons? I noticed that AstroPy adheres to the PSF code of conduct, as well as having adopted an enhancement proposal process modelled after PEPs. Can you explain why that is important and what kind of an impact it has had on the community around AstroPy? Picks Tobias Citizen Ex piprot Open Culture Chris The Allusionist Criminal Hardcore History Erik HubbleSite Great Courses – History of the Ancient World Keep In Touch astropy.org AstroPy User Mailing List AstroPy Dev Mailing List Links tutorials.astropy.org AstroQuery Cython The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/20/2015 • 49 minutes, 18 seconds

Dariusz Suchojad on Zato

Visit our site to listen to past episodes, support the show, and sign up for our mailing list. Summary Service integration platforms have traditionally been the realm of Java projects. Zato is a project that shows Python is a great choice for systems integration due to its flexibility and wealth of useful libraries. In this episode we had the opportunity to speak with Dariusz Suchojad, the creator of Zato about why he decided to make it and what makes it interesting. Listen to the episode and then take it for a spin. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email, leave us a message on Google+, or leave a comment on our show notes I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. Linode is also sponsoring us this week. Check them out at linode.com/podcastinit and get a $10 credit to try out their fast and reliable Linux virtual servers for your next project. We are recording today on October 27th, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Dariusz Suchojad about Zato Interview with Dariusz Suchojad Introductions How did you get introduced to Python? Can you explain what Zato is and what motivated you to create it? What makes Zato stand out from other service bus implementations? What are some signs that someone should consider incorporating Zato into their software architecture? Does zato perform well in restricted resource environments like ec2? What performance bottlenecks are common when using zato? It seems that most other ESB projects are written in Java. What advantages does Python have over Java for this kind of project and in what ways is it inferior? The architectural nature of ESBs are such that they form the central backbone of a software system. How have you been able to ensure an appropriate level of reliability and stability in Zato while still delivering new features and improvements? What are the scalability and high availability characteristics of Zato? Does zato run well using pypy? For anyone wanting to use Zato, what are the infrastructure requirements for deployment? What are some of the security ramifications you took into account in zato’s design? What are some of the most novel uses for Zato that you have seen or heard about? Picks Tobias SPY Eric Royer’s One Man Band pip-tools Chris Rational Security New Rustacean Podcast Johan Goes to Mexico Dariusz Sublime Text Editor Keep In Touch zato.io Twiter Github The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/13/2015 • 42 minutes, 26 seconds

Tom Rothamel on Ren’Py

Visit our site to listen to past episodes, support the show, and sign up for our mailing list. Summary Tom Rothamel is an embedded systems engineer who spends his free time working on Ren’Py, a visual novel engine written in Python. Ren’Py allows you to write interactive fiction experiences and deploy them across desktop and mobile platforms. By creating a purpose-built DSL for describing the interactions, users of Ren’Py can focus on crafting polished experiences without fighting through the vagaries of programming languages, while still providing access to the internals when necessary. Listen to our interview with Tom to learn more about this long-running project and what makes it so interesting. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. Linode is also sponsoring us this week. Check them out at linode.com/podcastinit and get a $10 credit to try out their fast and reliable Linux virtual servers for your next project. We are recording today on October 19th, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Tom Rothamel about RenPy On Hired software engineers & designers can get 5+ interview requests in a week and each offer has salary and equity upfront. With full time and contract opportunities available, users can view the offers and accept or reject them before talking to any company. Work with over 2,500 companies from startups to large public companies hailing from 12 major tech hubs in North America and Europe. Hired is totally free for users and If you get a job you’ll get a $2,000 “thank you” bonus. If you use our special link to signup, then that bonus will double to $4,000 when you accept a job. If you’re not looking for a job but know someone who is, you can refer them to Hired and get a $1,337 bonus when they accept a job. Interview with Tom Rothamel Introductions How did you get introduced to Python? What is Ren’Py and what was your inspiration for starting it? I noticed that Ren’Py supports a number of different styles of gameplay. Can you explain the differences between interactive fiction, kinetic fiction and RPGs? I notice that RenPy has clearly been around a while (Some of the games for OSX are PowerPC binaries!) – what problems have you encountered maintaining such a long lived project and keeping it current? What libraries does Ren’Py leverage and how did you go about selecting them to allow for cross-platform development and deployment? What underlying Python graphics toolkit does RenPy use for display, and how did that choice affect RenPy’s design? While reading through the quickstart in the documentation I noticed that there is a special syntax that you have created for defining the dialog and narratives. Can you explain how you created the DSL for building the storylines? It feels to me like RenPy was heavily inspired by the JRPG genre and as such there are games where sex plays a prominent role(I noticed a mention of Hentai in the docs), which is less readily accepted in the west. Have you ever encountered any pushback on this issue? I noticed that some of the games that were created with Ren’Py are available on the Steam platform. What elements of the Ren’Py project lend themselves to producing games with enough polish to be published on such a mainstream platform? If you were just starting out today implementing RenPy, would you still use Python? Why? Picks Tobias DJ Logic git-extras Radon Chris Narcos The Rust Programming Language Kent Falls Brewing Shower Beer Tom Cython NPR One The Seinfeld Method Keep In Touch renpy.org Twitter Links Long Live The Queen Moonlight Walks The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

11/6/2015 • 58 minutes, 52 seconds

Anthony Scopatz on Xonsh

Visit our site to listen to past episodes, support the show, and sign up for our mailing list. Summary Anthony Scopatz is the creator of the Python shell Xonsh in addition to his work as a professor of nuclear physics. In this episode we talked to him about why he created Xonsh, how it works, and what his goals are for the project. It is definitely worth trying out Xonsh as it greatly simplifies the day-to-day use of your terminal environment by adding easily accessible python interoperability. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. Linode is also sponsoring us this week. Check them out at linode.com/podcastinit and get a $10 credit to try out their fast and reliable Linux virtual servers for your next project We are recording today on October 12th, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Anthony Scopatz about Xonsh On Hired software engineers & designers can get 5+ interview requests in a week and each offer has salary and equity upfront. With full time and contract opportunities available, users can view the offers and accept or reject them before talking to any company. Work with over 2,500 companies from startups to large public companies hailing from 12 major tech hubs in North America and Europe. Hired is totally free for users and If you get a job you’ll get a $2,000 “thank you” bonus. If you use our special link to signup, then that bonus will double to $4,000 when you accept a job. If you’re not looking for a job but know someone who is, you can refer them to Hired and get a $1,337 bonus when they accept a job. Use the promo code podcastinit10 to get a $10 credit when you sign up! Interview with Anthony Scopatz Introductions How did you get introduced to Python? Can you explain what Xonsh is and your motivation for creating it? For people transitioning to Xonsh from a shell like Bash or Zsh, what are some of the biggest differences that they will see? What are some really powerful one-liners that showcase Xonsh’s capabilities? What is it about Python that lends itself to this kind of a project and what are your thoughts on building something like Xonsh in another language such as Ruby or Node.js? If you had to single out one killer feature that Xonsh brings to the table, what would that be? Is it possible to specify which shell, such as bash or zsh, gets used in subprocess mode? I started using the Xonsh shell as my daily terminal recently and have been enjoying it so far. One of the things that I have been wondering is how to hook into the completion system to provide eldoc style completion from parsing the output of help flags. Do you have any advice on where to start? Perhaps using the docopt library to handle parsing of help output and generate completions from that? What are your thoughts on adding a section to the project documentation for people to list various extension modules that people can take advantage of? Or perhaps creating something along the lines of Oh my Xonsh? How do bash function definitions interoperate with the Xonsh environment and functions defined in Python? It seems as though there could be some potential path or compatibility issues when moving between virtual environments and having access to extension modules loaded into Xonsh. Can you shed some light on that? Do you have any suggestions for people who may not have the privileges to set their own login shell but who want to try Xonsh? What are some of the most interesting uses of Xonsh that you have seen? What does the future hold for the Xonsh project and how can our audience help? Picks Tobias Mortdecai Alembic SQLAlchemy population.io Chris Consider Phlebas The Martian – Movie Fantastic Planet Anthony The Worst Journey In The World Keep In Touch Mailing List xonsh.org #xonsh on OFTC GitHub Twitter: @scopatz Links Effective Computation in Physics Python Prompt Toolkit The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/31/2015 • 57 minutes, 53 seconds

Kay Hayen on Nuitka

Visit our site to listen to past episodes, support the show, and sign up for our mailing list. Summary Kay Hayen is a systems engineer from Germany who has dedicated his spare time to the creation of Nuitka, a library that will compile your Python project to C++. In this episode we talked to Kay about what inspired him to create the project, how it operates, and some of the challenges he has faced. It is a very interesting project and it has the potential to let you run your Python code in a whole new way! Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email, leave us a message on Google+, or leave a comment on our show notes I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. Linode has also sponsored this episode and you can get a $10 credit using the link linode.com/podcastinit to try out their fast and reliable linux virtual servers. We are recording today on October 6th, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Kay Hayen about the Nuitka project On Hired software engineers & designers can get 5+ interview requests in a week and each offer has salary and equity upfront. With full time and contract opportunities available, users can view the offers and accept or reject them before talking to any company. Work with over 2,500 companies from startups to large public companies hailing from 12 major tech hubs in North America and Europe. Hired is totally free for users and If you get a job you’ll get a $2,000 “thank you” bonus. If you use our special link to signup, then that bonus will double to $4,000 when you accept a job. If you’re not looking for a job but know someone who is, you can refer them to Hired and get a $1,337 bonus when they accept a job. Use the promo code podcastinit10 to get a $10 credit when you sign up! Interview with Kay Hayen Introductions German, family with 2 kids, one cat Working in ATM (Air Traffic Management), tracker product Systems Engineer Nuitka as a hobbyist How did you get introduced to Python? Once was Perl “Guru”. Python was getting a lot of positive press Team decision to want to use readable stuff CPAN was still more complete, but Python was making inroads Can you describe how to pronounce the name of your project? Wife Anna, Russian, Annuitka -> Nuitka Can you briefly describe what Nuitka is and what your motivation was for creating it? I was thinking a fully integrated and compatible compiler should be possible. Why is nobody doing it? I can do it. I am doing it. Take Python beyond current use cases. Everbody currently using Python needs no compiler, or wouldn’t use it Less need for time consuming C++/Python hybrid coding Simple code should compile to fast code by default Complex code should still work On the project web site it says that Nuitka does a lot of clever things after being fed a Python project. Can you provide some details as to what some of that cleverness is? Re-formulations of Python into simpler Python No “class” No “assert” No complex assignments SSA tracing Attaching uses to assignments properly Despite try/finally Loops Avoids checks for known defined/undefined values Function inlining (coming) Constant propagation Closure variable removal What is libpython and how is it used in both Nuitka and CPython? Core of the Python interpreter With Python VM and C interface Nuitka can fall back to it Avoiding it as often as we can, key to performance Is there any way to provide hints to Nuitka to generate more optimized output? Nuitka is yet to make a difference based on type information Not yet there, but coming soonish. SSA was pre-requisite PEP 484 will be unreliable type information, mostly useless I want type hints that are checked at Python run time What are some of the biggest challenges in generating statically compiled code from a language as dynamic as Python? Python is compiled to .pyc files Compatible Frame stack, cached Exception handling of Python is terrible CPython type system designed to be extensible Extension types for functions, bound/unbound methods, generators, etc. Many details to get right Are there any particular Python constructs that Nuitka is unable to translate and as a corollary to that is the compilation step lossy at all or do you have some way of ensuring that the functionality of the program remains unaltered? Big point, no price attached Except for not having bytecode, there is nothing missing No pdb support Edit / run cycle is not accelerated That said: PyQt (integrated), PySide (available, unmerged), wxPython (available, maybe merged) needed patches to take compiled function/method objects for function objects too Are there any particular types of programs that benefit the most from Nuitka’s compilation? Bindings with ctypes of cffi compile into zero overhead C calls (planned) Scientific programs are the most obvious goal (float type inference) CPU bound or low latency programs Is it possible to feed an entire project with multiple modules into Nuitka all at once or is the standard use to perform compilation one source file or submodule at a time? You give it the main program and it recurses imports according to “PYTHONPATH” nuitka –recurse-all “/usr/bin/hg” supposed to work Might have to give directories with program plug-ins I’m curious about what led you to choose compilation to C++ for Nuitka rather than making Nuitka an LLVM back end like Numba? When I started Nuitka, I was using C++0x and variadic templates Wanted to make a proof of concept that compatibility and integration is feasible From there, code generation got less high level to goto ridden C How does Nuitka compare to projects like Numba or Cython? Graceful degradation goal Complete compatibility with Python whole stack How does Nuitka compare to PyPy? – Kay PyPy is the coolest project ever Pure Python goals shared How can users evaluate the performance of Nuitka – Kay They currently cannot Developing a tool to compare CPython and Nuitka runs Based on vmprof from PyPy people Identify parts of program where Nuitka is slower Links to source code To be done, help needed. Nuitka is only starting to get to serious performance Compatibility is such a high bar to take C++ to C took a year (avoiding C++ exceptions) SSA literally took forever Picks Tobias Forbidden Island Forbidden Desert Otto Project Chris Grimm Super Symmetry Are You Listening To?: Boston Ripple Kay Learn being skeptic, Atheist Experience MicroPython Keep In Touch Nuitka Homepage Google+ Email The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/24/2015 • 1 hour, 34 minutes, 35 seconds

Trent Nelson on PyParallel

Visit our site to listen to past episodes, support the show, and sign up for our mailing list. Summary Trent Nelson is a software engineer working with Continuum Analytics and a core contributor to CPython. He started experimenting with a way to sidestep the restrictions of the Global Interpreter Lock without discarding its benefits and that has become the PyParallel project. We had the privilege of discussing the details around this innovative experiment with Trent and learning more about the challenges he has experienced, what motivated him to start the project, and what it can offer to the community. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. We are recording today on September 7th, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Trent Nelson about PyParallel On Hired software engineers & designers can get 5+ interview requests in a week and each offer has salary and equity upfront. With full time and contract opportunities available, users can view the offers and accept or reject them before talking to any company. Work with over 2,500 companies from startups to large public companies hailing from 12 major tech hubs in North America and Europe. Hired is totally free for users and If you get a job you’ll get a $2,000 “thank you” bonus. If you use our special link to signup, then that bonus will double to $4,000 when you accept a job. If you’re not looking for a job but know someone who is, you can refer them to Hired and get a $1,337 bonus when they accept a job. Interview with Trent Nelson Introductions How did you get introduced to Python? For our listeners who may not be aware, can you give us an overview of what Pyparallel is and what makes it different from other Python implementations? How did PyParallel come about? What are some of the biggest technical hurdles that you have been faced with during your work on PyParallel? I understand that PyParallel currently only works on Windows. What was the motivation for that and what would be required for enabling PyParallel to run on a Linux or BSD style operating system? How does Pyparallel get around the limitations of the global interpreter lock without removing it? Is there any special syntax required to take advantage of the parallelism offered by PyParallel? How does it interact with the threading module in the standard library? In the abstract for the Pyparallel paper, you cite a simple rule – “Don’t persist parallel objects” – how easy is this to do with currently available concurrency paradigms and APIs, and would it make sense to add such support? For instance, how would one be sure to follow this rule when using Twisted or asyncio? Are there any operations that are not supported in parallel threads? What drove the decision to fork Python 3.3 as opposed to the 2.X series? In the documentation you mention that the long term goal for PyParallel is to merge it back into Python mainline, possibly within 5 years. Has anything changed with that goal or timeline? What milestones do you need to hit before that becomes a realistic possibility? Can you compare PyParallel to PyPy-STM and Go with Goroutines in terms of performance and user implementation? What are some particular problem areas that you are looking for help with? Assuming that it does get merged in as Python 4, how do you think that would affect the features and experiments that went into Python 5? To be continued… Picks Tobias Testinfra Software Engineering Daily Chris Hello Webapp – Intermediate Concepts Grimm Rainbow Dome PBS Idea Channel Trent Show Stopper by G. Pascal Zachary Keep In Touch GitHub Twitter @PyParallel @TrentNelson

10/14/2015 • 1 hour, 12 minutes, 43 seconds

Dag Brattli on RxPy

Visit our site to listen to past episodes, support the show, and sign up for our newsletter! Summary Dag Brattli is an engineer with Microsoft and in his spare time he created the ported the Reactive Xtensions framework to Python in the form of the RxPy library. In this episode we had the opportunity to speak with Dag and learn more about what ReactiveX is, why it is useful and how you can use it in your Python programs. It is definitely a very powerful programming patern when manipulating data streams which is becoming increasingly common in modern software architectures. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit to double your signing bonus. We are recording today on October 2nd, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Dag Brattli about the RxPy project On Hired software engineers & designers can get 5+ interview requests in a week and each offer has salary and equity upfront. With full time and contract opportunities available, users can view the offers and accept or reject them before talking to any company. Work with over 2,500 companies from startups to large public companies hailing from 12 major tech hubs in North America and Europe. Hired is totally free for users and If you get a job you’ll get a $2,000 “thank you” bonus. If you use our special link to signup, then that bonus will double to $4,000 when you accept a job. If you’re not looking for a job but know someone who is, you can refer them to Hired and get a $1,337 bonus when they accept a job. Interview with Dag Brattli Introductions How did you get introduced to Python? For our listeners who haven’t heard of it before, can you describe what RxPy is and why someone might want to use it? What problem domains are best suited for using the Reactive X approach? What is involved in integrating RxPy into an existing code base? When should we use RxPy over asyncio or asynchronous workers like Celery? What resources or tutorials do you recommend people use when trying to understand how and when to use the Reactive X tools? What in particular about Python lends itself to the ReactiveX pattern, and what features of the language does RxPy leverage in particular in its implementation? In what ways does the Python implementation of the Reactive X framework differ from those of other languages? The project description references the use of LINQ for querying the various data streams that RxPy enables consumption of. I had always heard of LINQ in the context of traditional database queries. What makes LINQ a good choice for stream processing? I mostly hear about ReactiveX in terms of UI design, but the project description seemed to indicate it was much more generally useful. What are some of the less common and more interesting problems that RxPy lends itself to solving? Picks Tobias icdiff Timeline card game Griatch’s Digital Art Chris elpy sshuttle Chimay Grand Reserve Dag ASTor How To Bake Pi – A book about the mathematics of mathematics Keep In Touch GitHub Links Main ReactiveX Site rxjava site for documentation rxmarbles MSDN Channel 9 Function Overloading in Python 3 The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

10/9/2015 • 33 minutes, 1 second

uWSGI Core Developers

Visit our site to listen to past episodes, join the mailing list and support the show. Summary uWSGI is one of the most versatile application servers available. It was originally written for running Python applications and has since gained functionality to support Perl, Ruby, PHP, and more in addition to the incredible feature set. In this episode Tobias got to interview three of the core developers of this project and find out more about how the different pieces of it fit together and what its future holds. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.init. Sign up at hired.com/podcastinit to double your signing bonus. We are recording today on September 22nd, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing the core developers of uWSGI (Adriano Di Luzio, Riccardo Magliocchetti, and Roberto De Ioris) Interview with uWSGI core developers Introductions How did you get introduced to Python? For anyone who hasn’t come across the project before, can you explain what uWSGI is and what makes it unique? How did you architect uWSGI in order to allow for supporting so many different languages? The feature set of uWSGI is truly incredible. Does this make the code complicated to understand and modify? Can you describe some of your favorite features in uWSGI? What have you found to be the most overlooked or underutilized features of uWSGI? Can you briefly describe how Emperor mode works and how that can be used to handle routing between microservices? Could you discuss some of the particular features UWSGI provides around load balancing? Is connection draining supported? Can nodes be dynamically added and removed from the pool or does the config need to be rewritten and UWSGI restarted? The configuration syntax looks like it provides a very rich set of capabilities. Is it based on a general purpose programming language or is it a DSL? What might be some common use cases for using UWSGI in tandem with another web server like NGINX? I have read that WSGI does not get along with http/2. Are there any plans to look towards supporting that protocol in some way? What new capabilities can we look forward to in the future of uWSGI? Picks Tobias Manjaro Linux Kontact Blackhat Riccardo Building Microservices book Django-Denis Adriano Paxos Algorithm Roberto The Brink Keep In Touch Mailing List #uWSGI on IRC GitHub latest docs Roberto Twitter GitHub Adriano GitHub Twitter Riccardo GitHub Twitter

10/3/2015 • 34 minutes, 59 seconds

Griatch on Evennia (Making MUDs with Python)

Visit our site to listen to past episodes, sign up for our mailing list and support the show. Summary Griatch is an incredibly talented digital artist, professional astronomer and the maintainer of the Evennia project for creating MUDs in Python. We got the opportunity to speak with him about what MUDs are, why they’re interesting and how Evennia simplifies the process of creating and extending them. If you’re interested in building your own virtual worlds, this episode is a great place to start. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at We are recording today on September 15th, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Griatch about the Evennia project Interview with Griatch Introductions How did you get introduced to Python? Can you explain what MUDs are and what that has to do with Evennia? What is it about MUDs that keeps them interesting long after the technical restrictions that led to their creation are no longer present, especially in light of 3D multiplayer games like WoW and EVE Online? Can you give us a rundown of the various parts of Evennia (MUD engine, web interface, etc.) and how they fit together? How does Evennia handle the fact that a MUD world is comprised of many hundreds of objects containing various properties, maintaining consistent, persistent state as players interact with them? What concurrency tools or paradigms does Evennia use? During the height of MUDs popularity, one highly sought after feature was the idea of being able to have players travel from one MUD instance to another, would it be possible to implement this in Evennia? Has the Evennia core team given any thought to adding features to support a richer client interface? Graphical maps or the like? How difficult would it be to use Evennia to interface with something like Slack or Hipchat for a company-wide MUD? Have you ever heard of someone doing something like that? Are there any fully fledged running MUDs built with Evennia out in the wild? Picks Tobias libraries.io jsonapi.org Marshmallow Marshalling Library Chris The End of All Things David’s Tea Steeper Hello Webapp – Intermediate Concepts Griatch F2Py Designing Virtual Worlds Imaginary Realities Optional Realities Keep In Touch Evennia Website Evennia Github Freenode IRC Channel #Evennia Links roll20

9/29/2015 • 1 hour, 14 minutes, 3 seconds

Hylang Core Developers

Visit our site to listen to past episodes, support the show, and sign up for our mailing list Summary We got the chance to talk to some of the core developers of Hylang, which is a Lisp dialect that runs on the Python VM! We talked about how it got started, how it works and why you should try it. Of particular interest is our discussion about using Hylang to backport language features, or create entirely new ones due to the power of Lisp and the Python AST (Abstract Syntax Tree). If you need to level up your Lisp knowledge, they gave us a great list of references to help out. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at We are recording today on August 27, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Paul Tagliamonte, Tuukka Turto, and Morten Linderud Interview with Hylang Developers Introductions How did you get introduced to Python? Before we get too far along can you explain what Hy is? What inspired you to create Hy? What do you recommend as reference material for Python developers to gain familiarity with idiomatic Lisp? What are some of the problem domains where implementation becomes easier or more elegant as a result of Hy’s LISP syntax? Given the ability to create powerful macros in Lisp, could Hy be used as a way of prototyping or backporting new language features in Python? What are some of the most challenging and interesting problems you encountered bringing an alternate syntax to the Python runtime? While playing around with the Hy REPL I noticed that it does visual matching of parentheses when closing an expression. What other niceties have been included in the REPL? What are your thoughts on adding autocompletion to the REPL as a way of encouraging discovery and exploration of the Hy language? Which LISP variant is Hy most similar to, and why? How does garbage collection work in Hy, and why? How hard would it be to port existing LISP packages to Hy like MACSYMA or CLOS? What kind of overhead in terms of runtime performance and memory usage does Hy impose? Has this been a challenge in Hy’s development? What are some of the most innovative uses for Hy that you have seen or created? What does the future hold for Hy? I noticed that there are a large number of core contributors to Hylang and I’m curious how you determine what features to work on? Picks Tobias Displacy The Golem and the Jinni by Helene Wecker – Read it on Scribd Safari Online Chris Dash and Zeal Reasonably sound (podcast) PBS Idea Channel (Youtube) Paul Reproducible Build Project Model View Culture Tuukka SICP Lecture F# ReactiveX 1 Game Per Month (#!GAM) Morten Hackers Mr. Robot Keep In Touch Paul Twitter paultag on IRC Website Tuukka Twitter Morten Twitter Links Core features of Hylang Adderall – minicanron in hylang Books Joy of Clojure Let over Lambda Land of Lisp Clojure programming Herculeum – Tukka’s DSL for roguelikes Pixie – Lisp in RPython Dogelang BPython Github trending repos with Hylang Pineal hydiomatic – Algernon

9/19/2015 • 55 minutes, 48 seconds

Bryan Van de Ven on Bokeh

Visit our site to listen to past episodes, subscribe to our mailing list, and donate to the show. Summary Bryan Van de Ven is the project maintainer for Bokeh, a plotting and visualization toolkit that allows Python developers to easily create attractive interactive visualizations for the web. We talked about the project’s history, some interesting use cases for it, and what its near future looks like. Bryan also told us about how Bokeh compares to some of the other visualization libraries in both Python and Javascript, as well as how to use Bokeh from other languages such as Scala and Lua. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at There is a new Python podcast that just started up recently! It’s called the Python Test Podcast and covers the world of testing in Python, so go ahead and give it a listen. You can find it at We are recording today on Aug 18th, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Bryan Van de Ven about the Bokeh project Interview with Bryan Van de Ven Introductions How did you get introduced to Python? For our listeners who aren’t familiar with what Bokeh is, can you describe it? What inspired you to create Bokeh? Bokeh has integrations with some of the other Python graphing libraries such as matplotlib and seaborn. I can see how this would be useful to easily update existing code to publish visualizations on the web. Are there other use cases for these integrations? I noticed that Bokeh has bindings for some languages other than Python. R and Julia are obvious candidates due to their strong focus on analytics work, I’m curious what made you choose Scala and Lua as languages worth targeting? Do you lose any capabilities using the javascript library by itself? Other than the sample data sets that come with Bokeh, can you suggest a good publicly available data set with accompanying tutorial for people who want to get started with data visualization using Bokeh? Can you provide some comparisons between D3.js and the Bokeh javascript library in terms of capabilities and performance? The Bokeh project has a server component that allows for streaming data to clients. Can you describe the architecture of that and some example uses for it? Why was the server written as a Flask blueprint as opposed to making it a component of another framework such as Django or Pyramid and how difficult would it be to port the functionality to another system? What’s the most interesting use of Bokeh you’ve seen? Are you aware of any projects in other languages that are comparable to Bokeh? Picks Tobias wappalyzer The Graveyard Book by Neil Gaiman Chris Edward Snowden Meets the IETF Between the World and Me Untapp’d Bryan Audiobooks Scribd – Subscription service for ebooks and audio books with a great selection Try Audible and Get Two Free Audiobooks Cartographies of Time The Post-Modern Jukebox Keep In Touch Twitter Mailing List Bokeh Web Site Links vispy Vincent vega D3.js nbviewer.org bokeh page million song dataset data.gov ggplot / ggvis mathematica

9/8/2015 • 57 minutes, 18 seconds

Jessica McKellar

Visit our site to listen to past episodes, support the show and sign up for our mailing list. Summary We got the chance to talk to Jessica McKellar about her work in the Python community. She told us about her experience as a director for the PSF, working as the diversity outreach manager for PyCon, and being a champion for improving the on-boarding experience for new users of Python. We also discussed perceptions around the performance of Python and some of the work being done to improve concurrency, as well as her work with OpenHatch. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at We are recording today on Aug, 12 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Jessica McKellar Interview with Jessica McKellar Introductions How did you get introduced to Python? Attended MIT, originally for Chemistry Had friends pursuing CS degrees Toolset and skills seemed worth investingating Led to BA and MS MIT was in transition from LISP to Python Can you describe what your responsibilities are as a director of the PSF? A lot of outreach and investment in the community Do you think the PSF does a good job of making people aware of what it is, what it does for the community, and how they can help? Struggled with this historically but has gotten better in recent years Website re-design has helped A large focus of your work in the community has been around improving the experience of users who are new to Python and programming in general and I noticed that you just received the Frank Willison Memorial Award for your contributions to outreach and education in the Python community. What is your motivation behind this particular focus? Great deal of empathy for newcomers due to personal history Knowing how to program changes how you think about the world Has the situation for newcomers running Windows who wish to try Python gotten any better since your keynote at Kiwi PyCon? Some vaguaries of setup have gotten better with recent versions (e.g. setting path variables) Ruby has in-browser tutorial to get people hooked Do “Batteries Included’ distributions like Anaconda help or is it the same problem of visibility you discussed in your talk? Informatino flow / what are you default options question We could be much more opinionated about this You have presented a number of times about the future of Python and how we can all help to make sure that story is a happy one. How has the material for that talk changed over the past few years? As a largely volunteer community, how to maximize the impact of the bandwidth that we have Focus on the ‘top of the funnel’ to win over new users Python has the steepest positive curve of any language Community should invest in AP high school Python curriculum What do you anticipate will be the talking points for this topic over the next few years? We need to be smart about which areas we invest in to ensure success e.g. mobile, web, desktop. If you could grade the Python community on how well they have listened to and acted on the calls to action in your talks over the past few years, what would you give them? Rallying large groups of volunteers is a hard problem We need to think about commercial partnerships in key areas In your Kiwi PyCon talk you mentioned Kivy as an example of a great way to do mobile software development in Python. It feels to me like the Kivy team are still not getting the community involvement and buy in they should. How can we help make Kivy the mobile app development platform of choice for beginners? This will be a tough battle because Python is not the default platform for mobile compared to Java for Android, Objective C, Swift Users vote with their feet depending on what provides the most value to them Opportunity for a virtuous cycle here Game development as an entree to programming has been a recurring theme on our podcast. Has the Python game dev scene improved at all since 2013? And do you still see the same pitfalls holding people back (like app packaging), or have we moved on to different problems? The problems are largely the same Status quo still feels pretty broken Creative experiments around this definitely make sense for the community KivEnt could be a win here because Kivy apps are free standing binaries and require no dependencies. What do you view as the biggest threats to the popularity of Python currently and what can we do to address them? Other languages gaining popularity where Python has historically been strong (e.g. server-side development) A lot of this may be a perception issue May be largely a marketing problem I understand that you were involved in the formation of the Open Hatch organization. Can you describe what Open Hatch does and how our listeners can get involved? Non-profit dedicated to lowering barriers to entry for open source contribution Host workshops in colleges, underserved communities, etc. Picks Tobias F.lux Lightyear.fm PEP 0401 Chris The Alex Verus Series by Benedict Jacka Rick Dillon’s Org-mode structure manipulation tutorial Dominion Jessica Reply All Podcast RFC 959 – original FTP RFC Go read some RFCs! Think Stats Keep In Touch Google for “Jesstess” Conference Presentations https://www.youtube.com/watch?v=CIRPSbsRw8&utmsource=rss&utmmedium=rss https://www.youtube.com/watch?v=2p-FecWnyQ&utmsource=rss&utmmedium=rss https://www.youtube.com/watch?v=lH9KJBrR1Q&utmsource=rss&utmmedium=rss https://www.youtube.com/watch?v=d1a4Jbjc-vU&utmsource=rss&utmmedium=rss

9/1/2015 • 51 minutes, 23 seconds

Static Site Generators with Justin Mayer and Roberto Alsina

Visit our site to listen to past episodes, comment on the show or find out more about us. Summary In this episode we had the opportunity to discuss the world of static site generators with Roberto Alsina of the Nikola project and Justin Mayer of the Pelican project. They explained what static site generators are and why you might want to use one. We asked about why you should choose a Python based static site generator, theming and markup support as well as metadata formats and documentation. We also debated what makes Pelican and Nikola so popular compared to other projects. Brief Introduction Welcome to Podcast.__init__ the podcast about Python and the people who make it great Follow us on iTunes, Stitcher or TuneIn Give us feedback on iTunes, Twitter, email or Disqus We donate our time to you because we love Python and its community. If you would like to return the favor you can send us a donation}. Everything that we don’t spend on producing the show will be donated to the PSF to keep the community alive. Date of recording – August 08, 2015 Hosts Tobias Macey and Chris Patti Today we are interviewing the core developers of Nikola and Pelican about static site generators Interview Introductions Monitorial.net <- Justin Upriise <- Justin Works for Canonical <- Roberto How did you get introduced to Python? Justin: Needed a way to get order data to payment processor for commerce company Roberto: 1996 got involved with Linux Found XForms Wrote Python bindings For our listeners who might not know, what are static site generators and what are some of the advantages they bring to the table over other similar systems that perform the same function? Roberto Remove all the effort from the computer that serves the website Server runs no code Smaller ssurface area for security purposes Justin Better performance – important for responsiveness and uptime Easier deployment and maintenance Easier versioning and migration Can version both input and output There are a number of static site generators available in virtually every language. Why would a user want to leverage a Python solution vs Ruby, javascript, Go, etc.? ReStructured TeXT is best supported in Python Good language for supporting various markup syntaxes Most static site generators seem to have a primary focus on blogging. What is it about these tools that lend themselves so well to that use case? The author of the tools shape the purpose of the tool Most popular among programmers which is a demographic that is likely to have a blog Workflow is similar to what programmers are used to Still useful for non-chronological pages due to templating system Something that struck me comparing the two systems is that they have largely the same kinds of data going into the metadata block for each post, but it’s expressed in a different / incompatible way in each. Have you ever considered agreeing on a standard and even advertising it as such so all static site generators could make use of it? Challenging because of the idiosyncratic way problems are solved in each system Wouldn’t end up with the same site even if metadata were identical Roberto & Justin are talking, this may happen! The themes in Pelican and Nikola have very different feels and one of the things that initially drew me to Pelican is the larger catalog of themes available. What are some of the challenges involved in creating a theme for a static site generator? Many programmers who write SSGs aren’t amazing at HTML Pelican and Nikola seem to be the most widely used projects for creating static sites using Python. What do you think is the key to that popularity? Frequent updates, good documentation and large community Easy to get up and running Need to be productive inside of 2 minutes Good first impressions are key Importance of extensibility Core modularity and availability of plugins A lot of people have written about the importance (and difficulty) of writing and maintaining good documentation in open source projects. Nikola’s documentation is excellent. How did Nikola manage this in its development process and what can other open source projects learn from this? No secrets – just do it and keep it updated. Need to look at the tool as if using it for the first time What are some specific examples of unique and interesting uses your site generators have been put to? Justin: kernel.org, Debian, Chicago Linux Users, TransFX (translation house) all use Pelican Embedding Jupyter notebooks and MathML rendering in posts Site search plugin Nikola: Big adoption in the sciences (Jupyter notebook embedding supported in core) Output is forever Plugin to trigger internet archive to reindex site Nikola’s flexible deployment architecture (e.g. the use of doit tasks) seems to lend itself to some interesting use cases. What was the inspiration for this? Build was taking 1 1/2 hours, doit allowed for incremental generation Doit is a generic task system. Nikola has no “main” it’s a collection of doit tasks. Is there any specific help that you would like to ask of the audience? Contribute themes Help with reviewing issues and pull requests Picks Tobias Termux Magic Wormhole Arrow Chris Emacs Lisp Introduction 3D Cellular Automata in Minecraft Prompt 2 Justin Monitorial.net Upriise Ergodox Jarvis Bamboo Sit/Stand Desk Talky.io Fish shell Tacklebox iTerm v3.0 beta Brother Thelonious Belgian Ale Frog’s Leap Winery PyCon Italia and Italy in general Roberto Neal Stephenson Docopt Fried Pickles PyAr Python Argentina User Group PyCon Argentina in Mendosa PyCamp Keep In Touch Justin Personal Pelican Roberto Nikola Forums and mailing list

8/25/2015 • 1 hour, 32 minutes, 35 seconds

Al Sweigart on Python for Non-Programmers

Visit our site to listen to past episodes, learn more about us, and support the show. Summary We got the opportunity to speak with Al Sweigart about his work on books like ‘Automate The Boring Stuff With Python’ and ‘Invent With Python’. We discussed how Python can be useful to people who don’t work as software engineers, why coding literacy is important for the general populace and how that will affect the ways in which we interact with software. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at We are recording today on July 27th, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Al Sweigart about Python for non-programmers Interview with Al Sweigert Introductions How did you get introduced to Python? Started in PHP/Perl, introduced to Python in 2006 Lack of curly braces took some getting used to Clarity of standard library was refreshing What inspired you to start writing books for non-programmers? Friend who took care of 10 year old interested in programming Lack of coherent introductory material Started writing a tutorial which grew to book length All books published under Creative Commons license You have written a few books about teaching Python to people who have never programmed, can you share your thoughts on the best order in which to introduce the various aspects of programming? Blog post driven development – http://blog.estimote.com/post/119525082855/user-stories-on-steroids-how-estimote-uses-blog?utmsource=rss&utmmedium=rss Where does software testing come in when teaching new coders how to program? Use the logger, debugger, and assertions effectively In invent with Python you use games as the vehicle to discuss the principles involved with writing code. What is it about computer games that makes them so popular as a means to introduce programming to newcomers? Something everyone is familiar with Easy to make a simple game to get started Good way to get creative with programming For automate the boring stuff with Python you focused on explaining how programming can be useful even if it is not someone’s occupation. How did you determine which kinds of activities to focus on for the book? Got the idea at a meetup talking to someone who works in an office doing repetitive tasks A lot of office jobs that involve tedious computer work which could be automated What are your thoughts on the need for software literacy among the general population? How much programming knowledge do you think is sufficient for a member of our modern society? You also wrote about using Python to decrypt simple ciphers as a means to learn about code. What was the inspiration for this approach to software education? One of the projects in invent with Python was a simple cypher, inspired further interest in the subject In episode 7 with Jacob Kaplan-Moss we talked about how we define what a programmer is. Can you share your opinions on what separates someone who can understand code from someone who is a programmer? Barriers to entry have been significantly lowered, making the distinction very fuzzy Definition of programmer is becoming much wider Books available at: Automate the Boring Stuff Invent With Python Picks Tobias Logbook Emacs Psychotherapist Ex Machina Mining the social web Chris Emacs Rocks Working Copy Feedly Tom Collins Al PyCon Selenium Python Module Seven Eaves by Neal Stephenson Keep In Touch Twitter Email

8/16/2015 • 52 minutes, 51 seconds

Liza Avramenko on CheckIO and Empire of Code

Visit our site to listen to past episodes, find additional content, sign up for our newsletter or learn about the hosts. Summary In this episode we talked to Liza Avramenko, the CEO of CheckIO, about Empire of Code and CheckIO. We discussed what differentiates them from each other and from the other coding games that have been spreading on the internet. One of the main differentiators for CheckIO in particular is the strong focus on community. The bottom line is that if you use Python then you should check out CheckIO and Empire of Code as a great way to practice your skills. Brief Intro Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Follow us on iTunes, Stitcher or TuneIn Give us feedback on iTunes, Twitter, email or Disqus We donate our time to you because we love Python and its community. If you would like to return the favor you can send us a donation. Everything that we don’t spend on producing the show will be donated to the PSF to keep the community alive. We are recording today on July 27th, 2015 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Liza Avramenko about CheckIO Interview Please introduce yourself How did you get introduced to Python? Learned about it from Co-Founder Alex For anyone not familiar with CheckIO, can you explain what it is? What was the inspiration for creating the CheckIO platform? Alex was bored working in a bank and wanted to create a place for sharing practice problems What is your goal with this platform? Become global community for most popular coding languages Remain open and supportive How do you deal with the question of ownership and licensing in CheckIO? Was this a tricky hurdle to get past in the site’s creation? Being willing to share solutions publicly is a core part of the site. This had to be more explicitly stated due to some users confusion early on. Growing a community is difficult because of the chicken and egg problem. How did you kickstart the growth of the CheckIO community? Community always number one priority Started organically Initially had 24/7 live chat to help new users Openness was attractive, led to critical mass As community grew, need for live chat decreased Nature of Python community lends itself well to a collaborative, open community Guido provided advice on how to grow and foster community Guido himself has participated in a number of conversations on your platform to critique submissions. Have you received any feedback from him directly about his impressions of the system? How does diversity play into CheckIO? Are there aspects of the site’s design that are purposefully meant to attract a diverse audience? CheckIO has always targeted people with basic coding experience Early live chat feedback focused around very new coders wishing there was more material for them These early challenges resulted in the development of Empire of Code There are a number of other online programming-oriented games available. What makes CheckIO and Empire of Code stand out from them? Priority of community Others are more about gaming, showcasing talent How did you design the gamification aspects of CheckIO, and how important do you think they are to the site’s success? CheckIO was never a game, more of a library of challenges that have game elements Empire of Code is all about gamification, code and algo improvement are baked into the gameplay You choose Python or Javascript “legions” at character creation time, this is a one time choice. Buildings, troop movements, materials, etc. are all based in code Players can steal code and algorithms from other players Incredible innovation Great adoption story for new users – can start playing without writing any code But in order to really excel you will WANT to start writing code So many people have their original motivations for coding come from playing games Cooperative play in the form of training missions with other players This is an opportunity to learn how people on the other side are solving the same problem New languages are planned – Ruby, maybe Java? Do you think that there is something about the Python language or community that inspires adoption of this kind of gamified practice? You recently released the beta of a new experience called Empire of Code which is more akin to the type of video game that many people are familiar with. What inspired that evolution? As part of the new experience, you also added JavaScript as an available language. Do you intend to add new languages in the future? Is there a particular demographic or set of demographics that you are targeting with Empire of Code vs CheckIO? What’s the monetization strategy for Empire of Code or CheckIO? For Empire, you can play for free but you might keep losing your resources until you can learn to code more effectively, OR you can buy a shield which will protect your resources for a time. In CheckIO, how do you label the difficulty level of the individual puzzles, is there a set of guidelines for that or is it up to the puzzle writer / submitter? CheckIO trusts its community The community rates each challenge Part of the CheckIO platform is the ability for users to submit their own problems. How much vetting is involved before these submissions are available to users of the site? Where do you see CheckIO and Empire of Code going in the future? Want to have Empire of Code known as the best online game that blends in programming by the end of 2016 In ~5 years want to see people saying the CheckIO/Empire of Code inspired people to program as a career In ~10 years want to see all major languages represented Aiming to become a major game publisher Picks Tobias JSON Web Tokens Source Code Pro DirEnv Chappie Chris Prune Nikola Warday’s Cocktail Liza Kiev, Ukraine Bulletproof Coffee Keep In Touch Twitter: @avrliza

8/6/2015 • 48 minutes, 15 seconds

Glyph on Ethics in Software

Visit our site for past episodes and extra content. Summary In this episode we had a nice long conversation with Glyph Lefkowitz of Twisted fame about his views on the need for an established code of ethics in the software industry. Some of the main points that were covered include the need for maintaining a proper scope in the ongoing discussion, the responsibilities of individuals and corporations, and how any such code might compare with those employed by other professions. This is something that every engineer should be thinking about and the material that we cover will give you a good starting point when talking to your compatriots. Brief Introduction Welcome to Podcast.__init__ the podcast about Python and the people who make it great Date of recording – July 21, 2015 Hosts Tobias Macey and Chris Patti Follow us on iTunes, Stitcher, TuneIn, Google+ and Twitter Give us feedback! (iTunes, Twitter, email, Disqus comments) We donate our time to you because we love Python and its community. If you would like to return the favor you can send us a donation. Everything that we don’t spend on producing the show will be donated to the PSF to keep the community alive. Overview – Interview with Firstname Lastname about Topic Interview with Glyph Introductions How did you get introduced to Python? – Chris 2000 – large scale collaborative gaming system in Java Asynchronous IO Twisted Let’s start with the bad news What are some of the potential wide spread implications of less than ethical software that you were referring to in your Pycon talk? – Chris Robot Apocalypse (Not really) Much of the discussion around this derails into unrealistic nightmare scenarios THERAC 25 radiation machine Toyota unintended acceleration scandal Real worry – gradual erosion of trust in programmers and computers First requirement for a code of ethics – a clear understanding of the reality you’re trying to litigate The search for ethics will likely begin in academia where this aspect of software dev is more like psychology. In your talk you commented on the training courses that Lawyers are required to take as part of their certification. Do you think the fact that there is no standardized certification body for software development contributes to a lack of widely held ethical principles in software engineering? – Tobias Do you think that it is necessary to form such a certification mechanism for developers as part of the effort to establish a recognized ethical code? – Tobias If we were to create a certification to indicate proper training in the software engineers code of ethics, how do you think that would affect the rate at which people enter the industry? – Tobias Assuming we can all agree on a set of relatively strict professional ethics that would prevent the above from happening, how would we enforce those ethics? Or do you advocate an honor system? – Chris Ethics are by definition an honors system Enforcement would be straight forward – professional organizations to maintain a record and deviations from that record Need better laws & better jurisprudence We need an Underwriters Laboratory seal for software development ethics Code of software ethics will not and should not tell you how to be a decent human being. Devs / companies can create software that could be used for evil – “We are merchants of death and these are lethal weapons” – could conceivably earn the ethical software developer’s seal of approval. Where does accessibility of the software we make fit into a code of ethics? Do you think there should be a minimum level of support for technologies such as screen readers or captioning for audio content in the software that we build? – Tobias Minimum levels of knowledge required Minimum levels of content in curriculum In your talk you mentioned how Rackspace’s stance on user support matches the ideals you’d previously laid out, can you flesh that out a bit for us? What does that mean to individual Rackers in their day to day work lives? – Chris In your talk you mentioned that availability of the software source should be mandatory for compliance with a properly defined ethical framework. What mechanisms for providing that access do you think would be acceptable? Should there be a central repository for housing and providing access to that source? – Tobias Would the list of acceptable mechanisms change according to the intended audience of the software? – Tobias What responsibility do you think producers of software should have to maintain an archive of the source for past versions? – Tobias How should we define what level of access is provided? In the case of commercial software should the source only be available to paying customers, perhaps delivered along with the product? This also poses an interesting quandary for SaaS providers. Should they provide the source to their systems only to paying customers, or to potential customers as well? – Tobias This question of transparency and availability of source is especially interesting in the light of a number of stories that have come out recently about patients who have been provided with prostheses and other medical devices. In a number of cases, shortly after receiving the device, the company who made it, which are increasingly startups, goes out of business, leaving the patient with no way of obtaining support for something that they are dependent on for their health and well-being. Having the source for those devices available would help mitigate the impact of such a situation. – Tobias You brought up an interesting aspect of the trust equation and its relevance to the need for an ethical code. Because what we do as software engineers is effectively viewed as sorcery by a vast majority of the public, they must therefore wholly place their trust in us as part of using the products that we create. As you mentioned with the demise of the scribe with the rise of literacy, increasing the overall awareness of how software works at a basic level partially reduces that depency of trust. At what level of aptitude do you think our relationship with our users becomes more equitable? How does the concept of source availability play into this topic of general education? – Tobias What can the Python community in particular do to start the ball rolling towards defining a set of professional ethics, and what has it already done in this area? – Chris PSF Code of Conduct is a starting point PSF is an organization of individuals Corporations are cagey about getting involved for fear of it becoming a legally binding contract Django Code of Conduct more specific Picks Tobias Phillips SHP9500 keybase.io – Tweet us with your favorite thing about the show to get an invite Paul Blart: Mall Cop 2 Chris Don’t Starve for IOS Want to understand Pythonâ€s comprehensions? Think in Excel or SQL. Barr Hill Gin Glyph Py2App Blog post PyObjC Sensair Sou Vide immersion circulator Keep In Touch Twitter Keybase.io email Glyph everywhere on the internet

8/3/2015 • 1 hour, 19 minutes, 23 seconds

Holger Krekel on Py.Test

Visit our site to listen to past episodes, learn more about the show and sign up for our mailing list. Summary In this episode we talked to Holger Krekel about the py.test library. We discussed the various styles of testing that it supports, the plugin system and how it compares to the unittest library. We also reviewed some of the challenges around packaging and releasing Python software and our thoughts on some ways that they can be improved. Brief Introduction Welcome to Podcast.__init__ the podcast about Python and the people who make it great Date of recording – July 8th, 2015 Hosts Tobias Macey and Chris Patti Follow us on iTunes, Stitcher or TuneIn Give us feedback on iTunes, Twitter, email or Disqus) We donate our time to you because we love Python and its community. If you would like to return the favor you can send us a donation}. Everything that we don’t spend on producing the show will be donated to the PSF to keep the community alive. Overview – Interview with Holger Krekel about his work on Pytest Interview with Holger Krekel Introductions Programming for 25 years Runs a consultancy Been to almost every EuroPyCon and PyCon US How did you get introduced to Python? – Chris Wanted to write an HTTP proxy and Java I/O was too confusing. Jython took less than a day to get it working after 2-3 days on it with Java. What inspired you to create Pytest, and how did the existing unittest framework play into the story? – Chris Introduced to agile methods through the Zope community Zope used unittest – didn’t like the boiler plate Not in the spirit of Python Only took ~200 lines of code to get a testing tool working Original name was ‘utest’ – 2003 Pytest name came in 2004 on Pypy project Huge number of tests on that project (20,000) – distributed test runner – xdist helped solve this. There are many different styles of testing, such as BDD, unit testing, integration testing, functional testing, what attributes of py.test make it suitable or unsuitable for these different approaches? – Tobias What are your views on black box testing and how would someone use py.test to implement this approach? – Tobias Pytest’s plugin architecture enables you to hook into the various phases of test execution enabling you to extend Pytest in all kinds of ways beyond the original design. I have been hearing a lot about property based testing which was popularized by the Quickcheck module in Haskell. Does py.test support anything like that? – Tobias hypothesis-pytest Do you think the characteristics and nature of the unit testing framework being used have any effect on the number and quality of the tests developers write? – Chris Developers find writing tests in Pytest to be fun compared to unittest Which will help people write better tests Encourages refactoring Is there ever a time when you would advice against writing tests? – Tobias When exploring a problem, writing tests first doesn’t make sense When getting feedback on a potential approach, writing tests first can be a waste of time What are some signs that you watch out for when writing tests that tell you that a particular feature needs to be refactored? – Tobias When the test code is fragile it should be refactored Requires experience to really understand when to refactor When it’s not fun anymore or the tests are repetitive For someone who is converting their existing unit tests from UnitTest/Nose style to use py.test in an idiomatic manner, what are some of the biggest differences to be aware of? – Tobias Generator/yield based testing should move to property based testing If py.test can’t run a UnitTest/Nose style test it is considered a bug and gets fixed Has the strict backwards compatibility policy presented any interesting technical challenges thus far? – Chris Yes it definitely makes more work However breaking the API in a large project like this will cause too many problems for users py.test supports execution of tests written with other frameworks, how much ongoing maintenance does this feature require as changes are made to the other implementations? – Tobias The web page says that Pytest is designed to work with domain specific and non Python tests, and in fact a coworker is using it to test a node.js project – how did Pytest’s design enable this? – Chris Pytest uses a collection tree model to represent your project This is not Python specific All classes and functions are just mapped into this tree, not directly on the Python function There are few Python specific hooks for fixtures etc. People have written plugins so they can express their tests in YAML, Microsoft Excel Tests are represented as items All plugins are written in Python What are some of the most interesting applications of py.test that you have seen? – Tobias Plugins! Pytest-BDD Pytest-C++ Pytest-sugar Py.test plugin list Speaking about adoption, do you have any sense of the relative adoption of Pytest versus unitest or other tools? – Tobias Very hard to actually know Download numbers are not a clear indicator due to robots, CI systems, etc. Quantifying market share is hard to do Popularity is not a useful heuristic in determining a good fot for technology adoption But popularity is an indicator for the level of support you might receive Tech can be popular but very poorly maintained Are there any features of py.test that would make it suitable for use with configuration management tools and infrastructure testing? – Tobias Example driven testing Run py.test from a blackbox approach Largest benefit would be from having one testing tool used across the organization Where do you see Pytest and more generally test frameworks headed in the future? – Chris No big changes for Pytest – lots of incremental things Plugins will add functionality Holger is also the author of Tox Integration testing and testing in more complex environments are a direction that test management tools will likely go Tools like Jenkins can be a real headache in trying to have a good testing story for your company https://devpi.net/hpk/dev/devpi-server/2.2.0/+toxresults/devpi-server-2.2.0.tar.gz?utmsource=rss&utmmedium=rss Any questions we didn’t ask? Pytest is a very healthy project! There are 10 regular contributors – this is exceptional among OSS projects Picks Tobias python-future six The Way Back Rosewill BK-500A} or BK-500i pipdeptree pundler Chris Crop Bavarian Weizen Dutch Pancakes Prophet Holger The Utopia of Rules IPFS.io – The interplanetary file system A New Way to Look at Networking Keep In Touch Twitter Blog The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/24/2015 • 1 hour, 11 minutes

Damien George Talks To Us About MicroPython

Visit our site for more news, information and past episodes of Podcast.__init__! Summary We talked to Damien George about his work on the Micro Python interpreter and the PyBoard SOC (Systom On a Chip). The combination of the interpreter and SOC allows Python developers to get involved in hardware hacking, as well as letting electronics afficionados try their hand at development. Damien explained to us where this fits in with the expanding landscape of low cost embedded devices and why you should get one to start playing with it. Brief Introduction Date of recording – June 29th, 2015 Hosts – Tobias Macey and Chris Patti Follow us on iTunes, Stitcher or TuneIn Give us feedback! (iTunes, Twitter, email, Disqus comments) You can donate (if you want)! Overview – Interview with Damien George from the Micro Python project Interview with Damien George Introductions Postdoc in Theoretical Physics How did you get introduced to Python? What problem were you trying to solve when you first had the idea to create the Micro Python board and interpreter? Not really Python lets you get things done quickly Abstracts the hardware really well In the Kickstarter video you mention that Micro Python is a complete re-implementation of Python optimized to run on a micro-controller. How hard was it to create an alternative Python implementation? Did you have hard decisions to make as to what to include given the limitations of the hardware? To start with, was it even possible? Proof of Concept: Get a REPL running on the board Lots of tricks to get things to fit into RAM Stuffing integers into pointers Optimizing RAM at various points Runs the parser 4 times, looking for different things each time Lots of things are stored in ROM in the built-in Flash Very fine efficiency trade off between code size, memory usage, speed. REPL runs in 1K of RAM! Most of this is the parse tree 20 line script might take ~5K RAM 128K RAM on the Micro Python board Not 100% Python – but 90% – the most useful parts I know that people who have developed alternative Ruby implementations have run into issues due to the lack of a formal specification. Has the fact that there is a specification for Python made your job easier? Definitely, Python is very well defined Well documented Already multiple implementations The WiPy chip seems like an interesting device. What are some ways in which it could be put to use? A Micro Python cluster for instance? Small, cheap, low power little wireless chip that also runs Python You can telnet in and have a Python REPL Part of the Internet of Things What changes did you have to make to get the Python interpreter to run without an underlying operating system? When you were designing the hardware, what were some of the requirements that you were targeting in terms of performance or peripherals? Wanted the best chip for the least money Didn’t know ahead of time how many resources were required What level of hardware knowledge is required to start working with the Micro Python board? Virtually none Just need to plug into USB and login with a terminal program to get a Python prompt Can change frequency of CPU, turn on/off LEDs, etc. Connecting peripherals requires some hardware knowledge Module namespace to make hardware management easier For anyone who is interested in writing libraries, what kinds of restrictions do they need to be aware of? Be aware of RAM size limitations Prety much anything that will fit will work Libraries with C extensions won’t work because they rely on the CPython API What license is used for the Micro Python interpreter and the PyBoard? Are the compatible with commercial uses? MIT License Hardware schematics are open source as well, open and accessible design What are some of the most interesting/innovative projects that you have seen people make with the Micro Python board or runtime? Damien attempted to make a quadcopter – not completely finished Micro Python controlled guitar – PyBoard connected to actuators to play guitar How does the experience of using Micro Python compare to some of the other hardware projects that are popular right now such as Arduino, Raspberry Pi or Tessel? PyBoard in between Arduino and Raspberry Pi More approachable than Arduino Not a full OS like Raspberry Pi Tessel similar to Micro Python but runs Javascript EU Space Agency (Europe’s version of NASA) interested in Micro Python Prepared to fund Micro Python development to explore possibilities of space based applications Code needs to be well written and with few bugs See if it can be used for real-time systems Picks Tobias Machine Gun Preacher – Real life story of Sam Childers’ work in Southern Sudan Pocket Book Android App – E-Book app with good UI/UX and solid feature set Online access to digital media through local library memberships Hoopla Digital Overdrive Chris Real Ramen RedHat Summit The SELinux Coloring Book Damien MOSH – Mobile shell, resilient SSH that allows for resuming sessions across networks, computer sleeps, etc. Keep in Touch Twitter @micropython @damienpgeorge GitHub – micropython The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/16/2015 • 49 minutes, 17 seconds

Allen Downey on Teaching Computer Science with Python

Find past episodes and more information about the show at iTunes, Stitcher or TuneIn Give us feedback! (iTunes, Twitter, email, Disqus comments) You can donate (if you want) Overview – Interview with Allen Downey, Prolific Author and Professor of Computer Science Interview with Allen Downey Introductions How did you get introduced to Python? – Chris Wrote a Java book with an open license to allow anyone to make changes Jeff Elkner translated it to Python What attributes of Python make it well suited for use in teaching computer science principles? Syntax is simple, makes a difference for beginners Good error messages Batteries included One of the things I found very compelling about Think Like a Computer Scientist is its use of interactive turtle graphics early on. What makes the turtle continue to be a compelling educational tool and what made you choose it for this book in particular? Everything you do has a visible effect, makes it easier to see what’s happening and debug Used to introduce functional decomposition because of no return value in turtle graphics Great way to explore complex geometric concepts Did the structure of your courses change when you started using Python as the language used in the classroom? Were you able to cover more material as a result? Able to make material more interesting Less time spent fighting with syntax As a professor of computer science, do you attempt to incorporate the realities of software development in a business environment, such as unit testing and working with legacy code, into your lesson plans? Unit tests useful as a teaching tool Version control getting introduced earlier A number of your books are written around the format of ‘Think X’. Can you describe what a reader can expect from this approach and how you came up with it? Learning how to program can be used as a lever to learn everything else You can understand what a thing is by understanding what it does What are some of the more common stumbling blocks students and developers encounter when trying to learn about stastics and modeling, and how can they be overcome? Traditional analytic methods for statistical computation – get in the way and impede understanding P-values are a great example What test should I do? is the wrong question I’ve heard you refer to yourself as a ‘bayesian’. Can you elaborate on what that means and how bayesian statistics fits into the larger landscape of data science? Frustration with frequentist approach to statistics Wasted time over debate of objectivity vs subjectivity Bayesian approach takes modeling ideas and makes them explicit Can directly compare and contrast results of competing models Classical approaches don’t answer the most interesting questions *We’re big fans of iPython notebook which you’ve used in at least one of your books already – can you describe some of the ways you have implemented it in an educational context, as well as some of the benefits and drawbacks? Started using about 2 years ago Appreciated usefulness for books and teaching because of synthesis of text, code and results Working on DSP really highlighted the usefulness of IPython notebooks Picks Tobias IMAPy – IMAP for humans ScudCloud – Linux desktop Slack client Thrive – Online purchasing club for healthy and organic foods Floobits – remote pair programming Chris Testament of Youth Mastering Emacs – The Website / Blog StayFocused Fallout Shelter Keep in Touch Twitter Blog The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/9/2015 • 37 minutes, 42 seconds

Jacob Kovac on KivEnt

Listen to past episodes and find out more about the show at our website pythonpodcast.com Synopsis In this episode we talked to Jacob Kovac, creator of the KivEnt game engine and one of the Kivy core developers. He told us about what inspired him to create the KivEnt project, some of the ways that he has managed to optimize rendering time and some of the problems that he has encountered as part of his work on the project. We also discussed what the use cases and limitations of the KivEnt engine are and he shared some of the projects that have been made with it. Brief Introduction Date of recording – June 17th, 2015 Hosts – Tobias Macey and Chris Patti Follow us on iTunes, Stitcher or TuneIn Give us feedback! (iTunes, Twitter, email, Disqus comments) We don’t have any corporate sponsorship or advertisements in the show because we are making it for the community and we respect our listeners and value your time. If you would like to help support the show and keep it ad-free you can find out how by visiting our website Overview – Interview with Jacob Kovac about the KivEnt Game Engine, based off of Kivy Interview with Jacob Kovac Introductions How did you get introduced to Python? – Chris Could you please give us a high level overview of KivEnt and how it differs from other game builder frameworks like Unity or Unreal? Manages memory for game objects and stores them contiguously in memory for greater efficiency Real-time focused rendering engine for Kivy Cython interface to provide performant game objects with Python API Increased speed of main render loop by 38X by removing a single Python list lookup Kivent is mainly 2D focused, vs 3D for Unity/Unreal Python all the way down Cython and pointer magic for optimization purposes Made to be familiar to Pythonistas Aiming for “A” level games Bringing modern advancements in making games to Python – GPU awareness Built with constraints in mind The Pacman Dossier What inspired you to create the KivEnt engine? Tried to create an Android infinite runner in Kivy, performance was unacceptable Looking for how to build games in Python with large amounts of data Is there a particular kind of game KivEnt is particularly suited for versus any of the other popular frameworks? Focuses mainly on 2D, agnostic as to ‘type’ of game Jacob’s interests largely focused on procedurally generated environments Could KivEnt be used to create networked multiplayer games and what challenges might that bring to the table for the aspiring KivEnt game developer? Multiplayer thought to be largely out of scope This doesn’t mean KivEnt is bad for multiplayer games, but that KivEnt in and of itself doesn’t wholly solve this problem. Plenty of other frameworks to draw on for handling the multi-player server or pulling data from it, KivEnt solves the client side problems germane to making a game in Python Does the fact that KivEnt games need to run on so many platforms present any unique difficulties in KivEnt’s development? Kivy has solved most of the cross-platform problems Difference in GPU vendors has proved the most difficult I hear game developers talk a lot about assets and asset formats. What kinds of assets can be used with KivEnt? 2D assets are simple – especially as compared to 3D KivEnt supports any image format that Kivvy does for your platform Coming next release – you can specify the vertex format for your model https://youtu.be/qe9fWC-2e3M?utmsource=rss&utmmedium=rss I have heard that unit testing games is difficult and rarely done for reasons of time pressure, as well as lack of determinism in the interactions. Does KivEnt provide any utilities to make this easier? Not currently well tested, but targeting that for next release Trying to add tooling to make testing games easier, though still somewhat difficult Platform Biased Podcast – by a bunch of Microsoft Studios SDETs How does KivEnt handle input and what kids of input devices are supported? Input handled entirely by Kivy, so any inputs supported by Kivy are accessible in KivEnt Rumors of using Kinect camera with Kivy/KivEnt applications Is there a built in physics engine or is that something that is pluggable? Mostly pluggable Chipmunk 2D integration provided via a module Particle Panda – one of the major inspirations for KivEnt New Particle engine coming in the next version of KivEnt How does KivEnt handle collision tracking? Mathematically difficult, very hard to get right Don’t do it! Use the physics engine – Chipmunk 2D is also a collision detection engine Kivy enables devs to use C, C++, Java and Objective C code in their games Game development has been democratized Entity / Component architecture enables great modularity Game objects that appear on the screen (Gun, ball, etc.) are not represented as such in the system Can you tell us about some of the projects that you have seen built in KivEnt which you are most excited by? https://github.com/chozabu/KivEntEd?utmsource=rss&utmmedium=rss https://play.google.com/store/apps/details?id=org.chozabu.boardzfree&hl=en&utmsource=rss&utmmedium=rss What are some ways in which our listeners could help contribute to the project? Would like to see more people build games in KivEnt Give feedback about the experience and what can be improved If you have Apple hardware, try out KivEnt and file issues with any errors that occur Picks Tobias EIN (Emacs IPython Notebook) Pip 7.x RESTful Web APIs Chris The Killing Data Science on the iPad with RethinkDB Left Hand Nitro Milk Stout Jacob Pelican Static Site Generator Terraria 1.3 Amorone Homemade Red Wine Keep in Touch E-Mail – kovac Blog – chaosbuffalogames.com/blog IRC – #kivy The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

7/3/2015 • 1 hour, 8 minutes, 41 seconds

Eric Schles on Fighting Human Trafficking with Python

Listen to past episodes, read about the hosts or donate to the show at podcastinit.com Brief Introduction Date of recording – June 10th, 2015 Hosts Tobias Macey and Chris Patti Follow us on iTunes, Stitcher or TuneIn Give us feedback! (iTunes, Twitter, email, Disqus comments) You can donate (if you want)! Overview – Interview with Eric Schles Interview with Eric Schles Introductions How did you get introduced to Python? What inspired you to take up the fight against slavery? Is there personal story behind this choice? Some of your work touches on the “Deep Web”. Can you provide listeners with some context around what that term means and role it plays in what you do? Tor .onion sites (Hidden Services) are examples Anonymous Web Experience Anonymity allows for illegal, immoral things like buying selling people Conceptually very important idea Bruce Schneier – Web technologies need to be more privacy aware Like a really scary version of “The Internet of the Old Days” Photos of young, exploited men and women Pedophiles are building communities, having parties through these hidden services Eric feels that Tor is an extreme Feels there had to be a way to protect the rights of legitimate while protecting against pedophiles Maybe a voting system? The Tor project feels that any compromise lessens the that’s so important for people in embattled or countries (Worded that poorly -Chris) No metrics on the amount of pedophilia that actually happens Tor – probably a lot Sexually abused victims of trafficking grow up damanged unable to do anything else Consumers of this type of porn were often themselves victims sexual abuse Structural dissonance which exists to create this problem society needs to be addressed Google puts the number to the anti-trafficking hotline at top of any trafficking search results Darren (Derek?) Hayes – redirect to trafficking resources when viewing advertisements for victims trafficking Why did you choose Python as opposed to any other tool for your search engine? Needed solutions quickly with the ability to evolve as needed Able to rapidly develop and incorporate new features rapidly Easy to scale as needed Flask is easier to prototype and iterate with Python data science tools make the analysis easy Able to finish a 2 year C++ project in 3 weeks using Python Doing data science in Ruby is challenging Pandas Dataframe galvanized the creation of a lot of other useful tools Vincent – write Python which compiles to D3 Can you provide a high level description of the technical details the search engine that you created, and what it’s like to with Tor through Python? Directed search engine “It would be like if you went to Google but everything watched was Porn which you were uncomfortabl seeing and you sad” Get most case information through regular old detective work Person arrested / in holding yields phone number, other attributes that can feed the search engine Google can’t scrape the deep web Memex tool indexes the deep web – Eric’s search engine uses that Eric does design work for the Memex project Developed by the amazing Chris White Eric’s search engine uses the Tor driver in Selenium to .onion sites What are some of the technical and legal challenges that you experienced in the course of your work? Most of the technical challenges are around automated processing Legal structure provides some limits on what can be worked on Does your search engine try to infer who might be engaged in work voluntarily as opposed to those being forced into it their will? No, because they get all their case referrals from detective work You have to have been hospitalized or in some other way come the attention of the authorities for being deprived of rights Trafficking looks very different in different cultures Global similarities Afraid to say why if hurt Forced into having sex against your will Clear patterns of indication Urban versus Suburban versus Rural Fracking towns Demographics are very different – mostly men very women, LOTS of ads for sex workers Only helping people that want to be helped What was the most surprising fact you uncovered as part of research? Imagery of exploited children is so depressing and sad Without revealing anything you shouldn’t, are you aware of being set free as a result of your work? “Not my work, our work” Not an individual effort lawyers, analysts, larger DAs office Given the complicated socio-economic aspects of human and prosecution of those who are responsible, can you discuss of the moral and ethical considerations that you have confronted with while building these tools? Privacy is the biggest concern Open source book to teach colleagues at the DA’s office how program to in Python Sometimes Eric works at Civic Hall Are there any projects out there that you consider similar to you are working on? Thorn’s Spotlight tool Memex Project Polaris Project Datakind Anti Trafficking dosomething.org – more broadly focused – help center for teens RescueForensics – stage startup What would it take for other municipalities and law agencies to get started with using your tools? Go to https://github.com/EricSchles?utmsource=rss&utmmedium=rss Alert System and investagator Contact Eric at [email protected] to collaborate How can our listeners get involved and help you with this Chris Tweet at @EricSchles or E-mail Eric Volunteer for any of the non profit anti-trafficking groups Message to the community: There is a world of good waiting to happen Picks Tobias @accidentalaRt tldrlegal.com Rishloo Chris Neil Gaiman’s Sandman Overture Alchemist Brewing’s Heady Topper Hen of the Wood Eric James Powell’s Blog Julia Nunes XKCD Explain XKCD Keep in Touch Twitter: @EricSchles Eric’s About.me page More From Eric He presented at PyGotham 2014 He also talked at the Open Data Science Conference 2015 Boston The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/25/2015 • 1 hour, 13 minutes, 9 seconds

Naomi Ceder, Lynn Root and Tracy Osborn on Diversity in the Python Community

Listen to past episodes, read about the show and check out our donations section at podcastinit.com Brief Introduction Date of recording – Jun-10th, 2015 Hosts Macey and Chris Patti Follow us on iTunes, Stitcher or TuneIn Give us feedback! (iTunes, Twitter, email, Disqus comments) You can donate (if you want)! Overview – Interview with Tracy Osborn, Naomi Ceder, Lynn Root Interview with Prominent PyLadies Introductions Tracy Osborn Naomi Ceder Lynn Root How did you get introduced to Python? In what ways do you think the Python community has succeeded in making itself more friendly and welcoming to women and other under represented minorities, and where could it do better? Python community leadership takes a positive stance on diversity Codes of conduct are taken very seriously Financial diversity needs more focus What can you tell us about PyLadies and DJango Girls? PyLadies started in a coffee shop in LA pip install PyLadies Over 70 locations on almost every continent – half on meetup.com What are some of the challenges you still face in being a part of the Python community, and how can our listeners help? Don’t be disparaging about women-focused events I had to read up to page 17 of the top authors list on PyPi to find a woman. Can you provide some insight into what may be contributing to this state of affairs and how we can help to improve it? pypi is confusing and intimidating Process and tools are tough to use Maybe Pyladies should host a “make your own package” night Mentorship and easy HOWTOs are needed You have all gained some notoriety in the Python community through work that you have done. Do you feel that you were faced with greater adversity than your peers in the course of your careers? Startup community more hostile than Python community We are talking to each of you because of your involvement in the Python community. Have you worked with and been involved in other language communities? If so, can you provide some comparisons between that and Python in how they manage the subject of diversity, gender and otherwise? Design community – lots of conferences with “all dude” conference speaker line up Startups very focused on males for employees and customers What effect do you think job descriptions play in excluding women and other minorities from roles in development positions? (In reference to https://blog.safaribooksonline.com/2015/06/08/on-recruiting-inclusiveness-and-crafting-better-job-descriptions/?utmsource=rss&utmmedium=rss) Discourage more appropriate term than exclude Women less likely to apply for roles that they are not completely qualified for Spotify experimenting with blind resume review and cross-checking of job descriptions Result is more women applying and having better results For any women and young girls who may be considering a career in technology, do you have any words of advice? Go for it, but be aware that it’s hard Do you have any advice for the men in the Python community and technology as a whole? Actually listen when somebody tells you that it’s not the same for them (race, economics, gender) Have some compassion and empathy Men should educate themselves Old habits die hard but getting over them is important Is there anything we haven’t discussed that any of you would like to bring up? Picks Tobias The Banned and the Banished series by James Clemens Cool Hand Luke with Paul Newman Chris Baxter Stowaway IPA Mastering Emacs 99% Invisible – The Nutshell Studies Naomi Ceder Korey Schrum – Dying for a Living Into the Brambles – by “PyDanny – Danny Greenfeld” Lynn Root Jupyter – tmpnb – Kyle Kelly blog post Knit Your Own Zoo Bechdel Test The Good Wife Passes the Bechdel Test Inspiration for women being awesome in a male dominated industry Tracy Osborn EasyPost – Simplifies generating shipping labels for USPS Keep in Touch Naomi Ceder @naomiceder Lynn Root @roguelynn Tracy Osborn @limedaring Blog Hello Webapp The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/18/2015 • 49 minutes, 14 seconds

Brian Granger and Fernando Perez of the IPython Project

You can find past episodes and other information about the show at podcastinit.com Brief Introduction Date of recording – June 3rd, 2015 Hosts – Tobias Macey and Chris Patti Overview – Interview with Fernando Perez and Brian Granger, core developers of IPython/Project Jupyter Follow us on iTunes, Stitcher or TuneIn Give us feedback! (iTunes, Twitter, email, Disqus comments) You can donate (if you want)! Interview with Brian Granger and Fernando Perez Introductions How did you get introduced to Python? – Chris For anyone who may not have heard of or used IPython, can you describe what it is? How challenging was it to port IPython to Python 3? Thomas Kluyver What prompted the name change from IPython to Project Jupyter and were there any associated changes in the project itself? Name inspired by Julia, Python and R – the three programming languages of data science Data scientists have adopted the use of IPython notebooks in their work on a large scale, what is it about notebooks that lend themselves to this particular problem domain? Bayesian methods for Hackers – Cameron Davidson-Pilon Signal processing in Python O’Reilly added support for notebooks into Atlas publishing platform IPython Notebook seems like an incredible tool for educators is advanced fields. Have you seen wide spread adoption in this area and is it a focus for the project? NBGrader – notebook grader Github recently added the ability to render notebooks in a repo. Did you work with them to build that integration? What are some of the most interesting uses of IPython notebooks that you have seen? Gallery of interesting notebooks on the wiki Reproducible academic publications Couple of dozen scientific papers, some very high profile Educational notebooks on various subjects Great learning resource, as well as entertaining MOOC taught between distributed team on Open EdX using IPython notebooks about numerical computing with Python Peter Norvig collection of IPython notebooks Includes analysis of traveling salesman problem notebooks.codeneuro.org– time series data analysis <- Couldn’t get this to work. -Chris Are there any notable projects that use IPython as one of their components? KBase for computational biology Sage – Open source mathematics project written in Python Created by number theorist William Stein Custom parser to allow for non-python syntax Quantopian – Collaborative platform for financial modeling. Runs on top of IPython Wakari from Continuum Analytics – hosted IPython with computing environment Rackspace hosts TempNB and other IPython services Where do you see Project Jupyter going in the future? Are there any particular new features you’d like to see added? – Tobias One of the biggest targeted features is real-time collaboration Prototyped by engineers from Google More modular UI and architecture Multi-user deployments with Jupyter Hub A few weeks ago we interviewed Jonathan Slenders who wrote ptpython, which brings IDE like capabilities to interactive Python. Have you ever considered including this in IPython? What are some of the features that an average user might not know about? Is there anything in particular that you would like to ask our listeners for help with? Pitch in with the development effort Organize community events on behalf of IPython/Jupyter Be patient while documentation improves Picks Tobias Dayworld trilogy by Phillip Jose Farmer ReadRuler.com Chris RubyTapas by Avdi Grimm CodeNewbies Tweetbot Brian Granger Data Science from Scratch – Joel Gruß Elements of Graphing Data – William Cleveland Fernando Perez Republic Lost – Lawrence Lessig Alvaro Mutis Keep in Touch Twitter @projectjupyter, @ipythondev, @ellisonbg, @fperezorg The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/13/2015 • 1 hour, 21 minutes, 48 seconds

David Baumgold on Flask-Dance, WebhookDB and Open EdX

You can find out more about us and view previous episodes at podcastinit.com. Brief Introduction Date of recording – 2015-06-02 Hosts – Tobias Macey and Chris Patti Follow us on – iTunes, Stitcher or TuneIn Give us feedback on iTunes, Twitter, email or Disqus Interview with David Baumgold Introduction How did you get introduced to Python? What problem does Flask-Dance solve that wasn’t covered by other libraries? What were some of the technical issues that you encountered while building Flask-Dance? What are some of the design considerations that you had when building Flask-Dance? You also built webhookdb for replicating GitHub’s information to be queryable. What are some use cases for which you would want to do that? What is Open EdX and what is its intended audience? What are some of the challenges implementing a system like Open EdX, and what can Python developers learn from the implementation of the project? Picks Tobias Evil mode Forgotify Wolf of Wall Street pipreqs Chris Dark Horse Brewing – “Smells Like a Safety Meeting” Medium Modern Gnu Emacs David Homebrewhttps://open.edx.org/ for OSX Homebrew Cask Arrow Moment.js The Imitation Game Keep in touch Twitter: @singingwolfboy GitHub Website Email

6/7/2015 • 32 minutes, 21 seconds

Mark Baggett on Python for InfoSec

Read all of our show notes and find more information about us at Beautiful Soup Brief Introduction Date of recording – May 28th, 2015 Hosts – Tobias Macey and Chris Patti Overview – Interview with Mark Bagett Follow us on iTunes, Stitcher or TuneIn Give us feedback! (iTunes, Twitter, email, Disqus comments) You can donate (if you want)! Interview with Mark Bagett Introductions How were you first introduced to Python? – Chris Started using it for automating tasks while working as a sysadmin Found code that launched an attack on FTP server – in Python What are some of the tasks in your job that you use Python for? -Tobias Trusted command & control backdoor for Windows Mostly not used by malware authors – thus far (at least Mark hasn’t seen it used that way) Flame virus – 5MB payload – incredibly advanced Lua interpreter bundled along with the scripts Vale framework – Python framework that takes payloads out of penetration testing executables What is it about Python that makes it useful for penetration testing and other information security tasks? Same thing that makes it useful for anything else mpacket from core security What are some of the more useful Python penetration testing tools? OFFENSE Beautiful Soup scapy Volatility DEFENSE Counter dictionary from collections Pandas iPython matplotlib We’ve noticed that a lot of the literature around information security and penetration testing focuses on targeting Windows. Can you enlighten us as to why that is? Windows event tracing logman event trace providers – implement packet sniffing (Can turn every browser into a key logger) Primary attack surface – Where most attacks are targeted Fewer purely Linux systems Very few ports open – maybe 80, 22 Very likely no user just sitting there waiting to run an executable you send More freedom on Linux – less formalized patching process, more variable tools = more exploits Will write code to only use built in modules for Python that will run in customer target environments What are some of the legal considerations that you have to deal with on a regular basis as a penetration tester? There have recently been a number of attacks based on hijacking the TCP/IP stack. Is Python being used for any of these exploits or tools to defend against them? Data analytics Detect repeated sequence numbers – Man in the Middle Attack As simple as 5 lines of Python code import scapy, start sniffing packets, pull together all packets – make list of associated packets Can pull together all packets inside of stream Time spefic source communicates with specific destination Bro – intrusion detection suite Built into Security Onion – Doug Berks FLOSS Weekly episode 296 with Bro developers What are some activities that you do on a regular basis for which you would turn to another language or toolchain, rather than using Python? Powershell – The Python of windows Whitelisted and ubiquitous Password cracking – compiled language like C or assembly For anyone who is interested in getting involved in the security industry, and penetration testing in particular, what resources or tools would you recommend? Developers make the best InfoSec professionals Lots of jobs and opportunities Developer -> Systems Administration -> Information Security Security conferences – BSides, Defcon, Black Hat Online capture the flag challenges (google it) – good practice for critical thinking and using code for security exercises Get involved in the industry – Meetups, etc. SANS institute course, Python for Penetration Testers, SEC573 by Mark Baggett – sans.org Lots of free online resources Violent Python PicoCTF Counter Hack Challenges Picks Tobias Authy OpenWRT TP-Link Archer C7 Schemas For The Real World by Carina C. Zona The Soul of Software by Avdi Grimm China Mieville Chris Rapscallion Munich Dark Write Marginal Way Frankie and Johnny’s pyenv Mark Bagett Corelabs impacket Google Labs – Rekall Adams peanut butter cup fudge ripple cheesecake BSides security conference Keep in Touch Twitter: @markbaggett In Depth Defense The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

6/3/2015 • 1 hour, 14 minutes, 30 seconds

Jacob Kaplan-Moss on Addressing Cultural Issues in Tech

Read all of our show notes and find more information about us at podcastinit.com Brief Introduction Date of recording – May 18th, 2015 Hosts – Tobias Macey and Chris Patti Follow us on iTunes, Stitcher or TuneIn Give us feedback! (iTunes, Twitter, email, Disqus comments) Overview – Interview with Jacob Kaplan-Moss Interview with Jacob Kaplan-Moss Introductions How were you first introduced to Python? So, we wanted to invite you on the show to discuss the keynote that you gave at this years PyCon. Can you tell us what you mean when you say that you’re a mediocre programmer and why that is such an important admission to make? What are some ways that we can change the tone of the conversation around programming skill? What do we gain by admitting to ourselves and others that we are not all phenomenal engineers? Where does the myth of exceptional vs terrible programmers come from? Can you provide some examples of times that you came in contact with this narrative? How do you think hiring tactics in technology companies contribute to this misconception and how can they be more accepting of average programmers? What are some ways that we can work toward eradicating the myth of the 10x programmer? Thinking about our industry’s problems retaining women and other undervalued groups, do you think the way many managers do performance reviews play a role? If so, how can we do better? What Works For Women At Work Can you tell us about some other ongoing narratives in the technology industry that you find equally as damaging as our misconceptions around skills and knowledge? – Tobias indie.vc Picks Tobias True Ability Manjaro Linux Vultr VPS Mage Wars Chris K is for Kriek Trello Dan Carlin’s Hardcore History Jacob Kaplan-Moss Hello Web App What Works For Women At Work Why Women Leave Tech: What the Research Says Library Extension for Chrome and Firefox Keep In Touch @jacobian

5/26/2015 • 49 minutes, 35 seconds

Jonathan Slenders Talks About Prompt Toolkit

Visit our site at podcastinit.com for more show notes and news. Brief Introduction Date of recording – May 17th, 2015 Hosts – Tobias Macey and Chris Patti Follow us on iTunes, Stitcher or TuneIn Give us feedback! (iTunes, Twitter, email, Disqus comments) Overview – Interview with Jonathan Slenders Interview with Jonathan Slenders Introductions How were you first introduced to Python? -Chris What inspired you to create the python-prompt-toolkit? What are some design considerations that you made when building prompt-toolkit? Make minimal use of inheritance Overly strong coupling Better clarity for the API of your library Completely event driven / asynchronous No global state ptpython completion benefits from asynchrony – The jedi completion library is too slow – completion happens in its own thread You have built a number of projects that use the prompt-toolkit as a core component, did you have them in mind from the beginning, or are they experiments to test the capabilities of the toolkit? tmux rewrite in Python, abandoned, original motivation for prompt-toolkit ptpython pgcli ptpdb pyvim Do you intend to bring PyVim to feature parity with Vim, or is it just intended for experimentation? Short answer: Don’t know – but will probably never be in full parity with Vim What inspired you to create ptpython and why did you choose to make it a stand-along project rather than extending iPython? How difficult was it to integrate with IPython and what were the benefits? IPython has its own event loop – this presented difficulties as prompt-toolkit has its own as well What are some of the most interesting uses that you have seen of the prompt-toolkit? PyVim – really challenged the design pgcli Picks Tobias vimsert Johnny Cash Project Interstellar Chris Grimm Telekinesis pandoc vimpager Homebrew Cask Jonathan Slenders Belgian Beer Rochefort Western European Folk Dancing Keep in touch Twitter – @jonathans GitHub – jonathanslenders

5/19/2015 • 40 minutes, 53 seconds

Ned Batchelder

Visit podcastinit.com for information about the show and links to our iTunes and Stitcher feeds. Brief Introduction Date of recording – May 4th, 2015 Hosts – Tobias Macey and Chris Patti Overview – Interview with Ned Batchelder Follow us on iTunes, Stitcher or TuneIn Give us feedback! (iTunes, Twitter, email, Disqus comments) You can donate (if you want)! Interview with Ned Batchelder Introductions How did you get introduced to Python? Zope … Implemented in Python How did you get started as the organizer for Boston Python Meetup? History is long and varied (Why is this switching to numbers? Started – 6 people sitting around a coffee table 5 or 6 years Co-organizer Jessica McKeller Built structures to help keep the community goingr Weekend Python Workshop People ‘adjacent’ to the male members – wives, mothers, etc. “What comes next” from weekend workshops – became Project Night How much of your time ends up being dedicated to the Python community? Also maitainer of coverage.py Active on Freenode IRC #python 20 hours a week What are your goals for the Boston Python community? Continue to grow More events, different events? chipy – Chicago UG very active – 1 on 1 mentoring program Smaller events – 5 person events – study groups All levels not just beginners Computational Biologists – study genomics Three user groups Pyladies Boston DJango Boston Boston Python Meetup What do you find to be the most important thing(s) for building a healthy community (particularly in reference to programming)? Consistency – good to know what to expect Pick a cadence – don’t burn out Speakers aren’t superheroes, they’re just people. ‘Everyone has at least one talk in them’. Value in having a blog, twitter stream – people talk back to you and by correcting your mistakes everyone benefits. How do you keep people engaged outside of the monthly meetings? Meetup.com – requires moderation python.org mailing lists – unmoderated – low traffic Need to do more in that regard What do you like the most/least about the Python community? Communities can improve – IRC has gotten better Turmoil on PSF mailing list over election for directors How do you strike a balance between sponsors and the rest of the community? Do you have policies around sponsored presentations / talks? Tend not to do sponsored talks Microsoft NERD – great benefit to Boston Python Provides monthly space for the group 1 minute slots for sponsors No sales pitches What are the steps I can take to start my own tech community? How can you get the word out? Meetup.com is useful People like free food and beer Be predictable. Pick something sustainable What is the State of Python, from your perspective? No signs of slowing down Ruby people are moving to other environments Python people are still using Python Python 2 to 3 conflict is unfortunate – transition could have been handled more smoothly Python 3 ecosystem is getting much better Next big drama – type hinting proposal Appears to be contrary to one of the basics tenets of the language at first blush Do you feel that Boston will ever have its own regional Python conference? Toyed with bid to bring Pycon to Boston Would require someone stepping up to do it Not sure how a regional conference ‘feels’ as a local event Try to have Boston Python be like a year long conference all year long Huge undertaking Picks Tobias Scribd Konch DupeGuru Chris The River Cafe Pythonista Rototo – IOS Game Stone Brewing Arrogant Bastard Ned Tox Pythonz Spell Tower Richard Feynman’s Cornell Lectures Keep in Touch Twitter: @nedbat and @bostonpython IRC: nedbat nedbatchelder.com bostonpython.com

5/12/2015 • 1 hour, 15 minutes, 55 seconds

Travis Oliphant

For show notes and other content, visit our site at http://www.pythonpodcast.com?utm_source=rss&utm_medium=rss Brief Introduction Date of recording – Apr 28th 2015 Hosts – Tobias Macey and Chris Patti Overview – Interview with Travis Oliphant Interview with Travis Oliphant Introductions How did you get introduced to Python? I’m curious what inspired you to create NumPy and SciPy? Why did you choose Python for those libraries? Numeric, Jim Hugunin Morphology library in NumArray For those of us who aren’t in the know, can you provide a brief definition of what data science is and how you got involved in it? Term coined by DJ Patil Answer: Anybody who takes data and tries to derive insights from it Nobody really knows what this means Can you tell us the story of how Continuum Analytics came to be? What are some interesting projects that you have worked on with Continuum Analytics? Bokeh Wakari Anaconda Numba Blaze Can you explain a bit about what NumFocus is and how it got started? How can our audience get involved with NumFocus? For someone just starting out in the data science and data analytics space, what advice would you give? Download Anaconda, learn as much Python as you can Google search “Data Analysis in Python” iPython Notebooks in data analysis R community Meetups Online classes R Community can be helpful Of your myriad achievements, what are you most proud of? Picks Tobias Used bookstores The Book Barn Cloudy with a Chance of Meatballs Kickin’ it Old School Chris Kids In The Hall MFA Boston Art in Bloom CodeNewbies Apple 27″ Retina iMac 5K Travis Oliphant Data Carpentry Tracy Teal (@tracykteal) Patterned on Software Carpentry Brain Science Podcast – Ginger Campbell, MD Money, Bank Credit and Economic Cycles Travis Contacts Twitter: Travis – @teoliphant NumFocus – @numfocus Continuum Analytics – @ContinuumIO The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

5/4/2015 • 52 minutes, 16 seconds

Kivy Core Developers

You can view all of the show notes for every episode at http://podcastinit.com?utm_source=rss&utm_medium=rss Brief Introduction Date of recording – Apr 21st 2015 Hosts – Tobias Macey and Chris Patti Overview – Interview with members of the Kivy core development team Interview with Kivy Core Developers Introductions How did you get introduced to Python? How did the Kivy project get started? What made you choose Python as the basis for Kivy? What were some influences on and inspirations for Kivy’s design? Raymond Hettinger – Beyond Pep 8 One of the amazing things about Kivy is that it’s comparatively simple to learn and get started with. Did this ease of use occur by design or accident? What were some of the biggest challenges to designing or implementing Kivy? If you could start the project over, what would you do differently? What are some of the most interesting things you’ve seen Kivy used for? Gabriel Pettier – http://www.tangibledisplay.com/en/?utmsource=rss&utmmedium=rss Mathieu Virbel – https://www.digital-stories.fr/?utmsource=rss&utmmedium=rss and https://vimeo.com/80051846?utmsource=rss&utmmedium=rss What are some changes/features that you are particularly excited about for the future of Kivy? Wiki for roadmap to 2.0 PyJnius PyObjus Kivy-iOS Buildozer Kivy Remote Shell Plyer Are there any platforms/operating systems that you are trying to add support for (e.g. Sailfish OS, Ubuntu Phone, Firefox OS)? Is there anything in particular that you would like to ask for our listeners to help with? Google Summer of Code – If you didn’t get accepted, DO it anyway! Start small – documentation fixes Fix issues Huge backlog – help answering questions Maintainers for subprojects – like PyJnius Sponsors – Kivy core team looking for new hardware Increase unit test coverage If you find a bug submit a test case Picks Tobias Zeal CommitStrip Chris Jack’s Abbey Smoke & Dagger Woman in Gold Mathieu Virbel YAPF Yet Another Python Formatter Learn Chinese With Cats! Rince Cochon Akshay Aurora Mangoes! Tic-Tac-Toe machine controlled by Kivy Ryan Pessa E-Cigarettes – The MilkMan by Vaping Rabbit Gabriel Pettier I3WM Tiling window manager Boulet Corp SMBC Contacting the Kivy Core Team Kivy.org – About Us page The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

4/27/2015 • 1 hour, 30 minutes, 33 seconds

Reuven Lerner

Full show notes can be found at http://podcastinit.com/episode-2-reuven-lerner.html?utm_source=rss&utm_medium=rss Episode 2 Brief intro Recording date/time Hosts Overview Reuven Lerner Interview Please introduce yourself How did you get introduced to Python? How did you break into the field of providing Python trainings? What are the most common languages that your students are coming from? What are some of the biggest obstacles that people encounter when learning Python? Where does Python draw the inspiration for its object system from? In what way(s) does learning Python differ from learning other languages? What sorts of materials/mediums do you use for training people in Python? Python Tutor Do you use your book (Practice make Python) as follow up material for your trainings? In your freelance work, what portion of your projects use Python? Ruby is Oscar, Python is Felix Have you seen a change in the demand for Python skills in the time between when you first started using it and now? What types of projects would cause you to choose something other than Python? Picks Reuven Lerner Daily Tech Video Mindless Eating: Why We Eat More Than We Think by Brian Wansink Age of Ambition: Chasing Fortune, Truth, and Faith in the New China by Evan Osnos Chris Patti Spencer Trappist Ale Rich Hickey’s The Value of Values YouCompleteMe – Vim auto-completion SizeUp for OSX Tobias Macey CheckIO – Gamified practice programming Snap Circuits Nvidia Shield Tablet Samson Go Mic Portable USB Condenser Microphone Zoho Apps Closing remarks Reuven Contact: Website blog Twitter: @reuvenmlerner

4/23/2015 • 1 hour, 7 minutes, 31 seconds

Thomas Hatch

Full show notes can be found at http://podcastinit.com/episode-1-thomas-hatch.html?utm_source=rss&utm_medium=rss Brief Intro Hosts Overview Python at Chefconf! Plug for Talk Python To Me Thomas Hatch Interview Picks Thomas Hatch Flow Based Programming IOFlo Imagine Dragons Chris Patti Stone Imperial Russian Stout Python One Liner Games Boston Python User Group Tobias Macey Noisli CopyQ Pelican Moving From Heroku to AWS With Salt Part 1 Moving From Heroku to AWS With Salt Part 2 Closing Remarks

4/11/2015 • 1 hour, 6 minutes, 50 seconds

Podcast.init - Introduction

Welcome to the first episode of a new podcast focused on bringing you the stories of the people who make the Python language and ecosystem great. Outline Introduction Brief Host Biographies Why We’re Doing This Why We Love Python & Favorite Tools Thank You Picks! Picks Tobias Summoner Wars Dbeaver KDE Connect Playerctl Chris ptpython Duchesse de Bourgogne The intro and outro music is from Requiem for a Fish (The Freak Fandango Orchestra) / CC BY-SA 3.0

3/21/2015 • 27 minutes, 23 seconds