Winamp Logo
The Data Scientist Show Cover
The Data Scientist Show Profile

The Data Scientist Show

English, Technology, 1 season, 89 episodes, 5 days, 7 hours, 12 minutes
About
A deep dive into data scientists' day-to-day work, tools and models they use, how they tackle problems, and their career journeys. This podcast helps you grow a successful career in data science. Listening to an episode is like having lunch with an experienced mentor. Guests are data science practitioners from various industries, AI researchers, economists, and CTOs of AI companies. This is hosted by Daliana Liu, senior data scientist. Follow Daliana on Twitter(https://twitter.com/DalianaLiu) for more updates on data science, career, and this podcast.
Episode Artwork

Why data scientists are tired, six real data scientists' frustrations - The Data Scientist Show #089

Daliana interviewed 6 data scientists from her meetup in New York City. It's a unique episode where you get to hear the real frustrations of data scientists. We talked about struggles working in healthcare, finance, data quality and AI, how to advocate for yourself, and align with your managers. Subscribe to Daliana's newsletter on ⁠www.dalianaliu.com⁠ for more on data science and career. Daliana's Twitter: ⁠https://twitter.com/DalianaLiu⁠ Daliana’s LinkedIn: ⁠https://www.linkedin.com/in/dalianaliu/
4/17/202442 minutes, 22 seconds
Episode Artwork

Why 80% of A/B tests fail, how to 10X your experimentation velocity - Kristi Angel - The Data Scientist Show #088

Most experimentations fail, Kristi Angel shares her expertise on scaling experimentation and avoiding common A/B testing pitfalls. Learn five things that can help boost test velocity, designing impactful experiments, and leveraging knowledge repos. (Chapters below) Kristi Angel’s LinkedIn: ⁠https://www.linkedin.com/in/kristiangel/ Subscribe to Daliana's newsletter on ⁠www.dalianaliu.com⁠ for more on data science and career. Daliana's Twitter: ⁠https://twitter.com/DalianaLiu⁠ Daliana’s LinkedIn: ⁠https://www.linkedin.com/in/dalianaliu/⁠ (00:00:00) Intro (00:01:26) Why do most experimentations fail? (00:07:05) Mistakes in choosing metrics (00:10:05) Is revenue a good metric? (00:13:18) Split metrics in three ways (00:15:10) Daliana's story with too many category breakdowns (00:16:59) What makes the best data science team? (00:19:24) Data scientist work in silo vs in a data science team (00:21:15) Building a knowledge center (00:23:40) Example of knowledge center; nuance of experimentations (00:26:09) How many metrics and variants? (00:30:56) How to reduce noise - CUPED (00:33:01) Future of A/B testing (00:38:33) Q&A: Low statistical power
4/8/202444 minutes, 7 seconds
Episode Artwork

From physics PhD to data science leader, unexpected challenges in survey data, Python vs R, EDA best practices, building MLOps toolkit - Julia Silge - The Data Scientist Show #087

Julia Silge is an engineering manager at Posit PBC, formerly know as R-studio, where she leads a team of developers building open source software MLOps. Before Posit, she finished a PhD in astrophysics, worked for several years in the nonprofit space, and was a data scientist at Stack Overflow where some of her most public work involved the annual developer survey. We talked about MLOps tools, challenges in survey data, text analysis, and balancing her interests in data science and engineering. Subscribe to Daliana's newsletter on ⁠www.dalianaliu.com⁠ for more on data science and career. Daliana's Twitter: ⁠https://twitter.com/DalianaLiu⁠ Daliana’s LinkedIn: ⁠https://www.linkedin.com/in/dalianaliu/⁠ (00:00:00) Introduction (00:00:56) Getting into data science (00:04:50) Transition from data centers to engineering manager (00:14:04) Common challenges in tool development (00:17:38) Challenges with survey data (00:26:47) Engineering skills for data scientists (00:28:59) Balancing roles (00:34:49) Developing skills in Exploratory Data Analysis (EDA) (00:39:19) Python vs. R for data analysis (00:44:40) Exciting aspects in career and personal life
3/30/202446 minutes, 21 seconds
Episode Artwork

Why he created Pandas, the future of data systems, why he left his CTO role to become a chief architect - Wes McKinney - The Data Scientist Show #086

Wes McKinney is the co-creator of pandas library and he is the cofounder of Voltron data. Currently he is a principal Architect at Posit and an investor in data systems. Daliana's Twitter: ⁠https://twitter.com/DalianaLiu⁠ Daliana’s LinkedIn: ⁠https://www.linkedin.com/in/dalianaliu/⁠ Wes' LinkedIn: https://www.linkedin.com/in/wesmckinn/ (00:00:00) Introduction (00:00:44) How Pandas Started (00:06:40) Voltron Data (00:10:03) Benefits of Easy-to-Use Data Tools (00:13:20) The Rise of New Data Tools (00:18:07) Choosing Tools: Vertical or Flexible? (00:23:01) Big Models and Data Tools (00:29:29) Challenges in Building a Product (00:31:28) Becoming a Top Architect (00:34:55) Missed Aspects of Previous Roles (00:39:04) A Busy Week: Advising, Designing, Investing (00:43:42) Improving Open Source (00:45:24) How to Decide What to Work On (00:46:28) What he’s learning now (00:47:56) Excitement in Career and Life (00:48:29) Using ChatGPT for Learning (00:50:27) Future Impact Goals
3/22/202452 minutes, 28 seconds
Episode Artwork

From financial analyst to director of analytics, how to get promoted quickly, 7 elements of influence - Christopher Fricker - The Data Scientist Show #085

Christopher Fricker is a senior director in analytics and BI at Renaissance Learning. He started his career in finance and later became a data science consultant working with Meta, Netflix, and pre-IPO tech companies doing analytics. We talked about the mental models that helped him grow from a finance analyst to an analytics leader. Subscribe to Daliana's newsletter on ⁠www.dalianaliu.com⁠ for more on data science and career. Chris’ LinkedIn: https://www.linkedin.com/in/christopherfricker/ Daliana's Twitter: ⁠https://twitter.com/DalianaLiu⁠ Daliana’s LinkedIn: ⁠https://www.linkedin.com/in/dalianaliu/⁠ (00:00:00) Introduction (00:01:45) How to get promoted quickly (00:08:40) Power vs authority (00:11:21) First principal thinking (00:32:38) ROI of a data team (00:41:01) How to be persuasive (00:55:27) All Data is wrong (00:56:57) How he audits the data (01:01:28) How to make someone help you at work
3/15/20241 hour, 15 minutes, 2 seconds
Episode Artwork

Adapters: the game changer for fine-tuning - Geoffrey Angus - The Data Scientist Show #084

I interviewed Geoffery Angus, ML team lead @Predibase to talk about why adapter-based training is a game changer. We started with an overview of fine-tuning and then discussed five reasons why adapters are the future of LLMs. Later we also shared a demo and answered questions from the live audience. Try fine-tuning for free: https://pbase.ai/GetStarted Geoffrey’s LinkedIn:https://www.linkedin.com/in/geoffreyangus Daliana's Twitter: ⁠https://twitter.com/DalianaLiu⁠ Daliana’s LinkedIn: ⁠https://www.linkedin.com/in/dalianaliu/⁠ Daliana's Twitter: ⁠https://twitter.com/DalianaLiu⁠ Daliana’s LinkedIn: ⁠https://www.linkedin.com/in/dalianaliu/ Geoffrey’s LinkedIn: https://www.linkedin.com/in/geoffreyangus Try finetuning for free: https://pbase.ai/GetStarted (00:00:00) Intro (00:01:19) What is Fine-tuning? (00:08:18) Utilizing Adapters for Finetuning Enhancement (00:09:50) 5 reasons why adapters are the future of LLMs (00:26:34) Common Mistakes in Adapters Usage (00:28:34) Training Your Own Adapter (00:32:23) Behind the Scenes of the Adapter Training Process (00:37:51) Config File Guidance for Fine-Tuning (00:39:41) Debugging Strategies for Suboptimal Fine-Tuning Results (00:42:23) User Queries: Creating a LoRa Adapter and Future Support (00:51:06) Key Takeaways and Recap
3/8/202452 minutes, 45 seconds
Episode Artwork

Landing a job by analyzing Seattle's crime data, from data scientist to founder of interview query, building a lifestyle business - Jay Feng - The Data Scientist Show #083

Jay Feng created a viral project using Seattle crime data and later got into data science. He later founded "Interview Query" helping data scientists get jobs. We'll talk about how he landed his data science job through his blog, and his journey from data scientist to founder. Daliana's Twitter: ⁠https://twitter.com/DalianaLiu⁠ Daliana’s LinkedIn: ⁠https://www.linkedin.com/in/dalianaliu/⁠ Jay Feng's LinkedIn: ⁠https://www.linkedin.com/in/jay-feng-ab66b049/⁠ Jay Feng's YouTube: ⁠https://www.youtube.com/c/DataScienceJay⁠ (00:00:00) Introduction (00:01:33) From engineer to data scientist (00:03:32) Got a job through a project (00:05:59) Daliana's portfolio project with Zillow (00:09:39) From data scientist to entreprenuer (00:13:40) "Tinder" for job (00:15:31) How he chose companies to work for (00:16:22) Why he became an entreprenuer (00:18:02) How many hours does he work (00:19:19) Challenges when building "interview query" (00:20:44) Speed vs scale (00:22:36) Growth hacks he used (00:24:48) YouTube vs newsletter (00:27:46) Lessons he learned as a CEO (00:29:42) How to grow from tech employee to founder (00:32:26) How he defines success (00:35:05) If you have a business idea for Jay
2/29/202436 minutes, 6 seconds
Episode Artwork

Case studies from the GenAI frontier, scaling ML teams, from biologist to machine learning consultant- Erik Gafni - The Data Scientist Show #082

Erik Gafni builds AI systems and teams. He founded Eventum AI (https://bit.ly/eventum-ai), an ML consulting company working with high-growth startups. We talked about GenAI projects he worked on, how he built production ML systems, how to scale ML teams, and his journey from biologist to ML researcher. Interested in working with Erik: https://bit.ly/erik-consulting Erik's LinkedIn: https://bit.ly/erik-gafni-LI (00:00:00) Introduction (00:01:18) Is GenAI overhyped? (00:03:48) Ascent translation with AI (00:11:17) Social media app with AI (00:13:19) Stable diffusion model evaluation (00:15:16) "Consult-to-hire" model (00:16:54) AI in biotech (00:22:06) Self-supervised learning (00:30:43) How he hires people (00:32:40) Research vs production (00:35:19) Is AGI coming? (00:36:51) New trends in GenAI (00:41:07) Data quality in GenAI (00:42:19) Philosophy in LLMs (00:47:38) OpenAI vs Open Source (00:53:20) Mistakes he made (00:57:02) How did he get into ML
2/24/20241 hour, 3 minutes, 2 seconds
Episode Artwork

Data science job market in 2024, softskills for interviews, AI engineering - Jay Feng - The Data Scientist Show #081

Jay Feng is the CEO of interview query, a service that help data scientists get jobs. Previously he worked as a data scientist at Nextdoor, Monster. We talked about data science job market, the rise of AI engineering, and the softskills people overlook during interviews. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Daliana's Twitter: https://twitter.com/DalianaLiu Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/ Jay Feng's LinkedIn: https://www.linkedin.com/in/jay-feng-ab66b049/ Jay Feng's YouTube: https://www.youtube.com/c/DataScienceJay 00:00:00 Introduction 00:01:11 Data science job market in 2024 00:09:13 Build projects with AI 00:16:19 Softskills in interviews 00:23:18 Daliana's story on "socializing ideas" 00:28:38 Common mistakes in interviews 00:35:30 Product DS vs ML interviews 00:36:27 Product analytics interview questions 00:39:18 Career transition in DS 00:43:04 Jay's career journey 00:45:38 Is there a principal data analyst? 00:51:52 AI engineer 00:54:28 New roles vs obsolete roles in DS 01:04:46 Is data science dead?
2/16/20241 hour, 6 minutes, 39 seconds
Episode Artwork

How to handle being laid off (as data scientists), severance negotiation, full-time employment vs independent consultant - The Data Scientist Show #080

We are joined by two data scientists who have firsthand experience with layoffs. We’ll talk about how to negotiate severance packages, how to handle stress, strategies for job hunting post-layoff, and how to reduce risks in full-time employment. Working with Daliana on personal branding: https://forms.gle/heNuZzaHjaAMQwLu6 Her email: [email protected] Guests: Susan Shu Chang: Linkedin: https://www.linkedin.com/in/susan-shu-chang/ Newsletter: susanshu.substack.com Sundar Swaminathan Linkedin: https://www.linkedin.com/in/sswamina3/ Website: https://www.sundarswaminathan.com/⁠
2/9/20241 hour, 6 minutes, 33 seconds
Episode Artwork

From data analyst to sales engineer, personality-based career design, sales skills for data people - Jenny Wu - The Data Scientist Show #079

Jenny Wu is a data analyst turned sales engineer for data products at Hex. We talked about sales engineer vs data analyst, how to design a career based on your personality, and how to transition into a customer-facing role. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Jenny’s LinkedIn: https://www.linkedin.com/in/jenny-wu-... Daliana's Twitter: https://twitter.com/DalianaLiu Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/ (00:00:00) Introduction (00:01:34) What is a Sales Engineer? (00:09:35) Sales Engineering Day-to-Day (00:13:09) Challenge in sales (00:21:37) Traits of Successful Salespeople (00:30:32) Stakeholder Engagement (00:36:24) Getting into customer-facing roles (00:43:55) Quitting her job to travel the world (00:48:05) Advice on Career Breaks (00:50:39) Embedding Career and Personal Goals (00:51:57) How do you achieve happiness?
2/1/202457 minutes, 26 seconds
Episode Artwork

The future of data science teams, integrating AI into data science workflows, building data apps for stakeholders - Barry McCardel - The Data Scientist Show #078

Barry McCardel is the cofounder and CEO of Hex(free trial: hex.tech/dsshow), a collaborative data workspace. Their customers include FiveTran, Notion, and Anthropic. We talked about what does the future of data team look like, how to tackle challenges of data team collaborations, and how to leverage AI in data science’s workflow. 60-day Free Trial: hex.tech/dsshow Barry’s LinkedIn: https://www.linkedin.com/in/barrymccardel (00:00:00) Introduction (00:01:25) Is AI replacing data scientists? (00:06:08) Are data science teams getting smaller? (00:09:54) What is Hex? (00:11:24) How to communicate with stakeholders (00:24:29) Should data scientists be full stack? (00:31:23) How data team measure ROI (00:33:35) Quantitative vs qualitative analysis (00:35:33) When you shouldn't use data? Data vs product intuition (00:41:39) How to hire your first data team? (00:48:59) Is the modern data stack dead? (00:53:55) GenAI in data science workflows (00:59:03) Future of data scientist (01:02:30) New features in Hex
1/21/20241 hour, 4 minutes, 50 seconds
Episode Artwork

Product data science for Microsoft AI, data scientist's role of GenAI, how to deal with burn out - Sid Sharan - The Data Scientist Show #077

Siddhartha Sharan is a Senior Data and Applied Scientist at Microsoft, helping product teams make data-driven decisions. Currently he is working on an AI product built with OpenAI APIs for sentiment analysis. We talked about how he evaluates AI products built with large language models at Microsoft, product data science, and how he went from a business background to data science. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Sid’s LinkedIn: https://www.linkedin.com/in/siddharthasharan/ Daliana's Twitter: https://twitter.com/DalianaLiu Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/ (00:00:00) Introduction(00:05:20) How does Microsoft evaluate AI product(00:16:17) Using OpenAI API for sentiment analysis(00:25:29) Microsoft data science team culture(00:26:52) DS, PM collaboration(00:28:29) Three steps to build trust in data science(00:30:13) How did he got into Microsoft(00:34:09) Level up in Genetech(00:36:09) ML engineer vs Product DS(00:37:43) Core skills in product DS(00:40:20) Hiring(00:42:47) How to deal with burnout(00:45:03) Should you over work to earn trust?(00:45:44) Daliana's story about first day at Amazon(00:49:54) Will AI replace data scientists?(00:51:32) Data scientist's role of GenAI(00:54:32) How to keep up with GenAI
1/15/202458 minutes, 57 seconds
Episode Artwork

How she doubled her salary in a year as a data analyst, SQL in the real world, is job hopping bad? - Jess Ramos - The Data Scientist Show #076

Jess Ramos is a Senior Data Analyst at Crunchbase, a LinkedIn Learning Instructor, and a content creator in the data space. She has a bachelor's degree in Math, Spanish, and Business from Berry University and a master's in Business Analytics from University of Georgia. Today we’ll talk about SQL in the real world, data analyst vs data scientist, is job hopping bad, how she negotiated her salary. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Daliana's Twitter: https://twitter.com/DalianaLiu Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/ Jess’ Linkedin: https://www.linkedin.com/in/jessramosmsba/ (00:00:00) Introduction (00:01:24) Why Jess left her job at Freddie Mac (00:03:25) Is job hopping bad (00:04:42) How to explain short job stints when interviewing (00:06:49) Jess's day-to-day work and tech stack (00:09:15) SQL in the real world (00:12:10) How to talk data to stakeholders (00:18:33) How Jess prepares for SQL interviews (00:28:11) Data analysts vs data scientists (00:32:11) Choosing a career path (00:47:19) How to ask recruiter questions (00:50:15) Jess's LinkedIn content creation journey (00:59:03) The future of Jess's career (01:03:42) Jess's favorite books
1/5/20241 hour, 7 minutes, 48 seconds
Episode Artwork

How we went from "enemies" to allies while working at Amazon, from civil engineering to machine learning and generative AI at AWS- Mehdi Noori - The Data Scientist Show #075

Mehdi Noori is an applied science manager at the Generative AI Innovation Center at Amazon. I used to work with Mehdi while we were at the Machine Learning Solutions Lab at AWS. So before Amazon, Maddie was a data scientist working on marketing intelligence. Mehdi has a PhD from University of Central Florida in civil engineering and sustainability. Subscribe to Daliana's newsletter for more on data science and career www.dalianaliu.com Mehdi Noori: https://www.linkedin.com/in/mehdi-noori/ Predicting Soccer Goals: https://aws.amazon.com/blogs/machine-learning/predicting-soccer-goals-in-near-real-time-using-computer-vision/
12/6/20231 hour, 31 minutes, 53 seconds
Episode Artwork

Why she quit her finance job to become a farmer, exploring a different path from the modern life - Misty Arnold - The Data Scientist Show #074

My friend Misty moved to a farm in Portugal after her 20 years of career in finance. We talked about her experience moving from the busy corporate life to the farm life where she does a lot of manual work. Was it challenging, how does her finance work, and what is her advice to other people who also want to explore a different path outside of the modern city life. I hope this episode will give you a different perspective about your career. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Daliana's Twitter: https://twitter.com/DalianaLiu Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/ (00:00:00) Introduction (00:11:41) Life on the farm (00:15:46) Her finance plans (00:22:55) Her career journey (00:27:14) What do accountants do (00:32:29) I thought I would be happy (00:41:25) Daliana's personal view about finance; when it's enough for you (00:44:41) Does she feel lonely on a farm? (00:48:39) What if she didn't leave the corporate world? (00:54:07) Does she regret her decision
11/29/20231 hour, 10 minutes, 28 seconds
Episode Artwork

Why he left his MLE job for product data science at Meta, data science at Uber, Linkedin, and Truecar - Pan Wu - The Data Scientist Show #073

Pan Wu is a senior manager of data science at Meta. We talked about why he moved from machine learning to product data science, projects he worked on at Uber, Linkedin, and Meta, and how he transitioned from IC to manager. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Pan’s LinkedIn: https://www.linkedin.com/in/panwu/ Daliana's Twitter: https://twitter.com/DalianaLiu Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/ (00:00:00) Introduction (00:01:30) Why he transitioned from MLE to product DS (00:07:38) Meta data scientists skill sets (00:15:49) When did his interest shifted from MLE to product DS (00:18:04) Is MLE more respected? (00:25:46) A/B testing deep dives in 3 steps (00:28:21) Built a tool at Linkedin (00:35:52) How to sell your project (00:41:07) Junior vs senior data scientist (00:43:24) From staff data scientist to manager (00:45:18) Explore being a manager (00:46:24) Cultures in Uber, Linkedin, TrueCar (00:52:09) Data science over the past 10 year (00:55:06) MLE vs DS fun and frustration (00:57:26) Product DS reality (00:59:10) Learning new skills (01:01:39) Mistakes he made (01:06:34) Future of data science (01:08:04) Will data scientists be replaced by AI (01:09:42) Three skills he looks for when hiring
11/19/20231 hour, 13 minutes, 1 second
Episode Artwork

Machine learning in cybersecurity, computer vision in sports, from business analyst to ML engineer - Betty Zhang - The Data Scientist Show #072

Betty Zhang is a data scientist currently working at a cloud security company, previously she was a data scientist at Amazon Web Services. Today we’ll talk about her computer vision projects in Sports, data science use cases in cyber security, from business major to data scientist, what’s her experience working in startups vs big tech companies. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Betty’s Linkedin: https://www.linkedin.com/in/betty-zhang-0bb63731/ Daliana's Twitter: https://twitter.com/DalianaLiu Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/ (00:00:00) Introduction (00:01:21) Computer Vision Project in Sports at AWS (00:12:28) Challenges in computer vision (00:14:02) Time allocation for ML projects (00:15:22) 3 key skills for computer vision (00:17:20) From business analyst to ML engineer (00:18:14) How she got her data scientist job through Linkedin (00:21:32) How she got into Amazon (00:22:17) Three tech skills needed during Amazon interviews (00:26:11) Why she joined a Cyber Security startup (00:27:22) Three cybersecurity use cases (00:29:47) Anomaly detection (00:30:40) ML for cybersecurity (00:34:43) Tech stacks Amazon vs Startups (00:39:35) Startups vs big tech (00:45:56) Balance learning and impact (00:48:35) Advice for new data scientists
11/12/202355 minutes, 12 seconds
Episode Artwork

Stop abusing A/B testing, toxic experimentation culture, how to run A/B tests with rigor - Che Sharma - The Data Scientist Show #071

Che Sharma came back to discuss toxic behaviors in experimentation culture and provide actionable advice on how to handle those situations, how to have rigor and integrity when designing and analyzing A/B tests. Che was the 4th data scientist at Airbnb, later he joined Webflow as an early employee. In 2021 he founded Eppo, a next-gen A/B experimentation platform designed for modern data and product teams to run more trustworthy and advanced experiments. We talked about A/B testing best practices, A/B testing for ML models, and Che’s career journey. Reach out to Che: https://www.linkedin.com/in/chetanvsharma/
11/4/20231 hour, 3 minutes, 42 seconds
Episode Artwork

Academia vs. Industry for Machine Learning, Research at Uber AI Labs, ML for Wind Farms - Jason Yosinski - The Data Scientist Show #070

Jason Yosinski was a founding member of Uber AI Labs. He is also a co-founder of WinscapeAI a company dedicated to using custom sensor networks and machine learning to increase the efficiency and sustainability of wind farms. Jason holds a PhD in computer science from Cornell University. We talked about his experience at Uber AI, his research in deep learning, and ML for wind farms. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Jason’s Website: https://yosinski.com/ Jason’s LinkedIn: https://www.linkedin.com/in/jasonyosinski/ Daliana's Twitter: https://twitter.com/DalianaLiu Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu (00:00:00) Introduction (00:06:06) His advice for Uber ML teams (00:16:03) From research to industry (00:20:24) ML for wind farms (00:25:40) Metrics for wind energy prediction (00:29:23) Start with a small dataset (00:32:00) ML in academia vs. the industry (00:33:24) Do you need a PhD for ML? (00:38:14) Daliana's story about grad school (00:41:37) The value of a PhD (00:43:13) ML Collective (00:48:36) Technical communication (00:57:21) ML Skillsets (00:59:45) Future of machine learning (01:05:23) Personal development: Hoffman process (01:15:13) Do things that excites you
10/23/20231 hour, 16 minutes, 9 seconds
Episode Artwork

Ads forecasting at Netflix and Spotify, how to build your personal moat - Jeff Li - The Data Scientist Show #069

Jeff Li is a senior data scientist at Netflix, focusing on Ads forecast. Previously he was a data science manager at Spotify, worked on supply forecasting, demand forecasting, and data infrastructure. He studied business at the University of Southern California. We talked about Ads forecasting, career path as a manager vs IC, culture in Spotify vs Netflix vs Doordash. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Jeff Li’s LinkedIn: https://www.linkedin.com/in/lijeffrey/ Daliana's Twitter: https://twitter.com/DalianaLiu Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu
9/14/20231 hour, 26 minutes, 29 seconds
Episode Artwork

A/B testing at Airbnb, building next-gen experimentation platform at Eppo - Che Sharma - The Data Scientist Show #068

Che Sharma was the 4th data scientist at Airbnb, later he joined Webflow as an early employee. In 2021 he founded Eppo, a next-gen A/B experimentation platform designed for modern data and product teams to run more trustworthy and advanced experiments. We talked about A/B testing best practices, A/B testing for ML models, and Che’s career journey. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Che’s LinkedIn: https://www.linkedin.com/in/chetanvsharma/ Try Eppo for A/B testing: https://www.geteppo.com/ Daliana's Twitter: https://twitter.com/DalianaLiu Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu (00:00:00) Introduction (00:01:26) Getting started in data science at Airbnb (00:03:08) Keys to successful A/B testing (00:06:53) Interpreting and communicating A/B test results (00:15:00) A/B testing best practices testing machine learning models (00:41:39) Centralizing experiment analysis (00:53:46) Preparing data scientists for the future (00:59:33) Developing communication skills as a data scientist (01:08:43) Transitioning from individual contributor to manager (01:12:28) The future of experimentation
8/25/20231 hour, 14 minutes, 15 seconds
Episode Artwork

From data scientist@Meta to full-time YouTuber (500k+ sub), AI engineering, future of work - Tina Huang - The Data Scientist Show #067

We talked about self-learning, productivity, how Tina navigates her career change and how she thinks AI could change the future of work. Tina's YouTube: www.youtube.com/@TinaHuang1 Lonely Octopus: www.lonelyoctopus.com Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Tina Huang is a data scientist turned YouTube creator with 500k subscribers. She is the founder of Lonely Octopus, an online program helping people gain data science, AI, and freelancing skills. She originally studied pharmacology before transitioning into tech, completing a master's degree in computer science at UPenn. (00:02:38) Transitioning from Data Science to Content Creation (00:06:29) Preparing for Data Science Interviews (00:10:59) Starting a YouTube Channel (00:14:18) Building Multiple Income Streams (00:17:35) Getting Started with AI Skills (00:29:29) Advice for Starting YouTube (00:34:47) Improving Storytelling Skills (00:36:58) Overcoming Procrastination (00:42:33) The Future of Work (01:47:08) Looking to the Future (01:26:49) Income Breakdown
8/10/20231 hour, 54 minutes, 52 seconds
Episode Artwork

Making LLMs hallucinate less, how to diagnose ML models, from PM in Google AI to CEO of Galileo - Vikram Chatterji - The Data Scientist Show #066

Vikram is the co-founder of Galileo – an AI diagnostics and explainability platform used by data science teams building NLP, LLMs and Computer Vision models across the Fortune 500 and high growth startups. 
 Prior to Galileo, Vikram led Product Management at Google AI, where his team built models for the Fortune 2000 across retail, financial services, healthcare and contact centers. He has a master degree from Carnegie Mellon University from the school of computer science. If you enjoy the show, subscribe to the channel and leave a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Vikram Chatterji’s LinkedIn: https://www.linkedin.com/in/vikram-chatterji/ "The Mom Test": https://www.amazon.com/The-Mom-Test-Rob-Fitzpatrick-audiobook/dp/B07RJZKZ7F Daliana's Twitter: https://twitter.com/DalianaLiu Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu (00:00:00) Introduction (00:04:24) How he got into machine learning (00:06:53) Diagnosing large language models (00:09:56) Addressing model hallucination (00:12:46) Metrics for measuring hallucination (00:17:30) From Google AI to starting Galileo (00:24:08) Developing LLMs and putting them into production (00:32:51) Galileo's diagnostics and explainability platform (00:43:16)  Advice for data scientists when joining a startup
8/1/20231 hour, 26 minutes, 50 seconds
Episode Artwork

Data Science "Mix Martial Arts", applied re-inforcement learning, scaling AI workloads using Ray - Max Pumperla - The Data Scientist Show #065

Max Pumperla designed his own career path in data science. He is a freelance software engineer at AnyScale, and also a data science professor. We talked about reinforcement learning, open source contributions, Ray for data scientists, and his view on the data scientists role. If you enjoy the show, subscribe to the channel and leave a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Max’s LinkedIn: https://www.linkedin.com/in/max-pumperla-a8099354/ Max's GitHub: https://github.com/maxpumperla Daliana's Twitter: https://twitter.com/DalianaLiu Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu (00:00:00) Introduction (00:09:19) How he got a remote job through Twitter (00:14:06) Introduction to Ray (00:18:52) Reinforcement learning (00:23:56) Key lessons on integrating customer feedback (00:35:12) Flaws in data science job titles (00:45:51) How to be irreplaceable as a data scientist (00:48:55) An unconventional career path as a data scientist (01:12:24) Productivity and work-life balance (01:28:10) Advice for building a personal brand
7/28/20231 hour, 53 minutes, 28 seconds
Episode Artwork

Uber's ML Systems (Uber Eats, Customer Support), Declarative Machine Learning - Piero Molino - The Data Scientist Show #064

Piero Molino was one of the founding members of Uber AI Labs. He worked on several deployed ML systems, including an NLP model for Customer Support, and the Uber Eats Recommender System. He is the author of Ludwig , an open source declarative deep learning framework. In 2021 he co-founded Predibase, the low-code declarative machine learning platform built on top of Ludwig. Piero's LinkedIn: https://www.linkedin.com/in/pieromolino Predibase free access: bit.ly/3PCeqqw Daliana's Twitter: https://twitter.com/DalianaLiu Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu (00:00:00) Introduction (00:01:54) Journey to machine learning (00:03:51) Recommending system at Uber Eats (00:04:13) Projects at Uber AI  (00:09:34) Uber's customer obsession ticket system (00:16:01) How to evaluate online-offline business and model performance metrics (00:17:16) Customer Satisfaction (00:28:38) When do you know whether a project is good enough (00:41:50) Declarative machine learning and Ludwig (00:45:32) Ludwig vs AutoML (00:54:44) Working with Professor Chris Re (00:58:32) Why he started Predibase (01:07:56) LLM and GenAI (01:10:17) Challenges for LLMs (01:22:36) Advice for data scientists (01:34:29) Career advice to his younger self
7/4/20231 hour, 50 minutes, 5 seconds
Episode Artwork

Data science in transportation, the interception of operational research and ML - Holger Teichgraeber - The Data Scientist Show #063

Holger Teichgraeber is a Data Science Manager at Archer Aviation. Previously, he worked at Convoy as a Research Scientist on their trucking marketplace, and at various companies in the energy space. Holger has a Bachelor's degree in Mechanical Engineering from Aachen, Germany, and a Masters and Ph.D. with research focus on machine learning and optimization applied to energy systems from Stanford University. He regularly writes on LinkedIn, with the goal to show how to build valuable products at the intersection of machine learning and optimization in production. If you enjoy the show, subscribe to the channel and leave a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Holger's LinkedIn: https://www.linkedin.com/in/holgerteichgraeber/ Daliana's Twitter: https://twitter.com/DalianaLiu Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu (00:00:00) Introduction (00:01:28) How he got into operations research (00:02:39) Operation research vs data science (00:04:37) Trucking optimization at Convoy (00:08:42) Optimization problem (00:10:18) Strategic planning on air mobility at Archer (00:13:50) Using simulation and solving a problem (00:16:45) Big data science work vs smaller data science work (00:21:23) Stakeholder management (00:29:28) IC vs Manager (00:32:04) Advice on promotion (00:39:12) Work cultures in Germany and the US (00:41:16) How to handle tight deadlines (00:43:21) Important feedback from his work (00:44:14) How to plan projects (00:44:45) Next big challenge for data science teams (00:45:40) Career growth in the next few years (00:46:01) Connect with Holger
6/26/202346 minutes, 53 seconds
Episode Artwork

Tackling data quality issues, 5 pillars of data observability, from management consultant to CEO of Monte Carlo - Barr Moses -The Data Scientist Show #062

Barr Moses is a consultant turned CEO & Co-Founder of Monte Carlo, a data reliability company. She started her career as a management consultant at Bain & Company and a research assistant at the Statistics Department at Stanford University. Later, she became VP of Customer Operations at customer success company Gainsight, where she built the data and analytics team. She also served in the Israeli Air Force as a commander of an intelligence data analyst unit. Barr graduated from Stanford with a B.Sc. in Mathematical and Computational Science. Today, we’ll talk about Barr’s career journey, data reliability and observability, and what it means for data teams. If you enjoy the show, subscribe to the channel and leave a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science. Barr's LinkedIn: https://www.linkedin.com/in/barrmoses/ Daliana's Twitter: https://twitter.com/DalianaLiu Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu (00:00:00) Introduction (00:01:24) How did she got into data science (00:08:26) Frameworks for data-driven decisions (00:11:20) Is customer support ticket always bad? (00:15:20) How to quickly find out what is true (00:20:17) Struggles in the data team (00:23:37) Daliana’s story about lineage (00:28:00) People stressed about data (00:28:09) Netflix was down because of wrong data (00:30:40) Common issues with data quality (00:33:14) 5 pillars of data observability (00:39:14) How does Monte Carlo help data scientists (00:43:08) Build in-house vs adopt tools (00:45:48) How Daliana fixed a data quality issue (01:02:44) How to measure the impact of the data team (01:09:09) Mistakes she made (01:15:28) Beat the odds
5/18/20231 hour, 21 minutes, 31 seconds
Episode Artwork

Is search dead? Google vs ChatGPT, from Google Search to enterprise search at Glean, machine learning in search, tech layoffs - Deedy Das - The Data Scientist Show #061

Deedy Das is a founding engineer at Glean, an enterprise search startup. Previously, he was a Tech Lead at Google Search working on query understanding and the sports product in New York, Tel Aviv, and Bangalore. Before that, he was an engineer at Facebook New York and graduated from Cornell University. Outside of work, Deedy writes on his blog. He published a viral resume template and his work on exposing grading flaws in the Indian education system. He also enjoys running marathons, road cycling, and playing cricket. Today we’ll talk about the search projects he worked on at Google, why he left Google, his current work at Glean, and his thoughts on whether Google is doomed because of  ChatGPT. If you enjoy the show, subscribe to the channel and leave a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science.   Deedy's Twitter: https://twitter.com/debarghya_das?s=20 Daliana's Twitter: https://twitter.com/DalianaLiu Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu  (00:00:00) Introduction  (00:01:52) What is search  (00:04:33) Query understanding  (00:12:46) Google vs ChatGPT  (00:18:24) Fixing bug for Sundar Pichai  (00:27:33) Why he left google  (00:30:32) How to get into search  (00:34:38) Enterprise search at Glean  (00:46:55) Advice for people who got laid off  (00:48:41) What do search engineers do  (00:51:37) How he evaluates candidates  (00:53:58) Future of search  (00:57:16) Why the web is declining  (00:59:25) Copilot and AI-powered developer tools  (01:03:46) Indian startup ecosystem  (01:07:45) India vs Silicon Valley  (01:09:48) How he grew 30k followers on Twitter  (01:13:28) Daliana and Deedy’s challenge with social media  (01:19:31) Career mistakes he made
2/21/20231 hour, 27 minutes, 6 seconds
Episode Artwork

The 100-hour work week of an self-taught machine learning researcher, how he got into Google Brain, why he started Omni - Jeremy Nixon - The Data Scientist Show #060

Jeremy Nixon is a machine learning researcher, software engineer, and startup founder. Previously he was a software engineer at Google Brain working on deep learning. Now, he is the co-founder and CEO of Omni, building an immersive information retrieval system for you and your team. He studied applied math at Harvard University. Today we’ll talk about how he got into Google brain, his 3-month self-learning plan to learn machine learning, his startup, and how he executed his goal relentlessly since 2016. If you enjoy the show, subscribe to the channel and leave a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science. Jeremy's Twitter: https://twitter.com/JvNixon Jeremy's Blog: https://jeremynixon.github.io/ Daliana's Twitter: https://twitter.com/DalianaLiu Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu Jeremy's LinkedIn: https://www.linkedin.com/in/jeremyvnixon (00:00:00) Introduction  (00:01:50) Research in Google Brain  (00:03:37) How he got into Google Brain  (00:07:56) His 3-month plan to learn ML  (00:17:55) The 100-hour workweek  (00:33:26) What if he is tired  (00:39:59) Why he found Omni  (00:44:24) Data science problems in Omni  (00:54:42) Future of machine learning  (00:57:51) Silicon Valley is very accessible  (00:59:47) The golden handcuffs  (01:06:58) From data scientist to full-stack engineer  (01:09:06) Close-minded data scientists  (01:24:10) Advice to ML learners  (01:29:41) Something he wished that he did when he was younger  (01:37:25) The future of his career  (01:42:17) Connect with Jeremy
2/20/20231 hour, 42 minutes, 52 seconds
Episode Artwork

The power of error analysis, tree models for search relevancy, what ChatGPT means for data scientists - Sergey Feldman - The Data Scientist Show #059

Sergey Feldman is the head of AI at Alongside, providing mental health support for students. He is also a Lead Applied Research Scientist at Allen Institute for AI, where he built an ML model that improved search relevancy for scientific literature. Sergey has a PhD in Electrical and Electronics Engineering from the University of Washington. Today we’ll talk about machine learning for search, his consulting project for the Gates Foundation, AI for mental health, and career lessons. Make sure you listen till the end. If you like the show, subscribe, leave a comment, and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's Twitter: https://twitter.com/DalianaLiuDaliana's   Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/   Sergey's LinkedIn: https://www.linkedin.com/in/sergey-feldman-6b45074b/  Data Cowboys: http://www.data-cowboys.com/ Sergey Feldman: You Should Probably Be Doing Nested Cross-Validation | PyData Miami 2019: https://www.youtube.com/watch?v=DuDtXtKNpZs December 4th, 2018 - Breakfast with WACh with Dr. Sergey Feldman, PhD: https://www.youtube.com/watch?v=vA_czRcCpvQ (00:00:00) Introduction  (00:01:24) Machine learning skeptic  (00:03:02) Tree-based models for search relevance  (00:14:34) How to do error analysis  (00:19:20) Nested cross-validation  (00:21:34) Model evaluation  (00:30:43) Error analysis common mistakes  (00:33:37) How to avoid overfitting  (00:35:56) Consulting project with Gates Foundation  (00:41:16) Tree-based models vs linear models  (00:45:19) Working with non-tech stakeholders  (00:50:20) Chatbot for teen’s mental health  (00:54:32) Can ChatGPT provide therapy?   (00:58:12) How he got into machine learning  (01:02:12) How to not have a boss  (01:03:46) Feelings vs Facts  (01:09:02) Future of machine learning  (01:11:30) How to prepare for the future  (01:13:39) AutoML  (01:17:12) His passion for large language models
1/24/20231 hour, 19 minutes, 43 seconds
Episode Artwork

How to build data science muscle memory, DeepChecks -- an open source ML testing suite - Philip Tannor - The Data Scientist Show #058

Philip Tannor is the Co-Founder and CEO of Deepchecks, a python package to run checks for machine learning models. Previously, he was the head of data science group at the Isreal Defense Force. He has a master's degree from Tel Aviv University in engineering, his thesis was about a new algorithm that combines neural networks with gradient-boosting decision trees. Today we’ll talk about his career journey, how to build your data science muscle memory, the algorithm he worked on, and how to check ML models. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science and career. Daliana's Twitter: https://twitter.com/DalianaLiuDaliana's  LinkedIn: https://www.linkedin.com/in/dalianaliu/  Philip’s LinkedIn: https://www.linkedin.com/in/philip-tannor-a6a910b7/?originalSubdomain=il Augboost: https://medium.com/@ptannor/augboost-like-xgboost-but-with-few-twists-e4df4017a5c4 (00:00:00) Introduction  (00:01:17) How did he get into ML  (00:02:52) Data science in the military  (00:08:15) How to take feedback  (00:13:24) Handling criticism  (00:15:12) What he worked on  (00:18:18) testing deployment  (00:21:28) How to build the data science muscle memory  (00:27:09) Improving the skills of data scientists  (00:30:42) His thesis in grad school  (00:36:59) Combine NN and gradient boosting  (00:40:05) Aug boost  (00:41:15)Tools he uses  (00:45:58) Deepchecks  (00:50:46) Most challenging part of building Deepchecks  (00:52:05) How can people contribute  (00:53:40) Behind the scenes  (00:56:09) Deciding how to fix or improve the model  (01:00:49) Advise for those who wanna create open-source projects  (01:04:07) Features to add for the enterprise product  (01:06:57) About his life and career right now  (01:08:27) Connect with Philip
12/7/20221 hour, 8 minutes, 51 seconds
Episode Artwork

The Daliana Special: how did I got into data science, 5 things only experienced data scientists know, and why I started "The Data Scientist Show" - Daliana Liu #057

Who is Daliana? This is a conversation I had in 2021 with Harpreet Sahota. I talked about my unexpected journey to data science all the way back in high school, things I wish I could know earlier about my career, the projects I worked on, what is like to be a quote-and-unquote influencer on Linkedin, and more. If you want more content from me, I write about data science and career nerdy jokes, on my Linkedin and you can subscribe to my very infrequent newsletter at dalianaliu.com. I’m curious what you think about this episode, leave a comment on YouTube or send a DM on Linkedin. Hope you enjoy the Daliana special!   Daliana's Newsletter: https://dalianaliu.com  Daliana's Twitter: https://twitter.com/DalianaLiu  Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/  Harpreet's LinkedIn: https://www.linkedin.com/in/harpreetsahota204/  The artist of the data science podcast: https://theartistsofdatascience.fireside.fm/  (00:00:00) Introduction  (00:02:52) Where did Daliana grow up  (00:05:19) Daliana in highschool  (00:07:11) How did she got into data science  (00:11:36) Why is writing important for data scientist  (00:15:51) How to write better  (00:20:56) Career lessons you didn't learn in school  (00:27:40) Imposter syndrome  (00:31:29) Day-to-day work as a data scientist  (00:36:16) Most common mistakes data scientists make  (00:39:41) Data Analyst vs. Data Scientist  (00:42:30) What is the science in data science?  (00:44:51) Can everyone be a data scientist  (00:49:21) Linkedin profile tips for job search  (00:52:59) How she creates content  (00:54:11) Being a data scientist "influencer"  (00:56:04) Why she started "the data scientist show"  (01:01:16) Women in data science  (01:06:39) What's her legacy  (01:09:43) What is she reading  (01:14:21) Connect with Daliana
11/24/20221 hour, 15 minutes, 20 seconds
Episode Artwork

How he carved his own path at Airbnb, from data engineer to CEO of Mage - Tommy Dang - the data scientist show #056

Tommy Dang is the Co-founder and CEO of Mage, a data ingestion and transformation pipeline for data engineers (https://github.com/mage-ai/mage-ai). Previously, he was working on data engineering and machine learning engineering at Airbnb. He has a bachelor degree of science in UC Berkeley studying economic, history, and sociology. Today we’ll talk about how he learned engineering and machine learning after college, data tools and ML tools he built at Airbnb, performance review, and how he navigates his career. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science and career. Tommy’s LinkedIn: https://www.linkedin.com/in/dangtommy/ Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu (00:00:00) Introduction  (00:01:28) Get into computer science from non-tech background  (00:03:08) How he started his first project  (00:04:07) Projects at Airbnb  (00:06:09) Speed vs Quality when building data pipelines  (00:16:34) How to deal with AdHoc requests  (00:21:00) How did he learn machine learning  (00:24:04) How he convinced data scientists to teach him ML  (00:25:15) Performance review  (00:27:11) Don’t let your job title limit your career  (00:28:29) Why he started his company  (00:31:38) Build your own tool vs use open source solutions  (00:33:12) Transitioning from an engineer to a CEO  (00:34:50) Earn trust from internal stakeholders  (00:36:27) Career advice  (00:41:31) How he carved his own path at Airbnb  (00:46:00) How did he learn to be a good engineer  (00:47:10) Best advice for data scientists or engineers  (00:48:41) Most important quality of data scientists or engineers  (00:51:51) Design principles  (00:58:51) Future of tools  (01:01:00) What does he think about his future career  (01:05:05) Inspiration of Tommy
11/8/20221 hour, 8 minutes, 2 seconds
Episode Artwork

How to effectively test and debug machine learning models, from ML engineer@Apple to startup founder - Gabriel Bayomi - the data scientist show #055

Gabriel Bayomi is the Co-Founder at OpenLayer, a tool that tests & debugs machine learning models. OpenLayer was in the YCombinator’s batch in 2021, building tools for machine learning model testing. Previously he was a machine learning engineer at Apple working on Siri. He has a master degree in computer science from Carnegie Mellon. He is passionate about Natural Language Processing, Machine Learning, and Computational Social Science. We talked about how to test and debug machine learning models, his experience at Apple, and career lessons. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science and career. Gabriel’s LinkedIn: https://www.linkedin.com/in/gbayomi Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu (0:00) Intro (01:01:39) How he got into machine learning (01:06:43) His experience at Apple, Siri (01:15:55) How to validate the solution (01:19:39) Benefits of using external error analysis framework (01:21:30) How to build a model evaluation pipeline (01:28:26) Don’t overfit the subset of data (01:33:19) Your validation set shouldn’t be fixed (01:41:03) Become one with data (01:44:05) Three model interpretability library you should use (01:50:47) Common mistakes people made in model validation (01:53:33) How to create an adversarial test (01:55:43) How to check data quality (01:06:46) Transition from engineer to executive (01:10:04) Things he learnt from his favorite coworker (01:17:57) how job roles would evolve
10/24/20221 hour, 24 minutes, 1 second
Episode Artwork

From Amazon research scientist to head of data product at Vestiaire Collective, why data science projects fail, how to be a good communicator - Alisa Kim - the data scientist show #054

Alisa Kim is the head of data product at Vestiaire Collective. Previously, she was a research scientist at Amazon Web Services. We used to work on the same team in Machine Learning Solutions Lab and Amazon Web Services. We have collaborated on projects before and previously she was a consultant and worked on analytics and investment banking. She has a Ph.D. in Econ AI and she has worked on various industries and multiple continents. She's someone I really enjoyed working with. We talked about her journey, the projects she worked on and the lessons she learnt. If you like the show subscribe to the channel and give us a 5 star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.   Alisa's LinkedIn: https://de.linkedin.com/in/alisakolesnikova Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's twitter: https://twitter.com/DalianaLiu (0:00) Intro (00:01:38) how she got into data science (00:04:38) day-to-day at AWS ML Solutions Lab (00:08:00) AWS leadership principles (00:16:34) challenges the consultant faces when working with external customers (00:23:36) from AWS to Vestiaire Collective (00:37:54) how to build a better data product (00:44:17) how data scientist can align with business stakeholders  (00:57:52) from tech to business (01:01:33) how to develop communication skills (01:09:17) increase visibility of the data science team (01:17:22) being proactive vs being passive in chasing opportunities (01:24:06) get feedback from your "nearest neighbors" (01:25:37) how to set boundary at work (01:38:48) mistakes she made in her career (01:48:25) how to manage disagreement (01:57:53) future of data science
10/19/20222 hours, 12 minutes, 17 seconds
Episode Artwork

The lessons from almost losing a million dollars for his company, how to build good data assets and get buy-in from the leadership - Mark Freeman - the data scientist show#053

Mark Freeman is a community health advocate turned data scientist His mission is to improve the well-being of people, especially among those marginalized. He is currently a senior data scientist at Humu where he builds data tools that drive behavior change to make work better. He has a master degree from the Stanford School of Medicine in clinical research, experimental design and statistics. He also has a certificate in entrepreneurship from the Business School of Stanford. In his free time, he volunteers with a Bay Area Community Health Advisory Council. He also plays Men's Division III Rugby. We talked about the building data tools, data engineering skills for data scientist, how to pitch a projects, and his career journey. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Mark's LinkedIn: https://www.linkedin.com/in/mafreeman2/ Chapters: (0:00) Intro (00:03:05) Our experience using R - 1000 lines of code (00:09:22) Entrepreneurship within a company (00:16:25) DBT and modern data stack (00:20:15) Tools don’t matter (in interviews) (00:21:09) Things DE enjoys but DS doesn’t (00:24:55) How to work with different stakeholders (00:30:32) Common SQL mistakes (00:33:34) SQL vs Python vs R (00:35:26) T.R.I.B.E framework for projects (00:40:43) Meet the stakeholders where they at (00:42:40) Use feedback to get buy-in from collaborator (00:46:36) How to pitch a new idea (00:49:45) Don’t lead with solution, lead with the problem (00:51:03) How to get buy-in from the leadership (00:57:56) Present an idea as if the audience came up with it (00:58:41) How to iterate a project (01:00:27) How he almost lost 1 Million dollar for his company (01:02:07) Things he learned from his manager (01:04:19) Things that help people make changes effectively (01:06:05) Things he learned from mentoring (01:12:19) Mental Health and anxiety (01:17:12) Web3 (01:20:14) Why he cares about community health (01:25:40) "Soul - searching" on his future (01:28:36) Why he write on LinkedIn (01:30:04) Future of data science
10/15/20221 hour, 32 minutes, 31 seconds
Episode Artwork

From deep learning architect at AWS to PM in AI product - Abhi Sharma - the data scientist show #052

Abhi Sharma started his career as a software engineer at Amazon Lab 126, building cloud services for Alexa. Later he transferred to Amazon Web Services as a deep learning architect. We used to work at the same team at machine learning solutions lab in AWS. Currently, he is a product manager, responsible for machine learning products like chatbot at Chime. We talked about how he transitioned his career from software engineer to deep learning architect and to a product manager, cool projects he worked on, and our shared experiences at Amazon. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Abhi's LinkedIn: https://www.linkedin.com/in/abhivs/ Highlights: (0:00) Intro (00:01:48) from SWE to deep learning architect to product manager (00:12:44) day-to-day as a product manager at Chime (00:19:46) how he collaborates with different data personas (00:27:21) how to negotiate for more time for projects with leaders (00:33:59) some timelines are negotiable (00:38:00) most impactful project he worked on (00:44:22) how to evaluate KPI, and not game the system (00:48:02) think about development in the beginning (00:50:29) data scientists need to educate the business and demystify the buzz words (00:54:19) Amazon’s Think Big Challenge (00:57:09) Never solve the problem twice (01:00:25) How to transition to a product manager (01:07:48) why he wanted to become a PM (01:25:35) How can data scientist learn from PM
10/4/20221 hour, 30 minutes, 45 seconds
Episode Artwork

What data scientists need to know about MLOps principles, from GPA 2.6 to Sr. MLOps Engineer@Intuit - Mikiko Bazeley - the data scientist show051

Mikiko Bazeley is a senior software engineer working on MLOps at Intuit. Previously, she worked as a growth hacker, data analyst in Finance, then become a data scientist, and later transitioned into machine learning. She has a bachelor degree in econ, biological anthropologie, did data science bootcamp at springboard. She is a tech writer for NVIDIA and she’s working on a course on MLOps. Her goal is to demystify MLOps & show how to develop high-quality ML products from scratch. You can find her content on Linkedin and YouTube. Today, we’ll talk about useful engineering principles for data scientists, MLOps, and her career journey. Subscribe to www.dalianaliu.com for more on data science and career. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Mikiko's Linkedin: https://www.linkedin.com/in/mikikobazeley/ Highlights: (0:00) Intro  (00:02:00) from GPA2.6 to data scientist (00:05:27) her experience at Mailchimp (00:11:44) her frustrations on Cookiecutter project (00:14:09) the pain point of a data scientist working with engineering (00:21:01) 2 MLOps pattern (00:25:52) challenges about her work (00:29:49) the basic engineering skills a data scientist should have (00:32:46) the tests a data scientist should write (00:37:42) how an MLOps engineer collaborates with a data scientist (00:45:28) what makes a good MLOps engineer (00:52:33) AWS vs GCP vs Azure (00:58:59) how a data scientist collaborates with an MLOps engineer  (01:05:19) suggestions for building a model on a large scale (01:09:11) how she learnt MLOps on her own within 6 months (01:17:32) learn from code review (01:19:17) MLOps books and resources she recommended (01:24:13) mistakes she made earlier in her career (01:31:29) common mistakes people make during career change (01:38:22) "Start with the end in mind" (01:41:16) the future of MLOps (01:46:23) how she sees her career growth (01:56:40) how she continues learning new skills (02:00:09) what she is excited about her career and life
9/27/20222 hours, 4 minutes, 50 seconds
Episode Artwork

Bayesian thinking in work and life, ad attribution models and A/B testing, machine learning@Foursquare - Max Sklar - the data scientist show050

Max Sklar is an independent engineer and researcher. Previously, he was an engineering and Innovation Labs Advisor at Foursquare after 7 years at the company as a machine learning engineer. Previously, he has worked on Ad Attribution, recommendation engine, ratings. He is the host of The Local Maximum podcast. Max studied CS from Yale, and holds a Master degree in information systems from New York university. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Max's Linkedin: https://www.linkedin.com/in/max-sklar-b638464/ Max’s website: localmaxradio.com/about Interviews he mentioned during the podcast: Andrew Gelman, Statistics at Columbia University Shirin Mojarad on Causality Johnny Nelson on Free Speech and Moderation online Stephanie Yang talking about Foursquare's Venue Rating System Dennis Crowley: on Labs, on Innovation Sophie Carr (Bayesian Mathematician) Will Kurt (Bayesian) Marsbot for Airpods Other Episodes Mentioned Bayesian Thinking P-Hacking Interview on Learn Bayesian Statistics Highlights: (0:00) Intro (00:01:23) from computer science to machine learning (00:05:35) Bayesian methods in rating system (00:14:53) how to choose a Bayesian prior (00:20:10) how to deal with p-hacking (00:26:57) causality model in ad attribution (00:35:20) Bias-correction methods (00:45:43) negative lift in advertising (00:51:05) unexpected consumer behaviors (00:52:08) why he decided not to climb the "engineer ladder" (00:56:46) the challenges of having 5 managers in a year (01:01:38) using the 3rd-party software vs building his own (01:04:18) how he approaches ML problems (01:07:51) his tech stack (01:09:25) his advise on learning machine learning (01:12:40) projects he is working on (01:17:10) Bayesian for his life decisions (01:22:00) how writing helps him (01:23:48) the confusion, stress and excitement in his career
9/13/20221 hour, 30 minutes, 25 seconds
Episode Artwork

Why he quit a $500k+ machine learning job at Meta (Facebook): a candid review of his experience, mistakes, and ML best practices - Damien Benveniste - the data scientist show049

(timestamps below)Damien Benveniste is a data scientist and software engineer. Previously, he was a machine learning tech leader and mentor. He has worked for almost ten years in different machine learning roles in different industries such as AdTech market research, e-commerce and health care. He has a Ph.D. in physics from Johns Hopkins University and now working towards co-founding own startup in employee engagement space. We talked about his career journey, how he solved challenging problems, and his advice for new data scientists and engineers. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Damien's Linkedin: https://www.linkedin.com/in/damienbenveniste/ (00:00) Intro  (00:01:17) from quantitative trading to machine learning  (00:07:52) his experience at Meta  (00:21:16) automated machine learning  (00:28:52) model paradigm  (00:32:47) the productivity-oriented culture at Meta  (00:41:42) short-term gain vs long-term goal  (00:44:38) things he liked at Meta  (00:51:54) the project that shaped his career  (01:03:56) the importance of having a baseline for ML models  (01:09:12) why he time-boxed everything  (01:16:25) test the model in production  (01:20:05)experimental design for ML  (01:23:25) the most challenging project he worked on  (01:37:07) best practices for machine learning  (01:48:44) how he sees himself  (02:00:52) lessons he learnt from being layoff  (02:06:45) frustration he had in his previous job  (02:16:14) what he is working on  (02:29:18) the future of machine learning  (02:39:52) things he is excited about
9/6/20222 hours, 44 minutes, 26 seconds
Episode Artwork

Time series modeling in supply chain, how to master business communication, save the environment with data science - Sunishchal Dev - the data scientist show048

Sunishchal Dev is a lead data scientist at Booster. He's helping to decarbonize the transportation industry by optimizing last mile delivery of renewable fuels. Previously, he was a management consultant. On the side, he volunteers with Project Drawdown to model the most effective solutions to climate change. He is also a mentor of future data scientist as a springboard by guiding them through real world projects. We talked about his career journey, supply chain optimization, how data science can help the environment. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu (0:00) Intro (00:01:24) from business to data science (00:06:36) the big impact of a small improvement (00:08:50) data engineering vs predictive modeling (00:11:48) routing optimization (00:16:27) time series model (00:21:32) use upsampling to simulate intermittent time series problem (00:26:20) his modern data stack (00:28:29) collaborate with engineers (00:30:06) common mistakes people made in building time series model (00:37:02) collaborate with truck drivers (00:40:17) how to become a good communicator (00:46:30) his experience in mentoring data scientist (00:51:14) things people cannot learn at school (00:53:16) the mistakes he made and the things he learnt from his mentor (00:56:07) how data science can help the environment Books recommended:  The Pyramid Principle: Logic in Writing and Thinking The Book of Why: The New Science of Cause and Effect Influence, New and Expanded: The Psychology of Persuasion
8/31/20221 hour, 3 minutes, 9 seconds
Episode Artwork

Product data science@Spotity, from management consultant to data scientist, salary negotiation, managing ADHD - Felicia Rutberg - the data scientist show047

Felicia Rutberg is a product strategy and analytics manager at Snap, previously she was a product data scientist at Spotify. She started her career as a management consultant at Accenture. She studied mathematics and cognitive psychology at the Vanderbilt University. Felicia reached out to me on Linkedin because she wanted to share how she became a data scientist while having ADHD. Today we’ll talk about product analytics at Spotify and Snap, her career journey, and ADHD. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Felicia's Linkedin: https://www.linkedin.com/in/feliciarutberg/  Highlights:  (00:01:29) from management consulting to data science  (00:12:20) financial data analyst at Spotify  (00:20:06) how to do internal job transition  (00:25:57) product data scientist at Spotify in the econometrics team  (00:29:33) how she became more vocal on the creative process (00:33:48) how to get the last 1% of the work done  (00:38:53) how to ensure the quality of the analysis  (00:50:19) propensity score matching at Spotify  (00:57:09) how to validate causal inference outcomes  (01:00:51) lessons from working with economists  (01:19:16) from Spotify to Snap  (01:27:35) salary negotiation  (01:34:02) day-to-day at Snap  (01:38:33) Spotify vs Snap  (01:44:35) lessons from management consulting that helped her data science journey  (01:47:37) ADHD and self-compassion  (02:02:52) the books she recommended  (02:08:26) her future career
8/18/20222 hours, 12 minutes, 57 seconds
Episode Artwork

Data science interviews trends, from being laid off to landing a data scientist job at Airbnb - Emma Ding - the data scientist show #046

Emma Ding is a data scientist turned career coach. Previously she was a data scientist and software engineer at airbnb. I first discovered her through a viral Medium blog called “how I got 4 data science offers and doubled my income 2 months after being laid off". Today, her mission is to help data scientists land their dream offers by being strategic and efficient in their interview preparation at https://www.datainterviewpro.com/. Among the 80 clients she worked with, 90% of them received data scientist job offers from top tech companies, such as meta, linkedin, doordash, robinhood, etc. We talked about how she doubled her salary and got into Airbnb after she was laid off , her experience at Airbnb during the first half of the podcast, and then we’ll dive into new trends in data science interviews and her best strategy to get a data scientist job. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Emma's YouTube: https://www.youtube.com/c/ DataInterviewPro Free product case class: https://www.datainterviewpro.com/product-case-masterclass-registration  Books on causal inference: Mostly harmless econometrics and Mastering Metrics: The Path from Cause to Effect.  Emma's Linkedin: https://www.linkedin.com/in/emmading001/  (00:00) Intro   (00:04:24) her strategy to get the data scientist offer after the layoff   (00:07:00) advices for preparing interviews   (00:14:04) her day-to-day at Airbnb   (00:16:46) things she learnt from her mentor   (00:18:07) from a data scientist to a SDE to a data interview pro   (00:22:12) trends of data science interview   (00:26:48) data scientist tracks: analytics-driven vs algorithms-driven   (00:32:56) SQL interviews: readability and proficiency     (00:35:06) make a study plan, execute it and keep the confidence   (00:41:29) what she teaches in her datainterview.com   (00:43:45) how to tackle take-home challenges   (00:45:41) how to negotiate salaries   (00:46:56) how to build confidence in the job search process   (00:50:23) how to study efficiently different subjects   (00:54:26) how to transition to data science   (01:00:05) how to remedy mistakes during the interview   (01:03:37) is data scientist still in demand?   (01:08:43) advices for getting ready for the new career
8/2/20221 hour, 19 minutes, 35 seconds
Episode Artwork

Using ML to tackle disruptive behaviors in gaming@Activision, data science in the metaverse, cyber security - Carly Taylor - the data scientist show #045

Carly Taylor is a senior manager at Activision, leading a team of  machine learning engineers to tackle disruptive behaviors in the game ‘Call of Duty’. Previously, she has held various roles including machine learning engineer, data scientist, product analyst, Analytical Chemist. She has a master degree in computational chemistry from the university of colorado. She’s passionate about video games and cyber security. She shares her insights on machine learning, gaming, and career with 33k Linkedin follower. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Carly's Linkedin: https://www.linkedin.com/in/carly-taylor0017/ Highlights: (00:00) Intro  (00:01:14) from chemistry major to data scientist in gaming  (00:05:46) how she tackles disruptive behavior using machine learning  (00:11:38) feature engineering and model drift in fraud detection  (00:16:49) the challenge of dealing with the large scale of data  (00:27:10) data science in the Metaverse  (00:36:08) signal processing and anomaly detection  (00:40:31) dealing with the outliers  (00:45:49) gets the buy-ins from the leadership  (00:49:56) from an IC to a manager  (00:53:36) mentorship, mistakes, and other things she learnt from work  (00:58:48) Python or R?  (01:05:30) how she sees herself grow and how she deals with struggles  (01:07:56) the future of data science in gaming
7/29/20221 hour, 15 minutes, 41 seconds
Episode Artwork

From lawyer to senior data scientist at Amazon, data science in devices, HR, and real estate, how to 're-invent' yourself - Pauline Chow - the data scientist show #044

Pauline Chow is a data scientist and former legal attorney and active transportation advocate. She worked in banking, fashion and education start-ups, and Amazon. Currently, she is the data engineering lead for Thrackle, a blockchain research and modeling company. She has a master degree in computer science, Machine learning, from Georgia Institute of Technology, she also has a law degree JD from the university of wisconsin. She is also a certified yoga teacher and published writer.  We talked about her projects in three different teams in Amazon: devices, HR, and real estate; how her law degree helped her become a better data scientist; how she 're-invented' herself. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Her author website www.paulinechowstories.com or connect with her on twitter @itspaulinechow. Pauline's Linkedin: https://www.linkedin.com/in/paulinec/ A More Beautiful Question: The Power of Inquiry to Spark Breakthrough Ideas -- examples of the purpose of questioning. The Four Tendencies by Gretchen Rubin (quiz, book). An interesting framework for considering how different people respond to internal and external expectations and pressures. Why only rewarding high-performers can be detrimental to an organization? Wharton People Analytics Conference. Case Studies: Network Analysis. (2015, December 13). https://www.youtube.com/watch?v=0fM6JYC2zfQ
7/13/20221 hour, 31 minutes, 11 seconds
Episode Artwork

From chemical engineer to data scientist@ExxonMobil, why he left to do data science freelancing, data career jumpstart, Avery Smith - the data scientist show#043

Avery Smith is a data science consultant and career coach at Data Career Jumpstar, and TA at MIT professional education. Previously, he was working on optimization and predictive analytics at ExxonMobil. We talked about his journey from from chemical engineer to data analytics, optimization problems in energy sector, why he left ExxonMobil, and his best advice for people to get into data science. Follow Daliana on Twitter (https://twitter.com/DalianaLiu) for more on data science and this podcast. If you like the show, subscribe and give me a 5-star review :)  Topics: His first data science projects His experience with ExxonMobil Why he left ExxonMobil Data science consulting Challenges when working with clients Why he built his own career coaching program How Linkedin helped his career TA at MIT, MIT's data engineering curriculum how to build a data science portfolio Avery's Linkedin: https://www.linkedin.com/in/averyjsmith/
7/6/20221 hour, 31 minutes, 59 seconds
Episode Artwork

Applied machine learning research methods, human-machine team, AI strategies, trends in machine learning, how to earn trust - Vin Vashishta - The data scientist show #042

(Highlights below) Vin Vashishta is a chief data officer and AI strategist at V Squared, a company he founded in 2012 that  provides AI strategy, transformation, and data organizational build-out services. He teaches data professionals about strategy, communications, business acumen, and applied machine learning research methods. Vin has 130k+ followers on Linkedin talking about AI, analytics, and strategy. His website: https://www.datascience.vin/ If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Highlights: (0:00) Intro  (00:03:37) "ML strategy" with 'pricing' as an example  (00:09:45) what is a good metric for ML  (00:13:16) how to translate a business problem into a data problem  (00:23:42) leverage users in the "Human Machine Teaming"  (00:48:22) how he earned the trust  (01:17:31) data science evolution from 2012 to 2022  (01:31:06) how he learns new domain knowledge (01:36:25) the mistakes he made  (01:42:15) what he learnt from his mentor
6/29/20221 hour, 50 minutes, 1 second
Episode Artwork

Retail store forecasting with video and audio, ML in high frequency trading, from tech to politics, ML in Web3 - Greg Tanaka, the data scientist show #041

(Highlights below) Greg Tanaka is a computer scientist turned CEO of an AI company. He started coding when he was 6, studied computer science at UC Berkeley, and has built many machine learning applications, he is the the founder and CEO of Percolata developing ”Forecast as a Service”. He is also the council member of Palo Alto in California, and just finished his campaign for congress. Today we’ll talk about his career journey, forecasting, machine learning in blockchain and political campaigns. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Greg's Linkedin: https://www.linkedin.com/in/gltanaka/, Twitter: https://twitter.com/GregTanaka Greg's DAO: https://www.gregtanaka.org/dao Highlights: (00:02:10) use computer vision, audio, and Wi-Fi fingerprints to forecast the retail store traffic  (00:21:55) why time series forecast is hard  (00:26:39) how he made the forecasting more stable  (00:28:46) how he troubleshot the spikes and drops in data  (00:36:04) human trading vs algorithmic trading  (00:47:36) his vision of machine learning in blockchain  (00:54:57) why he got into politics  (01:05:57) advises for people who are interested in Web3  (01:11:04) AutoML and the future of machine learning (01:15:36) things he wished he could learn earlier
6/23/20221 hour, 30 minutes, 41 seconds
Episode Artwork

Weather forecasting with AI, Kaggle tips and tricks, dealing with missing data, deep learning with Jesper Dramsch, The Data Scientist Show #040

(Highlights below) Jesper Dramsch is a scientist for machine learning at the European Centre for Medium-Range Weather forecasts. They have a phd in applied Machine Learning to Geoscience from Technical University of Denmark. They are a Kaggle Kernals Expert and TPU star, ranking at top 81/100k worldwide. We talked about weather forecasting, things they learned from Kaggle, how to deal with missing data and ourliers, deep learning, Keras vs Pytorch, XGBoost, their struggles as a phd student, working in the EU vs US. Follow @DalianaLiu for more updates on data science and this show. (00:01:27) how he got into in ML  (00:09:10) how he handled missing data  (00:28:34) Transformers are eating the world  (00:49:36) Hoover Loss is a fantastic metric to deal with extreme values  (00:54:48) his experience with Kaggle competition  (01:02:59) Kaggle tricks that helped his models perform better  (01:08:18) PyTorch vs Keras  (01:30:30) working in different countries and cultures  Resources shared by Jesper: The newsletter with missing data: https://buttondown.email/jesper/archive/towels-have-quite-a-dry-sense-of-humor/ The paper by Gael about missing data: https://academic.oup.com/gigascience/article/doi/10.1093/gigascience/giac013/6568998 The Huber Loss: https://en.wikipedia.org/wiki/Huber_loss Skill Scores: https://en.wikipedia.org/wiki/Forecast_skill Brier Skill in Weather: https://www.dwd.de/EN/ourservices/seasonals_forecasts/forecast_reliability.html CRPS Continuous Ranked Probability Score https://datascience.stackexchange.com/questions/63919/what-is-continuous-ranked-probability-score-crps ConvNext, Convnets for the 2020s: https://arxiv.org/abs/2201.03545 Transformers for ensemble forecasts: https://arxiv.org/abs/2106.13924 Books I recommend: https://www.amazon.com/shop/jesperdramsch/list/2DYS5KVR5TX0E Blog posts I wrote about these books: https://dramsch.net/tags/books/ Short I made about Test-Time Augmentation https://www.youtube.com/shorts/w4sAh9lKyls Their links: https://dramsch.net/links Their open PhD thesis: https://dramsch.net/phd Newsletter: https://dramsch.net/newsletter Twitter: https://dramsch.net/twitter Youtube: https://dramsch.net/youtube Linkedin: https://dramsch.net/linkedin Kaggle: https://dramsch.net/
6/16/20221 hour, 58 minutes, 11 seconds
Episode Artwork

Reinforcement learning common use cases, recommendation engine, productivity - Susan Shu Chang the data scientist show#039

(Highlights below) Susan Shu Chang is a principal data scientist at clearco, helping ecommerce founders' by building machine learning-powered investing. In her previous role, she developed the company’s very first ML powered website recommender system, deployed to millions of customers, and created a custom OpenAI Gym environment for a reinforcement learning project in production. She is also the founder and developer of Quill Game Studios, selling ~10k copies of the debut game in 6 months. She has given talks at PyCon Canada,Toronto Machine Learning Summit (TMLS), and more. She writes about her career journey and learning on https://www.susanshu.com/ If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Highlights  (00:00) Intro  (00:01:29) from economics to data science  (00:07:23) reinforcement learning (RL)  (00:20:00) recent reinforcement learning use cases  (00:27:28) reinforcement learning for social media's recommender system  (01:04:42) common mistakes when productionizing models  (01:08:30) principal data scientist's day-to-day (01:14:05) what productivity really means  (01:21:04) productivity tips  (01:41:48) books and blogs on productivity
6/8/20221 hour, 53 minutes, 5 seconds
Episode Artwork

User-centric data science, design thinking, from UX researcher to data science manager@Visa - Laura Gabrysiak - the data scientist show #038

(highlights below) Laura Gabrysiak is a senior manager of data products and solutions at Visa. Previously, she's a data scientist, building machine learning models and decision tools to enable Visa clients. She has a college degree in computational and linguistics and has masters in design thinking. She's building the local data science community in Miami, and a co-founder of our Ladies. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Laura's Linkedin:https://www.linkedin.com/in/lauragabrysiak/ (00:02:43) her journey into data science  (00:20:28) anecdotes vs big data  (00:27:05) the power of small data  (00:30:41) design thinking key elements  (00:47:25) mindset shift from a user researcher to a data scientist  (01:00:51) how to improve customer engagement  (01:02:10) how to make data visualization effective  (01:27:21) mindset shift from an individual contributor to a manager  (01:40:43) advices for people who are on PIP 
5/31/20222 hours, 1 minute, 54 seconds
Episode Artwork

A/B testing and growth analytics at Airbnb, building data science tools and metrics store with Nick Handel, the data scientist show#037

(Highlights below) Nick Handel was a senior data scientist leading the launch of the data side of this Airbnb Trips and later built a team that designed aribnb’s end-to-end machine learning platform, bighead. Currently, he is the cofounder and CEO of Transform, he first centralized 'metrics store' that empowers data analysts to deliver insights. He was recognized as 30 under 30 by Forbes in 2018. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Nick's Linkedin:https://www.linkedin.com/in/nicholashandel/ Highlights:  (00:00) intro and career journey  (00:10:58) common mistakes in A/B testing (00:25:48) how to do A/B testing deep dives (00:27:32) surprising A/B testing results (00:29:18) facts vs opinions (00:33:55) A/B testing best practices (00:55:01) how he built a new data schema for Airbnb Trips  (01:00:43) how to collect data when building data science tools (01:38:53) trend of data science tools 
5/24/20222 hours, 10 minutes, 7 seconds
Episode Artwork

Becoming a superforecaster, decision science for better human predictions - Pavel Atanasov-the data scientist show#036

(Highlights below)Pavel is a decision scientist and co-founder at Pytho, using decision science to measure and improve human judgment & prediction. He has a phd in psychology and decision science from the University of Pennsylvania, focusing on crowd predictions. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Pavel's twitter: https://twitter.com/PavelDAtanasov Superforecasting book, based on the Good Judgment Project: https://www.amazon.com/Superforecasting-Science-Prediction-Philip-Tetlock/dp/0804136718 Blogs about forecasting:   Vox's Future Perfect series: https://www.vox.com/future-perfect Astral Codex Ten: https://astralcodexten.substack.com/ Highlights:  (00:01:10) how he got into decision science  (00:14:38) what makes someone a super forecaster  (00:16:20) three elements of becoming a super forecaster  (00:24:37) how to effectively update our opinions  00:30:05 how he designed experiments to find out what was a better system  (00:48:27) why humans sometimes are better than algorithm  (01:14:50) how to collect data and information better  (01:33:25) why you should quit  (01:42:30) the future of decision science  
5/17/20221 hour, 51 minutes, 29 seconds
Episode Artwork

Using AI to detect online abuse, from physics PhD to staff ML engineer@Linkedin, persuasion at work with James Verbus - the data scientist show #035

(Timestamps below) James Verbus is Staff Machine Learning Engineer at LinkedIn. He has a PhD in Physics from Brown university. He is the tech lead of the Anti-Scraping and Automation AI Team, working on protecting LinkedIn's Members from bots and abusive scripted behavior, pioneering the use of deep learning to detect abusive automated sequences of user activity (blog post). If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu (00:01:14) from physic to data science  (00:16:37) background of online abuse detection  (00:24:40) Isolation Forest Algorithm (00:42:59) his day-to-day as a staff ML Engineer  (00:52:57) how to persuade stakeholders  (00:58:17) how to build influence at work  (01:00:22) how he grew to staff engineer  (01:13:48) what he learned from his mentor 
5/10/20221 hour, 35 minutes, 55 seconds
Episode Artwork

The golden age of AI and neuroscience, brain computer interface (BCI), from academia to FAANG with Patrick Mineault - The Data Scientist Show #034

(Timestamps below) Patrick Mineault is a neural data scientist. He has worked at Google and Facebook after he did a postdoc at UCLA. He worked on Brain Computer Interface (BCI) at Facebook Reality Labs, building a BCI that allows you to type with your brain. He tweets about neuro-AI @patrickmineault, and writes a blog (https://xcorr.net) sharing his career journey and learnings along the way. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu How he got into data science (00:02:41) His work at Google on A/B testing (00:04:17) How he joined Facebook Reality Lab(00:23:53) Projects on neuro-AI and brain computer interface (BCI) (00:27:13) Skills needed for BCI research (00:34:37) How AI influence neuroscience (01:34:28) computer vision VS human vision (01:39:57) model vs data, nature vs nurture(01:45:32)
5/5/20222 hours, 46 minutes, 25 seconds
Episode Artwork

From biostatistician to the 'artist of data science', how he turned his life around, philosophy - Harpreet Sahota - The Data Scientist Show#033

Harpreet Sahota is a data scientist and ML developer advocate, he is also the host of “artist of the data science” podcast and weekly data science happy hours, he is the principal data science mentor at data science dream job. He is also a philosophy nerd. He had some struggles when he tried to get into data science, and today we’ll talk about his experience as a biostatistician, data scientist, lessons he learned from his journey and from mentoring other people, and how he turned his life around.  If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Harpreet's Linkedin: https://www.linkedin.com/in/harpreetsahota204/?originalSubdomain=ca The artist of data science podcast: https://theartistsofdatascience.fireside.fm/
4/6/20221 hour, 25 minutes, 8 seconds
Episode Artwork

How he built the best Covid forecasting model, lessons learned and how to improve model performance with Youyang Gu - The Data Scientist Show#032

Youyang Gu is the creator of http://covid19-projections.com. In 2020, while most Covid prediction model failed, without any experience in medicine he created a forecasting model that outperforms almost all medical experts. Yann LeCun, Facebook's chief AI scientist and professor stated that Gu's model "is the most accurate to predict deaths from COVID-19", surpassing the accuracy of the well-funded Institute for Health Metrics and Evaluation COVID model. It was cited by the Centers for Disease Control (CDC) in its estimates for U.S. recovery. Currently, he is a member of the Technical Advisory Group at the World Health Organization. Working on laying the groundwork for a comprehensive, global study to document and analyze differences in levels of mortality attributable to COVID-19 between and within countries. Today we talked about how he built the model, lessons he learned, his advice for data scientists and what his working on today. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Youyang's blog: https://youyanggu.com/ Youyang's Twitter: https://twitter.com/youyanggu
3/31/20222 hours, 3 minutes, 43 seconds
Episode Artwork

Feature engineering, ML models in production, new trend for ML tools, day-to-day of a principal engineer with Willem Pienaar - The Data Scientist Show #031

Willem is the creator of Feast, an open-source feature store (feast.dev), building tools at the intersection of engineering, data, and ML. Currently, he work as a Principal engineer at Tecton, Leading the development of Feast, an open source feature store. Previously, he has worked in South Africa, Thailand, Singapore before he moved to San Francisco in the US. Today we’ll talk about machine learning in production, cool projects he worked, machine learning in startup and how to pick the right data science track for your career. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Willem's Linkedin:https://www.linkedin.com/in/willempienaar/
3/24/20221 hour, 36 minutes, 24 seconds
Episode Artwork

Machine learning in healthcare, how to scale ML solutions, from ML researcher to product leader at Microsoft with Muazma Zahid - The Data Scientist Show #030

Muazma Zahid is a leader in data and AI, speaker and researcher in Biomedical Engineering with several international publications and awards. We talked about machine learning in healthcare, how to scale data science solutions, her journey from a ML researcher to data engineer to engineering manager to a product leader.  She joined Microsoft in 2018 as a data engineer, later became a senior manager in software engineering, and now she is a principal product manager. She won the mentor of the year award in 2020 by Women Tech Network. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Muazma's Linkedin: https://www.linkedin.com/in/muazmazahid/
3/20/20221 hour, 40 minutes, 31 seconds
Episode Artwork

Hands-on time series analysis, open source projects, R packages, MLOps common mistakes with Rami Krispin - The Data Scientist Show #029

Rami leads the data science and engineering team at Apple Finance Decision Support. He uses advanced statistical and machine learning models to help leadership make better decisions. He is also an open-source contributor and the author of Hands-On Time Series Analysis with R and several R packages for time series analysis and machine learning applications. He has a master degree in applied econometrics. We talked about time series, open source, MLOps and his career journey. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Rami's Linkedin: https://www.linkedin.com/in/rami-krispin/ Rami's Github: https://github.com/RamiKrispin Rami's Twitter: https://twitter.com/Rami_Krispin Rami's Blog: https://ramikrispin.github.io/
3/11/20221 hour, 31 minutes, 3 seconds
Episode Artwork

Becoming a deep learning researcher without a PhD, graph neural network(GNN), time series, recommender system with Kyle Kranen - The Data Scientist Show#028

Kyle Kranen is a Deep Learning Software Engineer at Nvidia. Researching, implementing, and optimizing state of the art distributed deep learning models, using mainly Pytorch and Tensorflow. He has a unique combination of skillset of both hardware and software engineering. We talked about Graph Neural Network (GNN), Temporal Fusion Transformer (TFT), time series, and other deep learning research topics and his career journey.  If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Kyle's Linkedin: https://www.linkedin.com/in/kyle-kranen/
3/3/20221 hour, 57 minutes, 10 seconds
Episode Artwork

How to 'predict' the past, geospatial data's use cases, Data-as-a-Service (DaaS), out-of-the-box career advice with the CEO of SafeGraph, Auren Hoffman - The Data Scientist Show #027

Auren Hoffman is CEO of SafeGraph: the place for data about physical places. We talked about how to use analytics and machine learning to find truth in data, geospatial data and their use cases, the impact of DaaS, and what he looks for when he develops talents. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Auren's Twitter: @auren.
2/24/20221 hour, 26 minutes, 27 seconds
Episode Artwork

Telling compelling stories with data, people skills for analytical thinkers with Gilbert Eijkelenboom - The Data Scientist Show #026

Gilbert Eijkelenboom is the founder of Mindspeaking, a training program helping data & analytics professionals improve their business understanding, persuasion, and storytelling skills. He wrote the best-selling book “people skills for analytical people”. We talked about to get buy-in from stakeholders, how to build work relationships as introverts, how to earn trust, how to tell compelling stories with data, and lessons he learned from playing poker. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Gilbert offers various free materials: Free self-assessment of your Data communication skills (3 min): mindspeaking.com/maturity-model  Free preview of People Skills for Analytical Thinkers: mindspeaking.com/book Free email course: mindspeaking.com/conversation
2/17/20221 hour, 28 minutes, 9 seconds
Episode Artwork

Sports analytics and personal branding for data scientists, Ken Jee - The Data Scientist Show #025

Ken Jee is the head of data science@Scouts Consulting Group and a YouTube creator with over 180k followers. Today we talked about sports analytics, how to grow your career and get promoted, how to explain complex concepts to stakeholders, how to build personal brands as data scientists.   If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Ken Jee's Linkedin, YouTube
2/9/20221 hour, 32 minutes, 17 seconds
Episode Artwork

From Apple store specialist to ML engineer at Apple, build a portfolio through open source projects, Julia Language, with Logan Kilpatrick - The Data Scientist Show #024

Logan Kilpatrick is a machine learning engineer at Apple, Developer Community Advocate of Julia. He is a teaching fellow at Harvard extension school, and currently doing a master program of science in Law. Today we’ll talk about how he became a Machine learning engineer, the internship he did in NASA, why you should care about open source communities, Julia, what the future of machine learning looks like, make sure you stay till the end. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Logan's Twitter: https://twitter.com/OfficialLoganK?s=20 Logan's LinkedIn: https://www.linkedin.com/in/logankilpatrick/ 
2/3/20221 hour, 42 minutes, 54 seconds
Episode Artwork

Tackling complex ML problems with small steps, MLOps best practices, pre-model analysis, from marketing analyst to principal ML researcher with Nathan Landi, The Data Scientist Show #023

Nathan Landi is a principal quantitative researcher at TEKSystems. He is on the advisory board of MLOps world. We talk about pre-model analysis using information value, MLOps best practices, multi-stage modeling, tackling complex problems with simple models, interview tips, and how he grew his career from marketing analyst to principal ML researcher. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu The mentorship service mentioned is SharpestMinds, it's free to sign up here. Nathan's Linkedin: https://www.linkedin.com/in/nathanielglandi/
1/27/20221 hour, 44 minutes, 27 seconds
Episode Artwork

Data-driven sales strategies, sales metrics, how to collaborate with business leaders with Dennis Yu - The Data Scientist Show #022

Dennis Yu is a Revenue and Strategy Leader, currently he is the Merchant Success Team Lead at Shopify, he’s on the advisory board of USC startup accelerator, and he is also leading Talent and Professional Development for Asian ERG at Shopify. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Dennis's Linkedin: https://www.linkedin.com/in/dennisyyu/ Today we’ll talk about what business leaders look for in data science projects and his career journey. Sales metrics: LTV GMV band, etc Example of how he would grow sales revenue How to tell the stories with data How data scientists and business leaders should collaborate His career journey How growing up as an Asian American shaped his perspectives
1/20/20221 hour, 19 minutes, 51 seconds
Episode Artwork

Economic thinking and a must-listen mini MBA for data scientists with Airbnb VP and Wharton Professor, Amit Gandhi - The Data Scientist Show#021

Amit Gandhi is a technical fellow and VP at Airbnb. He is a professor in economics at the Wharton School in the University of Pennsylvania. He gave as a master class on economic thinking and a mini-MBA tailored for data scientists. We also talk about his career journey, decision-making, machine learning, economics, and his advice to data scientists, make sure you stick to the end. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Amit's Linkedin: https://www.linkedin.com/in/amitgandhiecon/
1/13/20221 hour, 40 minutes, 17 seconds
Episode Artwork

Translating ML model’s output into financial impact, fraud detection, financial modeling at Google, interview preparation with Dan Lee - The Data Scientist Show #020

Dan Lee is the an ex-Google data scientist turned founder of DataInterview - an interview prep platform for data scientist. We talked about how to translate model results into dollar amount, fraud detection models, quantitative thinking, data storytelling, best practices in exploratory data analysis (EDA), and interview prep tips. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu His Linkedin: https://www.linkedin.com/in/danleedata/ Interview Prep: https://dataInterview.com
1/5/20221 hour, 19 minutes, 1 second
Episode Artwork

Unlocking the power of emotional intelligence for your career success, how to handle toxic relationships and how to regulate negative emotions with Marc Brackett - The Data Scientist Show #019

Marc is a Yale Professor and the founding director of Yale Center for Emotional Intelligence. He wrote the best selling book “Permission to Feel”. Today we’ll talk about how we can use emotional intelligence to empower our careers: how to regulate negative emotions how to deal with toxic relationships at work how to influence big stakeholders If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu
12/30/20211 hour, 13 minutes, 30 seconds
Episode Artwork

The ultimate data science interview landscape, three shifts in DS job search, common mistakes in interviews with Andrew Berry The Data Scientist Show #018

Andrew Berry is a data science educator at Lighthouse Labs. He has worked with over 100+ students from various backgrounds trying to transition into data science. He teaches data science, coaches aspiring data scientists, and design courses. We talked about the shift in data science interviews, how to tackle coding interviews, future of job search, how to build your portfolio, interviews tips for big companies vs small companies, behavioral interviews vs technical interviews, how to write cold emails and more. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Andrew's Linkedin: https://www.linkedin.com/in/berrya/
12/23/20211 hour, 49 minutes, 58 seconds
Episode Artwork

From unemployed to chief data scientist of multiple startups, machine learning prototyping, how to read people, overcoming life struggles with Matt Kirk, the data scientist show #017

Matt Kirk is Daliana's mentor, so it's a very special episode! Matt has been many things in his life: data scientist, software engineer, research analyst (quant), a founder, a c-level executive, and so on.  We talked about Matt’s unique career adventure, machine learning solutions he built for startups, how to read people and influence stakeholders, how to understand yourself, how to be productive and how he overcome his life struggles. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu You can reach out to him: matt[at]matthewkirk[dot]com
12/15/20212 hours, 2 minutes, 53 seconds
Episode Artwork

The unique algorithm for compact and accurate machine learning models, no-code ML use cases and its impact on the future of data scientists with Blair Newman - The Data Scientist Show #016

Blair Newman is the CTO of Neuton AI. Neuton is a zero-code cloud platform that empowers users of any tech level to apply the best machine learning practices for solving real-world challenges faster. We’ll talk about Neuton AI’s patented deep learning algorithm that doesn’t use back propagation, his career journey, no-code ML use cases, and how does no-code ML impact the future of data science. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Blair’s Linkedin: https://www.linkedin.com/in/blairnewman/ Neuton AI: https://neuton.ai/
12/9/20211 hour, 18 minutes, 52 seconds
Episode Artwork

Build successful end-to-end machine learning systems, ML engineers day-to-day and stakeholder management with Eugene Yan - The Data Scientist Show#015

Eugene Yan is a machine learning engineer at Amazon. He designs, builds, and operates machine learning systems that serve customers at scale. In his free time, he writes and speaks about data science on www.eugeneyan.com with 2,000+ subscribers. We talked about how to build an end-to-end ML project successfully, machine learning best practices, his approach to tackle challenging problems, high-impact projects he worked on, how to communicate effectively with stakeholders, why writing documents is important, and how to get to the next level. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Eugene's Twitter: https://twitter.com/eugeneyan?s=20
12/2/20211 hour, 50 minutes, 11 seconds
Episode Artwork

From data engineer to data scientist at Google, transition into DS from non-tech degree, salary negotiation, how to manage up with Sundas Khalid - The Data Scientist Show #014

Sundas Khalid is a senior analytics lead at Google. She started her career as a data engineer and transitioned into data science through self-learning. I met Sundas when we worked together at Amazon. She helped women of color negotiate a $1.4M in incremental salaries. She talks about careers in data science, personal finance, and salary negotiation on YouTube and Instagram. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Sundas' website: https://sundaskhalid.com/about-me Sundas' YouTube:https://www.youtube.com/c/sundaskhalidInstagram:https://www.instagram.com/sundaskhalidd/?hl=en • We talked about how she transitioned from data engineer to data scientist  • Data engineer vs data scientist pros and cons  • How to grow to a senior data scientist  • How to build a data science tool that has impact for the business • 3 mistakes people make when negotiating salary   • How to build wealth using your salary   (00:00:00) Introduction  (00:01:23) Overview of her career journey  (00:04:52) High-impact projects she worked on  (00:06:59) Tools she uses  (00:07:42) To be successful in DE  (00:09:16) Transitioning into data science  (00:12:06) DE skills that give an edge as a data scientist  (00:13:11) Her expectations of data science  (00:15:49) Data engineering vs data science  (00:17:33) Her day-to-day as a data scientist  (00:19:42) The struggles in her day-to-day work  (00:21:31) Is data science going away?  (00:22:45) Automation tools and reports  (00:25:53) Growing her career as a data scientist  (00:27:53) Communicating better with people   (00:30:06) mistakes she made in her career  (00:34:16) Daliana joining the team Weblab (00:37:16) Tips for negotiating salary  (00:41:57) Tools to use for researching salaries  (00:43:19) Mistakes when negotiating a salary  (00:44:03) Importance of investing early in a career  (00:49:09) Tips about investing and building wealth  (00:51:15) Things that people should know more  (00:52:51) Tips for increasing productivity  (00:54:46) Future of data science and analytics  (00:56:08) How to keep updated on new information  (00:57:52) What is she excited about now
11/25/202159 minutes, 39 seconds
Episode Artwork

Develop product sense to uplevel your data science career, how to influence product managers with data, crack product sense interview questions with Peter Knudson - The Data Scientist Show #013

Peter Knudson is a product manager of 10 years who focuses on innovative new experiences that help drive engagement in the ever evolving landscape of mobile and console games. He is also the author of the Amazon best selling book “Product Sense.” We talk about what is product sense, how do data scientist develop product sense, what are product manager’s frustration when working with data scientists, how can data scientists influence product managers better, misconceptions about product management, common mistakes in product management. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Peter’s best selling book “Product Sense”: https://www.amazon.com/Product-Sense-Problems-Interviews-Management-ebook/dp/B0998SRN37 Website: ProductSenseBook.com Peter's Linkedin: https://www.linkedin.com/in/thisispeterk/
11/17/20211 hour, 11 minutes, 6 seconds
Episode Artwork

The secret to improve mental health, future of data engineering, work life balance with Zach Wilson - The Data Scientist Show #012

Zach Wilson is a tech lead at Airbnb building data pipelines, previously he worked at Netflix and Facebook. Zach graduated from college at the age of 20 with degrees of math and computer science. He has over 80k followers on Linkedin.   We talked about mental health, terminal level, promotions, work life balance, building audience on Linkedin, and the future of data engineering.   For data engineering best practices, Zach's career journey, working in FAANG, please go to previous episode "Demystify data engineering". If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu  Zach's YouTube: https://www.youtube.com/channel/UCAq9f7jFEA7Mtl3qOZy2h1A (00:00:00) Introduction (00:00:41) Starting his career at 20 (00:06:38) Change of mindset (00:11:35) His ambition (00:15:57) Hobbies he developed (00:19:28) Maintaining work-life balance (00:21:31) His thoughts about the terminal level (00:26:55) Career advice about job burned-out (00:33:11) His daily routine (00:39:34) Good projects to learn (00:46:22) Doing projects using JavaScript (00:47:56) What are the lessons he can share (00:52:26) Growing a large audience (01:00:13) His goal on sharing (01:04:12) Advice for people who wants to grow an audience on LinkedIn (01:09:28) Something he is excited about in the future (01:14:04) Future of data engineering (01:21:01) Connect with Zach
11/9/20211 hour, 21 minutes, 57 seconds
Episode Artwork

Demystify data engineering, 3 common mistakes, FAANG's culture, how to say no at work with Zach Wilson - The Data Scientist Show #011

Zach Wilson is a tech lead at Airbnb building data pipelines, previously he worked at Netflix and Facebook. Zach graduated from college at the age of 20 with degrees of math and computer science. He has over 80k followers on Linkedin. We talked about common data engineering mistakes, best practices, softskills, how to say no at work, work experience in Facebook, Netflix, and Airbnb. This is part one of our conversation, and please go to next week’s episode for part two. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Zach's Linkedin: https://www.linkedin.com/in/eczachly/ (00:00:00) Introduction (00:00:43) How did he get into DE (00:01:23) Data Infrastructure vs Data Engineer (00:02:44) Day-to-day at Airbnb (00:05:57) How much data science should DEs need to know (00:10:07) Common mistakes of DEs (00:14:53) Good questions to ask stakeholders (00:18:10) Communicating with data scientists and software engineers (00:20:39) Frustrations when working with data scientists (00:24:34) Setting up processes (00:26:22) High-quality pipeline (00:28:42) High-impact data engineering project (00:33:13) Mistakes he made early in his career (00:38:14) Core DE skills that juniors must know (00:40:50) How to go to the next level  (00:44:15) Meeting his mentor (00:46:02) Some advice from mentors (00:48:00) Best advice about influencing without a title (00:49:45) Working at Facebook, Netflix, and Airbnb
11/4/20211 hour, 9 seconds
Episode Artwork

Build a killer analytics dashboard for your CEO; data visualization best practices with Kate Strachnyi - The Data Scientist Show #010

Kate Strachnyi is the founder of DATAcated – delivering training on data visualization, data storytelling, and dashboard best practices. She has over 150k followers on Linkedin. We talked about how she got into data analytics without a background in math, what makes a good dashboard, how to work with executives, how to tell stories with data, what she’s looking for when hiring a data analyst, and the psychology of color! If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu DATAcated: https://datacated.com/
10/28/20211 hour, 24 minutes, 26 seconds
Episode Artwork

Ace the data science interview; build kick-ass portfolio projects with Nick Singh - The Data Scientist Show #009

Nick Singh is a career coach and the co-author of "ace the data science interview". He has over 60k followers on Linkedin, and previously worked at Facebook and Google. We talked about how to prepare for data science interviews, how to build a portfolio, what makes a candidate stand out, how to write cold emails to recruiters, and his career journey. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu You can find his book on Amazon: https://www.amazon.com/dp/0578973839?&linkCode=sl1&tag=nicksingh03-20&linkId=4fa541a539320e8936926cb3a5167881&language=en_US&ref_=as_li_ss_tl Nick's Linkedin: https://www.linkedin.com/in/nipun-singh/
10/21/202156 minutes, 25 seconds
Episode Artwork

Solving the brain with machine learning; the secret to a successful career with Konrad Kording - The Data Scientist Show #008

Konrad Kording is a neuroscientist and professor at the University of Pennsylvania. Konrad is trying to understand how the world and the brain works using data. He is known for his research in computational neuroscience. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Konrad's twitter:https://twitter.com/KordingLab The online community of computational neuroscientists he's working on: http://neuromatch.io/ We talked about: - Is evolution gradient descent? - What makes a data scientist competitive? - His three principles of doing good science - Why do we need casual inference in AI? - Should we optimize our brain's 'loss function' to make us happier? - The secret to a good career - Three rules he follows for doing good science - Is deep learning a bubble? - How did he get to where he's at today
10/14/20211 hour, 38 minutes, 39 seconds
Episode Artwork

How do data scientists get into blockchain? How to build a career by networking online, Greg Osuri - The Data Scientist Show #007

A seasoned open-source developer of 25+ years, Greg Osuri is the CEO and co-Founder of Akash Network, an open-source decentralized cloud that provides a fast, efficient, and low-cost application deployment.    Prior to Akash Network, Greg founded AngelHack, the world’s largest hackathon organization with over 200,000 developers across 164 cities across the globe. At AngelHack, he helped launch several developer companies including Firebase, which was acquired by Google in 2014.   Greg launched his career at IBM and later designed Kaiser Permanente’s first cloud architecture. As an expert in open-source, distributed systems, and blockchain development, and an applied economist, Greg is a featured international speaker and has spoken recently at events including Kong Summit, Block-Con, and Block to the Future.   His work has been featured in top-tier publications including BeInCrypto, CoinDesk, Cointelegraph, Forbes, TechCrunch, and Yahoo! Finance. Greg was instrumental in the passing of California’s first Blockchain law, providing the first expert-witness testimony at the Senate.   About Akash Network: Akash Network, the world's first decentralized and open-source cloud, accelerates deployment, scale, efficiency and price performance for high-growth industries like blockchain and machine learning/AI.  If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Greg's Twitter: https://twitter.com/gregosuri 
10/8/20211 hour, 20 minutes, 29 seconds
Episode Artwork

Human-centered design for AI; working with Fei-Fei Li; human first design for AI, Andrew Kondrich - The Data Scientist Show #006

Andrew Kondrich is a machine learning engineer at Scale. We talked about his career journey, human first design for AI, how to get into machine learning, and what kind of candidates companies are looking for. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu
10/1/202141 minutes, 24 seconds
Episode Artwork

From history major to data science manager, when you shouldn't use data, Bryan Davis - The Data Scientist Show #005

Bryan is a data science manager, previously he worked at Facebook and Indeed as senior data scientist. Bryan specialize in ad system design, ad ranking, and A/B testing platforms. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu We talked about:  - how he got into data science as a history major  - when not to use data science to make decisions  - how data scientists should influence the company's culture  - how data scientists can have a competitive edge in the future  - what is the ad ranking problem  - data science books for game theories  - how to use game theories in real life  
9/24/20211 hour, 23 minutes, 25 seconds
Episode Artwork

How to get your dream job without applying online. with Jerry Lee - The Data Scientist Show #004

Jerry is the COO/Founder of Wonsulting and an ex-Senior Strategy & Operations Manager at Google & used to lead Product Strategy at Lucid. After graduating from Babson College, Jerry was hired as the youngest analyst in his organization by being promoted multiple times in 2 years to his current position in Google. With Wonsulting, Jerry partners with universities & organizations (220+ to date) to help others land into their dream careers. He's amassed 250,000+ followers across LinkedIn, TikTok & Instagram and has reached 40M+ professionals. In addition, his work has been featured on Forbes, Newsweek, Business Insider, Yahoo! News, LinkedIn & elected as the 2020 LinkedIn Top Voice for Tech. Jerry shared his expert advice on how to network effectively, how to send messages to recruiters, and how he used data analytics to solve million dollar business problems. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu
9/17/20211 hour, 7 minutes, 45 seconds
Episode Artwork

Transition into machine learning as an engineer, two mistakes ML scientists should avoid. Alexey Grigorev - The Data Scientist Show #003

Alexey Grigorev is a principal data scientist at OLX Group, He is also the founder of Data Talks Club with 4,100 members. He wrote a book called "Machine Learning Bookcamp" to help people learn machine learning by doing projects. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu We talked about: how Alexey transitioned into machine learning - What kind of project helped him get his job - 2 mistakes new data scientists often make - Why do you need to know the baseline  - What makes you stand out as a candidate  - His free machine learning course      #datascience #machinelearning #ai #ml #career
9/8/20211 hour, 6 minutes, 50 seconds
Episode Artwork

The future of data scientists; network like a champion with Jim Zheng - The Data Scientist Show #002

Jim Zheng is an engineering manager at Flexport, building the data platform; he was a data scientist at Salesforce, worked at Yahoo as a UX designer, and was a researcher in computer science at Stanford. He is also the cofounder of Senpai, an audio platform for experts to share domain knowledge.  If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science. Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu Ask Jim a question on Senpai: https://beta.senpai.so/ataki12 Jim's article on how to send cold emails: https://www.linkedin.com/pulse/hiring...  We talked about: - Jim's career path in engineering and data science - What makes a great data scientist  - What's the future of data scientists  - What's a 'human cloud'  - How to network like a champion  - Best way to work with mentors  - Career advice to his younger self   - Life lessons he learned from playing chess  
9/3/20211 hour, 20 minutes, 55 seconds
Episode Artwork

Build resilient machine learning models; advice for ML careers - Gerald Friedland The Data Scientist Show #001

Gerald Friedland is the CTO of an AI company (Brainome) and a professor at UC Berkeley. Listen to his advice on how to build more resilient machine learning models and get inspired by his career journey!  If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science and career. Gerald’s LinkedIn: https://www.linkedin.com/in/geraldfriedland/ Gerald’s Class: https://www2.eecs.berkeley.edu/Courses/CS294_3438/ Gerald’s Youtube Lectures: https://www.youtube.com/playlist?list=PL17CtGMLr0XzOsLydB0jik4UpEyoW5SOx Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/ Daliana's Twitter: https://twitter.com/DalianaLiu (00:00:00) Introduction (00:01:06) How did he get into machine learning (00:12:13) How to Reduce Overfitting (00:22:15) Technology in Brainome (00:24:13) Brainome vs auto ML (00:27:08) Measurement of a data-centric approach (00:27:32) Data Drift (00:32:28) Courses to take (00:34:33) Information theory (00:38:06) Advice for students in grad school (00:42:43) Dealing with failure and stress (00:44:05) The underdog story (00:49:12) Who is Gerald Friedland (00:50:00) Inspiration of Gerald (00:51:08) Future of machine learning (00:52:36) Tips for asking the right questions (00:53:40) Question your assumptions (00:54:10) Favorite books and courses (00:56:34) Advice on machine learning (00:57:52) Ideal qualities of an ML engineer
8/30/20211 hour, 3 minutes, 24 seconds