A deep dive into data scientists' day-to-day work, tools and models they use, how they tackle problems, and their career journeys. This podcast helps you grow a successful career in data science. Listening to an episode is like having lunch with an experienced mentor. Guests are data science practitioners from various industries, AI researchers, economists, and CTOs of AI companies. This is hosted by Daliana Liu, senior data scientist. Follow Daliana on Twitter(https://twitter.com/DalianaLiu) for more updates on data science, career, and this podcast.
Why data scientists are tired, six real data scientists' frustrations - The Data Scientist Show #089
Daliana interviewed 6 data scientists from her meetup in New York City. It's a unique episode where you get to hear the real frustrations of data scientists. We talked about struggles working in healthcare, finance, data quality and AI, how to advocate for yourself, and align with your managers.
Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
4/17/2024 • 42 minutes, 22 seconds
Why 80% of A/B tests fail, how to 10X your experimentation velocity - Kristi Angel - The Data Scientist Show #088
Most experimentations fail, Kristi Angel shares her expertise on scaling experimentation and avoiding common A/B testing pitfalls. Learn five things that can help boost test velocity, designing impactful experiments, and leveraging knowledge repos. (Chapters below)
Kristi Angel’s LinkedIn: https://www.linkedin.com/in/kristiangel/
Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
(00:00:00) Intro
(00:01:26) Why do most experimentations fail?
(00:07:05) Mistakes in choosing metrics
(00:10:05) Is revenue a good metric?
(00:13:18) Split metrics in three ways
(00:15:10) Daliana's story with too many category breakdowns
(00:16:59) What makes the best data science team?
(00:19:24) Data scientist work in silo vs in a data science team
(00:21:15) Building a knowledge center
(00:23:40) Example of knowledge center; nuance of experimentations
(00:26:09) How many metrics and variants?
(00:30:56) How to reduce noise - CUPED
(00:33:01) Future of A/B testing
(00:38:33) Q&A: Low statistical power
4/8/2024 • 44 minutes, 7 seconds
From physics PhD to data science leader, unexpected challenges in survey data, Python vs R, EDA best practices, building MLOps toolkit - Julia Silge - The Data Scientist Show #087
Julia Silge is an engineering manager at Posit PBC, formerly know as R-studio, where she leads a team of developers building open source software MLOps. Before Posit, she finished a PhD in astrophysics, worked for several years in the nonprofit space, and was a data scientist at Stack Overflow where some of her most public work involved the annual developer survey. We talked about MLOps tools, challenges in survey data, text analysis, and balancing her interests in data science and engineering.
Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
(00:00:00) Introduction
(00:00:56) Getting into data science
(00:04:50) Transition from data centers to engineering manager
(00:14:04) Common challenges in tool development
(00:17:38) Challenges with survey data
(00:26:47) Engineering skills for data scientists
(00:28:59) Balancing roles
(00:34:49) Developing skills in Exploratory Data Analysis (EDA)
(00:39:19) Python vs. R for data analysis
(00:44:40) Exciting aspects in career and personal life
3/30/2024 • 46 minutes, 21 seconds
Why he created Pandas, the future of data systems, why he left his CTO role to become a chief architect - Wes McKinney - The Data Scientist Show #086
Wes McKinney is the co-creator of pandas library and he is the cofounder of Voltron data. Currently he is a principal Architect at Posit and an investor in data systems.
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
Wes' LinkedIn: https://www.linkedin.com/in/wesmckinn/
(00:00:00) Introduction
(00:00:44) How Pandas Started
(00:06:40) Voltron Data
(00:10:03) Benefits of Easy-to-Use Data Tools
(00:13:20) The Rise of New Data Tools
(00:18:07) Choosing Tools: Vertical or Flexible?
(00:23:01) Big Models and Data Tools
(00:29:29) Challenges in Building a Product
(00:31:28) Becoming a Top Architect
(00:34:55) Missed Aspects of Previous Roles
(00:39:04) A Busy Week: Advising, Designing, Investing
(00:43:42) Improving Open Source
(00:45:24) How to Decide What to Work On
(00:46:28) What he’s learning now
(00:47:56) Excitement in Career and Life
(00:48:29) Using ChatGPT for Learning
(00:50:27) Future Impact Goals
3/22/2024 • 52 minutes, 28 seconds
From financial analyst to director of analytics, how to get promoted quickly, 7 elements of influence - Christopher Fricker - The Data Scientist Show #085
Christopher Fricker is a senior director in analytics and BI at Renaissance Learning. He started his career in finance and later became a data science consultant working with Meta, Netflix, and pre-IPO tech companies doing analytics. We talked about the mental models that helped him grow from a finance analyst to an analytics leader.
Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Chris’ LinkedIn: https://www.linkedin.com/in/christopherfricker/
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
(00:00:00) Introduction
(00:01:45) How to get promoted quickly
(00:08:40) Power vs authority
(00:11:21) First principal thinking
(00:32:38) ROI of a data team
(00:41:01) How to be persuasive
(00:55:27) All Data is wrong
(00:56:57) How he audits the data
(01:01:28) How to make someone help you at work
3/15/2024 • 1 hour, 15 minutes, 2 seconds
Adapters: the game changer for fine-tuning - Geoffrey Angus - The Data Scientist Show #084
I interviewed Geoffery Angus, ML team lead @Predibase to talk about why adapter-based training is a game changer. We started with an overview of fine-tuning and then discussed five reasons why adapters are the future of LLMs. Later we also shared a demo and answered questions from the live audience. Try fine-tuning for free: https://pbase.ai/GetStarted
Geoffrey’s LinkedIn:https://www.linkedin.com/in/geoffreyangus
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
Geoffrey’s LinkedIn: https://www.linkedin.com/in/geoffreyangus
Try finetuning for free: https://pbase.ai/GetStarted
(00:00:00) Intro
(00:01:19) What is Fine-tuning?
(00:08:18) Utilizing Adapters for Finetuning Enhancement
(00:09:50) 5 reasons why adapters are the future of LLMs
(00:26:34) Common Mistakes in Adapters Usage
(00:28:34) Training Your Own Adapter
(00:32:23) Behind the Scenes of the Adapter Training Process
(00:37:51) Config File Guidance for Fine-Tuning
(00:39:41) Debugging Strategies for Suboptimal Fine-Tuning Results
(00:42:23) User Queries: Creating a LoRa Adapter and Future Support
(00:51:06) Key Takeaways and Recap
3/8/2024 • 52 minutes, 45 seconds
Landing a job by analyzing Seattle's crime data, from data scientist to founder of interview query, building a lifestyle business - Jay Feng - The Data Scientist Show #083
Jay Feng created a viral project using Seattle crime data and later got into data science. He later founded "Interview Query" helping data scientists get jobs. We'll talk about how he landed his data science job through his blog, and his journey from data scientist to founder.
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
Jay Feng's LinkedIn: https://www.linkedin.com/in/jay-feng-ab66b049/
Jay Feng's YouTube: https://www.youtube.com/c/DataScienceJay
(00:00:00) Introduction
(00:01:33) From engineer to data scientist
(00:03:32) Got a job through a project
(00:05:59) Daliana's portfolio project with Zillow
(00:09:39) From data scientist to entreprenuer
(00:13:40) "Tinder" for job
(00:15:31) How he chose companies to work for
(00:16:22) Why he became an entreprenuer
(00:18:02) How many hours does he work
(00:19:19) Challenges when building "interview query"
(00:20:44) Speed vs scale
(00:22:36) Growth hacks he used
(00:24:48) YouTube vs newsletter
(00:27:46) Lessons he learned as a CEO
(00:29:42) How to grow from tech employee to founder
(00:32:26) How he defines success
(00:35:05) If you have a business idea for Jay
2/29/2024 • 36 minutes, 6 seconds
Case studies from the GenAI frontier, scaling ML teams, from biologist to machine learning consultant- Erik Gafni - The Data Scientist Show #082
Erik Gafni builds AI systems and teams. He founded Eventum AI (https://bit.ly/eventum-ai), an ML consulting company working with high-growth startups. We talked about GenAI projects he worked on, how he built production ML systems, how to scale ML teams, and his journey from biologist to ML researcher.
Interested in working with Erik: https://bit.ly/erik-consulting
Erik's LinkedIn: https://bit.ly/erik-gafni-LI
(00:00:00) Introduction
(00:01:18) Is GenAI overhyped?
(00:03:48) Ascent translation with AI
(00:11:17) Social media app with AI
(00:13:19) Stable diffusion model evaluation
(00:15:16) "Consult-to-hire" model
(00:16:54) AI in biotech
(00:22:06) Self-supervised learning
(00:30:43) How he hires people
(00:32:40) Research vs production
(00:35:19) Is AGI coming?
(00:36:51) New trends in GenAI
(00:41:07) Data quality in GenAI
(00:42:19) Philosophy in LLMs
(00:47:38) OpenAI vs Open Source
(00:53:20) Mistakes he made
(00:57:02) How did he get into ML
2/24/2024 • 1 hour, 3 minutes, 2 seconds
Data science job market in 2024, softskills for interviews, AI engineering - Jay Feng - The Data Scientist Show #081
Jay Feng is the CEO of interview query, a service that help data scientists get jobs. Previously he worked as a data scientist at Nextdoor, Monster. We talked about data science job market, the rise of AI engineering, and the softskills people overlook during interviews. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
Jay Feng's LinkedIn: https://www.linkedin.com/in/jay-feng-ab66b049/
Jay Feng's YouTube: https://www.youtube.com/c/DataScienceJay
00:00:00 Introduction
00:01:11 Data science job market in 2024
00:09:13 Build projects with AI
00:16:19 Softskills in interviews
00:23:18 Daliana's story on "socializing ideas"
00:28:38 Common mistakes in interviews
00:35:30 Product DS vs ML interviews
00:36:27 Product analytics interview questions
00:39:18 Career transition in DS
00:43:04 Jay's career journey
00:45:38 Is there a principal data analyst?
00:51:52 AI engineer
00:54:28 New roles vs obsolete roles in DS
01:04:46 Is data science dead?
2/16/2024 • 1 hour, 6 minutes, 39 seconds
How to handle being laid off (as data scientists), severance negotiation, full-time employment vs independent consultant - The Data Scientist Show #080
We are joined by two data scientists who have firsthand experience with layoffs. We’ll talk about how to negotiate severance packages, how to handle stress, strategies for job hunting post-layoff, and how to reduce risks in full-time employment.
Working with Daliana on personal branding: https://forms.gle/heNuZzaHjaAMQwLu6
Her email: [email protected]
Guests:
Susan Shu Chang:
Linkedin: https://www.linkedin.com/in/susan-shu-chang/
Newsletter: susanshu.substack.com
Sundar Swaminathan
Linkedin: https://www.linkedin.com/in/sswamina3/
Website: https://www.sundarswaminathan.com/
2/9/2024 • 1 hour, 6 minutes, 33 seconds
From data analyst to sales engineer, personality-based career design, sales skills for data people - Jenny Wu - The Data Scientist Show #079
Jenny Wu is a data analyst turned sales engineer for data products at Hex. We talked about sales engineer vs data analyst, how to design a career based on your personality, and how to transition into a customer-facing role. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Jenny’s LinkedIn: https://www.linkedin.com/in/jenny-wu-...
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
(00:00:00) Introduction
(00:01:34) What is a Sales Engineer?
(00:09:35) Sales Engineering Day-to-Day
(00:13:09) Challenge in sales
(00:21:37) Traits of Successful Salespeople
(00:30:32) Stakeholder Engagement
(00:36:24) Getting into customer-facing roles
(00:43:55) Quitting her job to travel the world
(00:48:05) Advice on Career Breaks
(00:50:39) Embedding Career and Personal Goals
(00:51:57) How do you achieve happiness?
2/1/2024 • 57 minutes, 26 seconds
The future of data science teams, integrating AI into data science workflows, building data apps for stakeholders - Barry McCardel - The Data Scientist Show #078
Barry McCardel is the cofounder and CEO of Hex(free trial: hex.tech/dsshow), a collaborative data workspace. Their customers include FiveTran, Notion, and Anthropic. We talked about what does the future of data team look like, how to tackle challenges of data team collaborations, and how to leverage AI in data science’s workflow.
60-day Free Trial: hex.tech/dsshow
Barry’s LinkedIn: https://www.linkedin.com/in/barrymccardel
(00:00:00) Introduction
(00:01:25) Is AI replacing data scientists?
(00:06:08) Are data science teams getting smaller?
(00:09:54) What is Hex?
(00:11:24) How to communicate with stakeholders
(00:24:29) Should data scientists be full stack?
(00:31:23) How data team measure ROI
(00:33:35) Quantitative vs qualitative analysis
(00:35:33) When you shouldn't use data? Data vs product intuition
(00:41:39) How to hire your first data team?
(00:48:59) Is the modern data stack dead?
(00:53:55) GenAI in data science workflows
(00:59:03) Future of data scientist
(01:02:30) New features in Hex
1/21/2024 • 1 hour, 4 minutes, 50 seconds
Product data science for Microsoft AI, data scientist's role of GenAI, how to deal with burn out - Sid Sharan - The Data Scientist Show #077
Siddhartha Sharan is a Senior Data and Applied Scientist at Microsoft, helping product teams make data-driven decisions. Currently he is working on an AI product built with OpenAI APIs for sentiment analysis. We talked about how he evaluates AI products built with large language models at Microsoft, product data science, and how he went from a business background to data science. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Sid’s LinkedIn: https://www.linkedin.com/in/siddharthasharan/
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
(00:00:00) Introduction(00:05:20) How does Microsoft evaluate AI product(00:16:17) Using OpenAI API for sentiment analysis(00:25:29) Microsoft data science team culture(00:26:52) DS, PM collaboration(00:28:29) Three steps to build trust in data science(00:30:13) How did he got into Microsoft(00:34:09) Level up in Genetech(00:36:09) ML engineer vs Product DS(00:37:43) Core skills in product DS(00:40:20) Hiring(00:42:47) How to deal with burnout(00:45:03) Should you over work to earn trust?(00:45:44) Daliana's story about first day at Amazon(00:49:54) Will AI replace data scientists?(00:51:32) Data scientist's role of GenAI(00:54:32) How to keep up with GenAI
1/15/2024 • 58 minutes, 57 seconds
How she doubled her salary in a year as a data analyst, SQL in the real world, is job hopping bad? - Jess Ramos - The Data Scientist Show #076
Jess Ramos is a Senior Data Analyst at Crunchbase, a LinkedIn Learning Instructor, and a content creator in the data space. She has a bachelor's degree in Math, Spanish, and Business from Berry University and a master's in Business Analytics from University of Georgia. Today we’ll talk about SQL in the real world, data analyst vs data scientist, is job hopping bad, how she negotiated her salary. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
Jess’ Linkedin: https://www.linkedin.com/in/jessramosmsba/
(00:00:00) Introduction
(00:01:24) Why Jess left her job at Freddie Mac
(00:03:25) Is job hopping bad
(00:04:42) How to explain short job stints when interviewing
(00:06:49) Jess's day-to-day work and tech stack
(00:09:15) SQL in the real world
(00:12:10) How to talk data to stakeholders
(00:18:33) How Jess prepares for SQL interviews
(00:28:11) Data analysts vs data scientists
(00:32:11) Choosing a career path
(00:47:19) How to ask recruiter questions
(00:50:15) Jess's LinkedIn content creation journey
(00:59:03) The future of Jess's career
(01:03:42) Jess's favorite books
1/5/2024 • 1 hour, 7 minutes, 48 seconds
How we went from "enemies" to allies while working at Amazon, from civil engineering to machine learning and generative AI at AWS- Mehdi Noori - The Data Scientist Show #075
Mehdi Noori is an applied science manager at the Generative AI Innovation Center at Amazon. I used to work with Mehdi while we were at the Machine Learning Solutions Lab at AWS. So before Amazon, Maddie was a data scientist working on marketing intelligence. Mehdi has a PhD from University of Central Florida in civil engineering and sustainability. Subscribe to Daliana's newsletter for more on data science and career www.dalianaliu.com
Mehdi Noori: https://www.linkedin.com/in/mehdi-noori/
Predicting Soccer Goals: https://aws.amazon.com/blogs/machine-learning/predicting-soccer-goals-in-near-real-time-using-computer-vision/
12/6/2023 • 1 hour, 31 minutes, 53 seconds
Why she quit her finance job to become a farmer, exploring a different path from the modern life - Misty Arnold - The Data Scientist Show #074
My friend Misty moved to a farm in Portugal after her 20 years of career in finance. We talked about her experience moving from the busy corporate life to the farm life where she does a lot of manual work. Was it challenging, how does her finance work, and what is her advice to other people who also want to explore a different path outside of the modern city life. I hope this episode will give you a different perspective about your career.
Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
(00:00:00) Introduction
(00:11:41) Life on the farm
(00:15:46) Her finance plans
(00:22:55) Her career journey
(00:27:14) What do accountants do
(00:32:29) I thought I would be happy
(00:41:25) Daliana's personal view about finance; when it's enough for you
(00:44:41) Does she feel lonely on a farm?
(00:48:39) What if she didn't leave the corporate world?
(00:54:07) Does she regret her decision
11/29/2023 • 1 hour, 10 minutes, 28 seconds
Why he left his MLE job for product data science at Meta, data science at Uber, Linkedin, and Truecar - Pan Wu - The Data Scientist Show #073
Pan Wu is a senior manager of data science at Meta. We talked about why he moved from machine learning to product data science, projects he worked on at Uber, Linkedin, and Meta, and how he transitioned from IC to manager. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Pan’s LinkedIn: https://www.linkedin.com/in/panwu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
(00:00:00) Introduction
(00:01:30) Why he transitioned from MLE to product DS
(00:07:38) Meta data scientists skill sets
(00:15:49) When did his interest shifted from MLE to product DS
(00:18:04) Is MLE more respected?
(00:25:46) A/B testing deep dives in 3 steps
(00:28:21) Built a tool at Linkedin
(00:35:52) How to sell your project
(00:41:07) Junior vs senior data scientist
(00:43:24) From staff data scientist to manager
(00:45:18) Explore being a manager
(00:46:24) Cultures in Uber, Linkedin, TrueCar
(00:52:09) Data science over the past 10 year
(00:55:06) MLE vs DS fun and frustration
(00:57:26) Product DS reality
(00:59:10) Learning new skills
(01:01:39) Mistakes he made
(01:06:34) Future of data science
(01:08:04) Will data scientists be replaced by AI
(01:09:42) Three skills he looks for when hiring
11/19/2023 • 1 hour, 13 minutes, 1 second
Machine learning in cybersecurity, computer vision in sports, from business analyst to ML engineer - Betty Zhang - The Data Scientist Show #072
Betty Zhang is a data scientist currently working at a cloud security company, previously she was a data scientist at Amazon Web Services. Today we’ll talk about her computer vision projects in Sports, data science use cases in cyber security, from business major to data scientist, what’s her experience working in startups vs big tech companies. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Betty’s Linkedin: https://www.linkedin.com/in/betty-zhang-0bb63731/
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/
(00:00:00) Introduction
(00:01:21) Computer Vision Project in Sports at AWS
(00:12:28) Challenges in computer vision
(00:14:02) Time allocation for ML projects
(00:15:22) 3 key skills for computer vision
(00:17:20) From business analyst to ML engineer
(00:18:14) How she got her data scientist job through Linkedin
(00:21:32) How she got into Amazon
(00:22:17) Three tech skills needed during Amazon interviews
(00:26:11) Why she joined a Cyber Security startup
(00:27:22) Three cybersecurity use cases
(00:29:47) Anomaly detection
(00:30:40) ML for cybersecurity
(00:34:43) Tech stacks Amazon vs Startups
(00:39:35) Startups vs big tech
(00:45:56) Balance learning and impact
(00:48:35) Advice for new data scientists
11/12/2023 • 55 minutes, 12 seconds
Stop abusing A/B testing, toxic experimentation culture, how to run A/B tests with rigor - Che Sharma - The Data Scientist Show #071
Che Sharma came back to discuss toxic behaviors in experimentation culture and provide actionable advice on how to handle those situations, how to have rigor and integrity when designing and analyzing A/B tests.
Che was the 4th data scientist at Airbnb, later he joined Webflow as an early employee. In 2021 he founded Eppo, a next-gen A/B experimentation platform designed for modern data and product teams to run more trustworthy and advanced experiments. We talked about A/B testing best practices, A/B testing for ML models, and Che’s career journey.
Reach out to Che: https://www.linkedin.com/in/chetanvsharma/
11/4/2023 • 1 hour, 3 minutes, 42 seconds
Academia vs. Industry for Machine Learning, Research at Uber AI Labs, ML for Wind Farms - Jason Yosinski - The Data Scientist Show #070
Jason Yosinski was a founding member of Uber AI Labs. He is also a co-founder of WinscapeAI a company dedicated to using custom sensor networks and machine learning to increase the efficiency and sustainability of wind farms. Jason holds a PhD in computer science from Cornell University. We talked about his experience at Uber AI, his research in deep learning, and ML for wind farms. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Jason’s Website: https://yosinski.com/
Jason’s LinkedIn: https://www.linkedin.com/in/jasonyosinski/
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu
(00:00:00) Introduction
(00:06:06) His advice for Uber ML teams
(00:16:03) From research to industry
(00:20:24) ML for wind farms
(00:25:40) Metrics for wind energy prediction
(00:29:23) Start with a small dataset
(00:32:00) ML in academia vs. the industry
(00:33:24) Do you need a PhD for ML?
(00:38:14) Daliana's story about grad school
(00:41:37) The value of a PhD
(00:43:13) ML Collective
(00:48:36) Technical communication
(00:57:21) ML Skillsets
(00:59:45) Future of machine learning
(01:05:23) Personal development: Hoffman process
(01:15:13) Do things that excites you
10/23/2023 • 1 hour, 16 minutes, 9 seconds
Ads forecasting at Netflix and Spotify, how to build your personal moat - Jeff Li - The Data Scientist Show #069
Jeff Li is a senior data scientist at Netflix, focusing on Ads forecast. Previously he was a data science manager at Spotify, worked on supply forecasting, demand forecasting, and data infrastructure. He studied business at the University of Southern California. We talked about Ads forecasting, career path as a manager vs IC, culture in Spotify vs Netflix vs Doordash.
Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Jeff Li’s LinkedIn: https://www.linkedin.com/in/lijeffrey/
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu
9/14/2023 • 1 hour, 26 minutes, 29 seconds
A/B testing at Airbnb, building next-gen experimentation platform at Eppo - Che Sharma - The Data Scientist Show #068
Che Sharma was the 4th data scientist at Airbnb, later he joined Webflow as an early employee. In 2021 he founded Eppo, a next-gen A/B experimentation platform designed for modern data and product teams to run more trustworthy and advanced experiments. We talked about A/B testing best practices, A/B testing for ML models, and Che’s career journey. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Che’s LinkedIn: https://www.linkedin.com/in/chetanvsharma/
Try Eppo for A/B testing: https://www.geteppo.com/
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu
(00:00:00) Introduction
(00:01:26) Getting started in data science at Airbnb
(00:03:08) Keys to successful A/B testing
(00:06:53) Interpreting and communicating A/B test results
(00:15:00) A/B testing best practices testing machine learning models
(00:41:39) Centralizing experiment analysis
(00:53:46) Preparing data scientists for the future
(00:59:33) Developing communication skills as a data scientist
(01:08:43) Transitioning from individual contributor to manager
(01:12:28) The future of experimentation
8/25/2023 • 1 hour, 14 minutes, 15 seconds
From data scientist@Meta to full-time YouTuber (500k+ sub), AI engineering, future of work - Tina Huang - The Data Scientist Show #067
We talked about self-learning, productivity, how Tina navigates her career change and how she thinks AI could change the future of work.
Tina's YouTube: www.youtube.com/@TinaHuang1
Lonely Octopus: www.lonelyoctopus.com
Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Tina Huang is a data scientist turned YouTube creator with 500k subscribers. She is the founder of Lonely Octopus, an online program helping people gain data science, AI, and freelancing skills. She originally studied pharmacology before transitioning into tech, completing a master's degree in computer science at UPenn.
(00:02:38) Transitioning from Data Science to Content Creation
(00:06:29) Preparing for Data Science Interviews
(00:10:59) Starting a YouTube Channel
(00:14:18) Building Multiple Income Streams
(00:17:35) Getting Started with AI Skills
(00:29:29) Advice for Starting YouTube
(00:34:47) Improving Storytelling Skills
(00:36:58) Overcoming Procrastination
(00:42:33) The Future of Work
(01:47:08) Looking to the Future
(01:26:49) Income Breakdown
8/10/2023 • 1 hour, 54 minutes, 52 seconds
Making LLMs hallucinate less, how to diagnose ML models, from PM in Google AI to CEO of Galileo - Vikram Chatterji - The Data Scientist Show #066
Vikram is the co-founder of Galileo – an AI diagnostics and explainability platform used by data science teams building NLP, LLMs and Computer Vision models across the Fortune 500 and high growth startups. Prior to Galileo, Vikram led Product Management at Google AI, where his team built models for the Fortune 2000 across retail, financial services, healthcare and contact centers. He has a master degree from Carnegie Mellon University from the school of computer science. If you enjoy the show, subscribe to the channel and leave a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Vikram Chatterji’s LinkedIn: https://www.linkedin.com/in/vikram-chatterji/
"The Mom Test": https://www.amazon.com/The-Mom-Test-Rob-Fitzpatrick-audiobook/dp/B07RJZKZ7F
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu
(00:00:00) Introduction
(00:04:24) How he got into machine learning
(00:06:53) Diagnosing large language models
(00:09:56) Addressing model hallucination
(00:12:46) Metrics for measuring hallucination
(00:17:30) From Google AI to starting Galileo
(00:24:08) Developing LLMs and putting them into production
(00:32:51) Galileo's diagnostics and explainability platform
(00:43:16) Advice for data scientists when joining a startup
8/1/2023 • 1 hour, 26 minutes, 50 seconds
Data Science "Mix Martial Arts", applied re-inforcement learning, scaling AI workloads using Ray - Max Pumperla - The Data Scientist Show #065
Max Pumperla designed his own career path in data science. He is a freelance software engineer at AnyScale, and also a data science professor. We talked about reinforcement learning, open source contributions, Ray for data scientists, and his view on the data scientists role. If you enjoy the show, subscribe to the channel and leave a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Max’s LinkedIn: https://www.linkedin.com/in/max-pumperla-a8099354/
Max's GitHub: https://github.com/maxpumperla
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu
(00:00:00) Introduction
(00:09:19) How he got a remote job through Twitter
(00:14:06) Introduction to Ray
(00:18:52) Reinforcement learning
(00:23:56) Key lessons on integrating customer feedback
(00:35:12) Flaws in data science job titles
(00:45:51) How to be irreplaceable as a data scientist
(00:48:55) An unconventional career path as a data scientist
(01:12:24) Productivity and work-life balance
(01:28:10) Advice for building a personal brand
7/28/2023 • 1 hour, 53 minutes, 28 seconds
Uber's ML Systems (Uber Eats, Customer Support), Declarative Machine Learning - Piero Molino - The Data Scientist Show #064
Piero Molino was one of the founding members of Uber AI Labs. He worked on several deployed ML systems, including an NLP model for Customer Support, and the Uber Eats Recommender System. He is the author of Ludwig , an open source declarative deep learning framework. In 2021 he co-founded Predibase, the low-code declarative machine learning platform built on top of Ludwig.
Piero's LinkedIn: https://www.linkedin.com/in/pieromolino
Predibase free access: bit.ly/3PCeqqw
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu
(00:00:00) Introduction
(00:01:54) Journey to machine learning
(00:03:51) Recommending system at Uber Eats
(00:04:13) Projects at Uber AI
(00:09:34) Uber's customer obsession ticket system
(00:16:01) How to evaluate online-offline business and model performance metrics
(00:17:16) Customer Satisfaction
(00:28:38) When do you know whether a project is good enough
(00:41:50) Declarative machine learning and Ludwig
(00:45:32) Ludwig vs AutoML
(00:54:44) Working with Professor Chris Re
(00:58:32) Why he started Predibase
(01:07:56) LLM and GenAI
(01:10:17) Challenges for LLMs
(01:22:36) Advice for data scientists
(01:34:29) Career advice to his younger self
7/4/2023 • 1 hour, 50 minutes, 5 seconds
Data science in transportation, the interception of operational research and ML - Holger Teichgraeber - The Data Scientist Show #063
Holger Teichgraeber is a Data Science Manager at Archer Aviation. Previously, he worked at Convoy as a Research Scientist on their trucking marketplace, and at various companies in the energy space. Holger has a Bachelor's degree in Mechanical Engineering from Aachen, Germany, and a Masters and Ph.D. with research focus on machine learning and optimization applied to energy systems from Stanford University. He regularly writes on LinkedIn, with the goal to show how to build valuable products at the intersection of machine learning and optimization in production. If you enjoy the show, subscribe to the channel and leave a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career.
Holger's LinkedIn: https://www.linkedin.com/in/holgerteichgraeber/
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu
(00:00:00) Introduction
(00:01:28) How he got into operations research
(00:02:39) Operation research vs data science
(00:04:37) Trucking optimization at Convoy
(00:08:42) Optimization problem
(00:10:18) Strategic planning on air mobility at Archer
(00:13:50) Using simulation and solving a problem
(00:16:45) Big data science work vs smaller data science work
(00:21:23) Stakeholder management
(00:29:28) IC vs Manager
(00:32:04) Advice on promotion
(00:39:12) Work cultures in Germany and the US
(00:41:16) How to handle tight deadlines
(00:43:21) Important feedback from his work
(00:44:14) How to plan projects
(00:44:45) Next big challenge for data science teams
(00:45:40) Career growth in the next few years
(00:46:01) Connect with Holger
6/26/2023 • 46 minutes, 53 seconds
Tackling data quality issues, 5 pillars of data observability, from management consultant to CEO of Monte Carlo - Barr Moses -The Data Scientist Show #062
Barr Moses is a consultant turned CEO & Co-Founder of Monte Carlo, a data reliability company. She started her career as a management consultant at Bain & Company and a research assistant at the Statistics Department at Stanford University. Later, she became VP of Customer Operations at customer success company Gainsight, where she built the data and analytics team. She also served in the Israeli Air Force as a commander of an intelligence data analyst unit. Barr graduated from Stanford with a B.Sc. in Mathematical and Computational Science. Today, we’ll talk about Barr’s career journey, data reliability and observability, and what it means for data teams. If you enjoy the show, subscribe to the channel and leave a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science.
Barr's LinkedIn: https://www.linkedin.com/in/barrmoses/
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu
(00:00:00) Introduction
(00:01:24) How did she got into data science
(00:08:26) Frameworks for data-driven decisions
(00:11:20) Is customer support ticket always bad?
(00:15:20) How to quickly find out what is true
(00:20:17) Struggles in the data team
(00:23:37) Daliana’s story about lineage
(00:28:00) People stressed about data
(00:28:09) Netflix was down because of wrong data
(00:30:40) Common issues with data quality
(00:33:14) 5 pillars of data observability
(00:39:14) How does Monte Carlo help data scientists
(00:43:08) Build in-house vs adopt tools
(00:45:48) How Daliana fixed a data quality issue
(01:02:44) How to measure the impact of the data team
(01:09:09) Mistakes she made
(01:15:28) Beat the odds
5/18/2023 • 1 hour, 21 minutes, 31 seconds
Is search dead? Google vs ChatGPT, from Google Search to enterprise search at Glean, machine learning in search, tech layoffs - Deedy Das - The Data Scientist Show #061
Deedy Das is a founding engineer at Glean, an enterprise search startup. Previously, he was a Tech Lead at Google Search working on query understanding and the sports product in New York, Tel Aviv, and Bangalore. Before that, he was an engineer at Facebook New York and graduated from Cornell University. Outside of work, Deedy writes on his blog. He published a viral resume template and his work on exposing grading flaws in the Indian education system. He also enjoys running marathons, road cycling, and playing cricket. Today we’ll talk about the search projects he worked on at Google, why he left Google, his current work at Glean, and his thoughts on whether Google is doomed because of ChatGPT. If you enjoy the show, subscribe to the channel and leave a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science.
Deedy's Twitter: https://twitter.com/debarghya_das?s=20
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu
(00:00:00) Introduction
(00:01:52) What is search
(00:04:33) Query understanding
(00:12:46) Google vs ChatGPT
(00:18:24) Fixing bug for Sundar Pichai
(00:27:33) Why he left google
(00:30:32) How to get into search
(00:34:38) Enterprise search at Glean
(00:46:55) Advice for people who got laid off
(00:48:41) What do search engineers do
(00:51:37) How he evaluates candidates
(00:53:58) Future of search
(00:57:16) Why the web is declining
(00:59:25) Copilot and AI-powered developer tools
(01:03:46) Indian startup ecosystem
(01:07:45) India vs Silicon Valley
(01:09:48) How he grew 30k followers on Twitter
(01:13:28) Daliana and Deedy’s challenge with social media
(01:19:31) Career mistakes he made
2/21/2023 • 1 hour, 27 minutes, 6 seconds
The 100-hour work week of an self-taught machine learning researcher, how he got into Google Brain, why he started Omni - Jeremy Nixon - The Data Scientist Show #060
Jeremy Nixon is a machine learning researcher, software engineer, and startup founder. Previously he was a software engineer at Google Brain working on deep learning. Now, he is the co-founder and CEO of Omni, building an immersive information retrieval system for you and your team. He studied applied math at Harvard University. Today we’ll talk about how he got into Google brain, his 3-month self-learning plan to learn machine learning, his startup, and how he executed his goal relentlessly since 2016. If you enjoy the show, subscribe to the channel and leave a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science.
Jeremy's Twitter: https://twitter.com/JvNixon
Jeremy's Blog: https://jeremynixon.github.io/
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu
Jeremy's LinkedIn: https://www.linkedin.com/in/jeremyvnixon
(00:00:00) Introduction
(00:01:50) Research in Google Brain
(00:03:37) How he got into Google Brain
(00:07:56) His 3-month plan to learn ML
(00:17:55) The 100-hour workweek
(00:33:26) What if he is tired
(00:39:59) Why he found Omni
(00:44:24) Data science problems in Omni
(00:54:42) Future of machine learning
(00:57:51) Silicon Valley is very accessible
(00:59:47) The golden handcuffs
(01:06:58) From data scientist to full-stack engineer
(01:09:06) Close-minded data scientists
(01:24:10) Advice to ML learners
(01:29:41) Something he wished that he did when he was younger
(01:37:25) The future of his career
(01:42:17) Connect with Jeremy
2/20/2023 • 1 hour, 42 minutes, 52 seconds
The power of error analysis, tree models for search relevancy, what ChatGPT means for data scientists - Sergey Feldman - The Data Scientist Show #059
Sergey Feldman is the head of AI at Alongside, providing mental health support for students. He is also a Lead Applied Research Scientist at Allen Institute for AI, where he built an ML model that improved search relevancy for scientific literature. Sergey has a PhD in Electrical and Electronics Engineering from the University of Washington. Today we’ll talk about machine learning for search, his consulting project for the Gates Foundation, AI for mental health, and career lessons. Make sure you listen till the end. If you like the show, subscribe, leave a comment, and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's Twitter: https://twitter.com/DalianaLiuDaliana's
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Sergey's LinkedIn: https://www.linkedin.com/in/sergey-feldman-6b45074b/
Data Cowboys: http://www.data-cowboys.com/
Sergey Feldman: You Should Probably Be Doing Nested Cross-Validation | PyData Miami 2019: https://www.youtube.com/watch?v=DuDtXtKNpZs
December 4th, 2018 - Breakfast with WACh with Dr. Sergey Feldman, PhD: https://www.youtube.com/watch?v=vA_czRcCpvQ
(00:00:00) Introduction
(00:01:24) Machine learning skeptic
(00:03:02) Tree-based models for search relevance
(00:14:34) How to do error analysis
(00:19:20) Nested cross-validation
(00:21:34) Model evaluation
(00:30:43) Error analysis common mistakes
(00:33:37) How to avoid overfitting
(00:35:56) Consulting project with Gates Foundation
(00:41:16) Tree-based models vs linear models
(00:45:19) Working with non-tech stakeholders
(00:50:20) Chatbot for teen’s mental health
(00:54:32) Can ChatGPT provide therapy?
(00:58:12) How he got into machine learning
(01:02:12) How to not have a boss
(01:03:46) Feelings vs Facts
(01:09:02) Future of machine learning
(01:11:30) How to prepare for the future
(01:13:39) AutoML
(01:17:12) His passion for large language models
1/24/2023 • 1 hour, 19 minutes, 43 seconds
How to build data science muscle memory, DeepChecks -- an open source ML testing suite - Philip Tannor - The Data Scientist Show #058
Philip Tannor is the Co-Founder and CEO of Deepchecks, a python package to run checks for machine learning models. Previously, he was the head of data science group at the Isreal Defense Force. He has a master's degree from Tel Aviv University in engineering, his thesis was about a new algorithm that combines neural networks with gradient-boosting decision trees. Today we’ll talk about his career journey, how to build your data science muscle memory, the algorithm he worked on, and how to check ML models. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science and career.
Daliana's Twitter: https://twitter.com/DalianaLiuDaliana's
LinkedIn: https://www.linkedin.com/in/dalianaliu/
Philip’s LinkedIn: https://www.linkedin.com/in/philip-tannor-a6a910b7/?originalSubdomain=il
Augboost: https://medium.com/@ptannor/augboost-like-xgboost-but-with-few-twists-e4df4017a5c4
(00:00:00) Introduction
(00:01:17) How did he get into ML
(00:02:52) Data science in the military
(00:08:15) How to take feedback
(00:13:24) Handling criticism
(00:15:12) What he worked on
(00:18:18) testing deployment
(00:21:28) How to build the data science muscle memory
(00:27:09) Improving the skills of data scientists
(00:30:42) His thesis in grad school
(00:36:59) Combine NN and gradient boosting
(00:40:05) Aug boost
(00:41:15)Tools he uses
(00:45:58) Deepchecks
(00:50:46) Most challenging part of building Deepchecks
(00:52:05) How can people contribute
(00:53:40) Behind the scenes
(00:56:09) Deciding how to fix or improve the model
(01:00:49) Advise for those who wanna create open-source projects
(01:04:07) Features to add for the enterprise product
(01:06:57) About his life and career right now
(01:08:27) Connect with Philip
12/7/2022 • 1 hour, 8 minutes, 51 seconds
The Daliana Special: how did I got into data science, 5 things only experienced data scientists know, and why I started "The Data Scientist Show" - Daliana Liu #057
Who is Daliana? This is a conversation I had in 2021 with Harpreet Sahota. I talked about my unexpected journey to data science all the way back in high school, things I wish I could know earlier about my career, the projects I worked on, what is like to be a quote-and-unquote influencer on Linkedin, and more. If you want more content from me, I write about data science and career nerdy jokes, on my Linkedin and you can subscribe to my very infrequent newsletter at dalianaliu.com. I’m curious what you think about this episode, leave a comment on YouTube or send a DM on Linkedin. Hope you enjoy the Daliana special!
Daliana's Newsletter: https://dalianaliu.com
Daliana's Twitter: https://twitter.com/DalianaLiu
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Harpreet's LinkedIn: https://www.linkedin.com/in/harpreetsahota204/
The artist of the data science podcast: https://theartistsofdatascience.fireside.fm/
(00:00:00) Introduction
(00:02:52) Where did Daliana grow up
(00:05:19) Daliana in highschool
(00:07:11) How did she got into data science
(00:11:36) Why is writing important for data scientist
(00:15:51) How to write better
(00:20:56) Career lessons you didn't learn in school
(00:27:40) Imposter syndrome
(00:31:29) Day-to-day work as a data scientist
(00:36:16) Most common mistakes data scientists make
(00:39:41) Data Analyst vs. Data Scientist
(00:42:30) What is the science in data science?
(00:44:51) Can everyone be a data scientist
(00:49:21) Linkedin profile tips for job search
(00:52:59) How she creates content
(00:54:11) Being a data scientist "influencer"
(00:56:04) Why she started "the data scientist show"
(01:01:16) Women in data science
(01:06:39) What's her legacy
(01:09:43) What is she reading
(01:14:21) Connect with Daliana
11/24/2022 • 1 hour, 15 minutes, 20 seconds
How he carved his own path at Airbnb, from data engineer to CEO of Mage - Tommy Dang - the data scientist show #056
Tommy Dang is the Co-founder and CEO of Mage, a data ingestion and transformation pipeline for data engineers (https://github.com/mage-ai/mage-ai). Previously, he was working on data engineering and machine learning engineering at Airbnb. He has a bachelor degree of science in UC Berkeley studying economic, history, and sociology. Today we’ll talk about how he learned engineering and machine learning after college, data tools and ML tools he built at Airbnb, performance review, and how he navigates his career. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science and career.
Tommy’s LinkedIn: https://www.linkedin.com/in/dangtommy/
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
(00:00:00) Introduction
(00:01:28) Get into computer science from non-tech background
(00:03:08) How he started his first project
(00:04:07) Projects at Airbnb
(00:06:09) Speed vs Quality when building data pipelines
(00:16:34) How to deal with AdHoc requests
(00:21:00) How did he learn machine learning
(00:24:04) How he convinced data scientists to teach him ML
(00:25:15) Performance review
(00:27:11) Don’t let your job title limit your career
(00:28:29) Why he started his company
(00:31:38) Build your own tool vs use open source solutions
(00:33:12) Transitioning from an engineer to a CEO
(00:34:50) Earn trust from internal stakeholders
(00:36:27) Career advice
(00:41:31) How he carved his own path at Airbnb
(00:46:00) How did he learn to be a good engineer
(00:47:10) Best advice for data scientists or engineers
(00:48:41) Most important quality of data scientists or engineers
(00:51:51) Design principles
(00:58:51) Future of tools
(01:01:00) What does he think about his future career
(01:05:05) Inspiration of Tommy
11/8/2022 • 1 hour, 8 minutes, 2 seconds
How to effectively test and debug machine learning models, from ML engineer@Apple to startup founder - Gabriel Bayomi - the data scientist show #055
Gabriel Bayomi is the Co-Founder at OpenLayer, a tool that tests & debugs machine learning models. OpenLayer was in the YCombinator’s batch in 2021, building tools for machine learning model testing. Previously he was a machine learning engineer at Apple working on Siri. He has a master degree in computer science from Carnegie Mellon. He is passionate about Natural Language Processing, Machine Learning, and Computational Social Science. We talked about how to test and debug machine learning models, his experience at Apple, and career lessons. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science and career.
Gabriel’s LinkedIn: https://www.linkedin.com/in/gbayomi
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
(0:00) Intro
(01:01:39) How he got into machine learning
(01:06:43) His experience at Apple, Siri
(01:15:55) How to validate the solution
(01:19:39) Benefits of using external error analysis framework
(01:21:30) How to build a model evaluation pipeline
(01:28:26) Don’t overfit the subset of data
(01:33:19) Your validation set shouldn’t be fixed
(01:41:03) Become one with data
(01:44:05) Three model interpretability library you should use
(01:50:47) Common mistakes people made in model validation
(01:53:33) How to create an adversarial test
(01:55:43) How to check data quality
(01:06:46) Transition from engineer to executive
(01:10:04) Things he learnt from his favorite coworker
(01:17:57) how job roles would evolve
10/24/2022 • 1 hour, 24 minutes, 1 second
From Amazon research scientist to head of data product at Vestiaire Collective, why data science projects fail, how to be a good communicator - Alisa Kim - the data scientist show #054
Alisa Kim is the head of data product at Vestiaire Collective. Previously, she was a research scientist at Amazon Web Services. We used to work on the same team in Machine Learning Solutions Lab and Amazon Web Services. We have collaborated on projects before and previously she was a consultant and worked on analytics and investment banking. She has a Ph.D. in Econ AI and she has worked on various industries and multiple continents. She's someone I really enjoyed working with. We talked about her journey, the projects she worked on and the lessons she learnt. If you like the show subscribe to the channel and give us a 5 star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Alisa's LinkedIn: https://de.linkedin.com/in/alisakolesnikova
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's twitter: https://twitter.com/DalianaLiu
(0:00) Intro
(00:01:38) how she got into data science
(00:04:38) day-to-day at AWS ML Solutions Lab
(00:08:00) AWS leadership principles
(00:16:34) challenges the consultant faces when working with external customers
(00:23:36) from AWS to Vestiaire Collective
(00:37:54) how to build a better data product
(00:44:17) how data scientist can align with business stakeholders
(00:57:52) from tech to business
(01:01:33) how to develop communication skills
(01:09:17) increase visibility of the data science team
(01:17:22) being proactive vs being passive in chasing opportunities
(01:24:06) get feedback from your "nearest neighbors"
(01:25:37) how to set boundary at work
(01:38:48) mistakes she made in her career
(01:48:25) how to manage disagreement
(01:57:53) future of data science
10/19/2022 • 2 hours, 12 minutes, 17 seconds
The lessons from almost losing a million dollars for his company, how to build good data assets and get buy-in from the leadership - Mark Freeman - the data scientist show#053
Mark Freeman is a community health advocate turned data scientist His mission is to improve the well-being of people, especially among those marginalized. He is currently a senior data scientist at Humu where he builds data tools that drive behavior change to make work better. He has a master degree from the Stanford School of Medicine in clinical research, experimental design and statistics. He also has a certificate in entrepreneurship from the Business School of Stanford. In his free time, he volunteers with a Bay Area Community Health Advisory Council. He also plays Men's Division III Rugby. We talked about the building data tools, data engineering skills for data scientist, how to pitch a projects, and his career journey. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Mark's LinkedIn: https://www.linkedin.com/in/mafreeman2/
Chapters:
(0:00) Intro
(00:03:05) Our experience using R - 1000 lines of code
(00:09:22) Entrepreneurship within a company
(00:16:25) DBT and modern data stack
(00:20:15) Tools don’t matter (in interviews)
(00:21:09) Things DE enjoys but DS doesn’t
(00:24:55) How to work with different stakeholders
(00:30:32) Common SQL mistakes
(00:33:34) SQL vs Python vs R
(00:35:26) T.R.I.B.E framework for projects
(00:40:43) Meet the stakeholders where they at
(00:42:40) Use feedback to get buy-in from collaborator
(00:46:36) How to pitch a new idea
(00:49:45) Don’t lead with solution, lead with the problem
(00:51:03) How to get buy-in from the leadership
(00:57:56) Present an idea as if the audience came up with it
(00:58:41) How to iterate a project
(01:00:27) How he almost lost 1 Million dollar for his company
(01:02:07) Things he learned from his manager
(01:04:19) Things that help people make changes effectively
(01:06:05) Things he learned from mentoring
(01:12:19) Mental Health and anxiety
(01:17:12) Web3
(01:20:14) Why he cares about community health
(01:25:40) "Soul - searching" on his future
(01:28:36) Why he write on LinkedIn
(01:30:04) Future of data science
10/15/2022 • 1 hour, 32 minutes, 31 seconds
From deep learning architect at AWS to PM in AI product - Abhi Sharma - the data scientist show #052
Abhi Sharma started his career as a software engineer at Amazon Lab 126, building cloud services for Alexa. Later he transferred to Amazon Web Services as a deep learning architect. We used to work at the same team at machine learning solutions lab in AWS. Currently, he is a product manager, responsible for machine learning products like chatbot at Chime. We talked about how he transitioned his career from software engineer to deep learning architect and to a product manager, cool projects he worked on, and our shared experiences at Amazon. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Abhi's LinkedIn: https://www.linkedin.com/in/abhivs/
Highlights:
(0:00) Intro
(00:01:48) from SWE to deep learning architect to product manager
(00:12:44) day-to-day as a product manager at Chime
(00:19:46) how he collaborates with different data personas
(00:27:21) how to negotiate for more time for projects with leaders
(00:33:59) some timelines are negotiable
(00:38:00) most impactful project he worked on
(00:44:22) how to evaluate KPI, and not game the system
(00:48:02) think about development in the beginning
(00:50:29) data scientists need to educate the business and demystify the buzz words
(00:54:19) Amazon’s Think Big Challenge
(00:57:09) Never solve the problem twice
(01:00:25) How to transition to a product manager
(01:07:48) why he wanted to become a PM
(01:25:35) How can data scientist learn from PM
10/4/2022 • 1 hour, 30 minutes, 45 seconds
What data scientists need to know about MLOps principles, from GPA 2.6 to Sr. MLOps Engineer@Intuit - Mikiko Bazeley - the data scientist show051
Mikiko Bazeley is a senior software engineer working on MLOps at Intuit. Previously, she worked as a growth hacker, data analyst in Finance, then become a data scientist, and later transitioned into machine learning. She has a bachelor degree in econ, biological anthropologie, did data science bootcamp at springboard. She is a tech writer for NVIDIA and she’s working on a course on MLOps. Her goal is to demystify MLOps & show how to develop high-quality ML products from scratch. You can find her content on Linkedin and YouTube. Today, we’ll talk about useful engineering principles for data scientists, MLOps, and her career journey. Subscribe to www.dalianaliu.com for more on data science and career. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Mikiko's Linkedin: https://www.linkedin.com/in/mikikobazeley/
Highlights:
(0:00) Intro
(00:02:00) from GPA2.6 to data scientist
(00:05:27) her experience at Mailchimp
(00:11:44) her frustrations on Cookiecutter project
(00:14:09) the pain point of a data scientist working with engineering
(00:21:01) 2 MLOps pattern
(00:25:52) challenges about her work
(00:29:49) the basic engineering skills a data scientist should have
(00:32:46) the tests a data scientist should write
(00:37:42) how an MLOps engineer collaborates with a data scientist
(00:45:28) what makes a good MLOps engineer
(00:52:33) AWS vs GCP vs Azure
(00:58:59) how a data scientist collaborates with an MLOps engineer
(01:05:19) suggestions for building a model on a large scale
(01:09:11) how she learnt MLOps on her own within 6 months
(01:17:32) learn from code review
(01:19:17) MLOps books and resources she recommended
(01:24:13) mistakes she made earlier in her career
(01:31:29) common mistakes people make during career change
(01:38:22) "Start with the end in mind"
(01:41:16) the future of MLOps
(01:46:23) how she sees her career growth
(01:56:40) how she continues learning new skills
(02:00:09) what she is excited about her career and life
9/27/2022 • 2 hours, 4 minutes, 50 seconds
Bayesian thinking in work and life, ad attribution models and A/B testing, machine learning@Foursquare - Max Sklar - the data scientist show050
Max Sklar is an independent engineer and researcher. Previously, he was an engineering and Innovation Labs Advisor at Foursquare after 7 years at the company as a machine learning engineer. Previously, he has worked on Ad Attribution, recommendation engine, ratings. He is the host of The Local Maximum podcast. Max studied CS from Yale, and holds a Master degree in information systems from New York university. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Max's Linkedin: https://www.linkedin.com/in/max-sklar-b638464/
Max’s website: localmaxradio.com/about
Interviews he mentioned during the podcast:
Andrew Gelman, Statistics at Columbia University
Shirin Mojarad on Causality
Johnny Nelson on Free Speech and Moderation online
Stephanie Yang talking about Foursquare's Venue Rating System
Dennis Crowley: on Labs, on Innovation
Sophie Carr (Bayesian Mathematician)
Will Kurt (Bayesian)
Marsbot for Airpods
Other Episodes Mentioned
Bayesian Thinking
P-Hacking
Interview on Learn Bayesian Statistics
Highlights:
(0:00) Intro
(00:01:23) from computer science to machine learning
(00:05:35) Bayesian methods in rating system
(00:14:53) how to choose a Bayesian prior
(00:20:10) how to deal with p-hacking
(00:26:57) causality model in ad attribution
(00:35:20) Bias-correction methods
(00:45:43) negative lift in advertising
(00:51:05) unexpected consumer behaviors
(00:52:08) why he decided not to climb the "engineer ladder"
(00:56:46) the challenges of having 5 managers in a year
(01:01:38) using the 3rd-party software vs building his own
(01:04:18) how he approaches ML problems
(01:07:51) his tech stack
(01:09:25) his advise on learning machine learning
(01:12:40) projects he is working on
(01:17:10) Bayesian for his life decisions
(01:22:00) how writing helps him
(01:23:48) the confusion, stress and excitement in his career
9/13/2022 • 1 hour, 30 minutes, 25 seconds
Why he quit a $500k+ machine learning job at Meta (Facebook): a candid review of his experience, mistakes, and ML best practices - Damien Benveniste - the data scientist show049
(timestamps below)Damien Benveniste is a data scientist and software engineer. Previously, he was a machine learning tech leader and mentor. He has worked for almost ten years in different machine learning roles in different industries such as AdTech market research, e-commerce and health care. He has a Ph.D. in physics from Johns Hopkins University and now working towards co-founding own startup in employee engagement space. We talked about his career journey, how he solved challenging problems, and his advice for new data scientists and engineers. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Damien's Linkedin: https://www.linkedin.com/in/damienbenveniste/
(00:00) Intro
(00:01:17) from quantitative trading to machine learning
(00:07:52) his experience at Meta
(00:21:16) automated machine learning
(00:28:52) model paradigm
(00:32:47) the productivity-oriented culture at Meta
(00:41:42) short-term gain vs long-term goal
(00:44:38) things he liked at Meta
(00:51:54) the project that shaped his career
(01:03:56) the importance of having a baseline for ML models
(01:09:12) why he time-boxed everything
(01:16:25) test the model in production
(01:20:05)experimental design for ML
(01:23:25) the most challenging project he worked on
(01:37:07) best practices for machine learning
(01:48:44) how he sees himself
(02:00:52) lessons he learnt from being layoff
(02:06:45) frustration he had in his previous job
(02:16:14) what he is working on
(02:29:18) the future of machine learning
(02:39:52) things he is excited about
9/6/2022 • 2 hours, 44 minutes, 26 seconds
Time series modeling in supply chain, how to master business communication, save the environment with data science - Sunishchal Dev - the data scientist show048
Sunishchal Dev is a lead data scientist at Booster. He's helping to decarbonize the transportation industry by optimizing last mile delivery of renewable fuels. Previously, he was a management consultant. On the side, he volunteers with Project Drawdown to model the most effective solutions to climate change. He is also a mentor of future data scientist as a springboard by guiding them through real world projects. We talked about his career journey, supply chain optimization, how data science can help the environment. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
(0:00) Intro
(00:01:24) from business to data science
(00:06:36) the big impact of a small improvement
(00:08:50) data engineering vs predictive modeling
(00:11:48) routing optimization
(00:16:27) time series model
(00:21:32) use upsampling to simulate intermittent time series problem
(00:26:20) his modern data stack
(00:28:29) collaborate with engineers
(00:30:06) common mistakes people made in building time series model
(00:37:02) collaborate with truck drivers
(00:40:17) how to become a good communicator
(00:46:30) his experience in mentoring data scientist
(00:51:14) things people cannot learn at school
(00:53:16) the mistakes he made and the things he learnt from his mentor
(00:56:07) how data science can help the environment
Books recommended:
The Pyramid Principle: Logic in Writing and Thinking
The Book of Why: The New Science of Cause and Effect
Influence, New and Expanded: The Psychology of Persuasion
8/31/2022 • 1 hour, 3 minutes, 9 seconds
Product data science@Spotity, from management consultant to data scientist, salary negotiation, managing ADHD - Felicia Rutberg - the data scientist show047
Felicia Rutberg is a product strategy and analytics manager at Snap, previously she was a product data scientist at Spotify. She started her career as a management consultant at Accenture. She studied mathematics and cognitive psychology at the Vanderbilt University. Felicia reached out to me on Linkedin because she wanted to share how she became a data scientist while having ADHD. Today we’ll talk about product analytics at Spotify and Snap, her career journey, and ADHD. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Felicia's Linkedin: https://www.linkedin.com/in/feliciarutberg/
Highlights:
(00:01:29) from management consulting to data science
(00:12:20) financial data analyst at Spotify
(00:20:06) how to do internal job transition
(00:25:57) product data scientist at Spotify in the econometrics team
(00:29:33) how she became more vocal on the creative process
(00:33:48) how to get the last 1% of the work done
(00:38:53) how to ensure the quality of the analysis
(00:50:19) propensity score matching at Spotify
(00:57:09) how to validate causal inference outcomes
(01:00:51) lessons from working with economists
(01:19:16) from Spotify to Snap
(01:27:35) salary negotiation
(01:34:02) day-to-day at Snap
(01:38:33) Spotify vs Snap
(01:44:35) lessons from management consulting that helped her data science journey
(01:47:37) ADHD and self-compassion
(02:02:52) the books she recommended
(02:08:26) her future career
8/18/2022 • 2 hours, 12 minutes, 57 seconds
Data science interviews trends, from being laid off to landing a data scientist job at Airbnb - Emma Ding - the data scientist show #046
Emma Ding is a data scientist turned career coach. Previously she was a data scientist and software engineer at airbnb. I first discovered her through a viral Medium blog called “how I got 4 data science offers and doubled my income 2 months after being laid off". Today, her mission is to help data scientists land their dream offers by being strategic and efficient in their interview preparation at https://www.datainterviewpro.com/. Among the 80 clients she worked with, 90% of them received data scientist job offers from top tech companies, such as meta, linkedin, doordash, robinhood, etc. We talked about how she doubled her salary and got into Airbnb after she was laid off , her experience at Airbnb during the first half of the podcast, and then we’ll dive into new trends in data science interviews and her best strategy to get a data scientist job. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Emma's YouTube: https://www.youtube.com/c/
DataInterviewPro Free product case class: https://www.datainterviewpro.com/product-case-masterclass-registration
Books on causal inference: Mostly harmless econometrics and Mastering Metrics: The Path from Cause to Effect.
Emma's Linkedin: https://www.linkedin.com/in/emmading001/
(00:00) Intro
(00:04:24) her strategy to get the data scientist offer after the layoff
(00:07:00) advices for preparing interviews
(00:14:04) her day-to-day at Airbnb
(00:16:46) things she learnt from her mentor
(00:18:07) from a data scientist to a SDE to a data interview pro
(00:22:12) trends of data science interview
(00:26:48) data scientist tracks: analytics-driven vs algorithms-driven
(00:32:56) SQL interviews: readability and proficiency
(00:35:06) make a study plan, execute it and keep the confidence
(00:41:29) what she teaches in her datainterview.com
(00:43:45) how to tackle take-home challenges
(00:45:41) how to negotiate salaries
(00:46:56) how to build confidence in the job search process
(00:50:23) how to study efficiently different subjects
(00:54:26) how to transition to data science
(01:00:05) how to remedy mistakes during the interview
(01:03:37) is data scientist still in demand?
(01:08:43) advices for getting ready for the new career
8/2/2022 • 1 hour, 19 minutes, 35 seconds
Using ML to tackle disruptive behaviors in gaming@Activision, data science in the metaverse, cyber security - Carly Taylor - the data scientist show #045
Carly Taylor is a senior manager at Activision, leading a team of machine learning engineers to tackle disruptive behaviors in the game ‘Call of Duty’. Previously, she has held various roles including machine learning engineer, data scientist, product analyst, Analytical Chemist. She has a master degree in computational chemistry from the university of colorado. She’s passionate about video games and cyber security. She shares her insights on machine learning, gaming, and career with 33k Linkedin follower. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Carly's Linkedin: https://www.linkedin.com/in/carly-taylor0017/
Highlights:
(00:00) Intro
(00:01:14) from chemistry major to data scientist in gaming
(00:05:46) how she tackles disruptive behavior using machine learning
(00:11:38) feature engineering and model drift in fraud detection
(00:16:49) the challenge of dealing with the large scale of data
(00:27:10) data science in the Metaverse
(00:36:08) signal processing and anomaly detection
(00:40:31) dealing with the outliers
(00:45:49) gets the buy-ins from the leadership
(00:49:56) from an IC to a manager
(00:53:36) mentorship, mistakes, and other things she learnt from work
(00:58:48) Python or R?
(01:05:30) how she sees herself grow and how she deals with struggles
(01:07:56) the future of data science in gaming
7/29/2022 • 1 hour, 15 minutes, 41 seconds
From lawyer to senior data scientist at Amazon, data science in devices, HR, and real estate, how to 're-invent' yourself - Pauline Chow - the data scientist show #044
Pauline Chow is a data scientist and former legal attorney and active transportation advocate. She worked in banking, fashion and education start-ups, and Amazon. Currently, she is the data engineering lead for Thrackle, a blockchain research and modeling company. She has a master degree in computer science, Machine learning, from Georgia Institute of Technology, she also has a law degree JD from the university of wisconsin. She is also a certified yoga teacher and published writer.
We talked about her projects in three different teams in Amazon: devices, HR, and real estate; how her law degree helped her become a better data scientist; how she 're-invented' herself. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Her author website www.paulinechowstories.com or connect with her on twitter @itspaulinechow. Pauline's Linkedin: https://www.linkedin.com/in/paulinec/
A More Beautiful Question: The Power of Inquiry to Spark Breakthrough Ideas -- examples of the purpose of questioning.
The Four Tendencies by Gretchen Rubin (quiz, book). An interesting framework for considering how different people respond to internal and external expectations and pressures.
Why only rewarding high-performers can be detrimental to an organization? Wharton People Analytics Conference. Case Studies: Network Analysis. (2015, December 13). https://www.youtube.com/watch?v=0fM6JYC2zfQ
7/13/2022 • 1 hour, 31 minutes, 11 seconds
From chemical engineer to data scientist@ExxonMobil, why he left to do data science freelancing, data career jumpstart, Avery Smith - the data scientist show#043
Avery Smith is a data science consultant and career coach at Data Career Jumpstar, and TA at MIT professional education. Previously, he was working on optimization and predictive analytics at ExxonMobil. We talked about his journey from from chemical engineer to data analytics, optimization problems in energy sector, why he left ExxonMobil, and his best advice for people to get into data science. Follow Daliana on Twitter (https://twitter.com/DalianaLiu) for more on data science and this podcast. If you like the show, subscribe and give me a 5-star review :)
Topics:
His first data science projects
His experience with ExxonMobil
Why he left ExxonMobil
Data science consulting
Challenges when working with clients
Why he built his own career coaching program
How Linkedin helped his career
TA at MIT, MIT's data engineering curriculum
how to build a data science portfolio
Avery's Linkedin: https://www.linkedin.com/in/averyjsmith/
7/6/2022 • 1 hour, 31 minutes, 59 seconds
Applied machine learning research methods, human-machine team, AI strategies, trends in machine learning, how to earn trust - Vin Vashishta - The data scientist show #042
(Highlights below) Vin Vashishta is a chief data officer and AI strategist at V Squared, a company he founded in 2012 that provides AI strategy, transformation, and data organizational build-out services.
He teaches data professionals about strategy, communications, business acumen, and applied machine learning research methods. Vin has 130k+ followers on Linkedin talking about AI, analytics, and strategy. His website: https://www.datascience.vin/ If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Highlights:
(0:00) Intro
(00:03:37) "ML strategy" with 'pricing' as an example
(00:09:45) what is a good metric for ML
(00:13:16) how to translate a business problem into a data problem
(00:23:42) leverage users in the "Human Machine Teaming"
(00:48:22) how he earned the trust
(01:17:31) data science evolution from 2012 to 2022
(01:31:06) how he learns new domain knowledge
(01:36:25) the mistakes he made
(01:42:15) what he learnt from his mentor
6/29/2022 • 1 hour, 50 minutes, 1 second
Retail store forecasting with video and audio, ML in high frequency trading, from tech to politics, ML in Web3 - Greg Tanaka, the data scientist show #041
(Highlights below) Greg Tanaka is a computer scientist turned CEO of an AI company. He started coding when he was 6, studied computer science at UC Berkeley, and has built many machine learning applications, he is the the founder and CEO of Percolata developing ”Forecast as a Service”. He is also the council member of Palo Alto in California, and just finished his campaign for congress. Today we’ll talk about his career journey, forecasting, machine learning in blockchain and political campaigns. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Greg's Linkedin: https://www.linkedin.com/in/gltanaka/, Twitter: https://twitter.com/GregTanaka
Greg's DAO: https://www.gregtanaka.org/dao
Highlights:
(00:02:10) use computer vision, audio, and Wi-Fi fingerprints to forecast the retail store traffic
(00:21:55) why time series forecast is hard
(00:26:39) how he made the forecasting more stable
(00:28:46) how he troubleshot the spikes and drops in data
(00:36:04) human trading vs algorithmic trading
(00:47:36) his vision of machine learning in blockchain
(00:54:57) why he got into politics
(01:05:57) advises for people who are interested in Web3
(01:11:04) AutoML and the future of machine learning
(01:15:36) things he wished he could learn earlier
6/23/2022 • 1 hour, 30 minutes, 41 seconds
Weather forecasting with AI, Kaggle tips and tricks, dealing with missing data, deep learning with Jesper Dramsch, The Data Scientist Show #040
(Highlights below) Jesper Dramsch is a scientist for machine learning at the European Centre for Medium-Range Weather forecasts. They have a phd in applied Machine Learning to Geoscience from Technical University of Denmark. They are a Kaggle Kernals Expert and TPU star, ranking at top 81/100k worldwide. We talked about weather forecasting, things they learned from Kaggle, how to deal with missing data and ourliers, deep learning, Keras vs Pytorch, XGBoost, their struggles as a phd student, working in the EU vs US. Follow @DalianaLiu for more updates on data science and this show.
(00:01:27) how he got into in ML
(00:09:10) how he handled missing data
(00:28:34) Transformers are eating the world
(00:49:36) Hoover Loss is a fantastic metric to deal with extreme values
(00:54:48) his experience with Kaggle competition
(01:02:59) Kaggle tricks that helped his models perform better
(01:08:18) PyTorch vs Keras
(01:30:30) working in different countries and cultures
Resources shared by Jesper:
The newsletter with missing data:
https://buttondown.email/jesper/archive/towels-have-quite-a-dry-sense-of-humor/
The paper by Gael about missing data:
https://academic.oup.com/gigascience/article/doi/10.1093/gigascience/giac013/6568998
The Huber Loss:
https://en.wikipedia.org/wiki/Huber_loss
Skill Scores:
https://en.wikipedia.org/wiki/Forecast_skill
Brier Skill in Weather:
https://www.dwd.de/EN/ourservices/seasonals_forecasts/forecast_reliability.html
CRPS Continuous Ranked Probability Score
https://datascience.stackexchange.com/questions/63919/what-is-continuous-ranked-probability-score-crps
ConvNext, Convnets for the 2020s:
https://arxiv.org/abs/2201.03545
Transformers for ensemble forecasts:
https://arxiv.org/abs/2106.13924
Books I recommend:
https://www.amazon.com/shop/jesperdramsch/list/2DYS5KVR5TX0E
Blog posts I wrote about these books:
https://dramsch.net/tags/books/
Short I made about Test-Time Augmentation
https://www.youtube.com/shorts/w4sAh9lKyls
Their links: https://dramsch.net/links
Their open PhD thesis: https://dramsch.net/phd
Newsletter: https://dramsch.net/newsletter
Twitter: https://dramsch.net/twitter
Youtube: https://dramsch.net/youtube
Linkedin: https://dramsch.net/linkedin
Kaggle: https://dramsch.net/
6/16/2022 • 1 hour, 58 minutes, 11 seconds
Reinforcement learning common use cases, recommendation engine, productivity - Susan Shu Chang the data scientist show#039
(Highlights below) Susan Shu Chang is a principal data scientist at clearco, helping ecommerce founders' by building machine learning-powered investing. In her previous role, she developed the company’s very first ML powered website recommender system, deployed to millions of customers, and created a custom OpenAI Gym environment for a reinforcement learning project in production. She is also the founder and developer of Quill Game Studios, selling ~10k copies of the debut game in 6 months. She has given talks at PyCon Canada,Toronto Machine Learning Summit (TMLS), and more. She writes about her career journey and learning on https://www.susanshu.com/ If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Highlights
(00:00) Intro
(00:01:29) from economics to data science
(00:07:23) reinforcement learning (RL)
(00:20:00) recent reinforcement learning use cases
(00:27:28) reinforcement learning for social media's recommender system
(01:04:42) common mistakes when productionizing models
(01:08:30) principal data scientist's day-to-day
(01:14:05) what productivity really means
(01:21:04) productivity tips
(01:41:48) books and blogs on productivity
6/8/2022 • 1 hour, 53 minutes, 5 seconds
User-centric data science, design thinking, from UX researcher to data science manager@Visa - Laura Gabrysiak - the data scientist show #038
(highlights below) Laura Gabrysiak is a senior manager of data products and solutions at Visa. Previously, she's a data scientist, building machine learning models and decision tools to enable Visa clients. She has a college degree in computational and linguistics and has masters in design thinking. She's building the local data science community in Miami, and a co-founder of our Ladies. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Laura's Linkedin:https://www.linkedin.com/in/lauragabrysiak/
(00:02:43) her journey into data science
(00:20:28) anecdotes vs big data
(00:27:05) the power of small data
(00:30:41) design thinking key elements
(00:47:25) mindset shift from a user researcher to a data scientist
(01:00:51) how to improve customer engagement
(01:02:10) how to make data visualization effective
(01:27:21) mindset shift from an individual contributor to a manager
(01:40:43) advices for people who are on PIP
5/31/2022 • 2 hours, 1 minute, 54 seconds
A/B testing and growth analytics at Airbnb, building data science tools and metrics store with Nick Handel, the data scientist show#037
(Highlights below) Nick Handel was a senior data scientist leading the launch of the data side of this Airbnb Trips and later built a team that designed aribnb’s end-to-end machine learning platform, bighead. Currently, he is the cofounder and CEO of Transform, he first centralized 'metrics store' that empowers data analysts to deliver insights. He was recognized as 30 under 30 by Forbes in 2018. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Nick's Linkedin:https://www.linkedin.com/in/nicholashandel/
Highlights:
(00:00) intro and career journey
(00:10:58) common mistakes in A/B testing
(00:25:48) how to do A/B testing deep dives
(00:27:32) surprising A/B testing results
(00:29:18) facts vs opinions
(00:33:55) A/B testing best practices
(00:55:01) how he built a new data schema for Airbnb Trips
(01:00:43) how to collect data when building data science tools
(01:38:53) trend of data science tools
5/24/2022 • 2 hours, 10 minutes, 7 seconds
Becoming a superforecaster, decision science for better human predictions - Pavel Atanasov-the data scientist show#036
(Highlights below)Pavel is a decision scientist and co-founder at Pytho, using decision science to measure and improve human judgment & prediction. He has a phd in psychology and decision science from the University of Pennsylvania, focusing on crowd predictions. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Pavel's twitter: https://twitter.com/PavelDAtanasov
Superforecasting book, based on the Good Judgment Project: https://www.amazon.com/Superforecasting-Science-Prediction-Philip-Tetlock/dp/0804136718
Blogs about forecasting:
Vox's Future Perfect series: https://www.vox.com/future-perfect
Astral Codex Ten: https://astralcodexten.substack.com/
Highlights:
(00:01:10) how he got into decision science
(00:14:38) what makes someone a super forecaster
(00:16:20) three elements of becoming a super forecaster
(00:24:37) how to effectively update our opinions
00:30:05 how he designed experiments to find out what was a better system
(00:48:27) why humans sometimes are better than algorithm
(01:14:50) how to collect data and information better
(01:33:25) why you should quit
(01:42:30) the future of decision science
5/17/2022 • 1 hour, 51 minutes, 29 seconds
Using AI to detect online abuse, from physics PhD to staff ML engineer@Linkedin, persuasion at work with James Verbus - the data scientist show #035
(Timestamps below) James Verbus is Staff Machine Learning Engineer at LinkedIn. He has a PhD in Physics from Brown university. He is the tech lead of the Anti-Scraping and Automation AI Team, working on protecting LinkedIn's Members from bots and abusive scripted behavior, pioneering the use of deep learning to detect abusive automated sequences of user activity (blog post). If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
(00:01:14) from physic to data science
(00:16:37) background of online abuse detection
(00:24:40) Isolation Forest Algorithm
(00:42:59) his day-to-day as a staff ML Engineer
(00:52:57) how to persuade stakeholders
(00:58:17) how to build influence at work
(01:00:22) how he grew to staff engineer
(01:13:48) what he learned from his mentor
5/10/2022 • 1 hour, 35 minutes, 55 seconds
The golden age of AI and neuroscience, brain computer interface (BCI), from academia to FAANG with Patrick Mineault - The Data Scientist Show #034
(Timestamps below) Patrick Mineault is a neural data scientist. He has worked at Google and Facebook after he did a postdoc at UCLA. He worked on Brain Computer Interface (BCI) at Facebook Reality Labs, building a BCI that allows you to type with your brain. He tweets about neuro-AI @patrickmineault, and writes a blog (https://xcorr.net) sharing his career journey and learnings along the way. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
How he got into data science (00:02:41)
His work at Google on A/B testing (00:04:17)
How he joined Facebook Reality Lab(00:23:53)
Projects on neuro-AI and brain computer interface (BCI) (00:27:13)
Skills needed for BCI research (00:34:37)
How AI influence neuroscience (01:34:28)
computer vision VS human vision (01:39:57)
model vs data, nature vs nurture(01:45:32)
5/5/2022 • 2 hours, 46 minutes, 25 seconds
From biostatistician to the 'artist of data science', how he turned his life around, philosophy - Harpreet Sahota - The Data Scientist Show#033
Harpreet Sahota is a data scientist and ML developer advocate, he is also the host of “artist of the data science” podcast and weekly data science happy hours, he is the principal data science mentor at data science dream job. He is also a philosophy nerd. He had some struggles when he tried to get into data science, and today we’ll talk about his experience as a biostatistician, data scientist, lessons he learned from his journey and from mentoring other people, and how he turned his life around. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Harpreet's Linkedin: https://www.linkedin.com/in/harpreetsahota204/?originalSubdomain=ca
The artist of data science podcast: https://theartistsofdatascience.fireside.fm/
4/6/2022 • 1 hour, 25 minutes, 8 seconds
How he built the best Covid forecasting model, lessons learned and how to improve model performance with Youyang Gu - The Data Scientist Show#032
Youyang Gu is the creator of http://covid19-projections.com. In 2020, while most Covid prediction model failed, without any experience in medicine he created a forecasting model that outperforms almost all medical experts. Yann LeCun, Facebook's chief AI scientist and professor stated that Gu's model "is the most accurate to predict deaths from COVID-19", surpassing the accuracy of the well-funded Institute for Health Metrics and Evaluation COVID model. It was cited by the Centers for Disease Control (CDC) in its estimates for U.S. recovery.
Currently, he is a member of the Technical Advisory Group at the World Health Organization. Working on laying the groundwork for a comprehensive, global study to document and analyze differences in levels of mortality attributable to COVID-19 between and within countries.
Today we talked about how he built the model, lessons he learned, his advice for data scientists and what his working on today. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Youyang's blog: https://youyanggu.com/
Youyang's Twitter: https://twitter.com/youyanggu
3/31/2022 • 2 hours, 3 minutes, 43 seconds
Feature engineering, ML models in production, new trend for ML tools, day-to-day of a principal engineer with Willem Pienaar - The Data Scientist Show #031
Willem is the creator of Feast, an open-source feature store (feast.dev), building tools at the intersection of engineering, data, and ML. Currently, he work as a Principal engineer at Tecton, Leading the development of Feast, an open source feature store. Previously, he has worked in South Africa, Thailand, Singapore before he moved to San Francisco in the US. Today we’ll talk about machine learning in production, cool projects he worked, machine learning in startup and how to pick the right data science track for your career. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Willem's Linkedin:https://www.linkedin.com/in/willempienaar/
3/24/2022 • 1 hour, 36 minutes, 24 seconds
Machine learning in healthcare, how to scale ML solutions, from ML researcher to product leader at Microsoft with Muazma Zahid - The Data Scientist Show #030
Muazma Zahid is a leader in data and AI, speaker and researcher in Biomedical Engineering with several international publications and awards. We talked about machine learning in healthcare, how to scale data science solutions, her journey from a ML researcher to data engineer to engineering manager to a product leader.
She joined Microsoft in 2018 as a data engineer, later became a senior manager in software engineering, and now she is a principal product manager. She won the mentor of the year award in 2020 by Women Tech Network. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Muazma's Linkedin: https://www.linkedin.com/in/muazmazahid/
3/20/2022 • 1 hour, 40 minutes, 31 seconds
Hands-on time series analysis, open source projects, R packages, MLOps common mistakes with Rami Krispin - The Data Scientist Show #029
Rami leads the data science and engineering team at Apple Finance Decision Support. He uses advanced statistical and machine learning models to help leadership make better decisions. He is also an open-source contributor and the author of Hands-On Time Series Analysis with R and several R packages for time series analysis and machine learning applications. He has a master degree in applied econometrics. We talked about time series, open source, MLOps and his career journey. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Rami's Linkedin: https://www.linkedin.com/in/rami-krispin/
Rami's Github: https://github.com/RamiKrispin
Rami's Twitter: https://twitter.com/Rami_Krispin
Rami's Blog: https://ramikrispin.github.io/
3/11/2022 • 1 hour, 31 minutes, 3 seconds
Becoming a deep learning researcher without a PhD, graph neural network(GNN), time series, recommender system with Kyle Kranen - The Data Scientist Show#028
Kyle Kranen is a Deep Learning Software Engineer at Nvidia. Researching, implementing, and optimizing state of the art distributed deep learning models, using mainly Pytorch and Tensorflow. He has a unique combination of skillset of both hardware and software engineering. We talked about Graph Neural Network (GNN), Temporal Fusion Transformer (TFT), time series, and other deep learning research topics and his career journey. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Kyle's Linkedin: https://www.linkedin.com/in/kyle-kranen/
3/3/2022 • 1 hour, 57 minutes, 10 seconds
How to 'predict' the past, geospatial data's use cases, Data-as-a-Service (DaaS), out-of-the-box career advice with the CEO of SafeGraph, Auren Hoffman - The Data Scientist Show #027
Auren Hoffman is CEO of SafeGraph: the place for data about physical places. We talked about how to use analytics and machine learning to find truth in data, geospatial data and their use cases, the impact of DaaS, and what he looks for when he develops talents. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Auren's Twitter: @auren.
2/24/2022 • 1 hour, 26 minutes, 27 seconds
Telling compelling stories with data, people skills for analytical thinkers with Gilbert Eijkelenboom - The Data Scientist Show #026
Gilbert Eijkelenboom is the founder of Mindspeaking, a training program helping data & analytics professionals improve their business understanding, persuasion, and storytelling skills. He wrote the best-selling book “people skills for analytical people”. We talked about to get buy-in from stakeholders, how to build work relationships as introverts, how to earn trust, how to tell compelling stories with data, and lessons he learned from playing poker. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Gilbert offers various free materials:
Free self-assessment of your Data communication skills (3 min): mindspeaking.com/maturity-model
Free preview of People Skills for Analytical Thinkers: mindspeaking.com/book
Free email course: mindspeaking.com/conversation
2/17/2022 • 1 hour, 28 minutes, 9 seconds
Sports analytics and personal branding for data scientists, Ken Jee - The Data Scientist Show #025
Ken Jee is the head of data science@Scouts Consulting Group and a YouTube creator with over 180k followers. Today we talked about sports analytics, how to grow your career and get promoted, how to explain complex concepts to stakeholders, how to build personal brands as data scientists.
If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Ken Jee's Linkedin, YouTube
2/9/2022 • 1 hour, 32 minutes, 17 seconds
From Apple store specialist to ML engineer at Apple, build a portfolio through open source projects, Julia Language, with Logan Kilpatrick - The Data Scientist Show #024
Logan Kilpatrick is a machine learning engineer at Apple, Developer Community Advocate of Julia. He is a teaching fellow at Harvard extension school, and currently doing a master program of science in Law. Today we’ll talk about how he became a Machine learning engineer, the internship he did in NASA, why you should care about open source communities, Julia, what the future of machine learning looks like, make sure you stay till the end. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Logan's Twitter: https://twitter.com/OfficialLoganK?s=20
Logan's LinkedIn: https://www.linkedin.com/in/logankilpatrick/
2/3/2022 • 1 hour, 42 minutes, 54 seconds
Tackling complex ML problems with small steps, MLOps best practices, pre-model analysis, from marketing analyst to principal ML researcher with Nathan Landi, The Data Scientist Show #023
Nathan Landi is a principal quantitative researcher at TEKSystems. He is on the advisory board of MLOps world. We talk about pre-model analysis using information value, MLOps best practices, multi-stage modeling, tackling complex problems with simple models, interview tips, and how he grew his career from marketing analyst to principal ML researcher. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
The mentorship service mentioned is SharpestMinds, it's free to sign up here.
Nathan's Linkedin: https://www.linkedin.com/in/nathanielglandi/
1/27/2022 • 1 hour, 44 minutes, 27 seconds
Data-driven sales strategies, sales metrics, how to collaborate with business leaders with Dennis Yu - The Data Scientist Show #022
Dennis Yu is a Revenue and Strategy Leader, currently he is the Merchant Success Team Lead at Shopify, he’s on the advisory board of USC startup accelerator, and he is also leading Talent and Professional Development for Asian ERG at Shopify. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Dennis's Linkedin: https://www.linkedin.com/in/dennisyyu/
Today we’ll talk about what business leaders look for in data science projects and his career journey.
Sales metrics: LTV GMV band, etc
Example of how he would grow sales revenue
How to tell the stories with data
How data scientists and business leaders should collaborate
His career journey
How growing up as an Asian American shaped his perspectives
1/20/2022 • 1 hour, 19 minutes, 51 seconds
Economic thinking and a must-listen mini MBA for data scientists with Airbnb VP and Wharton Professor, Amit Gandhi - The Data Scientist Show#021
Amit Gandhi is a technical fellow and VP at Airbnb. He is a professor in economics at the Wharton School in the University of Pennsylvania. He gave as a master class on economic thinking and a mini-MBA tailored for data scientists. We also talk about his career journey, decision-making, machine learning, economics, and his advice to data scientists, make sure you stick to the end. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Amit's Linkedin: https://www.linkedin.com/in/amitgandhiecon/
1/13/2022 • 1 hour, 40 minutes, 17 seconds
Translating ML model’s output into financial impact, fraud detection, financial modeling at Google, interview preparation with Dan Lee - The Data Scientist Show #020
Dan Lee is the an ex-Google data scientist turned founder of DataInterview - an interview prep platform for data scientist. We talked about how to translate model results into dollar amount, fraud detection models, quantitative thinking, data storytelling, best practices in exploratory data analysis (EDA), and interview prep tips. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
His Linkedin: https://www.linkedin.com/in/danleedata/
Interview Prep: https://dataInterview.com
1/5/2022 • 1 hour, 19 minutes, 1 second
Unlocking the power of emotional intelligence for your career success, how to handle toxic relationships and how to regulate negative emotions with Marc Brackett - The Data Scientist Show #019
Marc is a Yale Professor and the founding director of Yale Center for Emotional Intelligence. He wrote the best selling book “Permission to Feel”. Today we’ll talk about how we can use emotional intelligence to empower our careers:
how to regulate negative emotions
how to deal with toxic relationships at work
how to influence big stakeholders
If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
12/30/2021 • 1 hour, 13 minutes, 30 seconds
The ultimate data science interview landscape, three shifts in DS job search, common mistakes in interviews with Andrew Berry The Data Scientist Show #018
Andrew Berry is a data science educator at Lighthouse Labs. He has worked with over 100+ students from various backgrounds trying to transition into data science. He teaches data science, coaches aspiring data scientists, and design courses.
We talked about the shift in data science interviews, how to tackle coding interviews, future of job search, how to build your portfolio, interviews tips for big companies vs small companies, behavioral interviews vs technical interviews, how to write cold emails and more. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Andrew's Linkedin: https://www.linkedin.com/in/berrya/
12/23/2021 • 1 hour, 49 minutes, 58 seconds
From unemployed to chief data scientist of multiple startups, machine learning prototyping, how to read people, overcoming life struggles with Matt Kirk, the data scientist show #017
Matt Kirk is Daliana's mentor, so it's a very special episode! Matt has been many things in his life: data scientist, software engineer, research analyst (quant), a founder, a c-level executive, and so on. We talked about Matt’s unique career adventure, machine learning solutions he built for startups, how to read people and influence stakeholders, how to understand yourself, how to be productive and how he overcome his life struggles. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
You can reach out to him: matt[at]matthewkirk[dot]com
12/15/2021 • 2 hours, 2 minutes, 53 seconds
The unique algorithm for compact and accurate machine learning models, no-code ML use cases and its impact on the future of data scientists with Blair Newman - The Data Scientist Show #016
Blair Newman is the CTO of Neuton AI. Neuton is a zero-code cloud platform that empowers users of any tech level to apply the best machine learning practices for solving real-world challenges faster.
We’ll talk about Neuton AI’s patented deep learning algorithm that doesn’t use back propagation, his career journey, no-code ML use cases, and how does no-code ML impact the future of data science. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Blair’s Linkedin: https://www.linkedin.com/in/blairnewman/
Neuton AI: https://neuton.ai/
12/9/2021 • 1 hour, 18 minutes, 52 seconds
Build successful end-to-end machine learning systems, ML engineers day-to-day and stakeholder management with Eugene Yan - The Data Scientist Show#015
Eugene Yan is a machine learning engineer at Amazon. He designs, builds, and operates machine learning systems that serve customers at scale. In his free time, he writes and speaks about data science on www.eugeneyan.com with 2,000+ subscribers.
We talked about how to build an end-to-end ML project successfully, machine learning best practices, his approach to tackle challenging problems, high-impact projects he worked on, how to communicate effectively with stakeholders, why writing documents is important, and how to get to the next level. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Eugene's Twitter: https://twitter.com/eugeneyan?s=20
12/2/2021 • 1 hour, 50 minutes, 11 seconds
From data engineer to data scientist at Google, transition into DS from non-tech degree, salary negotiation, how to manage up with Sundas Khalid - The Data Scientist Show #014
Sundas Khalid is a senior analytics lead at Google. She started her career as a data engineer and transitioned into data science through self-learning. I met Sundas when we worked together at Amazon. She helped women of color negotiate a $1.4M in incremental salaries. She talks about careers in data science, personal finance, and salary negotiation on YouTube and Instagram. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Sundas' website: https://sundaskhalid.com/about-me
Sundas' YouTube:https://www.youtube.com/c/sundaskhalidInstagram:https://www.instagram.com/sundaskhalidd/?hl=en
• We talked about how she transitioned from data engineer to data scientist
• Data engineer vs data scientist pros and cons
• How to grow to a senior data scientist
• How to build a data science tool that has impact for the business
• 3 mistakes people make when negotiating salary
• How to build wealth using your salary
(00:00:00) Introduction
(00:01:23) Overview of her career journey
(00:04:52) High-impact projects she worked on
(00:06:59) Tools she uses
(00:07:42) To be successful in DE
(00:09:16) Transitioning into data science
(00:12:06) DE skills that give an edge as a data scientist
(00:13:11) Her expectations of data science
(00:15:49) Data engineering vs data science
(00:17:33) Her day-to-day as a data scientist
(00:19:42) The struggles in her day-to-day work
(00:21:31) Is data science going away?
(00:22:45) Automation tools and reports
(00:25:53) Growing her career as a data scientist
(00:27:53) Communicating better with people
(00:30:06) mistakes she made in her career
(00:34:16) Daliana joining the team Weblab
(00:37:16) Tips for negotiating salary
(00:41:57) Tools to use for researching salaries
(00:43:19) Mistakes when negotiating a salary
(00:44:03) Importance of investing early in a career
(00:49:09) Tips about investing and building wealth
(00:51:15) Things that people should know more
(00:52:51) Tips for increasing productivity
(00:54:46) Future of data science and analytics
(00:56:08) How to keep updated on new information
(00:57:52) What is she excited about now
11/25/2021 • 59 minutes, 39 seconds
Develop product sense to uplevel your data science career, how to influence product managers with data, crack product sense interview questions with Peter Knudson - The Data Scientist Show #013
Peter Knudson is a product manager of 10 years who focuses on innovative new experiences that help drive engagement in the ever evolving landscape of mobile and console games. He is also the author of the Amazon best selling book “Product Sense.”
We talk about what is product sense, how do data scientist develop product sense, what are product manager’s frustration when working with data scientists, how can data scientists influence product managers better, misconceptions about product management, common mistakes in product management. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Peter’s best selling book “Product Sense”: https://www.amazon.com/Product-Sense-Problems-Interviews-Management-ebook/dp/B0998SRN37
Website: ProductSenseBook.com
Peter's Linkedin: https://www.linkedin.com/in/thisispeterk/
11/17/2021 • 1 hour, 11 minutes, 6 seconds
The secret to improve mental health, future of data engineering, work life balance with Zach Wilson - The Data Scientist Show #012
Zach Wilson is a tech lead at Airbnb building data pipelines, previously he worked at Netflix and Facebook. Zach graduated from college at the age of 20 with degrees of math and computer science. He has over 80k followers on Linkedin.
We talked about mental health, terminal level, promotions, work life balance, building audience on Linkedin, and the future of data engineering.
For data engineering best practices, Zach's career journey, working in FAANG, please go to previous episode "Demystify data engineering". If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Zach's YouTube: https://www.youtube.com/channel/UCAq9f7jFEA7Mtl3qOZy2h1A
(00:00:00) Introduction
(00:00:41) Starting his career at 20
(00:06:38) Change of mindset
(00:11:35) His ambition
(00:15:57) Hobbies he developed
(00:19:28) Maintaining work-life balance
(00:21:31) His thoughts about the terminal level
(00:26:55) Career advice about job burned-out
(00:33:11) His daily routine
(00:39:34) Good projects to learn
(00:46:22) Doing projects using JavaScript
(00:47:56) What are the lessons he can share
(00:52:26) Growing a large audience
(01:00:13) His goal on sharing
(01:04:12) Advice for people who wants to grow an audience on LinkedIn
(01:09:28) Something he is excited about in the future
(01:14:04) Future of data engineering
(01:21:01) Connect with Zach
11/9/2021 • 1 hour, 21 minutes, 57 seconds
Demystify data engineering, 3 common mistakes, FAANG's culture, how to say no at work with Zach Wilson - The Data Scientist Show #011
Zach Wilson is a tech lead at Airbnb building data pipelines, previously he worked at Netflix and Facebook. Zach graduated from college at the age of 20 with degrees of math and computer science. He has over 80k followers on Linkedin.
We talked about common data engineering mistakes, best practices, softskills, how to say no at work, work experience in Facebook, Netflix, and Airbnb. This is part one of our conversation, and please go to next week’s episode for part two. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Zach's Linkedin: https://www.linkedin.com/in/eczachly/
(00:00:00) Introduction
(00:00:43) How did he get into DE
(00:01:23) Data Infrastructure vs Data Engineer
(00:02:44) Day-to-day at Airbnb
(00:05:57) How much data science should DEs need to know
(00:10:07) Common mistakes of DEs
(00:14:53) Good questions to ask stakeholders
(00:18:10) Communicating with data scientists and software engineers
(00:20:39) Frustrations when working with data scientists
(00:24:34) Setting up processes
(00:26:22) High-quality pipeline
(00:28:42) High-impact data engineering project
(00:33:13) Mistakes he made early in his career
(00:38:14) Core DE skills that juniors must know
(00:40:50) How to go to the next level
(00:44:15) Meeting his mentor
(00:46:02) Some advice from mentors
(00:48:00) Best advice about influencing without a title
(00:49:45) Working at Facebook, Netflix, and Airbnb
11/4/2021 • 1 hour, 9 seconds
Build a killer analytics dashboard for your CEO; data visualization best practices with Kate Strachnyi - The Data Scientist Show #010
Kate Strachnyi is the founder of DATAcated – delivering training on data visualization, data storytelling, and dashboard best practices. She has over 150k followers on Linkedin. We talked about how she got into data analytics without a background in math, what makes a good dashboard, how to work with executives, how to tell stories with data, what she’s looking for when hiring a data analyst, and the psychology of color! If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
DATAcated: https://datacated.com/
10/28/2021 • 1 hour, 24 minutes, 26 seconds
Ace the data science interview; build kick-ass portfolio projects with Nick Singh - The Data Scientist Show #009
Nick Singh is a career coach and the co-author of "ace the data science interview". He has over 60k followers on Linkedin, and previously worked at Facebook and Google. We talked about how to prepare for data science interviews, how to build a portfolio, what makes a candidate stand out, how to write cold emails to recruiters, and his career journey. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
You can find his book on Amazon: https://www.amazon.com/dp/0578973839?&linkCode=sl1&tag=nicksingh03-20&linkId=4fa541a539320e8936926cb3a5167881&language=en_US&ref_=as_li_ss_tl
Nick's Linkedin: https://www.linkedin.com/in/nipun-singh/
10/21/2021 • 56 minutes, 25 seconds
Solving the brain with machine learning; the secret to a successful career with Konrad Kording - The Data Scientist Show #008
Konrad Kording is a neuroscientist and professor at the University of Pennsylvania. Konrad is trying to understand how the world and the brain works using data. He is known for his research in computational neuroscience. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Konrad's twitter:https://twitter.com/KordingLab
The online community of computational neuroscientists he's working on: http://neuromatch.io/
We talked about:
- Is evolution gradient descent?
- What makes a data scientist competitive?
- His three principles of doing good science
- Why do we need casual inference in AI?
- Should we optimize our brain's 'loss function' to make us happier?
- The secret to a good career
- Three rules he follows for doing good science
- Is deep learning a bubble?
- How did he get to where he's at today
10/14/2021 • 1 hour, 38 minutes, 39 seconds
How do data scientists get into blockchain? How to build a career by networking online, Greg Osuri - The Data Scientist Show #007
A seasoned open-source developer of 25+ years, Greg Osuri is the CEO and co-Founder of Akash Network, an open-source decentralized cloud that provides a fast, efficient, and low-cost application deployment.
Prior to Akash Network, Greg founded AngelHack, the world’s largest hackathon organization with over 200,000 developers across 164 cities across the globe. At AngelHack, he helped launch several developer companies including Firebase, which was acquired by Google in 2014. Greg launched his career at IBM and later designed Kaiser Permanente’s first cloud architecture. As an expert in open-source, distributed systems, and blockchain development, and an applied economist, Greg is a featured international speaker and has spoken recently at events including Kong Summit, Block-Con, and Block to the Future. His work has been featured in top-tier publications including BeInCrypto, CoinDesk, Cointelegraph, Forbes, TechCrunch, and Yahoo! Finance. Greg was instrumental in the passing of California’s first Blockchain law, providing the first expert-witness testimony at the Senate.
About Akash Network: Akash Network, the world's first decentralized and open-source cloud, accelerates deployment, scale, efficiency and price performance for high-growth industries like blockchain and machine learning/AI. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Greg's Twitter: https://twitter.com/gregosuri
10/8/2021 • 1 hour, 20 minutes, 29 seconds
Human-centered design for AI; working with Fei-Fei Li; human first design for AI, Andrew Kondrich - The Data Scientist Show #006
Andrew Kondrich is a machine learning engineer at Scale. We talked about his career journey, human first design for AI, how to get into machine learning, and what kind of candidates companies are looking for. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
10/1/2021 • 41 minutes, 24 seconds
From history major to data science manager, when you shouldn't use data, Bryan Davis - The Data Scientist Show #005
Bryan is a data science manager, previously he worked at Facebook and Indeed as senior data scientist. Bryan specialize in ad system design, ad ranking, and A/B testing platforms. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
We talked about:
- how he got into data science as a history major
- when not to use data science to make decisions
- how data scientists should influence the company's culture
- how data scientists can have a competitive edge in the future
- what is the ad ranking problem
- data science books for game theories
- how to use game theories in real life
9/24/2021 • 1 hour, 23 minutes, 25 seconds
How to get your dream job without applying online. with Jerry Lee - The Data Scientist Show #004
Jerry is the COO/Founder of Wonsulting and an ex-Senior Strategy & Operations Manager at Google & used to lead Product Strategy at Lucid. After graduating from Babson College, Jerry was hired as the youngest analyst in his organization by being promoted multiple times in 2 years to his current position in Google. With Wonsulting, Jerry partners with universities & organizations (220+ to date) to help others land into their dream careers. He's amassed 250,000+ followers across LinkedIn, TikTok & Instagram and has reached 40M+ professionals. In addition, his work has been featured on Forbes, Newsweek, Business Insider, Yahoo! News, LinkedIn & elected as the 2020 LinkedIn Top Voice for Tech.
Jerry shared his expert advice on how to network effectively, how to send messages to recruiters, and how he used data analytics to solve million dollar business problems. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
9/17/2021 • 1 hour, 7 minutes, 45 seconds
Transition into machine learning as an engineer, two mistakes ML scientists should avoid. Alexey Grigorev - The Data Scientist Show #003
Alexey Grigorev is a principal data scientist at OLX Group, He is also the founder of Data Talks Club with 4,100 members. He wrote a book called "Machine Learning Bookcamp" to help people learn machine learning by doing projects. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
We talked about: how Alexey transitioned into machine learning
- What kind of project helped him get his job
- 2 mistakes new data scientists often make
- Why do you need to know the baseline
- What makes you stand out as a candidate
- His free machine learning course
#datascience #machinelearning #ai #ml #career
9/8/2021 • 1 hour, 6 minutes, 50 seconds
The future of data scientists; network like a champion with Jim Zheng - The Data Scientist Show #002
Jim Zheng is an engineering manager at Flexport, building the data platform; he was a data scientist at Salesforce, worked at Yahoo as a UX designer, and was a researcher in computer science at Stanford. He is also the cofounder of Senpai, an audio platform for experts to share domain knowledge. If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science.
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
Ask Jim a question on Senpai: https://beta.senpai.so/ataki12
Jim's article on how to send cold emails: https://www.linkedin.com/pulse/hiring...
We talked about:
- Jim's career path in engineering and data science
- What makes a great data scientist
- What's the future of data scientists
- What's a 'human cloud'
- How to network like a champion
- Best way to work with mentors
- Career advice to his younger self
- Life lessons he learned from playing chess
9/3/2021 • 1 hour, 20 minutes, 55 seconds
Build resilient machine learning models; advice for ML careers - Gerald Friedland The Data Scientist Show #001
Gerald Friedland is the CTO of an AI company (Brainome) and a professor at UC Berkeley. Listen to his advice on how to build more resilient machine learning models and get inspired by his career journey! If you like the show subscribe to the channel and give us a 5-star review. Subscribe to Daliana's newsletter on www.dalianaliu.com/ for more on data science and career.
Gerald’s LinkedIn: https://www.linkedin.com/in/geraldfriedland/
Gerald’s Class: https://www2.eecs.berkeley.edu/Courses/CS294_3438/
Gerald’s Youtube Lectures: https://www.youtube.com/playlist?list=PL17CtGMLr0XzOsLydB0jik4UpEyoW5SOx
Daliana's LinkedIn: https://www.linkedin.com/in/dalianaliu/
Daliana's Twitter: https://twitter.com/DalianaLiu
(00:00:00) Introduction
(00:01:06) How did he get into machine learning
(00:12:13) How to Reduce Overfitting
(00:22:15) Technology in Brainome
(00:24:13) Brainome vs auto ML
(00:27:08) Measurement of a data-centric approach
(00:27:32) Data Drift
(00:32:28) Courses to take
(00:34:33) Information theory
(00:38:06) Advice for students in grad school
(00:42:43) Dealing with failure and stress
(00:44:05) The underdog story
(00:49:12) Who is Gerald Friedland
(00:50:00) Inspiration of Gerald
(00:51:08) Future of machine learning
(00:52:36) Tips for asking the right questions
(00:53:40) Question your assumptions
(00:54:10) Favorite books and courses
(00:56:34) Advice on machine learning
(00:57:52) Ideal qualities of an ML engineer