Brand Logo
DataTalks.Club

DataTalks.Club

By DataTalks.Club

DataTalks.Club - the place to talk about data!
Available on
Apple Podcasts Logo
Pocket Casts Logo
Spotify Logo
Currently playing episode

What I Learned After Interviewing 300 Data Scientists - Oleg Novikov

DataTalks.ClubMay 07, 2021
00:00
01:08:36
The Future of AI Agents - Aditya Gautam

The Future of AI Agents - Aditya Gautam

In this talk, Aditya, an experienced AI Researcher and Engineer, shares his technical evolution—from his roots in embedded systems to building complex, large-scale AI agent architectures. We explore the practical challenges of enterprise AI adoption, the shifting economics of LLMs, and the infrastructure required to deploy reliable multi-agent systems.You’ll learn about:- The ROI of Fine-Tuning: How to decide between specialized small models and general-purpose APIs based on cost and latency.- Agent MLOps Stack: The essential roles of guardrails, data lineage, and auditability in AI workflows.- Reliability in High-Stakes Verticals: Navigating the unique AI deployment challenges in the legal and healthcare sectors.- Evaluation Frameworks: How to design robust evals for multi-tenancy systems at scale.- Human-in-the-Loop: Strategies for aligning "LLM as a judge" with human-labeled ground truth to eliminate bias.- The Future of AGI: What to expect from the next wave of multimodal agents and autonomous systems.TIMECODES: 00:00 Aditya’s from embedded systems to AI08:52 Enterprise AI research and adoption gaps 13:13 AI reliability in legal and healthcare 19:16 Specialized models and agent governance 24:58 LLM economics: Fine-tuning vs. API ROI 30:26 Agent MLOps: Guardrails and data lineage 36:55 Iterating on agents with user feedback 43:30 AI evals for multi-tenancy and scale 50:18 Aligning LLM judges with human labels 56:40 Agent infrastructure and deployment risks 1:02:35 Future of AGI and multimodal agentsThis talk is designed for Machine Learning Engineers, Data Scientists, and Technical Product Managers who are moving beyond AI prototypes and into production-grade agentic workflows. It is especially relevant for those working in regulated industries or managing high-volume API budgets.Connect with Aditya:- Linkedin - https://www.linkedin.com/in/aditya-gautam-68233a30/Connect with DataTalks.Club:- Join the community - https://datatalks.club/slack.html- Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ- Check other upcoming events - https://lu.ma/dtc-events- GitHub: https://github.com/DataTalksClub- LinkedIn - https://www.linkedin.com/company/datatalks-club/ - Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/

Mar 06, 202601:08:40
Foundations of Analytics Engineer Role: Skills, Scope, and Modern Practices - Juan Manuel Perafan

Foundations of Analytics Engineer Role: Skills, Scope, and Modern Practices - Juan Manuel Perafan

In this talk, Juan, Analytics Engineer and author of Fundamentals of Analytics Engineering share his professional journey from studying psychological research in Colombia to becoming one of the first analytics engineers in the Netherlands. We explore the evolution of the role, the shift toward engineering rigor in data modeling, and how the landscape of tools like dbt and Databricks is changing the way teams work.



You’ll learn about:

  • The fundamental differences between traditional BI engineering and modern analytics engineering.
  • How to bridge the gap between business stakeholders and technical data infrastructure.
  • The technical "glue" that connects Python and SQL for robust data pipelines.
  • The importance of automated testing (generic vs. singular tests) to prevent "silent" data failures.
  • Strategies for modeling messy, fragmented source data into a unified "business reality."
  • The current state of the "Lakehouse" paradigm and how it impacts storage and compute costs.
  • Expert advice on navigating the dbt ecosystem and its emerging competitors.



Links:

  • DE Course: https://github.com/DataTalksClub/data-engineering-zoomcamp
  • Luma: https://luma.com/0uf7mmup



TIMECODES:

0:00 Juan’s psychological research and transition to data

4:36 Riding the wave: The early days of analytics engineering

7:56 Breaking down the gap between analysts and engineers

11:03 The art of turning business reality into clean data

16:25 Why data engineering is about safety, not just speed

20:53 Reimagining data modeling in the modern era

26:53 To split or not to split: Finding the right team roles

30:35 Python, SQL, and the technical toolkit for success

38:41 How to stop manually testing your data dashboards

46:34 Bringing software engineering rigor to data workflows

49:50 Must-read books and resources for mastering the craft

55:42 The future of dbt and the shifting tool landscape

1:00:29 Deciphering the lakehouse: Warehousing in the cloud

1:11:16 Pro-tips for starting your data engineering journey

1:14:40 The big debate: Databricks vs. Snowflake

1:18:28 Why every data professional needs a local community



This talk is designed for data analysts looking to level up their engineering skills, data engineers interested in the business-logic layer, and data leaders trying to structure their teams more effectively. It is particularly valuable for those preparing for the Data Engineering Zoomcamp or anyone looking to transition into an Analytics Engineering role.


Connect with Juan

  • Linkedin - https://www.linkedin.com/in/jmperafan/
  • Website - https://juanalytics.com/


Connect with DataTalks.Club:

  • Join the community - https://datatalks.club/slack.html
  • Subscribe to our Google calendar to have all our events in your calendar
  • https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ- Check other upcoming events
  • https://lu.ma/dtc-events- GitHub: https://github.com/DataTalksClub
  • LinkedIn - https://www.linkedin.com/company/datatalks-club/
  • Twitter - https://twitter.com/DataTalksClub
  • Website - https://datatalks.club/
Feb 27, 202601:23:57
 AI Engineering: Skill Stack, Agents, LLMOps, and How to Ship AI Products - Paul Iusztin

AI Engineering: Skill Stack, Agents, LLMOps, and How to Ship AI Products - Paul Iusztin

In this episode of DataTalks.Club, Paul Iusztin, founding AI engineer and author of the LLM Engineer’s Handbook, breaks down the transition from traditional software development to production-grade AI engineering.

We explore the essential skill stack for 2026, the shift from "PoC purgatory" to shipping real products, and why the future of the field belongs to the full-stack generalist.



You’ll learn about:

- Why the role is evolving into the "new software engineer" and how to own the full product lifecycle.

- Identifying when to use traditional ML (like XGBoost) over LLMs to avoid over-engineering.

- The architectural shift from fine-tuning to mastering data pipelines and semantic search.

- Reliable Agentic Workflows- How to use coding assistants like Claude and Cursor to act as an architect rather than just a coder.

- Why human-in-the-loop evaluation is the most critical bottleneck in shipping reliable AI.

- How to build a "Second Brain" portfolio project that proves your end-to-end engineering value.


Links:

- Course link: https: https://academy.towardsai.net/courses/agent-engineering?ref=b3ab31

- Decoding AI Magazine: https://www.decodingai.com/



TIMECODES:

00:00 From code to cars: Paul’s journey to AI

07:08 Deep learning and the autonomous driving challenge

12:09 The transition to global product engineering

15:13 Survival guide: Data science vs. AI engineering

22:29 The full-stack AI engineer skill stack

29:12 Mastering RAG and knowledge management

32:27 The generalist edge: Learning with AI

42:21 Technical pillars for shipping AI products

54:05 Portfolio secrets and the "second brain"

58:01 The future of the LLM engineer’s handbook



This talk is designed for software engineers, data scientists, and ML engineers looking to move beyond proof-of-concepts and master the engineering rigors of shipping AI products in a production environment.

It is particularly valuable for those aiming for founding or lead AI roles in startups.



Connect with Paul

- Linkedin - https://www.linkedin.com/in/pauliusztin/

- Website - https://www.pauliusztin.ai/



Connect with DataTalks.Club:

- Join the community - https://datatalks.club/slack.html

- Subscribe to our Google calendar to have all our events in your calendar

- https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ

- Check other upcoming events - https://lu.ma/dtc-events

- GitHub: https://github.com/DataTalksClub

- LinkedIn - https://www.linkedin.com/company/datatalks-club/

- Twitter - https://twitter.com/DataTalksClub

- Website - https://datatalks.club/

Feb 06, 202601:07:15
Applying ML: An Ongoing Personal Journey

Applying ML: An Ongoing Personal Journey

In this talk, Rileen, a Senior Computational Biologist and Cancer Data Scientist, shares his professional journey from physics and computer science to cutting-edge cancer genomics and applied machine learning. From his early work in alternative splicing models to deep learning in medical imaging, Rileen explains how biology, data science, and AI intersect to transform cancer research.

TIMECODES:00:00 Rileen's Career Journey and Education06:14 Understanding Alternative Splicing in Computational Biology10:56 Modeling Alternative Splicing with Machine Learning14:52 Model Error Analysis and Transition to Cancer Research18:37 What Is Cancer? Mutational Theory Explained21:45 Cancer Treatments and Causes24:57 Cancer Genomics and Tumor Models28:59 Comparing Cell Lines and Tumor Samples (Multi-omics Analysis)32:32 Machine Learning Applications in Cancer Research35:38 Deep Learning for Medical Imaging and Pathology39:17 Data Privacy and Applied ML Course Projects42:50 Learning Outcomes and Future Plans46:36 Industry Experience in Pharmaceutical Research50:14 Day in the Life of a Computational Biologist55:02 Advice for Current ML Students58:40 Project Management and Challenges in Genomics1:02:23 Public Data Sets and Cancer Research in GermanyConnect with Rileen:- Twitter - https://x.com/RileenSinha- Linkedin - https://www.linkedin.com/in/rileen-sinha-a644692/- Github - https://github.com/OptimistixConnect with DataTalks.Club:- Join the community - https://datatalks.club/slack.html- Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ- Check other upcoming events - https://lu.ma/dtc-events- GitHub: https://github.com/DataTalksClub- LinkedIn - https://www.linkedin.com/company/datatalks-club/ - Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/

Jan 09, 202601:04:31
Building Pet Health Tech: ML, Sensors, and Dog Behavior Data

Building Pet Health Tech: ML, Sensors, and Dog Behavior Data

In this session Sofya shares her journey building a pet-tech startup that blends machine learning sensor data and canine behavior analytics. She walks through her path from early programming explorations to launching a health monitoring device designed around anomaly detection and long-term behavioral baselines.


TIMECODES:

00:00 Sofya's pet tech startup with machine learning sensor data and behavior pattern analytics

10:00 Journey from programming hobby to full time software development career

17:20 Career growth after skipping university and building practical experience

24:07 Puppy adoption story and family influence on pet focused innovation

32:16 Dog health monitoring framed as anomaly detection in real world machine learning

37:05 Collecting canine data with emphasis on sleep patterns and cycle tracking

43:35 Establishing a dogs normal baseline through long term data observation

49:34 Startup funding through personal savings and early stage bootstrapping

55:28 Finding cofounders and collaborators through meetups and coworking communities

59:48 Closing insights on Sofya's educational path and early device prototypes


Connect with Sofya

- Website - https://www.fit-tails.com/

- Linkedin - https://www.linkedin.com/in/sofya-yulpatova/


Connect with DataTalks.Club:

- Join the community - https://datatalks.club/slack.html

- Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ

- Check other upcoming events - https://lu.ma/dtc-events

- GitHub: https://github.com/DataTalksClub

- LinkedIn - https://www.linkedin.com/company/datatalks-club/

- Twitter - https://twitter.com/DataTalksClub

- Website - https://datatalks.club/



Dec 12, 202501:01:15
From Full-Time Mom to Head of Data and Cloud - Xia He-Bleinagel

From Full-Time Mom to Head of Data and Cloud - Xia He-Bleinagel

In this talk, Xia He-Bleinagel, Head of Data & Cloud at NOW GmbH, shares her remarkable journey from studying automotive engineering across Europe to leading modern data, cloud, and engineering teams in Germany.

We dive into her transition from hands-on engineering to leadership, how she balanced family with career growth, and what it really takes to succeed in today’s cloud, data, and AI job market.


TIMECODES:

00:00 Studying Automotive Engineering Across Europe

08:15 How Andrew Ng Sparked a Machine Learning Journey

11:45 Import–Export Work as an Unexpected Career Boos

t17:05 Balancing Family Life with Data Engineering Studies

20:50 From Data Engineer to Head of Data & Cloud

27:46 Building Data Teams & Tackling Tech Debt

30:56 Learning Leadership Through Coaching & Observation

34:17 Management vs. IC: Finding Your Best Fit

38:52 Boosting Developer Productivity with AI Tools

42:47 Succeeding in Germany’s Competitive Data Job Market

46:03 Fast-Track Your Cloud & Data Career

50:03 Mentorship & Supporting Working Moms in Tech

53:03 Cultural & Economic Factors Shaping Women’s Careers

57:13 Top Networking Groups for Women in Data

1:00:13 Turning Domain Expertise into a Data Career Advantage


Connect with Xia- Linkedin - https://www.linkedin.com/in/xia-he-bleinagel-51773585/

- Github - https://github.com/Data-Think-2021

- Website - https://datathinker.de/


Connect with DataTalks.Club:

- Join the community - https://datatalks.club/slack.html

- Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ

- Check other upcoming events - https://lu.ma/dtc-events

- GitHub: https://github.com/DataTalksClub

- LinkedIn - https://www.linkedin.com/company/datatalks-club/

- Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/


Nov 28, 202501:02:14
From Black-Box Systems to Augmented Decision-Making - Anusha Akkina

From Black-Box Systems to Augmented Decision-Making - Anusha Akkina

In this talk, Anusha Akkina, co-founder of Auralytix, shares her journey from working as a Chartered Accountant and Auditor at Deloitte to building an AI-powered finance intelligence platform designed to augment, not replace, human decision-making. Together with host Alexey from DataTalks.Club, she explores how AI is transforming finance operations beyond spreadsheets—from tackling ERP limitations to creating real-time insights that drive strategic business outcomes.


TIMECODES:

00:00 Building trust in AI finance and introducing Auralytix

02:22 From accounting roots to auditing at Deloitte and Paraxel

08:20 Moving to Germany and pivoting into corporate finance

11:50 The data struggle in strategic finance and the need for change

13:23 How Auralytix was born: bridging AI and financial compliance

17:15 Why ERP systems fail finance teams and how spreadsheets fill the gap

24:31 The real cost of ERP rigidity and lessons from failed transformations

29:10 The hidden risks of spreadsheet dependency and knowledge loss

37:30 Experimenting with ChatGPT and coding the first AI finance prototype

43:34 Identifying finance’s biggest pain points through user research

47:24 Empowering finance teams with AI-driven, real-time decision insights

50:59 Developing an entrepreneurial mindset through strategy and learning

54:31 Essential resources and finding the right AI co-founder


Connect with Anusha

- Linkedin - https://www.linkedin.com/in/anusha-akkina-acma-cgma-56154547/

- Website - https://aurelytix.com/


Connect with DataTalks.Club:

- Join the community - https://datatalks.club/slack.html

- Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ

- Check other upcoming events - https://lu.ma/dtc-events

- GitHub: https://github.com/DataTalksClub

- LinkedIn - https://www.linkedin.com/company/datatalks-club/

- Twitter - https://twitter.com/DataTalksClub

- Website - https://datatalks.club/

Nov 28, 202501:02:48
Qdrant 2025 Conference Interviews

Qdrant 2025 Conference Interviews

At Qdrant Conference, builders, researchers, and industry practitioners shared how vector search, retrieval infrastructure, and LLM-driven workflows are evolving across developer tooling, AI platforms, analytics teams, and modern search research.


Andrey Vasnetsov (Qdrant) explained how Qdrant was born from the need to combine database-style querying with vector similarity search—something he first built during the COVID lockdowns. He highlighted how vector search has shifted from an ML specialty to a standard developer tool and why hosting an in-person conference matters for gathering honest, real-time feedback from the growing community.


Slava Dubrov (HubSpot) described how his team uses Qdrant to power AI Signals, a platform for embeddings, similarity search, and contextual recommendations that support HubSpot’s AI agents. He shared practical use cases like look-alike company search, reflected on evaluating agentic frameworks, and offered career advice for engineers moving toward technical leadership.


Marina Ariamnova (SumUp) presented her internally built LLM analytics assistant that turns natural-language questions into SQL, executes queries, and returns clean summaries—cutting request times from days to minutes. She discussed balancing analytics and engineering work, learning through real projects, and how LLM tools help analysts scale routine workflows without replacing human expertise.


Evgeniya (Jenny) Sukhodolskaya (Qdrant) discussed the multi-disciplinary nature of DevRel and her focus on retrieval research. She shared her work on sparse neural retrieval, relevance feedback, and hybrid search models that blend lexical precision with semantic understanding—contributing methods like Mini-COIL and shaping Qdrant’s search quality roadmap through end-to-end experimentation and community education.


Speakers


Andrey Vasnetsov

Co-founder & CTO of Qdrant, leading the engineering and platform vision behind a developer-focused vector database and vector-native infrastructure.

Connect: https://www.linkedin.com/in/andrey-vasnetsov-75268897/


Slava Dubrov

Technical Lead at HubSpot working on AI Signals—embedding models, similarity search, and context systems for AI agents.

Connect: https://www.linkedin.com/in/slavadubrov/


Marina Ariamnova

Data Lead at SumUp, managing analytics and financial data workflows while prototyping LLM tools that automate routine analysis.

Connect: https://www.linkedin.com/in/marina-ariamnova/


Evgeniya (Jenny) Sukhodolskaya

Developer Relations Engineer at Qdrant specializing in retrieval research, sparse neural methods, and educational ML content.

Connect: https://www.linkedin.com/in/evgeniya-sukhodolskaya/

Nov 28, 202551:59
How to Build and Evaluate AI systems in the Age of LLMs - Hugo Bowne-Anderson

How to Build and Evaluate AI systems in the Age of LLMs - Hugo Bowne-Anderson

In this talk, Hugo Bowne-Anderson, an independent data and AI consultant, educator, and host of the podcasts Vanishing Gradients and High Signal, shares his journey from academic research and curriculum design at DataCamp to advising teams at Netflix, Meta, and the US Air Force. Together, we explore how to build reliable, production-ready AI systems—from prompt evaluation and dataset design to embedding agents into everyday workflows.


You’ll learn about:

  • How to structure teams and incentives for successful AI adoption
  • Practical prompting techniques for accurate timestamp and data generation
  • Building and maintaining evaluation sets to avoid “prompt overfitting”- Cost-effective methods for LLM evaluation and monitoring
  • Tools and frameworks for debugging and observing AI behavior (Logfire, Braintrust, Phoenix Arise)
  • The evolution of AI agents—from simple RAG systems to proactive, embedded assistants
  • How to escape “proof of concept purgatory” and prioritize AI projects that drive business value
  • Step-by-step guidance for building reliable, evaluable AI agents


This session is ideal for AI engineers, data scientists, ML product managers, and startup founders looking to move beyond experimentation into robust, scalable AI systems. Whether you’re optimizing RAG pipelines, evaluating prompts, or embedding AI into products, this talk offers actionable frameworks to guide you from concept to production.


LINKS

  • Escaping POC Purgatory: Evaluation-Driven Development for AI Systems - https://www.oreilly.com/radar/escaping-poc-purgatory-evaluation-driven-development-for-ai-systems/
  • Stop Building AI Agents - https://www.decodingai.com/p/stop-building-ai-agents
  • How to Evaluate LLM Apps Before You Launch - https://www.youtube.com/watch?si=90fXJJQThSwGCaYv&v=TTr7zPLoTJI&feature=youtu.be
  • My Vanishing Gradients Substack - https://hugobowne.substack.com/
  • Building LLM Applications for Data Scientists and Software Engineers
  • https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=datatalksclub

TIMECODES:

00:00 Introduction and Expertise

04:04 Transition to Freelance Consulting and Advising

08:49 Restructuring Teams and Incentivizing AI Adoption

12:22 Improving Prompting for Timestamp Generation

17:38 Evaluation Sets and Failure Analysis for Reliable Software

23:00 Evaluating Prompts: The Cost and Size of Gold Test Sets

27:38 Software Tools for Evaluation and Monitoring

33:14 Evolution of AI Tools: Proactivity and Embedded Agents

40:12 The Future of AI is Not Just Chat

44:38 Avoiding Proof of Concept Purgatory: Prioritizing RAG for Business Value

50:19 RAG vs. Agents: Complexity and Power Trade-Offs

56:21 Recommended Steps for Building Agents

59:57 Defining Memory in Multi-Turn Conversations


Connect with Hugo

  • Twitter - https://x.com/hugobowne
  • Linkedin - https://www.linkedin.com/in/hugo-bowne-anderson-045939a5/
  • Github - https://github.com/hugobowne
  • Website - https://hugobowne.github.io/


Connect with DataTalks.Club:

  • Join the community - https://datatalks.club/slack.html
  • Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ
  • Check other upcoming events - https://lu.ma/dtc-events
  • GitHub: https://github.com/DataTalksClub- LinkedIn - https://www.linkedin.com/company/datatalks-club/
  • Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/
Oct 24, 202501:01:41
From Biotechnology to Bioinformatics Software - Sebastian Ayala Ruano

From Biotechnology to Bioinformatics Software - Sebastian Ayala Ruano

In this talk, Sebastian, a bioinformatics researcher and software engineer, shares his inspiring journey from wet lab biotechnology to computational bioinformatics. Hosted by Data Talks Club, this session explores how data science, AI, and open-source tools are transforming modern biological research — from DNA sequencing to metagenomics and protein structure prediction.


You’ll learn about:

- The difference between wet lab and dry lab workflows in biotechnology

- How bioinformatics enables faster insights through data-driven modeling

- The MCW2 Graph Project and its role in studying wastewater microbiomes

- Using co-abundance networks and the CC Lasso algorithm to map microbial interactions

- How AlphaFold revolutionized protein structure prediction

- Building scientific knowledge graphs to integrate biological metadata

- Open-source tools like VueGen and VueCore for automating reports and visualizations

- The growing impact of AI and large language models (LLMs) in research and documentation

- Key differences between R (BioConductor) and Python ecosystems for bioinformatics


This talk is ideal for data scientists, bioinformaticians, biotech researchers, and AI enthusiasts who want to understand how data science, AI, and biology intersect. Whether you work in genomics, computational biology, or scientific software, you’ll gain insights into real-world tools and workflows shaping the future of bioinformatics.


Links:

- MicW2Graph: https://zenodo.org/records/12507444

- VueGen: https://github.com/Multiomics-Analytics-Group/vuegen

- Awesome-Bioinformatics: https://github.com/danielecook/Awesome-Bioinformatics


TIMECODES00:00 Sebastian’s Journey into Bioinformatics06:02 From Wet Lab to Computational Biology08:23 Wet Lab vs Dry Lab Explained12:35 Bioinformatics as Data Science for Biology15:30 How DNA Sequencing Works19:29 MCW2 Graph and Wastewater Microbiomes23:10 Building Microbial Networks with CC Lasso26:54 Protein–Ligand Simulation Basics29:58 Predicting Protein Folding in 3D33:30 AlphaFold Revolution in Protein Prediction36:45 Inside the MCW2 Knowledge Graph39:54 VueGen: Automating Scientific Reports43:56 VueCore: Visualizing OMIX Data47:50 Using AI and LLMs in Bioinformatics50:25 R vs Python in Bioinformatics Tools53:17 Closing Thoughts from Ecuador

Connect with Sebastian

  • Twitter - https://twitter.com/sayalaruano
  • Linkedin - https://linkedin.com/in/sayalaruano
  • Github - https://github.com/sayalaruano
  • Website - https://sayalaruano.github.io/


Connect with DataTalks.Club:

  • Join the community - https://datatalks.club/slack.html
  • Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ
  • Check other upcoming events - https://lu.ma/dtc-events
  • GitHub: https://github.com/DataTalksClub
  • LinkedIn - https://www.linkedin.com/company/datatalks-club/
  • Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/
Oct 24, 202555:36
 Lessons from Applied AI: Tesla, Waymo, and Beyond - Aishwarya Jadhav

Lessons from Applied AI: Tesla, Waymo, and Beyond - Aishwarya Jadhav

In this episode, we talked with Aishwarya Jadhav, a machine learning engineer whose career has spanned Morgan Stanley, Tesla, and now Waymo. Aishwarya shares her journey from big data in finance to applied AI in self-driving, gesture understanding, and computer vision. She discusses building an AI guide dog for the visually impaired, contributing to malaria mapping in Africa, and the challenges of deploying safe autonomous systems. We also explore the intersection of computer vision, NLP, and LLMs, and what it takes to break into the self-driving AI industry.TIMECODES00:51 Aishwarya’s career journey from finance to self-driving AI05:45 Building AI guide dog for the visually impaired12:03 Exploring LiDAR, radar, and Tesla’s camera-based approach16:24 Trust, regulation, and challenges in self-driving adoption19:39 Waymo, ride-hailing, and gesture recognition for traffic control24:18 Malaria mapping in Africa and AI for social good29:40 Deployment, safety, and testing in self-driving systems37:00 Transition from NLP to computer vision and deep learning43:37 Reinforcement learning, robotics, and self-driving constraints51:28 Testing processes, evaluations, and staged rollouts for autonomous driving52:53 Can multimodal LLMs be applied to self-driving?55:33 How to get started in self-driving AI careersConnect with Aishwarya- Linkedin - https://www.linkedin.com/in/aishwaryajadhav8/Connect with DataTalks.Club:- Join the community - https://datatalks.club/slack.html- Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ- Check other upcoming events - https://lu.ma/dtc-events- GitHub: https://github.com/DataTalksClub- LinkedIn - https://www.linkedin.com/company/datatalks-club/ - Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/

Oct 10, 202559:18
Building reliable AI products in the era of Gen AI and Agents - Ranjitha Kulkarni

Building reliable AI products in the era of Gen AI and Agents - Ranjitha Kulkarni

In this episode, we talked with Ranjitha Kulkarni, a machine learning engineer with a rich career spanning Microsoft, Dropbox, and now NeuBird AI. Ranjitha shares her journey into ML and NLP, her work building recommendation systems, early AI agents, and cutting-edge LLM-powered products. She offers insights into designing reliable AI systems in the new era of generative AI and agents, and how context engineering and dynamic planning shape the future of AI products.TIMECODES00:00 Career journey and early curiosity04:25 Speech recognition at Microsoft05:52 Recommendation systems and early agents at Dropbox07:44 Joining NewBird AI12:01 Defining agents and LLM orchestration16:11 Agent planning strategies18:23 Agent implementation approaches22:50 Context engineering essentials30:27 RAG evolution in agent systems37:39 RAG vs agent use cases40:30 Dynamic planning in AI assistants43:00 AI productivity tools at Dropbox46:00 Evaluating AI agents53:20 Reliable tool usage challenges58:17 Future of agents in engineering Connect with Ranjitha- Linkedin - https://www.linkedin.com/in/ranjitha-gurunath-kulkarniConnect with DataTalks.Club:- Join the community - https://datatalks.club/slack.html- Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ- Check other upcoming events - https://lu.ma/dtc-events- GitHub: https://github.com/DataTalksClub- LinkedIn - https://www.linkedin.com/company/datatalks-club/ - Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/

Oct 10, 202559:45
From Theme Parks to Tesla: Building Data Products That Work

From Theme Parks to Tesla: Building Data Products That Work

In this episode, we talked with Abouzar Abbaspour, a data engineer whose career spans software engineering in Iran, building crowd and recommendation systems at a Dutch theme park, deploying large-scale ML models at Bol.com, and now working at Tesla. Abouzar shares how he bridged diverse industries, tackled real-world data challenges, and adapted to new roles while keeping a hands-on approach to machine learning and engineering.TIMECODES00:00 Career journey and early motivations06:17 Moving to Europe for data science12:18 Working with theme parks and crowd modeling18:29 Lessons from ride and visitor data23:06 Building recommendation systems at Efteling27:26 Joining Bol.com and the Dutch e-commerce industry32:49 Product and brand recommendation logic36:09 Experimenting with "Tinder for brands"40:26 Engagement metrics and product validation43:02 From ML engineering to data engineering roles52:04 Hands-on skills at Tesla and industry expectations57:43 Career growth, learning, and adviceConnect with AbouzarLinkedin -   / abouzar-abbaspour  

Website - https://www.abouzar-abbaspour.com/

Connect with DataTalks.Club:

Oct 10, 202501:00:45
From Semiconductors to Machine Learning: A Career in Data and Teaching

From Semiconductors to Machine Learning: A Career in Data and Teaching

In this episode, we chat with Dashel Ruiz, whose journey spans semiconductors, machine learning, and teaching. Dashel shares how he transitioned from hardware to data science, navigated complex projects in diverse industries, and now combines technical expertise with a passion for teaching. Tune in to hear insights on building a career in data, mastering new technologies, and making an impact both in the lab and the classroom.


TIMECODES

00:00 Dashel's unique career path from music to semiconductors

06:16 The transition into data and software engineering at Microchip

11:44 Discovering machine learning to solve real problems in semiconductor manufacturing

20:40 How Dashel found and his experience with the Machine Learning Zoomcamp

29:33 The practical advantages of DataTalks.Club courses over other platforms

39:52 Overcoming challenges and the value of the learning community

48:10 Hands-on project experience: From image classification to Kaggle competitions

54:12 Staying motivated throughout the long-term course

59:55 The importance of deployment and full-stack ML skills

1:07:36 Closing thoughts on teaching and future courses


Connect with Dashel

  • Linkedin - https://www.linkedin.com/in/dashel-ruiz-perez-2b036172/


Connect with DataTalks.Club:

  • Join the community - https://datatalks.club/slack.html
  • Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ
  • Check other upcoming events - https://lu.ma/dtc-events
  • GitHub: https://github.com/DataTalksClub
  • LinkedIn - https://www.linkedin.com/company/datatalks-club/
  • Twitter - https://twitter.com/DataTalksClub
  • Website - https://datatalks.club/
Oct 10, 202501:13:26
Lessons from Two Decades of AI - Micheal Lanham

Lessons from Two Decades of AI - Micheal Lanham

In this episode, we talk with Michael Lanham, an AI and software innovator with over two decades of experience spanning game development, fintech, oil and gas, and agricultural tech. Michael shares his journey from building neural network-based games and evolutionary algorithms to writing influential books on AI agents and deep learning. He offers insights into the evolving AI landscape, practical uses of AI agents, and the future of generative AI in gaming and beyond.


TIMECODES

00:00 Micheal Lanham’s career journey and AI agent books

05:45 Publishing journey: AR, Pokémon Go, sound design, and reinforcement learning

10:00 Evolution of AI: evolutionary algorithms, deep learning, and agents

13:33 Evolutionary algorithms in prompt engineering and LLMs

18:13 AI agent books second edition and practical applications

20:57 AI agent workflows: minimalism, task breakdown, and collaboration

26:25 Collaboration and orchestration among AI agents

31:24 Tools and reasoning servers for agent communication

35:17 AI agents in game development and generative AI impact

38:57 Future of generative AI in gaming and immersive content

41:42 Coding agents, new LLMs, and local deployment

45:40 AI model trends and data scientist career advice

53:36 Cognitive testing, evaluation, and monitoring in AI

58:50 Publishing details and closing remarks


Connect with Micheal

  • Linkedin - https://www.linkedin.com/in/micheal-lanham-189693123/


Connect with DataTalks.Club:


Sep 26, 202559:59
Berlin PyData 2025 Conference Interviews

Berlin PyData 2025 Conference Interviews

At PyData Berlin, community members and industry voices highlighted how AI and data tooling are evolving across knowledge graphs, MLOps, small-model fine-tuning, explainability, and developer advocacy.


- Igor Kvachenok (Leuphana University / ProKube) combined knowledge graphs with LLMs for structured data extraction in the polymer industry, and noted how MLOps is shifting toward LLM-focused workflows.

- Selim Nowicki (Distill Labs) introduced a platform that uses knowledge distillation to fine-tune smaller models efficiently, making model specialization faster and more accessible.

- Gülsah Durmaz (Architect & Developer) shared her transition from architecture to coding, creating Python tools for design automation and volunteering with PyData through PyLadies.

- Yashasvi Misra (Pure Storage) spoke on explainable AI, stressing accountability and compliance, and shared her perspective as both a data engineer and active Python community organizer.

- Mehdi Ouazza (MotherDuck) reflected on developer advocacy through video, workshops, and branding, showing how creative communication boosts adoption of open-source tools like DuckDB.



Igor Kvachenok

Master’s student in Data Science at Leuphana University of Lüneburg, writing a thesis on LLM-enhanced data extraction for the polymer industry. Builds RDF knowledge graphs from semi-structured documents and works at ProKube on MLOps platforms powered by Kubeflow and Kubernetes.


Connect: https://www.linkedin.com/in/igor-kvachenok/



Selim Nowicki

Founder of Distill Labs, a startup making small-model fine-tuning simple and fast with knowledge distillation. Previously led data teams at Berlin startups like Delivery Hero, Trade Republic, and Tier Mobility. Sees parallels between today’s ML tooling and dbt’s impact on analytics.


Connect: https://www.linkedin.com/in/selim-nowicki/



Gülsah Durmaz

Architect turned developer, creating Python-based tools for architectural design automation with Rhino and Grasshopper. Active in PyLadies and a volunteer at PyData Berlin, she values the community for networking and learning, and aims to bring ML into architecture workflows.


Connect: https://www.linkedin.com/in/gulsah-durmaz/


Yashasvi (Yashi) Misra

Data Engineer at Pure Storage, community organizer with PyLadies India, PyCon India, and Women Techmakers. Advocates for inclusive spaces in tech and speaks on explainable AI, bridging her day-to-day in data engineering with her passion for ethical ML.


Connect: https://www.linkedin.com/in/misrayashasvi/



Mehdi Ouazza

Developer Advocate at MotherDuck, formerly a data engineer, now focused on building community and education around DuckDB. Runs popular YouTube channels ("mehdio DataTV" and "MotherDuck") and delivered a hands-on workshop at PyData Berlin. Blends technical clarity with creative storytelling.


Connect: https://www.linkedin.com/in/mehd-io/

Sep 26, 202549:21
 From Astronomy to Applied ML - Daniel Egbo

From Astronomy to Applied ML - Daniel Egbo

In this episode, we talk with Daniel, an astrophysicist turned machine learning engineer and AI ambassador. Daniel shares his journey bridging astronomy and data science, how he leveraged live courses and public knowledge sharing to grow his skills, and his experiences working on cutting-edge radio astronomy projects and AI deployments. He also discusses practical advice for beginners in data and astronomy, and insights on career growth through community and continuous learning.TIMECODES00:00 Lunar eclipse story and Daniel’s astronomy career04:12 Electromagnetic spectrum and MEERKAT data explained10:39 Data analysis and positional cross-correlation challenges15:25 Physics behind radio star detection and observation limits16:35 Radio astronomy’s advantage and machine learning potential20:37 Radio astronomy progress and Daniel’s ML journey26:00 Python tools and experience with ZoomCamps31:26 Intel internship and exploring LLMs41:04 Sharing progress and course projects with orchestration tools44:49 Setting up Airflow 3.0 and building data pipelines47:39 AI startups, training resources, and NVIDIA courses50:20 Student access to education, NVIDIA experience, and beginner astronomy programs57:59 Skills, projects, and career advice for beginners59:19 Starting with data science or engineering1:00:07 Course sponsorship, data tools, and learning resourcesConnect with Daniel


Connect with DataTalks.Club:

Sep 26, 202501:03:55
Berlin Buzzwords 2025 Conference Interviews

Berlin Buzzwords 2025 Conference Interviews

At Berlin Buzzwords, industry voices highlighted how search is evolving with AI and LLMs.


- Kacper Łukawski (Qdrant) stressed hybrid search (semantic + keyword) as core for RAG systems and promoted efficient embedding models for smaller-scale use.

- Manish Gill (ClickHouse) discussed auto-scaling OLAP databases on Kubernetes, combining infrastructure and database knowledge.

- André Charton (Kleinanzeigen) reflected on scaling search for millions of classifieds, moving from Solr/Elasticsearch toward vector search, while returning to a hands-on technical role.

- Filip Makraduli (Superlinked) introduced a vector-first framework that fuses multiple encoders into one representation for nuanced e-commerce and recommendation search.

- Brian Goldin (Voyager Search) emphasized spatial context in retrieval, combining geospatial data with AI enrichment to add the “where” to search.

- Atita Arora (Voyager Search) highlighted geospatial AI models, the renewed importance of retrieval in RAG, and the cautious but promising rise of AI agents.


Together, their perspectives show a common thread: search is regaining center stage in AI—scaling, hybridization, multimodality, and domain-specific enrichment are shaping the next generation of retrieval systems.


Kacper Łukawski

Senior Developer Advocate at Qdrant, he educates users on vector and hybrid search. He highlighted Qdrant’s support for dense and sparse vectors, the role of search with LLMs, and his interest in cost-effective models like static embeddings for smaller companies and edge apps.

Connect: https://www.linkedin.com/in/kacperlukawski/


Manish Gill

Engineering Manager at ClickHouse, he spoke about running ClickHouse on Kubernetes, tackling auto-scaling and stateful sets. His team focuses on making ClickHouse scale automatically in the cloud. He credited its speed to careful engineering and reflected on the shift from IC to manager.

Connect: https://www.linkedin.com/in/manishgill/


André Charton

Head of Search at Kleinanzeigen, he discussed shaping the company’s search tech—moving from Solr to Elasticsearch and now vector search with Vespa. Kleinanzeigen handles 60M items, 1M new listings daily, and 50k requests/sec. André explained his career shift back to hands-on engineering.

Connect: https://www.linkedin.com/in/andrecharton/


Filip Makraduli

Founding ML DevRel engineer at Superlinked, an open-source framework for AI search and recommendations. Its vector-first approach fuses multiple encoders (text, images, structured fields) into composite vectors for single-shot retrieval. His Berlin Buzzwords demo showed e-commerce search with natural-language queries and filters.

Connect: https://www.linkedin.com/in/filipmakraduli/


Brian Goldin

Founder and CEO of Voyager Search, which began with geospatial search and expanded into documents and metadata enrichment. Voyager indexes spatial data and enriches pipelines with NLP, OCR, and AI models to detect entities like oil spills or windmills. He stressed adding spatial context (“the where”) as critical for search and highlighted Voyager’s 12 years of enterprise experience.

Connect: https://www.linkedin.com/in/brian-goldin-04170a1/


Atita Arora

Director of AI at Voyager Search, with nearly 20 years in retrieval systems, now focused on geospatial AI for Earth observation data. At Berlin Buzzwords she hosted sessions, attended talks on Lucene, GPUs, and Solr, and emphasized retrieval quality in RAG systems. She is cautiously optimistic about AI agents and values the event as both learning hub and professional reunion.

Connect: https://www.linkedin.com/in/atitaarora/


Sep 12, 202501:07:42
 From Medicine to Machine Learning: How Public Learning Turned into a Career - Pastor Soto

From Medicine to Machine Learning: How Public Learning Turned into a Career - Pastor Soto

In this episode, We talked with Pastor, a medical doctor who built a career in machine learning while studying medicine. Pastor shares how he balanced both fields, leveraged live courses and public sharing to grow his skills, and found opportunities through freelancing and mentoring.

TIMECODES

00:00 Pastor’s background and early programming journey

06:05 Learning new tools and skills on the job while studying medicine

11:44 Balancing medical studies with data science work and motivation

13:48 Applying medical knowledge to data science and vice versa

18:44 Starting freelance work on Upwork and overcoming language challenges

24:03 Joining the machine learning engineering course and benefits of live cohorts

27:41 Engaging with the course community and sharing progress publicly

35:16 Using LinkedIn and social media for career growth and interview opportunities

41:03 Building reputation, structuring learning, and leveraging course projects

50:53 Volunteering and mentoring with DeepLearning.AI and Stanford Coding Place

57:00 Managing time and staying productive while studying medicine and machine learning


Connect with Pastor

Connect with DataTalks.Club:


Aug 22, 202559:32
How to Rebuild Data Trust? Mindful Data Strategy and Maintenance vs Innovation - Lior Barak

How to Rebuild Data Trust? Mindful Data Strategy and Maintenance vs Innovation - Lior Barak

Struggling with data trust issues, dashboard drama, or constant pipeline firefighting? In this deep‑dive interview, Lior Barak shows you how to shift from a reactive “fix‑it” culture to a mindful, impact‑driven practice rooted in Zen/Wabi‑Sabi principles.

You’ll learn:

Why 97 % of CEOs say they use data, but only 24 % call themselves data‑driven

The traffic‑light dashboard pattern (green / yellow / red) that instantly tells execs whether numbers are safe to use

A practical rule for balancing maintenance, rollout, and innovation—and avoiding team burnout

How to quantify ROI on data products, kill failing legacy systems, and handle ad‑hoc exec requests without derailing roadmaps

Turning “imperfect” data into business value with mindful communication, root‑cause logs, and automated incident review loops


🕒 TIMECODES

00:00 Community and mindful data strategy

04:06 Career journey and product management insights

08:03 Wabi-sabi data and the trust crisis

11:47 AI, data imperfection, and trust challenges

20:05 Trust crisis examples and root cause analysis

25:06 Regaining trust through mindful data management

30:47 Traffic light system and effective communication

37:41 Communication gaps and team workload balance

39:58 Maintenance stress and embracing Zen mindset

49:29 Accepting imperfection and measuring impact

56:19 Legacy systems and managing executive requests

01:00:23 Role guidance and closing reflections


🔗 Connect with Lior

LinkedIn - https://www.linkedin.com/in/liorbarak

Website - https://cookingdata.substack.com/

Cooking Data newsletter: https://cookingdata.substack.com/

Product product lifecycle manager: https://app--data-product-lifecycle-manager-c81b10bb.base44.app/


🔗 Connect with DataTalks.Club

Join the community - https://datatalks.club/slack.html

Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/u/0/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ

Check other upcoming events - https://lu.ma/dtc-events

GitHub: https://github.com/DataTalksClub

LinkedIn - https://www.linkedin.com/company/datatalks-club/

Twitter - https://x.com/DataTalksClub

Website - https://datatalks.club/


🔗 Connect with Alexey

Twitter - https://x.com/Al_Grigor

Linkedin - https://www.linkedin.com/in/agrigorev/



Aug 15, 202501:01:31
From Simulations to Freelance Data Engineering: Orell's Journey Out of Academia and Into Consulting - Orell Garten

From Simulations to Freelance Data Engineering: Orell's Journey Out of Academia and Into Consulting - Orell Garten

In this episode, we talk with Orell about his journey from electrical engineering to freelancing in data engineering. Exploring lessons from startup life, working with messy industrial data, the realities of freelancing, and how to stay up to date with new tools.


Topics covered:

  • Why Orel left a PhD and a simulation‑focused start‑up after Covid hit
  • What he learned trying (and failing) to commercialise medical‑imaging simulations
  • The first freelance project and the long, quiet months that followed
  • How he now finds clients, keeps projects small and delivers value quickly
  • Typical work he does for industrial companies: parsing messy machine logs, building simple pipelines, adding structure later
  • Favorite everyday tools (Python, DuckDB, a bit of C++) and the habit of blocking time for learning
  • Advice for anyone thinking about freelancing: cash runway, networking, and focusing on problems rather than “perfect” tech choices


A practical conversation for listeners who are curious about moving from research or permanent roles into freelance data engineering.


🕒 TIMECODES

0:00 Orel’s career and move to freelancing

9:04 Startup experience and data engineering lessons

16:05 Academia vs. startups and starting freelancing

25:33 Early freelancing challenges and networking

34:22 Freelance data engineering and messy industrial data

43:27 Staying practical, learning tools, and growth

50:33 Freelancing challenges and client acquisition

58:37 Tools, problem-solving, and manual work


🔗 CONNECT WITH ORELL

Twitter - https://bsky.app/profile/orgarten.bsk...

LinkedIn - / ogarten

Github - https://github.com/orgarten

Website - https://orellgarten.com


🔗 CONNECT WITH DataTalksClub

Join the community - https://datatalks.club/slack.html

Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/...

Check other upcoming events - https://lu.ma/dtc-events

GitHub: https://github.com/DataTalksClub

LinkedIn - / datatalks-club

Twitter - / datatalksclub

Website - https://datatalks.club/


🔗 CONNECT WITH ALEXEY

Connect with Alexey

Twitter - / al_grigor

Linkedin - / agrigorev

Aug 01, 202558:22
Can You Quit Your Job and Still Succeed as a Data Freelancer?

Can You Quit Your Job and Still Succeed as a Data Freelancer?

Thinking about swapping your 9‑to‑5 for client work, but worried that a long German–style notice period will kill your chances?  In this live interview, seven‑year data‑freelance veteran Dimitri walks through his experience of taking his freelance career to the next level.


About the Speaker:

Dimitri Visnadi is an independent data consultant with a focus on data strategy. He has been consulting companies leading the marketing data space such as Unilever, Ferrero, Heineken, and Red Bull.


He has lived and worked in 6 countries across Europe in both corporate and startup organizations. He was part of data departments at Hewlett-Packard (HP) and a Google partnered consulting firm where he was working on data products and strategy.


Having received a Masters in Business Analytics with Computer Science from University College London and a Bachelor in Business Administration from John Cabot University, Dimitri still has close ties to academia and holds a mentor position in entrepreneurship at both institutions.

🕒 TIMECODES00:00 Dimitri’s journey from corporate to freelance data specialist05:41 Job tenure trends, tech career shifts, and freelance types10:50 Freelancing challenges, success, and finding clients17:33 Freelance market trends and Dimitri’s job board23:51 Starting points, top freelance skills, and market insights32:48 Building a lifestyle business: scaling and work-life balance45:30 Data Freelancer course and marketing for freelancers48:33 Subscription services and managing client relationships56:47 Pricing models and transitioning advice1:01:02 Notice periods, networking, and risks in freelancing transition

🔗 CONNECT WITH DataTalksClub

Join the community - https://datatalks.club/slack.html

Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/...

Check other upcoming events - https://lu.ma/dtc-events

LinkedIn - / datatalks-club

Twitter - / datatalksclub

Website - https://datatalks.club/

🔗 CONNECT WITH DIMITRI

Linkedin - https://www.linkedin.com/in/visnadi/

Jul 25, 202558:14
From Hackathons to Developer Advocacy - Will Russel

From Hackathons to Developer Advocacy - Will Russel

In this podcast episode, we talked with Will Russell about From Hackathons to Developer Advocacy.


About the Speaker:

Will Russell is a Developer Advocate at Kestra, known for his videos on workflow orchestration. Previously, Will built open source education programs to help up and coming developers make their first contributions in open source. With a passion for developer education, Will creates technical video content and documentation that makes technologies more approachable for developers.

In this episode, we sit down with Will—developer advocate, content creator, and passionate community builder. We’ll hear about his unique path through tech, the lessons he’s learned, and his approach to making complex topics accessible and engaging. Whether you’re curious about open source, hackathons, or what it’s like to bridge the gap between developers and the broader tech community, this conversation is full of insights and inspiration.


🕒 TIMECODES

0:00 Introduction, career journeys, and video setup and workflow

10:41 From hackathons to open source: Early experiences and learning

16:04 Becoming a hackathon organizer and the value of soft skills

23:18 How to organize a hackathon, memorable projects, and creativity

33:39 Major League Hacking: Building community and scaling student programs

41:16 Mentorship, development environments, and onboarding in open source

49:14 Developer advocacy, content strategy, and video tips

57:16 Will’s current projects and future plans for content creation


🔗 CONNECT WITH DataTalksClub

Join the community - https://datatalks.club/slack.html

Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ

Check other upcoming events - https://lu.ma/dtc-events

LinkedIn - https://www.linkedin.com/company/datatalks-club/

Twitter - https://twitter.com/DataTalksClub

Website - https://datatalks.club/


🔗 CONNECT WITH WILL

LinkedIn - https://www.linkedin.com/in/wrussell1999/

Twitter - https://x.com/wrussell1999

GitHub - https://github.com/wrussell1999

Website - https://wrussell.co.uk/

May 26, 202557:11
Build a Strong Career in Data - Lavanya Gupta

Build a Strong Career in Data - Lavanya Gupta

In this podcast episode, we talked with Lavanya Gupta about Building a Strong Career in Data.

About the Speaker:

Lavanya is a Carnegie Mellon University (CMU) alumni of the Language Technologies Institute (LTI). She works as a Sr. AI/ML Applied Associate at JPMorgan Chase in their specialized Machine Learning Center of Excellence (MLCOE) vertical. Her latest research on long-context evaluation of LLMs was published in EMNLP 2024.


In addition to having a strong industrial research background of 5+ years, she is also an enthusiastic technical speaker. She has delivered talks at events such as Women in Data Science (WiDS) 2021, PyData, Illuminate AI 2021, TensorFlow User Group (TFUG), and MindHack! Summit. She also serves as a reviewer at top-tier NLP conferences (NeurIPS 2024, ICLR 2025, NAACL 2025). Additionally, through her collaborations with various prestigious organizations, like Anita BOrg and Women in Coding and Data Science (WiCDS), she is committed to mentoring aspiring machine learning enthusiasts.


In this episode, we talk about Lavanya Gupta’s journey from software engineer to AI researcher. She shares how hackathons sparked her passion for machine learning, her transition into NLP, and her current work benchmarking large language models in finance. Tune in for practical insights on building a strong data career and navigating the evolving AI landscape.


🕒 TIMECODES

00:00 Lavanya’s journey from software engineer to AI researcher

10:15 Benchmarking long context language models

12:36 Limitations of large context models in real domains

14:54 Handling large documents and publishing research in industry

19:45 Building a data science career: publications, motivation, and mentorship

25:01 Self-learning, hackathons, and networking

33:24 Community work and Kaggle projects

37:32 Mentorship and open-ended guidance

51:28 Building a strong data science portfolio

🔗 CONNECT WITH LAVANYALinkedIn -   / lgupta18  🔗 CONNECT WITH DataTalksClub Join the community - https://datatalks.club/slack.html Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/... Check other upcoming events - https://lu.ma/dtc-events LinkedIn -   / datatalks-club   Twitter -   / datatalksclub   Website - https://datatalks.club/

May 09, 202551:60
From Supply Chain Management to Digital Warehousing and FinOps - Eddy Zulkifly

From Supply Chain Management to Digital Warehousing and FinOps - Eddy Zulkifly

In this podcast episode, we talked with Eddy Zulkifly about From Supply Chain Management to Digital Warehousing and FinOps


About the Speaker:

  • Eddy Zulkifly is a Staff Data Engineer at Kinaxis, building robust data platforms across Google Cloud, Azure, and AWS. With a decade of experience in data, he actively shares his expertise as a Mentor on ADPList and Teaching Assistant at Uplimit. Previously, he was a Senior Data Engineer at Home Depot, specializing in e-commerce and supply chain analytics. Currently pursuing a Master’s in Analytics at the Georgia Institute of Technology, Eddy is also passionate about open-source data projects and enjoys watching/exploring the analytics behind the Fantasy Premier League.


    In this episode, we dive into the world of data engineering and FinOps with Eddy Zulkifly, Staff Data Engineer at Kinaxis. Eddy shares his unconventional career journey—from optimizing physical warehouses with Excel to building digital data platforms in the cloud.


    🕒 TIMECODES

    0:00 Eddy’s career journey: From supply chain to data engineering

    8:18 Tools & learning: Excel, Docker, and transitioning to data engineering

    21:57 Physical vs. digital warehousing: Analogies and key differences

    31:40 Introduction to FinOps: Cloud cost optimization and vendor negotiations

    40:18 Resources for FinOps: Certifications and the FinOps Foundation

    45:12 Standardizing cloud cost reporting across AWS/GCP/Azure

    50:04 Eddy’s master’s degree and closing thoughts


    🔗 CONNECT WITH EDDY

    Twitter - https://x.com/eddarief

    Linkedin - https://www.linkedin.com/in/eddyzulkifly/

    Github: https://github.com/eyzyly/eyzyly

    ADPList: https://adplist.org/mentors/eddy-zulkifly


    🔗 CONNECT WITH DataTalksClub

    Join the community - https://datatalks.club/slack.html

    Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ


    Check other upcoming events - https://lu.ma/dtc-events

    LinkedIn - https://www.linkedin.com/company/datatalks-club/

    Twitter - https://twitter.com/DataTalksClub

    Website - https://datatalks.club/

  • Apr 04, 202552:08
    Data Intensive AI - Bartosz Mikulski

    Data Intensive AI - Bartosz Mikulski

    In this podcast episode, we talked with Bartosz Mikulski about Data Intensive AI.


    About the Speaker:

    Bartosz is an AI and data engineer. He specializes in moving AI projects from the good-enough-for-a-demo phase to production by building a testing infrastructure and fixing the issues detected by tests. On top of that, he teaches programmers and non-programmers how to use AI. He contributed one chapter to the book 97 Things Every Data Engineer Should Know, and he was a speaker at several conferences, including Data Natives, Berlin Buzzwords, and Global AI Developer Days. 


    In this episode, we discuss Bartosz’s career journey, the importance of testing in data pipelines, and how AI tools like ChatGPT and Cursor are transforming development workflows. From prompt engineering to building Chrome extensions with AI, we dive into practical use cases, tools, and insights for anyone working in data-intensive AI projects. Whether you’re a data engineer, AI enthusiast, or just curious about the future of AI in tech, this episode offers valuable takeaways and real-world experiences.


    0:00 Introduction to Bartosz and his background

    4:00 Bartosz’s career journey from Java development to AI engineering

    9:05 The importance of testing in data engineering

    11:19 How to create tests for data pipelines

    13:14 Tools and approaches for testing data pipelines

    17:10 Choosing Spark for data engineering projects

    19:05 The connection between data engineering and AI tools

    21:39 Use cases of AI in data engineering and MLOps

    25:13 Prompt engineering techniques and best practices

    31:45 Prompt compression and caching in AI models

    33:35 Thoughts on DeepSeek and open-source AI models

    35:54 Using AI for lead classification and LinkedIn automation

    41:04 Building Chrome extensions with AI integration

    43:51 Comparing Cursor and GitHub Copilot for coding

    47:11 Using ChatGPT and Perplexity for AI-assisted tasks

    52:09 Hosting static websites and using AI for development

    54:27 How blogging helps attract clients and share knowledge

    58:15 Using AI to assist with writing and content creation


    🔗 CONNECT WITH Bartosz

    LinkedIn: https://www.linkedin.com/in/mikulskibartosz/

    Github: https://github.com/mikulskibartosz

    Website: https://mikulskibartosz.name/blog/


    🔗 CONNECT WITH DataTalksClub

    Join the community - https://datatalks.club/slack.html

    Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ

    Check other upcoming events - https://lu.ma/dtc-events

    LinkedIn - https://www.linkedin.com/company/datatalks-club/

    Twitter - https://twitter.com/DataTalksClub Website - https://datatalks.club/

    Mar 21, 202554:55
    MLOps in Corporations and Startups - Nemanja Radojkovic

    MLOps in Corporations and Startups - Nemanja Radojkovic

    In this podcast episode, we talked with Nemanja Radojkovic about MLOps in Corporations and Startups.


    About the Speaker:

    Nemanja Radojkovic is Senior Machine Learning Engineer at Euroclear.


    In this event,we’re diving into the world of MLOps, comparing life in startups versus big corporations. Joining us again is Nemanja, a seasoned machine learning engineer with experience spanning Fortune 500 companies and agile startups. We explore the challenges of scaling MLOps on a shoestring budget, the trade-offs between corporate stability and startup agility, and practical advice for engineers deciding between these two career paths. Whether you’re navigating legacy frameworks or experimenting with cutting-edge tools.


    1:00 MLOps in corporations versus startups

    6:03 The agility and pace of startups

    7:54 MLOps on a shoestring budget

    12:54 Cloud solutions for startups

    15:06 Challenges of cloud complexity versus on-premise

    19:19 Selecting tools and avoiding vendor lock-in

    22:22 Choosing between a startup and a corporation

    27:30 Flexibility and risks in startups

    29:37 Bureaucracy and processes in corporations

    33:17 The role of frameworks in corporations

    34:32 Advantages of large teams in corporations

    40:01 Challenges of technical debt in startups

    43:12 Career advice for junior data scientists

    44:10 Tools and frameworks for MLOps projects

    49:00 Balancing new and old technologies in skill development

    55:43 Data engineering challenges and reliability in LLMs

    57:09 On-premise vs. cloud solutions in data-sensitive industries

    59:29 Alternatives like Dask for distributed systems


    🔗 CONNECT WITH NEMANJA

    LinkedIn -   / radojkovic  

    Github - https://github.com/baskervilski


    🔗 CONNECT WITH DataTalksClub

    Join the community - https://datatalks.club/slack.html

    Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/...

    Check other upcoming events - https://lu.ma/dtc-events 

    LinkedIn -   / datatalks-club   

    Twitter -   / datatalksclub   

    Website - https://datatalks.club/ 

    Mar 14, 202558:04
    Trends in Data Engineering – Adrian Brudaru

    Trends in Data Engineering – Adrian Brudaru

    In this podcast episode, we talked with Adrian Brudaru about ​the past, present and future of data engineering.


    About the speaker:

  • Adrian Brudaru studied economics in Romania but soon got bored with how creative the industry was, and chose to go instead for the more factual side. He ended up in Berlin at the age of 25 and started a role as a business analyst. At the age of 30, he had enough of startups and decided to join a corporation, but quickly found out that it did not provide the challenge he wanted.

    As going back to startups was not a desirable option either, he decided to postpone his decision by taking freelance work and has never looked back since. Five years later, he co-founded a company in the data space to try new things. This company is also looking to release open source tools to help democratize data engineering.


    0:00 Introduction to DataTalks.Club

    1:05 Discussing trends in data engineering with Adrian

    2:03 Adrian's background and journey into data engineering

    5:04 Growth and updates on Adrian's company, DLT Hub

    9:05 Challenges and specialization in data engineering today

    13:00 Opportunities for data engineers entering the field

    15:00 The "Modern Data Stack" and its evolution

    17:25 Emerging trends: AI integration and Iceberg technology

    27:40 DuckDB and the emergence of portable, cost-effective data stacks

    32:14 The rise and impact of dbt in data engineering

    34:08 Alternatives to dbt: SQLMesh and others

    35:25 Workflow orchestration tools: Airflow, Dagster, Prefect, and GitHub Actions

    37:20 Audience questions: Career focus in data roles and AI engineering overlaps

    39:00

    The role of semantics in data and AI workflows

    41:11 Focusing on learning concepts over tools when entering the field

    45:15 Transitioning from backend to data engineering: challenges and opportunities

    47:48 Current state of the data engineering job market in Europe and beyond

    49:05 Introduction to Apache Iceberg, Delta, and Hudi file formats

    50:40 Suitability of these formats for batch and streaming workloads

    52:29 Tools for streaming: Kafka, SQS, and related trends

    58:07 Building AI agents and enabling intelligent data applications

    59:09Closing discussion on the place of tools like DBT in the ecosystem


    🔗 CONNECT WITH ADRIAN BRUDARU

    Linkedin -  / data-team   Website - https://adrian.brudaru.com/ 🔗 CONNECT WITH DataTalksClub

    Join the community - https://datatalks.club/slack.html Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/... Check other upcoming events - https://lu.ma/dtc-events LinkedIn -  /datatalks-club   Twitter -  /datatalksclub   Website - https://datatalks.club/

  • Mar 07, 202556:59
    Competitive Machine Leaning And Teaching – Alexander Guschin

    Competitive Machine Leaning And Teaching – Alexander Guschin

    In this podcast episode, we talked with Alexander Guschin about launching a career off Kaggle.


    About the Speaker:

  • Alexander Guschin is a Machine Learning Engineer with 10+ years of experience, a Kaggle Grandmaster ranked 5th globally, and a teacher to 100K+ students. He leads DS and SE teams and contributes to open-source ML tools.

    0:00 Starting with Machine Learning: Challenges and Early Steps

    13:05 Community and Learning Through Kaggle Sessions

    17:10 Broadening Skills Through Kaggle Participation

    18:54 Early Competitions and Lessons Learned

    21:10 Transitioning to Simpler Solutions Over Time

    23:51 Benefits of Kaggle for Starting a Career in Machine Learning

    29:08 Teamwork vs. Solo Participation in Competitions

    31:14 Schoolchildren in AI Competitions

    42:33 Transition to Industry and MLOps

    50:13 Encouraging teamwork in student projects

    50:48 Designing competitive machine learning tasks

    52:22 Leaderboard types for tracking performance

    53:44 Managing small-scale university classes

    54:17 Experience with Coursera and online teaching

    59:40 Convincing managers about Kaggle's value

    61:38 Secrets of Kaggle competition success

    63:11 Generative AI's impact on competitive ML

    65:13 Evolution of automated ML solutions

    66:22 Reflecting on competitive data science experience


    🔗 CONNECT WITH ALEXANDER GUSCHINLinkedin - https://www.linkedin.com/in/1aguschin/Website - https://www.aguschin.com/


    🔗 CONNECT WITH DataTalksClub

    Join DataTalks.Club:⁠⁠⁠⁠https://datatalks.club/slack.html⁠⁠⁠⁠

    Our events:⁠⁠⁠⁠https://datatalks.club/events.html⁠⁠⁠⁠

    Datalike Substack -⁠⁠⁠⁠https://datalike.substack.com/⁠⁠⁠⁠

    LinkedIn:⁠⁠⁠⁠  / datatalks-club  ⁠

  • Feb 14, 202553:27
    Redefining AI Infrastructure: Open-Source, Chips, and the Future Beyond Kubernetes – Andrey Cheptsov

    Redefining AI Infrastructure: Open-Source, Chips, and the Future Beyond Kubernetes – Andrey Cheptsov

    In this podcast episode, we talked with Andrey Cheptsov about ​The future of AI infrastructure.


    About the Speaker:

    Andrey Cheptsov is the founder and CEO of dstack, an open-source alternative to Kubernetes and Slurm, built to simplify the orchestration of AI infrastructure. Before dstack, Andrey worked at JetBrains for over a decade helping different teams make the best developer tools.

    During the event, the guest, Andrey Cheptsov, founder and CEO of dstack, discussed the complexities of AI infrastructure. We explore topics like the challenges of using Kubernetes for AI workloads, the need to rethink container orchestration, and the future of hybrid and cloud-only infrastructures. Andrey also shares insights into the role of on-premise and bare-metal solutions, edge computing, and federated learning.

    00:00 Andrey's Career Journey: From JetBrains to DStack

    5:00 The Motivation Behind DStack

    7:00 Challenges in Machine Learning Infrastructure

    10:00 Transitioning from Cloud to On-Prem Solutions

    14:30 Reflections on OpenAI's Evolution

    17:30 Open Source vs Proprietary Models: A Balanced Perspective

    21:01 Monolithic vs. Decentralized AI businesses

    22:05 The role of privacy and control in AI for industries like banking and healthcare

    30:00 Challenges in training large AI models: GPUs and distributed systems

    37:03 DeepSpeed's efficient training approach vs. brute force methods

    39:00 Challenges for small and medium businesses: hosting and fine-tuning models

    47:01 Managing Kubernetes challenges for AI teams

    52:00 Hybrid vs. cloud-only infrastructure

    56:03 On-premise vs. bare-metal solutions

    58:05 Exploring edge computing and its challenges


    🔗 CONNECT WITH ANDREY CHEPTSOV

    Twitter -  / andrey_cheptsov  

    Linkedin -  / andrey-cheptsov  

    GitHub - https://github.com/dstackai/dstack/

    Website - https://dstack.ai/


    🔗 CONNECT WITH DataTalksClub

    Join DataTalks.Club:⁠⁠⁠https://datatalks.club/slack.html⁠⁠⁠

    Our events:⁠⁠⁠https://datatalks.club/events.html⁠⁠⁠

    Datalike Substack -⁠⁠⁠https://datalike.substack.com/⁠⁠⁠

    LinkedIn:⁠⁠⁠  / datatalks-club  ⁠

    Jan 31, 202556:55
    Linguistics and Fairness - Tamara Atanasoska

    Linguistics and Fairness - Tamara Atanasoska

    In this podcast episode, we talked with Tamara Atanasoska about ​building fair AI systems.


    About the Speaker:​Tamara works on ML explainability, interpretability and fairness as Open Source Software Engineer at probable. She is a maintainer of fairlearn, contributor to scikit-learn and skops. Tamara has both computer science/ software engineering and a computational linguistics(NLP) background.During the event, the guest discussed their career journey from software engineering to open-source contributions, focusing on explainability in AI through Scikit-learn and Fairlearn. They explored fairness in AI, including challenges in credit loans, hiring, and decision-making, and emphasized the importance of tools, human judgment, and collaboration. The guest also shared their involvement with PyLadies and encouraged contributions to Fairlearn.

    00:00 Introduction to the event and the community

    01:51 Topic introduction: Linguistic fairness and socio-technical perspectives in AI

    02:37 Guest introduction: Tamara’s background and career

    03:18 Tamara’s career journey: Software engineering, music tech, and computational linguistics

    09:53 Tamara’s background in language and computer science

    14:52 Exploring fairness in AI and its impact on society

    21:20 Fairness in AI models26:21 Automating fairness analysis in models

    32:32 Balancing technical and domain expertise in decision-making

    37:13 The role of humans in the loop for fairness

    40:02 Joining Probable and working on open-source projects

    46:20 Scopes library and its integration with Hugging Face

    50:48 PyLadies and community involvement

    55:41 The ethos of Scikit-learn and Fairlearn


    🔗 CONNECT WITH TAMARA ATANASOSKA

    Linkedin - https://www.linkedin.com/in/tamaraatanasoska

    GitHub- https://github.com/TamaraAtanasoska


    🔗 CONNECT WITH DataTalksClub

    Join DataTalks.Club:⁠⁠https://datatalks.club/slack.html⁠⁠

    Our events:⁠⁠https://datatalks.club/events.html⁠⁠

    Datalike Substack -⁠⁠https://datalike.substack.com/⁠⁠

    LinkedIn:⁠⁠  / datatalks-club  


    Jan 17, 202553:12
    Career choices, transitions and promotions in and out of tech - Agita Jaunzeme

    Career choices, transitions and promotions in and out of tech - Agita Jaunzeme

    In this podcast episode, we talked with Agita Jaunzeme about Career choices, transitions and promotions in and out of tech.


    About the Speaker:

    Agita has designed a career spanning DevOps/DataOps engineering, management, community building, education, and facilitation. She has worked on projects across corporate, startup, open source, and non-governmental sectors. Following her passion, she founded an NGO focusing on the inclusion of expats and locals in Porto. Embodying the values of innovation, automation, and continuous learning, Agita provides practical insights on promotions, career pivots, and aligning work with passion and purpose.


    During this event, discussed their career journey, starting with their transition from art school to programming and later into DevOps, eventually taking on leadership roles. They explored the challenges of burnout and the importance of volunteering, founding an NGO to support inclusion, gender equality, and sustainability. The conversation also covered key topics like mentorship, the differences between data engineering and data science, and the dynamics of managing volunteers versus employees. Additionally, the guest shared insights on community management, developer relations, and the importance of product vision and team collaboration. 0:00 Introduction and Welcome 1:28 Guest Introduction: Agita’s Background and Career Highlights 3:05 Transition to Tech: From Art School to Programming 5:40 Exploring DevOps and Growing into Leadership Roles 7:24 Burnout, Volunteering, and Founding an NGO 11:00 Volunteering and Mentorship Initiatives 14:00 Discovering Programming Skills and Early Career Challenges 15:50 Automating Work Processes and Earning a Promotion 19:00 Transitioning from DevOps to Volunteering and Project Management 24:00 Managing Volunteers vs. Employees and Building Organizational Skills 31:07 Personality traits in engineering vs. data roles 33:14 Differences in focus between data engineers and data scientists 36:24 Transitioning from volunteering to corporate work 37:38 The role and responsibilities of a community manager 39:06 Community management vs. developer relations activities 41:01 Product vision and team collaboration 43:35 Starting an NGO and legal processes 46:13 NGO goals: inclusion, gender equality, and sustainability 49:02 Community meetups and activities 51:57 Living off-grid in a forest and sustainability 55:02 Unemployment party and brainstorming session 59:03 Unemployment party: the process and structure


    🔗 CONNECT WITH AGITA JAUNZEME Linkedin - /agita


    🔗 CONNECT WITH DataTalksClub Join DataTalks.Club: ⁠https://datatalks.club/slack.html⁠ Our events: ⁠https://datatalks.club/events.html⁠ Datalike Substack - ⁠https://datalike.substack.com/⁠ LinkedIn: ⁠  / datatalks-club  

    Jan 10, 202555:21
    Career advice, learning, and featuring women in ML and AI - Isabella Bicalho

    Career advice, learning, and featuring women in ML and AI - Isabella Bicalho

    In this podcast episode, we talked with Isabella Bicalho about Career advice, learning, and featuring women in ML and AI.


    About the Speaker:

    Isabella is a Machine Learning Engineer and Data Scientist with three years of hands-on AI development experience. She draws upon her early computational research expertise to develop ML solutions. While contributing to open-source projects, she runs a newsletter dedicated to showcasing women's accomplishments in data science.


    During this event, the guest discussed her transition into machine learning, her freelance work in AI, and the growing AI scene in France. She shared insights on freelancing versus full-time work, the value of open-source contributions, and developing both technical and soft skills. The conversation also covered career advice, mentorship, and her Substack series on women in data science, emphasizing leadership, motivation, and career opportunities in tech. 0:00 Introduction 1:23 Background of Isabella Bicalho 2:02 Transition to machine learning 4:03 Study and work experience 5:00 Living in France and language learning 6:03 Internship experience 8:45 Focus areas of Inria 9:37 AI development in France 10:37 Current freelance work 11:03 Freelancing in machine learning 13:31 Moving from research to freelancing 14:03 Freelance vs. full-time data science 17:00 Finding first freelance client 18:00 Involvement in open-source projects 20:17 Passion for open-source and teamwork 23:52 Starting new projects 25:03 Community project experience 26:02 Teaching and learning 29:04 Contributing to open-source projects 32:05 Open-source tools vs. projects 33:32 Importance of community-driven projects 34:03 Learning resources 36:07 Green space segmentation project 39:02 Developing technical and soft skills 40:31 Gaining insights from industry experts 41:15 Understanding data science roles 41:31 Project challenges and team dynamics 42:05 Turnover in open-source projects 43:05 Managing expectations in open-source work 44:50 Mentorship in projects 46:17 Role of AI tools in learning 47:59 Overcoming learning challenges 48:52 Discussion on substack 49:01 Interview series on women in data 50:15 Insights from women in data science 51:20 Impactful stories from substack 53:01 Leadership challenges in projects 54:19 Career advice and opportunities 56:07 Motivating others to step out of comfort zone 57:06 Contacting for substack story sharing 58:00 Closing remarks and connections


    🔗 CONNECT WITH ISABELLA BICALHO Github: github https://github.com/bellabf LinkedIn:   / isabella-frazeto  


    🔗 CONNECT WITH DataTalksClub Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html Datalike Substack - https://datalike.substack.com/ LinkedIn:   / datatalks-club  

    Dec 13, 202454:41
    AI in Industry: Trust, Return on Investment and Future - Maria Sukhareva

    AI in Industry: Trust, Return on Investment and Future - Maria Sukhareva

    Reflection on an Almost Two-Year Journey of Generative AI in Industry – Maria Sukhareva

    ​About the speaker:

    ​Maria Sukhareva is a principal key expert in Artificial Intelligence in Siemens with over 15 years of experience at the forefront of generative AI technologies. Known for her keen eye for technological innovation, Maria excels at transforming cutting-edge AI research into practical, value-driven tools that address real-world needs. Her approach is both hands-on and results-focused, with a commitment to creating scalable, long-term solutions that improve communication, streamline complex processes, and empower smarter decision-making. Maria's work reflects a balanced vision, where the power of innovation is met with ethical responsibility, ensuring that her AI projects deliver impactful and production-ready outcomes.


    We talked about:

    00:00 DataTalks.Club intro

    02:13 Career journey: From linguistics to AI

    08:02 The Evolution of AI Expertise and its Future

    13:10 AI vulnerabilities: Bypassing bot restrictions

    17:00 Non-LLM classifiers as a more robust solution

    22:56 Risks of chatbot deployment: Reputational and financial

    27:13 The role of AI as a tool, not a replacement for human workers

    31:41 The role of human translators in the age of AI

    34:49 Evolution of English and its Germanic roots

    38:44 Beowulf and Old English

    39:43 Impact of the Norman occupation on English grammar

    42:34 Identifying mushrooms with AI apps and safety precautions

    45:08 Decoding ancient languages ​​like Sumerian

    49:43 The evolution of machine translation and multilingual models

    53:01 Challenges with low-resource languages ​​and inconsistent orthography

    57:28 Transition from academia to industry in AI


    Join our Slack: https://datatalks.club/slack.html

    Our events: https://datatalks.club/events.html

    Dec 06, 202452:59
    Large Hadron Collider and Mentorship – Anastasia Karavdina

    Large Hadron Collider and Mentorship – Anastasia Karavdina

    We talked about:

    00:00 DataTalks.Club intro

    00:00 Large Hadron Collider and Mentorship

    02:35 Career overview and transition from physics to data science

    07:02 Working at the Large Hadron Collider

    09:19 How particles collide and the role of detectors

    11:03 Data analysis challenges in particle physics and data science similarities

    13:32 Team structure at the Large Hadron Collider

    20:05 Explaining the connection between particle physics and data science

    23:21 Software engineering practices in particle physics

    26:11 Challenges during interviews for data science roles

    29:30 Mentoring and offering advice to job seekers

    40:03 The STAR method and its value in interviews

    50:32 Paid vs unpaid mentorship and finding the right fit


    ​About the speaker:

    ​Anastasia is a particle physicist turned data scientist, with experience in large-scale experiments like those at the Large Hadron Collider. She also worked at Blue Yonder, scaling AI-driven solutions for global supply chain giants, and at Kaufland e-commerce, focusing on NLP and search. Anastasia is a mentor for Ml/AI, dedicated to helping her mentees achieve their goals. She is passionate about growing the next generation of data science elite in Germany: from Data Analysts up to ML Engineers.


    Join our Slack: https://datatalks .club/slack.html

    Nov 22, 202454:14
    MLOps as a Team - Raphaël Hoogvliets

    MLOps as a Team - Raphaël Hoogvliets

    We talked about:

    00:00 DataTalks.Club intro

    02:34 Career journey and transition into MLOps

    08:41 Dutch agriculture and its challenges

    10:36 The concept of "technical debt" in MLOps

    13:37 Trade-offs in MLOps: moving fast vs. doing things right

    14:05 Building teams and the role of coordination in MLOps

    16:58 Key roles in an MLOps team: evangelists and tech translators

    23:01 Role of the MLOps team in an organization

    25:19 How MLOps teams assist product teams

    27 :56 Standardizing practices in MLOps

    32:46 Getting feedback and creating buy-in from data scientists

    36:55 The importance of addressing pain points in MLOps

    39:06 Best practices and tools for standardizing MLOps processes

    42:31 Value of data versioning and reproducibility

    44:22 When to start thinking about data versioning

    45:10 Importance of data science experience for MLOps

    46:06 Skill mix needed in MLOps teams

    47:33 Building a diverse MLOps team

    48:18 Best practices for implementing MLOps in new teams

    49:52 Starting with CI/CD in MLOps

    51:21 Key components for a complete MLOps setup

    53:08 Role of package registries in MLOps

    54:12 Using Docker vs. packages in MLOps

    57:56 Examples of MLOps success and failure stories

    1:00:54 What MLOps is in simple terms

    1:01:58 The complexity of achieving easy deployment, monitoring, and maintenance


    Join our Slack: https://datatalks .club/slack.html

    Nov 08, 202455:36
    Using Data to Create Liveable Cities - Rachel Lim

    Using Data to Create Liveable Cities - Rachel Lim

    We talked about:

    00:00 DataTalks.Club intro 01:56 Using data to create livable cities 02:52 Rachel's career journey: from geography to urban data science 04:20 What does a transport scientist do? 05:34 Short-term and long-term transportation planning 06:14 Data sources for transportation planning in Singapore 08:38 Rachel's motivation for combining geography and data science 10:19 Urban design and its connection to geography 13:12 Defining a livable city 15:30 Livability of Singapore and urban planning 18:24 Role of data science in urban and transportation planning 20:31 Predicting travel patterns for future transportation needs 22:02 Data collection and processing in transportation systems 24:02 Use of real-time data for traffic management 27:06 Incorporating generative AI into data engineering 30:09 Data analysis for transportation policies 33:19 Technologies used in text-to-SQL projects 36:12 Handling large datasets and transportation data in Singapore 42:17 Generative AI applications beyond text-to-SQL 45:26 Publishing public data and maintaining privacy 45:52 Recommended datasets and projects for data engineering beginners 49:16 Recommended resources for learning urban data science


    About the speaker:

    Rachel is an urban data scientist dedicated to creating liveable cities through the innovative use of data. With a background in geography, and a masters in urban data science, she blends qualitative and quantitative analysis to tackle urban challenges. Her aim is to integrate data driven techniques with urban design to foster sustainable and equitable urban environments. 


    Links: - https://datamall.lta.gov.sg/content/datamall/en/dynamic-data.html 00:00 DataTalks.Club intro 01:56 Using data to create livable cities 02:52 Rachel's career journey: from geography to urban data science 04:20 What does a transport scientist do? 05:34 Short-term and long-term transportation planning 06:14 Data sources for transportation planning in Singapore 08:38 Rachel's motivation for combining geography and data science 10:19 Urban design and its connection to geography 13:12 Defining a livable city 15:30 Livability of Singapore and urban planning 18:24 Role of data science in urban and transportation planning 20:31 Predicting travel patterns for future transportation needs 22:02 Data collection and processing in transportation systems 24:02 Use of real-time data for traffic management 27:06 Incorporating generative AI into data engineering 30:09 Data analysis for transportation policies 33:19 Technologies used in text-to-SQL projects 36:12 Handling large datasets and transportation data in Singapore 42:17 Generative AI applications beyond text-to-SQL 45:26 Publishing public data and maintaining privacy 45:52 Recommended datasets and projects for data engineering beginners 49:16 Recommended resources for learning urban data science Join our slack: https: //datatalks.club/slack.html

    Nov 01, 202445:36
    DataTalks.Club 4th Anniversary AMA Podcast – Alexey Grigorev and Johanna Bayer

    DataTalks.Club 4th Anniversary AMA Podcast – Alexey Grigorev and Johanna Bayer

    We talked about:

    00:00 DataTalks.Club intro

    00:00 DataTalks.Club anniversary "Ask Me Anything" event with Alexey Grigorev

    02:29 The founding of DataTalks .Club

    03:52 Alexey's transition from Java work to DataTalks.Club

    04:58 Growth and success of DataTalks.Club courses

    12:04 Motivation behind creating a free-to-learn community

    24:03 Staying updated in data science through pet projects

    26 :37 Hosting a second podcast and maintaining programming skills

    28:56 Skepticism about LLMs and their relevance

    31:53 Transitioning to DataTalks.Club and personal reflections

    33:32 Memorable moments and the first event's success

    36:19 Community building during the pandemic

    38:31 AI's impact on data analysts and future roles

    42:24 Discussion on AI in healthcare

    44:37 Age and reflections on personal milestones

    47:54 Building communities and personal connections

    49:34 Future goals for the community and courses

    51:18 Community involvement and engagement strategies

    53:46 Ideas for competitions and hackathons

    54:20 Inviting guests to the podcast

    55:29 Course updates and future workshops

    56:27 Podcast preparation and research process

    58:30 Career opportunities in data science and transitioning fields

    1:01 :10 Book recommendations and personal reading experiences


    About the speaker:

    Alexey Grigorev is the founder of DataTalks.Club.


    Join our slack: https://datatalks.club/slack.html

    Oct 26, 202453:41
    Human-Centered AI for Disordered Speech Recognition - Katarzyna Foremniak

    Human-Centered AI for Disordered Speech Recognition - Katarzyna Foremniak

    We talked about:

    00:00 DataTalks.Club intro

    08:06 Background and career journey of Katarzyna

    09:06 Transition from linguistics to computational linguistics

    11:38 Merging linguistics and computer science

    15:25 Understanding phonetics and morpho-syntax

    17:28 Exploring morpho-syntax and its relation to grammar

    20:33 Connection between phonetics and speech disorders

    24:41 Improvement of voice recognition systems

    27:31 Overview of speech recognition technology

    30:24 Challenges of ASR systems with atypical speech

    30:53 Strategies for improving recognition of disordered speech

    37:07 Data augmentation for training models

    40:17 Transfer learning in speech recognition

    42:18 Challenges of collecting data for various speech disorders

    44:31 Stammering and its connection to fluency issues

    45:16 Polish consonant combinations and pronunciation challenges

    46:17 Use of Amazon Transcribe for generating podcast transcripts

    47:28 Role of language models in speech recognition

    49:19 Contextual understanding in speech recognition

    51:27 How voice recognition systems analyze utterances

    54:05 Personalization of ASR models for individuals

    56:25 Language disorders and their impact on communication

    58:00 Applications of speech recognition technology

    1:00:34 Challenges of personalized and universal models

    1:01:23 Voice recognition in automotive applications

    1:03:27 Humorous voice recognition failures in cars

    1:04:13 Closing remarks and reflections on the discussion


    About the speaker:

    Katarzyna is a computational linguist with over 10 years of experience in NLP and speech recognition. She has developed language models for automotive brands like Audi and Porsche and specializes in phonetics, morpho-syntax, and sentiment analysis.

    Kasia also teaches at the University of Warsaw and is passionate about human-centered AI and multilingual NLP.

    Join our slack: https://datatalks.club/slack.html

    Oct 10, 202448:01
    DataOps, Observability, and The Cure for Data Team Blues - Christopher Bergh

    DataOps, Observability, and The Cure for Data Team Blues - Christopher Bergh

    0:00

    hi everyone Welcome to our event this event is brought to you by data dos club which is a community of people who love

    0:06

    data and we have weekly events and today one is one of such events and I guess we

    0:12

    are also a community of people who like to wake up early if you're from the states right Christopher or maybe not so

    0:19

    much because this is the time we usually have uh uh our events uh for our guests

    0:27

    and presenters from the states we usually do it in the evening of Berlin time but yes unfortunately it kind of

    0:34

    slipped my mind but anyways we have a lot of events you can check them in the

    0:41

    description like there's a link um I don't think there are a lot of them right now on that link but we will be

    0:48

    adding more and more I think we have like five or six uh interviews scheduled so um keep an eye on that do not forget

    0:56

    to subscribe to our YouTube channel this way you will get notified about all our future streams that will be as awesome

    1:02

    as the one today and of course very important do not forget to join our community where you can hang out with

    1:09

    other data enthusiasts during today's interview you can ask any question there's a pin Link in live chat so click

    1:18

    on that link ask your question and we will be covering these questions during the interview now I will stop sharing my

    1:27

    screen and uh there is there's a a message in uh and Christopher is from

    1:34

    you so we actually have this on YouTube but so they have not seen what you wrote

    1:39

    but there is a message from to anyone who's watching this right now from Christopher saying hello everyone can I

    1:46

    call you Chris or you okay I should go I should uh I should look on YouTube then okay yeah but anyways I'll you don't

    1:53

    need like you we'll need to focus on answering questions and I'll keep an eye

    1:58

    I'll be keeping an eye on all the question questions so um

    2:04

    yeah if you're ready we can start I'm ready yeah and you prefer Christopher

    2:10

    not Chris right Chris is fine Chris is fine it's a bit shorter um

    2:18

    okay so this week we'll talk about data Ops again maybe it's a tradition that we talk about data Ops every like once per

    2:25

    year but we actually skipped one year so because we did not have we haven't had

    2:31

    Chris for some time so today we have a very special guest Christopher Christopher is the co-founder CEO and

    2:37

    head chef or hat cook at data kitchen with 25 years of experience maybe this

    2:43

    is outdated uh cuz probably now you have more and maybe you stopped counting I

    2:48

    don't know but like with tons of years of experience in analytics and software engineering Christopher is known as the

    2:55

    co-author of the data Ops cookbook and data Ops Manifesto and it's not the

    3:00

    first time we have Christopher here on the podcast we interviewed him two years ago also about data Ops and this one

    3:07

    will be about data hops so we'll catch up and see what actually changed in in

    3:13

    these two years and yeah so welcome to the interview well thank you for having

    3:19

    me I'm I'm happy to be here and talking all things related to data Ops and why

    3:24

    why why bother with data Ops and happy to talk about the company or or what's changed

    3:30

    excited yeah so let's dive in so the questions for today's interview are prepared by Johanna berer as always

    3:37

    thanks Johanna for your help so before we start with our main topic for today

    3:42

    data Ops uh let's start with your ground can you tell us about your career Journey so far and also for those who

    3:50

    have not heard have not listened to the previous podcast maybe you can um talk

    3:55

    about yourself and also for those who did listen to the previous you can also maybe give a summary of what has changed

    4:03

    in the last two years so we'll do yeah so um my name is Chris so I guess I'm

    4:09

    a sort of an engineer so I spent about the first 15 years of my career in

    4:15

    software sort of working and building some AI systems some non- AI systems uh

    4:21

    at uh Us's NASA and MIT linol lab and then some startups and then um

    4:30

    Microsoft and then about 2005 I got I got the data bug uh I think you know my

    4:35

    kids were small and I thought oh this data thing was easy and I'd be able to go home uh for dinner at 5 and life

    4:41

    would be fine um because I was a big you started your own company right and uh it didn't work out that way

    4:50

    and um and what was interesting is is for me it the problem wasn't doing the

    4:57

    data like I we had smart people who did data science and data engineering the act of creating things it was like the

    5:04

    systems around the data that were hard um things it was really hard to not have

    5:11

    errors in production and I would sort of driving to work and I had a Blackberry at the time and I would not look at my

    5:18

    Blackberry all all morning I had this long drive to work and I'd sit in the parking lot and take a deep breath and

    5:24

    look at my Blackberry and go uh oh is there going to be any problems today and I'd be and if there wasn't I'd walk and

    5:30

    very happy um and if there was I'd have to like rce myself um and you know and

    5:36

    then the second problem is the team I worked for we just couldn't go fast enough the customers were super

    5:42

    demanding they didn't care they all they always thought things should be faster and we are always behind and so um how

    5:50

    do you you know how do you live in that world where things are breaking left and right you're terrified of making errors

    5:57

    um and then second you just can't go fast enough um and it's preh Hadoop era

    6:02

    right it's like before all this big data Tech yeah before this was we were using

    6:08

    uh SQL Server um and we actually you know we had smart people so we we we

    6:14

    built an engine in SQL Server that made SQL Server a column or

    6:20

    database so we built a column or database inside of SQL Server um so uh

    6:26

    in order to make certain things fast and and uh yeah it was it was really uh it's not

    6:33

    bad I mean the principles are the same right before Hadoop it's it's still a database there's still indexes there's

    6:38

    still queries um things like that we we uh at the time uh you would use olap

    6:43

    engines we didn't use those but you those reports you know are for models it's it's not that different um you know

    6:50

    we had a rack of servers instead of the cloud um so yeah and I think so what what I

    6:57

    took from that was uh it's just hard to run a team of people to do do data and analytics and it's not

    7:05

    really I I took it from a manager perspective I started to read Deming and

    7:11

    think about the work that we do as a factory you know and in a factory that produces insight and not automobiles um

    7:18

    and so how do you run that factory so it produces things that are good of good

    7:24

    quality and then second since I had come from software I've been very influenced

    7:29

    by by the devops movement how you automate deployment how you run in an agile way how you

    7:35

    produce um how you how you change things quickly and how you innovate and so

    7:41

    those two things of like running you know running a really good solid production line that has very low errors

    7:47

    um and then second changing that production line at at very very often they're kind of opposite right um and so

    7:55

    how do you how do you as a manager how do you technically approach that and

    8:00

    then um 10 years ago when we started data kitchen um we've always been a profitable company and so we started off

    8:07

    uh with some customers we started building some software and realized that we couldn't work any other way and that

    8:13

    the way we work wasn't understood by a lot of people so we had to write a book and a Manifesto to kind of share our our

    8:21

    methods and then so yeah we've been in so we've been in business now about a little over 10

    8:28

    years oh that's cool and uh like what

    8:33

    uh so let's talk about dat offs and you mentioned devops and how you were inspired by that and by the way like do

    8:41

    you remember roughly when devops as I think started to appear like when did people start calling these principles

    8:49

    and like tools around them as de yeah so agile Manifesto well first of all the I

    8:57

    mean I had a boss in 1990 at Nasa who had this idea build a

    9:03

    little test a little learn a lot right that was his Mantra and then which made

    9:09

    made a lot of sense um and so and then the sort of agile software Manifesto

    9:14

    came out which is very similar in 2001 and then um the sort of first real

    9:22

    devops was a guy at Twitter started to do automat automated deployment you know

    9:27

    push a button and that was like 200 Nish and so the first I think devops

    9:33

    Meetup was around then so it's it's it's been 15 years I guess 6 like I was

    9:39

    trying to so I started my career in 2010 so I my first job was a Java

    9:44

    developer and like I remember for some things like we would just uh SFTP to the

    9:52

    machine and then put the jar archive there and then like keep our fingers crossed that it doesn't break uh uh like

    10:00

    it was not really the I wouldn't call it this way right you were deploying you

    10:06

    had a Dey process I put it yeah

    10:11

    right was that so that was documented too it was like put the jar on production cross your

    10:17

    fingers I think there was uh like a page on uh some internal Viki uh yeah that

    10:25

    describes like with passwords and don't like what you should do yeah that was and and I think what's interesting is

    10:33

    why that changed right and and we laugh at it now but that was why didn't you

    10:38

    invest in automating deployment or a whole bunch of automated regression

    10:44

    tests right that would run because I think in software now that would be rare

    10:49

    that people wouldn't use C CD they wouldn't have some automated tests you know functional

    10:56

    regression tests that would be the exception whereas that the norm at the beginning of your career and so that's

    11:03

    what's interesting and I think you know if we if we talk about what's changed in the last two three years I I think it is

    11:10

    getting more standard there are um there's a lot more companies who are

    11:15

    talking data Ops or data observability um there's a lot more tools that are a lot more people are

    11:22

    using get in data and analytics than ever before I think thanks to DBT um and

    11:29

    there's a lot of tools that are I think getting more code Centric right that

    11:35

    they're not treating their configuration like a black box there there's several

    11:41

    bi tools that tout the fact that they that they're uh you know they're they're git Centric you know and and so and that

    11:49

    they're testable and that they have apis so things like that I think people maybe let's take a step back and just do a

    11:57

    quick summary of what data Ops data Ops is and then we can talk about like what changed in the last two years sure so I

    12:06

    guess it starts with a problem and that it's it sort of

    12:11

    admits some dark things about data and analytics and that we're not really successful and we're not really happy um

    12:19

    and if you look at the statistics on sort of projects and problems and even

    12:25

    the psychology like I think about a year or two we did a survey of

    12:31

    data Engineers 700 data engineers and 78% of them wanted their job to come with a therapist and 50% were thinking

    12:38

    of leaving the career altogether and so why why is everyone sort of unhappy well I I I think what happens is

    12:46

    teams either fall into two buckets they're sort of heroic teams who

    12:52

    are doing their they're working night and day they're trying really hard for their customer um and then they get

    13:01

    burnt out and then they quit honestly and then the second team have wrapped

    13:06

    their projects up in so much process and proceduralism and steps that doing

    13:12

    anything is sort of so slow and boring that they again leave in frustration um

    13:18

    or or live in cynicism and and that like the only outcome is quit and

    13:24

    start uh woodworking yeah the only outcome really is quit and start working

    13:29

    and um as a as a manager I always hated that right because when when your team

    13:35

    is either full of heroes or proceduralism you always have people who have the whole system in their head

    13:42

    they're certainly key people and then when they leave they take all that knowledge with them and then that

    13:48

    creates a bottleneck and so both of which are aren aren't and I think the

    13:53

    main idea of data Ops is there's a balance between fear and herois

    14:00

    that you can live you don't you know you don't have to be fearful 95% of the time maybe one or two% it's good to be

    14:06

    fearful and you don't have to be a hero again maybe one or two per it's good to be a hero but there's a balance um and

    14:13

    and in that balance you actually are much more prod

    Aug 15, 202453:47
    Working as a Core Developer in the Scikit-Learn Universe - Guillaume Lemaître

    Working as a Core Developer in the Scikit-Learn Universe - Guillaume Lemaître

    In this podcast episode, we talked with Guillaume Lemaître about navigating scikit-learn and imbalanced-learn. 🔗 CONNECT WITH Guillaume Lemaître LinkedIn - https://www.linkedin.com/in/guillaume-lemaitre-b9404939/ Twitter - https://x.com/glemaitre58 Github - https://github.com/glemaitre Website - https://glemaitre.github.io/ 🔗 CONNECT WITH DataTalksClub Join the community - https://datatalks-club.slack.com/join/shared_invite/zt-2hu0sjeic-ESN7uHt~aVWc8tD3PefSlA#/shared-invite/email Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/u/0/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ Check other upcoming events - https://lu.ma/dtc-events LinkedIn - https://www.linkedin.com/company/datatalks-club/ Twitter - https://twitter.com/DataTalksClub Website - https://datatalks.club/ 🔗 CONNECT WITH ALEXEY Twitter - https://twitter.com/Al_Grigor Linkedin - https://www.linkedin.com/in/agrigorev/ 🎙 ABOUT THE PODCAST At DataTalksClub, we organize live podcasts that feature a diverse range of guests from the data field. Each podcast is a free-form conversation guided by a prepared set of questions, designed to learn about the guests’ career trajectories, life experiences, and practical advice. These insightful discussions draw on the expertise of data practitioners from various backgrounds. We stream the podcasts on YouTube, where each session is also recorded and published on our channel, complete with timestamps, a transcript, and important links. You can access all the podcast episodes here - https://datatalks.club/podcast.html 📚Check our free online courses ML Engineering course - http://mlzoomcamp.com Data Engineering course - https://github.com/DataTalksClub/data-engineering-zoomcamp MLOps course - https://github.com/DataTalksClub/mlops-zoomcamp Analytics in Stock Markets - https://github.com/DataTalksClub/stock-markets-analytics-zoomcamp LLM course - https://github.com/DataTalksClub/llm-zoomcamp Read about all our courses in one place - https://datatalks.club/blog/guide-to-free-online-courses-at-datatalks-club.html 👋🏼 GET IN TOUCH If you want to support our community, use this link - https://github.com/sponsors/alexeygrigorev If you're a company and want to support us, contact at alexey@datatalks.club

    Jul 26, 202452:31
    Building a Domestic Risk Assessment Tool - Sabina Firtala

    Building a Domestic Risk Assessment Tool - Sabina Firtala

    Links:

    • LinkedIn:https://www.linkedin.com/company/frontline100/
    • Ba Linh Le's LinkedIn: https://www.linkedin.com/in/ba-linh-le-/
    • Sabrina's LinkedIn: https://www.linkedin.com/in/sabina-firtala/
    • Twitter: https://x.com/frontline_100?mx=2
    • Website: https://www.frontline100.com/

    Free LLM course: https://github.com/DataTalksClub/llm-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

    Jul 13, 202449:36
    Berlin Buzzwords 2024

    Berlin Buzzwords 2024

    We stream the podcasts on YouTube, where each session is also recorded and published on our channel, complete with timestamps, a transcript, and important links. You can access all the podcast episodes here - https://datatalks.club/podcast.html 📚Check our free online courses ML Engineering course - http://mlzoomcamp.com Data Engineering course - https://github.com/DataTalksClub/data-engineering-zoomcamp MLOps course - https://github.com/DataTalksClub/mlops-zoomcamp Analytics in Stock Markets - https://github.com/DataTalksClub/stock-markets-analytics-zoomcamp LLM course - https://github.com/DataTalksClub/llm-zoomcamp Read about all our courses in one place - https://datatalks.club/blog/guide-to-free-online-courses-at-datatalks-club.html 👋🏼 GET IN TOUCH If you want to support our community, use this link - https://github.com/sponsors/alexeygrigorev If you’re a company, support us at alexey@datatalks.club

    Jul 06, 202437:33
    Community Building and Teaching in AI & Tech - Erum Afzal

    Community Building and Teaching in AI & Tech - Erum Afzal

    We talked about:

    • Erum's Background
    • Omdena Academy and Erum’s Role There
    • Omdena’s Community and Projects
    • Course Development and Structure at Omdena Academy
    • Student and Instructor Engagement
    • Engagement and Motivation
    • The Role of Teaching in Community Building
    • The Importance of Communities for Career Building
    • Advice for Aspiring Instructors and Freelancers
    • DS and ML Talent Market Saturation
    • Resources for Learning AI and Community Building
    • Erum’s Resource Recommendations


    Links:

    • LinkedIn: https://www.linkedin.com/in/erum-afzal-64827b24/

    • Twitter:  https://twitter.com/Erum55449739

    Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

    May 10, 202450:01
    Working in Open Source - Probabl.ai and sklearn - Vincent Warmerdam

    Working in Open Source - Probabl.ai and sklearn - Vincent Warmerdam

    We talked about:

    • Vincent’s Background
    • SciKit Learn’s History and Company Formation
    • Maintaining and Transitioning Open Source Projects
    • Teaching and Learning Through Open Source
    • Role of Developer Relations and Content Creation
    • Teaching Through Calm Code and The Importance of Content Creation
    • Current Projects and Future Plans for Calm Code
    • Data Processing Tricks and The Importance of Innovation
    • Learning the Fundamentals and Changing the Way You See a Problem
    • Dev Rel and Core Dev in One
    • Why :probabl. Needs a Dev Rel
    • Exploration of Skrub and Advanced Data Processing
    • Personal Insights on SciKit Learn and Industry Trends
    • Vincent’s Upcoming Projects

    Links:

    • probabl. YouTube channel: https://www.youtube.com/@UCIat2Cdg661wF5DQDWTQAmg
    • Calmcode website: https://calmcode.io/
    • probabl. website: https://probabl.ai/


    Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

    May 03, 202452:02
    AI for Ecology, Biodiversity, and Conservation - Tanya Berger-Wolf

    AI for Ecology, Biodiversity, and Conservation - Tanya Berger-Wolf

    Links:

    • Biodiversity and Artificial Intelligence pdf: https://www.gpai.ai/projects/responsible-ai/environment/biodiversity-and-AI-opportunities-recommendations-for-action.pdf


    Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

    Apr 26, 202451:48
    Knowledge Graphs and LLMs Across Academia and Industry - Anahita Pakiman

    Knowledge Graphs and LLMs Across Academia and Industry - Anahita Pakiman

    We talked about:

    • Anahita's Background
    • Mechanical Engineering and Applied Mechanics
    • Finite Element Analysis vs. Machine Learning
    • Optimization and Semantic Reporting
    • Application of Knowledge Graphs in Research
    • Graphs vs Tabular Data
    • Computational graphs
    • Graph Data Science and Graph Machine Learning
    • Combining Knowledge Graphs and Large Language Models (LLMs)
    • Practical Applications and Projects
    • Challenges and Learnings
    • Anahita’s Recommendations


    Links:

    • GitHub repo: https://github.com/antahiap/ADPT-LRN-PHYS/tree/main

    Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

    Apr 05, 202453:15
    Inclusive Data Leadership Coaching - Tereza Iofciu

    Inclusive Data Leadership Coaching - Tereza Iofciu

    We talked about:

    • Tereza’s background
    • Switching from an Individual Contributor to Lead
    • Python Pizza and the pizza management metaphor
    • Learning to figure things out on your own and how to receive feedback
    • Tereza as a leadership coach
    • Podcasts
    • Tereza’s coaching framework (selling yourself vs bragging)
    • The importance of retrospectives
    • The importance of communication and active listening
    • Convincing people you don’t have power over
    • Building relationships and empathy
    • Inclusive leadership


    Links:

    • LinkedIn: https://www.linkedin.com/in/tereza-iofciu/
    • Twitter: https://twitter.com/terezaif
    • Github: https://github.com/terezaif
    • Website: https:// terezaiofciu.com


    Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

    Mar 29, 202448:17
    Building Production Search Systems - Daniel Svonava
    Mar 22, 202458:26
    Building Machine Learning Products - Reem Mahmoud

    Building Machine Learning Products - Reem Mahmoud

    We talked about:


    • Reem’s background
    • Context-aware sensing and transfer learning
    • Shifting focus from PhD to industry
    • Reem’s experience with startups and dealing with prejudices towards PhDs
    • AI interviewing solution
    • How candidates react to getting interviewed by an AI avatar
    • End-to-end overview of a machine learning project
    • The pitfalls of using LLMs in your process
    • Mitigating biases
    • Addressing specific requirements for specific roles
    • Reem’s resource recommendations


    Links:

    • LinkedIn: https://www.linkedin.com/in/reemmahmoud/recent-activity/all/
    • Website: https://topmate.io/reem_mahmoud


    Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

    Mar 16, 202456:48