Keith Galli
Keith Galli
  • 86
  • 15 039 991
Solving 100 Python Pandas Problems! (from easy to very difficult)
In this tutorial, you'll gain hands-on experience with the python pandas library, building experience with data manipulation and analysis skills important for data science. You'll learn how to create, modify, and analyze DataFrames, handle missing data (NaNs), clean messy data, and generate some visualizations. By tackling a variety of problems, from basic data handling to advanced DataFrame techniques, you'll build a solid foundation in managing and interpreting real-world data sets using pandas.
Repo we're working off of (credit to Alex Riley who put repo together):
github.com/ajcr/100-pandas-puzzles
My code solutions (use repo above for blank starting template):
github.com/KeithGalli/100-pandas-puzzles
Hope that you enjoy this video. If you do, make sure to like it and subscribe to not miss future videos like this!
Video Timeline!
0:00 - Intro & Setup
2:14 - Problems (1-3) Initial pandas setup
4:42 - Problems (4-10) DataFrame operations
4:52 - 4) Create a dataframe from dictionary
5:24 - 5) Display dataframe summary
5:41 - 6) First 3 rows of the dataframe
6:02 - 7) Select ‘animal’ and ‘age’ columns
7:42 - 8) Data in specific rows and columns
9:06 - 9) Rows with visits greater than 3
9:57 - 10) Rows with NaN in age
10:56 - 11) Cats younger than 3 years
11:35 - 12) Age between 2 and 4
12:45 - 13) Change age in row ‘f’
15:56 - 14) Sum of all visits
16:41 - 15) Average age by animal
20:21 - 16) Modify and revert rows
24:06 - 17) Count by animal type
25:28 - Quick review
26:17 - 18) Sort by age and visits
28:07 - 19) Convert 'priority' to boolean
29:42 - 20) Replace 'snake' with 'python'
30:53 - 21) Mean age by animal and visits
33:49 - Advanced DataFrame techniques
33:57 - 22) Filter duplicate integers
43:18 - 23) Subtract row mean
45:42 - 24) Column with smallest sum
50:39 - 25) Count unique rows
53:17 - 26) Column with third NaN
1:10:27 - Solution review for 26
1:17:13 - 27) Sum of top three values
1:24:01 - 28) Sum by column condition
1:40:11 - Recent problem review
1:42:53 - 29) Count differences since last zero
1:56:19 - 30) Locate largest values
2:08:38 - 31) Replace negatives with mean
2:17:43 - 32) Rolling mean over groups
2:23:10 - Series and DatetimeIndex
2:23:12 - 33) DatetimeIndex for 2015
2:27:56 - 34) Sum values on Wednesdays
2:45:04 - 35) Monthly mean values
2:46:16 - 36) Best value in four-month groups
2:50:26 - 37) DatetimeIndex of third Thursdays
2:59:03 - Cleaning Data
2:59:40 - 38) Fill missing FlightNumber
3:02:45 - 39) Split column by delimiter
3:06:47 - 40) Fix city name capitalization
3:08:30 - 41) Reattach columns
3:13:11 - 42) Fix airline name punctuation
3:17:45 - 43) Expand RecentDelays into columns
3:27:31 - MultiIndexes in Pandas
3:27:34 - 44) Construct a MultiIndex
3:30:37 - Solution review
3:32:44 - 45) Lexicographically sorted check
3:32:58 - 46) Select specific MultiIndex labels
3:34:23 - 47) Slice Series with MultiIndex
3:35:24 - 48) Sum by first level
3:37:47 - 49) Alternative sum method
3:40:08 - Additional solution insights
3:41:22 - 50) Swap MultiIndex levels
3:45:27 - Minesweeper problems
3:45:44 - 51) Generate coordinate grid
4:00:28 - 52) Add 'safe' or 'mine' column
4:03:04 - 53) Count adjacent mines
4:27:33 - Review solution to 53
4:33:02 - Skipped problems 54 & 55
4:33:11 - Plotting
4:33:12 - 56) Scatter plot with black x markers
4:41:26 - 57) Plot four data types
4:52:50 - 58) Overlay multiple graphs
5:03:11 - 59) Hourly stock data summary
5:14:12 - 60) Candlestick plot
------------------
Practice your Python Pandas data science skills with problems on StrataScratch!
stratascratch.com/?via=keith
Переглядів: 37 592

Відео

Solving Real-World Data Science Problems with LLMs! (Historical Document Analysis)
Переглядів 13 тис.Місяць тому
In this video we walk through the process of analyzing historical documents using Python & Large Language Models. We start by setting up LLMs using both closed-source (OpenAI API) and open-source (Llama 2 via Ollama) options. Next, we walk through how we can leverage the LLMs to parse out entities from text. After this we actually start playing around with our data, loading in a specific subcat...
How to make your GitHub more impressive to Employers! (5 simple tips)
Переглядів 3,9 тис.2 місяці тому
In this video, we look into how you can enhance your GitHub profile to help catch the eye of potential employers. With insights from my experience in hiring for data science and programming roles, we walk through five actionable tips to elevate your GitHub presence. These range from showcasing your coding projects effectively to optimizing your profile's appearance and readability. Whether you'...
Can You Solve These 3 Data Analysis Puzzles? (AnalystBuilder & Python Pandas)
Переглядів 3,2 тис.4 місяці тому
Check out AnalystBuilder! www.analystbuilder.com/?via=keith Join me as I dive into the Analyst Builder platform created by fellow UA-camr @AlexTheAnalyst! In this video, we tackle a series of Python programming challenges, demonstrating real-time problem-solving and coding skills. We navigate through various tasks, from identifying high-risk heart attack patients using data analysis to manipula...
Python Project: Implement a REST API with Flask & Flasgger Libraries!
Переглядів 2,7 тис.5 місяців тому
We continue where we left on in the last video and implement two REST endpoints for a book review API. We implement a Get request to retrieve all our reviews and implement a Post request to add a new review to our Airtable database. We use the Flasgger library to create interactive documentation for our endpoints. We use ChatGPT to help us write our endpoints Python code. We use Render.com to d...
How to create & deploy an API in Python! (with interactive documentation)
Переглядів 5 тис.5 місяців тому
In this video, we work through the process of creating and deploying a Python3 based API using libraries such as Flask, Flask-Restful, & Flasgger. Link to code: github.com/KeithGalli/python-api-example Part 2: ua-cam.com/video/rCrDYRBOuNw/v-deo.htmlsi=bBeT9orOpjr-089S In this first video, we start with the basics, setting up GET endpoints and learning how to deploy our API to the cloud (using R...
Complete Regular Expressions Tutorial! (with exercises for practice)
Переглядів 17 тис.Рік тому
Practice your Python Pandas data science skills with problems on StrataScratch! stratascratch.com/?via=keith In this video we go through all the fundamentals of using regular expressions (regexes) to match patterns in programming. In this video we cover the following: - Character Sets [a-zA-Z0-9] - Quantifiers *, , ?, {3,5} - Metacharacters ^ . | $ - Character Classes \b \s \w \d - Groups - Loo...
Full Data Science Mock Interview! (featuring Kylie Ying)
Переглядів 14 тис.Рік тому
Check out Mobile Pixels! bit.ly/3WKUC55 In this video we walk through a full-length data science interview. The task in the video is to develop a model to identify bots on a social media platform. In the video we cover topics including feature vectorization, one-hot encodings, dataset building, and more! Check out Kylie's channel: @KylieYYing Follow me on social media! Instagram | instagram.com...
Full Python Portfolio Project! Create a smart program to download & transcribe top podcasts.
Переглядів 14 тис.Рік тому
Check out www.assemblyai.com/? to start transcribing as many podcasts as your heart desires! In this video we create a Python program that can automatically scrape the RSS feeds of your favorite podcasters, pulling out the episodes you’ll find most interesting, and downloading transcribing them. This project leverages a wide range of Python skills making it a good portfolio project. In it you’l...
Solving Real-World Data Science Interview Questions! (with Python Pandas)
Переглядів 105 тис.Рік тому
Visit brilliant.org/KeithGalli/ to get started learning STEM for free, and the first 200 people will get 20% off their annual premium subscription In this video we solve a series of Data Science Interview questions on Stratascratch. We start with easy problems using Python Pandas and then progressively get more difficult. At the end of the video we do five non-coding interview questions that fo...
5 Jupyter Notebook Tips & Tricks to Improve your Data Science Workflow!
Переглядів 54 тис.Рік тому
Visit brilliant.org/KeithGalli/ to get started learning STEM for free, and the first 200 people will get 20% off their annual premium subscription In this video we walk through some of my favorite tips & tricks for doing data science with Jupyter Notebooks. Many of these tips have helped me become more efficient writing Python code for my data science projects. Topics covered: - Running bash co...
Solving real world data science problems with Python! (computer vision edition)
Переглядів 39 тис.2 роки тому
Practice your Python Pandas data science skills with problems on StrataScratch! stratascratch.com/?via=keith In this video we work on a real world computer vision problem using Python. The problem task is to create a model that can distinguish a flower known as “La Eterna” from other types of flowers. To do this we create convolutional neural networks (CNNs) using the Tensorflow/Keras libraries...
Want to land your first data science role??... I have a BIG announcement! (no degree required)
Переглядів 17 тис.2 роки тому
Want to land your first data science role??... I have a BIG announcement! (no degree required)
Complete Natural Language Processing (NLP) Tutorial in Python! (with examples)
Переглядів 127 тис.2 роки тому
Complete Natural Language Processing (NLP) Tutorial in Python! (with examples)
Solving real-world data analysis problems with Python Pandas! (Lego dataset analysis)
Переглядів 84 тис.2 роки тому
Solving real-world data analysis problems with Python Pandas! (Lego dataset analysis)
The Future of Data Science! (Snickerdoodle AI/ML Framework)
Переглядів 22 тис.2 роки тому
The Future of Data Science! (Snickerdoodle AI/ML Framework)
How I became an unemployed MIT grad still living with my parents.
Переглядів 57 тис.2 роки тому
How I became an unemployed MIT grad still living with my parents.
The data science machine!! (Lenovo ThinkPad P15 unboxing)
Переглядів 35 тис.3 роки тому
The data science machine!! (Lenovo ThinkPad P15 unboxing)
I wrote a program to automatically post a video if I hit 1000 subscribers!
Переглядів 9 тис.3 роки тому
I wrote a program to automatically post a video if I hit 1000 subscribers!
Learn Python, Programming, & Data Science | Channel Trailer
Переглядів 33 тис.3 роки тому
Learn Python, Programming, & Data Science | Channel Trailer
Solving Coding Interview Questions in Python on LeetCode (easy & medium problems)
Переглядів 172 тис.3 роки тому
Solving Coding Interview Questions in Python on LeetCode (easy & medium problems)
How to Schedule & Automatically Run Python Code!
Переглядів 136 тис.3 роки тому
How to Schedule & Automatically Run Python Code!
How to Generate an Analytics Report (pdf) in Python!
Переглядів 151 тис.3 роки тому
How to Generate an Analytics Report (pdf) in Python!
Solving real world data science tasks with Python Beautiful Soup! (movie dataset creation)
Переглядів 282 тис.3 роки тому
Solving real world data science tasks with Python Beautiful Soup! (movie dataset creation)
How to Make a High Quality Tutorial Video! (workflow, camera equipment, and software tools)
Переглядів 19 тис.3 роки тому
How to Make a High Quality Tutorial Video! (workflow, camera equipment, and software tools)
Comprehensive Python Beautiful Soup Web Scraping Tutorial! (find/find_all, css select, scrape table)
Переглядів 301 тис.3 роки тому
Comprehensive Python Beautiful Soup Web Scraping Tutorial! (find/find_all, css select, scrape table)
Real-World Python Neural Nets Tutorial (Image Classification w/ CNN) | Tensorflow & Keras
Переглядів 83 тис.3 роки тому
Real-World Python Neural Nets Tutorial (Image Classification w/ CNN) | Tensorflow & Keras
Introduction to Neural Networks in Python (what you need to know) | Tensorflow/Keras
Переглядів 86 тис.4 роки тому
Introduction to Neural Networks in Python (what you need to know) | Tensorflow/Keras
Python Data Science Project Ideas! (for all skill levels)
Переглядів 71 тис.4 роки тому
Python Data Science Project Ideas! (for all skill levels)
Professional Code Refactor! (Cleaning Python Code & Rewriting it to use Classes)
Переглядів 41 тис.4 роки тому
Professional Code Refactor! (Cleaning Python Code & Rewriting it to use Classes)

КОМЕНТАРІ

  • @souravbarua3991
    @souravbarua3991 31 хвилина тому

    Your videos are really helpful. Please make this kind of project video with pyspark also. Thank you.

  • @ryandavis280
    @ryandavis280 9 годин тому

    OMG keith you are a lifesaver! thank you!

  • @franciscoortega104
    @franciscoortega104 День тому

    Thanks Keith for this video! I'm new on data science I'm using your videos to practice and learn a lot more. Really thanks!

  • @DJdopeMike
    @DJdopeMike День тому

    Thank you for the great overview of NLP!

  • @overnights2572
    @overnights2572 2 дні тому

    It is still very helpful... straight to the point, many thanks

  • @andreogimenes
    @andreogimenes 2 дні тому

    49 seconds theres a disgusting sound!

  • @user-fq5kb9nx3c
    @user-fq5kb9nx3c 2 дні тому

    I just enter data analysis area and amazing this videos made 4 years before already! thanks for made this, learnt your skills and problem solving as talents, appreciated!

  • @tdcode
    @tdcode 2 дні тому

    Man, you're crazy 🤣🤣🤣🤣🤣🤣🤣🤣🤣. This is awesome! Thanks for a colossal and great video!!!🎉🎉

  • @ArshiaFajar1234
    @ArshiaFajar1234 3 дні тому

    Just a quick question...can i submitt this as my intro to Data science course semester project????

    • @KeithGalli
      @KeithGalli 2 дні тому

      My recommendation would be to try to take the skills that you learn in this project and apply them to a different dataset. You can find a lot of interesting datasets on Kaggle.

    • @ArshiaFajar1234
      @ArshiaFajar1234 2 дні тому

      @@KeithGalli gonna try and I'll let everyone one know what happened. Since my professor expects a project worth of PhD from a third semester Data Science Student 😭.

  • @panth5501
    @panth5501 4 дні тому

    Great content, solution of 23 I believe is wrong.

  • @SleepyOrca-hj4sf
    @SleepyOrca-hj4sf 4 дні тому

    My fiance is acrewed!

  • @nehasinha3861
    @nehasinha3861 4 дні тому

    Thank you for this :)

  • @AnthonyRonaldBrown
    @AnthonyRonaldBrown 4 дні тому

    The A.R.B Strongest Games Programs in the World! :) ua-cam.com/video/Xg36QswuC_U/v-deo.html

  • @vishalcrazy5121
    @vishalcrazy5121 5 днів тому

    Thank you for this Keith .

  • @abdulbasitnisar
    @abdulbasitnisar 5 днів тому

    Thank you such much!! whatever you are doing actually is life changing for people like me who is self learning these! Thank you!!!

  • @lordvoldemort1985
    @lordvoldemort1985 7 днів тому

    in the last example you could bin the prices into bins of less than 5, 5-20, 20-50, 50-100, etc and use hue

  • @amiralekperov6283
    @amiralekperov6283 8 днів тому

    On the final steps I get the error - agg function failed (using groupby). What might it be?

  • @saikumar7247
    @saikumar7247 8 днів тому

    sir could u make same like numpy video

  • @user-en1on9ix8t
    @user-en1on9ix8t 9 днів тому

    What I see is you should increase your testosterones replace running with building muscles.

    • @user-en1on9ix8t
      @user-en1on9ix8t 9 днів тому

      And please don't take anti depression pils.....read antifragile by Nassim Taleb.

  • @sushibooshi
    @sushibooshi 9 днів тому

    More machine learning content! This is awesome stuff Keith!

  • @myc0301
    @myc0301 10 днів тому

    Man, I have a comment here, to execute what you're doing in the part of groupby, it was necessary to do this: numeric_columns = df.select_dtypes(include='number') result = numeric_columns.groupby(df['Type 1']).mean() result The TypeError indicate that was not longer available the type of operations that you were doing in the video, hope this helps anyone!

  • @NuanceWebsites
    @NuanceWebsites 10 днів тому

    Bro, you are a genius!!!

  • @premkumarramanathan8848
    @premkumarramanathan8848 10 днів тому

    !ls is not recognized as an internal or external command, operable program or batch file. I'm using Anaconda in Windows 10. How to solve this problem, please help.

  • @akashtribhuvan8124
    @akashtribhuvan8124 11 днів тому

    Hi Keith, can you check this out and pin @_Nelyen 's comment. Instead of writing: df.groupby(["Type 1"]).mean() Try writing: df.groupby(["Type 1"]).mean(numeric_only=True)

  • @h3arty
    @h3arty 11 днів тому

    Fleet Battle - my id is Miaaaow... Fleet me betchez!!!

  • @101touchapps
    @101touchapps 11 днів тому

    a couple of years ago (5 years) i did a pokemon pandas tutorial from you and it totally got me into the world of data science. i came back to say thanks for that tutorial. It really helped me. now am a python instructor.

    • @KeithGalli
      @KeithGalli 10 днів тому

      Love it!! Thanks for the message. Glad I could play a small part in your journey. Congrats on being an instructor now.

  • @peaceful420
    @peaceful420 12 днів тому

    Thank you so much brilliant guy, answered so many important questions 🙏

  • @Crocodile645
    @Crocodile645 12 днів тому

    Your smile is and body language made me laugh... So hard .. you're a born comedian, thank you for such a great content man 😘

  • @Crocodile645
    @Crocodile645 12 днів тому

    Hahahahahaha hahahahahaha hahahahahaha hahahahahaha hahahahahaha hahahahahaha hahahahahaha 😂😂😂 you're so funny... Don't know why I laugh a lot everytime I watch you.. thanks for making boring information so enjoyable

  • @aronisalberg
    @aronisalberg 12 днів тому

    thanks a lot bro so good please make more

  • @StarkTsorian
    @StarkTsorian 13 днів тому

    Thank you for this. I recently graduated with a masters in data science and finding a job has been brutal. Thank you for the tips.

  • @Vutra99
    @Vutra99 15 днів тому

    Thailand bar girls here I come

  • @renatolippi
    @renatolippi 15 днів тому

    Excellent! Thank you very much for this video!! Please more with this format 👏

  • @vijaykumar-od7kx
    @vijaykumar-od7kx 15 днів тому

    Excellent tutorial to learn the fundamentals of SCI-Kit

  • @javaluvawithjeremystones6315
    @javaluvawithjeremystones6315 16 днів тому

    Really amazing video! Thanks for the great content! For the part where it was allowing you to not escape the period in the square brackets, it's not sublime, you don't need to escape characters inside square brackets, although it won't complain if you do escape them. The only exception I can currently think of is the ^ symbol and only if you put it at the beginning.

  • @Hamsters_Rage
    @Hamsters_Rage 16 днів тому

    29:26 - he starts writing some code

  • @sarveshpadav2881
    @sarveshpadav2881 16 днів тому

    24:27 We can use gropyby to count the animals in the following way... df.groupby('animal')['animal'].count()

  • @senthilkumarradhakrishnan744
    @senthilkumarradhakrishnan744 17 днів тому

    Are check and cheque meant to be similar?

  • @DataScienceMAHAMAT
    @DataScienceMAHAMAT 17 днів тому

    This is the most practical Python tutorial video I've ever watched. Thanks for sharing!

  • @APP-ld6jf
    @APP-ld6jf 17 днів тому

    Looking forward to numpy puzzles now!!

  • @michaelgross1248
    @michaelgross1248 17 днів тому

    Loved the video, inspired to try, but no matter what I do I cannot get files to upload.

  • @hata3128
    @hata3128 18 днів тому

    Thank you for sharing this! I hope you are feeling better these days!

  • @StalkedHuman
    @StalkedHuman 18 днів тому

    What do you think is a good method to concatanate a string value from datafram column to anothee dataframe column by index key. Example, df_1 rows 10, 20, 26, 30, 40 column 5 (string) concatonate to df_2 rows 9, 19, 25, 29, 39 column 1?..

  • @aishwaryapattnaik3082
    @aishwaryapattnaik3082 18 днів тому

    Such a great tutorial Keith. Please keep uploading such high quality videos on Pandas and many more

  • @ryanmugo4206
    @ryanmugo4206 18 днів тому

    i would give this guy a 10/10...truly understood everything

  • @Kira-vs4np
    @Kira-vs4np 18 днів тому

    just a note, at 1:19:21 the format = "mixed" isn't really working for me, and it fills the date_born column with NaT values. So, I tried format = "%d %B %Y" and it works

  • @willgordon5737
    @willgordon5737 19 днів тому

    06:24 when that burrito you eat fights back

  • @chenjackson6001
    @chenjackson6001 19 днів тому

    感谢你的辛苦付出

  • @Soiboyyy
    @Soiboyyy 20 днів тому

    Bruh I’m here cuz I was playing this girl im dating and she clapped me like 10 times in a row. 😅 it was annoying cuz she never gave any tips, she literally just laughed the whole time.. I was literally getting annoyed cuz obviously I suck and she is just having a blast laughing at my misery lol.. And she always went first and in the middle. 🙄

  • @AnasM24
    @AnasM24 20 днів тому

    Thank you man