Description
DATA WRANGLING PRACTICE PROJECT WITH PANDAS
INSTRUCTOR
Anastasia Migunova is a data scientist with wide experience in big consulting firms among others. Currently based in Germany, she holds a Ph.D. in Applied Maths and M.A. in Computer Science.
PROJECT DESCRIPTION
The aim of this project is to replicate the tasks every data analyst or data scientist must perform to clean and analyze a dataset. The project is broken down into multiple exercises to make you practice the most used Pandas functions and tools. The project also includes a visualization and data analysis part as well as some python coding.
Once you finish it, you will acquire the basic knowledge about python and its data libraries required to work with data in the real world.
DOWNLOAD / CONTENT
You will receive an email with a protected ZIP and a password to access the content. If you are a registered user, the download is always available on your account.
The downloadable zip is made up of:
1) One PDF with the instructions and guidelines, including the project broken down into assignments so that you can finish the challenge step by step.
2) A link to the datasets you will use in the project. The datasets are public data.
3) A Notebook file with the project solved. It contains not only the source code but also detailed explanations and comments about how the code works.
IMPORTANT: to see the solutions (Notebook) you need to have jupyter or ANACONDA package installed on your machine. If you do not have it, you may download it here. It is free.
WHAT YOU WILL PRACTICE
– Libraries: you will have to work (and install) with the following python libraries: pandas, numpy, datetime, matplotlib and seaborn. The project includes more than 40 assignments related to:
– Import/export gzip, csv and Excels.
– Remove, select, rename, filter columns and rows.
– Nulls
– Data types.
– Groupby
– Filters
– Outliers and duplicates.
– Convert long to wide format.
– Basic Regex.
– Date formats.
– Data engineering.
– Numpy
– One hot encoding.
– Merge and joins.
– Loops (for).
– Visualization: boxplot, bars, countplot, scatterplot, heatmaps, pairwise, etc.
VERSION
Python 3.7
Pandas: 0.24.2
Numpy: 1.16.4
CONTACT
If you need additional information, do not hesitate to contact us.
Additional information
Specification: Data Wrangling Assignments with pandas
|
Reviews (2)
2 reviews for Data Wrangling Assignments with pandas
Only logged in customers who have purchased this product may leave a review.
nick_boy (verified owner) –
Good documentation and easy to follow.
More than 40 problems from easy functions to more complex stuff.
I would like less visualization work and a bit more about merge and joins.
Akir Van (verified owner) –
If you ´ve done your online courses, this is your homework. It´s focused on the pandas library and the visualization with matplotlib and seaborne.
It´s challenging for a new pandas user but I would still recommend it. There ´re plenty of exercises so you really dive in the real use of the most important functions and methods. Not only you learn about pandas but you also practice the data cleaning all data scientists talk about.
The code is clean and easy to follow.