Note
This is the English 🇬🇧🇺🇸 version of the README
. If you want to see the French 🇫🇷 version, you can click on the link below:
This GitHub repository stores the source files used to build the site https://pythonds.linogaliana.fr/.
It contains the entire course Python for Data Science that I teach in the second year (Master 1) at ENSAE.
The syllabus is available on the ENSAE website and on the course website.
Overall, it offers a very comprehensive content that can satisfy both beginners in data science and those looking for more advanced content:
- Data Manipulation: standard data manipulation (
Pandas
), geographical data (Geopandas
), data retrieval (web scraping, API)... - Data Visualization: classic visualizations (
Matplotlib
,Seaborn
), cartography, interactive visualizations (Plotly
,Folium
) - Modeling: machine learning (
Scikit
), econometrics - Text Data Processing (NLP): introduction to tokenization with
NLTK
andSpaCy
, modeling... - Introduction to Modern Data Science: cloud computing,
ElasticSearch
, continuous integration...
The content of this site is based on open data, whether French data (mainly from the central platform data.gouv
or the website of Insee) or American data.
A good complement to the website's content is the course we give with Romain Avouac (@avouacr) in the final year at ENSAE, more focused on the production of data science projects: https://ensae-reproductibilite.github.io/website/