Skip to content

aetperf/LeTourDataSet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LeTourDataSet

A data set on riders in the Tour de France.

TL;DR

If you use pandas, just get the data via:

import pandas as pd 
df = pd.read_csv("https://raw.githubusercontent.com/camminady/LeTourDataSet/master/data/TDF_Riders_History.csv")

Disclaimer

For issues with this data set, see the Issues tab. There are some entries that are incorrect. However, so far it seems that the mistake stems from wrong data on the letour.fr website. Looking back, I should have probably scraped another website.

Data

Every cyclist of the Tour de France in a single CSV file, stored in the file data/TDF_Riders_History.csv. There's also data on every single stage of the Tour de France in data/TDF_Stages_History.csv.

How to run

To regenerate the data/TDF_Riders_History.csv and data/TDF_Stages_History.csv files, execute src/main.py. This might take a couple of minutes.

Analysis

The src/analysis.py contains some basic analysis and visualizations of the data. For example, the distance and winner pace are shown below.

Distance and winner average pace

Legacy code

This code has been completely rewritten. The previous code, including the output, is in the legacy repository. Especially legacy/README.txt should be read.

About

Scrapping Tour de France History website

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published