ErgastF1

ErgastF1

Ergast.com is a webservice that provides a database of Formula 1 races, starting from the 1950 season until today. The dataset includes information such as the time taken in each lap, the time taken for pit stops, the performance in the qualifying rounds etc. of all Formula 1 races from 1950 to 2017. The task is to predict the winner (or tied winners) of a race with the data available up to the start of the race (e.g., the list of the race attendees and qualifying times are known but their lap times in the race are not known).

Original source: ergast.com

Versions

  • ErgastF1 (by Jan Motl)

Dataset details

Associated task:
Classification
Domain:
Sport
Data types:
Size:
60.4 MB
Count of tables:
14
Count of rows:
544,056
Count of columns:
98
Missing values:
Yes
Compound keys:
No
Loops:
Yes
Type:
Real
Instance count:
31,313
Target table:
target
Target column:
win
Target ID:
targetId
Target timestamp:
raceId

Algorithms

Dataset versionTargetAlgorithmAuthor textMeasureValue
ErgastF1winFastPropgetML: Feature Learning with AutoML to build end-to-end prediction pipelinesROC AUC0.9242
ErgastF1winDeep Feature SynthesisfeaturetoolsROC AUC0.9202
ErgastF1winFastPropgetML: Feature Learning with AutoML to build end-to-end prediction pipelinesAccuracy0.9727
ErgastF1winDeep Feature SynthesisfeaturetoolsAccuracy0.9724

How to download the dataset

The datasets are publicly available directly from MariaDB database.

  1. Open your favourite MariaDB client (MySQL Workbench works, but see FAQ)
  2. Use following credentials:
    • hostname: db.relational-data.org
    • port: 3306
    • username: guest
    • password: relational
  3. Export "ErgastF1" database (or other version of the dataset, if available) in your favourite format (e.g. CSV or SQL dump).