Sitemap
Dev Genius

Coding, Tutorials, News, UX, UI and much more related to development

Member-only story

Getting started with F1 statistics and Python

12 min readAug 23, 2022

--

Data preparation in Python for the analysis of F1 statistics with the Ergast dataset.

Press enter or click to view image in full size
Photo by author

This tutorial describes how to use historic Formula One data for analysis. It covers obtaining the data, cleaning the data and two first analyses made with this data (more will follow!). The main focus in this article is the data preparation of this data set for analysis. It may feel as the dirty work, but good data preparation pays itself back. Easy.

The data is retrieved from the Ergast Developer API. This is an API providing historical data on F1 races, starting in 1950, though not all data is complete. Data is available up to the current season, containing all planned races and results for all completed races.

The available data contains the following table:

  • Drivers — Information on all current and previous drivers
  • Constructors — Information on all current and previous constructors
  • Race results, both constructor and driver
  • Qualifying results — Results of all qualifying sessions, including the seperate Q1, A2 and Q3 sessions.
  • Lap times — Lap times of all completed laps by all drivers in all events
  • Pit stops — All pit stops made, when and duration (pit in — pit out)

--

--

Dev Genius
Dev Genius

Published in Dev Genius

Coding, Tutorials, News, UX, UI and much more related to development

Leo van der Meulen
Leo van der Meulen

Written by Leo van der Meulen

Dutch open data and public transportation enthousiast. Working for over 15 years in public transport. LinkedIn: https://www.linkedin.com/in/leovandermeulen/