Football (soccer) data analysis: A pedagogic introduction

Fri September 10, 06:00 PM–06:30 PM • Back to program
Session Type Pre-Recorded
Start time 18:00
End time 18:30
Countdown link Open timer

Organisers Note: This talk was a backup presentation that was not streamed during the event, but is being made available post event.

Nowadays data is the key to solving challenges in most fields, say astronomy, applied mathematics, health care, or even sports. Data is everywhere and it is super beneficial to leverage these data and build models for making sense out of them and telling stories. The field of sports, especially football (soccer), is enriched by data analysis that makes us understand the game better and predict outcomes. Many people want to delve deep into football (soccer) data analysis and get their hands dirty. This talk is to help them do so by pedagogically introducing them to the introductory analytical methodologies to overcome the initial barriers of the field and start working with data analyses and visualizations on football data to deduce interesting results.

<hr />
  1. I will start my talk addressing how to get open access football event data using the statsbomb API using Python [3min],

  2. The next thing I will talk about is drawing a football pitch using the mplsoccer Python module, so that we can start making most of our football data visualizations on this pitch [3 min],

  3. I will then talk about simple data visualizations like drawing shot maps, pass maps, and their corresponding heat maps [7 min],

  4. Next I will teach how to visualize a passing network on the pitch of a particular team during a particular game. We will further advance our knowledge by analyzing this pass network using the NetworkX python module that is usually used in complex network analysis in mathematics. We will learn how to calculate pass degree distributions of each player, find out which player was the most central in that pass network by calculating the "centrality" of each player node, and so on [8 min],

  5. After that, I will teach how to implement computational geometric concepts like Convex Hulls, Voronoi diagrams, and Delaunay triangulations using the Python package scipy.spatial and mplsoccer on open access football tracking data so that we can analyze how many passes were available to a player at a particular instance of a game, or how a group of players broke down space on the pitch at a particular instance, etc. [7 min], and

  6. I will end my talk by guiding the audience to the references I used for starting with football (soccer) data analysis [2 min].

Indranil Ghosh He/him

I am a first-year Ph.D. student in applied mathematics from the School of Fundamental Sciences, Massey University. My research is on dynamical systems and robust chaos. I have a master's in Physics from Jadavpur University. I am mostly interested in dynamical systems, computational mathematics, optimization, soccer data analysis, etc. I am very fascinated with open source software development and write codes mostly in Python and R, and sometimes Fortran.