Exploring data using Python (Seaborn and Matplotlab)

Introduction

In this blog, I explored the dataset by Next Gen Stats tracking data for running NFL plays.

The code for the “NFL plays” data has been made public on my Github here.

Data Set

For this blog, I have used the Kaggle data set — NFL Big Data Bowl.

The dataset contains 65k rows and 48 columns of NFL data. Each row in the file corresponds to a single player’s involvement in a single play. The dataset was intentionally joined (i.e. denormalized) to make the API simple. All the columns are contained in one large data frame which is grouped and provided by .

Data Visualization

Relationship between week into the season and speed in yards/second while considering playing at ‘home’ and ‘away’.

The speed is pretty constant. However, it rises to the highest level into weeks 5 and 12 during the away and home games respectively.

Are the acceleration and speed constant throughout the seasons?

The highest acceleration and speed were achieved in the 2018 season. Other than that both the variables are the same throughout the seasons.

What is the player's weight relationship?

The weight of players varies from 160 lbs to 350 lbs. It generally keeps around 175, 225, and 325 lbs for low, medium, and heavy built individuals.

Temperature variations for various seasons

The temperature was a bit higher in the 2019 season but other than that the temperature remains approximately constant throughout the other two seasons.

Is the performance any better during home games?

The performance is approximately the same depending no matter whether they placed at home or away.

Relationship between Humidity and Temperature

This wonderful two-dimensional KDE plot shows the variation between Humidity and Temperature during the playoff season.

Yards vs Week into the season

The yards increased around week 15 into the season. The highest yards were in quarters 2,3 and 4. The lowest yeards were in quarter 5.

Yardline range

The yardline ranges between 0 to 50. The yardline usually lines mostly around 28.

Conclusions

  • The highest acceleration and speed were achieved in the 2018 season. Other than that both the variables are the same throughout the seasons.
  • The temperature was a bit higher in the 2019 season.
  • The weight of players varies from 160 lbs to 350 lbs. It generally keeps around 175, 225, and 325 lbs for low, medium, and heavy built individuals.
  • The yardline ranges between 0 to 50. The yardline usually lines mostly around 28.
  • The performance is approximately the same depending no matter whether they placed at home or away.

If you have any questions or comments or need any further clarifications please don’t hesitate to contact me at aditimukerjee33@gmail.com or reach me at 403–671–7296. If you are interested in collaborating on any project, feel free to reach out to me without any hesitation.

If you enjoyed this story, please click the 👏 button and share to help others find it! Feel free to leave a comment below.

Engineer. Data Analyst. Machine Learning enthusiast