NFL & Python: Pull NFL Next Gen Stats
Introduction and Overview
In this section, Tim Bryan introduces himself and explains that he will be demonstrating how to pull NFL next gen stats using the NFL data Pi package. He mentions that these stats provide insights into player performance beyond what can be seen in a simple box score.
Introduction to Next Gen Stats
- Tim Bryan introduces himself as a creator of videos on sports analytics and coding.
- He explains that the NFL next gen stats are displayed during football games and sponsored by AWS and Amazon.
- The NFL data Pi package allows users to access and analyze this data.
- Next gen stats provide information on metrics such as separation, speed, and completion percentage above expectation.
Dependencies for the Project
In this section, Tim Bryan discusses the dependencies required for the project.
Dependencies
- The project requires the following dependencies:
- pandas: a standard data science library.
- NFL data Pi package: used to pull NFL next gen stats.
- matplotlib: used for visualizing the data.
- style library from matplotlib: used for styling the visualizations.
- Tim mentions that he turns off the display max columns parameter in pandas to ensure all columns are visible when pulling the data.
Importing Data and Setting Parameters
In this section, Tim Bryan explains how to import next gen stats data using the NFL data Pi package and sets parameters for analysis.
Importing Data
- To import next gen stats data, Tim uses the
import_NGS_datafunction from the NFL data Pi package.
- The function takes a parameter called
stat_type, which can be set to "passing", "receiving", or "rushing" depending on the desired data.
- Tim sets the
stat_typeparameter to "passing" for this demonstration.
Setting Parameters
- Tim defines the year as 2022, indicating that he will be working with data from that year.
- He mentions that the next gen stats data is available starting from 2016.
Exploring the Data
In this section, Tim Bryan explores the next gen stats data and highlights some interesting metrics.
Data Exploration
- Tim mentions that the first season available in the data is 2016.
- He points out metrics such as average time to throw, completed air yards, intended air yards, and aggressiveness.
- Tim notes that there are many more metrics available for receiving and rushing data.
- He explains that filtering down to week zero provides full season data for a specific year.
Metrics of Interest
In this section, Tim Bryan discusses two specific metrics of interest: average time to throw and completion percentage above expectation.
Metrics of Interest
- Average time to throw represents how much help a quarterback receives from their offensive line or their mobility outside the pocket.
- Completion percentage above expectation indicates how well receivers are helping quarterbacks by catching difficult passes.
- These metrics provide insights into player performance beyond traditional statistics.
Filtering Data and Visualizing
In this section, Tim Bryan demonstrates how to filter the data and visualize it using matplotlib.
Filtering Data
- To filter the data, Tim sets the season variable equal to his chosen year (2022).
- He resets the index to ensure it starts at zero after filtering.
Visualizing Data
- Tim defines a layout size for visualizing the data using matplotlib.
Visualizing Quarterback Performance
In this section, the speaker discusses how to visualize quarterback performance using average time to throw and completion percentage above expectation.
Appending Data and Centering Axes
- Append the average time to throw for each quarterback to the X data.
- Subtract the overall average time to throw from each quarterback's average time to throw. This centers the axis at zero.
- Repeat the same process for completion percentage above expectation (Y data).
Creating Scatter Plot
- Create a scatter plot using the X and Y data.
- Increase the size of the points on the chart for better visibility.
- Adjust colors for better visualization.
Centering Axes and Setting Up Ticks
- Use
set_position_spinesfunction to center both x and y axes at zero.
- Set up ticks for easier counting on the chart.
Annotating Data Points
- Set X and Y limits based on data being pulled.
- Annotate each player's name and year next to their respective data point.
Interpretation of Scatter Plot
- Most quarterbacks are centered around the average of both metrics.
- Outliers like Daniel Jones and Marcus Mariota have significantly more time to throw than average quarterbacks.
- Quarterbacks in quadrant one have a lot of time to throw, with receivers making above-average catches.
- Quarterbacks in quadrant two have limited time but still achieve better-than-expected completion percentages.
- Quarterbacks in quadrant three have limited time with below-average catches by receivers.
- Quarterbacks in quadrant four have a lot of time but struggle with receivers making great catches.
Conclusion
The speaker concludes by summarizing how different quarterbacks perform based on their time to throw and receiver performance. They also mention future videos diving into NFL next-gen stats and machine learning.
The transcript provided does not contain any non-English content.