Friday, July 7, 2023

NBA shot Clustering Analysis

Article courtesy of Avery Caraway

The game of basketball has changed dramatically over the last few decades. Teams are scoring at a much higher rate than in the past. In the 2005-2006 season, NBA teams scored on average 97.0 points per game [1]. In the 2020-2021 NBA season, the league has increased this total point average to 112.1 points per game [2]. One reason for this drastic growth is the increase in 3 point shots taken. In the 2007-2008 season, teams averaged 18.04 3-point attempts per game, but that number rose to 28.98 attempts per game in the 2017–2018 season [3]. Even if the field goal percentage remains the same, attempting more shots equals more points being scored, and in part, a better chance to win the game.

Many players in today’s NBA have modeled their game after the ability to make 3-point shots. Although teams are made up of more than just one individual player, the increasing rate of the 3 point shot has brought success to these teams. For example, let’s take a look at Stephen Curry of the Golden State Warriors. Now highly regarded as the best 3 point shooter of all time, he did not start off that way in the NBA. In his first year in the league, 2009, Curry attempted 4.8 3-point shots per game and was successful 43.7% of the time. In the year 2016, Curry led his team to 73 wins, the highest regular season total of all time. During this season, Curry attempted 11.2 3-point shots per game and was successful 45.4% of the time. Although his field goal percentage rate barely improved, the amount of shots taken drastically increased by over 6 attempts per game [4]. There are many characteristics of a great team, but one that can take advantage of the 3 point line has shown to be successful in recent years.   

To see how much shot locations have changed in the NBA, we can look at the Houston Rockets and San Antonio Spurs from the 2007 and 2017 seasons. In 2007, the Rockets had 7304 shot attempts and in 2017 it increased to 8698 attempts. Whereas, in 2007, the Spurs had 8075 shot attempts and in 2017 it increased to 8742 attempts. The game has increased in pace over the last 10 years with the Rockets in 2017 taking approximately 17 more shots a game and the Spurs taking approximately 8 more shots a game when compared to their 2007 season statistics


Does this change of pace and increased number of attempts have an impact on the location where the shots were taken? There are three different clustering techniques to analyze the shooting data; K-means clustering, gaussian mixture, and DB scan. These methods show differences in the shooting patterns of both teams.  

K-Means is a distance-based algorithm which clusters points based on closeness. This process requires the user to provide a number of groups to cluster the points in. The centroids of each of these clusters are generated randomly. These get adjusted every iteration to find the actual centroids of the clustering of data. It cannot however create irregularly shaped clusters.



Gaussian Mixtures are probabilistic models which use a soft clustering approach. It assumes that there is a certain number of Gaussian distributions. The clusters have a specific and unique mean and variance. The values of mean and variance are determined by ExpectationMaximization technique. This method does need the user to mention the number of clusters.


The DB scan method is a density based clustering algorithm that works on the assumption that clusters are dense regions in space separated by regions of lower density. It is very robust to outliers and can create irregularly shaped clusters. The user does not need to specify the number of clusters for this method. However, the circle radius from each data point and the number of points inside each circle from data points need to be provided. The number of clusters is determined through the iterations.


In both the Gaussian mixture and DB scan these models were able to create separate clusters of shots along the 3 point line. However, the k-means was unable to do so for the shooting data, because this method cannot cluster irregular shapes, as K-means calculates the distance from a centroid, which naturally forms spherical clusters. As such, K-means is a useful clustering method when researching players that have similar styles or roles. It is not as successful when clustering shots since it doesn’t create separate clusters around the 3-point line, like DB scan and Gaussian mixture do. Gaussian mixture was also able to show more shooting density in 2017 for both the Rockets and the Spurs under the 3 point line when compared to the 2007 season. Overall, the Rockets shot more 3 pointers and less mid range shots in 2017 when compared to the 2007 shooting data. On the other hand, the San Antonio Spurs had a similar shot distribution location in 2017 as it was in 2007.

Recommendations:
The next step in this research would be to see how much effect on overall winning percentage this increases in 3 point attempts has. It was mentioned above that the Golden State Warriors had the best statistical season of all time, largely due to their 3 point shooting ability. However, is this one team an anomaly, or will there be a true shift in the NBA game over the next 10 years? Should teams invest time and resources into putting 5 highly efficient shooters on the court, as opposed to the traditional Guard-Guard-Forward-Forward-Center lineup that fans are used to seeing? There are many factors involved in the overall winning percentage of a team, but it would be interesting to see how important the 3 point shot is compared to the other statistics. Doing an analysis on teams in the playoffs vs. teams not in the playoffs over the last 5 years may give information into this topic.

REFERENCES
1. “NBA league average points per game 2006”. StatMuse. 2006 https://www.statmuse.com/nba/ask/nba-league-average-points-per-game-2006
2. “NBA league average points per game 2021”. StatMuse. 2021 https://www.statmuse.com/nba/ask/nba-league-average-points-per-game-2021
3. Andrew Lisa (2019). “25 ways the NBA has changed in the last 50 years”. Stacker. https://stacker.com/basketball/25-ways-nba-has-changed-last-50-years
4. “Stephen Curry”. Basketball Reference. https://www.basketball-reference.com/players/c/curryst01.html