# On Entropy in NBA Basketball

EE376A (Winter 2019)
##### Authors: Travis Chen and Rifath Rashid

Introduction

We begin with a quote from Kevin Durant of the Golden State Warriors,  2-time NBA Champion and Finals MVP:

“The beauty of the game is the movement and the way you play off of each other and I think we do such a good job of not being too predictable out there… we might hit Draymond for a dunk or get Andre in there for a slip or Steph and Play hitting 3’s or I’m coming off shooting the mid-range so we’re just trying to be unpredictable and use everybody on our team and I think that’s to our advantage when we move the ball that’s beautiful basketball. That’s why we play the game. we know we always got that in our back pocket along with good talent so we wanna try to have a good balance between both, of giving it to our best players but also playing with our team concepts … “

http://www.espn.com/video/clip?id=23719638

Durant’s commentary yields a few immediate insights:

1. A notion of entropy is useful for basketball teams to keep opponents off balance both from a scheme and on-court movement perspective; it is much harder to defend an opposing team that has the possibility of running multiple plays on any given possession than one that runs the same action every play
2. This notion of entropy must be balanced with game theoretic notions of strict dominance — that is, an entropic action may be suboptimal when it takes a team’s best play-calls involving their best players.

Literature Review

We begin with a review of one of the seminal works analyzing entropy in an NBA context,  D’Amour et al., and how in particular he fails to address the second point. D’Amour begins by condensing the exponential state space of basketball plays down to a simplified Markov model, where each state is defined by the following key attributes:

• The current ball carrier
• Location in one 7 different court regions
• A boolean representing whether or not a defender is within 5 feet of him

Each transition s → s’ has an associated transition probability, and every state can transition to a special end state if a shot-attempt or turn-over occurs within a one-second window.  Furthermore, D’Amour introduces two key concepts: immediate opportunity and open look probability. Here, immediate opportunity is simply the expected points earned from attempting a shot in a particular state while open look probability represents the probability that the current state will eventually transition into a state with some player having an undefended shot attempt.

While D’Amour acknowledges the simplifying assumptions of this model, these assumptions should already raise flags to a reader.  Namely, such a model will inevitably possess skewed transition probabilities given that not all state transitions are possible. An undefended power forward near the rim is unlikely to transition to any state but the end state.  On the other hand, a point guard at the center of the arch might uniformly transition to any other state. Very quickly, we realize that entropic properties of a state do not necessarily define its strategic utility. There must be more complexities at play.  As D’Amour’s findings show, even the team with the highest entropy in play calling, the Philadelphia 76’ers had the 2nd to worse league record in the year the study was conducted. The most successful teams, Miami and San Antonio, exhibited a blend of high entropy and high immediate opportunity, hinting at the reality that a successful strategy should balance both.  That is to say, the correlation of entropy with winning is more asserted than empirically verified.

Moreover, D’Amour analysis operates under Markov Assumption, that “the future evolution of the possession only depends on its current state.” The most intuitive issue with this assumption is that play-calling requires a notion of memory between states: the beginning of a possession, n > 1 states ago, will strongly inform what will transpire n > 1 states later even if the intermediate steps have less mutual information. More formally, we can think of this in terms of the data processing inequality. Consider a Markov Chain X → Y → Z where X is “Steph Curry is dribbling the ball on the left wing with a defender on him.” This could lead to 4 different outcomes Y, each with equal probability: say Klay Thompson receives ball at the top of the key with a defender on him, Draymond Green receives the ball at the top of the key wide open, Kevin Durant receives the ball on the right wing with a defender on him, and Steph Curry moves to the right wing. However, we know that based on the starting state of the play, that Steph Curry dribbled the ball up to the left wing, he will with certainty Z end up with the ball on the right corner.  This would be a clear counter-example to the data processing inequality, given that based on how Amour has defined the state space, I(X; Z) > I(X; Y) despite the assumption that X → Y → Z is a Markov Chain.

An Empirical Examination of Basketball Play-Calling Entropy in 2018

We examine a dataset containing frequency and efficiency statistics for each of the 7 main playtypes (Catch and Shoot, Drives, Handoffs, Isolations, Picks, Posts, Transition), for each team this season. Given our lack of location tracking data, we use this as a heuristic for the space of basketball plays. Moreover, this approach scraps the Markov Assumption altogether by modelling play-calling as a prior on the set of plays a coach will call rather than a continuous-time sequence of states that transition from one to another.

For example, the breakdown for the Golden State Warriors on a per-game basis is as follows:

• Catch and Shoot: 49.0
• Drives: 22.4
• Handoffs: 9.4
• Isolations: 11.6
• Picks: 30.9,
• Posts: 12.2,
• Transition: 18.1

We normalize this to a probability distribution:

• Catch and Shoot: 0.319
• Drives: 0.146
• Handoffs: 0.61
• Isolations: 0.076
• Picks: 0.201
• Posts: 0.079
• Transition: 0.118

And compute entropy as:

H(playcalling) = $- \sum_{u \in playcalls} p(u) log p(u)$ = – 0.319 log (0.319) – 0.146 log (0.146) … = 1.787

Critically, we evaluate whether notion of entropy is predictive of percentage of games won. ﻿

We notice that there exists a moderate correlation between our notion of play-calling entropy and win rate, an $R^2$ value of 0.073. Unfortunately there are some clear outliers — many very successful teams including the Utah Jazz (UTA) and the Houston Rockets (HOU) employ strategies with relatively low play-calling entropy. For example, the Houston Rockets have a play-calling distribution that is heavily weighted towards isolation plays (#1 in attempts), yet simultaneously ranks #1 in isolation efficiency, that is points per shot taken out of isolation.

This suggests that while our play-calling entropy metric is somewhat predictive of team win success, it falls incomplete when considering that it may be strategic to have a lower entropy distribution of plays if said limited plays are strictly more efficient than others.

Tying in Efficient Opportunity with Play-Calling Entropy

Efficiency ties in with game theoretic notions of strict dominance: that is, maximizing the usage of a single type of play-call may be advantageous over assigning additional probability to any other if that play-call will always perform better (in terms of maximizing expected points for a team).

At the same time it is both empirically and intuitively apparent that entropy leads to more efficiency as well. For example, if an opposing defense knows that an offensive team will always drive and rarely if ever catch and shoot, then they can clog the restricted area of the basketball knowing that there is no threat of a shooter farther out on the perimeter. This lends itself to an analysis of equilibria.

To simplify our problem, consider a version of basketball where team Entroballers  can only drive or shoot. Consider the following parameters:

• $p$ = Probability that Entroballers calls a drive play
• $E_{Drive}$ = Points per Drive Play
• $1 - p$ = Probability that Entroballers calls a catch and shoot play
• $E_{Shoot}$ = Points per Catch and Shoot Play

To incorporate the idea that increasing the frequency of a play-call may reduce its efficiency  (due to teams being more prepared for it), we introduce a discount factor D. For every unit of probability increase for either play, we can model the efficiency of such a play as decreasing by D units. Conversely, when we decrease the probability of a certain play by a unit, the efficiency of such a play will increase by D units.

To give a concrete example suppose our Entroballers parameters are:

• $p = \frac{2}{3}$
• $E_{Drive} = 1.5$
• $1-p = \frac{1}{3}$
• $E_{Shoot} = 0.9$
• $D = 0.5$

It is clear that to optimize performance, Entroballers should probably be calling more drive plays than they currently are and thus calling fewer shot plays. Their current overall expected value for a given play is:

At the same time, we cannot just call a drive play 100% of the time, given that this will heavily discount our expected drive efficiency to the point of being ineffective.

Thus, we want to increase our drive frequency by the right amount, $\epsilon$. That is, we wish to find:

Optimizing by taking the derivative of the above expression and setting it to 0, we get that $\epsilon$ is optimized at 13/60, or around 0.22, so that we call drives around 89% of the time rather than 67%.

Our new optimized expected value is now:

an upgrade over our original 1.3!

Note that this is an entropy-decreasing modification to our playcall distribution: our original entropy was:

-(⅔) log (⅔) – (⅓) log (⅓) = 0.918

But it is now:

– (.89) log (.89) – (.11) log (.11) = 0.500

yet our expectation for points scored has increased, suggesting a counter-example to the notion that entropy is perfectly predictive of success.

Given this crude notion that both raw efficiency and entropy are important to creating an optimal, equilibrium-based strategy, we seek to create a modified objective for teams: maximize a weighted sum of play-calling entropy and efficiency.

For now, we give equal weights (0.5) to both as a proof of concept. We obtain a much stronger correlation than previously, just using entropy.

Conclusion

We investigate the notion of entropy in a basketball context and conclude that the seminal traditional approach of D’Amour et al. may be inadequate in predicting team success due to modelling assumption issues, both not accounting for notions of strict dominance and relying on the Markov Assumption. We propose a game-theoretic inspired alternative, balancing entropy and efficiency, which has a strong correlation with team success.

Outreach

Given how relevant our topic might be to young aspiring ballers, we immediately knew that our outreach activity needed to be interactive.  What better way to teach kids and prospective NBA players entropy than to get them draining hoops?

We created an animation that allowed students to pick a difficulty level from 0, 1, or 2.  Our animation would then flash a green dot on the screen, indicating which spot the student must shoot from. Namely, these levels correspond to the entropy of shot selection.  If a student picked level 0, their shot selection would be deterministic. However, if they challenged themselves and picked level 2, then they would soon find themselves hopping all over our makeshift half-court to meet the demands of all the flashing green dots and to be unpredictable to their imaginary opponents.  In particular, this version of the game had entropy of 2, given that each of 4 states had uniform probability.

The real beauty of this activity was the gradual transformation in the students mentalities around challenging themselves.  At first, most students only wanted to try level 0. With all their peers watching, they cared more about not missing than being unpredictable.  However, as the whole outreach event progressed, we noticed the same came students running back to our station, struggling to contain their excitement about attempting level 2.  The same students, with a newfound air of confidence, would bring their friends and ask if they could tackle the challenge as a group, and very soon we were servicing groups of 3 or 4 (something we had not planned for, but were more than happy to accommodate).

We did have a huge bag of KitKats and our policy was that each participant would get 1 Kitkat for being brave enough to try the activity and an additional Kitkat for each shot made.  To our humbling surprise, parents were not fond of this KitKat policy after they realized their children were future Division 1 athletes making all 10 of their shots.