Thursday, April 03, 2014

Spurs' 19-Game Winning Streak on Line Tonight in OKC

Less than an hour from now, the San Antonio Spurs will put their 19-game winning streak on the line in Oklahoma City against the Thunder.

Last week, as the Philadelphia 76ers were plummeting toward a tie for the NBA's longest losing streak in history (26 games), I posted an analysis of whether teams were ever able to end long losing streaks against really good opponents. Occasionally this happened, as we discovered. More likely, however, was that a skid would end against very weak opposition.

With tonight's Spurs-Thunder game starting shortly, I have put together a graphic that turns last week's question on its head. How often does a long winning streak end against poor opposition? Or, does it nearly always take a high-caliber opponent to end a team's long winning streak? Except for now looking at all-time great NBA winning streaks instead of losing streaks, my methodology today is the same as last week's.

What we find in the graph below (on which you can click to enlarge) is that some of the greatest winning streaks, such as the Lakers' record 33-gamer, were ended only when a stellar opponent came up on the schedule. A few times, however, a hot team was embarrassed by an opponent playing at or below a .300 clip!

Time is short, so I'll end here. I may come back and add more commentary later...


UPDATE: The Spurs' winning streak ended at 19, with a loss to the Thunder.

Saturday, March 29, 2014

Michigan's 3PT Shooting: An Illustration of Regression to the Mean

Despite holding a 60-45 lead over Tennessee with 10:57 left in last night's NCAA Sweet Sixteen game, the Michigan men's basketball team had to sweat things out for a 73-71 win (play-by-play sheet). One reason the Wolverines were unable to coast to a blow-out win over the Volunteers was a drop in Michigan's three-point shooting percentage from .778 (7-of-9) in the first half to .364 (4-of-11) in the second.

Whereas there could be substantive reasons for the Wolverines' second-half decline from behind the arc (e.g., fatigue, better Tennessee defense), the phenomenon of regression toward the mean almost certainly contributed, as well. Regression toward the mean refers to performers who exhibit extreme values on a set of initial measurements -- on either the high or low end -- achieving at closer to an average level on later measurements. According to the Social Research Methods website, regression toward the mean:

will happen anytime you measure two measures! It will happen forwards in time (i.e., from pretest to posttest). It will happen backwards in time (i.e., from posttest to pretest)! It will happen across measures collected at the same time (e.g., height and weight)! It will happen even if you don't give your program or treatment. 

Using box scores from all of Michigan's 2013-14 games to date (contained in UM's game notes in advance of Sunday's Elite Eight match-up with Kentucky), I plotted the Wolverines' team three-point shooting percentages for each first-half and second-half played this season. Each line in the graph links the two halves of the same game, with the Tennessee game depicted in orange, as one example (there were too many games, 36, to label each line). You may click on the graph to enlarge it.


Regression to the mean is indicated by lines that slope from very high to the middle, and lines that slope from very low to the middle. Also shown in the graph is Michigan's .402 three-point success rate for the season to this point. The Wolverines' pattern is a textbook example of regression toward the mean, as can be seen by comparing the above graph to this diagram from a textbook (Campbell and Kenny's A Primer on Regression Artifacts).

When Michigan (or any team) hits close to 80% of its treys in a half of one game, it is unlikely that it can match or exceed that rate in the other half. It is also true that a team shooting .100 or worse for a half will rarely* match or drop below that level in the other half.

As noted above, regression to the mean is virtually certain to occur anytime multiple measurements are obtained. The above depiction for Michigan is probably more dramatic than would be the case for most other teams, as most teams presumably are not as capable as the Wolverines of exceeding three-point shooting percentages of .600 or .700 within a half. Out of 351 NCAA Division I men's basketball teams, Michigan finished the regular season tied for seventh nationally in three-point shooting percentage.

---
*I inadvertently omitted the word "rarely" from the original version of this posting.

Thursday, March 27, 2014

Note to 76ers' Fans: Losing Streaks Usually End Against Bad Teams

With college basketball's March Madness dominating the U.S. hoops scene, it may have escaped some that the Philadelphia 76ers are on the verge of tying and possibly breaking the NBA record for longest losing streak. As shown on the Wikipedia's list of the longest NBA losing streaks, Philadelphia has been "deep-sixed" 25 straight times during its current streak, one loss shy of the record 26 consecutive defeats suffered by the 2010–11 Cleveland Cavaliers.

A record-tying 26th straight loss likely awaits the Sixers tonight at Houston. Even the disparity in the teams' records this season -- 48-22 for the Rockets, compared to 15-56 for Philly -- probably doesn't capture the full difference in the teams' abilities. After all, the Western Conference, in which Houston plays, has been much tougher this year than the Eastern Conference, in which the Sixers play, and teams play most of their games within conference. Also, after Philly recently traded Evan Turner, one of its better players, an article from Examiner.com contended that:

On paper this is a bad deal for the Sixers, but they have no intention on trying to win games. The team is tanking like no other in hopes of winning the 2014 NBA Draft Lottery.

Getting back to tonight's game, Carl Bialik, the former Wall Street Journal "Numbers Guy" who now writes for the newly relaunched FiveThirtyEight, calculates only a 4% chance of the 76ers winning.

As the Sixers' losing streak was building in recent weeks, I began trying to come up with a statistical angle on it. One line of thinking is that, contrary to the idea of other teams taking the struggling team lightly, opponents will play even harder against a team in free-fall in an attempt to avoid being "that team" -- the one against whom the losing streak ended. Thus, for Philly, ending its losing streak against a strong team such as Houston, on the road no less, would seem unlikely.

The question then came to me: what is the profile of an opposing squad against which a team ends its long losing streak? Presumably, such an opponent is likely to be a bad team. If you lose to a team that has lost its last 20 or 25 games, you can't be that good yourself. Ultimately, though, it's an empirical question.

I consulted the aforementioned Wikipedia list of the longest losing streaks in NBA history. The list included 30 losing streaks: one each of 26, 25, and 24 straight losses, three of length 23, one of 21 games, four of length 20, seven of length 19, four of length 18, and eight 17-game losing streaks. The list also included the date and opponent when the streak ended. I then went to Basketball Reference, which has extensive season logs for all teams in NBA history. For example, seeing on the Wikipedia list that the 2010-11 Cleveland Cavaliers (holder of the league record) ended their 26-game losing streak on February 11, 2011 against the L.A. Clippers, I could go to the Clippers' log for that season and see that they brought a 20-32 (.385) record into the game with the Cavs. Taking advantage of this weak opposition, Cleveland ended its losing streak.

I tried to make the same inquiry into the ending of all 30 of the NBA's longest losing streaks. However, streaks that carried over from one season to the next often ended early the next season, when teams may have played only a few games. To ensure relatively large samples of games, therefore, I limited my analysis to situations in which teams against whom a long losing streak ended had played at least 20 games during the season. There were 18 such situations, which I depict in the following graph. Unless you have some unbelievably strong eyesight, you'll want to click on the graphic to enlarge it.


The data points, represented by little basketballs, are arranged left-to-right from lowest to highest opponents' winning percentages entering games in which long losing streaks ended. A description of each streak-ending appears vertically by each ball. On the far left, the game in question is one in which the 1997-98 Denver Nuggets ended their 23-game losing streak by beating a 10-32 (.238) Clippers outfit. In another seven games, a team ended a long losing streak by beating a team whose winning percentage was in the .300's entering the game.

Contrary to my expectation that most games to end long losing streaks would have featured a really weak opposing team, seven of the games featured opponents with incoming winning percentages from .483-.614. And, most surprising of all, in three games, teams ended their long losing streaks against top-quality opposition (depicted in blue text on the graphic):
  • The 1964-65 then-San Francisco Warriors ended their 17-game losing streak by beating the 34-16 (.680) Cincinnati Royals (now the Sacramento Kings).
  • The 1972-73 Sixers, a squad that won only nine games all season, snapped their 20-game losing streak by beating the 42-18 (.700) Milwaukee Bucks. This was during the Kareem Abdul-Jabbar era in Milwaukee, in which the Bucks won the 1971 NBA title and lost a seven-game final in 1974. Kareem did miss the fourth quarter of the Sixers' streak-busting game, due to a back injury.
  • The 1967-68 then-San Diego Rockets ended their 17-game losing streak by beating the 48-16 (.750) 76ers. This was toward the end of Wilt Chamberlain's time in Philly, with the Sixers having won the 1967 NBA title.  
So yes, there is some precedent for teams ending their long losing streaks against opposing teams with winning percentages in the vicinity of .700. Perhaps you noticed another pattern, though. All three instances of teams ending their losing streaks against such lofty opposition occurred more than 40 years ago! It may be just a coincidence. However, another possibility is that the greater scrutiny of sports contests now than in the past (e.g., via the Internet, 24-hour sports cable networks, and radio talk shows) has made the top teams extra sensitive to becoming "that team" when they face an opponent on a long losing streak.

UPDATE: After losing to Houston to tie the NBA record of 26 straight losses, the 76ers beat Detroit to end the streak.

Friday, March 14, 2014

Team Scoring Runs in College Basketball -- Revisited

ESPN The Magazine's annual "Analytics Issue" (March 3, 2014) includes an article by Ken Pomeroy on when team scoring runs are most likely to occur in college basketball. Pomeroy focuses on runs of at least 10-0 (i.e., one team scoring 10 straight points without any scoring by the opponent), although other analysts might differ either on the minimum number of points by the "hot" team needed to constitute a run or on whether the shutout element is necessary for a run (i.e., some would consider outscoring a team by a margin of 15-2, for example, during a stretch to be a run).

Using the 10-0 criterion and voluminous data from recent seasons, Pomeroy examined, among other things, the probability of teams going on a run, depending on whether they were winning or losing (and by how much) or tied. He found small, but steady, differences, comprising an unmistakable trend. The more a team was behind, the higher its probability of going on a 10-0 run, and the more a team was ahead, the smaller its probability. A team trailing by 10 points had approximately a 1.86% chance of going on such a run, a team trailing by 9 points had roughly a 1.76% chance, a team 8 points behind had roughly a 1.72% chance, and so forth (the reason these percentages are approximate is that the exact values are not listed and I am estimating them visually from the heights of bars on a graph). In a tie game, a team has about a 1.24% chance of a 10-0 run. A team with a 1-point lead had around a 1.20% chance, one with a 2-point lead had roughly a 1.16% chance, and so forth. Finally, a team up 10 had approximately a 0.88% chance.

In my 2012 book Hot Hand, I also examined team runs, in this case in the 2004 NCAA men's basketball tournament. My aim at the time was simply to document the number of runs in the tourney, using the criterion of a 10-point margin, but not requiring a shutout during the run. A margin such as 12-2 or 16-3 would have sufficed, for example. As I wrote in the book, "Nearly three-quarters of the games (47 out of 64) featured at least one major run" (p. 27). Many of the games included multiple runs, so the number of total runs was 67. Unlike Pomeroy, I did not initially seek to correlate the occurrence of runs with whether the team that went on the run was ahead or behind (and by how much) at the time of the run. However, play-by-play sheets from the 2004 tournament are still available online (by going to a given team's schedule page at ESPN.com, as in this example, and then selecting 2003-04). Thus, I could go back and try to replicate Pomeroy's analysis.*

Whereas my horizontal axis was structured the same as Pomeroy's, depicting how many points behind (negative values) or ahead (positive values) a team was right before launching its run, I used a different measure of run intensity on the vertical axis. As I noted above, he looked at probability of a 10-0 run. Instead, I plotted the margin of a given run (e.g., outscoring an opponent 16-3 during a run would be recorded as +13). The graph appears below, the background ranging left-to-right from darker red for larger deficits to darker green for larger leads. You may click on the graph to enlarge it.


Each dot represents a particular scoring run. Three of the dots are annotated to provide examples of what all the dots represent. The downward-trending line, known as the best-fit line because it is close to as many of the dots as possible, shows the same pattern as Pomeroy's findings. The further behind a team was, the greater its tendency to outscore the opponent by a huge margin. For those of you with some statistical training, the correlation between initial deficit/lead and margin of outscoring the opponent during the run was r = -.24, on the cusp of statistical significance at p = .051.

Teams that were way ahead rarely went on a big scoring run. One exception, noted in the graph, is that Kansas, already leading 85-64, went on a 12-0 run against the University of Alabama-Birmingham to expand the Jayhawks' lead to 97-64. One reason teams with big leads rarely go on new scoring runs presumably is that they often take out their top players, both to rest them and to avoid the appearance of "running up the score" on the opponent.

A second, more statistically based reason for why trailing teams are more likely than leading teams to go on runs is the concept of regression toward the mean. Regression toward the mean tells us that, even absent any intervention, both extreme low performers (i.e., the trailing team) and extreme high performers (i.e., the leading team) tend to return toward more average performances. An extreme low performer has nowhere to go but up, and an extreme high performer has nowhere to go but down.

In conclusion, Pomeroy's and my investigations are very clear that trailing teams are far more likely to go on scoring runs than are leading teams. Psychological factors suggested by Pomeroy (e.g., motivation on the part of the trailing team and the desire to conserve energy by the leading team) and regression toward the mean are likely explanations of the basic finding, but it is difficult to know the relative importance of the two explanations.

---
*In revisiting the play-by-play sheets from the 2004 NCAA tournament, I noticed a few slight discrepancies with what appeared in my book. For example, a game article on the Mississippi State-Monmouth contest stated that, "The 15th-seeded [Monmouth] Hawks shot their way within four points late in the first half, but Mississippi State pulled away by controlling both ends of the floor. The Bulldogs tore off a 22-5 run in less than 10 minutes, and cruised to their largest margin of victory of the season." Based on the game article, I listed Mississippi State's run as 22-5 in my book. However, the article's qualifier "in less than 10 minutes" was more important than I realized at the time. If one looks at the play-by-play sheet, one sees that Mississippi State indeed outscored Monmouth 22-5 in the roughly 10:00 window of time from 4:47 left in the first half (MSU up 36-32) to 15:23 left in the second half (MSU up 58-37). However, what I did not notice until revisiting the play-by-play sheet for today's analysis is that Mississippi State added 6 more unanswered points beyond the 10-minute window, making the full run really 28-5.Small discrepancies such as this were corrected for today's analysis.

Saturday, December 28, 2013

Kyle Korver Approaches 100 Straight Games with At Least One Made Three-Pointer

Kyle Korver, a 6-foot-7 shooting guard for the Atlanta Hawks, has now hit at least one three-point basket in 98 straight games, a new record. To reach an even 100, Korver must hit a three tonight at home against Charlotte and Sunday night at Orlando (Hawks' schedule). As reported in this USA Today article from late November, the previous record was 89 consecutive games with a trey. The Top 5 longest such streaks in NBA history include Korver's and four that go back to the 1980s and '90s:

Kyle Korver 98 (still going)
Dana Barros 89
Michael Adams 79
Dennis Scott 78
Reggie Miller 68

The "at least one" aspect of Korver's streak brings to mind baseball hitting streaks (i.e., number of games with at least one hit). Let's refer to a made three-pointer in basketball or a hit in baseball as a "success." To calculate the probability of a streak of games with at least one success, one needs to know the probability of at least one success in a given game, and to determine that, one must know the probability of a success in a single trial (a shot in basketball or an official at-bat in baseball).

The above-linked USA Today article, published when Korver's streak was at 87 games, noted that he had shot 46.9% (226-for-482) on treys during the streak. Consulting Korver's game-by-game log on ESPN.com, we see that he has made an additional 40 treys on 82 attempts in his last 11 games (note that he missed four games between Nov. 27 and Dec. 2 due to injury). Thus, the last 11 games have raised Korver's three-point shooting percentage during the streak to 47.2% or .472 (based on 266/564). We will use Korver's .472 shooting percentage on threes to represent his probability of success in a single trial.

During Joe DiMaggio's Major League Baseball-record streak of getting at least one hit in 56 consecutive games, set in 1941, his batting average was .409. That would be DiMaggio's probability of success in a single trial.

The number of trials or opportunities an athlete gets per game is also a crucial factor. In basketball, of course, a player can shoot as often as the coach wants him or her to, whereas in baseball, teams follow a batting order that limits each player to one at-bat for every nine taken by his team. With that distinction between basketball and baseball noted, dividing Korver's 564 three-point attempts by the 98 games in which they have occurred yields an average of 5.76 attempts per game. DiMaggio averaged 3.98 at-bats per game.* Because Korver has had a higher success rate in his task (making three-point shots) than DiMaggio had at his (getting hits), and because Korver has had more attempts per game than DiMaggio, it is not surprising that the length of Korver's streak has greatly exceeded DiMaggio's.

To deal in round numbers, let's say Korver takes six shots per game from behind the arc. With his .472 success rate on treys, his likelihood of missing any given attempt is 1 - .472, or .528. Korver's probability of missing all six of his (typical) three-point attempts in a game is then .528 to the 6th power, or .022. Anything other than an all-miss night means at least one made three-pointer. Thus, we take 1 - .022, which yields .978 as Korver's probability of making at least one shot from downtown in a game. (This formula assumes the outcomes of shots from one attempt to the next are independent of each other, like successive tosses of a coin.)

With roughly a 98% chance in each game of making at least one three-pointer (as long as he gets six attempts and has a .472 likelihood of success on each one), it is no wonder that Korver has been able to maintain such a long streak of games with a trey. To complete the calculation of Korver's probability of hitting at least one three-pointer in 98 consecutive games, we would raise .978 to the 98th power. This yields .113, for a roughly 1-in-9 chance that Korver (or other players whose three-point attempts and shooting percentages are similar to Korver's) would hit a trey in 98 consecutive games.

As discussed in my book Hot Hand (pp. 54-55), researchers David Rockoff and Phil Yates offered the useful reminder to streak analysts that, even though a player averages a certain number of opportunities per game, he or she may get more opportunities than average or fewer opportunities than average in any particular game. The above calculation for Korver, assuming a consistent six shots per game, may thus have to be taken with a grain of salt.

If, for example, Korver launched only two three-point attempts in a given game, he would not have a 97.8% probability of making at least one three in the game. Instead, we would take his miss rate (.528) and raise it to the second power (corresponding to two shots) to estimate the probability of an all-miss game; the calculation yields .279. With only two shots taken, therefore, Korver would have a 1 - .279, or .721, probability of making at least one trey in the game. That's still pretty high, but not as near-definitive as .978. With more than six shots in a game, Korver's probability of making at least one three-pointer would be even higher than .978. With 10 shots, for example, we would take 1 - (.528 to the 10th power), which yields .998.

To explore this issue further, I've plotted the number of three-point attempts taken by Korver in each game of his 98-game streak.


The most common scenario (evident in 21 games) is that Korver attempts exactly six treys per game. Games with fewer than six attempts (46 total) are somewhat more common than games with more than six (31). Still, he's had no games with just one attempt and only five with two attempts.

Another interesting question (to me at least) is, how many times has Korver's streak been in danger? In other words, how many times has he gotten late into a game without a made three? I decided to examine in depth the play-by-play sheets for games in which he ended up with only one made three, of which there were 24 games (roughly a quarter of the games in the streak). In those 24 games, Korver made his lone three:

  • 10 times in the first quarter.
  • 4 times in the second quarter.
  • 5 times in the third quarter; and
  • 5 times in the fourth quarter.

Of the five games in which Korver got his lone three-pointer in the fourth quarter, four times there was 5:50 or more left in the game. The closest the streak has come to ending, it appears, was in what became the 54th game of the streak, March 8, 2013 in Boston. On that night, Korver made his only three of the game with just 1:31 remaining in the fourth quarter! (It is possible he went deeper into a game with no threes, then make two or more quick ones, although it seems unlikely. I did not examine play-by-play sheets for games in which he made more than one trey.)

I think I've provided several statistical angles on Korver's streak. However, the USA Today article has even more, including how many miles he runs on the court per 48 minutes, trying to get himself open for threes! Here is the link again to that article.
---

*The DiMaggio statistics and a fairly straightforward discussion of how to calculate streak probabilities are available in this article.

Wednesday, November 27, 2013

Looking Back on the Dodgers' 42-8 Spurt this Past Season

I've done a lot of tweeting about streak-related developments over the past few months (see link to my Twitter feed in the right-hand column), rather than writing deeper analyses. However, I've now conducted a detailed analysis of the L.A. Dodgers' 42-8 spurt midway through the 2013 season (one of best 50-game stretches in MLB history). My article is available at the baseball website Seamheads.com.

Tuesday, September 10, 2013

It's Official! Pirates Get First Winning Season Since 1992

The Pittsburgh Pirates won their 82nd game of the season last night, 1-0 over the Texas Rangers, to ensure the Steel City franchise's first winning (above .500) season since 1992. The Pirates' streak of 20 straight losing seasons was the longest such streak in any of the four major sports leagues in North America (MLB, NFL, NBA, and NHL).

I created the following graphic to indicate how many wins the Pirates attained from 1993 (the year the string of losing seasons began) to the present. The break-even point of 81 wins is highlighted. Unless you have amazing eyesight, you'll probably want to click on the graphic to enlarge it.


Congratulations to the Pirates and their fans. Not only has this year's team ended the franchise's record streak of losing seasons; it is a virtual cinch to make the playoffs (Pittsburgh's probability of making the playoffs is currently listed as 98.7% on ESPN's standings). I like that the Pirates have achieved a winning record relatively early, rather than taking things down to the final days of the season.