The Iowa Hawkeyes Men's basketball team has seen its fair share of ups and downs over the past 20 seasons. From NCAA tournament appearances to NIT rejections, there has been plenty of triumphs and turmoils to examine over the past 20 years. This analysis investigates some trends in the performance of the team over this time frame (1999-2019). Specifically, it seeks to shed light on a variety of trends and hunches the average fan may hold, including the belief that the team performs its worst on Sunday afternoons (Iowa City is the #1 party school after all). More substantively, this report dissects the following broader questions: How has the team performed over time? At Home versus on the Road? Against different conferences? Are there any other trends of note? All of these questions will be answered with the fairly crude, but not meaningless, method of using point margins-or Iowa points scored minus opponent points scored-and win rates (after all, winning isn't everything, it's the only thing).
One thing that we were interested in was how the day of the week affects perfermance. Unlike football, there is a lot of variability in what day of the week a basketball game is played in. One of our hypotheses was that the team plays poorly on Sundays compared to any other day of the week. The plot below is a visual of how the number of wins and losses changes based on the day of the week.
This graph provides for an easy comparison between the number of wins and losses and furthermore, it illustrates how many games are played for each day of the week. It is very clear that Saturday has the largest number of games and Monday has the fewest number of games. This graph is good for initial exploration but it is difficult to compare the win percentages from one day of the week to the next so a different plot will be needed. We created a graph that would allow to accurately compare win percentages on different days.
The red line moving horizontally accross the graph is an indication of the overall win percentage by the Iowa Hawkeyes. Not only was our initial hunch incorrect but the team actually plays at an above average level on Sundays. The worst two days of the week are Wednesday and Thursday. We think that after looking at this plot, betting on the Hawkeyes could be influenced by what day of the week the game is being played on.
After loooking at days of the week, we wanted to explore how month of the season affects performance by the Iowa Hawkeyes. The process is very similar to how we looked at day of the week. Our initial hypothesis was that there is a "March Slump" where the Hawkeyes tend to do worse in March compared to any other month. We started with an initial bar plot showing the number of wins and losses in each month. Remember, a basketball season typically runs from November to March.
Based on this graph, we see that the number of wins is very high at the start of the season and slowly decreases. We can also see the number of losses starts small at the start of the season and increases drastically in January and February. There are only two games that have been played in April over the past 20 years explaining why the bar chart is so small for that month. It looks like the biggest magnitude of games is played in January and February. After looking at this plot, we want to explore win percentages by month more closely.
Again, the horizontal red line accross the graph represents the average win percentage accross all games. It looks like there is not just a "March Slump" but an end of season slump. Iowa Hawkeyes start every season off strong and finish below average. Knowing a little bit about the data though, this is in part because the Iowa Hawkeyes play more competitive opponents in the later half of the season.
After examining how day of the week and month of the year affect performance, we wanted to create a visualization that would include both. We thought this could best be done with a calendar plot using average point differential accross all years as the response variable. This did provide some difficulty. For example, January 1st may be on a Monday one year and then be on a Tuesday the next year. Furthermore, there is only a leap year every 4 years unless the year is divisible by 100, then there is no leap year unless the year is divisible by 400, then there is a leap year. We noticed that the point differential on "leap day" was zero so we felt comfortable not representing it in the data. As far as what day of the week to use, we decided to go with the calendar from 2018. By averaging all 20 years, we thought that this would create a sort-of block design washing out the day of the week effect but still providing a powerful visualization.
## `summarise()` has grouped output by 'Month'. You can override using the `.groups` argument.
This plot further confirms that the Iowa Hawkeyes do well at the start of the season but then finish poorly. Point differentials are generally positive in November and December but then decrease in the later half of the season. We liked this plot because it demonstrated what we noticed earlier but also provided some insight into the variability of point differential. Negative point differentials in November are rare except around Thanksgiving. Negative point differential values are rare in December except before Christmas around finals time. January and February have a lot of negative point differential averages but we do see a sprinkling of white and green giving us some insight into the variability of how well the Iowa Hawkeyes score.