Report overview

This report summarises player data for football matches contested in 2014 and 2015. In total there are 368 observations in the dataset.

Sprint distance by playing position

First, clean up the data by changing any text variables (e.g., home/away) into factor/categorical ones. The “phase_of_season” variable is coded as a number (1-4). Change this to a factor too.

Then, create a boxplot depicting the sprint distance (sprint_distance) covered during matches by each playing position (position). Which playing position tends to cover the highest amount of sprint distance during matches? Hint: set the fill to “position” to produce different coloured boxplots for each playing position.

High speed running over the season by playing position

Then, create a boxplot that shows how the amount of High Speed Running (high_speed_distance_5_5_7m_s) covered by each playing position changed over the phases of the season (phase_of_season). Does it look as though any of the playing positions were getting more fatigued as the season went on? Hint: use the facet_wrap() function to produce a separate plot for each level of the “position” variable.

Match outcomes for home and away games

Generate a two-way frequency table (or cross tabulation) of match outcomes (i.e., draw, loss, win) by match location (i.e., home or away). Edit your table using the kableExtra package. Is the total number of losses greater at home or away games?

Table 1. Match outcomes for home and away games.
Away Home
Draw 24 43
Loss 70 59
Win 86 86