Soccer - uwaggs/usportspy GitHub Wiki

Soccer Functions

The usportspy package offers functions to retrieve various data related to Soccer, including schedules, team and player box scores, and play-by-play data.

soccer_get_schedule

Fetches the schedule for Soccer games based on gender.

Parameters:

  • gender (str): Must be "m" or "w".

Returns:

  • pd.DataFrame: A DataFrame containing the schedule for Soccer games. The columns of the returned DataFrame are: league, season, game_id, date, exhibition, conference, playoffs, championship, home_team, away_team, home_score, away_score, game_link.

Example:

from usportspy import soccer_get_schedule

# Get the schedule for male Soccer
schedule_male = soccer_get_schedule("m")
print(schedule_male.head())

# Get the schedule for female Soccer
schedule_female = soccer_get_schedule("w")
print(schedule_female.head())

Expected output:

  league   season game_id        date  exhibition  conference  playoffs  championship          home_team          away_team  home_score  away_score game_link
0   msoc  2010-11     NaN  2010-08-18        True       False     False         False       saskatchewan  mountroyalcollege         1.0         0.0       NaN
1   msoc  2010-11     NaN  2010-08-18        True       False     False         False    seattle-pacific     trinitywestern         0.0         1.0       NaN
2   msoc  2010-11     NaN  2010-08-20        True       False     False         False  westernwashington                ubc         0.0         2.0       NaN
3   msoc  2010-11     NaN  2010-08-21        True       False     False         False                ubc       ubc-okanagan         4.0         1.0       NaN
4   msoc  2010-11     NaN  2010-08-21        True       False     False         False                ufv           capilano         7.0         2.0       NaN
  league   season game_id        date  exhibition  conference  playoffs  championship          home_team       away_team  home_score  away_score game_link
0   wsoc  2010-11     NaN  2010-08-15        True       False     False         False            seattle  trinitywestern         2.0         1.0       NaN
1   wsoc  2010-11     NaN  2010-08-15        True       False     False         False        ubcokanagan             ubc         0.0         6.0       NaN
2   wsoc  2010-11     NaN  2010-08-16        True       False     False         False  westernwashington  trinitywestern         2.0         1.0       NaN
3   wsoc  2010-11     NaN  2010-08-18        True       False     False         False     trinitywestern    saintmartins         3.0         1.0       NaN
4   wsoc  2010-11     NaN  2010-08-18        True       False     False         False                ubc  langaracollege         6.0         0.0       NaN

soccer_get_team_box_score

Fetches the team box scores for Soccer games based on gender and seasons.

Parameters:

  • gender (str): Must be "m" or "w".
  • seasons (list of int, optional): List of seasons (starting year) to filter by. If nothing is provided, data for all seasons will be returned.

Returns:

  • pd.DataFrame: A DataFrame containing the team box scores for the specified seasons. The columns of the returned DataFrame are: team, half, shots, saves, corner_kicks, fouls, team_status, date, sport, link, game_id, season, path.

Example:

from usportspy import soccer_get_team_box_score

# Get team box scores for male Soccer for the 2019 and 2021 seasons
team_box_scores_male = soccer_get_team_box_score("m", [2019, 2021])
print(team_box_scores_male.head())

# Get team box scores for female Soccer for the 2019 and 2021 seasons
team_box_scores_female = soccer_get_team_box_score("w", [2019, 2021])
print(team_box_scores_female.head())

Expected output:

              team  half  shots  saves  corner_kicks  fouls team_status        date sport                                               link        game_id   season path
0  Thompson Rivers   1.0      1      1             0      4        away  2019-10-26  msoc  https://en.usports.ca/sports/msoc/2019-20p/box...  20191026_d3p9  2019-20  NaN
1  Thompson Rivers   2.0      3      3             1      6        away  2019-10-26  msoc  https://en.usports.ca/sports/msoc/2019-20p/box...  20191026_d3p9  2019-20  NaN
2      Mount Royal   1.0      7      0             1      8        home  2019-10-26  msoc  https://en.usports.ca/sports/msoc/2019-20p/box...  20191026_d3p9  2019-20  NaN
3      Mount Royal   2.0      8      2             2      5        home  2019-10-26  msoc  https://en.usports.ca/sports/msoc/2019-20p/box...  20191026_d3p9  2019-20  NaN
4         McMaster   1.0      0      0             0      0        away  2019-10-27  msoc  https://en.usports.ca/sports/msoc/2019-20p/box...  20191027_i7yt  2019-20  NaN
         date sport                                               link        game_id  ... corner_kicks fouls  team_status                                          path
0  2019-11-01  wsoc  https://en.usports.ca/sports/wsoc/2019-20p/box...  20191101_mddg  ...            1     2         away  data/wsoc_team_box/wsoc_team_box_2019-20.csv
1  2019-11-01  wsoc  https://en.usports.ca/sports/wsoc/2019-20p/box...  20191101_mddg  ...            2     0         away  data/wsoc_team_box/wsoc_team_box_2019-20.csv
2  2019-11-01  wsoc  https://en.usports.ca/sports/wsoc/2019-20p/box...  20191101_mddg  ...            1     2         home  data/wsoc_team_box/wsoc_team_box_2019-20.csv
3  2019-11-01  wsoc  https://en.usports.ca/sports/wsoc/2019-20p/box...  20191101_mddg  ...            1     2         home  data/wsoc_team_box/wsoc_team_box_2019-20.csv
4  2019-11-03  wsoc  https://en.usports.ca/sports/wsoc/2019-20p/box...  20191103_ssk2  ...            4     2         away  data/wsoc_team_box/wsoc_team_box_2019-20.csv

[5 rows x 13 columns]

soccer_get_player_box_score

Fetches the player box scores for Soccer games based on gender and seasons.

Parameters:

  • gender (str): Must be "m" or "w".
  • seasons (list of int, optional): List of seasons (starting year) to filter by. If nothing is provided, data for all seasons will be returned.

Returns:

  • pd.DataFrame: A DataFrame containing the player box scores for the specified seasons. The columns of the returned DataFrame are: jersey, position, name, team, shots, shots_on_goal, goals, assists, shots_on_goal_against, goals_against, saves, minutes_played, card_type, card_time_received, player_links, team_status, date, sport, link, game_id, season, card_two_type, card_two_time_received, card_three_type, card_three_time_received, path.

Example:

from usportspy import soccer_get_player_box_score

# Get player box scores for male Soccer for the 2019 and 2021 seasons
player_box_scores_male = soccer_get_player_box_score("m", [2019, 2021])
print(player_box_scores_male.head())

# Get player box scores for female Soccer for the 2019 and 2021 seasons
player_box_scores_female = soccer_get_player_box_score("w", [2019, 2021])
print(player_box_scores_female.head())

Expected output:

   jersey position              name             team  shots  shots_on_goal  ...   season  card_two_type  card_two_time_received  card_three_type  card_three_time_received  path
0     1.0       gk   Jackson Gardner  Thompson Rivers      0              0  ...  2019-20            NaN                     NaN              NaN                       NaN   NaN
1     2.0        d      Jan Pirretas  Thompson Rivers      0              0  ...  2019-20            NaN                     NaN              NaN                       NaN   NaN
2     4.0        d       Josh Banton  Thompson Rivers      0              0  ...  2019-20            NaN                     NaN              NaN                       NaN   NaN
3     7.0        m  Justin Donaldson  Thompson Rivers      0              0  ...  2019-20            NaN                     NaN              NaN                       NaN   NaN
4     8.0        d     Callum Etches  Thompson Rivers      0              0  ...  2019-20            NaN                     NaN              NaN                       NaN   NaN

[5 rows x 26 columns]
   jersey position             name     team  shots  shots_on_goal  goals  ...  card_two_time_received  card_three_type  card_three_time_received  _43  _44 _45 path
0     NaN      NaN             Team  MacEwan      0              0      0  ...                     NaN              NaN                       NaN  NaN  NaN NaN  NaN
1     2.0      NaN    Megan Lemoine  MacEwan      1              0      0  ...                     NaN              NaN                       NaN  NaN  NaN NaN  NaN
2     3.0      NaN      Anna McPhee  MacEwan      1              0      0  ...                     NaN              NaN                       NaN  NaN  NaN NaN  NaN
3     5.0      NaN   Jamie Erickson  MacEwan      1              1      0  ...                     NaN              NaN                       NaN  NaN  NaN NaN  NaN
4     6.0      NaN  Kaylin Hermautz  MacEwan      0              0      0  ...                     NaN              NaN                       NaN  NaN  NaN NaN  NaN

[5 rows x 29 columns]

soccer_get_pbp

Fetches the play-by-play (PBP) data for Soccer games based on gender and seasons.

Parameters:

  • gender (str): Must be "m" or "w".
  • seasons (list of int, optional): List of seasons (starting year) to filter by. If nothing is provided, data for all seasons will be returned.

Returns:

  • pd.DataFrame: A DataFrame containing the play-by-play data for the specified seasons. The columns of the returned DataFrame are: halves, time, player, substitution, assist_player, save, team_status, score_away, score_home, description, event, action, result, assist, card_type, date, sport, link, game_id, season, path.

Example:

from usportspy import soccer_get_pbp

# Get play-by-play data for male Soccer for the 2019 and 2021 seasons
pbp_male = soccer_get_pbp("m", [2019, 2021])
print(pbp_male.head())

# Get play-by-play data for female Soccer for the 2019 and 2021 seasons
pbp_female = soccer_get_pbp("w", [2019, 2021])
print(pbp_female.head())

Expected output:

   halves  time           player substitution assist_player save  ...        date  sport                                               link        game_id   season path
0     1.0  0:00  Jackson Gardner          NaN           NaN  NaN  ...  2019-10-26   msoc  https://en.usports.ca/sports/msoc/2019-20p/box...  20191026_d3p9  2019-20  NaN
1     1.0  0:00     Kyran Valley          NaN           NaN  NaN  ...  2019-10-26   msoc  https://en.usports.ca/sports/msoc/2019-20p/box...  20191026_d3p9  2019-20  NaN
2     1.0   NaN              NaN          NaN           NaN  NaN  ...  2019-10-26   msoc  https://en.usports.ca/sports/msoc/2019-20p/box...  20191026_d3p9  2019-20  NaN
3     1.0   NaN              NaN          NaN           NaN  NaN  ...  2019-10-26   msoc  https://en.usports.ca/sports/msoc/2019-20p/box...  20191026_d3p9  2019-20  NaN
4     1.0  7:02  Miguel Da Rocha          NaN           NaN  NaN  ...  2019-10-26   msoc  https://en.usports.ca/sports/msoc/2019-20p/box...  20191026_d3p9  2019-20  NaN

[5 rows x 21 columns]
   halves   time                                     player substitution assist_player  ... sport                                               link        game_id   season path
0     1.0  45:00  Bianca Castillo is starting as the Goalie          NaN           NaN  ...  wsoc  https://en.usports.ca/sports/wsoc/2019-20/boxs...  20190819_lhca  2019-20  NaN
1     1.0  45:00           Sydney is starting as the Goalie          NaN           NaN  ...  wsoc  https://en.usports.ca/sports/wsoc/2019-20/boxs...  20190819_lhca  2019-20  NaN
2     1.0  44:23                                        NaN          NaN           NaN  ...  wsoc  https://en.usports.ca/sports/wsoc/2019-20/boxs...  20190819_lhca  2019-20  NaN
3     1.0  40:43                         Missed Charly goes          NaN           NaN  ...  wsoc  https://en.usports.ca/sports/wsoc/2019-20/boxs...  20190819_lhca  2019-20  NaN
4     1.0  39:41                     Megan Lemoine Mac Kami          NaN           NaN  ...  wsoc  https://en.usports.ca/sports/wsoc/2019-20/boxs...  20190819_lhca  2019-20  NaN

[5 rows x 21 columns]