Soccer - uwaggs/usportspy GitHub Wiki
Soccer Functions
The usportspy package offers functions to retrieve various data related to Soccer, including schedules, team and player box scores, and play-by-play data.
soccer_get_schedule
Fetches the schedule for Soccer games based on gender.
Parameters:
gender(str): Must be"m"or"w".
Returns:
pd.DataFrame: A DataFrame containing the schedule for Soccer games. The columns of the returned DataFrame are:league,season,game_id,date,exhibition,conference,playoffs,championship,home_team,away_team,home_score,away_score,game_link.
Example:
from usportspy import soccer_get_schedule
# Get the schedule for male Soccer
schedule_male = soccer_get_schedule("m")
print(schedule_male.head())
# Get the schedule for female Soccer
schedule_female = soccer_get_schedule("w")
print(schedule_female.head())
Expected output:
league season game_id date exhibition conference playoffs championship home_team away_team home_score away_score game_link
0 msoc 2010-11 NaN 2010-08-18 True False False False saskatchewan mountroyalcollege 1.0 0.0 NaN
1 msoc 2010-11 NaN 2010-08-18 True False False False seattle-pacific trinitywestern 0.0 1.0 NaN
2 msoc 2010-11 NaN 2010-08-20 True False False False westernwashington ubc 0.0 2.0 NaN
3 msoc 2010-11 NaN 2010-08-21 True False False False ubc ubc-okanagan 4.0 1.0 NaN
4 msoc 2010-11 NaN 2010-08-21 True False False False ufv capilano 7.0 2.0 NaN
league season game_id date exhibition conference playoffs championship home_team away_team home_score away_score game_link
0 wsoc 2010-11 NaN 2010-08-15 True False False False seattle trinitywestern 2.0 1.0 NaN
1 wsoc 2010-11 NaN 2010-08-15 True False False False ubcokanagan ubc 0.0 6.0 NaN
2 wsoc 2010-11 NaN 2010-08-16 True False False False westernwashington trinitywestern 2.0 1.0 NaN
3 wsoc 2010-11 NaN 2010-08-18 True False False False trinitywestern saintmartins 3.0 1.0 NaN
4 wsoc 2010-11 NaN 2010-08-18 True False False False ubc langaracollege 6.0 0.0 NaN
soccer_get_team_box_score
Fetches the team box scores for Soccer games based on gender and seasons.
Parameters:
gender(str): Must be"m"or"w".seasons(list of int, optional): List of seasons (starting year) to filter by. If nothing is provided, data for all seasons will be returned.
Returns:
pd.DataFrame: A DataFrame containing the team box scores for the specified seasons. The columns of the returned DataFrame are:team,half,shots,saves,corner_kicks,fouls,team_status,date,sport,link,game_id,season,path.
Example:
from usportspy import soccer_get_team_box_score
# Get team box scores for male Soccer for the 2019 and 2021 seasons
team_box_scores_male = soccer_get_team_box_score("m", [2019, 2021])
print(team_box_scores_male.head())
# Get team box scores for female Soccer for the 2019 and 2021 seasons
team_box_scores_female = soccer_get_team_box_score("w", [2019, 2021])
print(team_box_scores_female.head())
Expected output:
team half shots saves corner_kicks fouls team_status date sport link game_id season path
0 Thompson Rivers 1.0 1 1 0 4 away 2019-10-26 msoc https://en.usports.ca/sports/msoc/2019-20p/box... 20191026_d3p9 2019-20 NaN
1 Thompson Rivers 2.0 3 3 1 6 away 2019-10-26 msoc https://en.usports.ca/sports/msoc/2019-20p/box... 20191026_d3p9 2019-20 NaN
2 Mount Royal 1.0 7 0 1 8 home 2019-10-26 msoc https://en.usports.ca/sports/msoc/2019-20p/box... 20191026_d3p9 2019-20 NaN
3 Mount Royal 2.0 8 2 2 5 home 2019-10-26 msoc https://en.usports.ca/sports/msoc/2019-20p/box... 20191026_d3p9 2019-20 NaN
4 McMaster 1.0 0 0 0 0 away 2019-10-27 msoc https://en.usports.ca/sports/msoc/2019-20p/box... 20191027_i7yt 2019-20 NaN
date sport link game_id ... corner_kicks fouls team_status path
0 2019-11-01 wsoc https://en.usports.ca/sports/wsoc/2019-20p/box... 20191101_mddg ... 1 2 away data/wsoc_team_box/wsoc_team_box_2019-20.csv
1 2019-11-01 wsoc https://en.usports.ca/sports/wsoc/2019-20p/box... 20191101_mddg ... 2 0 away data/wsoc_team_box/wsoc_team_box_2019-20.csv
2 2019-11-01 wsoc https://en.usports.ca/sports/wsoc/2019-20p/box... 20191101_mddg ... 1 2 home data/wsoc_team_box/wsoc_team_box_2019-20.csv
3 2019-11-01 wsoc https://en.usports.ca/sports/wsoc/2019-20p/box... 20191101_mddg ... 1 2 home data/wsoc_team_box/wsoc_team_box_2019-20.csv
4 2019-11-03 wsoc https://en.usports.ca/sports/wsoc/2019-20p/box... 20191103_ssk2 ... 4 2 away data/wsoc_team_box/wsoc_team_box_2019-20.csv
[5 rows x 13 columns]
soccer_get_player_box_score
Fetches the player box scores for Soccer games based on gender and seasons.
Parameters:
gender(str): Must be"m"or"w".seasons(list of int, optional): List of seasons (starting year) to filter by. If nothing is provided, data for all seasons will be returned.
Returns:
pd.DataFrame: A DataFrame containing the player box scores for the specified seasons. The columns of the returned DataFrame are:jersey,position,name,team,shots,shots_on_goal,goals,assists,shots_on_goal_against,goals_against,saves,minutes_played,card_type,card_time_received,player_links,team_status,date,sport,link,game_id,season,card_two_type,card_two_time_received,card_three_type,card_three_time_received,path.
Example:
from usportspy import soccer_get_player_box_score
# Get player box scores for male Soccer for the 2019 and 2021 seasons
player_box_scores_male = soccer_get_player_box_score("m", [2019, 2021])
print(player_box_scores_male.head())
# Get player box scores for female Soccer for the 2019 and 2021 seasons
player_box_scores_female = soccer_get_player_box_score("w", [2019, 2021])
print(player_box_scores_female.head())
Expected output:
jersey position name team shots shots_on_goal ... season card_two_type card_two_time_received card_three_type card_three_time_received path
0 1.0 gk Jackson Gardner Thompson Rivers 0 0 ... 2019-20 NaN NaN NaN NaN NaN
1 2.0 d Jan Pirretas Thompson Rivers 0 0 ... 2019-20 NaN NaN NaN NaN NaN
2 4.0 d Josh Banton Thompson Rivers 0 0 ... 2019-20 NaN NaN NaN NaN NaN
3 7.0 m Justin Donaldson Thompson Rivers 0 0 ... 2019-20 NaN NaN NaN NaN NaN
4 8.0 d Callum Etches Thompson Rivers 0 0 ... 2019-20 NaN NaN NaN NaN NaN
[5 rows x 26 columns]
jersey position name team shots shots_on_goal goals ... card_two_time_received card_three_type card_three_time_received _43 _44 _45 path
0 NaN NaN Team MacEwan 0 0 0 ... NaN NaN NaN NaN NaN NaN NaN
1 2.0 NaN Megan Lemoine MacEwan 1 0 0 ... NaN NaN NaN NaN NaN NaN NaN
2 3.0 NaN Anna McPhee MacEwan 1 0 0 ... NaN NaN NaN NaN NaN NaN NaN
3 5.0 NaN Jamie Erickson MacEwan 1 1 0 ... NaN NaN NaN NaN NaN NaN NaN
4 6.0 NaN Kaylin Hermautz MacEwan 0 0 0 ... NaN NaN NaN NaN NaN NaN NaN
[5 rows x 29 columns]
soccer_get_pbp
Fetches the play-by-play (PBP) data for Soccer games based on gender and seasons.
Parameters:
gender(str): Must be"m"or"w".seasons(list of int, optional): List of seasons (starting year) to filter by. If nothing is provided, data for all seasons will be returned.
Returns:
pd.DataFrame: A DataFrame containing the play-by-play data for the specified seasons. The columns of the returned DataFrame are:halves,time,player,substitution,assist_player,save,team_status,score_away,score_home,description,event,action,result,assist,card_type,date,sport,link,game_id,season,path.
Example:
from usportspy import soccer_get_pbp
# Get play-by-play data for male Soccer for the 2019 and 2021 seasons
pbp_male = soccer_get_pbp("m", [2019, 2021])
print(pbp_male.head())
# Get play-by-play data for female Soccer for the 2019 and 2021 seasons
pbp_female = soccer_get_pbp("w", [2019, 2021])
print(pbp_female.head())
Expected output:
halves time player substitution assist_player save ... date sport link game_id season path
0 1.0 0:00 Jackson Gardner NaN NaN NaN ... 2019-10-26 msoc https://en.usports.ca/sports/msoc/2019-20p/box... 20191026_d3p9 2019-20 NaN
1 1.0 0:00 Kyran Valley NaN NaN NaN ... 2019-10-26 msoc https://en.usports.ca/sports/msoc/2019-20p/box... 20191026_d3p9 2019-20 NaN
2 1.0 NaN NaN NaN NaN NaN ... 2019-10-26 msoc https://en.usports.ca/sports/msoc/2019-20p/box... 20191026_d3p9 2019-20 NaN
3 1.0 NaN NaN NaN NaN NaN ... 2019-10-26 msoc https://en.usports.ca/sports/msoc/2019-20p/box... 20191026_d3p9 2019-20 NaN
4 1.0 7:02 Miguel Da Rocha NaN NaN NaN ... 2019-10-26 msoc https://en.usports.ca/sports/msoc/2019-20p/box... 20191026_d3p9 2019-20 NaN
[5 rows x 21 columns]
halves time player substitution assist_player ... sport link game_id season path
0 1.0 45:00 Bianca Castillo is starting as the Goalie NaN NaN ... wsoc https://en.usports.ca/sports/wsoc/2019-20/boxs... 20190819_lhca 2019-20 NaN
1 1.0 45:00 Sydney is starting as the Goalie NaN NaN ... wsoc https://en.usports.ca/sports/wsoc/2019-20/boxs... 20190819_lhca 2019-20 NaN
2 1.0 44:23 NaN NaN NaN ... wsoc https://en.usports.ca/sports/wsoc/2019-20/boxs... 20190819_lhca 2019-20 NaN
3 1.0 40:43 Missed Charly goes NaN NaN ... wsoc https://en.usports.ca/sports/wsoc/2019-20/boxs... 20190819_lhca 2019-20 NaN
4 1.0 39:41 Megan Lemoine Mac Kami NaN NaN ... wsoc https://en.usports.ca/sports/wsoc/2019-20/boxs... 20190819_lhca 2019-20 NaN
[5 rows x 21 columns]