Annexures | MySQL - SampathVSS/DataAnalysis-IPL-Datasets-2008-to-2017 GitHub Wiki

This section will provide all the MySQL queries used to achieve the desired results

1. Quick Insights

1.1 Number of Matches conducted in each year and the period of the tournament

image

Query :

SELECT
Season_Year,
count(*) as total_matches,
min(match_date) as start_date,
max(match_date) as end_date,
DATEDIFF(max(match_date),min(match_date)) as period_of_tournament
FROM matches
GROUP BY season_year

Explanation :

  • Season_Year: The year of the IPL season.
  • COUNT(*): Counts the total number of matches in each season.
  • MIN(match_date): Finds the earliest match date in the season.
  • MAX(match_date): Finds the latest match date in the season.
  • DATEDIFF(MAX(match_date), MIN(match_date)): Calculates the duration of the tournament in days.
  • GROUP BY Season_Year: Groups the results by season year.

1.2 Number of matches conducted in each year and in each stadium

image

Query :

This query uses a Common Table Expression (CTE) to list the number of matches conducted in each venue for each season year. The final output will summarize the matches conducted in each stadium across different years.

WITH cte AS
(SELECT
venue_name,
SUM(CASE WHEN season_year = 2008 THEN 1 ELSE 0 END) AS 2008,
SUM(CASE WHEN season_year = 2009 THEN 1 ELSE 0 END) AS 2009,
SUM(CASE WHEN season_year = 2010 THEN 1 ELSE 0 END) AS 2010,
SUM(CASE WHEN season_year = 2011 THEN 1 ELSE 0 END) AS 2011,
SUM(CASE WHEN season_year = 2012 THEN 1 ELSE 0 END) AS 2012,
SUM(CASE WHEN season_year = 2013 THEN 1 ELSE 0 END) AS 2013,
SUM(CASE WHEN season_year = 2014 THEN 1 ELSE 0 END) AS 2014,
SUM(CASE WHEN season_year = 2015 THEN 1 ELSE 0 END) AS 2015,
SUM(CASE WHEN season_year = 2016 THEN 1 ELSE 0 END) AS 2016,
SUM(CASE WHEN season_year = 2017 THEN 1 ELSE 0 END) AS 2017,
count(*) as total_matches
FROM ipl.matches
GROUP BY season_year,venue_name
ORDER BY total_matches DESC)

With referring to the above CTE table another grouping will be done to summarize the table between stadium name and season years

SELECT
venue_name,
SUM(2008) as 2008,
SUM(2009) as 2009,
SUM(2010) as 2010,
SUM(2011) as 2011,
SUM(2012) as 2012,
SUM(2013) as 2013,
SUM(2014) as 2014,
SUM(2015) as 2015,
SUM(2016) as 2016,
SUM(2017) as 2017,
SUM(total_matches) as total_matches
FROM cte
GROUP BY venue_name
ORDER BY total_matches DESC

Explanation:

WITH cte AS (...): Defines a Common Table Expression to calculate matches per venue per season.
venue_name: The name of the venue where the matches were conducted.
SUM(CASE WHEN season_year = YYYY THEN 1 ELSE 0 END): Calculates the number of matches for the given year.
COUNT(*): Counts the total number of matches in each venue.
GROUP BY season_year, venue_name: Groups the results by season year and venue name.
ORDER BY total_matches DESC: Orders the results by total matches in descending order.

In the final SELECT statement:

SUM(YYYY): Sums the matches for each year for each venue.
SUM(total_matches): Sums the total matches for each venue.
GROUP BY venue_name: Groups the final results by venue name.
ORDER BY total_matches DESC: Orders the final results by total matches in descending order.

1.3 Number of matches played by each team in each year in each stadium

Query :

image

Match table will be considered for creating CTE tables for both team 1 and team 2

-- Step 1: Define CTE for Team 1
WITH team_1 AS (
SELECT
Team1 AS team_name,
SUM(CASE WHEN Season_year = 2008 THEN 1 ELSE 0 END) AS 2008, -- Matches in 2008
SUM(CASE WHEN Season_year = 2009 THEN 1 ELSE 0 END) AS 2009, -- Matches in 2009
SUM(CASE WHEN Season_year = 2010 THEN 1 ELSE 0 END) AS 2010, -- Matches in 2010
SUM(CASE WHEN Season_year = 2011 THEN 1 ELSE 0 END) AS 2011, -- Matches in 2011
SUM(CASE WHEN Season_year = 2012 THEN 1 ELSE 0 END) AS 2012, -- Matches in 2012
SUM(CASE WHEN Season_year = 2013 THEN 1 ELSE 0 END) AS 2013, -- Matches in 2013
SUM(CASE WHEN Season_year = 2014 THEN 1 ELSE 0 END) AS 2014, -- Matches in 2014
SUM(CASE WHEN Season_year = 2015 THEN 1 ELSE 0 END) AS 2015, -- Matches in 2015
SUM(CASE WHEN Season_year = 2016 THEN 1 ELSE 0 END) AS 2016, -- Matches in 2016
SUM(CASE WHEN Season_year = 2017 THEN 1 ELSE 0 END) AS 2017, -- Matches in 2017
COUNT(match_id) AS number_of_matches -- Total matches for Team1
FROM matches
GROUP BY team_name
),

-- Step 2: Define CTE for Team 2
team_2 AS (
SELECT
Team2 AS team_name,
SUM(CASE WHEN Season_year = 2008 THEN 1 ELSE 0 END) AS 2008, -- Matches in 2008
SUM(CASE WHEN Season_year = 2009 THEN 1 ELSE 0 END) AS 2009, -- Matches in 2009
SUM(CASE WHEN Season_year = 2010 THEN 1 ELSE 0 END) AS 2010, -- Matches in 2010
SUM(CASE WHEN Season_year = 2011 THEN 1 ELSE 0 END) AS 2011, -- Matches in 2011
SUM(CASE WHEN Season_year = 2012 THEN 1 ELSE 0 END) AS 2012, -- Matches in 2012
SUM(CASE WHEN Season_year = 2013 THEN 1 ELSE 0 END) AS 2013, -- Matches in 2013
SUM(CASE WHEN Season_year = 2014 THEN 1 ELSE 0 END) AS 2014, -- Matches in 2014
SUM(CASE WHEN Season_year = 2015 THEN 1 ELSE 0 END) AS 2015, -- Matches in 2015
SUM(CASE WHEN Season_year = 2016 THEN 1 ELSE 0 END) AS 2016, -- Matches in 2016
SUM(CASE WHEN Season_year = 2017 THEN 1 ELSE 0 END) AS 2017, -- Matches in 2017
COUNT(match_id) AS number_of_matches -- Total matches for Team2
FROM matches
GROUP BY team_name
),

-- Step 3: Combine the CTE tables
combined_team AS (
SELECT * FROM team_1
UNION ALL
SELECT * FROM team_2
)

-- Step 4: Summarize the results by team name
SELECT
team_name,
SUM(2008) AS 2008, -- Sum of matches in 2008 for each team
SUM(2009) AS 2009, -- Sum of matches in 2009 for each team
SUM(2010) AS 2010, -- Sum of matches in 2010 for each team
SUM(2011) AS 2011, -- Sum of matches in 2011 for each team
SUM(2012) AS 2012, -- Sum of matches in 2012 for each team
SUM(2013) AS 2013, -- Sum of matches in 2013 for each team
SUM(2014) AS 2014, -- Sum of matches in 2014 for each team
SUM(2015) AS 2015, -- Sum of matches in 2015 for each team
SUM(2016) AS 2016, -- Sum of matches in 2016 for each team
SUM(2017) AS 2017, -- Sum of matches in 2017 for each team
SUM(number_of_matches) AS total_matches -- Sum of total matches for each team
FROM combined_team
GROUP BY team_name
ORDER BY total_matches DESC; -- Order by total matches in descending order

Explanation :

WITH team_1 AS (...): Creates a CTE for Team 1 with the number of matches played in each year.
Team1 AS team_name: Renames the Team1 column to team_name.
SUM(CASE WHEN Season_year = YYYY THEN 1 ELSE 0 END): Calculates the number of matches for each year.
COUNT(match_id): Counts the total number of matches for Team1.
GROUP BY team_name: Groups the results by team name.
WITH team_2 AS (...): Creates a CTE for Team 2 with the number of matches played in each
Team2 AS team_name: Renames the Team2 column to team_name.
SUM(CASE WHEN Season_year = YYYY THEN 1 ELSE 0 END): Calculates the number of matches for each
COUNT(match_id): Counts the total number of matches for Team2.
GROUP BY team_name: Groups the results by team name.
combined_team AS (...): Combines the results of team_1 and team_2 using UNION ALL.
UNION ALL: Combines the rows from both CTEs, keeping duplicates.
SELECT ... FROM combined_team: Summarizes the results by team name.
SUM(YYYY): Sums the number of matches for each year for each team.
SUM(number_of_matches): Sums the total matches for each team.
GROUP BY team_name: Groups the final results by team name.
ORDER BY total_matches DESC: Orders the final results by total matches in descending order.

Second Summary image

-- Use the IPL database
USE ipl;

-- Step 1: Define CTE for Team 1
WITH team_1 AS (
SELECT
Team1 AS team_name,
venue_name,
SUM(CASE WHEN Season_year = 2008 THEN 1 ELSE 0 END) AS 2008, -- Matches in 2008
SUM(CASE WHEN Season_year = 2009 THEN 1 ELSE 0 END) AS 2009, -- Matches in 2009
SUM(CASE WHEN Season_year = 2010 THEN 1 ELSE 0 END) AS 2010, -- Matches in 2010
SUM(CASE WHEN Season_year = 2011 THEN 1 ELSE 0 END) AS 2011, -- Matches in 2011
SUM(CASE WHEN Season_year = 2012 THEN 1 ELSE 0 END) AS 2012, -- Matches in 2012
SUM(CASE WHEN Season_year = 2013 THEN 1 ELSE 0 END) AS 2013, -- Matches in 2013
SUM(CASE WHEN Season_year = 2014 THEN 1 ELSE 0 END) AS 2014, -- Matches in 2014
SUM(CASE WHEN Season_year = 2015 THEN 1 ELSE 0 END) AS 2015, -- Matches in 2015
SUM(CASE WHEN Season_year = 2016 THEN 1 ELSE 0 END) AS 2016, -- Matches in 2016
SUM(CASE WHEN Season_year = 2017 THEN 1 ELSE 0 END) AS 2017, -- Matches in 2017
COUNT(match_id) AS number_of_matches -- Total matches for Team1
FROM matches
GROUP BY team_name, venue_name
),

-- Step 2: Define CTE for Team 2
team_2 AS (
SELECT
Team2 AS team_name,
venue_name,
SUM(CASE WHEN Season_year = 2008 THEN 1 ELSE 0 END) AS 2008, -- Matches in 2008
SUM(CASE WHEN Season_year = 2009 THEN 1 ELSE 0 END) AS 2009, -- Matches in 2009
SUM(CASE WHEN Season_year = 2010 THEN 1 ELSE 0 END) AS 2010, -- Matches in 2010
SUM(CASE WHEN Season_year = 2011 THEN 1 ELSE 0 END) AS 2011, -- Matches in 2011
SUM(CASE WHEN Season_year = 2012 THEN 1 ELSE 0 END) AS 2012, -- Matches in 2012
SUM(CASE WHEN Season_year = 2013 THEN 1 ELSE 0 END) AS 2013, -- Matches in 2013
SUM(CASE WHEN Season_year = 2014 THEN 1 ELSE 0 END) AS 2014, -- Matches in 2014
SUM(CASE WHEN Season_year = 2015 THEN 1 ELSE 0 END) AS 2015, -- Matches in 2015
SUM(CASE WHEN Season_year = 2016 THEN 1 ELSE 0 END) AS 2016, -- Matches in 2016
SUM(CASE WHEN Season_year = 2017 THEN 1 ELSE 0 END) AS 2017, -- Matches in 2017
COUNT(match_id) AS number_of_matches -- Total matches for Team2
FROM matches
GROUP BY team_name, venue_name
),

-- Step 3: Combine the CTE tables
combined_team AS (
SELECT * FROM team_1
UNION ALL
SELECT * FROM team_2
),

-- Step 4: Create summary for each team and venue
team_summary AS (
SELECT
team_name,
venue_name,
DENSE_RANK() OVER (PARTITION BY team_name ORDER BY SUM(number_of_matches) DESC) AS ranking, -- Rank venues by number of matches per team
SUM(2008) AS 2008, -- Sum of matches in 2008 for each team
SUM(2009) AS 2009, -- Sum of matches in 2009 for each team
SUM(2010) AS 2010, -- Sum of matches in 2010 for each team
SUM(2011) AS 2011, -- Sum of matches in 2011 for each team
SUM(2012) AS 2012, -- Sum of matches in 2012 for each team
SUM(2013) AS 2013, -- Sum of matches in 2013 for each team
SUM(2014) AS 2014, -- Sum of matches in 2014 for each team
SUM(2015) AS 2015, -- Sum of matches in 2015 for each team
SUM(2016) AS 2016, -- Sum of matches in 2016 for each team
SUM(2017) AS 2017, -- Sum of matches in 2017 for each team
SUM(number_of_matches) AS total_matches -- Sum of total matches for each team
FROM combined_team
GROUP BY team_name, venue_name
)

-- Step 5: Select the top venue for each team
SELECT * FROM team_summary
WHERE ranking = 1
ORDER BY total_matches DESC; -- Order by total matches in descending order

Explanation : WITH team_1 AS (...): Creates a CTE for Team 1 with the number of matches played in each year at each venue.
WITH team_2 AS (...): Creates a CTE for Team 2 with the number of matches played in each year at each venue.
combined_team AS (...): Combines the results of team_1 and team_2 using UNION ALL.
team_summary AS (...): Summarizes the results by team name and venue, ranking venues based on the number of matches played by each team.
SELECT * FROM team_summary WHERE ranking = 1 ORDER BY total_matches DESC; Selects the top-ranked venue for each team and orders the results by the total number of matches in descending order.

1.4 Demographics of the Players

Query :

image

-- Step 1: Create a CTE for team 1
WITH team_1_sub AS (
SELECT
Match_Id,
Player_team AS team_name,
SUM(CASE WHEN Season_year = 2008 THEN IF(Country_Name <> "India", ROUND(1/11, 3) * 100, 0) ELSE 0 END) AS 2008,
SUM(CASE WHEN Season_year = 2009 THEN IF(Country_Name <> "India", ROUND(1/11, 3) * 100, 0) ELSE 0 END) AS 2009,
SUM(CASE WHEN Season_year = 2010 THEN IF(Country_Name <> "India", ROUND(1/11, 3) * 100, 0) ELSE 0 END) AS 2010,
SUM(CASE WHEN Season_year = 2011 THEN IF(Country_Name <> "India", ROUND(1/11, 3) * 100, 0) ELSE 0 END) AS 2011,
SUM(CASE WHEN Season_year = 2012 THEN IF(Country_Name <> "India", ROUND(1/11, 3) * 100, 0) ELSE 0 END) AS 2012,
SUM(CASE WHEN Season_year = 2013 THEN IF(Country_Name <> "India", ROUND(1/11, 3) * 100, 0) ELSE 0 END) AS 2013,
SUM(CASE WHEN Season_year = 2014 THEN IF(Country_Name <> "India", ROUND(1/11, 3) * 100, 0) ELSE 0 END) AS 2014,
SUM(CASE WHEN Season_year = 2015 THEN IF(Country_Name <> "India", ROUND(1/11, 3) * 100, 0) ELSE 0 END) AS 2015,
SUM(CASE WHEN Season_year = 2016 THEN IF(Country_Name <> "India", ROUND(1/11, 3) * 100, 0) ELSE 0 END) AS 2016,
SUM(CASE WHEN Season_year = 2017 THEN IF(Country_Name <> "India", ROUND(1/11, 3) * 100, 0) ELSE 0 END) AS 2017,
ROUND(
(COUNT(DISTINCT Player_Id) * 100) /
(SELECT COUNT(DISTINCT Player_Id) FROM ipl.player_match WHERE Country_Name <> "India"), 2
) AS Contribution_foreign_players
FROM ipl.player_match
GROUP BY team_name, Match_Id
),

-- Step 2: Calculate the average contribution for each year and overall
team_1_final AS (
SELECT
team_name,
ROUND(AVG(2008), 3) AS 2008,
ROUND(AVG(2009), 3) AS 2009,
ROUND(AVG(2010), 3) AS 2010,
ROUND(AVG(2011), 3) AS 2011,
ROUND(AVG(2012), 3) AS 2012,
ROUND(AVG(2013), 3) AS 2013,
ROUND(AVG(2014), 3) AS 2014,
ROUND(AVG(2015), 3) AS 2015,
ROUND(AVG(2016), 3) AS 2016,
ROUND(AVG(2017), 3) AS 2017,
ROUND(AVG(Contribution_foreign_players), 3) AS Contribution_foreign_players
FROM team_1_sub
GROUP BY team_name
)

-- Step 3: Select the results from team_1_final
SELECT * FROM team_1_final;

Explanation : WITH team_1_sub AS (...): Creates a CTE for team 1 to calculate the foreign player contribution percentage for each match and each year.
team_1_final AS (...): Calculates the average foreign player contribution percentage for each year and overall for each team.
SELECT * FROM team_1_final: Selects the results from team_1_final.

Second Summary

image

-- Step 1: Count the total number of players from each country and calculate the percentage of players for each country

SELECT
Country_Name,
COUNT(Country_Name) AS total_players,
ROUND(COUNT(Country_Name) * 100 / (SELECT total FROM (SELECT COUNT(Country_Name) AS total FROM ipl.player) AS sub_table), 2) AS percentage
FROM ipl.player
GROUP BY Country_Name
ORDER BY total_players DESC;

Explanation :

SELECT COUNT(Country_Name) AS total_players: Counts the total number of players from each country.
SELECT total FROM (SELECT COUNT(Country_Name) AS total FROM ipl.player) AS sub_table: Subquery to calculate the total number of players in the dataset.
ROUND(..., 2) AS percentage: Calculates the percentage of players for each country and rounds it to two decimal places.
GROUP BY Country_Name: Groups the results by country name.
ORDER BY total_players DESC: Orders the results by the total number of players in descending order.

2. Key Performance Metrics

2.1 Tournament wise Key Metrics

Query :

image

WITH match_details AS (
SELECT
match_id,
Striker,
SUM(Runs_Scored) AS total_runs,
(CASE WHEN (SUM(Runs_Scored) / 50 > 1 AND SUM(Runs_Scored) / 50 < 2) THEN 1 ELSE 0 END) AS half_century,
(CASE WHEN (SUM(Runs_Scored) / 100 >= 1 AND SUM(Runs_Scored) / 100 <= 2) THEN 1 ELSE 0 END) AS century,
(CASE WHEN (SUM(Runs_Scored) / 200 >= 2 AND SUM(Runs_Scored) / 200 <= 3) THEN 1 ELSE 0 END) AS double_century,
(CASE WHEN Runs_Scored = 6 THEN 1 ELSE 0 END) as sixes,
(CASE WHEN Runs_Scored = 4 THEN 1 ELSE 0 END) as fours,
(CASE WHEN Runs_Scored = 0 THEN 1 ELSE 0 END) as dot_balls,
count(Runs_Scored) as balls_faced,
(SUM(Runs_Scored) / count(Runs_Scored)) * 100 as strike_rate,
sum(Extra_runs) as extra_runs,
SUM(Caught) as catches,
(SUM(Bowled) + sum(Bowler_Wicket)) as bowled,
SUM(Run_out) as run_out
FROM
ipl.ball_by_ball
GROUP BY
match_id,
Striker
ORDER BY
total_runs DESC
)

SELECT
m.Season_year,
sum(md.total_runs) as total_runs,
sum(md.balls_faced) as total_balls_faced,
ROUND((sum(md.total_runs) / sum(md.balls_faced)) * 100, 2) as avg_strike_rate,
sum(md.half_century) as total_half_centuries,
sum(md.century) as total_centuries,
SUM(md.sixes) as total_sixes,
SUM(md.fours) as total_fours,
SUM(md.dot_balls) as total_dot_balls,
sum(md.extra_runs) as total_extra_runs,
sum(md.catches) as total_catches,
sum(md.bowled) as total_bowled,
sum(md.run_out) as total_run_out
FROM
match_details md
JOIN
matches m
ON
md.match_id = m.match_id
GROUP BY
m.Season_year

Explanation :

m.Season_year: The year of the IPL season.
sum(md.total_runs): Total runs scored across all matches in the season.
sum(md.balls_faced): Total number of balls faced across all matches in the season.
ROUND((sum(md.total_runs) / sum(md.balls_faced)) * 100, 2): Average strike rate of batsmen across all matches in the season.
**sum(md.half_century): **Total number of half-centuries scored in the season.
sum(md.century): Total number of centuries scored in the season.
SUM(md.sixes): Total number of sixes hit in the season.
SUM(md.fours): Total number of fours hit in the season.
SUM(md.dot_balls): Total number of dot balls faced in the season.
sum(md.extra_runs): Total number of extra runs in the season.
sum(md.catches): Total number of catches taken in the season.
sum(md.bowled): Total number of batsmen bowled out in the season.
sum(md.run_out): Total number of run-outs in the season.
GROUP BY m.Season_year: Groups the results by season year.

2.2 Venue wise Key Metrics

image

Query :

USE ipl;

-- Step 1: Create a Common Table Expression (CTE) to calculate match and striker specific statistics
WITH match_details AS (
SELECT
Team_Batting as team_id, -- Team that is batting
match_id, -- Unique identifier for each match
Striker, -- Batsman who faced the ball
SUM(Runs_Scored) AS total_runs, -- Total runs scored by the batsman in the match
-- Flag indicating if the batsman scored a half-century (50-99 runs)
(CASE WHEN (SUM(Runs_Scored) / 50 > 1 AND SUM(Runs_Scored) / 50 < 2) THEN 1 ELSE 0 END) AS half_century,
-- Flag indicating if the batsman scored a century (100-199 runs)
(CASE WHEN (SUM(Runs_Scored) / 100 >= 1 AND SUM(Runs_Scored) / 100 <= 2) THEN 1 ELSE 0 END) AS century,
-- Flag indicating if the batsman scored a double century (200-299 runs)
(CASE WHEN (SUM(Runs_Scored) / 200 >= 2 AND SUM(Runs_Scored) / 200 <= 3) THEN 1 ELSE 0 END) AS double_century,
-- Number of sixes hit by the batsman
(CASE WHEN Runs_Scored = 6 THEN 1 ELSE 0 END) as sixes,
-- Number of fours hit by the batsman
(CASE WHEN Runs_Scored = 4 THEN 1 ELSE 0 END) as fours,
-- Number of dot balls (balls with zero runs) faced by the batsman
(CASE WHEN Runs_Scored = 0 THEN 1 ELSE 0 END) as dot_balls,
COUNT(Runs_Scored) as balls_faced, -- Total number of balls faced by the batsman
-- Strike rate of the batsman (calculated as total runs per 100 balls faced)
(SUM(Runs_Scored) / COUNT(Runs_Scored)) * 100 as strike_rate,
SUM(Extra_runs) as extra_runs, -- Total extra runs (wides, no-balls, etc.) contributed by the batsman
SUM(Caught) as catches, -- Number of catches taken by the batsman
(SUM(Bowled) + SUM(Bowler_Wicket)) as bowled, -- Total number of batsmen bowled out by the bowler
SUM(Run_out) as run_out -- Number of run-outs involving the batsman
FROM
ipl.ball_by_ball
GROUP BY
team_id,
match_id,
Striker
),

-- Step 2: Calculate team performance based on the match details
team_performance AS (
SELECT
m.venue_name, -- Venue of the match
SUM(md.total_runs) as total_runs, -- Total runs scored by the team
SUM(md.balls_faced) as total_balls_faced, -- Total balls faced by the team
-- Average strike rate of the team
ROUND((SUM(md.total_runs) / SUM(md.balls_faced)) * 100, 2) as avg_strike_rate,
SUM(md.half_century) as total_half_centuries, -- Total half-centuries scored by the team
SUM(md.century) as total_centuries, -- Total centuries scored by the team
SUM(md.sixes) as total_sixes, -- Total sixes hit by the team
SUM(md.fours) as total_fours, -- Total fours hit by the team
SUM(md.dot_balls) as total_dot_balls -- Total dot balls faced by the team
FROM
match_details md
JOIN
matches m ON md.match_id = m.match_id -- Join with matches to get match details
JOIN
team t ON md.team_id = t.Team_Id -- Join with team to get team details
GROUP BY
t.Team_Name, -- Group by team name to get team performance
m.venue_name -- Group by venue to get venue specific performance
)

-- Step 3: Rank the teams based on their total runs
SELECT
*,
RANK() OVER (ORDER BY total_runs DESC) as ranking -- Rank teams based on total runs scored
FROM
team_performance
ORDER BY
ranking ASC; -- Order the results by ranking in ascending order

Explanation :

Ranking the teams: The RANK() function ranks teams based on the total runs scored.
Ordering the results: The results are ordered by the ranking in ascending order, so the team with the most runs is ranked first.

2.3 Team wise Key Metrics

**Query : **

image

USE ipl;

-- Step 1: Create a Common Table Expression (CTE) to calculate match and striker specific statistics
WITH match_details AS (
SELECT
Team_Batting AS team_id, -- Team that is batting
match_id, -- Unique identifier for each match
Striker, -- Batsman who faced the ball
SUM(Runs_Scored) AS total_runs, -- Total runs scored by the batsman in the match
-- Flag indicating if the batsman scored a half-century (50-99 runs)
(CASE WHEN (SUM(Runs_Scored) / 50 > 1 AND SUM(Runs_Scored) / 50 < 2) THEN 1 ELSE 0 END) AS half_century,
-- Flag indicating if the batsman scored a century (100-199 runs)
(CASE WHEN (SUM(Runs_Scored) / 100 >= 1 AND SUM(Runs_Scored) / 100 <= 2) THEN 1 ELSE 0 END) AS century,
-- Flag indicating if the batsman scored a double century (200-299 runs)
(CASE WHEN (SUM(Runs_Scored) / 200 >= 2 AND SUM(Runs_Scored) / 200 <= 3) THEN 1 ELSE 0 END) AS double_century,
-- Number of sixes hit by the batsman
(CASE WHEN Runs_Scored = 6 THEN 1 ELSE 0 END) AS sixes,
-- Number of fours hit by the batsman
(CASE WHEN Runs_Scored = 4 THEN 1 ELSE 0 END) AS fours,
-- Number of dot balls (balls with zero runs) faced by the batsman
(CASE WHEN Runs_Scored = 0 THEN 1 ELSE 0 END) AS dot_balls,
COUNT(Runs_Scored) AS balls_faced, -- Total number of balls faced by the batsman
-- Strike rate of the batsman (calculated as total runs per 100 balls faced)
(SUM(Runs_Scored) / COUNT(Runs_Scored)) * 100 AS strike_rate, SUM(Extra_runs) AS extra_runs, -- Total extra runs (wides, no-balls, etc.) contributed by the batsman
SUM(Caught) AS catches, -- Number of catches taken by the batsman
(SUM(Bowled) + SUM(Bowler_Wicket)) AS bowled, -- Total number of batsmen bowled out by the bowler
SUM(Run_out) AS run_out -- Number of run-outs involving the batsman
FROM
ipl.ball_by_ball
GROUP BY
team_id,
match_id,
Striker
),

-- Step 2: Calculate team performance based on the match details
team_performance AS (
SELECT
t.Team_Name, -- Name of the team
SUM(md.total_runs) AS total_runs, -- Total runs scored by the team
SUM(md.balls_faced) AS total_balls_faced, -- Total balls faced by the team
-- Average strike rate of the team
ROUND((SUM(md.total_runs) / SUM(md.balls_faced)) * 100, 2) AS avg_strike_rate,
SUM(md.half_century) AS total_half_centuries, -- Total half-centuries scored by the team
SUM(md.century) AS total_centuries, -- Total centuries scored by the team
SUM(md.sixes) AS total_sixes, -- Total sixes hit by the team
SUM(md.fours) AS total_fours, -- Total fours hit by the team
SUM(md.dot_balls) AS total_dot_balls -- Total dot balls faced by the team
FROM
match_details md
JOIN
matches m ON md.match_id = m.match_id -- Join with matches to get match details
JOIN
team t ON md.team_id = t.Team_Id -- Join with team to get team details
GROUP BY
t.Team_Name -- Group by team name to get team performance
)

-- Step 3: Rank the teams based on their total runs
SELECT
*,
RANK() OVER (ORDER BY total_runs DESC) AS ranking -- Rank teams based on total runs scored
FROM
team_performance
ORDER BY
ranking ASC; -- Order the results by ranking in ascending order

image

USE ipl;

-- Step 1: Create a Common Table Expression (CTE) to calculate match and striker specific statistics
WITH match_details AS (
SELECT
Team_Batting AS team_id, -- Team that is batting
match_id, -- Unique identifier for each match
Striker, -- Batsman who faced the ball
SUM(Runs_Scored) AS total_runs, -- Total runs scored by the batsman in the match
-- Flag indicating if the batsman scored a half-century (50-99 runs)
(CASE WHEN (SUM(Runs_Scored) / 50 > 1 AND SUM(Runs_Scored) / 50 < 2) THEN 1 ELSE 0 END) AS half_century,
-- Flag indicating if the batsman scored a century (100-199 runs)
(CASE WHEN (SUM(Runs_Scored) / 100 >= 1 AND SUM(Runs_Scored) / 100 <= 2) THEN 1 ELSE 0 END) AS century,
-- Flag indicating if the batsman scored a double century (200-299 runs)
(CASE WHEN (SUM(Runs_Scored) / 200 >= 2 AND SUM(Runs_Scored) / 200 <= 3) THEN 1 ELSE 0 END) AS double_century,
-- Number of sixes hit by the batsman
(CASE WHEN Runs_Scored = 6 THEN 1 ELSE 0 END) AS sixes,
-- Number of fours hit by the batsman
(CASE WHEN Runs_Scored = 4 THEN 1 ELSE 0 END) AS fours,
-- Number of dot balls (balls with zero runs) faced by the batsman
(CASE WHEN Runs_Scored = 0 THEN 1 ELSE 0 END) AS dot_balls,
COUNT(Runs_Scored) AS balls_faced, -- Total number of balls faced by the batsman
-- Strike rate of the batsman (calculated as total runs per 100 balls faced)
(SUM(Runs_Scored) / COUNT(Runs_Scored)) * 100 AS strike_rate,
SUM(Extra_runs) AS extra_runs, -- Total extra runs (wides, no-balls, etc.) contributed by the batsman
SUM(Caught) AS catches, -- Number of catches taken by the batsman
(SUM(Bowled) + SUM(Bowler_Wicket)) AS bowled, -- Total number of batsmen bowled out by the bowler
SUM(Run_out) AS run_out -- Number of run-outs involving the batsman
FROM
ipl.ball_by_ball
GROUP BY
team_id,
match_id,
Striker
),

-- Step 2: Calculate team performance based on the match details
team_performance AS (
SELECT
t.Team_Name, -- Name of the team
SUM(md.bowled) AS total_bowled, -- Total number of batsmen bowled out by the team
SUM(md.catches) AS total_catches, -- Total number of catches taken by the team
SUM(md.run_out) AS total_run_outs, -- Total number of run-outs involving the team
SUM(md.extra_runs) AS total_extras, -- Total extra runs (wides, no-balls, etc.) contributed by the team
SUM(md.dot_balls) AS total_dot_balls -- Total number of dot balls faced by the team
FROM
match_details md
JOIN
matches m ON md.match_id = m.match_id -- Join with matches to get match details
JOIN
team t ON md.team_id = t.Team_Id -- Join with team to get team details
GROUP BY
t.Team_Name -- Group by team name to get team performance
)

-- Step 3: Rank the teams based on their total number of batsmen bowled out
SELECT
*,
RANK() OVER (ORDER BY total_bowled DESC) AS ranking -- Rank teams based on total number of batsmen bowled out
FROM
team_performance
ORDER BY
ranking ASC; -- Order the results by ranking in ascending order

2.4 Player wise Key Metrics

Query :

image

WITH match_details AS ( SELECT match_id, Striker, SUM(Runs_Scored) AS total_runs,

    (CASE WHEN SUM(Runs_Scored) >= 50 AND SUM(Runs_Scored) < 100 THEN 1 ELSE 0 END) AS half_century,
    (CASE WHEN SUM(Runs_Scored) >= 100 AND SUM(Runs_Scored) < 200 THEN 1 ELSE 0 END) AS century,
    (CASE WHEN SUM(Runs_Scored) >= 200 AND SUM(Runs_Scored) < 300 THEN 1 ELSE 0 END) AS double_century,

    SUM(CASE WHEN Runs_Scored = 6 THEN 1 ELSE 0 END) AS sixes,
    SUM(CASE WHEN Runs_Scored = 4 THEN 1 ELSE 0 END) AS fours,
    SUM(CASE WHEN Runs_Scored = 0 THEN 1 ELSE 0 END) AS dot_balls,

    COUNT(Runs_Scored) AS balls_faced, 
    (SUM(Runs_Scored) / COUNT(Runs_Scored)) * 100 AS strike_rate,

    SUM(Extra_runs) AS extra_runs, 
    SUM(Caught) AS catches, 
    SUM(Bowled + Bowler_Wicket) AS bowled, 
    SUM(Run_out) AS run_out 
FROM
    ipl.ball_by_ball
GROUP BY
    match_id,
    Striker

),

team_performance AS ( SELECT p.Player_Name, SUM(md.total_runs) AS total_runs, SUM(md.balls_faced) AS total_balls_faced, ROUND((SUM(md.total_runs) / SUM(md.balls_faced)) * 100, 2) AS avg_strike_rate, SUM(md.century) AS total_centuries, SUM(md.sixes) AS total_sixes, SUM(md.fours) AS total_fours, SUM(md.dot_balls) AS total_dot_balls FROM match_details md JOIN matches m ON md.match_id = m.match_id JOIN player p ON md.Striker = p.Player_Id GROUP BY p.Player_Name )

SELECT *, RANK() OVER (ORDER BY total_runs DESC) AS ranking FROM team_performance ORDER BY ranking ASC

image

USE ipl;

-- Step 1: Create a Common Table Expression (CTE) to calculate bowler specific statistics
WITH match_details AS (
SELECT
Bowler, -- Bowler who delivered the ball
SUM(Runs_Scored) AS total_runs, -- Total runs conceded by the bowler
SUM((CASE WHEN Runs_Scored = 0 THEN 1 ELSE 0 END)) AS dot_balls, -- Number of dot balls (balls with zero runs)
COUNT(Runs_Scored) AS balls_bowled, -- Total number of balls bowled
SUM(Extra_runs) AS extra_runs, -- Total extra runs (wides, no-balls, etc.) conceded by the bowler
SUM(Caught) AS catches, -- Number of catches taken by fielders off the bowler's delivery
(SUM(Bowled) + SUM(Bowler_Wicket)) AS bowled, -- Total number of batsmen bowled out by the bowler
SUM(Run_out) AS run_out -- Number of run-outs involving the bowler
FROM
ipl.ball_by_ball
GROUP BY
Bowler
),

-- Step 2: Calculate player performance based on the match details for bowlers
team_performance AS (
SELECT
p.Player_Name, -- Name of the player
SUM(md.balls_bowled) AS total_balls_bowled, -- Total balls bowled by the player
SUM(md.total_runs) AS total_runs, -- Total runs conceded by the player
SUM(md.dot_balls) AS total_dot_balls, -- Total dot balls delivered by the player
SUM(md.extra_runs) AS total_extra_runs, -- Total extra runs conceded by the player
SUM(md.catches) AS total_catches, -- Total catches taken off the player's delivery
SUM(md.bowled) AS total_bowled -- Total number of batsmen bowled out by the player
FROM
match_details md
JOIN
player p ON md.Bowler = p.Player_Id -- Join with player to get player details
GROUP BY
p.Player_Name -- Group by player name to get player performance
)

-- Step 3: Rank the bowlers based on the total number of batsmen bowled out
SELECT
*,
DENSE_RANK() OVER (ORDER BY total_bowled DESC) AS ranking -- Rank players based on total batsmen bowled out
FROM
team_performance
ORDER BY
ranking ASC; -- Order the results by ranking in ascending order

image

USE ipl;

-- Step 1: Create a Common Table Expression (CTE) to calculate batsman-specific statistics per match
WITH match_details AS (
SELECT
match_id, -- Match identifier
Striker, -- Batsman identifier
SUM(Runs_Scored) AS total_runs, -- Total runs scored by the batsman in the match
(CASE WHEN (SUM(Runs_Scored) / 50 > 1 AND SUM(Runs_Scored) / 50 < 2) THEN 1 ELSE 0 END) AS half_century, -- Indicator for half-century
(CASE WHEN (SUM(Runs_Scored) / 100 >= 1 AND SUM(Runs_Scored) / 100 <= 2) THEN 1 ELSE 0 END) AS century, -- Indicator for century
(CASE WHEN (SUM(Runs_Scored) / 200 >= 2 AND SUM(Runs_Scored) / 200 <= 3) THEN 1 ELSE 0 END) AS double_century, -- Indicator for double-century
(CASE WHEN Runs_Scored = 6 THEN 1 ELSE 0 END) AS sixes, -- Number of sixes hit by the batsman
(CASE WHEN Runs_Scored = 4 THEN 1 ELSE 0 END) AS fours, -- Number of fours hit by the batsman
(CASE WHEN Runs_Scored = 0 THEN 1 ELSE 0 END) AS dot_balls, -- Number of dot balls faced by the batsman
COUNT(Runs_Scored) AS balls_faced, -- Total number of balls faced by the batsman
(SUM(Runs_Scored) / COUNT(Runs_Scored)) * 100 AS strike_rate, -- Strike rate of the batsman
SUM(Extra_runs) AS extra_runs, -- Total extra runs scored by the batsman
SUM(Caught) AS catches, -- Number of times the batsman was caught out
(SUM(Bowled) + SUM(Bowler_Wicket)) AS bowled, -- Number of times the batsman was bowled out
SUM(Run_out) AS run_out -- Number of times the batsman was run out
FROM
ipl.ball_by_ball
GROUP BY
match_id,
Striker
),

-- Step 2: Calculate performance based on the match details for batsmen from different countries
team_performance AS (
SELECT
p.Country_Name, -- Country of the batsman
SUM(md.total_runs) AS total_runs, -- Total runs scored by batsmen from the country
SUM(md.balls_faced) AS total_balls_faced, -- Total balls faced by batsmen from the country
ROUND((SUM(md.total_runs) / SUM(md.balls_faced)) * 100, 2) AS avg_strike_rate, -- Average strike rate of batsmen from the country
SUM(md.half_century) AS total_half_centuries, -- Total half-centuries scored by batsmen from the country
SUM(md.century) AS total_centuries, -- Total centuries scored by batsmen from the country
SUM(md.sixes) AS total_sixes, -- Total sixes hit by batsmen from the country
SUM(md.fours) AS total_fours, -- Total fours hit by batsmen from the country
SUM(md.dot_balls) AS total_dot_balls -- Total dot balls faced by batsmen from the country
FROM
match_details md
JOIN
matches m ON md.match_id = m.match_id
JOIN
player p ON md.Striker = p.Player_Id -- Join with player to get player details including country
GROUP BY
p.Country_Name -- Group by country name to get performance metrics per country
)

-- Step 3: Rank the countries based on total runs scored by their batsmen
SELECT
*,
DENSE_RANK() OVER (ORDER BY total_runs DESC) AS ranking -- Rank countries based on total runs scored
FROM
team_performance
ORDER BY
ranking ASC; -- Order the results by ranking in ascending order

image

USE ipl;

-- Step 1: Create a Common Table Expression (CTE) to calculate bowler-specific statistics
WITH match_details AS (
SELECT
Bowler, -- Bowler identifier
SUM(Runs_Scored) AS total_runs, -- Total runs conceded by the bowler
SUM((CASE WHEN Runs_Scored = 0 THEN 1 ELSE 0 END)) AS dot_balls, -- Number of dot balls bowled by the bowler
COUNT(Runs_Scored) AS balls_bowled, -- Total number of balls bowled by the bowler
SUM(Extra_runs) AS extra_runs, -- Total extra runs conceded by the bowler
SUM(Caught) AS catches, -- Number of catches taken by the bowler
(SUM(Bowled) + SUM(Bowler_Wicket)) AS bowled, -- Number of wickets taken by the bowler
SUM(Run_out) AS run_out -- Number of run-outs effected by the bowler
FROM
ipl.ball_by_ball
GROUP BY
Bowler
),

-- Step 2: Calculate performance metrics for bowlers from different countries
team_performance AS (
SELECT
p.Country_Name, -- Country of the bowler
SUM(md.balls_bowled) AS total_balls_bowled, -- Total balls bowled by bowlers from the country
SUM(md.total_runs) AS total_runs, -- Total runs conceded by bowlers from the country
SUM(md.dot_balls) AS total_dot_balls, -- Total dot balls bowled by bowlers from the country
SUM(md.extra_runs) AS total_extra_runs, -- Total extra runs conceded by bowlers from the country
SUM(md.catches) AS total_catches, -- Total catches taken by bowlers from the country
SUM(md.bowled) AS total_bowled -- Total wickets taken by bowlers from the country
FROM
match_details md
JOIN
player p ON md.Bowler = p.Player_Id -- Join with player to get player details including country
GROUP BY
p.Country_Name -- Group by country name to get performance metrics per country
)

-- Step 3: Rank the countries based on total wickets taken by their bowlers
SELECT
*,
DENSE_RANK() OVER (ORDER BY total_bowled DESC) AS ranking -- Rank countries based on total wickets taken
FROM
team_performance
ORDER BY
ranking ASC; -- Order the results by ranking in ascending order

3. Analysis

3.1 Chasing Vs Defending

image

**Query : **

-- CTE to calculate statistics for matches where the team is 'Team1'
WITH team_1 AS (
SELECT
Team1 AS team_name,
-- Count matches won by chasing
SUM(CASE WHEN Team1 = match_winner AND Toss_Name = 'bat' THEN 1 ELSE 0 END) AS won_by_chasing,
-- Count matches won by defending
SUM(CASE WHEN Team1 = match_winner AND Toss_Name = 'field' THEN 1 ELSE 0 END) AS won_by_defending,
-- Count total matches won
SUM(CASE WHEN Team1 = match_winner THEN 1 ELSE 0 END) AS match_status,
-- Count total matches played
COUNT(Team1) AS matches_played
FROM ipl.matches
GROUP BY team_name
),

-- CTE to calculate statistics for matches where the team is 'Team2'
team_2 AS (
SELECT
Team2 AS team_name,
-- Count matches won by chasing
SUM(CASE WHEN Team1 = match_winner AND Toss_Name = 'bat' THEN 1 ELSE 0 END) AS won_by_chasing,
-- Count matches won by defending
SUM(CASE WHEN Team1 = match_winner AND Toss_Name = 'field' THEN 1 ELSE 0 END) AS won_by_defending,
-- Count total matches won
SUM(CASE WHEN Team1 = match_winner THEN 1 ELSE 0 END) AS match_status,
-- Count total matches played
COUNT(Team2) AS matches_played
FROM ipl.matches
GROUP BY team_name
),

-- CTE to consolidate the statistics from team_1 and team_2
consolidated AS (
SELECT * FROM team_1
UNION ALL
SELECT * FROM team_2
),

-- CTE to calculate overall team statistics
team_stats AS (
SELECT
team_name,
-- Sum of matches played
SUM(matches_played) AS total_matches_played,
-- Sum of matches won
SUM(match_status) AS matches_won,
-- Calculate win percentage
ROUND((SUM(match_status) / SUM(matches_played)) * 100, 2) AS win_percentage,
-- Calculate percentage of matches won by chasing
ROUND((SUM(won_by_chasing) / SUM(match_status)) * 100, 2) AS won_by_chasing,
-- Calculate percentage of matches won by defending
ROUND((SUM(won_by_defending) / SUM(match_status)) * 100, 2) AS won_by_defending
FROM consolidated
GROUP BY team_name
),

-- CTE to calculate match details for each team
match_details AS (
SELECT
Team_Batting AS team_id,
match_id,
-- Sum of runs scored
SUM(Runs_Scored) AS total_runs,
-- Count of balls faced
COUNT(Runs_Scored) AS balls_faced,
-- Calculate strike rate
(SUM(Runs_Scored) / COUNT(Runs_Scored)) * 100 AS strike_rate
FROM ipl.ball_by_ball
GROUP BY team_id, match_id
),

-- CTE to calculate performance details for each team
team_performance AS (
SELECT
t.Team_Name,
-- Sum of total runs
SUM(md.total_runs) AS total_runs,
-- Sum of total balls faced
SUM(md.balls_faced) AS total_balls_faced,
-- Calculate average strike rate
ROUND((SUM(md.total_runs) / SUM(md.balls_faced)) * 100, 2) AS avg_strike_rate
FROM
match_details md
JOIN matches m ON md.match_id = m.match_id
JOIN team t ON md.team_id = t.Team_Id
GROUP BY t.Team_Name
),

-- CTE to combine team statistics and performance details
final_stats AS (
SELECT
ts.team_name,
ts.total_matches_played,
ts.matches_won,
ts.win_percentage,
ts.won_by_chasing,
ts.won_by_defending,
tp.avg_strike_rate
FROM
team_stats ts
JOIN team_performance tp ON ts.team_name = tp.Team_Name
)

-- Select final statistics for each team
SELECT
*
FROM
final_stats;

Explanation :

team_1 and team_2 CTEs: Calculate the statistics for teams when they appear as Team1 and Team2 in the matches table. This includes the number of matches won by chasing, defending, total matches won, and matches played.
consolidated CTE: Combines the results from team_1 and team_2.
team_stats CTE: Aggregates the consolidated data to calculate total matches played, matches won, win percentage, percentage of matches won by
chasing, and by defending.
match_details CTE: Computes the total runs scored, balls faced, and strike rate for each team in each match from the ball_by_ball table.
team_performance CTE: Aggregates match details to calculate the total runs, total balls faced, and average strike rate for each team.
final_stats CTE: Combines the statistics from team_stats and team_performance to generate the final statistics for each team.
Final SELECT statement: Retrieves the complete set of statistics for each team.

3.2 Most Consistent Player in each team by their contribution

-- CTE to calculate the performance of each player in each match
WITH player_wise_performance AS (
SELECT
Team_Batting AS team_name,
Striker AS player_name,
Match_id,
-- Sum of runs scored by the player in each match
SUM(Runs_Scored) AS runs_scored
FROM ball_by_ball
GROUP BY team_name, player_name, Match_id
),

-- CTE to calculate the total runs scored in the first innings of each match
match_wise_performance AS (
SELECT
Match_id,
-- Sum of runs scored in the first innings of each match
SUM(Runs_Scored) AS runs_scored
FROM ball_by_ball
WHERE Innings_No = 1
GROUP BY Match_id
),

-- CTE to calculate each player's contribution to their team's total runs in the first innings
player_wise_contribution AS (
SELECT
t.Team_Name,
p.Player_Name,
p.Country_Name,
pp.runs_scored AS player_runs,
mp.runs_scored AS match_runs,
-- Calculate player's contribution as a percentage of the total match runs
(pp.runs_scored / mp.runs_scored) * 100 AS player_wise_contribution
FROM player_wise_performance pp
LEFT JOIN match_wise_performance mp
ON pp.Match_id = mp.Match_id
LEFT JOIN player p
ON pp.player_name = p.Player_Id
LEFT JOIN team t
ON pp.team_name = t.Team_ID
),

-- CTE to calculate the average contribution of each player across all matches
sub_table AS (
SELECT
Team_Name,
Player_Name,
Country_Name,
-- Calculate the average contribution of the player
ROUND(AVG(player_wise_contribution), 2) AS average_player_contribution
FROM player_wise_contribution
GROUP BY Team_Name, Player_Name, Country_Name
ORDER BY average_player_contribution DESC
),

-- CTE to rank players within each team based on their average contribution
for_filtering AS (
SELECT
*,
-- Rank players within each team by their average contribution
DENSE_RANK() OVER (PARTITION BY Team_Name ORDER BY average_player_contribution DESC) AS ranking
FROM sub_table
)

-- Select the top-ranked player from each team who is not from India
SELECT *
FROM for_filtering
WHERE ranking <= 1 AND Country_Name <> 'India';

Explanation :

player_wise_performance CTE: Aggregates the runs scored by each player in each match.
match_wise_performance CTE: Aggregates the total runs scored in the first innings of each match.
player_wise_contribution CTE: Calculates each player's contribution to their team's total runs in the first innings as a percentage.
sub_table CTE: Calculates the average contribution of each player across all matches.
for_filtering CTE: Assigns a ranking to players within each team based on their average contribution.
Final SELECT statement: Retrieves the top-ranked player from each team who is not from India, based on their average contribution.