KZSScraper - StackUnderflowProject/Scraper GitHub Wiki

getTeamUrlMap(): Map<Stirng, String>

Function Overview:

  • Description: This Kotlin function getTeamUrlMap fetches a map of basketball team names to their corresponding URLs from the Eurobasket website.

  • Return Type: Map<String, String>: A map where keys are team names and values are the URLs corresponding to each team.

Description:

The getTeamUrlMap function is designed to extract basketball team names along with their URLs from the Eurobasket website's Slovenian basketball league page.

Steps:

  1. Request Setup:

    • Sets up an HTTP request to the Eurobasket page containing the list of basketball teams.
  2. Data Extraction:

    • Parses the HTML response and extracts specific elements (div.BasketBallTeamDetailsLine) that represent individual basketball team entries.
  3. Iterative Processing:

    • Iterates over each team entry and retrieves the team name (div.BasketBallTeamName) and its corresponding URL (div.BasketBallTeamName > a).
  4. Map Creation:

    • Constructs a map (getTeamUrlMap) where each team name is mapped to its respective URL.
  5. Return:

    • Returns the populated teamUrlMap containing basketball team names as keys and their URLs as values.

Example:

fun main() {
    // Fetch the map of team names to their Eurobasket URLs
    val teamUrlMap = getTeamMapUrl()

    // Print the extracted team URLs
    println("Basketball Team URLs:")
    teamUrlMap.forEach { (teamName, url) ->
        println("$teamName: $url")
    }
}

getTeams(teamUrlMap: Map<String, String>): Teams

Function Overview:

  • Description: This Kotlin function getTeams fetches basketball team data (names, logos) from the Eurobasket website for the Slovenian basketball league.

  • Parameters:

    • teamUrlMap (Optional): A map of basketball team names to their corresponding URLs on the Eurobasket website.
  • Return Type: Teams: An object containing all the fetched basketball teams.

Description:

The getTeams function is responsible for retrieving basketball team data (names and logos) from the specified Eurobasket webpage. It leverages web scraping techniques to extract relevant information and populate a Teams object.

Steps:

  1. Input Validation:

    • Checks if a teamUrlMap is provided. If not, it fetches the map using the getTeamUrlMap function.
  2. HTTP Request:

    • Sends an HTTP request to the Eurobasket webpage containing basketball team details.
  3. Data Extraction:

    • Parses the HTML response and extracts elements (div.BasketBallTeamDetailsLine) representing individual basketball team entries.
  4. Iterative Processing:

    • Iterates over each team entry and retrieves the team's name (div.BasketBallTeamName) and logo URL (img tag's src attribute).
    • Creates BasketballTeam objects using the extracted data and adds them to the Teams collection.
  5. Logging:

    • Logs the number of teams fetched.
  6. Coaches Data Fetching:

    • Calls the getCoaches function to retrieve and assign coach information to the fetched teams using the provided teamUrlMap.
  7. Return:

    • Returns the populated Teams object containing all the fetched basketball teams.

Example:

fun main() {
    // Fetch the basketball teams from the Eurobasket website
    val teams = getTeams()

    // Display the fetched teams
    println("Fetched Basketball Teams:")
    teams.forEachIndexed { index, team ->
        println("${index + 1}. ${team.name} - Coach: ${team.coach}")
    }
}

getArenas(teams: Teams, teamUrlMap: Map<String, String>): Stadiums

Function Overview:

  • Description: This Kotlin function getArenas fetches arena data (names, capacities, images, locations) associated with basketball teams from the Eurobasket website.

  • Parameters:

    • teams (Optional): A Teams object containing basketball team data. If not provided, it defaults to the teams fetched by getTeams.
    • teamUrlMap (Optional): A map of basketball team names to their corresponding URLs on the Eurobasket website. If not provided, it defaults to the URLs fetched by getTeamUrlMap.
  • Return Type: Stadiums: An object containing all the fetched arena data.

Description:

The getArenas function is responsible for retrieving arena information (names, capacities, images, locations) associated with basketball teams from Eurobasket webpages. It uses web scraping techniques to extract relevant details and populate a Stadiums object.

Steps:

  1. Input Validation:

    • Checks if teams and teamUrlMap are provided. If not, fetches them using getTeams and getTeamUrlMap respectively.
  2. HTTP Requests & Data Extraction:

    • Iterates over each team's URL in the teamUrlMap.
    • Sends HTTP requests to retrieve webpage content for each team.
    • Parses the HTML response to extract arena-related details such as stadium name, capacity, image path, and address.
    • Uses regex (arenaPattern) to extract arena-specific data from the webpage.
  3. Populating Stadiums Object:

    • Creates Stadium objects using the extracted data and adds them to the Stadiums collection.
    • Assigns the team ID to each stadium based on matching team names.
  4. Error Handling & Logging:

    • Handles exceptions gracefully and logs relevant messages during the data retrieval process.
  5. Return:

    • Returns the populated Stadiums object containing all the fetched arena data associated with basketball teams.

Example:

fun main() {
    // Fetch the basketball teams and their arenas from the Eurobasket website
    val teams = getTeams()
    val arenas = getArenas(teams)

    // Display the fetched arenas
    println("Fetched Arenas:")
    arenas.forEachIndexed { index, arena ->
        println("${index + 1}. ${arena.name} - Capacity: ${arena.capacity}, Location: ${arena.location}")
    }
}

getStandings(teams): Standings

Function Overview:

  • Description: This Kotlin function getStandings retrieves basketball team standings from the Eurobasket website.

  • Parameters:

    • teams (Optional): A Teams object containing basketball team data. If not provided, it defaults to the teams fetched by getTeams.
  • Return Type: Standings: An object containing all the fetched standings data.

Description:

The getStandings function is responsible for fetching basketball team standings from the Eurobasket webpage and populating a Standings object with the retrieved data.

Steps:

  1. HTTP Requests & Data Extraction:

    • Sends an HTTP request to retrieve the webpage content containing team standings.
    • Parses the HTML response to extract team standings data such as team name, place, games played, wins, losses, goals scored, and goals conceded.
  2. Populating Standings Object:

    • Creates BasketballStanding objects using the extracted data and adds them to the Standings collection.
    • Matches the team names to their corresponding IDs from the provided Teams object.
  3. Error Handling & Logging:

    • Handles exceptions gracefully and logs relevant messages during the data retrieval process.
  4. Return:

    • Returns the populated Standings object containing all the fetched standings data.

Example:

fun main() {
    // Fetch the basketball team standings from the Eurobasket website
    val teams = getTeams()
    val standings = getStandings(teams = teams)

    // Display the fetched standings
    println("Fetched Standings:")
    standings.forEachIndexed { index, standing ->
        val teamName = teams.find { it.id == standing.team }?.name ?: "Unknown"
        println("${index + 1}. $teamName - Place: ${standing.place}, Wins: ${standing.wins}, Losses: ${standing.losses}")
    }
}

getMatches(teams: Teams, arenas: Stadiums): Matches

Function Overview:

  • Description: This Kotlin function getMatches retrieves basketball match data from the Eurobasket website.

  • Parameters:

    • teams (Optional): A Teams object containing basketball team data. If not provided, it defaults to the teams fetched by getTeams.
    • arenas (Optional): A Stadiums object containing arena data. If not provided, it defaults to the arenas fetched by getArenas.
  • Return Type: Matches: An object containing all the fetched match data.

Description:

The getMatches function is responsible for fetching basketball match data from the Eurobasket webpage and populating a Matches object with the retrieved data.

Steps:

  1. Data Preparation:

    • Sets up necessary data structures such as dateMap to handle date parsing.
    • Initializes date formatters and regex patterns for parsing dates and match details.
  2. HTTP Requests & Data Extraction:

    • Sends an HTTP request to retrieve the webpage content containing basketball match data.
    • Parses the HTML response to extract match data such as match date, home team, away team, result, etc.
  3. Populating Matches Object:

    • Creates Match objects using the extracted data and adds them to the Matches collection.
    • Matches team and arena names to their corresponding IDs from the provided Teams and Stadiums objects.
  4. Error Handling & Logging:

    • Handles exceptions gracefully and logs relevant messages during the data retrieval process.
  5. Return:

    • Returns the populated Matches object containing all the fetched match data.

Example:

fun main() {
    // Fetch the basketball matches from the Eurobasket website
    val teams = getTeams()
    val matches = getMatches(teams = teams)

    // Display the fetched matches
    println("Fetched Matches:")
    matches.forEachIndexed { index, match ->
        val homeTeam = teams.find { it.id == match.home }?.name ?: "Unknown"
        val awayTeam = teams.find { it.id == match.away }?.name ?: "Unknown"
        println("${index + 1}. ${homeTeam} vs ${awayTeam} - Date: ${match.date}, Result: ${match.score}")
    }
}

saveAllData(fileType: FileType)

Function Overview:

  • Description: This Kotlin function saveAllData fetches various basketball-related data from the Eurobasket website and saves it in a specified file format.

  • Parameters:

    • fileType (Optional): A FileType enum indicating the format in which to save the data. Defaults to JSON.
  • Return Type: Unit: The function does not return a value; it saves data to files.

Description:

The saveAllData function is responsible for fetching and saving all relevant basketball data (teams, coaches, arenas, standings, and matches) from the Eurobasket website into files of a specified format.

Steps:

  1. Fetching Data:

    • Retrieves a map of team names to their corresponding URLs using getTeamUrlMap.
    • Fetches basketball teams, arenas, standings, and matches using helper functions (getTeams, getArenas, getStandings, getMatches).
  2. Saving Data:

    • Depending on the specified fileType (JSON, XML, CSV), saves each type of data (teams, standings, arenas, matches) to corresponding files.
  3. Error Handling:

    • Handles any exceptions that might occur during data fetching or saving.

Example:

fun main() {
    // Save all basketball-related data from Eurobasket to JSON files
    saveAllData(FileType.JSON)

    println("Data saved successfully!")
}