Exploring online data - abraker95/ultimate_osu_analyzer GitHub Wiki

In this tutorial we will explore the data WebBeatmapset, WebBeatmap, and WebScore objects contain. We will also learn how to gather specific information. For instance, you can collect all plays that are below a certain accuracy, combo, search for a specific player, and much more.


JSON to Object Converter

The first thing that needs to be said about these objects is that they are generated from JSON data. They are extended from the base class located in misc/util/json_obj.py which converts JSON data into a Python object, albeit shallowly. This means that JSON data such as:

{
    var1 = "string",
    var2 = 1234,
    var3 = [ 2, 3, 4, 5, 6 ]
    var4 = {
         subvar1 = "more string",
         subvar2 = 12345
    }
}

Is converted into an object that would be equivalent to:

class Obj():
    def __init__(self):
        self.var1 = "string"
        self.var2 = 1234
        self.var3 = [ 2, 3, 4, 5, 6 ]
        self.var4 = {
            subvar1 = "more string",
            subvar2 = 12345
        }

The WebBeatmapset, WebBeatmap, and WebScore objects we will be talking about are located in osu/online/structs/web_structs.py.

WebBeatmapset

The function from the first tutorial, CmdOnline.get_latest_beatmapsets(gamemode) returns a list of 50 WebBeatmapset objects. The following is a sample of WebBeatmapset data in JSON format:

{
    "id"              : 950955,
    "title"           : "Cristalisia",
    "artist"          : "onoken",
    "play_count"      : 2538,
    "favourite_count" : 7,
    "has_favourited"  : false,
    "submitted_date"  : "2019-04-05T09:26:52+00:00",
    "last_updated"    : "2019-05-20T18:39:11+00:00",
    "ranked_date"     : "2019-05-28T11:20:01+00:00",
    "creator"         : "Mir",
    "user_id"         : 8688812,
    "bpm"             : 134,
    "source"          : "Cytus II",
    "covers" : 
    {
        "cover"       : "https:\/\/assets.ppy.sh\/beatmaps\/950955\/covers\/cover.jpg?1558377565",
        "cover@2x"    : "https:\/\/assets.ppy.sh\/beatmaps\/950955\/covers\/[email protected]?1558377565",
        "card"        : "https:\/\/assets.ppy.sh\/beatmaps\/950955\/covers\/card.jpg?1558377565",
        "card@2x"     : "https:\/\/assets.ppy.sh\/beatmaps\/950955\/covers\/[email protected]?1558377565",
        "list"        : "https:\/\/assets.ppy.sh\/beatmaps\/950955\/covers\/list.jpg?1558377565",
        "list@2x"     : "https:\/\/assets.ppy.sh\/beatmaps\/950955\/covers\/[email protected]?1558377565",
        "slimcover"   : "https:\/\/assets.ppy.sh\/beatmaps\/950955\/covers\/slimcover.jpg?1558377565",
        "slimcover@2x": "https:\/\/assets.ppy.sh\/beatmaps\/950955\/covers\/[email protected]?1558377565"
    },
    "preview_url"       : "\/\/b.ppy.sh\/preview\/950955.mp3",
    "tags"              : "testimony 2 deemo",
    "video"             : false,
    "storyboard"        : false,
    "ranked"            : 1,
    "status"            : "ranked",
    "has_scores"        : true,
    "discussion_enabled": true,
    "discussion_locked" : false,
    "can_be_hyped"      : false,
    "hype": 
    {
        "current"  : 5,
        "required" : 5
    },
    "nominations": 
    {
        "current"  : 2,
        "required" : 2
    },
    "legacy_thread_url" : "https:\/\/osu.ppy.sh\/forum\/t\/890463",
    "beatmaps" : [ 
    {
        "id"                : 1985946,
        "beatmapset_id"     : 950955,
        "mode"              : "osu",
        "mode_int"          : 0,
        "convert"           : null,
        "difficulty_rating" : 5.21877,
        "version"           : "Fragments",
        "total_length"      : 127,
        "hit_length"        : 126,
        "cs"                : 4,
        "drain"             : 5,
        "accuracy"          : 8.5,
        "ar"                : 9.2,
        "playcount"         : 873,
        "passcount"         : 90,
        "count_circles"     : 467,
        "count_sliders"     : 215,
        "count_spinners"    : 2,
        "count_total"       : 903,
        "last_updated"      : "2019-05-20T18:39:11+00:00",
        "ranked"            : 1,
        "status"            : "ranked",
        "url"               : "https:\/\/osu.ppy.sh\/beatmaps\/1985946",
        "deleted_at"        : null
    }, ... ]
}

We can use the this data to filter the beatmapsets we want to take a look at. For example, beatmapsets with large play counts have more than 50 scores. Since we can only get top 50 scores, this causes only perfect or near perfect plays to be available to us. It can be hard to analyze which parts of the map are hard if players do them perfectly. So if we want to get beatmaps which have plays where the players are more likely to mess up, we need to get beatmaps with a lower play count. If we want to get all beatmapset with a play count below 2000, for example, we can do:

filtered_beatmapsets = [ beatmapset for beatmapset in beatmapsets if beatmapset.play_count < 2000 ]

The WebBeatmapset object has one function, get_beatmaps, which allows to get the list of beatmaps/difficulties the beatmapset contains. It's basically a function converting the "beatmaps" JSON data into a WebBeatmap object.

WebBeatmap

The following is a sample of WebBeatmapset data in JSON format:

{
    "id"                : 1985946,
    "beatmapset_id"     : 950955,
    "mode"              : "osu",
    "mode_int"          : 0,
    "convert"           : null,
    "difficulty_rating" : 5.21877,
    "version"           : "Fragments",
    "total_length"      : 127,
    "hit_length"        : 126,
    "cs"                : 4,
    "drain"             : 5,
    "accuracy"          : 8.5,
    "ar"                : 9.2,
    "playcount"         : 873,
    "passcount"         : 90,
    "count_circles"     : 467,
    "count_sliders"     : 215,
    "count_spinners"    : 2,
    "count_total"       : 903,
    "last_updated"      : "2019-05-20T18:39:11+00:00",
    "ranked"            : 1,
    "status"            : "ranked",
    "url"               : "https:\/\/osu.ppy.sh\/beatmaps\/1985946","deleted_at":null
}

The WebBeatmap object has 3 functions: get_beatmap_data, download_beatmap, and get_scores. We used the get_beatmap_data function in the first tutorial. This function returns the contents of the beatmap how you would see it in an *.osu file. Refer to the documentation here for more information about the data in an *.osu file.

The download_beatmap function downloads and saves the beatmap data as an *.osu file in the specified directory. The name and extension of the beatmap file is automatically filled in.

beatmap.download_beatmap('C:\\Users\\abraker\\Documents\\osu_maps')

We used the get_scores functions in the last tutorial. This functions returns a list of WebScore objects.

A thing to note about the get_beatmap_data and get_scores functions is that they cache the data they fetch to reduce redundant requests to osu! servers. To fetch latest data, overwriting cached data, do refresh=True in the function parameters like so:

beatmap.get_beatmapdata(refresh=True)
beatmap.get_scores(refresh=True)

Another thing to note about those functions is that they are programmatically rate limited. This means the functions will sleep for a certain period between multiple requests at a time. This is to reduce the probability of error 429, "Too many requests". get_beatmapdata and download_beatmap are rate limited to 500ms, and get_scores is rate limited to 3 seconds. Ofc once cached data is being read, these rate limits don't apply.

We can extend the idea of filtering beatmaps to filtering beatmap difficulties. Say we want to get all beatmaps under 2000 play count that are above 4 stars. Then we can take the filtered beatmapsets from last part and do:

beatmap_lists     = [ beatmapset.get_beatmaps() for beatmapset in filtered_beatmapsets ]
beatmaps          = [ beatmap for beatmap_list in beatmap_lists for beatmap in beatmap_list ]
filtered_beatmaps = [ beatmap for beatmap in beatmaps if beatmap.difficulty_rating > 4.0 ]

WebScore

The following is a sample of WebScore data in JSON format:

{
    "id"         : 2815500743,
    "user_id"    : 651307,
    "accuracy"   : 0.9880129091747349,
    "mode"       : "osu",
    "mode_int"   : 0,
    "mods"       : [ "NC", ... ],
    "score"      : 19401465,
    "max_combo"  : 977,
    "perfect"    : false,
    "replay"     : true,
    "statistics" : 
    {
         "count_50"   : 0,
         "count_100"  : 13,
         "count_300"  : 710,
         "count_geki" : 88,
         "count_katu" : 10,
         "count_miss" : 0
    },
    "pp"        : 284.966,
    "rank"      : "S",
    "created_at": "2019-05-28T04:23:24+00:00",
    "beatmap" : 
    {
        "id"                : 1964610,
        "beatmapset_id"     : 940681,
        "mode"              : "osu",
        "mode_int"          : 0,
        "convert"           : null,
        "difficulty_rating" : 4.38393,
        "version"           : "Farewell",
        "total_length"      : 271,
        "hit_length"        : 262,
        "cs"                : 4,
        "drain"             : 6,
        "accuracy"          : 7,
        "ar"                : 8,
        "playcount"         : 1701,
        "passcount"         : 153,
        "count_circles"     : 478,
        "count_sliders"     : 245,
        "count_spinners"    : 0,
        "count_total"       : 968,
        "last_updated"      : "2019-05-20T18:45:57+00:00",
        "ranked"            : 1,
        "status"            : "ranked",
        "url"               : "https:\/\/osu.ppy.sh\/beatmaps\/1964610",
        "deleted_at"        : null
    },
    "user" : 
    {
        "id"              : 651307,
        "username"        : "Prophet",
        "profile_colour"  : null,
        "avatar_url"      : "https:\/\/a.ppy.sh\/651307?1501639769.jpg",
        "country_code"    : "CN",
        "default_group"   : "default",
        "is_active"       : true,
        "is_bot"          : false,
        "is_online"       : true,
        "is_supporter"    : true,
        "last_visit"      : "2019-05-28T17:52:00+00:00",
        "pm_friends_only" : false,
        "country" : 
        {
            "code" : "CN",
            "name" : "China"
        }
    }
}

The WebScore object has functions which allow to get replay data and download it. There are actually two ways provided for doing that; one via api and one via web. The api method requires you to have your api key filled in. The web method requires you to have your username and password filled in. These can be filled in the osu/online/login.py file.

In the previous tutorial we used the get_replay_data_web function. This function returns the contents of the beatmap how you would see it in an *.osr file. Refer to the documentation here for more information about the data in an *.osr file. The get_replay_data_api does the same thing, however I've noticed that sometimes replays are not available when using it. Both functions cache the replay data after fetching.

The download_replay_web and download_replay_api functions download and save the replay data as an *.osr file in the specified directory. The name and extension of the replay file is automatically filled in.

score.download_replay_web('C:\\Users\\abraker\\Documents\\osu_replays')
score.download_replay_api('C:\\Users\\abraker\\Documents\\osu_replays')

Getting replays via api and web are rate limited to 3 seconds.