Query Setup. YouTube - strohne/Facepager GitHub Wiki
In the Getting Started section you find a quick introduction. For detailed information on specific options have a look at the reference of the API provided by YouTube: https://developers.google.com/youtube/v3/docs/ The documentation may be confusing at the first time. Look out for methods called list
. You will want to work with these methods most of the time.
The starting points for working with the YouTube API usually are users, channels, playlists or videos:
- The difference between a user and a channel can be confusing at first sight. In the browser they look the same, but with different URLs. For example, the URL for the user Oscars is "https://www.youtube.com/user/Oscars". The channel of Oscars has the URL "https://www.youtube.com/channel/UCb-vZWBeWA5Q2818JmmJiqQ".
Channels belong to users but you can't query user data directly. If you want to fetch basic information about a user you have to use the resourcechannels
and specify the parameterforUsername
.
Please remind: This screenshot was taken on 25th of June in 2018. If you try this out yourself, remind that APIs are changing constantly, so this approach may be outdated by the time you read this text.
If you start with the channel ID, you can use the preset "Channel information". - Every channel contains playlists such as “Foreign Language Oscar Winners” (https://www.youtube.com/playlist?list=PLJ8RjvesnvDMgHc17AJU0Pgw3vnLKQnpM). Channels at least have an 'uploads' playlist containing the videos listed on the channel page. You find the playlist ID in the channel data. To get the videos in a playlist, you fetch playlist items and inspect the result.
-
Videos can be contained within playlists and can be identified from playlist items. Every video has an ID. The playlist items contain the ID in the key 'contentDetails.videoId'. If you browse the web, usually, the ID is the last part of a video URL, for example the video "https://www.youtube.com/watch?v=nvExda62GxI" has the ID "nvExda62GxI".
Sometimes this ID is to be found in the middle of URL, look at the v-parameter: https://www.youtube.com/watch?v=nvExda62GxI&list=PLJ8RjvesnvDMgHc17AJU0Pgw3vnLKQnpM. - Videos have comments. The first level of comments is called a comment thread and can be requested by the commentThreads-endpoint. If you request data for a comment thread, you get the top level comments and the first replies to each of the top level comments.
- Replies are comments to comments and can be fetched using the comment-endpoint.
Thus, a typical pipeline for collecting comments would be:
- Fetch channel information.
- Fetch playlist items in the uploads playlist. You find the ID of the uploads playlist in the channel data (
<contentDetails.relatedPlaylists.uploads>
). - Fetch details for the videos referenced in the playlist items. You find the ID of the video in the key
<contentDetails.videoId>
. - Fetch top level comments aka comment threads. Again, you need the video ID as a starting node.
- Fetch replies.
Query settings | Explanation |
---|---|
Base path | The Base path should match the current version of the API. So check the documentation regularly. In case the API is updated, change the path, e. g. from https://www.googleapis.com/youtube/v2.9 to https://www.googleapis.com/youtube/v3 . |
Resource | The most basic resource is "search" to get a list of results for a specific search term. |
Parameters | Some parameters are mandatory, depending on the selected resource. For example, you have to provide the parameter q when searching videos. Simply add search terms as nodes and set the parameter to <Object ID> . One of the most common parameters is the part parameter. With this parameter you say which data fields you want to get back. See the documentation of the specific resource, e.g. regarding channels see https://developers.google.com/youtube/v3/docs/channels/list#parameters. |
Access Token | Crawling data from YouTube requires a login with your user credentials. Before starting your data collection, you have to complete the login via the corresponding button. A login page directly served by Google will open and once the login is successful, Facepager gets an Access Token. With this Access Token Facepager acts on behalf of the logged in user. Note: The Access Token is stored locally on your computer. No personal data is submitted to the developers or any other authority. |
Scope | For some endpoints Facepager needs to get permission to manage your account. For example, if you want to get comments you should add "https://www.googleapis.com/auth/youtube.force-ssl" and login to Google again. |
Be aware that the amount of requests is limited. The more data you request the faster you run into the rate limit, see https://developers.google.com/youtube/v3/getting-started#quota. You can register your own app at Google, see the first steps in the Getting Started with Google Cloud Platform. Just add the YouTube Data API to your app. Instead of configuring OAuth2 you can use an API key (clear the access token field and manually define the key parameter).
Login to Google: In case the error 404 occurs while trying to log in, add a Youtube channel to your Google account and try again.
Please read the following terms of services from YouTube to get more information about:
-
the developer policies: https://developers.google.com/youtube/terms/developer-policies
-
the YouTube API services terms of services (here you have to select your region: American, Asia-Pacific (APAC), European-Middle East (EMEA) and Russia): https://developers.google.com/youtube/terms/api-services-terms-of-service
-
the content manager policies: https://support.google.com/youtube/answer/9142671
-
and the community guidelines: https://www.youtube.com/howyoutubeworks/policies/community-guidelines/#community-guidelines
Searching for more answers and explanations? Just read our FAQ.