Metadata API - serlo/documentation GitHub Wiki
Metadata API
The Metadata API can be accessed via our GraphQL API.
How to access our GraphQL API
The Serlo GraphQL API is a powerful tool that allows you to fetch data of our educational resources. It is accessible via a POST request to https://api.serlo.org/graphql
. The body of the request should contain a GraphQL query as described in the official GraphQL documentation.
To test and refine your queries, you can use our interactive GraphQL playground at https://api.serlo.org/___graphql
. This tool provides a user-friendly interface for building queries, exploring the schema, and viewing the returned data.
Understanding the Request Payload and Pagination
The MetadataQuery endpoint is a part of the GraphQL API and allows you to fetch metadata about our educational resources and Serlo itself. It provides few methods for querying data such as resources
, publisher
, and the current version of the API.
query {
metadata {
resources(first: 10, instance: "de", modifiedAfter: "2023-05-17T00:00:00Z") {
nodes
pageInfo {
hasNextPage
endCursor
}
}
publisher
version
}
}
The most important one is resources
which serves the actual metadata of our educational content. It is structured as follows:
type Resources {
first: Int!
after: String
instance: Instance
modifiedAfter: String
}
Here's a detailed explanation of each field:
-
first
: This is the number of records you want to fetch in one request. The maximum value for this field is 1000 at the time of writing. This field is crucial for implementing pagination in your queries. By adjusting thefirst
parameter, you can control the number of records fetched in each request. -
after
: This is an optional field. If provided, the API will return records with a higher id as specified by theafter
property. This is also known as a cursor and in conjunction withfirst
, you can implement so called cursor-based pagination. For example, to fetch the first 10 records, you would setfirst
to 10. To fetch the next 10 records, you would leavefirst
unchanged and setafter
to the id of the last record you fetched (seeendCursor
parameter from thepageInfo
property). -
instance
: This is also an optional field. If provided, the API will return records for the specified instance. Theinstance
field is of typeInstance
as defined in your GraphQL schema. Specify either"de"
|"en"
|"es"
|"ta"
|"hi"
|"fr"
. -
modifiedAfter
: This is another optional field. If provided, the API will return records that were modified after the specified date and time. The string must be in the ISO 8601 datetime formatYYYY-MM-DDTHH:MM:SSZ
.
The GraphQL API returns a hasNextPage
object that helps you navigate through the data. It has the following interface:
type HasNextPageInfo {
hasNextPage: Boolean!
endCursor: String
}
This object tells you whether there are more records to fetch and provides a cursor to the last fetched record (endCursor
). You can use this cursor as the after
parameter in your next query to continue fetching records from where you left off. This way, you can efficiently paginate through large sets of data while respecting the limit of first
and without overloading the server or the client.
Querying the API
You can query the API using a GraphQL query. Here's an example of how you can do it using curl
:
curl --location 'https://api.serlo.org/graphql' \
--header 'Content-Type: application/json' \
--data '{ '\
' "query": "query($first: Int, $after: String, '\
' $instance: Instance, $modifiedAfter: String) { '\
' metadata { '\
' resources(first: $first, after: $after, '\
' instance: $instance, '\
' modifiedAfter: $modifiedAfter) { '\
' nodes '\
' pageInfo { '\
' hasNextPage '\
' endCursor '\
' } '\
' } '\
' } '\
' }", '\
' "variables": { '\
' "first": 10, '\
' "after": "NjI4Mw==", '\
' "instance": "de", '\
' "modifiedAfter": "2023-05-17T00:00:00Z" '\
' } '\
'}'
In this example, we're fetching the first 10 records after the record with ID NjI4Mw==
for the de
instance that were modified after 2023-05-17T00:00:00Z
.
Querying in Node.js
Note that the following code example requires a Node.js version of 18 or greater to have the fetch
function available in the global scope without third party packages. If you're on a lower Node.js version, consider using node-fetch
or similar libraries. Here's how you can query the API:
const gql = require("graphql-tag");
const { print } = require("graphql/language/printer");
const query = gql`
query ($first: Int, $instance: Instance, $modifiedAfter: String) {
metadata {
resources(
first: $first
instance: $instance
modifiedAfter: $modifiedAfter
) {
nodes
}
publisher
version
}
}
`;
const variables = {
first: 10,
instance: "de",
modifiedAfter: "2023-05-17T00:00:00Z",
};
const response = await fetch("https://api.serlo.org/graphql", {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({
query: print(query),
variables,
}),
})
const metadata = await response.json()
Querying in Python
The following shows an example of how a request in Python could look like:
import requests
payload = {
"first": 10,
}
response = requests.post(
"https://api.serlo.org/graphql",
headers={"Content-Type": "application/json"},
json={
"query": """
query($first: Int) {
metadata {
resources(first: $first) {
nodes
}
}
}
""",
"variables": {
"first": payload["first"],
},
},
)
print(response.json())
Tips for API consumer
- Start with a small value for
first
to test your query and then gradually increase it. - Use the
after
field to paginate through the records. - Use the
instance
andmodified_after
fields to filter the records. - Ensure you don't have duplicated records. Use the provided
id
for each record to identify the resources. - Store the date of when you queried the database for the last time and have a CRON job run every few weeks with
modified_after
set to said date. That way, you'll only get entities that have changed, or were added since your last request. - Always check the API response for errors. If there's an error, the API will return an
errors
field in the response.
Metadata format and response
As we wanted to base the Metadata API on a good standard that can be widely adopted by the education community, we helped draft the "Allgemeines Metadatenprofil für Bildungsressourcen" AMB standard. It translates to "general metadata profile for educational resources" and is based on schema.org and JSON-LD. In summary, it is a metadata standard for learning resources and designed to provide a structured, universal way to describe and categorize learning resources, making them easier to find, parse and use. The following is a brief summary and description of each property in the AMB standard and schema.org that we are returning.
@context This property provides the context for interpreting the JSON-LD document. It includes the language, vocabulary, and definitions for the type and id properties.
id This property is a unique identifier for the learning resource. It is a URL that points to the resource.
type This property specifies the type of the learning resource. It can be an array of types. It's vocabulary is defined by the LearningResource and classes of (CreativeWork)[https://schema.org/CreativeWork] of schema.org.
creator
This property describes the creator(s) of the learning resource. Each creator is an object with properties for id, name, type, and affiliation. The affiliation is another object that describes the organization the creator is affiliated with (containing id
, type
and name
)
dateCreated This property specifies the date the learning resource was created.
dateModified This property specifies the date the learning resource was last modified.
headline This schema.org property provides a headline/title for the learning resource.
identifier This schema.org property provides an additional identifier for the learning resource. It is an object with properties for type, propertyID, and value.
isAccessibleForFree This property indicates whether the learning resource is accessible for free. For Serlo content, this will always be true!
isFamilyFriendly
This property indicates whether the learning resource is family-friendly. For Serlo content, this will also always be true
.
inLanguage This property specifies the language(s) of the learning resource as an array.
learningResourceType This property describes the type of learning resource. It is an array of objects, each with an id property that points to a definition of the resource type. The vocabulary for it is defined in the OpenEduHub resource type.
license This property describes the license under which the learning resource is distributed.
mainEntityOfPage This property contains description about our metadata – with information about the publisher of the metadata (Serlo Education e.V.) and when it was created.
maintainer
This property describes the maintainer of the learning resource. In our API, it always has the same structure and content as the affiliation
of the creator field seen above.
name This property provides a name for the learning resource.
isPartOf This property describes the larger resource(s) that the learning resource is part of. It is an array of objects, each with an id property that points to the larger resource.
publisher
This property describes the publisher(s) of the learning resource. Each publisher is an object with properties for id, type, and name. In our API, it always has the same structure and content as the affiliation
of the creator field seen above.
version This property provides a version identifier for the learning resource.
Sample response
The following shows a complete example of how the response of a query for an article on Addition could look like.
{
"@context": [
"https://w3id.org/kim/amb/context.jsonld",
{
"@language": "de",
"@vocab": "http://schema.org/",
"type": "@type",
"id": "@id"
}
],
"id": "https://serlo.org/1495",
"type": [ "LearningResource", "Article" ],
"creator": [
{
"id": "https://serlo.org/324",
"name": "122d486a",
"type": "Person",
"affiliation": {
"id": "https://serlo.org/#organization",
"type": "Organization",
"name": "Serlo Education e.V.",
}
},
{
"id": "https://serlo.org/15491",
"name": "125f4a84",
"type": "Person",
"affiliation": {
"id": "https://serlo.org/#organization",
"type": "Organization",
"name": "Serlo Education e.V.",
}
},
{
"id": "https://serlo.org/22573",
"name": "12600e93",
"type": "Person",
"affiliation": {
"id": "https://serlo.org/#organization",
"type": "Organization",
"name": "Serlo Education e.V.",
}
},
{
"id": "https://serlo.org/1",
"name": "admin",
"type": "Person",
"affiliation": {
"id": "https://serlo.org/#organization",
"type": "Organization",
"name": "Serlo Education e.V.",
}
},
{
"id": "https://serlo.org/6",
"name": "12297c72",
"type": "Person",
"affiliation": {
"id": "https://serlo.org/#organization",
"type": "Organization",
"name": "Serlo Education e.V.",
}
},
{
"id": "https://serlo.org/677",
"name": "124902c9",
"type": "Person",
"affiliation": {
"id": "https://serlo.org/#organization",
"type": "Organization",
"name": "Serlo Education e.V.",
}
},
{
"id": "https://serlo.org/15473",
"name": "125f3e12",
"type": "Person",
"affiliation": {
"id": "https://serlo.org/#organization",
"type": "Organization",
"name": "Serlo Education e.V.",
}
},
{
"id": "https://serlo.org/15478",
"name": "125f467c",
"type": "Person",
"affiliation": {
"id": "https://serlo.org/#organization",
"type": "Organization",
"name": "Serlo Education e.V.",
}
},
{
"id": "https://serlo.org/27689",
"name": "1268a3e2",
"type": "Person",
"affiliation": {
"id": "https://serlo.org/#organization",
"type": "Organization",
"name": "Serlo Education e.V.",
}
},
],
"dateCreated": "2014-03-01T20:36:44+00:00",
"dateModified": "2014-10-31T15:56:50+00:00",
"headline": "Addition",
"identifier": {
"type": "PropertyValue",
"propertyID": "UUID",
"value": 1495
},
"isAccessibleForFree": true,
"isFamilyFriendly": true,
"inLanguage": [ "de" ],
"learningResourceType": [
{ "id": "http://w3id.org/openeduhub/vocabs/learningResourceType/text" },
{ "id": "http://w3id.org/openeduhub/vocabs/learningResourceType/worksheet" },
{ "id": "http://w3id.org/openeduhub/vocabs/learningResourceType/course" },
{ "id": "http://w3id.org/openeduhub/vocabs/learningResourceType/web_page" },
{ "id": "http://w3id.org/openeduhub/vocabs/learningResourceType/wiki" },
],
"license": { "id": "https://creativecommons.org/licenses/by-sa/4.0/" },
"mainEntityOfPage": [{
"id": "https://serlo.org/metadata-api",
"provider": {
"id": "https://serlo.org/#organization",
"type": "Organization",
"name": "Serlo Education e.V."
},
}],
"maintainer": {
"id": "https://serlo.org/#organization",
"type": "Organization",
"name": "Serlo Education e.V.",
},
"name": "Addition",
"isPartOf": [
{ "id": "https://serlo.org/1292" },
{ "id": "https://serlo.org/16072" },
{ "id": "https://serlo.org/16174" },
{ "id": "https://serlo.org/33119" },
{ "id": "https://serlo.org/34743" },
{ "id": "https://serlo.org/34744" },
],
"publisher": [
{
"id": "https://serlo.org/#organization",
"type": "Organization",
"name": "Serlo Education e.V.",
}
],
"version": { "id": "https://serlo.org/32614" },
}
License: CC BY-SA 4.0
The metadata API uses the CC BY-SA 4.0 license. Note that the content itself has another license (which is also CC-BY-SA in most cases) which can be accessed by the license
property. This is a human-readable summary of the license:
You are free to
- Share — copy and redistribute the metadata in any medium or format
- Adapt — remix, transform, and build upon the metadata for any purpose, even commercially.
The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms
- Attribution — You must give appropriate credit to Serlo Education e.V., provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
For API consumers, this means that you can use and adapt the data from the API for any purpose, including commercial purposes, as long as you provide appropriate credit and distribute your contributions under the same license. You also cannot apply any additional legal or technological restrictions that would prevent others from doing anything the license permits.
Publisher API Documentation
Overview
The Publisher API is a part of the Metadata API that provides information about the publisher of the content. You can use it to retrieve information about Serlo.
Querying the Publisher API
To query the Publisher API endpoint, use the publisher
field like this:
query {
metadata {
publisher
}
}
This query will return an object with metadata about us. You can see the example response below.
Response
The response from the Publisher API looks like the following:
{
"@context": [
"https://w3id.org/kim/lrmi-profile/draft/context.jsonld",
{ "@language": "de" }
],
"id": "https://serlo.org/",
"type": ["EducationalOrganization", "NGO"],
"name": "Serlo Education e.V.",
"alternateName": "Serlo",
"url": "https://de.serlo.org/",
"description": "Serlo.org bietet einfache Erklärungen, Kurse, Lernvideos, Übungen und Musterlösungen mit denen Schüler*innen und Studierende nach ihrem eigenen Bedarf und in ihrem eigenen Tempo lernen können. Die Lernplattform ist komplett kostenlos und werbefrei.",
"image": "https://assets.serlo.org/5ce4082185f5d_5df93b32a2e2cb8a0363e2e2ab3ce4f79d444d11.jpg",
"logo": "https://de.serlo.org/_assets/img/serlo-logo.svg",
"address": {
"type": "PostalAddress",
"streetAddress": "Daiserstraße 15 (RGB)",
"postalCode": "81371",
"addressLocality": "München",
"addressRegion": "Bayern",
"addressCountry": "Germany"
},
"email": "[email protected]"
}
Handling Deleted Data
When data is deleted from our API, it can impact the state of your local data store or application. To ensure your application data remains consistent with the API, it's crucial to handle these deletions appropriately.
Refetching and Reconciliation
If data is deleted from the API, the current recommended approach is to refetch all data and reconcile it with your local database. This process involves comparing the newly fetched data with the data in your database and making necessary updates.
Here's a general outline of the steps you'd have to perform:
-
Fetch All Data: Make a request to the API to fetch all available data. This data represents the current state of the API after the deletions. You can use a transformation to just keep the entity ids in memory and discard all the other data we are serving.
-
Compare with Local Data: Compare the fetched data with the data in your local database. For each item in your database, check if it exists in the fetched data.
-
Handle Deletions: If an item in your database does not exist in the fetched data, it means that the item has been deleted in the API. You should then delete this item from your database to keep it in sync with our API.
-
Update Database: After all deletions have been handled, don't forget to update your database with the newly fetched data or make an extra request utilizing the
modifiedAfter
parameter. This will ensure that your database has parity with the current state of our API.
Future outlook
We are evaluating other alternatives to this somewhat cumbersome process of handling entity deletions. We are considering having a distinct API to fetch individual entities based on their id, or an easy way to list recently deleted entities.
Please get in contact with us and let us know if this is something of interest to you and your team. You can find the email to reach us on the metadata website.
Changelog
Changelog 1.0.0
Breaking Changes
Our Metadata API has undergone significant changes from the old version to version 1.0.0. Here's a summary of these changes:
Changes to the GraphQl payload
The property entities
was named to resources
and is the primary way to fetch metadata from our GraphQL API.
@context
Property
Changes to the - The previous context
"https://w3id.org/kim/lrmi-profile/draft/context.jsonld"
has been replaced by"https://w3id.org/kim/amb/context.jsonld"
. - Alongside the existing "@language": "de", three additional attributes have been introduced:
- "@vocab": "http://schema.org/"
- "type": "@type"
- "id": "@id"
Changes to Entity Descriptions
- The
description
property is no longer universally available. From now on, it will only be present in entities where a description exists.
creator
Property
Introduction of the - A new
creator
property has been added, representing an array of objects. Each object in this array corresponds to a different author and includes anid
,name
,type
, andaffiliation
. Theaffiliation
object always refers to the Serlo organization, represented as follows:
{
"id": "https://serlo.org/organization",
"type": "Organization",
"name": "Serlo Education e.V.",
}
learningResourceType
Property
Changes to the - The
learningResourceType
property, previously a string, is now an array of objects. Each object has anid
property. The value of this property maps to a vocabulary term defined in the AMB standard, or more precisely here.
maintainer
and publisher
Properties
Changes to the - The
maintainer
andpublisher
properties have been expanded from simple strings to objects withid
,type
, andname
fields. Both now link to the Serlo organization.
version
Property
Changes to the - The
version
property has transitioned from a simple string to an object containing anid
field. This change supports serving and versioning the most current revision of an entity.
Other Changes
about
Property
Introduction of the - The new
about
property is an array of subjects the resource belongs to.
isPartOf
Property
Introduction of the - The new
isPartOf
property is an array of objects. Each object includes anid
property, which is a URL pointing to the taxonomy of the entity.
mainEntityOfPage
Property
Introduction of the - A new
mainEntityOfPage
property has been added, which includes an array of objects. Each object in the array contains anid
and aprovider
property. Theid
links to "https://serlo.org/metadata", while theprovider
links to the Serlo organization.
Changelog 1.1.0
- Add
type
ofWebContent
formainEntityOfPage
(see discussion in https://github.com/dini-ag-kim/amb/issues/218 of AMB) - Add the vocabulary of https://vocabs.openeduhub.de/w3id.org/openeduhub/vocabs/new_lrt/index.html to
learningResourceType
Changelog 1.2.0
- Add property
image
to all resources pointing to a thumbnail for the learning resource (based on the subject). - Fix a bug that for some CC-BY-SA resources the URL to the original author was returned as the license URL. Now in those cases it correctly returns a URL pointing to the CC license and the URL of the original author was added in the list of the creators.
- Links in
learningResourceType
to the deprecated vocabulary https://vocabs.openeduhub.de/w3id.org/openeduhub/vocabs/learningResourceType/index.html have been deleted.
Changelog 1.3.0
- We have added some additional filters to exclude some content like pages from our documententation or articles in construction from our metadata API.
Changelog 2.0.0
Breaking Changes
- We deleted the
edges
property in the return type ofresources
. - We renamed
EntityMetadataConnection
toResourceMetadataConnection
(this is the return type ifresources
)