Getting Started with Pipelines - strohne/Facepager GitHub Wiki

Reading time: 2 minutes

Data collection often involves multiple steps. For example, on platforms such as YouTube, you may first collect channel information, then the videos and finally the comments. Some APIs even follow standards such as Hydra and explicitly implement multi-stage collection processes. In Hydra, you first fetch collections that contain URLs, the you follow the URLs to get the detail data.

The pipeline feature of Facepager support multi-stage data collection by grouping together multiple presets that run one after another.

  1. Create a database: click New Database in the Menu Bar.
  2. Add nodes: Add a year as seed node by clicking Add Nodes in the Menu Bar, for example "1851". Select the node.
  3. Setup query: Open the presets and in the category "Knowledge Graph" click on the pipeline "Culture Knowledge Graph". This is a pipeline, because it groups two presets together.
  4. Fetch data: To run multiple presets one after another, you start fetching data directly from the preset window. Click the "Run pipeline" button.
  5. Inspect data: Expand your node or click Expand nodes in the Menu Bar to open all nodes. Select one of the new child nodes. The raw data is shown in the Data View to the right. Note how the results are nested: your seed node on the first level, results from the first preset on the second level and the data collected using the second preset on the last level.

What is next?