Demonstration Script - OCLC-Developer-Network/devconnect_2018_wikibase GitHub Wiki
This script was used in the OCLC DevConnect Online webinar to demonstrate how to install and launch Wikibase and related applications using Docker. Steps 1 through 5 were not part of the demonstration, but are in these notes for those who want to try and replicate the process and want to try this in the Amazon EC2 cloud and/or are looking for guidance on installing Docker, git, and Python.
1. Create an Amazon Web Service (AWS) Elastic Compute Cloud (EC2) instance
- Use this image: Amazon Linux AMI 2018.03.0 (HVM), SSD Volume Type - ami-0b59bfac6be064b78.
- Configure it as a “t2.large” instance (to get sufficient memory for Wikibase and the Query engine), with the default 8GB of storage (which should be plenty until working with larger datasets)
- In the instance’s security group, enable SSH port 22, and Custom http ports 8181 and 8282 (the Wikibase and SPARQL Query UIs), for access from the address ranges you want to support
Connect to your EC2 instance via SSH with a private key, and logon with username ec2-user.
The following steps should work with any Linux or Mac machine, and may work in a virtual Linux machine under Windows.
2. Install Docker
- Update installed packages with
sudo yum update -y
- Install the most recent Docker Community Edition package with
sudo yum install -y docker
- Start the Docker service with
sudo service docker start
- Add the ec2-user to the docker group so you can execute Docker commands without using sudo with
sudo usermod -a -G docker ec2-user
- Log out and log back in again to pick up the new docker group permissions.
- Verify that the ec2-user can run Docker commands without sudo, with
docker info
3. Install docker-compose
sudo curl -L "https://github.com/docker/compose/releases/download/1.22.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
4. Install git
sudo yum install git
5. Install Python
Pywikibot, which this demonstration uses for batch loading data to Wikibase, is a Python library. If a supported version of Python (2.7.4+, or 3.4+, check your version with python --version) isn't available yet on your machine, download and install it
Pywikibot depends on the Python "requests" package. If you are running Python 2.7.9+ or 3.4+, it should already be available. If not, you can install it with pip install requests
.
6. Clone the Wikibase docker image repository
cd ~
git clone https://github.com/wmde/wikibase-docker.git
Change to the path for the Wikibase docker image repo, and start up with wikibase-docker images using docker-compose:
cd wikibase-docker
docker-compose up
If all goes well, the Wikibase UI should be available for your EC2 site’s IP address and public DNS on port 8181, and the SPARQL UI should be running on port 8282. An updater script should be checking for Wikibase data changes every 10 seconds, synchronizing data in the SPARQL servers Blazegraph triplestore. Go to your IP address (or localhost if running on a notebook) with those two ports and see if Wikibase is up and running.
7. Install Pywikibot
Pywikibot is a Python library that provides support for creating and editing entity data in Wikibase. Pywikibot Documentation
Create a "pywikibot" directory for Pywikibot, change to that directory, and clone the Pywikibot core repository there. Then change to the cloned "core" directory for the next step of configuring Pywikibot.
cd ~
mkdir pywikibot
cd pywikibot
git clone --recursive https://gerrit.wikimedia.org/r/pywikibot/core.git
cd core
8. Configuring Pywikibot
Generate a definition for our Wikibase "family"
Pywikibot is designed to work with any one of a number of "families" of wikis. Out of the box, it's configured to work with Wikipedia, Wikidata, and assorted other wikis. We're launching our own brand new wiki, which it doesn't yet know about. So we'll want to create a family configuration. Run this script:
python pwb.py generate_family_file.py
and supply a URL for your Wikibase host (since we're running Pywikibot on the same machine as our Wikibase host, http://localhost:8181/w
/ will work) and supply the family name devnetdemo
. This will create a file in the pywikibot/core/pywikibot/families
path, called devnetdemo_family.py
We want to make a small addition to that family file definition, to tell Pywikibot how to handle adding statements for geo-locations. It needs to know the URL for a "globe" to associate the coordinates with. We'll just add a way for it to understand the "earth" globe, for now. To make this change, navigate to the directory where the family files are located (cd ~/pywikibot/core/pywikibot/families
), and add the following to the end of the file, and save your changes:
def globes(self, code):
return {'earth': 'http://www.wikidata.org/entity/Q2'}
Generate a user configuration, including a bot password
Generate a user-config.py file for your new wiki family by entering the command
python pwb.py generate_user_files.py
You should see your new family name in the prompted list of families. Select it, select the default language ('en'), enter "Admin" as your username (Wikibase is pre-configured with that username and a default password), and respond with "No" for the prompt for adding more projects. Respond with "Yes" for the prompt "Do you want to add a BotPassword for Admin?". Enter "bot" as the bot name. You'll then be prompted for a bot password. That needs to be created using the Wikibase user interface.
Here's what that interaction looks like:
Select family of sites we are working on, just enter the number or name (default: wikipedia): 2
The only known language: en
The language code of the site we're working on (default: en):
Username on en:devnetdemo: Admin
Do you want to add any other projects? ([y]es, [N]o): n
Do you want to add a BotPassword for Admin? ([y]es, [N]o, [q]uit): y
See https://www.mediawiki.org/wiki/Manual:Pywikibot/BotPasswords to
know how to get codes.Please note that plain text in
/home/ec2-user/pywikibot/core/user-password.py and anyone with read
access to that directory will be able read the file.
BotPassword's "bot name" for Admin: bot
BotPassword's "password" for "bot" (no characters will be shown):
The Wikibase site is automatically configured with a username (Admin) and password (adminpass). Login to the Wikibase UI with those credentials.
To create a bot password, find the "Special pages" link on the left-hand side of the Wikibase screen, select that, and scroll down to its "Users and Rights" section, and select the link to "Bot passwords". Enter the bot name "bot", and click "Create". Check the boxes for "High-volume editing", "Edit existing pages", and "Create, edit, and move pages", then click Create.
You should see a new bot password, something like "rer5mtkvfvhui36ata36ac7nrmfaclrn". Copy that to the clipboard, and then back at the command line where you're setting up the user-config.py file, paste in that value as the bot password.
There is another change to make to the user-config.py file, and unfortunately it needs to be made manually. Open it up with a text editor. Look for the line that reads put_throttle = 10
and change it to put_throttle = 0
. The setting for "put_throttle" is the number of seconds Pywikibot will wait between commands sent to its target wiki. So as not to swamp other shared systems like Wikidata, it is set by default to 10 seconds. But for our instance we can send commands without pausing, so we set it to 0 seconds.
9. Loading sample data for properties and items
For this presentation, a git repository includes a Python script that reads in pre-assembled data for Wikibase items and properties from tab-delimited text files, and uses Pywikibot to create the entities in our Wikibase instance. By convention, user-developed scripts that interact with Pywikibot are stored in its core/scripts/userscripts path. These steps will clone the git repository, navigate to its "app" directory, and copy its data and python script files to that pywikibot path:
cd ~
git clone https://github.com/OCLC-Developer-Network/devconnect_2018_wikibase.git
cd devconnect_2018_wikibase/app
cp * ~/pywikibot/core/scripts/userscripts/.
From the command line in the pywikibot/core directory, load the item and property data into your Wikibase with these commands:
cd ~/pywikibot/core
python pwb.py scripts/userscripts/load.py
(If you named your wikibase's pywikibot family something other than "devnetdemo", you'll have to hunt down that line in the load.py file and update it. Sorry about that!)
10. Testing the SPARQL Query Service
After data has been successfully loaded, you can use the SPARQL Query Service running on port 8282 to query the structured data in Wikibase. (An updater script has been watching as Wikibase changes are made, and importing that data into its triplestore.)
For example, to see the subclass/class relationships in the initial set of items that have been added, try a SPARQL query like:
SELECT ?sLabel ?oLabel WHERE {
?s wdt:P8 ?o.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
11. Customizing Wikibase
There are many ways to customize the Wikibase user experience, by modifying its settings, changing the default user interface "skin", and adding other Wikibase extensions. The default configuration is set in a file called LocalSettings.php.template
, found in the wikibase-docker/wikibase/1.30/base
path. When the wikibase docker image is started, that file is used to create /var/www/html/LocalSettings.php
in the docker container.
To over-ride the default and make modifications to the LocalSettings.php file, copy wikibase-docker/wikibase/1.30/base/LocalSettings.template.php
to your wikibase-docker path (the path that includes docker-compose.yml).
cd ~/wikibase-docker
cp wikibase/1.30/base/LocalSettings.php.template .
Then edit the docker-compose.yml file, and in the services:wikibase: section, change the volumes: settings to over-ride the LocalSettings.template.php with your local file system copy, adding a new entry after the existing volume entry for mediawiki-images-data:
volumes:
- mediawiki-images-data:/var/www/html/images
- ./LocalSettings.php.template:/LocalSettings.php.template
Now, modify your local copy of LocalSettings.php.template to adjust its settings for who can edit the site. By default, anyone can edit and create accounts, without being logged on. We'll change that by adding these two new lines to the local settings, after the default Site Settings, around line 38:
${DOLLAR}wgGroupPermissions['*']['edit'] = false;
${DOLLAR}wgGroupPermissions['*']['createaccount'] = false;
The wikibase container will need to be restarted for this change to take effect. This command should do the trick:
docker-compose up --no-deps -d wikibase
After Wikibase restarts, without logging in try to create a new item or account. The system should respond to those attempts with an error message, indicating that you do not have permissions.
Related Resources
View other resources describing Wikibase and the Docker deployment.