How to install CKAN 2.7.2 on CentOS 7.4 - ckan/ckan GitHub Wiki

Install python package

The WSGI file in the 9th step need the paste.deploy and paste.script package of python. You can verify via the following command:

# python
>>> from paste.deploy import loadapp
>>> from paste.script.util.logging_config import fileConfig

If it does not return error message, you can do nothing. If it does and you have installed pip, you can install the two packages via the following command:

pip install PasteDeploy
pip install PasteScript

Hint: The paste.deploy and paste.script are different from paste package. You can exit the python environment with Ctrl+D or exit commond.

Now you can start to install CKAN!

1. Install the required packages

Install and activate the CentOS Release Repository

# yum install centos-release

Update and reboot your system

# yum update
# shutdown -r now

Install wget and policycoreutils-python, which we'll need later.

# yum install wget policycoreutils-python

Install and activate the Extra Packages for Enterprise Linux (EPEL) Repository (it may already be installed):

# rpm -Uvh http://dl.fedoraproject.org/pub/epel/7/x86_64/Packages/e/epel-release-7-11.noarch.rpm

Install the required packages:

# yum install xml-commons git subversion mercurial postgresql-server postgresql-devel \
postgresql python-devel libxslt libxslt-devel libxml2 libxml2-devel python-virtualenv \
gcc gcc-c++ make java-1.7.0-openjdk-devel java-1.7.0-openjdk tomcat tomcat-webapps \
tomcat-admin-webapps xalan-j2 unzip policycoreutils-python mod_wsgi httpd 

2. Install CKAN

First, create a CKAN User. The ckan user is created with a shell of /sbin/nologin and a home directory of /usr/lib/ckan to mirror what is shown in the CKAN Deployment documentation.

# useradd -m -s /sbin/nologin -d /usr/lib/ckan -c "CKAN User" ckan

Open the newly created directory up for read access so that the content will eventually be able to be served out via httpd.

# chmod 755 /usr/lib/ckan

Switch to the ckan user.

# su -s /bin/bash - ckan

Install an isolated Python environment, called default, to host CKAN from.

    # virtualenv --no-site-packages default

Install the recommended setuptools version.

    # pip install setuptools==36.1 

Activate the newly installed Python environment.

# . default/bin/activate

Download and install version 2.7.2 of CKAN.

(default)# pip install --ignore-installed -e git+https://github.com/okfn/[email protected]#egg=ckan

Download and install the necessary Python modules to run CKAN into the isolated Python environment

(default)# pip install --ignore-installed -r default/src/ckan/pip-requirements-docs.txt

Return back to root user by

(default)# exit

or pressing Ctrl+D.

3. Configure PostgreSQL

NOTE: From version 2.7.0 , CKAN requires at least Postgres 9.3 (As I installed postgresql version 9.6 ; Thus,instead of postgresql,I would be using postgresql-9.6) Enable PostgreSQL to start on system boot

# chkconfig postgresql-9.6 on

Initialize the PostgreSQL database

# /usr/pgsql-9.6/bin/postgresql96-setup initdb

Edit /var/lib/pgsql/9.6/data/pg_hba.conf so it will accept passwords for login while still allowing the local postgres user to manage via ident login. The relevant changes to pg_hba.conf are as follows:

local   all         postgres                          ident
local   all         all                                md5
# IPv4 local connections:
host    all         all         127.0.0.1/32           md5
# IPv6 local connections:
host    all         all         ::1/128                md5

Start PostgreSQL

# service postgresql-9.6 start

Switch to postgres user

# su - postgres

List existing databases:

# psql -l

Check that the encoding of databases is UTF8, if not internationalisation may be a problem. Since changing the encoding of PostgreSQL may mean deleting existing databases, it is suggested that this is fixed before continuing with the CKAN install. Next you’ll need to create a database user if one doesn’t already exist. Create a new PostgreSQL database user called ckan_default, and enter a password for the user when prompted. You’ll need this password later

# createuser -S -D -R -P ckan_default

Create a new PostgreSQL database, called ckan_default, owned by the database user you just created.

# createdb -O ckan_default ckan_default -E utf-8

Exit the postgres user environment with Ctrl + D or

# exit

4. Create a CKAN Configuration

Switch back to root user and create a directory to contain the site’s config files:

# mkdir -p /etc/ckan/default
# chown -R ckan /etc/ckan/

Switch to ckan user and create a CKAN config file:

# su -s /bin/bash - ckan
# . default/bin/activate
(default)# cd /usr/lib/ckan/default/src/ckan
(default)# paster make-config ckan /etc/ckan/default/development.ini

Edit the development.ini file in a text editor, changing the following options:

sqlalchemy.url = postgresql://ckan_default:pass@localhost/ckan_default
ckan.site_url = http://default.yourdomain.com
ckan.site_id = default
solr_url = http://127.0.0.1:8983/solr/ckan

Replace pass with the password that you created in 3. Setup a PostgreSQL database above.
Replace http://default.yourdomain.com with the ckan.site_url that you want to use when update file or other functions.

Exit from running as the ckan user with Ctrl+D or exit.

5. Install and Setup Apache SOLR

Download and extract Apache SOLR

# wget http://archive.apache.org/dist/lucene/solr/6.2.1/solr-6.2.1.zip

when the download is finished, unzip the package and go to the bin directory

# unzip solr-6.2.1.zip
# cd solr-6.2.1/bin

as root, we need to create the install directory and execute the file install_solr_service.sh. This script accepts arguments if we need to install the daemon to listen ad different port or with another user.

Usage: install_solr_service.sh path_to_solr_distribution_archive OPTIONS

  The first argument to the script must be a path to a Solr distribution archive, such as solr-6.2.1.tgz
    (only .tgz or .zip are supported formats for the archive)

  Supported OPTIONS include:

    -d     Directory for live / writable Solr files, such as logs, pid files, and index data; defaults to /var/solr

    -i     Directory to extract the Solr installation archive; defaults to /opt/
             The specified path must exist prior to using this script.

    -p     Port Solr should bind to; default is 8983

    -s     Service name; defaults to solr

    -u     User to own the Solr files and run the Solr process as; defaults to solr
             This script will create the specified user account if it does not exist.

 NOTE: Must be run as the root user

Create and configure the core

Switch to the solr user and go to the bin directory

sudo su solr
cd /opt/solr/bin

now create the ckan core

./solr create -c ckan

solr creates all the configuration files and directories. At this point, we can see the core listed in our solr admin http://localhost:8983/solr and we can proceed to edit the configuration files

cd /var/solr/data/ckan/conf

open solrconfig.xml and insert following line into the root <config> element:

<schemaFactory class="ClassicIndexSchemaFactory"/>

Delete this element:

<initParams path="/update/**">
  <lst name="defaults">
    <str name="update.chain">add-unknown-fields-to-the-schema</str>
  </lst>
</initParams>

and also delete this element:

<processor class="solr.AddSchemaFieldsUpdateProcessorFactory">
  <str name="defaultFieldType">strings</str>
  <lst name="typeMapping">
    <str name="valueClass">java.lang.Boolean</str>
    <str name="fieldType">booleans</str>
  </lst>
  <lst name="typeMapping">
    <str name="valueClass">java.util.Date</str>
    <str name="fieldType">tdates</str>
  </lst>
  <lst name="typeMapping">
    <str name="valueClass">java.lang.Long</str>
    <str name="valueClass">java.lang.Integer</str>
    <str name="fieldType">tlongs</str>
  </lst>
  <lst name="typeMapping">
    <str name="valueClass">java.lang.Number</str>
    <str name="fieldType">tdoubles</str>
  </lst>
</processor>

Next, remove the managed-schema file

rm managed-schema

and copy or symlink the schema.xml from CKAN:

cp /somewhere/over/the/rainbow/ckan/conf/solr/schema.xml .

or

ln -s /usr/lib/ckan/default/src/ckan/ckan/config/solr/schema.xml schema.xml

Finally, restart solr

/etc/init.d/solr restart

or sudo service restart solr

6. Create the Database Tables

Switch back to running as the ckan user, activate the isolated Python environment, and change to the CKAN source directory.

# su -s /bin/bash - ckan
# . default/bin/activate
(default)# cd default/src/ckan

Initialize the CKAN database.

(default)# paster db init -c /etc/ckan/default/development.ini

You may see the output:

Initialising DB: SUCCESS.

This line should be the only output. If there is other output before it, you must find the error line and find out the reason to fix it.

7. Setup the Datastore (Optional)

Follow the instructions in Setting up the DataStore to create the required databases and users, set the right permissions and set the appropriate values in your CKAN config file.
Note: You'll need to run the paster --plugin=ckan datastore set-permissions -c /etc/ckan/default/development.ini command as root user, since we've not set a sudo password for the ckan user.
Note: Setting up the DataStore is optional.

8. Link to who.ini

You should still be in the python virtualenv for this step, if not, do the following:

# su -s /bin/bash - ckan
# . default/bin/activate
(default)# cd default/src/ckan

who.ini (the Repoze.who configuration file) needs to be accessible in the same directory as your CKAN config file, so create a symlink to it:

(default)# ln -s /usr/lib/ckan/default/src/ckan/who.ini /etc/ckan/default/who.ini

9. Create a WSGI file

Create your site’s WSGI script file /etc/ckan/default/apache.wsgi with the following contents:

import os
activate_this = os.path.join('/usr/lib/ckan/default/bin/activate_this.py')
execfile(activate_this, dict(__file__=activate_this))

from paste.deploy import loadapp
config_filepath = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'development.ini')
from paste.script.util.logging_config import fileConfig
fileConfig(config_filepath)
application = loadapp('config:%s' % config_filepath)

The modwsgi Apache module will redirect requests to your web server to this WSGI script file. The script file then handles those requests by directing them on to your CKAN instance (after first configuring the Python environment for CKAN to run in).
Exit the ckan user with Ctrl+D or exit.

10. Create the Apache config file

Create your site’s Apache config file at /etc/httpd/conf.d/ckan_default.conf, with the following contents:

WSGISocketPrefix /var/run/wsgi
<VirtualHost 0.0.0.0:80>
	ServerName default.yourdomain.com
	ServerAlias http://default.yourdomain.com
	WSGIScriptAlias / /etc/ckan/default/apache.wsgi

	# Pass authorization info on (needed for rest api).
	WSGIPassAuthorization On

	# Deploy as a daemon (avoids conflicts between CKAN instances).
	WSGIDaemonProcess ckan_default display-name=ckan_default processes=2 threads=15

	WSGIProcessGroup ckan_default

	# Add this to avoid Apache show error: 
	# "AH01630: client denied by server configuration: /etc/ckan/default/apache.wsgi" 
	<Directory /etc/ckan/default>
    	Options Indexes FollowSymLinks
		AllowOverride All
		Order allow,deny
    	Allow from all
   	       	# Require for apache 2.4.6
   	       	Require all granted
	</Directory>

	ErrorLog /var/log/httpd/ckan_default.error.log
	CustomLog /var/log/httpd/ckan_default.custom.log combined
</VirtualHost>

Replace default.yourdomain.com and http://default.yourdomain.com with the domain name for your site.

This tells the Apache modwsgi module to redirect any requests to the web server to the WSGI script that you created above. Your WSGI script in turn directs the requests to your CKAN instance.

And then edit /etc/hosts with command:

# vi /etc/hosts

Add a line to this file at last:

127.0.0.1    default.yourdomain.com

Replace the default.yourdomain.com with the domain name of your site you have just set in the /etc/httpd/conf.d/ckan_default.conf.

11. Configure Apache

Enable httpd to make network connections

# setsebool -P httpd_can_network_connect 1

Enable httpd to start on system boot

# chkconfig httpd on

Start httpd

# service httpd start 

12. Configure iptables

Edit the file /etc/sysconfig/iptables by inserting the following line near the middle of the file:

-A INPUT -m state --state NEW -m tcp -p tcp --dport 80 -j ACCEPT

Restart iptables

# service iptables restart

13. Connect to CKAN

You can now use the Paste development server to serve CKAN from the command-line. This is a simple and lightweight way to serve CKAN that is useful for development and testing:

   # cd /usr/lib/ckan/default/src/ckan
   # paster serve /etc/ckan/default/development.ini

Open http://127.0.0.1:5000/ in a web browser, and you should see the CKAN front page.

14. Create sysadmin account

Back to your terminal and enter to python virtualenv.

# su -s /bin/bash - ckan
# . default/bin/activate
# (default)# paster --plugin=ckan sysadmin add <username> -c /etc/ckan/default/development.ini
⚠️ **GitHub.com Fallback** ⚠️