How to Install CKAN 2.9 on Amazon Linux 2 - ckan/ckan GitHub Wiki
Install Dependencies First
$ sudo apt-get install redis-server [for AL2: sudo amazon-linux-extras install redis4.0]
$ sudo vim /etc/redis/redis.conf
`- Comment **#bind 127.0.0.1 ::1**, so it can listen on all interfaces`
`- Diable protected mode "**protected-mode no**"`
$ sudo /etc/init.d/redis-server restart
- Test from CKAN instance
$ telnet <RediServer-IP> 6379
- Install & Configure Solr [On the same Ubuntu 20 node]
$ sudo apt install -y solr-tomcat
- Change the default port Tomcat runs on (8080) to the one expected by CKAN. To do so change the following line in the /etc/tomcat9/server.xml file (tomcat8 in older Ubuntu versions):
From:
<Connector port="8080" protocol="HTTP/1.1"
To:
<Connector port="8983" protocol="HTTP/1.1"
-
Replace the default schema.xml file with a symlink to the CKAN schema file included in the sources. [ On Solr Machine]
$ sudo mv /etc/solr/conf/schema.xml /etc/solr/conf/schema.xml.bak
-
Copy schema.xml file from CKAN machine to this Solr Machine. This can only be done once CKAN is installed (later steps)
-
Copy /usr/lib/ckan/default/src/ckan/ckan/config/solr/schema.xml [From CKAN-Machine]
to /etc/solr/conf/schema.xml [on Solr Machine]
-
Restart Solr
sudo service tomcat9 restart
-
Check Solr running http://IP-Address:8983/solr/
-
Install and configure PostgreSQL (On the same Ubuntu 20.04 OS)
$ sudo apt update $ sudo apt install -y postgresql net-tools $ sudo service postgresql start
-
verify if postgres working $ sudo -u postgres psql -l Check that the encoding of databases is UTF8
-
Create a new PostgreSQL user called ckan_default, and enter a password as an Ubuntu user:
$ sudo -u postgres createuser -S -D -R -P ckan_default
-
Create a new PostgreSQL CKAN database, called ckan_default, owned by the database user you just created: as an Ubuntu user:
$ sudo -u postgres createdb -O ckan_default ckan_default -E utf-8
-
Edit postgresql.conf and pg_hba.conf. On Ubuntu, these files are located in /etc/postgresql/{Postgres version}/main.
- cd /etc/postgresql/12/main/
- vim postgresql.conf listen_addresses = '*'
-
Add a line similar to the line below to the bottom of pg_hba.conf to allow the machine running the web server to connect to PostgreSQL. Please change the IP address as desired according to your network settings.
-
vim pg_hba.conf
host all all <CKAN-IP>/32 md5
-
-
Verify if postgreSQL is now listening on all the interfaces
$ sudo service postgresql restart $ netstat -tulpn |grep 5432
-
Set-up Datastore database & user
-
Create a database_user called datastore_default.
Using 'Ubuntu user' $ sudo -u postgres createuser -S -D -R -P -l ckan_default ## if not already created above during CKAN settings
$ sudo -u postgres createuser -S -D -R -P -l datastore_default
-
Create the database (owned by ckan_default), which we’ll call datastore_default:
$ sudo -u postgres createdb -O ckan_default datastore_default -E utf-8 $ sudo -u postgres psql -l
-
Set permissions [Important]
############ This Step will be performed once CKAN is setup with datastore and database is initialized from CKAN machine ############
Once the DataStore database and the users are created, the permissions on the DataStore and CKAN database have to be set. CKAN provides a ckan command to help you correctly set these permissions. Since both CKAN and PostgreSQL are running on separate machines, we need to generate permissions on CKAN instance and deploy it on PostgreSQL:
- Generate the datastore permissions on CKAN machine (ckan -c /etc/ckan/default/ckan.ini datastore set-permissions) and paste it here:
$ sudo -u postgres psql
postgres=#
-
-
Installing the required packages
-
Install dependencies
$ sudo yum install python37 postgresql-devel python3-devel -y $ sudo yum install wget policycoreutils-python python3-pip git-core java-1.8.0-openjdk maven lsof gcc gcc gcc-c++ cmake automake gmp-devel boost -y
-
Create Virtual environment
mkdir -p ~/ckan/lib sudo ln -s ~/ckan/lib /usr/lib/ckan mkdir -p ~/ckan/etc sudo ln -s ~/ckan/etc /etc/ckan
sudo mkdir -p /usr/lib/ckan/default sudo chown
whoami
/usr/lib/ckan/default python3 -m venv /usr/lib/ckan/default . /usr/lib/ckan/default/bin/activate####### EVERYTHING FROM NOW ON will be executed on virtual env #######
-
Install setuptools:
(default) pip install setuptools==44.1.0 (default) pip install --upgrade pip
-
Install stable release
(default) pip install -e 'git+https://github.com/ckan/[email protected]#egg=ckan[requirements]'
-
-
Create folder for FileStore and file uploads.
-
create using ec2-user, so ec2-user have access to this folder.
(default) mkdir -p /var/lib/ckan/default
-
-
Generate & Configure CKAN.ini
(default) sudo mkdir -p /etc/ckan/default (default) sudo chown -R
whoami
/etc/ckan/(default) ckan generate config /etc/ckan/default/ckan.ini
################ ENSURE YOU HAVE SETUP REDIS, SOLR & PostgreSQL before you configure them here and initialise the DB & set Datastore permissions. ################
## Database Settings sqlalchemy.url = postgresql://ckan_default:<password>@<POSTGRES-IP>/ckan_default ckan.datastore.write_url = postgresql://ckan_default:<password>@<POSTGRES-IP>/datastore_default ckan.datastore.read_url = postgresql://datastore_default:<password>@<POSTGRES-IP>/datastore_default ## Site Settings ckan.site_url = http://ec2-1x-2xx-2xx-1xx.ap-southeast-2.compute.amazonaws.com ## Search Settings ckan.site_id = default-me # has to be unique solr_url = http://<SOLR-IP>:8983/solr ## Redis Settings # URL to your Redis instance, including the database to be used. ckan.redis.url = redis://<REDIS-IP>:6379/0 # Enable FileStore and file uploads. # When enabled, CKAN’s FileStore allows users to upload data files to CKAN resources, and to upload logo images for groups and organizations. Users will see an upload button when creating or updating a resource, group or organization. ckan.storage_path = /var/lib/ckan/default # Enable the datastore plugin # Add/Append the datastore plugin ckan.plugins = stats text_view image_view recline_view **datastore**
-
Link to who.ini
(default) ln -s /usr/lib/ckan/default/src/ckan/who.ini /etc/ckan/default/who.ini
-
DB Initialization and Create tables:
(default) ckan -c /etc/ckan/default/ckan.ini db init
2020-12-15 23:24:30,051 INFO [ckan.cli] Using configuration file /etc/ckan/default/ckan.ini 2020-12-15 23:24:30,051 INFO [ckan.config.environment] Loading static files from public 2020-12-15 23:24:30,099 INFO [ckan.config.environment] Loading templates from /home/ec2-user/ckan/lib/default/src/ckan/ckan/templates 2020-12-15 23:24:30,398 INFO [ckan.config.environment] Loading templates from /home/ec2-user/ckan/lib/default/src/ckan/ckan/templates 2020-12-15 23:24:30,695 INFO [ckan.cli.db] Initialize the Database 2020-12-15 23:24:30,844 INFO [ckan.model] CKAN database version remains as: ccd38ad5fced (head) 2020-12-15 23:24:30,844 INFO [ckan.model] Database initialised Initialising DB: SUCCESS
- Test Run your ckan
(default) cd /usr/lib/ckan/default/src/ckan (default) ckan -c /etc/ckan/default/ckan.ini run
2020-12-16 01:08:11,782 INFO [ckan.cli] Using configuration file /etc/ckan/default/ckan.ini 2020-12-16 01:08:11,783 INFO [ckan.config.environment] Loading static files from public 2020-12-16 01:08:11,825 INFO [ckan.config.environment] Loading templates from /home/ec2-user/ckan/lib/default/src/ckan/ckan/templates 2020-12-16 01:08:12,155 INFO [ckan.config.environment] Loading templates from /home/ec2-user/ckan/lib/default/src/ckan/ckan/templates 2020-12-16 01:08:12,505 INFO [ckan.cli.server] Running server localhost on port 5000 2020-12-16 01:08:13,510 INFO [ckan.cli] Using configuration file /etc/ckan/default/ckan.ini .... ....
-
Access it on http://localhost:5000
-
Set Datastore permissions
(It ensures that the datastore read-only user will only be able to select from the datastore database but has no create/write/edit permission or any permissions on other databases. You must execute this script as a database superuser on the PostgreSQL server that hosts your datastore database.)
(default) ckan -c /etc/ckan/default/ckan.ini datastore set-permissions */ -- Most of the following commands apply to an explicit database or to the whole -- 'public' schema, and could be executed anywhere. But ALTER DEFAULT -- PERMISSIONS applies to the current database, and so we must be connected to -- the datastore DB: \connect "datastore_default" -- revoke permissions for the read-only user REVOKE CREATE ON SCHEMA public FROM PUBLIC; REVOKE USAGE ON SCHEMA public FROM PUBLIC; GRANT CREATE ON SCHEMA public TO "ckan_default"; GRANT USAGE ON SCHEMA public TO "ckan_default"; GRANT CREATE ON SCHEMA public TO "ckan_default"; GRANT USAGE ON SCHEMA public TO "ckan_default"; -- take connect permissions from main db REVOKE CONNECT ON DATABASE "ckan_default" FROM "datastore_default"; -- grant select permissions for read-only user GRANT CONNECT ON DATABASE "datastore_default" TO "datastore_default"; GRANT USAGE ON SCHEMA public TO "datastore_default"; -- grant access to current tables and views to read-only user GRANT SELECT ON ALL TABLES IN SCHEMA public TO "datastore_default"; -- grant access to new tables and views by default ALTER DEFAULT PRIVILEGES FOR USER "ckan_default" IN SCHEMA public GRANT SELECT ON TABLES TO "datastore_default"; -- a view for listing valid table (resource id) and view names CREATE OR REPLACE VIEW "_table_metadata" AS SELECT DISTINCT substr(md5(dependee.relname || COALESCE(dependent.relname, '')), 0, 17) AS "_id", dependee.relname AS name, dependee.oid AS oid, dependent.relname AS alias_of FROM pg_class AS dependee LEFT OUTER JOIN pg_rewrite AS r ON r.ev_class = dependee.oid LEFT OUTER JOIN pg_depend AS d ON d.objid = r.oid LEFT OUTER JOIN pg_class AS dependent ON d.refobjid = dependent.oid WHERE (dependee.oid != dependent.oid OR dependent.oid IS NULL) AND -- is a table (from pg_tables view definition) -- or is a view (from pg_views view definition) (dependee.relkind = 'r'::"char" OR dependee.relkind = 'v'::"char") AND dependee.relnamespace = ( SELECT oid FROM pg_namespace WHERE nspname='public') ORDER BY dependee.oid DESC; ALTER VIEW "_table_metadata" OWNER TO "ckan_default"; GRANT SELECT ON "_table_metadata" TO "datastore_default"; -- _full_text fields are now updated by a trigger when set to NULL CREATE OR REPLACE FUNCTION populate_full_text_trigger() RETURNS trigger AS $body$ BEGIN IF NEW._full_text IS NOT NULL THEN RETURN NEW; END IF; NEW._full_text := ( SELECT to_tsvector(string_agg(value, ' ')) FROM json_each_text(row_to_json(NEW.*)) WHERE key NOT LIKE '\_%'); RETURN NEW; END; $body$ LANGUAGE plpgsql; ALTER FUNCTION populate_full_text_trigger() OWNER TO "ckan_default"; -- migrate existing tables that don't have full text trigger applied DO $body$ BEGIN EXECUTE coalesce( (SELECT string_agg( 'CREATE TRIGGER zfulltext BEFORE INSERT OR UPDATE ON ' || quote_ident(relname) || ' FOR EACH ROW EXECUTE PROCEDURE ' || 'populate_full_text_trigger();', ' ') FROM pg_class LEFT OUTER JOIN pg_trigger AS t ON t.tgrelid = relname::regclass AND t.tgname = 'zfulltext' WHERE relkind = 'r'::"char" AND t.tgname IS NULL AND relnamespace = ( SELECT oid FROM pg_namespace WHERE nspname='public')), 'SELECT 1;'); END; $body$;
Copy and paste above commands
Once you’ve installed CKAN from source, you need to deploy your CKAN site using a rudimentary web server. Because CKAN uses WSGI, a standard interface between web servers and Python web applications, CKAN can be used with a number of different web server and deployment configurations, however the CKAN project has now standardized on one NGINX with uwsgi
-
Install Nginx
Install NGINX (a web server) which will proxy the content from one of the WSGI Servers and add a layer of caching:
$ sudo amazon-linux-extras install nginx1 -y
-
Create the WSGI script file
$ sudo cp /usr/lib/ckan/default/src/ckan/wsgi.py /etc/ckan/default/
-
Create the WSGI Server (using python virtual env)
(default) pip install uwsgi (default) sudo cp /usr/lib/ckan/default/src/ckan/ckan-uwsgi.ini /etc/ckan/default/
-
We will not use Supervisor but will run ckan as Systemd service:
(default) vim /etc/ckan/default/ckan-uwsgi.ini
[uwsgi] http = 127.0.0.1:8080 uid = ec2-user guid = ec2-user wsgi-file = /etc/ckan/default/wsgi.py virtualenv = /usr/lib/ckan/default module = wsgi:application master = true pidfile = /tmp/%n.pid harakiri = 50 max-requests = 5000 vacuum = true callable = application buffer-size = 32768
(default) vi /etc/systemd/system/ckan.service
[Unit] Description=CKAN 2.9 After=syslog.target [Service] ExecStart=/usr/lib/ckan/default/bin/uwsgi --ini /etc/ckan/default/ckan-uwsgi.ini # Requires systemd version 211 or newer RuntimeDirectory=ckan Restart=always KillSignal=SIGQUIT Type=notify StandardError=syslog NotifyAccess=all [Install] WantedBy=multi-user.target
sudo systemctl enable ckan sudo systemctl start ckan sudo systemctl status ckan -l
-
Create the NGINX config file
(default) vim /etc/nginx/conf.d/ckan.conf
proxy_cache_path /tmp/nginx_cache levels=1:2 keys_zone=cache:30m max_size=250m; proxy_temp_path /tmp/nginx_proxy 1 2; server { client_max_body_size 100M; location / { proxy_pass http://127.0.0.1:8080/; proxy_set_header X-Forwarded-For $remote_addr; proxy_set_header Host $host; proxy_cache cache; proxy_cache_bypass $cookie_auth_tkt; proxy_no_cache $cookie_auth_tkt; proxy_cache_valid 30m; proxy_cache_key $host$scheme$proxy_host$request_uri; # In emergency comment out line to force caching # proxy_ignore_headers X-Accel-Expires Expires Cache-Control; } }
sudo systemctl restart nginx