-
Notifications
You must be signed in to change notification settings - Fork 6
Installation instructions
These detailed installation instructions are intended for RHEL6.
As outlined in the README, w3act requires Java, Play, and by default a PostgreSQL database to connect to.
Java installation is carried out manually and uses the Oracle jdk-7u45-linux-x64.tar.gz, which means that our 'java -version' is "build 1.7.0_45-b18". However, it is assumed that any Java v7 JDK is appropriate. This version should be the system default Java service.
Alternatively use the following commands:
# cd /home/ait/Downloads/
# wget --no-cookies --no-check-certificate --header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F" "http://download.oracle.com/otn-pub/java/jdk/7u45-b18/jdk-7u45-linux-x64.tar.gz"
Open the java JDK archive:
# tar xzf jdk-7u45-linux-x64.tar.gz\?AuthParam\=1389315645_23d791afdb9b9b12eac71a5702d01ed1
Set up this java version as the main java version:
# alternatives --install /usr/bin/java java /home/ait/Downloads/jdk1.7.0_45/bin/java 2
# alternatives --config java
# alternatives --install /usr/bin/java java /home/ait/Downloads/jdk1.7.0_45/bin/java 2
# alternatives --config java
Check the result via java version:
# java -version
Play is manually installed, using 'activator-dist-1.3.5.zip' from http://www.playframework.com/download. Note this play version is 2.3.x and should be the system default Play service.
In order to include bootstrap directory use commands:
# git submodule init
# git submodule update
Our PostgreSQL installation is internal, but principally:
# yum install http://yum.postgresql.org/9.3/redhat/rhel-6-x86_64/pgdg-redhat93-9.3-1.noarch.rpm
# yum install postgresql93-server
# chkconfig postgresql-9.3 on
PostgreSQL needs to be initialised:
# service postgresql-9.3 initdb
# service postgresql-9.3 start
- this should start PostgreSQL on default port 5432
PostgreSQL user authentication also needs to be amended. Edit '(postgres installation directory)/pg_hba.conf':
local all all trust
# IPv4 local connections:
host all all 127.0.0.1/32 trust
After editing service restart is required:
# service postgresql-9.3 restart
Check that PostgreSQL is running on port 5432:
# netstat -ant | grep 5432
Once PostgreSQL is installed, the user 'training' should be created and then the 'w3act' database created, for which the 'training' user should be defined as owner. Run:
$ su - postgres -c "createuser --superuser training"
$ su - postgres -c "createdb --owner=training --username=training w3act"
$ su - postgres -c "psql -c 'grant all on database w3act to training' "
There is an additional requirement for this w3act service, which is a Maxmind GeoIP2 database. However, this is pre-packaged in this repository and there is no further configuration required.
Whois lookup is a service for mapping between domain name and country. It is pre-packaged in this repository in the "lib" folder and there is no further configuration required. If necessary, installation guidelines can be found in README.
Configuration settings for sending e-mail via SMTP is required. That should be done in the project configuration file 'w3act.properties". This file supports e-mail configuration and sending. These settings are:
# host=0.0.0.0 // The IP address of SMTP connection
# user=Domain\\User // The username for login in SMTP
# password=1234 // The password for SMTP connection
# [email protected] // The e-mail address of the sender
# port=25 // The port for SMTP connection
# server_name=www.webarchive.org.uk // The server name e.g. for crawl permission request
The settings for the queue endpoint for the rabbitMQ library should be defined in the project configuration file 'w3act.properties".
# queue_host=www.webarchive.org.uk
# queue_port=5762
# queue_name=w3actqueue
# routing_key=w3actroutingkey
# exchange_name=w3actexchange
In order to create a development instance of the application, clone sources from the Github repository:
# git clone https://github.com/ukwa/w3act.git
# cd w3act
Make amendments to the code if necessary:
# change parameter in configuration files
# create configuration file for production mode
# modify initial-data.yml file for roles and permissions configuration
# modify templates.yml file for templates configuration
# modify users.yml file for users configuration
# modify contact-persons.yml file for configuration of contact persons
# modify flags.yml file for configuration of using predefined list (submitted with default version)
# modify tags.yml file for configuration of open tags
Modify the w3act.properties configuration file parameters (e.g. for Drupal access):
# drupal_user=username
# drupal_password=pwd
The file initial-data.yml is a text file that contains data for different database tables like e.g. user table, roles and permissions.
The installation-specific Users with associated Roles should be defined in users.yml file in the manner:
roles:
- !!models.Role &sys_admin
name: sys_admin
permissions: create_roles_and_organisations, create_user, administer_user
- !!models.Role &archivist
name: archivist
permissions: create_user, administer_user, administer_collections
users:
- !!models.User
email: [email protected]
name: Max Mustermann
password: secret
field_affiliation: COM
role_to_user: [*sys_admin, *archivist]
The file templates.yml is a text file that contains data about different email templates. The installation specific templates should be defined in this file in the manner:
- !!models.MailTemplate
name: General
type: Permission Request
subject: Our Archive
placeHolders: url, name
fromEmail: [email protected]
text: default.txt
where the "text" field defines a path to the file containing the email template text.
The file flags.yml contains predefined list of attention flags:
- !!models.Flag
name: PRIORITY_PERMISSION
description: This flag marks priority permission
- !!models.Flag
name: PRIORITY_CRAWL_AND_QA
description: This flag marks priority crawl and QA
- !!models.Flag
name: PRIORITY_QA
description: This flag marks priority QA
- !!models.Flag
name: QA_ISSUE_APPEARANCE
description: This flag marks QA issue appearance
- !!models.Flag
name: QA_ISSUE_FUNCTIONALITY
description: This flag marks QA issue functionality
- !!models.Flag
name: QA_ISSUE_CONTENT
description: This flag marks QA issue content
- !!models.Flag
name: FOLLOW_UP_PEMISSION
description: This flag marks follow up permission
- !!models.Flag
name: GENERAL_CHANGE_REQUEST
description: This flag marks general change request
The file tags.yml contains initial set of open tags in the manner:
- !!models.Tag
name: science
description: This site is related to science
- !!models.Tag
name: sport
description: This site is related to sport
In order to run the application in development mode, use the run command:
# activator run
or for debugging
# activator debug run
If the sbt cache causes problems, use:
# activator clean-all
or one of the required scripts:
# ./cleanup.sh - to remove previous compiled code and DB data
# ./cleanup-evolutions.sh - to remove the evolution DB table creation/destruction SQL
Switching on/off data import in application.conf
# application.data.import=true|false
Running data import
# ./data_import.sh - found in root project
For testing use command:
# cd w3act
# activator test
Various tests are implemented in this project and can be found in the test/ directory:
# Integration tests are necessary to check content of created internet pages or to start browser. For this task we also employ Selenium WebDriver in order to automatically run different W3ACT pages in a browser starting with the login page.
# Application tests are employed to evaluate general functionality and HTML page contents.
# Models testing is used to test the created domain model and its connection to the database.
# Additionaly [Travis](https://travis-ci.org/ukwa/w3act/) was setup for automated testing combined with Github submissions of this project.
In order to separate your development application from production mode you can create a configuration file for production mode e.g. conf/prod.conf. In this file you import values from conf/application.conf and overwrite fields that should be different in production mode e.g. database name or flag for evolutions ("-DapplyDownEvolutions.=true") that are required to start the production application. It is important for the production mode to have the database evolution scripts before starting e.g. \w3act\conf\evolutions\default\1.sql
# include "application.conf"
# db.default.driver=org.postgresql.Driver
# created database 'w3act' with user 'training'
# db.default.url="postgres://training:(password)@127.0.0.1/w3act"
# applyDownEvolutions.w3act=true
Download the project sources from Github and use the stage command to prepare your application to be run in place:
# git clone https://github.com/ukwa/w3act.git
# cd w3act
Use activator command if it is in path:
# activator clean stage
or directly from play installation:
# /home/ait/Downloads/activator-dist-1.3.5/w3act/activator clean stage
This cleans and compiles your application and copies it to the target/universal/stage directory. It also creates a service start-up script within target/universal/stage/bin/w3act where 'w3act' is the project’s name. Run created script:
# target/universal/stage/bin/w3act -Dconfig.file=/home/ait/projects/w3act/conf/prod.conf -Dlogger.file=/home/ait/projects/w3act/conf/prod-logger.conf
or for Windows:
# target/universal/stage/bin/w3actprod.bat
When you are running this script you can specify your configuration file as a parameter. The default is application.conf. For production you could use either –Dconfig.file or if you prefer also -Dconfig.resource=prod.conf, which essentially means the same and looks in the conf/ directory of the project for the given file. A third possibility would be to use e.g. "-Dconfig.url=http://www.webarchive.org.uk/conf/prod.conf" but then you must provide this URL.
Switching on/off data import in application.conf
# application.data.import=true|false
Running data import
# ./data_import.sh - found in root project
Setting Wayback URL
# application.wayback.url="http://www.webarchive.org.uk/wayback/archive/xmlquery.jsp?url="
Switching off importing accounts/user.yml
# use.accounts=false
There are two possibilities. The first one is to configure the logging level using the logger key in your conf/application.conf file. Play defines a default application logger for your application, which is automatically used when you use the default Logger operations.
# Root logger:
logger=ERROR
# Logger used by the framework:
logger.play=INFO
# Logger provided to your application:
logger.application=DEBUG
Another possibility is to use logback configuration. The default configuration file (logger.xml) comes with play in the production mode and defines two appenders, one dispatched to the standard out stream, and the other to the logs/application.log file. If you want to fully customize logback, just create an alternative logback config file called e.g. prod-logger.xml and copy that to the conf/ directory of your application. In this file you can specify your logging output e.g. /var/log/w3act.log:
# <appender name="FILE" class="ch.qos.logback.core.FileAppender">
# <file>/var/log/w3act.log</file>
# <encoder>
# <pattern>%date - [%level] - from %logger in %thread %n%message%n%xException%n</pattern>
# </encoder>
# </appender>
Using the "-Dlogger.file" property you can specify another logback configuration file to be loaded from the file system, e.g.
# target/universal/stage/bin/w3actprod -Dconfig.file=/home/ait/projects/w3act/conf/prod.conf -Dlogger.file=/home/ait/projects/w3act/conf/prod-logger.xml
If you want to deploy your application to the server without any dependency on Play itself you can do this with the dist task. This task will build a binary version of your application and produces a ZIP file in target/universal/w3act-1.0.zip containing all JAR files needed to run your application in the target/universal folder of your application.
# activator dist
For Windows users a start script will be produced with a .bat file extension. The Linux you will need to add Unix file permissions. Because when the file is expanded the start script will be required to be set as an executable:
$ unzip target/universal/w3act-1.0.zip
$ chmod +x /path/to/bin/w3act
where 'w3act' is a project name. A w3act-1.0 directory will be created that contains a bin/ folder with start scripts.
To learn how to create a proper binary distribution, see Production Configuration, Production Distribution and the native packager
Use the package created above employing generated w3act.BAT start script for Windows or shell script for Linux. Use necessary parameter e.g. for evolutions:
$ cd w3act-1.0
$ ./bin/w3act -DapplyDownEvolutions.w3act=true
Where w3act in applyDownEvolutions parameter stands for database name.
The RHEL scripts are located in folder conf/sysv.
In order to deploy project application into /opt/ folder on RHEL without the need for internet access use 'create-distribution-package' script that documents and supports the use of 'play dist' command in root directory of the project. This script creates RHEL deployment package for W3ACT project with the package structure:
$ sysv
$ w3act-1.0
$ |_bin
$ |_conf
$ |_lib
$ |_share
The resulting distribution package 'w3act-dist.zip' should contain:
1. Project sources (a ZIP file resulting from “play dist” command) e.g. w3act-1.0.zip
2. Configuration files (*.yml, *.conf …) in folder "conf/" of the zip
3. SysV init scripts in folder "sysv/"
This script should be executed in a root directory of the project. We assume that zip and unzip program is installed and play software is installed in /etc/default/play-2.2.1
Main definitions:
$ PROJECT_NAME=w3act
$ VERSION="$PROJECT_NAME"-1.0
$ SOURCES_ZIP="$VERSION".zip
$ PLAY_DIR=/etc/default/activator-dist-1.3.5
$ SOURCE_DIR=target/universal
$ DIST_ZIP="$PROJECT_NAME"-dist.zip
$ SYSV_DIR=sysv
In this script we first clean up the old distribution package. In a second step we build a binary version of the application in order to deploy it to the server without any dependency on Play itself using 'play dist'. Then we extract created sources ZIP. We add the SysV init scripts and create a distribution package as a ZIP. This package can be copied to the opt folder on RHEL and unzipped there.
$ Usage: ./create-deployment-package
All services are managed via SysV scripts from 'sysv' folder in distribution package. The SysV init script inside the main distribution will be installed to /etc/init.d/w3act and will support services like /etc/init.d/w3act [start|stop].
The script 'w3act-rhel-deployment' supports W3ACT RHEL deployment with the default run level 3 (/etc/inittab). The location of the W3ACT project after unzipping of the distribution ZIP 'w3act-dist.zip' should be /opt/w3act-1.0. The structure of the W3ACT project related files under the "opt" folder should be the same as described above.
We assume that PostgresQL, Java and Play Framework are already installed in folder /etc/default The distribution package contains:
1. Project sources (a ZIP file resulting from “play dist” command) e.g. w3act-1.0.zip
2. Configuration files (*.yml, *.conf …) in folder "conf/"
3. SysV init script in folder "sysv/"
4. This script that extracts supporting software and configuration files in required directories like (/etc/init.d, /etc/default/ and /etc/sysconfig)
This script cleans up old configuration files and copies new configuration files to the folder /etc/sysconfig/w3act/ in order to isolate installation settings from code. Then we clean up old SysV init script and copy new file to the /etc/init.d. The last step is creation of the run level link.
$ Usage: ./w3act-rhel-deployment
This SysV init script 'w3act' manages w3act services on RHEL
-
If it is necessary to change the Linux run level to e.g. 3 use the command
$ init 3
-
Copy this script from /opt/w3act/sysv/ folder to the /etc/init.d/ folder using command
$ cp /opt/sysv/w3act /etc/init.d/
-
Add rights using command
$ chmod 755 /etc/init.d/w3act
-
Create symlink to the /etc/init.d/w3act script in required /etc/rc5.d level folder e.g. for the level 3 use commands
$ cd /etc/rc3.d $ ln -s /etc/init.d/w3act S99w3act
-
To start the W3ACT application use e.g.
$ service w3act start
For application start we provide two parameters:
a. Database name -DapplyDownEvolutions.<databasename>=true e.g. w3act b. Location for the file that contains the process id of the started application e.g. -Dpidfile.path=/var/run/play.pid
-
To stop the W3ACT application use e.g.
$ service w3act stop
Expected locations for project and for play framework are:
$ PROJECT_ROOT=/opt/w3act-1.0
$ PLAY_DIR=/etc/default/activator-dist-1.3.5
The logging in a single-line formats with a datestamp per line is defined in play configuration file.
For password encryption we employ secure hashing with random salt method proposed by Taylor Hornby.
$ Password Hashing With PBKDF2 (http://crackstation.net/hashing-security.htm).
$ Copyright (c) 2013, Taylor Hornby
$ All rights reserved.
For Linux 'get-last-version.sh' should be executed at the beginning of deployment. This will create a file 'last-version.txt' in a root of the project. The About page will retrieve the last version from this file.
Problem: Console prints only a few info messages and drupal data is not imported into local database
possible cause: the path to PostgreSQL\9.3\bin\psql in cleanup.bat may be wrong and should be adjusted according to the settings on your machine
java.net.UnknownHostException: www.webarchive.org.uk
possible cause: Login information in conf/w3act.properties is not correct
drupal_user=... drupal_password=...
After correcting the login information it may be necessary to wait 30 minutes before another attempt to login can succeed.
Configuration error Cannot connect to database [default]
possible causes:
- database user name or password does not match with information from conf/application.conf
- database w3act does not exist or is not owned by the database given in conf/application.conf
Unexpected exception NoSuchElementException: key not found: SOURCE
solution: This is a problem rooted in the play framework. Stop the server and start it again.