This repository has been archived by the owner on Sep 12, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 5
/
Copy pathREADME.j2
208 lines (158 loc) · 17.5 KB
/
README.j2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
STATUS: ALPHA
# hyrax-ansible
Ansible playbook for installing a Hyrax-based institutional repository.
## Introduction and Assumptions
This repository is a starting point for teams planning on running Hyrax-based IRs in production.
The idea is similar to https://github.com/Islandora-Devops/claw-playbook
These roles install the dependences for Hyrax (Solr, Ruby, Redis, PostreSQL, Node.js, Nginx, FITS, FFmpeg, and Fedora 4) then installs a Hyrax based application from a Github repository using Ansistrano.
This playbook is being tested against CentOS 7, Debian 9, Debian 10, Ubuntu 18.04 and Ubuntu 20.04. The Vagrantfile used is available at https://github.com/cu-library/hyrax-ansible-testvagrants.
All roles assume services (Nginx, Fedora 4, PostgreSQL) are installed on one machine and communicate using UNIX sockets or the loopback interface. For large deployments, you'll want to edit these roles so that services are deployed on different machines and that communication between them is encrypted.
These roles also do not create admin users. See https://github.com/samvera/hyrax/wiki/Making-Admin-Users-in-Hyrax. But it does run the following to set up basic data structures needed for Hyrax / Hyku.
`bundle exec rails hyrax:default_collection_types:create hyrax:default_admin_set:create hyrax:workflow:load`
These roles assume that an external SMTP server will be used. Some environments might want to use use a local mail transfer agent. More information can be found in the Hyrax management guide: https://github.com/samvera/hyrax/wiki/Hyrax-Management-Guide#mailers.
`install_hyrax_on_localhost.yml` is a test playbook which runs the provided roles against localhost.
## Setup, Usage and Deployment
Initial provisioning of Linux boxes is left outside the scope of this project. We assume a working CentOS 7, Debian 9/10 or Ubuntu 18.04/20.04 system. If you are installing this repository on the box itself, check out the repository in to a local directory, verify the variables in `vars/common.yml`, then do the following:
```sh
ansible-galaxy install -r requirements.yml
ansible-playbook install_hyrax_on_localhost.yml
```
After initial setup you can use `ansible-playbook deploy.yml` and `ansible-playbook rollback.yml` to deploy and rollback various versions of the application code
### Carleton University Library's Use Of This Repo
Carleton University Library forked https://github.com/samvera/hyku and this repository into two private Github repositories. In those forks, we changed variables in the `common.yml` file including `ansistrano_git_repo` to point to our fork of Hyku. We moved the sensitive variables to an Ansible vault. Finally, we included the inventory and Ansible configuration needed to deploy on virtual machines in our environment.
## "Production Ready"
What does production ready mean? A production ready instance of Hyrax should be secure. It should be regularly backed-up. It should be easy to update. Finally, it should have good performance.
### Security
INCOMPLETE
The services running within the VM should ensure data is protected from unauthorized access and modification. If one were to gain access to a nonprivileged Linux user account, one should not be able to access or modify any IR data. *Currently, this is not true of these roles. Some services are available to any Linux user.*
### Backups
NEEDS TESTING, FEEDBACK WELCOME
The `hyrax` role has three backup scripts. They are copied to the /etc/cron.daily, /etc/cron.weekly, and /etc/cron.monthly directories, so the exact time they run is dependent on the distribution and system configuration.
Daily, Fedora 4 repository data (using the `fcr:backup` REST endpoint), PostgreSQL data (using `pg_dumpall`), the Redis `/var/lib/redis/dump.rdb` file and the Hyrax root directory are copied, tar'd together, and compressed. The resulting backup file has a datestamp and a MD5 checksum added to the filename. The daily backups reside in the `hyrax_backups_directory`/daily directory. As well as performing the backup, the daily backup script deletes any files in the daily directory that are older than 7 days old.
Weekly, a script moves the oldest daily backup to the `hyrax_backups_directory`/weekly directory. The weekly script also deletes files in the weekly directory that are older than 31 days old.
Monthly, a script moves the oldest weekly backup to the `hyrax_backups_directory`/monthly directory. The weekly script also deletes files in the monthly directory that are older than 365 days old.
For large deployments, this amount of backups might overwhelm local storage. It is recommended that the backup schedule and retention periods be tailored for your deployment.
It is up to local system administrators to copy backup data to a NAS or SAN, tape, or to cloud storage like Amazon Glacier or OLRC.
### Updates
NEEDS TESTING, FEEDBACK WELCOME
Where possible, it should be easy to keep a production server up-to-date. This means these roles should utilize well-known package repositories and package management tools when possible. Roles should be as idempotent and 'low impact' as possible, to encourage system administrators to run them regularly. Local modifications to community code should be minimal. These roles should not overwrite local changes and customizations.
### Performance
NEEDS TESTING, FEEDBACK WELCOME
These roles should install Hyrax so that it has good performance (max 500ms for most requests) on a "medium sized" server (4 core, 4GB RAM) when being used for typical workloads (documents and images, some multimedia, less than 10,000 digital objects). Community recommended software versions and configuration should be used.
## Software versions
Some software in the provided roles is installed using yum or apt from the default distribution repositories. The software version will depend on the distribution and release. For example, version 7 or 8 of Tomcat might be installed. For Java, OpenJDK version 8 is used.
Nginx is installed using that project's pre-built packages for the stable version, and not the default distribution repositories.
Node.js latest version 10.x is installed using the NodeSource repositories.
Some software is installed at a specific version:
* Fedora Repository {{ fedora4_version }} (Set using `fedora4_version` variable.)
* Solr {{ solr_version }} (Set using `solr_version` variable.)
* Ruby {{ ruby_version }} (Set using `ruby_version` variable.)
* FFmpeg {{ ffmpeg_version }} (Set using `ffmpeg_version` variable.)
* FITS {{ fits_version }} (Set using `fits_version` variable.)
FFmpeg is built with:
* cmake: {{ cmake_version }} (Set using `cmake_version` variable.)
* NASM {{ nasm_version }} (Set using `nasm_version` variable.)
* Yasm {{ yasm_version }} (Set using `yasm_version` variable.)
* x264: {{ x264_version }} (Set using `x264_version` variable.)
* x265: {{ x265_version }} (Set using `x265_version` variable.)
* fdk-aac: {{ fdk_aac_version }} (Set using `fdk_aac_version` variable.)
* lame: {{ lame_version }} (Set using `lame_version` variable.)
* opus: {{ opus_version }} (Set using `opus_version` variable.)
* libogg: {{ libogg_version }} (Set using `libogg_version` variable.)
* libvorbis: {{ libvorbis_version }} (Set using `libvorbis_version` variable.)
* aom: {{ aom_version }} (Set using `aom_version` variable.) Falling back to using a commit instead of a tagged release. The tarballs from https://aomedia.googlesource.com/aom/ are generated when requested for a particular tag. They are not stable releases, and as such do not have stable checksums. A checksum is not provided.
* libvpx: {{ libvpx_version }} (Set using `libvpx_version` variable.)
* libass: {{ libass_version }} (Set using `libass_version` variable.)
## Variables
|Variable|Notes|Default|
|---|---|---|
|`ansistrano_deploy_to` | Path where Hyrax app is deployed. | `{{ ansistrano_deploy_to }}` |
|`ansistrano_keep_releases` | Number of old releases to keep. | `{{ ansistrano_keep_releases }}` |
|`ansistrano_shared_paths` | Array of shared paths that are symlinked to each release. | `{{ ansistrano_shared_paths | join(', ') }}` |
|`ansistrano_shared_files` | Array of shared files that are symlinked to each release. | `{{ ansistrano_shared_files | join(', ') }}`|
|`ansistrano_deploy_via` | Ansistrano tool for depoyment, can by rsync or git. | `{{ ansistrano_deploy_via }}` |
|`ansistrano_git_repo` | Repo of the application to deploy | `{{ ansistrano_git_repo }}` |
|`ansistrano_git_branch` | What version of the repository to check out. This can be the full 40-character SHA-1 hash, the literal string HEAD, a branch name, or a tag name. | `{{ ansistrano_git_branch }}` |
|`ansistrano_git_identity_key_path` | If specified this file is copied over and used as the identity key for the git commands, path is relative to the playbook in which it is used ||
|`aom_version` | The version of aom to download. Used to build FFmpeg. | `{{ aom_version }}` |
|`bundler_version` | The version of the gem bundler to install. | `{{ bundler_version }}` |
|`cmake_checksum` | Verify the cmake-`{{ cmake_version }}`.tar.gz file, used by `get_url`. Format: `<algorithm>:<checksum>` | `{{ cmake_checksum }}` |
|`cmake_version` | The version of cmake to download. Used to build aom library for FFmpeg. | `{{ cmake_version }}` |
|`fdk_aac_checksum` | Verify the fdk-aac-`{{ fdk_aac_version }}`.tar.gz file, used by `get_url`. Format: `<algorithm>:<checksum>` | `{{ fdk_aac_checksum }}` |
|`fdk_aac_version` | The version of fdk-aac to download. Used to build FFmpeg. | `{{ fdk_aac_version }}` |
|`fedora4_checksum` | Verify the fcrepo-webapp-`{{ fedora4_version }}`.war file, used by `get_url` module. Format: `<algorithm>:<checksum>` | `{{ fedora4_checksum }}` |
|`fedora4_postgresqldatabase_user_password` | **Secure.** The password used by fedora4 to connect to Postgresql. | `{{ fedora4_postgresqldatabase_user_password }}` |
|`fedora4_version` | The version of Fedora 4 to download. | `{{ fedora4_version }}` |
|`ffmpeg_checksum` | Verify the ffmpeg-`{{ ffmpeg_version }}`.tar.bz2 file, used by `get_url`. Format: `<algorithm>:<checksum>` | `{{ ffmpeg_checksum }}` |
|`ffmpeg_compile_dir` | The directory where ffmpeg sources will be downloaded, unarchived, and compiled. | `{{ ffmpeg_compile_dir }}` |
|`ffmpeg_version` | The version of FFmpeg to download. | `{{ ffmpeg_version }}` |
|`fits_checksum` | Verify the fits-`{{ fits_version }}`.zip file, used by `get_url`. Format: `<algorithm>:<checksum>` | `{{ fits_checksum }}` |
|`fits_version` | The version of FITS to download. | `{{ fits_version }}` |
|`hyrax_backups_directory` | The location where backup files will be created. | `{{ hyrax_backups_directory }}` |
|`hyrax_database_pool_size` | The size of the database pool. | `{{ hyrax_database_pool_size }}` |
|`hyrax_from_email_address ` | The email address to use for the from field when sending emails from Hyrax. | `{{ hyrax_from_email_address }}` |
|`hyrax_postgresqldatabase_user_password` | **Secure.** The password used by hyrax to connect to Postgresql. | `{{ hyrax_postgresqldatabase_user_password }}` |
|`hyrax_secret_key_base` | **Secure.** The secret used by Rails for sessions etc. | `{{ hyrax_secret_key_base }}` |
|`hyrax_smtp_address` | Rails smtp address. | `{{ hyrax_smtp_address }}` |
|`hyrax_smtp_port` | Rails smtp port. | `{{ hyrax_smtp_port }}` |
|`hyrax_smtp_domain` | Rails smtp domain for mailer. | `{{ hyrax_smtp_domain }}` |
|`hyrax_smtp_user_name` | **Secure.** Rails smtp username for mailer. | `{{ hyrax_smtp_user_name }}` |
|`hyrax_smtp_password` | **Secure.** Rails smtp password for mailer. | `{{ hyrax_smtp_password }}` |
|`hyrax_smtp_authentication` | Rails authentication method for mailer. | `{{ hyrax_smtp_authentication }}` |
|`hyrax_smtp_enable_starttls_auto` | Rails starttls setting for mailer. | `{{ hyrax_smtp_enable_starttls_auto }}` |
|`hyrax_smtp_openssl_verify_mode` | Rails openssl verify mode for mailer. | `{{ hyrax_smtp_openssl_verify_mode }}` |
|`hyrax_smtp_ssl` | Rails ssl setting for mailer. | `{{ hyrax_smtp_ssl }}` |
|`hyrax_smtp_tls` | Rails tls setting for mailer. | `{{ hyrax_smtp_tls }}` |
|`imagemagick_package` | **Per-Distro** The name used by the `package` module when installing ImageMagick. ||
|`java_openjdk_package` | **Per-Distro** The name used by the `package` module when installing the Java JDK. ||
|`lame_checksum` | Verify the lame-`{{ lame_version }}`.tar.gz file, used by `get_url`. Format: `<algorithm>:<checksum>` | `{{ lame_checksum }}` |
|`lame_version` | The version of lame to download. Used to build FFmpeg. | `{{ lame_version }}` |
|`libass_checksum` | Verify the libass-`{{ yasm_version }}`.tar.gz file, used by `get_url`. Format: `<algorithm>:<checksum>` | `{{ libass_checksum }}` |
|`libass_version` | The version of libass to download. Used to build FFmpeg. | `{{ libass_version }}` |
|`libogg_checksum` | Verify the libogg-`{{ libogg_version }}`.tar.gz file, used by `get_url`. Format: `<algorithm>:<checksum>` | `{{ libogg_checksum }}` |
|`libogg_version` | The version of libogg to download. Used to build FFmpeg. | `{{ libogg_version }}` |
|`libvorbis_checksum` | Verify the libvorbis-`{{ libvorbis_version }}`.tar.gz file, used by `get_url`. Format: `<algorithm>:<checksum>` | `{{ libvorbis_checksum }}` |
|`libvorbis_version` | The version of libvorbis to download. Used to build FFmpeg. | `{{ libvorbis_version }}` |
|`libvpx_checksum` | Verify the libvpx-`{{ libvpx_version }}`.tar.gz file, used by `get_url`. Format: `<algorithm>:<checksum>` | `{{ libvpx_checksum }}` |
|`libvpx_version` | The version of libvpx to download. Used to build FFmpeg. | `{{ libvpx_version }}` |
|`make_jobs` | Sets an environment variable MAKEFLAGS to '-j X' in the test playbook. | `{{ make_jobs }}` |
|`nasm_checksum` | Verify the nasm-`{{ nasm_version }}`.tar.bz2 file, used by `get_url`. Format: `<algorithm>:<checksum>` | `{{ nasm_checksum }}` |
|`nasm_version` | The version of NASM to download. Used to build FFmpeg. | `{{ nasm_version }}` |
|`opus_checksum` | Verify the opus-`{{ opus_version }}`.tar.gz file, used by `get_url`. Format: `<algorithm>:<checksum>` | `{{ opus_checksum }}` |
|`opus_version` | The version of opus to download. Used to build FFmpeg. | `{{ opus_version }}` |
|`postgresql_contrib_package` | **Per-Distro** The name used by the `package` module when installing Postgresql's additional features. ||
|`postgresql_devel_package` | **Per-Distro** The name used by the `package` module when installing the Postgresql C headers and other development libraries. ||
|`postgresql_server_package` | **Per-Distro** The name used by the `package` module when installing the Postgresql server. ||
|`puma_web_concurency` | Number of Puma processes to start. | `{{ puma_web_concurency }}` |
|`python_psycopg2_package` | **Per-Distro** The name used by the `package` module when installing the Python Postgresql library (used by Ansible). ||
|`redis_package` | **Per-Distro** The name used by the `package` module when installing Redis. ||
|`ruby_install_version` | The version of ruby-install to download and install. | `{{ ruby_install_version }}` |
|`ruby_install_checksum` | Verify the ruby-instal tarball. | `{{ ruby_install_checksum }}` |
|`ruby_tarbz2_sha256_checksum` | Verify the ruby-`{{ ruby_version }}`.tar.bz2 file, used by `ruby-install`. Format: `<checksum>` | `{{ ruby_tarbz2_sha256_checksum }}` |
|`ruby_version` | The version of Ruby to download and install. | `{{ ruby_version }}` |
|`sidekiq_threads` | Tune the number of sidekiq threads that will be started. | `{{ sidekiq_threads }}` |
|`solr_checksum` | Verify the solr-`{{ solr_version }}`.tgz file, used by `get_url` module. Format: `<algorithm>:<checksum>` | `{{ solr_checksum }}` |
|`solr_mirror` | The mirror to use when downloading Solr. | `{{ solr_mirror }}` |
|`solr_version` | The version of Solr to download. | `{{ solr_version }}` |
|`tomcat_admin_package` | **Per-Distro** The name used by the `package` module when installing the tomcat manager webapps. ||
|`tomcat_fedora4_conf_path` | **Per-Distro** The path for the configuration file which sets JAVA_OPTS for Fedora4. ||
|`tomcat_fedora4_war_path` | **Per-Distro** The path at which the fedora4 war file will be copied. ||
|`tomcat_group` | **Per-Distro** The primary group for the tomcat service user. ||
|`tomcat_package` | **Per-Distro** The name used by the `package` module when installing tomcat. ||
|`tomcat_service_name` | **Per-Distro** The name of the tomcat service. ||
|`tomcat_user_password` | **Secure.** The password used to build the tomcat-users.xml file. | `{{ tomcat_user_password }}` |
|`tomcat_user` | **Per-Distro** The user which runs the tomcat service. ||
|`tomcat_users_conf_path` | **Per-Distro** The path for tomcat-users.xml. ||
|`x264_checksum` | Verify the x264-snapshot-`{{ x264_version }}`.tar.bz2 file, used by `get_url`. Format: `<algorithm>:<checksum>` | `{{ x264_checksum }}` |
|`x264_version` | The version of x264 to download. Used to build FFmpeg. | `{{ x264_version }}` |
|`x265_version` | The version of x265 to download. Used to build FFmpeg. | `{{ x265_version }}` |
|`yasm_checksum` | Verify the yasm-`{{ yasm_version }}`.tar.gz file, used by `get_url`. Format: `<algorithm>:<checksum>` | `{{ yasm_checksum }}` |
|`yasm_version` | The version of Yasm to download. Used to build FFmpeg. | `{{ yasm_version }}` |
**Per-Distro**: Different value for different OSs. The test playbook uses
```
vars_files:
- "vars/common.yml"
- "vars/{{ '{{' }} ansible_distribution {{ '}}' }}.yml"
```
and per-distribution variable files to provide different variables for different distributions.
**Secure**: Variables which should be different per-host and stored securely using Ansible Vaults or a tool like Hashicorp Vault. The test playbook insecurely puts these variables in `vars/common.yml`.