-
Notifications
You must be signed in to change notification settings - Fork 355
Running your own copy
Probably the most reliable way to run a copy of ConceptNet is to use the machine image that runs it on Amazon Web Services.
It's also possible to run ConceptNet on a computer that's not owned by Amazon, of course; see Build process for the instructions for setting it up from scratch. However, using an AMI lets us ensure the system is in the right configuration and skip a bunch of steps.
- Go to https://aws.amazon.com/ec2/, sign up if necessary, and log in if necessary.
- The ConceptNet image is hosted in the "US East (N. Virginia)" region, also known as
us-east-1
. You may be in a different region by default. The black bar in the upper right should look like this, and if it doesn't, you should click the drop-down and choose "US East (N. Virginia)":
- Click "Launch Instance".
- Choose "Community AMIs", search for "conceptnet-5.6", and select that image (
ami-9c6dcfe1
). - Choose a machine type to launch.
- You can run the API on a
t2.medium
or better (currently less than 5 cents per hour). - We run the real server on an
m4.large
so that it has the capacity to respond to a reasonable number of API requests. - If you want to be able to modify and rebuild the data, you'll need an
r4.xlarge
or better, so that you have access to at least 30 GB of RAM.
- You can run the API on a
- Proceed to "Configure Instance Details". Set "Auto-assign Public IP" to "Enable".
- Proceed to "Add Storage". The defaults should be fine.
- Proceed to "Configure Security Group". Add a rule allowing HTTP. The default IP range of
0.0.0.0/0, ::/0
(all addresses) is probably what you want. - "Review and Launch". Download the security key that you'll need to log into the system.
- On your EC2 instances list, take note of the public IP of your new machine. Let's call it YOUR.IP.ADDR.
- Connect to the machine over ssh, using the security key you downloaded, by following Amazon's instructions.
The system has two users you need to care about:
-
ubuntu
is the user you connect to using your SSH key. This user has the ability to run super-user commands withsudo
. Its home directory contains theconceptnet-puppet
repository, containing the scripts for configuring a machine to run ConceptNet. -
conceptnet
is the user that runs the ConceptNet code. It does not have super-user privileges. Its home directory contains the ConceptNet code and data.
To become the conceptnet
user, run sudo su conceptnet
. For example, as this user, you can connect to ConceptNet's PostgreSQL database by running psql conceptnet5
, or get a Python prompt where the conceptnet5
package is installed by running ipython
.
You should make sure that the machine is up-to-date, both with potential security fixes to Ubuntu packages, and with bug fixes to the ConceptNet code. To update Ubuntu packages, run:
sudo apt update
sudo apt dist-upgrade
You should also re-run the Puppet setup script, which will make sure that the ConceptNet code is up to date. (If you don't run this, you may encounter server errors due to running an outdated version of the code.)
cd conceptnet-puppet
./puppet-apply.sh
ConceptNet is running as a systemd service. Something has probably changed, so you should restart the service to run the new code:
sudo systemctl restart conceptnet
If something goes wrong, you'll want to look at the logs:
journalctl -u conceptnet
The ConceptNet API is now running on your server. While logged into the server, you can run:
curl http://localhost/
For an actually interesting response:
curl http://localhost/c/en/example
To access the API externally, you can go to http://YOUR.IP.ADDR/ (the IP address that you took note of earlier) from another machine. This should also get you the ConceptNet API.
The server is also serving the Web frontend, which it will use if your hostname is conceptnet.io
(which it isn't, because that's ours) or www.conceptnet.localhost
. This hostname is configured on the machine as an alias for localhost
, so you can test the Web frontend with:
curl http://www.conceptnet.localhost
Here's something frustrating we learned about Amazon AMIs: when you start the machine from an image, its disk is in some sort of cold storage, and every region of the disk that you access for the first time has to be "warmed up". Unfortunately, this tends to happen in the middle of a ConceptNet API query, and you end up waiting for so much of the disk to warm up that the request times out.
Here's a straightforward command to warm up the entire disk, by accessing every byte of data on the disk:
sudo dd if=/dev/xvda of=/dev/null bs=16M
Starting points
Reproducibility
Details