Skip to content

How do I get and install it?

Justin Gawrilow edited this page Jan 27, 2014 · 3 revisions

What do I need?

In short, you'll need a cloud with hadoop and zookeeper running on it. Cloudera's Distribution with Apache Hadoop (CDH) is recommended, but if you don't have an instance you can use, there are plenty of cloud-based VM's out there. You can also check out our XDATA VM which is an Ubuntu VM pre-loaded with CDH 4.5.0, other great software, and Louvain Modularity already installed!

I have a cloud. Now what?

My cloud is CDH 4.5.0

We have just the thing for you - that's right - A pre-built release for CDH 4.5.0 with Giraph 1.0.0 that you can get started with immediately!

My cloud is some other CDH 4.x.x

Not to worry! You'll have a few minor edits and a couple build processes to endure but you'll be up and running in no time. Following these instructions will get you there!

My cloud is CDH 5.x.x

You're really keeping up with the Jones'! This is our latest and greatest build and it's actually using the latest and greatest Giraph 1.1.0. If you follow the build instructions here you should be good to go!

I think it's installed. How do I know if it's working?

My cloud is CDH 4.5.0 or CDH 4.x.x

Before testing out the install, the single configuration option you'll need to set exists in louvain.py at Line 8. If you're using the XDATA VM, this property is already set to your local zookeeper server and you're good to go. If you're using another cloud, you'll need to specify the server/port list here. Once this is set you're ready to run against the included sample data to test your install and also a get a feel for the process and what's happening!

My cloud is CDH 5.x.x