Skip to content

Latest commit

 

History

History
43 lines (30 loc) · 2.3 KB

README.md

File metadata and controls

43 lines (30 loc) · 2.3 KB

K-Means clustering

Luca Todaro and Rasha Zieni, University of Pavia
this repository contains the code for our Advanced Computer Architecture project work: an implementation of serial and parallel k-means clustering algorithm, written in C++. Oh and by the way, we got A+. :D

Project description

we are running the kmeans algorithm both in C++ (our own serial and parallel implementation) as well as in Python using the serial implementation of Scikit-Learn. The aim of the project is to demonstrate the performance increase when switching from serial to parallel programming, plus we decided to spice the things up comparing how fast the serial C++ implementation is against the one in Python from a well-known library (so we assume it's a state of art implementation!).

Because the algorithm is intrinsically non-deterministic, we decided to start the algorithm with pre-defined datasets to decrease variance between experiments. Please check the folder input for more infos about the datasets we used.

Run the code on Google Cloud Platform

We included an easy script (that is, prepare-project.sh) that helps configuring the environment on your virtual machine. All you need to do is to install git with

sudo apt-get install git

and then clone this repo:

git clone https://github.com/A7F/aca-kmeans.git

move to the newly created folder with cd aca-kmeans and assign run permissions to the script prepare-project.sh with

sudo chmod 777 ./prepare-project.sh

Please remember to set the correct working directory inside the script you want to run first! For example, if you want to run the serial script, paste your path inside the variable base_dir. After that, the bash script we provided automatically builds the project and prepares the runnable file in the same directory, so all you need to do is to call ./aca_kmeans.

Some useful resources...

follows a small section of useful links and resources to keep in hand, you know... just in case...

project presentation project report

Course links

aca project work page and course index