Skip to content

fandahao17/K-means-with-dimension-reduction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

K-means-with-dimension-reduction

This is the repository for our final project of the Algorithms for Big Data course at University of Science and Technology of China.

Based on Christos Boutsidis's papers Random Projections for k-means Clustering and Unsupervised Feature Selection for the k-means Clustering Problem, we looked for ways to apply classical dimensionality reduction methods, i.e. JL-transform and SVD, to the K-means clustering problem.

We found some ways to improve the performance of the original methods proposed by Christos and conducted some experiments to evaluated the effectiveness of our improvements. These improvements include:

  • Fast JL-transorm to speed up the random projection process.
  • Mailman algorithm to make matrix multiplication faster.
  • A randomized SVD algorithm to get a highly accurate approximation within little time
  • Some thoughts and hypothesis to improve the rescaling process in random feature selection
  • Fast and accurate PCA-like feature extraction with only K constructed features

This repository contains code and evaluation results for our project. Feel free to check them out and discuss with us!

For more information, please check our technical report.

About

Reproduction of Boutsdis et al. 's NIPS'09 and NIPS'10 papers

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published