Skip to content

Latest commit

 

History

History
14 lines (14 loc) · 1.13 KB

README.md

File metadata and controls

14 lines (14 loc) · 1.13 KB

Data Mining Project

This repository holds the contents of the Data Mining project for the a.y. 2022/23 at the University of Pisa.

The project is divided in 4 parts:
Part 1 - Data Preparation
Exploration of the data using various techniques of dimensionality reduction, outlier detection and imbalanced learning
Part 2 - Advanced Classification and Regression
Use of multiple classification algorithms, such as logistic regression, Support Vector Machines, Neural Networks and Ensemble Methods to evaluate a multi-class classification problem. Also, implementation of different regression algorithms.
Part 3 - Time Series Analysis
Starting from audio files, the data was extracted, prepped and analyzed as Time Series data. We analyzed clustering using different algorithms and different distance metrics, as well as classification tasks using different kinds of approximations.
Part 4 - Explainability
Finally, a focus on explainability, using both a local (counterfactuals) and a global (trepan) explainer, in order to gain insight on the different black box algorithms used within the project.