Data Mining Project

This repository holds the contents of the Data Mining project for the a.y. 2022/23 at the University of Pisa.

The project is divided in 4 parts:

Part 1 - Data Preparation

Exploration of the data using various techniques of dimensionality reduction, outlier detection and imbalanced learning

Part 2 - Advanced Classification and Regression

Use of multiple classification algorithms, such as logistic regression, Support Vector Machines, Neural Networks and Ensemble Methods to evaluate a multi-class classification problem. Also, implementation of different regression algorithms.

Part 3 - Time Series Analysis

Starting from audio files, the data was extracted, prepped and analyzed as Time Series data. We analyzed clustering using different algorithms and different distance metrics, as well as classification tasks using different kinds of approximations.

Part 4 - Explainability

Finally, a focus on explainability, using both a local (counterfactuals) and a global (trepan) explainer, in order to gain insight on the different black box algorithms used within the project.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Data Mining Project

Part 1 - Data Preparation

Part 2 - Advanced Classification and Regression

Part 3 - Time Series Analysis

Part 4 - Explainability

Files

README.md

Latest commit

History

README.md

File metadata and controls

Data Mining Project

Part 1 - Data Preparation

Part 2 - Advanced Classification and Regression

Part 3 - Time Series Analysis

Part 4 - Explainability