Skip to content

deyaberger/DSLR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DSLR

Goal: Implement a logistic regression on a given dataset (cf subject_dslr.pdf)

Prerequisit:

If you do not have python3, run:
apt-get install python3
To create a virtual environment, run:
python3 -m venv [your_env_name]
Then: source [your_env_name]/bin/activate
Finally: pip install -r requirements.txt
cd src

I. Data Analysis

Run python3 describe.py [a_dataset.csv] to get the description of a dataset.
Use -h to display the usage and the options

II. Data Visualization

Use -h to display the usage and the options for the following functions:

  1. Run python3 histogram.py ../datasets/dataset_train.csv to display the histogram that answers the question:
    Which Hogwarts class has an homogenous repartition of grades between the four houses ?

  2. Run python3 scatter_plot.py ../datasets/dataset_train.csv to display a scatter plot that answers the following question:
    Which are the 2 similar features ?

  3. Run python3 pair_plot.py ../datasets/dataset_train.csv to display a pair plot that answers the following question:
    Which are the features we are going to use in our training ?

III.Logistic Regression

Use -h to display the usage and the options for the following functions:

  1. Run python3 logreg_train.py ../datasets/dataset_train.csv to train the model. It should creates a file called "weights.pkl" that will be used in the prediction program.

  2. Run python3 logreg_predict.py ../datasets/dataset_test.csv weights.pkl to predict the houses for students of the test dataset. It should create a csv file called "house.csv" where all the predictions are saved.

Visual examples:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages