1st read research_guide.doc
and then next_step.doc
This repository contains a Jupyter Notebook that performs statistical analysis on a House Listing Dataset. The analysis includes inference on features like property type, dimensions, age, locality, etc.
The input CSV file named train.csv
used for the analysis includes the columns as described in data_description.txt
.
main.ipynb
: The Jupyter Notebook performing data preprocessing, analysis, and visualization.- Input CSV Files i.e.
train.csv
: Dataset file containing the information used for analysis.
- Exploratory Data Analysis (EDA):
- Descriptive statistics of the dataset.
- Visualization of trends and correlations.
- Statistical Inference:
- Hypothesis testing.
- Analysis of property-related features.
- Recommendations:
- Suggestions based on findings.
- Clone this repository:
git clone https://github.com/samar-080301/house_price_EDA
- Navigate to the repository folder:
cd house_price_EDA
- Install the required dependencies:
pip install -r requirements.txt
- Open the Jupyter Notebook:
jupyter notebook main.ipynb
- Run the notebook cells sequentially.
Ensure the following Python libraries are installed:
pandas
numpy
matplotlib
seaborn
scipy
You can install them using:
pip install pandas numpy matplotlib seaborn scipy
The notebook generates:
- Summary statistics of the dataset.
- Graphical visualizations (e.g., bar charts, scatter plots).
- Insights into property features and trends.
Samar Pratap Singh
MBA Student | Data Analysis Enthusiast
This project is licensed under the MIT License. See the LICENSE file for details.