An association of home owners are seeking to know what features in a house could be remodelled to increase the house prices and what features they should add as they list their houses in the market. I will be providing insights on the best remodeling options for them and what features to highlight when selling their houses.
- The main goal of the project is to find the features that should be considered when building, renovating or selling a house.
The project used the King County House which can be found under data and the description of the columns is as below:
id
- Unique identifier for a housedate
- Date house was soldprice
- Sale price (prediction target)bedrooms
- Number of bedroomsbathrooms
- Number of bathroomssqft_living
- Square footage of living space in the homesqft_lot
- Square footage of the lotfloors
- Number of floors (levels) in housewaterfront
- Whether the house is on a waterfront- Includes Duwamish, Elliott Bay, Puget Sound, Lake Union, Ship Canal, Lake Washington, Lake Sammamish, other lake, and river/slough waterfronts
view
- Quality of view from house- Includes views of Mt. Rainier, Olympics, Cascades, Territorial, Seattle Skyline, Puget Sound, Lake Washington, Lake Sammamish, small lake / river / creek, and other
condition
- How good the overall condition of the house is. Related to maintenance of house.- Rated 1-5 from poor to very good
grade
- Overall grade of the house. Related to the construction and design of the house.- Rated 1-13 from poor to excellent
sqft_above
- Square footage of house apart from basementsqft_basement
- Square footage of the basementyr_built
- Year when house was builtyr_renovated
- Year when house was renovatedzipcode
- ZIP Code used by the United States Postal Servicelat
- Latitude coordinatelong
- Longitude coordinatesqft_living15
- The square footage of interior housing living space for the nearest 15 neighborssqft_lot15
- The square footage of the land lots of the nearest 15 neighbors
Conditions were rated as: "Poor": 0, "Fair": 1, "Average": 2, "Good": 3, "Very Good": 4 Views were rated as: "NONE": 0, "FAIR": 1, "AVERAGE": 2, "GOOD": 3, "EXCELLENT": 4 The method employed was as follows: - The datasets were imported and the dataset preparation was done. - Missing were handled by dropping affected columns or filling with the mode of the data depending on the scenario - Duplicated data were dropped and only the first instance of the duplicates was kept - The cleaned data was used to analyze and answer the problem statement - The models were used to answer the problem statement - A recommendation was made
The price distribution was
It is seen that the price is skewed The correlation between different features was
The features highlighted are
The model distribution of variables was
The homoscedasticity check was
It was seen that as grade and view were seen to affect the prices greatly
- The overall model explains about 65.4% of the variance in the prices.
- The model is off by about $141,119 in price
- All coefficients are statistically significant
- Compared to grade 3, grades 4,5,6 and 7 have a price decrease
- Compared to grade 3, grades 9,11,12 and 13 have a price increase with grade 13 having the most increase
- Compared to view 0, views 1,3 and 4 have a price increase with view 4 having the most
- Houses with a waterfront are set to increase the price by $517,800
- The types of view, compared to having no views would be having a house having a view quality of 1 which is considered fair as it would increase the price by $122,500 and should one want the best view at wuality of 4, they would see an increase in price by $286,500
- When it comes to building materials, one should go with the rating that is more than average to increase the sale as excellent grade increases price by $1,923,000
The model fits about 65% of the data and additional tests to check for normality and scaling of features should be considered. Also, for future research, deployment of other models besides the regression would be best and the data needs to be analysed while zoning in on specific zipcodes so as to get more accurate and fine tuned results
See the full analysis in the Jupyter Notebook or review this presentation.
For additional info, contact Medrine Waeni at [email protected]
├── data <- Both sourced externally
├── images <- Both sourced externally and generated from code
├── README.md <- The top-level README for reviewers of this project
├── King County Price Analysis.pdf <- PDF version of the Tableau work up
├── house_analysis.ipynb <- Narrative documentation of analysis in Jupyter notebook
├── notebook.pdf <- PDF version of Jupyter notebook
├── presentation.pdf <- PDF version of project presentation