This project involves performing a comprehensive analysis of a vehicle dataset from CarDeckho. The analysis includes identifying key statistics, visualizing data trends, and uncovering insights about vehicle prices, depreciation, and more. The project uses Python, Pandas, Seaborn, and Matplotlib for data processing and visualization.
The dataset includes information about various vehicles, such as:
- Manufacturing year
- Vehicle name
- Price
- Original price
- Distance driven
- Fuel type
- Seller type
- Transmission type
- Number of owners
-
Data Analysis:
- Range of manufacturing years.
- Lowest and highest selling prices.
- Total number of records.
- Number of missing records.
- Number of different vehicles.
- Most sold vehicle.
- Number of CNG vehicles.
- Vehicles sold by individuals.
- Automatic transmission vehicles.
- Single-owner vehicles.
- Most and least depreciated vehicles.
- Brands less affected by cost depreciation.
-
Visualizations:
- Correlation matrix to identify factors affecting cost depreciation.
- Scatter plots of selling price vs. age and distance driven.
- Boxplots and violin plots to visualize price distributions.
- Histograms of vehicle prices and ages.
- Pairplot of numeric features.
- Price comparison between individual and dealer sales.
The results of the analysis include key statistics and various visualizations, which help in understanding the factors affecting vehicle prices and depreciation.
Contributions are welcome! Please feel free to submit a Pull Request or open an issue for any improvements or suggestions.
This project is licensed under the MIT License.
For any questions or inquiries, please contact [email protected]).