Course materials for the Data Science: Principles and Practice course at the University of Cambridge.
Course homepage: https://www.cl.cam.ac.uk/teaching/2021/DataSciII/
To run the notebooks on your machine, check if Python 3
is installed. In addition, you will need the following libraries (notebooks tested with the versions indicated in the brackets; for older versions, you may consider running notebooks from https://github.com/ekochmar/cl-datasci-pnp):
Pandas
(v 1.0.1) for easy data uploading and manipulation. Check installation instructions at https://pandas.pydata.org/pandas-docs/stable/getting_started/install.htmlMatplotlib
(v 3.1.3): for visualisations. Check installation instructions at https://matplotlib.org/users/installing.htmlNumPy
(v 1.18.1) andSciPy
(v 1.4.1): for scientific programming. Check installation instructions at https://www.scipy.org/install.htmlScikit-learn
(v 0.22.1): for machine learning algorithms. Check installation instructions at http://scikit-learn.org/stable/install.htmlTensorFlow
(v 2; I'm using v 2.2.0): for deep learning algorithms. Check installation instructions at https://www.tensorflow.org/install
Alternatively, a number of these libraries can be installed in one go through Anaconda distribution.