forked from jakevdp/sklearn_tutorial
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
9 changed files
with
6,093 additions
and
5,722 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,206 +1,183 @@ | ||
{ | ||
"metadata": { | ||
"name": "", | ||
"signature": "sha256:e15002059f80bd12a6b29c25b2179198b517b5b98da80bcfa018729797f25aea" | ||
}, | ||
"nbformat": 3, | ||
"nbformat_minor": 0, | ||
"worksheets": [ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"<small><i>This notebook was put together by [Jake Vanderplas](http://www.vanderplas.com). Source and license info is on [GitHub](https://github.com/jakevdp/sklearn_tutorial/).</i></small>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# An Introduction to scikit-learn: Machine Learning in Python" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Goals of this Tutorial" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"- **Introduce the basics of Machine Learning**, and some skills useful in practice.\n", | ||
"- **Introduce the syntax of scikit-learn**, so that you can make use of the rich toolset available." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Schedule:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"**Preliminaries: Setup & introduction** (15 min)\n", | ||
"* Making sure your computer is set-up\n", | ||
"\n", | ||
"**Basic Principles of Machine Learning and the Scikit-learn Interface** (45 min)\n", | ||
"* What is Machine Learning?\n", | ||
"* Machine learning data layout\n", | ||
"* Supervised Learning\n", | ||
" - Classification\n", | ||
" - Regression\n", | ||
" - Measuring performance\n", | ||
"* Unsupervised Learning\n", | ||
" - Clustering\n", | ||
" - Dimensionality Reduction\n", | ||
" - Density Estimation\n", | ||
"* Evaluation of Learning Models\n", | ||
"* Choosing the right algorithm for your dataset\n", | ||
"\n", | ||
"**Supervised learning in-depth** (1 hr)\n", | ||
"* Support Vector Machines\n", | ||
"* Decision Trees and Random Forests\n", | ||
"\n", | ||
"**Unsupervised learning in-depth** (1 hr)\n", | ||
"* Principal Component Analysis\n", | ||
"* K-means Clustering\n", | ||
"* Gaussian Mixture Models\n", | ||
"\n", | ||
"**Model Validation** (1 hr)\n", | ||
"* Validation and Cross-validation" | ||
] | ||
}, | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"<small><i>This notebook was put together by [Jake Vanderplas](http://www.vanderplas.com). Source and license info is on [GitHub](https://github.com/jakevdp/sklearn_tutorial/).</i></small>" | ||
] | ||
}, | ||
{ | ||
"cell_type": "heading", | ||
"level": 1, | ||
"metadata": {}, | ||
"source": [ | ||
"An Introduction to scikit-learn: Machine Learning in Python" | ||
] | ||
}, | ||
{ | ||
"cell_type": "heading", | ||
"level": 2, | ||
"metadata": {}, | ||
"source": [ | ||
"Goals of this Tutorial" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"- **Introduce the basics of Machine Learning**, and some skills useful in practice.\n", | ||
"- **Introduce the syntax of scikit-learn**, so that you can make use of the rich toolset available." | ||
] | ||
}, | ||
{ | ||
"cell_type": "heading", | ||
"level": 2, | ||
"metadata": {}, | ||
"source": [ | ||
"Schedule:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"**10:00 - 10:15** Preliminaries: Setup & introduction\n", | ||
"* Making sure your computer is set-up\n", | ||
"\n", | ||
"**10:15 - 11:00** Basic Principles of Machine Learning and the Scikit-learn Interface\n", | ||
"* What is Machine Learning?\n", | ||
"* Machine learning data layout\n", | ||
"* Supervised Learning\n", | ||
" - Classification\n", | ||
" - Regression\n", | ||
" - Measuring performance\n", | ||
"* Unsupervised Learning\n", | ||
" - Clustering\n", | ||
" - Dimensionality Reduction\n", | ||
" - Density Estimation\n", | ||
"* Evaluation of Learning Models\n", | ||
"* Choosing the right algorithm for your dataset\n", | ||
"\n", | ||
"**11:00 - 12:00** Supervised learning in-depth\n", | ||
"* Support Vector Machines\n", | ||
"* Decision Trees and Random Forests\n", | ||
"\n", | ||
"*The tutorial repository contains additional material which we will not cover here. My hope is that you will find it useful to read-through on your own if you want to go deeper!*" | ||
] | ||
}, | ||
{ | ||
"cell_type": "heading", | ||
"level": 2, | ||
"metadata": {}, | ||
"source": [ | ||
"Preliminaries" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"This tutorial requires the following packages:\n", | ||
"\n", | ||
"- Python version 2.6-2.7 or 3.3-3.4\n", | ||
"- `numpy` version 1.5 or later: http://www.numpy.org/\n", | ||
"- `scipy` version 0.10 or later: http://www.scipy.org/\n", | ||
"- `matplotlib` version 1.3 or later: http://matplotlib.org/\n", | ||
"- `scikit-learn` version 0.14 or later: http://scikit-learn.org\n", | ||
"- `ipython` version 2.0 or later, with notebook support: http://ipython.org\n", | ||
"- `seaborn`: version 0.5 or later, used mainly for plot styling\n", | ||
"\n", | ||
"The easiest way to get these is to use the [conda](http://store.continuum.io/) environment manager.\n", | ||
"I suggest downloading and installing [miniconda](http://conda.pydata.org/miniconda.html).\n", | ||
"\n", | ||
"The following command will install all required packages:\n", | ||
"```\n", | ||
"$ conda install numpy scipy matplotlib scikit-learn ipython-notebook\n", | ||
"```\n", | ||
"\n", | ||
"Alternatively, you can download and install the (very large) Anaconda software distribution, found at https://store.continuum.io/." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Checking your installation\n", | ||
"\n", | ||
"You can run the following code to check the versions of the packages on your system:\n", | ||
"\n", | ||
"(in IPython notebook, press `shift` and `return` together to execute the contents of a cell)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"collapsed": false, | ||
"input": [ | ||
"from __future__ import print_function\n", | ||
"\n", | ||
"import IPython\n", | ||
"print('IPython:', IPython.__version__)\n", | ||
"\n", | ||
"import numpy\n", | ||
"print('numpy:', numpy.__version__)\n", | ||
"\n", | ||
"import scipy\n", | ||
"print('scipy:', scipy.__version__)\n", | ||
"\n", | ||
"import matplotlib\n", | ||
"print('matplotlib:', matplotlib.__version__)\n", | ||
"\n", | ||
"import sklearn\n", | ||
"print('scikit-learn:', sklearn.__version__)\n", | ||
"\n", | ||
"import seaborn\n", | ||
"print('seaborn', seaborn.__version__)" | ||
], | ||
"language": "python", | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"output_type": "stream", | ||
"stream": "stdout", | ||
"text": [ | ||
"IPython: 2.4.1\n", | ||
"numpy:" | ||
] | ||
}, | ||
{ | ||
"output_type": "stream", | ||
"stream": "stdout", | ||
"text": [ | ||
" 1.9.2\n", | ||
"scipy: 0.15.1\n", | ||
"matplotlib: 1.4.3\n", | ||
"scikit-learn:" | ||
] | ||
}, | ||
{ | ||
"output_type": "stream", | ||
"stream": "stdout", | ||
"text": [ | ||
" 0.15.2\n", | ||
"seaborn" | ||
] | ||
}, | ||
{ | ||
"output_type": "stream", | ||
"stream": "stdout", | ||
"text": [ | ||
" 0.5.1\n" | ||
] | ||
} | ||
], | ||
"prompt_number": 1 | ||
}, | ||
{ | ||
"cell_type": "heading", | ||
"level": 2, | ||
"metadata": {}, | ||
"source": [ | ||
"Useful Resources" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"- **scikit-learn:** http://scikit-learn.org (see especially the narrative documentation)\n", | ||
"- **matplotlib:** http://matplotlib.org (see especially the gallery section)\n", | ||
"- **IPython:** http://ipython.org (also check out http://nbviewer.ipython.org)" | ||
] | ||
} | ||
], | ||
"metadata": {} | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Preliminaries" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"This tutorial requires the following packages:\n", | ||
"\n", | ||
"- Python version 2.7 or 3.4+\n", | ||
"- `numpy` version 1.8 or later: http://www.numpy.org/\n", | ||
"- `scipy` version 0.15 or later: http://www.scipy.org/\n", | ||
"- `matplotlib` version 1.3 or later: http://matplotlib.org/\n", | ||
"- `scikit-learn` version 0.15 or later: http://scikit-learn.org\n", | ||
"- `ipython`/`jupyter` version 3.0 or later, with notebook support: http://ipython.org\n", | ||
"- `seaborn`: version 0.5 or later, used mainly for plot styling\n", | ||
"\n", | ||
"The easiest way to get these is to use the [conda](http://store.continuum.io/) environment manager.\n", | ||
"I suggest downloading and installing [miniconda](http://conda.pydata.org/miniconda.html).\n", | ||
"\n", | ||
"The following command will install all required packages:\n", | ||
"```\n", | ||
"$ conda install numpy scipy matplotlib scikit-learn ipython-notebook\n", | ||
"```\n", | ||
"\n", | ||
"Alternatively, you can download and install the (very large) Anaconda software distribution, found at https://store.continuum.io/." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Checking your installation\n", | ||
"\n", | ||
"You can run the following code to check the versions of the packages on your system:\n", | ||
"\n", | ||
"(in IPython notebook, press `shift` and `return` together to execute the contents of a cell)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from __future__ import print_function\n", | ||
"\n", | ||
"import IPython\n", | ||
"print('IPython:', IPython.__version__)\n", | ||
"\n", | ||
"import numpy\n", | ||
"print('numpy:', numpy.__version__)\n", | ||
"\n", | ||
"import scipy\n", | ||
"print('scipy:', scipy.__version__)\n", | ||
"\n", | ||
"import matplotlib\n", | ||
"print('matplotlib:', matplotlib.__version__)\n", | ||
"\n", | ||
"import sklearn\n", | ||
"print('scikit-learn:', sklearn.__version__)\n", | ||
"\n", | ||
"import seaborn\n", | ||
"print('seaborn', seaborn.__version__)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Useful Resources" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"- **scikit-learn:** http://scikit-learn.org (see especially the narrative documentation)\n", | ||
"- **matplotlib:** http://matplotlib.org (see especially the gallery section)\n", | ||
"- **IPython:** http://ipython.org (also check out http://nbviewer.ipython.org)" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.5.1" | ||
} | ||
] | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 0 | ||
} |
Oops, something went wrong.