Skip to content

Commit

Permalink
update all notebooks
Browse files Browse the repository at this point in the history
  • Loading branch information
jakevdp committed Mar 29, 2016
1 parent 288b26c commit 786424b
Show file tree
Hide file tree
Showing 9 changed files with 6,093 additions and 5,722 deletions.
383 changes: 180 additions & 203 deletions notebooks/01-Preliminaries.ipynb
Original file line number Diff line number Diff line change
@@ -1,206 +1,183 @@
{
"metadata": {
"name": "",
"signature": "sha256:e15002059f80bd12a6b29c25b2179198b517b5b98da80bcfa018729797f25aea"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<small><i>This notebook was put together by [Jake Vanderplas](http://www.vanderplas.com). Source and license info is on [GitHub](https://github.com/jakevdp/sklearn_tutorial/).</i></small>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# An Introduction to scikit-learn: Machine Learning in Python"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Goals of this Tutorial"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- **Introduce the basics of Machine Learning**, and some skills useful in practice.\n",
"- **Introduce the syntax of scikit-learn**, so that you can make use of the rich toolset available."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Schedule:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Preliminaries: Setup & introduction** (15 min)\n",
"* Making sure your computer is set-up\n",
"\n",
"**Basic Principles of Machine Learning and the Scikit-learn Interface** (45 min)\n",
"* What is Machine Learning?\n",
"* Machine learning data layout\n",
"* Supervised Learning\n",
" - Classification\n",
" - Regression\n",
" - Measuring performance\n",
"* Unsupervised Learning\n",
" - Clustering\n",
" - Dimensionality Reduction\n",
" - Density Estimation\n",
"* Evaluation of Learning Models\n",
"* Choosing the right algorithm for your dataset\n",
"\n",
"**Supervised learning in-depth** (1 hr)\n",
"* Support Vector Machines\n",
"* Decision Trees and Random Forests\n",
"\n",
"**Unsupervised learning in-depth** (1 hr)\n",
"* Principal Component Analysis\n",
"* K-means Clustering\n",
"* Gaussian Mixture Models\n",
"\n",
"**Model Validation** (1 hr)\n",
"* Validation and Cross-validation"
]
},
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<small><i>This notebook was put together by [Jake Vanderplas](http://www.vanderplas.com). Source and license info is on [GitHub](https://github.com/jakevdp/sklearn_tutorial/).</i></small>"
]
},
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"An Introduction to scikit-learn: Machine Learning in Python"
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Goals of this Tutorial"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- **Introduce the basics of Machine Learning**, and some skills useful in practice.\n",
"- **Introduce the syntax of scikit-learn**, so that you can make use of the rich toolset available."
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Schedule:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**10:00 - 10:15** Preliminaries: Setup & introduction\n",
"* Making sure your computer is set-up\n",
"\n",
"**10:15 - 11:00** Basic Principles of Machine Learning and the Scikit-learn Interface\n",
"* What is Machine Learning?\n",
"* Machine learning data layout\n",
"* Supervised Learning\n",
" - Classification\n",
" - Regression\n",
" - Measuring performance\n",
"* Unsupervised Learning\n",
" - Clustering\n",
" - Dimensionality Reduction\n",
" - Density Estimation\n",
"* Evaluation of Learning Models\n",
"* Choosing the right algorithm for your dataset\n",
"\n",
"**11:00 - 12:00** Supervised learning in-depth\n",
"* Support Vector Machines\n",
"* Decision Trees and Random Forests\n",
"\n",
"*The tutorial repository contains additional material which we will not cover here. My hope is that you will find it useful to read-through on your own if you want to go deeper!*"
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Preliminaries"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This tutorial requires the following packages:\n",
"\n",
"- Python version 2.6-2.7 or 3.3-3.4\n",
"- `numpy` version 1.5 or later: http://www.numpy.org/\n",
"- `scipy` version 0.10 or later: http://www.scipy.org/\n",
"- `matplotlib` version 1.3 or later: http://matplotlib.org/\n",
"- `scikit-learn` version 0.14 or later: http://scikit-learn.org\n",
"- `ipython` version 2.0 or later, with notebook support: http://ipython.org\n",
"- `seaborn`: version 0.5 or later, used mainly for plot styling\n",
"\n",
"The easiest way to get these is to use the [conda](http://store.continuum.io/) environment manager.\n",
"I suggest downloading and installing [miniconda](http://conda.pydata.org/miniconda.html).\n",
"\n",
"The following command will install all required packages:\n",
"```\n",
"$ conda install numpy scipy matplotlib scikit-learn ipython-notebook\n",
"```\n",
"\n",
"Alternatively, you can download and install the (very large) Anaconda software distribution, found at https://store.continuum.io/."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Checking your installation\n",
"\n",
"You can run the following code to check the versions of the packages on your system:\n",
"\n",
"(in IPython notebook, press `shift` and `return` together to execute the contents of a cell)"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from __future__ import print_function\n",
"\n",
"import IPython\n",
"print('IPython:', IPython.__version__)\n",
"\n",
"import numpy\n",
"print('numpy:', numpy.__version__)\n",
"\n",
"import scipy\n",
"print('scipy:', scipy.__version__)\n",
"\n",
"import matplotlib\n",
"print('matplotlib:', matplotlib.__version__)\n",
"\n",
"import sklearn\n",
"print('scikit-learn:', sklearn.__version__)\n",
"\n",
"import seaborn\n",
"print('seaborn', seaborn.__version__)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"IPython: 2.4.1\n",
"numpy:"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
" 1.9.2\n",
"scipy: 0.15.1\n",
"matplotlib: 1.4.3\n",
"scikit-learn:"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
" 0.15.2\n",
"seaborn"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
" 0.5.1\n"
]
}
],
"prompt_number": 1
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Useful Resources"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- **scikit-learn:** http://scikit-learn.org (see especially the narrative documentation)\n",
"- **matplotlib:** http://matplotlib.org (see especially the gallery section)\n",
"- **IPython:** http://ipython.org (also check out http://nbviewer.ipython.org)"
]
}
],
"metadata": {}
"cell_type": "markdown",
"metadata": {},
"source": [
"## Preliminaries"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This tutorial requires the following packages:\n",
"\n",
"- Python version 2.7 or 3.4+\n",
"- `numpy` version 1.8 or later: http://www.numpy.org/\n",
"- `scipy` version 0.15 or later: http://www.scipy.org/\n",
"- `matplotlib` version 1.3 or later: http://matplotlib.org/\n",
"- `scikit-learn` version 0.15 or later: http://scikit-learn.org\n",
"- `ipython`/`jupyter` version 3.0 or later, with notebook support: http://ipython.org\n",
"- `seaborn`: version 0.5 or later, used mainly for plot styling\n",
"\n",
"The easiest way to get these is to use the [conda](http://store.continuum.io/) environment manager.\n",
"I suggest downloading and installing [miniconda](http://conda.pydata.org/miniconda.html).\n",
"\n",
"The following command will install all required packages:\n",
"```\n",
"$ conda install numpy scipy matplotlib scikit-learn ipython-notebook\n",
"```\n",
"\n",
"Alternatively, you can download and install the (very large) Anaconda software distribution, found at https://store.continuum.io/."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Checking your installation\n",
"\n",
"You can run the following code to check the versions of the packages on your system:\n",
"\n",
"(in IPython notebook, press `shift` and `return` together to execute the contents of a cell)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from __future__ import print_function\n",
"\n",
"import IPython\n",
"print('IPython:', IPython.__version__)\n",
"\n",
"import numpy\n",
"print('numpy:', numpy.__version__)\n",
"\n",
"import scipy\n",
"print('scipy:', scipy.__version__)\n",
"\n",
"import matplotlib\n",
"print('matplotlib:', matplotlib.__version__)\n",
"\n",
"import sklearn\n",
"print('scikit-learn:', sklearn.__version__)\n",
"\n",
"import seaborn\n",
"print('seaborn', seaborn.__version__)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Useful Resources"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- **scikit-learn:** http://scikit-learn.org (see especially the narrative documentation)\n",
"- **matplotlib:** http://matplotlib.org (see especially the gallery section)\n",
"- **IPython:** http://ipython.org (also check out http://nbviewer.ipython.org)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.1"
}
]
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Loading

0 comments on commit 786424b

Please sign in to comment.