Skip to content

Commit

Permalink
initial commit of tutorial materials
Browse files Browse the repository at this point in the history
  • Loading branch information
jakevdp committed Sep 8, 2015
1 parent 3badfb6 commit 288b26c
Show file tree
Hide file tree
Showing 24 changed files with 6,957 additions and 2 deletions.
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -55,3 +55,10 @@ docs/_build/

# PyBuilder
target/

# IPython
.ipynb_checkpoints
notebooks/.ipynb_checkpoints

# Emacs
*~
58 changes: 56 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,56 @@
# sklearn_tutorial
Materials for my scikit-learn tutorial
# Scikit-learn Tutorial

*Jake VanderPlas*

- email: <[email protected]>
- twitter: [@jakevdp](https://twitter.com/jakevdp)
- github: [jakevdp](http://github.com/jakevdp)

This repository contains notebooks and other files associated with my
[Scikit-learn](http://scikit-learn.org) tutorial.

## Installation Notes
This tutorial requires the following packages:

- Python version 2.6-2.7 or 3.3+
- `numpy` version 1.5 or later: http://www.numpy.org/
- `scipy` version 0.10 or later: http://www.scipy.org/
- `matplotlib` version 1.3 or later: http://matplotlib.org/
- `scikit-learn` version 0.14 or later: http://scikit-learn.org
- `ipython` version 2.0 or later, with notebook support: http://ipython.org
- `seaborn` version 0.5 or later

The easiest way to get these is to use the [conda](https://store.continuum.io/) environment manager.
I suggest downloading and installing [miniconda](http://conda.pydata.org/miniconda.html).

Once this is installed, the following command will install all required packages in your Python environment:
```
$ conda install numpy scipy matplotlib scikit-learn ipython-notebook seaborn
```

Alternatively, you can download and install the (very large) Anaconda software distribution, found at https://store.continuum.io/.

## Downloading the Tutorial Materials
I would highly recommend using git, not only for this tutorial, but for the
general betterment of your life. Once git is installed, you can clone the
material in this tutorial by using the git address shown above:

git clone git://github.com/jakevdp/sklearn_tutorial.git

If you can't or don't want to install git, there is a link above to download
the contents of this repository as a zip file. I may make minor changes to
the repository in the days before the tutorial, however, so cloning the
repository is a much better option.


## Notebook Listing
You can [view the tutorial materials](http://nbviewer.ipython.org/github/jakevdp/sklearn_tutorial/blob/master/notebooks/Index.ipynb) using the excellent nbviewer service.

Note, however, that you cannot modify or run the contents within nbviewer.
To modify them, first download the tutorial repository, change to the notebooks directory, and run ``ipython notebook``.
You should see the list in the ipython notebook launch page in your web browser.
For more information on the IPython notebook, see http://ipython.org/notebook.html

Note also that some of the code in these notebooks will not work outside the
directory structure of this tutorial, so it is important to clone the full
repository if possible.
206 changes: 206 additions & 0 deletions notebooks/01-Preliminaries.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,206 @@
{
"metadata": {
"name": "",
"signature": "sha256:e15002059f80bd12a6b29c25b2179198b517b5b98da80bcfa018729797f25aea"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<small><i>This notebook was put together by [Jake Vanderplas](http://www.vanderplas.com). Source and license info is on [GitHub](https://github.com/jakevdp/sklearn_tutorial/).</i></small>"
]
},
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"An Introduction to scikit-learn: Machine Learning in Python"
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Goals of this Tutorial"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- **Introduce the basics of Machine Learning**, and some skills useful in practice.\n",
"- **Introduce the syntax of scikit-learn**, so that you can make use of the rich toolset available."
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Schedule:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**10:00 - 10:15** Preliminaries: Setup & introduction\n",
"* Making sure your computer is set-up\n",
"\n",
"**10:15 - 11:00** Basic Principles of Machine Learning and the Scikit-learn Interface\n",
"* What is Machine Learning?\n",
"* Machine learning data layout\n",
"* Supervised Learning\n",
" - Classification\n",
" - Regression\n",
" - Measuring performance\n",
"* Unsupervised Learning\n",
" - Clustering\n",
" - Dimensionality Reduction\n",
" - Density Estimation\n",
"* Evaluation of Learning Models\n",
"* Choosing the right algorithm for your dataset\n",
"\n",
"**11:00 - 12:00** Supervised learning in-depth\n",
"* Support Vector Machines\n",
"* Decision Trees and Random Forests\n",
"\n",
"*The tutorial repository contains additional material which we will not cover here. My hope is that you will find it useful to read-through on your own if you want to go deeper!*"
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Preliminaries"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This tutorial requires the following packages:\n",
"\n",
"- Python version 2.6-2.7 or 3.3-3.4\n",
"- `numpy` version 1.5 or later: http://www.numpy.org/\n",
"- `scipy` version 0.10 or later: http://www.scipy.org/\n",
"- `matplotlib` version 1.3 or later: http://matplotlib.org/\n",
"- `scikit-learn` version 0.14 or later: http://scikit-learn.org\n",
"- `ipython` version 2.0 or later, with notebook support: http://ipython.org\n",
"- `seaborn`: version 0.5 or later, used mainly for plot styling\n",
"\n",
"The easiest way to get these is to use the [conda](http://store.continuum.io/) environment manager.\n",
"I suggest downloading and installing [miniconda](http://conda.pydata.org/miniconda.html).\n",
"\n",
"The following command will install all required packages:\n",
"```\n",
"$ conda install numpy scipy matplotlib scikit-learn ipython-notebook\n",
"```\n",
"\n",
"Alternatively, you can download and install the (very large) Anaconda software distribution, found at https://store.continuum.io/."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Checking your installation\n",
"\n",
"You can run the following code to check the versions of the packages on your system:\n",
"\n",
"(in IPython notebook, press `shift` and `return` together to execute the contents of a cell)"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from __future__ import print_function\n",
"\n",
"import IPython\n",
"print('IPython:', IPython.__version__)\n",
"\n",
"import numpy\n",
"print('numpy:', numpy.__version__)\n",
"\n",
"import scipy\n",
"print('scipy:', scipy.__version__)\n",
"\n",
"import matplotlib\n",
"print('matplotlib:', matplotlib.__version__)\n",
"\n",
"import sklearn\n",
"print('scikit-learn:', sklearn.__version__)\n",
"\n",
"import seaborn\n",
"print('seaborn', seaborn.__version__)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"IPython: 2.4.1\n",
"numpy:"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
" 1.9.2\n",
"scipy: 0.15.1\n",
"matplotlib: 1.4.3\n",
"scikit-learn:"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
" 0.15.2\n",
"seaborn"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
" 0.5.1\n"
]
}
],
"prompt_number": 1
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Useful Resources"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- **scikit-learn:** http://scikit-learn.org (see especially the narrative documentation)\n",
"- **matplotlib:** http://matplotlib.org (see especially the gallery section)\n",
"- **IPython:** http://ipython.org (also check out http://nbviewer.ipython.org)"
]
}
],
"metadata": {}
}
]
}
587 changes: 587 additions & 0 deletions notebooks/02.1-Machine-Learning-Intro.ipynb

Large diffs are not rendered by default.

Loading

0 comments on commit 288b26c

Please sign in to comment.