Skip to content

Creating a Course

Ryan Holbrook edited this page Oct 14, 2020 · 15 revisions

TODO Automate this as much as possible.

Let's say we're going to create a new course on Kaggle Learn called Data Science.

Notebooks

  1. On the command line, navigate to the learntools/notebooks/ directory.
  2. Create a new branch on master with a name like ds-course. Be sure to check that there isn't already a branch with that name.
  3. Decide on a "track name" like data_science. This will be the name of the directory where your course files will exist. Check that there isn't already a directory with that name.
  4. There should be a Bash script called new_track.sh. Run /.new_track.sh data_science.
  5. Stage the new files: git add data_science.
  6. Commit the changes: git commit -m "Create track ds-course."
  7. Create a pull request on GitHub named [Data Science] New course.

Checking Code

  1. Navigate to the learntools root directory learntools/ (the directory containing setup.py).
  2. Install an editable version of learntools: pip install --editable . (note the period).
  3. Navigate to learntools/learntools.
  4. Create a directory for your course: mkdir data_science.
  5. Create an initialization file: touch data_science/__init__.py.
  6. Commit the changes.

Datasets

Create a folder to contain local copies of the course data: mkdir learntools/notebooks/input. These will just be for your own use while developing and won't be committed to the repository (it's in notebooks/.gitignore).

Jenkins

Add track name 'data_science' to TRACKS and TESTABLE_NOTEBOOK_TRACKS in learntools/notebooks/test.sh.

Create a new file setup_data.sh in learntools/notebooks/data_science/:

#!/bin/bash
# Download the datasets used in the ML notebooks to correct relative_paths (../input/...)

mkdir -p input

DATASETS="ryanholbrook/ds-course-data ryanholbrook/some-other-data"

for slug in $DATASETS
do
    name=`echo $slug | cut -d '/' -f 2`
    dest="input/$name"
    mkdir -p $dest
    kaggle d download -p $dest --unzip $slug
done

You'll need to keep this list of datasets in DATASETS up-to-date with those you use in your course (that is, those defined in track_meta.py).

Clone this wiki locally