From ff0f20ca434f0dccb2143fa9ef87d1793227c697 Mon Sep 17 00:00:00 2001 From: Matthew Seal Date: Wed, 1 Aug 2018 15:16:02 -0700 Subject: [PATCH] Added black linter rules --- .pre-commit-config.yaml | 5 + README.md | 187 +++++++++++++++++++++++++++++++++++++ README.rst | 201 ---------------------------------------- convert.sh | 7 ++ pyproject.toml | 29 ++++++ requirements-dev.txt | 2 + 6 files changed, 230 insertions(+), 201 deletions(-) create mode 100644 .pre-commit-config.yaml create mode 100644 README.md delete mode 100644 README.rst create mode 100755 convert.sh create mode 100644 pyproject.toml diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml new file mode 100644 index 00000000..087d0de6 --- /dev/null +++ b/.pre-commit-config.yaml @@ -0,0 +1,5 @@ +repos: +- repo: https://github.com/ambv/black + rev: stable + hooks: + - id: black diff --git a/README.md b/README.md new file mode 100644 index 00000000..300cf606 --- /dev/null +++ b/README.md @@ -0,0 +1,187 @@ +[![Papermill](https://user-images.githubusercontent.com/836375/27929844-6bb34e62-6249-11e7-9a2a-00849a64940c.png)](https://github.com/nteract/papermill) +======================================================================================================================================================================= + +[![image](https://travis-ci.org/nteract/papermill.svg?branch=master)](https://travis-ci.org/nteract/papermill) +[![image](https://codecov.io/github/nteract/papermill/coverage.svg?branch=master)](https://codecov.io/github/nteract/papermill?branch=master) +[![Documentation Status](https://readthedocs.org/projects/papermill/badge/?version=latest)](http://papermill.readthedocs.io/en/latest/?badge=latest) +[![image](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/nteract/papermill/master?filepath=papermill%2Ftests%2Fnotebooks%2Fbinder.ipynb) +[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black) + +**Papermill** is a tool for parameterizing, executing, and analyzing +Jupyter Notebooks. + +Papermill lets you: + +- **parametrize** notebooks +- **execute** and **collect** metrics across the notebooks +- **summarize collections** of notebooks + +This opens up new opportunities for how notebooks can be used. For +example: + +- Perhaps you have a financial report that you wish to run with + different values on the first or last day of a month or at the + beginning or end of the year, **using parameters** makes this task + easier. +- Do you want to run a notebook and depending on its results, choose a + particular notebook to run next? You can now programmatically + **execute a workflow** without having to copy and paste from + notebook to notebook manually. +- Do you have plots and visualizations spread across 10 or more + notebooks? Now you can choose which plots to programmatically + display a **summary** **collection** in a notebook to share with + others. + +Installation +------------ + +From the commmand line: + +``` {.sourceCode .bash} +pip install papermill +``` + +Installing In-Notebook bindings +------------------------------- + +- [Python](PythonBinding) (included in this repo) +- [R](https://github.com/nteract/papermillr) (**experimentally** available in the + **papermillr** project) + +Other language bindings welcome if someone would like to maintain parallel implementations! + +Usage +----- + +### Parametrizing a Notebook + +To parametrize your notebook designate a cell with the tag ``parameters``. + +Papermill looks for the ``parameters`` cell and treats this cell as defaults for the parameters passed in at execution time. Papermill will add a new cell tagged with ``injected-parameters`` with input parameters in order to overwrite the values in ``parameters``. If no cell is tagged with ``parameters`` the injected cell will be inserted at the top of the notebook. + +Additionally, if you rerun notebooks through papermill and it will reuse the ``injected-parameters`` cell from the prior run. In this case papermill will replace the old ``injected-parameters`` cell with the new run's inputs. + +![image](docs/img/parameters.png) + +### Executing a Notebook + +The two ways to execute the notebook with parameters are: (1) through +the Python API and (2) through the command line interface. + +#### Execute via the Python API + +``` {.sourceCode .python} +import papermill as pm + +pm.execute_notebook( + 'path/to/input.ipynb', + 'path/to/output.ipynb', + parameters = dict(alpha=0.6, ratio=0.1) +) +``` + +#### Execute via CLI + +Here's an example of a local notebook being executed and output to an +Amazon S3 account: + +``` {.sourceCode .bash} +$ papermill local/input.ipynb s3://bkt/output.ipynb -p alpha 0.6 -p l1_ratio 0.1 +``` + +Python In-notebook Bindings +--------------------------- + +### Recording Values to the Notebook + +Users can save values to the notebook document to be consumed by other +notebooks. + +Recording values to be saved with the notebook. + +``` {.sourceCode .python} +"""notebook.ipynb""" +import papermill as pm + +pm.record("hello", "world") +pm.record("number", 123) +pm.record("some_list", [1, 3, 5]) +pm.record("some_dict", {"a": 1, "b": 2}) +``` + +Users can recover those values as a Pandas dataframe via the +`read_notebook` function. + +``` {.sourceCode .python} +"""summary.ipynb""" +import papermill as pm + +nb = pm.read_notebook('notebook.ipynb') +nb.dataframe +``` + +![image](docs/img/nb_dataframe.png) + +### Displaying Plots and Images Saved by Other Notebooks + +Display a matplotlib histogram with the key name `matplotlib_hist`. + +``` {.sourceCode .python} +"""notebook.ipynb""" +import papermill as pm +from ggplot import mpg +import matplotlib.pyplot as plt + +# turn off interactive plotting to avoid double plotting +plt.ioff() + +f = plt.figure() +plt.hist('cty', bins=12, data=mpg) +pm.display('matplotlib_hist', f) +``` + +![image](docs/img/matplotlib_hist.png) + +Read in that above notebook and display the plot saved at +`matplotlib_hist`. + +``` {.sourceCode .python} +"""summary.ipynb""" +import papermill as pm + +nb = pm.read_notebook('notebook.ipynb') +nb.display_output('matplotlib_hist') +``` + +![image](docs/img/matplotlib_hist.png) + +### Analyzing a Collection of Notebooks + +Papermill can read in a directory of notebooks and provides the +`NotebookCollection` interface for operating on them. + +``` {.sourceCode .python} +"""summary.ipynb""" +import papermill as pm + +nbs = pm.read_notebooks('/path/to/results/') + +# Show named plot from 'notebook1.ipynb' +# Accepts a key or list of keys to plot in order. +nbs.display_output('train_1.ipynb', 'matplotlib_hist') +``` + +![image](docs/img/matplotlib_hist.png) + +``` {.sourceCode .python} +# Dataframe for all notebooks in collection +nbs.dataframe.head(10) +``` + +![image](docs/img/nbs_dataframe.png) + +Documentation +------------- + +We host the [papermill documentation](http://papermill.readthedocs.io) +on ReadTheDocs. diff --git a/README.rst b/README.rst deleted file mode 100644 index 58a11c6c..00000000 --- a/README.rst +++ /dev/null @@ -1,201 +0,0 @@ -|Logo| -====== - -.. image:: https://travis-ci.org/nteract/papermill.svg?branch=master - :target: https://travis-ci.org/nteract/papermill -.. image:: https://codecov.io/github/nteract/papermill/coverage.svg?branch=master - :target: https://codecov.io/github/nteract/papermill?branch=master -.. image:: https://readthedocs.org/projects/papermill/badge/?version=latest - :target: http://papermill.readthedocs.io/en/latest/?badge=latest - :alt: Documentation Status -.. image:: https://mybinder.org/badge.svg - :target: https://mybinder.org/v2/gh/nteract/papermill/master?filepath=papermill%2Ftests%2Fnotebooks%2Fbinder.ipynb - - -**Papermill** is a tool for parameterizing, executing, and analyzing Jupyter -Notebooks. - -Papermill lets you: - -* **parametrize** notebooks -* **execute** and **collect** metrics across the notebooks -* **summarize collections** of notebooks - -This opens up new opportunities for how notebooks can be used. For example: - -- Perhaps you have a financial report that you wish to run with different - values on the first or last day of a month or at the beginning or end - of the year, **using parameters** makes this task easier. -- Do you want to run a notebook and depending on its results, - choose a particular notebook to run next? You can now programmatically - **execute a workflow** without having to copy and paste from notebook to - notebook manually. -- Do you have plots and visualizations spread across 10 or more notebooks? - Now you can choose which plots to programmatically display a **summary** - **collection** in a notebook to share with others. - -Installation ------------- - -From the commmand line: - -.. code-block:: bash - - pip install papermill - -Installing In-Notebook bindings -------------------------------- - -* `Python `_ (included in this repo) -* `R`_ (available in the **papermillr** project) - -.. _`R`: https://github.com/nteract/papermillr - -Usage ------ - -Parametrizing a Notebook -~~~~~~~~~~~~~~~~~~~~~~~~ - -To parametrize your notebook designate a cell with the tag ``parameters``. - -Papermill looks for the ``parameters`` cell and treats this cell as defaults for the parameters passed in at execution time. Papermill will add a new cell tagged with ``injected-parameters`` with input parameters in order to overwrite the values in ``parameters``. If no cell is tagged with ``parameters`` the injected cell will be inserted at the top of the notebook. - -Additionally, if you rerun notebooks through papermill and it will reuse the ``injected-parameters`` cell from the prior run. In this case papermill will replace the old ``injected-parameters`` cell with the new run's inputs. - -.. image:: docs/img/parameters.png - -Executing a Notebook -~~~~~~~~~~~~~~~~~~~~ - -The two ways to execute the notebook with parameters are: (1) through the -Python API and (2) through the command line interface. - -Execute via the Python API -^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. code-block:: python - - import papermill as pm - - pm.execute_notebook( - 'path/to/input.ipynb', - 'path/to/output.ipynb', - parameters = dict(alpha=0.6, ratio=0.1) - ) - -Execute via CLI -^^^^^^^^^^^^^^^ - -Here's an example of a local notebook being executed and output to an -Amazon S3 account: - -.. code-block:: bash - - $ papermill local/input.ipynb s3://bkt/output.ipynb -p alpha 0.6 -p l1_ratio 0.1 - - -.. _PythonBinding: - -Python In-notebook Bindings ---------------------------- - -Recording Values to the Notebook -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Users can save values to the notebook document to be consumed by other -notebooks. - -Recording values to be saved with the notebook. - -.. code-block:: python - - """notebook.ipynb""" - import papermill as pm - - pm.record("hello", "world") - pm.record("number", 123) - pm.record("some_list", [1, 3, 5]) - pm.record("some_dict", {"a": 1, "b": 2}) - -Users can recover those values as a Pandas dataframe via the -``read_notebook`` function. - -.. code-block:: python - - """summary.ipynb""" - import papermill as pm - - nb = pm.read_notebook('notebook.ipynb') - nb.dataframe - -.. image:: docs/img/nb_dataframe.png - -Displaying Plots and Images Saved by Other Notebooks -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Display a matplotlib histogram with the key name ``matplotlib_hist``. - -.. code-block:: python - - """notebook.ipynb""" - import papermill as pm - from ggplot import mpg - import matplotlib.pyplot as plt - - # turn off interactive plotting to avoid double plotting - plt.ioff() - - f = plt.figure() - plt.hist('cty', bins=12, data=mpg) - pm.display('matplotlib_hist', f) - -.. image:: docs/img/matplotlib_hist.png - -Read in that above notebook and display the plot saved at ``matplotlib_hist``. - -.. code-block:: python - - """summary.ipynb""" - import papermill as pm - - nb = pm.read_notebook('notebook.ipynb') - nb.display_output('matplotlib_hist') - -.. image:: docs/img/matplotlib_hist.png - -Analyzing a Collection of Notebooks -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Papermill can read in a directory of notebooks and provides the -``NotebookCollection`` interface for operating on them. - -.. code-block:: python - - """summary.ipynb""" - import papermill as pm - - nbs = pm.read_notebooks('/path/to/results/') - - # Show named plot from 'notebook1.ipynb' - # Accepts a key or list of keys to plot in order. - nbs.display_output('train_1.ipynb', 'matplotlib_hist') - -.. image:: docs/img/matplotlib_hist.png - -.. code-block:: python - - # Dataframe for all notebooks in collection - nbs.dataframe.head(10) - -.. image:: docs/img/nbs_dataframe.png - -Documentation -------------- - -We host the `papermill documentation `_ on ReadTheDocs. - -.. |Logo| image:: https://user-images.githubusercontent.com/836375/27929844-6bb34e62-6249-11e7-9a2a-00849a64940c.png - :width: 200px - :target: https://github.com/nteract/papermill - :alt: Papermill diff --git a/convert.sh b/convert.sh new file mode 100755 index 00000000..08956cc3 --- /dev/null +++ b/convert.sh @@ -0,0 +1,7 @@ +FILES=*.rst +for f in $FILES +do + filename="${f%.*}" + echo "Converting $f to $filename.md" + `pandoc $f -f rst -t markdown -o $filename.md` +done diff --git a/pyproject.toml b/pyproject.toml new file mode 100644 index 00000000..8bfffb83 --- /dev/null +++ b/pyproject.toml @@ -0,0 +1,29 @@ +# Example configuration for Black. + +# NOTE: you have to use single-quoted strings in TOML for regular expressions. +# It's the equivalent of r-strings in Python. Multiline strings are treated as +# verbose regular expressions by Black. Use [ ] to denote a significant space +# character. + +[tool.black] +line-length = 100 +include = '\.pyi?$' +exclude = ''' +/( + \.git + | \.hg + | \.mypy_cache + | \.tox + | \.venv + | _build + | buck-out + | build + | dist + + # The following are specific to Black, you probably don't want those. + | blib2to3 + | tests/data + | profiling +)/ +''' +skip-string-normalization = true diff --git a/requirements-dev.txt b/requirements-dev.txt index 3247e986..30f344ba 100644 --- a/requirements-dev.txt +++ b/requirements-dev.txt @@ -7,3 +7,5 @@ pytest-cov pytest-mock moto==1.2.0 check-manifest +black +pre-commit