Skip to content

Commit

Permalink
Guess params by introspecting the _parameters_ cell (nteract#531)
Browse files Browse the repository at this point in the history
* Add Translator.inspect

* Apply changes following review

Co-authored-by: Frédéric Collonval <[email protected]>
  • Loading branch information
fcollonval and Frédéric Collonval authored Sep 6, 2020
1 parent 33b3e1c commit 82b5c5d
Show file tree
Hide file tree
Showing 19 changed files with 674 additions and 37 deletions.
20 changes: 18 additions & 2 deletions docs/usage-cli.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ options:

.. code-block:: bash
Usage: papermill [OPTIONS] NOTEBOOK_PATH OUTPUT_PATH
Usage: papermill [OPTIONS] NOTEBOOK_PATH [OUTPUT_PATH]
This utility executes a single notebook in a subprocess.
Expand All @@ -24,6 +24,9 @@ options:
stdin and write it out to stdout.
Options:
--help-notebook Display parameters information for the given
notebook path.
-p, --parameters TEXT... Parameters to pass to the parameters cell.
-r, --parameters_raw TEXT... Parameters to be read as raw string.
-f, --parameters_file TEXT Path to YAML file containing parameters.
Expand All @@ -32,34 +35,47 @@ options:
--inject-input-path Insert the path of the input notebook as
PAPERMILL_INPUT_PATH as a notebook
parameter.
--inject-output-path Insert the path of the output notebook as
PAPERMILL_OUTPUT_PATH as a notebook
parameter.
--inject-paths Insert the paths of input/output notebooks
as
PAPERMILL_INPUT_PATH/PAPERMILL_OUTPUT_PATH
as notebook parameters.
--engine TEXT The execution engine name to use in
evaluating the notebook.
--request-save-on-cell-execute / --no-request-save-on-cell-execute
Request save notebook after each cell
execution
--autosave-cell-every INTEGER How often in seconds to autosave the
notebook during long cell executions (0 to
disable)
--prepare-only / --prepare-execute
Flag for outputting the notebook without
execution, but with parameters applied.
-k, --kernel TEXT Name of kernel to run.
--cwd TEXT Working directory to run notebook in.
--progress-bar / --no-progress-bar
Flag for turning on the progress bar.
--log-output / --no-log-output Flag for writing notebook output to the
configured logger.
--stdout-file FILENAME File to write notebook stdout output to.
--stderr-file FILENAME File to write notebook stderr output to.
--log-level [NOTSET|DEBUG|INFO|WARNING|ERROR|CRITICAL]
Set log level
--start-timeout INTEGER Time in seconds to wait for kernel to start.
--start-timeout, --start_timeout INTEGER
Time in seconds to wait for kernel to start.
--execution-timeout INTEGER Time in seconds to wait for each cell before
failing execution (default: forever)
--report-mode / --no-report-mode
Flag for hiding input.
--version Flag for displaying the version.
Expand Down
56 changes: 56 additions & 0 deletions docs/usage-inspect.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
Inspect
=======

The two ways to inspect the notebook to discover its parameters are: (1) through the
Python API and (2) through the command line interface.

Execute via the Python API
~~~~~~~~~~~~~~~~~~~~~~~~~~

The `inspect_notebook` function can be called to inspect a notebook:

.. code-block:: python
inspect_notebook(<notebook path>)
.. code-block:: python
import papermill as pm
pm.inspect_notebook('path/to/input.ipynb')
.. note::
If your path is parametrized, you can pass those parameters in a dictionary
as second parameter:

``inspect_notebook('path/to/input_{month}.ipynb', parameters={month='Feb'})``

Inspect via CLI
~~~~~~~~~~~~~~~

To inspect a notebook using the CLI, enter the ``papermill --help-notebook`` command in the
terminal with the notebook and optionally path parameters.

.. seealso::

:doc:`CLI reference <./usage-cli>`

Inspect a notebook
^^^^^^^^^^^^^^^^^^

Here's an example of a local notebook being inspected and an output example:

.. code-block:: bash
papermill --help-notebook ./papermill/tests/notebooks/complex_parameters.ipynb
Usage: papermill [OPTIONS] NOTEBOOK_PATH [OUTPUT_PATH]
Parameters inferred for notebook './papermill/tests/notebooks/complex_parameters.ipynb':
msg: Unknown type (default None)
a: float (default 2.25) Variable a
b: List[str] (default ['Hello','World'])
Nice list
c: NoneType (default None)
1 change: 1 addition & 0 deletions docs/usage-workflow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ a collection of notebooks.
:maxdepth: 2

usage-parameterize
usage-inspect
usage-execute
usage-store

Expand Down
1 change: 1 addition & 0 deletions papermill/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@

from .exceptions import PapermillException, PapermillExecutionError
from .execute import execute_notebook
from .inspection import inspect_notebook
28 changes: 23 additions & 5 deletions papermill/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@

from .execute import execute_notebook
from .iorw import read_yaml_file, NoDatesSafeLoader
from .inspection import display_notebook_help
from . import __version__ as papermill_version

click.disable_unicode_literals_warning = True
Expand All @@ -37,8 +38,15 @@ def print_papermill_version(ctx, param, value):


@click.command(context_settings=dict(help_option_names=['-h', '--help']))
@click.pass_context
@click.argument('notebook_path', required=not INPUT_PIPED)
@click.argument('output_path', required=not (INPUT_PIPED or OUTPUT_PIPED))
@click.argument('output_path', default="")
@click.option(
'--help-notebook',
is_flag=True,
default=False,
help='Display parameters information for the given notebook path.',
)
@click.option(
'--parameters', '-p', nargs=2, multiple=True, help='Parameters to pass to the parameters cell.'
)
Expand Down Expand Up @@ -140,8 +148,10 @@ def print_papermill_version(ctx, param, value):
help='Flag for displaying the version.',
)
def papermill(
click_ctx,
notebook_path,
output_path,
help_notebook,
parameters,
parameters_raw,
parameters_file,
Expand Down Expand Up @@ -181,11 +191,16 @@ def papermill(
from stdin and write it out to stdout.
"""
if not help_notebook:
required_output_path = not (INPUT_PIPED or OUTPUT_PIPED)
if required_output_path and not output_path:
raise click.UsageError("Missing argument 'OUTPUT_PATH'")

if INPUT_PIPED and notebook_path and not output_path:
input_path = '-'
output_path = notebook_path
notebook_path = '-'
else:
notebook_path = notebook_path or '-'
input_path = notebook_path or '-'
output_path = output_path or '-'

if output_path == '-':
Expand All @@ -204,7 +219,7 @@ def papermill(
# Read in Parameters
parameters_final = {}
if inject_input_path or inject_paths:
parameters_final['PAPERMILL_INPUT_PATH'] = notebook_path
parameters_final['PAPERMILL_INPUT_PATH'] = input_path
if inject_output_path or inject_paths:
parameters_final['PAPERMILL_OUTPUT_PATH'] = output_path
for params in parameters_base64 or []:
Expand All @@ -218,9 +233,12 @@ def papermill(
for name, value in parameters_raw or []:
parameters_final[name] = value

if help_notebook:
sys.exit(display_notebook_help(click_ctx, notebook_path, parameters_final))

try:
execute_notebook(
input_path=notebook_path,
input_path=input_path,
output_path=output_path,
parameters=parameters_final,
engine_name=engine,
Expand Down
4 changes: 2 additions & 2 deletions papermill/execute.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,10 @@

from .log import logger
from .exceptions import PapermillExecutionError
from .iorw import load_notebook_node, write_ipynb, get_pretty_path, local_file_io_cwd
from .iorw import get_pretty_path, local_file_io_cwd, load_notebook_node, write_ipynb
from .engines import papermill_engines
from .utils import chdir
from .parameterize import parameterize_notebook, parameterize_path, add_builtin_parameters
from .parameterize import add_builtin_parameters, parameterize_notebook, parameterize_path


def execute_notebook(
Expand Down
119 changes: 119 additions & 0 deletions papermill/inspection.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
# -*- coding: utf-8 -*-
"""Deduce parameters of a notebook from the parameters cell."""
import click

from .iorw import get_pretty_path, load_notebook_node, local_file_io_cwd
from .log import logger
from .parameterize import add_builtin_parameters, parameterize_path
from .translators import papermill_translators
from .utils import any_tagged_cell, find_first_tagged_cell_index


def _open_notebook(notebook_path, parameters):
path_parameters = add_builtin_parameters(parameters)
input_path = parameterize_path(notebook_path, path_parameters)
logger.info("Input Notebook: %s" % get_pretty_path(input_path))

with local_file_io_cwd():
return load_notebook_node(input_path)


def _infer_parameters(nb):
"""Infer the notebook parameters.
Parameters
----------
nb : nbformat.NotebookNode
Notebook
Returns
-------
List[Parameter]
List of parameters (name, inferred_type_name, default, help)
"""
params = []

parameter_cell_idx = find_first_tagged_cell_index(nb, "parameters")
if parameter_cell_idx < 0:
return params
parameter_cell = nb.cells[parameter_cell_idx]
kernel_name = nb.metadata.kernelspec.name
language = nb.metadata.kernelspec.language

translator = papermill_translators.find_translator(kernel_name, language)
try:
params = translator.inspect(parameter_cell)
except NotImplementedError:
logger.warning(
"Translator for '{}' language does not support parameter introspection.".format(
language
)
)

return params


def display_notebook_help(ctx, notebook_path, parameters):
"""Display help on notebook parameters.
Parameters
----------
ctx : click.Context
Click context
notebook_path : str
Path to the notebook to be inspected
"""
nb = _open_notebook(notebook_path, parameters)
click.echo(ctx.command.get_usage(ctx))
pretty_path = get_pretty_path(notebook_path)
click.echo("\nParameters inferred for notebook '{}':".format(pretty_path))

if not any_tagged_cell(nb, "parameters"):
click.echo("\n No cell tagged 'parameters'")
return 1

params = _infer_parameters(nb)
if params:
for param in params:
p = param._asdict()
type_repr = p["inferred_type_name"]
if type_repr == "None":
type_repr = "Unknown type"

definition = " {}: {} (default {})".format(p["name"], type_repr, p["default"])
if len(definition) > 30:
if len(p["help"]):
param_help = "".join((definition, "\n", 34 * " ", p["help"]))
else:
param_help = definition
else:
param_help = "{:<34}{}".format(definition, p["help"])
click.echo(param_help)
else:
click.echo(
"\n Can't infer anything about this notebook's parameters. "
"It may not have any parameter defined."
)

return 0


def inspect_notebook(notebook_path, parameters=None):
"""Return the inferred notebook parameters.
Parameters
----------
notebook_path : str
Path to notebook
parameters : dict, optional
Arbitrary keyword arguments to pass to the notebook parameters
Returns
-------
Dict[str, Parameter]
Mapping of (parameter name, {name, inferred_type_name, default, help})
"""
nb = _open_notebook(notebook_path, parameters)

params = _infer_parameters(nb)
return {p.name: p._asdict() for p in params}
6 changes: 4 additions & 2 deletions papermill/iorw.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,14 +16,14 @@
from tenacity import retry, retry_if_exception_type, stop_after_attempt, wait_exponential

from . import __version__
from .log import logger
from .utils import chdir
from .exceptions import (
PapermillException,
PapermillRateLimitException,
missing_dependency_generator,
missing_environment_variable_generator,
)
from .log import logger
from .utils import chdir

try:
from .s3 import S3
Expand Down Expand Up @@ -411,6 +411,7 @@ def load_notebook_node(notebook_path):

if not hasattr(nb.metadata, 'papermill'):
nb.metadata['papermill'] = {
'default_parameters': dict(),
'parameters': dict(),
'environment_variables': dict(),
'version': __version__,
Expand All @@ -422,6 +423,7 @@ def load_notebook_node(notebook_path):

if not hasattr(cell.metadata, 'papermill'):
cell.metadata['papermill'] = dict()

return nb


Expand Down
9 changes: 9 additions & 0 deletions papermill/models.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
"""Models used by papermill."""
from collections import namedtuple

Parameter = namedtuple('Parameter', [
'name',
'inferred_type_name', # string of type
'default', # string representing the default value
'help',
])
Loading

0 comments on commit 82b5c5d

Please sign in to comment.