Skip to content

Commit

Permalink
Updated dianne overview
Browse files Browse the repository at this point in the history
  • Loading branch information
sbohez committed Aug 20, 2016
1 parent 060088f commit cafb9d3
Show file tree
Hide file tree
Showing 3 changed files with 30 additions and 21 deletions.
2 changes: 1 addition & 1 deletion doc/dashboard.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ In order to submit a job you need to enter the following details:
* *ACT*: Used for reinforcement learning. This requires an idle `Agent` service to interact with an environment to generate experience. **Note**: reinforcement learning in DIANNE is still in major development, have a look at the `master` branch for the latest features.
* **Neural Network**: The network that needs to be deployed for the job. You can select any available network from the `Repository` from the dropdown list. *Tip*: you can easily switch between the Builder & Dashboard by clicking the DIANNE logo!
* **Dataset**: The dataset that will be used for the job. Any `Dataset` service available on the cluster will be listed here. To add your own dataset, or use data augmentation using adapters, please see the [dataset documentation on GitHub](https://github.com/ibcn-cloudlet/dianne/blob/v0.4.0/doc/datasets.md).
* **Options**: Allows you to specify the options for the job in a key=value manner. Each type of job will have different options available. For a complete list of options and their default values, please see the [configuration documentation on GitHub](https://github.com/ibcn-cloudlet/dianne/blob/v0.4.0/doc/configuration.md).
* **Options**: Allows you to specify the options for the job in a `key=value` manner. Each type of job will have different options available. For a complete list of options and their default values, please see the [configuration documentation on GitHub](https://github.com/ibcn-cloudlet/dianne/blob/v0.4.0/doc/configuration.md).

Once you have set all the required details, just click the `Submit` button and your job will be submitted to the job queue. If the necessary requirements for the job are met (i.e. there is an idle `Learner` service for a learning job), the job will be started. When the job is finished, you can click on it to view the result:

Expand Down
49 changes: 29 additions & 20 deletions doc/dianne.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,59 +4,68 @@ DIANNE overview
Modules
-------

In DIANNE, the central entity is the **Module**. A Module is a basic building blocks of a neural network, which provides two flows of information: a forward pass and a backward pass. In the forward pass, required for neural network execution, the input data is transformed in some way to give an output. In the backward pass, required for training neural networks, the gradient on the output of the previous forward pass is taken in and the corresponding gradient on the input is calculated. Each module can have one (or more) next modules to forward its output to, and one (or more) previous modules to propagate the gradient on the input to.
In DIANNE, the central entity is the **Module**. A Module is a basic building block of a neural network, which provides two flows of information: a forward pass and a backward pass. In the forward pass, required for model evaluation, the input data is transformed in some way to give an output. In the backward pass, required for model training, the gradient on the output resulting from the previous forward pass and an error criterion is taken in and the corresponding gradient on the input and weights is calculated. Each module can have one or more subsequent modules to forward its output to, and one or more preceding modules to propagate the gradient on the input to.

A neural network can be constructed by chaining a number of modules. Starting with a special **Input** module, which simply forwards the input data, and ending with another special **Output** module, which collects the output. Besides the Input and the Output module, DIANNE supports a number of other types to build up neural networks. A **Linear** module calculates a weighted sum of the input and can be combined with an activation module (currently **Sigmoid**, **Tanh** and (Parameterized) Rectified Linear Units (**(P)ReLU**) modules are supported) to model a fully connected neural network layer. For classification, a **Softmax** module is added before the Output which converts the output to classification probabilities.
A neural network can be constructed by chaining a number of modules. Starting with an **Input** module, the entry point through which data will be pushed in the model, and ending with an **Output** module, which collects the output. Besides the Input and the Output module, DIANNE supports a number of other types to build up neural networks. For example, a **Linear** module calculates a weighted sum of the input and can be combined with an activation module (such as a **Sigmoid** or **ReLU**) to model a fully connected neural network layer. For classification, just add a **Softmax** module before the Output to convert the output to class probabilities.

Besides modules for fully connected neural networks, we provide **Convolution** and **MaxPooling** modules which are frequently used in state-of-the-art convolutional neural networks. Various split and combine modules allow to create parallel paths in a neural network. These special split and combine modules are the only modules which can have multiple next and previous modules. Finally, a number of preprocessing modules are provided, for example to normalize the input samples. By combining these modules one can build popular neural networks, such as for example the [Overfeat neural network](http://cilvr.nyu.edu/doku.php?id=software:overfeat:start):
DIANNE provides a range of different modules to build up your models, such as **Convolution** and **MaxPooling** modules to design state-of-the-art convolutional neural networks. Various split and combine modules allow to create parallel paths in a neural network. These special split and combine modules can have multiple next and previous modules, effectively constructing a neural network as a directed graph of modules. Preprocessing modules are provided, for example to normalize the input samples. By combining these modules one can rebuild popular neural networks, such as for example the [Overfeat neural network](http://cilvr.nyu.edu/doku.php?id=software:overfeat:start):

![Overfeat neural network modeled in DIANNE](figures/overfeat.png)

In DIANNE, the Module is not only a functional block, but also a unit of deployment. This means that each individual module can be deployed on any device in a distributed computing environment. This allows fine grained distribution of neural network modules, enabling offloading parts of a neural network in case of small and embedded devices, or model-parallellism when training large neural network models.
In DIANNE, the Module is not only a functional block, but also a unit of deployment, meaning that each individual module can be deployed on any device in a distributed computing environment. This allows fine grained distribution of neural network modules, enabling offloading parts of a neural network in case of small and embedded devices, or model parallellism when training large neural network models. For a list of all the available modules in DIANNE, have a look at the [module catalog on GitHub](https://github.com/ibcn-cloudlet/dianne/blob/v0.4.0/doc/modules.md).

Tensors
-------

The parameters and return values of the module's forward and backward passes are Tensors. DIANNE provides its own Tensor API providing various operations on n-dimensional arrays. Different implementations are available: one pure Java implementation, one based on the [Torch 7 Tensor library](https://github.com/torch/TH), and a Cuda implementation.
While DIANNE may be based on Java, this is by no means a hit on performance! All data arrays, such as parameters and return values of the module's forward and backward passes, are abstracted as **Tensor** objects. DIANNE provides its own Tensor API providing various operations on n-dimensional arrays. By keeping the underlying data and operations in native vs. JVM space, we can not only use powerful algebraic libraries, but also perform these operations on GPU. We are currently using native implementations based on the [torch and curtorch Tensor libraries](https://github.com/torch).

Runtime
-------

The DIANNE runtime runs on top of AIOLOS and OSGi in order to provide runtime modularity and distributed deployment. On each device running the DIANNE stack any neural network module can be deployed. At runtime, all next and previous modules of the neural network are wired together and a processing pipeline is constructed. When two modules are deployed on different devices, they will communicate via the remote procedure call mechanism offered by AIOLOS.
The DIANNE **Runtime** runs on top of AIOLOS and OSGi in order to provide runtime modularity and distributed deployment. On each device running the DIANNE stack any neural network module can be deployed. At runtime, all subsequent and preceding modules of the neural network are wired together and a processing graph is constructed. When two modules are deployed on different devices, they will communicate via the remote procedure call mechanism offered by AIOLOS.

![DIANNE runtime](figures/runtime.png)

Repository
----------

The DIANNE repository contains a collection of configured neural networks, as well as a collection of trained weights for these neural networks. Neural networks are stored in a file called modules.txt in JSON format, containing a map of all modules with their configuration parameters. Besides the modules descriptions, a layout.txt can provide information of the neural network layout in the GUI builder. Trained weights of a module are saved in a binary format.
The DIANNE **Repository** service maintains a collection of configured neural networks, as well as a collection of trained weights for these neural networks. The neural network structure is stored in a file in JSON format, containing a map of all modules with their configuration parameters, as well as their connectivity to other modules. Trained weights of a module are saved in a binary format. Weights can be tagged, allowing the user to maintain different sets of weights, for example when training with different hyperparameters.

When a training procedure is started, newly trained weights will be stored into the repository. This way, the DIANNE repository can also be used as parameter server for parallel training on multiple devices.
During training, the DIANNE repository acts as a parameter server. At the start of the training procedure, newly randomized weights will be stored into the repository. As training advances, updates to those weights are pushed to the repository, which will merge them. This way, the DIANNE repository can also be used for asynchronous gradient descent, when running multiple training routines in parallel.

Datasets
--------

In order to train and evaluate your neural network, DIANNE provides a Dataset interface that offers input samples and corresponding outputs for supervised learning. DIANNE has built-in adapters for various 2D image datasets, such as MNIST, ImageNet, CIFAR-10/100 and STL-10.
In order to train and evaluate your neural network, DIANNE provides a **Dataset** interface that offers input samples and optionally corresponding outputs for either supervised or unsupervised learning. Similar to the model repository, datasets are added using JSON descriptor files. DIANNE has built-in adapters for various well-known image datasets, such as MNIST, ImageNet, CIFAR-10/100 and STL-10, for which you can download the configuration files on the [datasets page](http://dianne.intec.ugent.be/datasets/). You can also add your own dataset by simply creating a new descriptor file, see how in the [dataset documentation on GitHub](https://github.com/ibcn-cloudlet/dianne/blob/v0.4.0/doc/datasets.md).

![DIANNE datasets](figures/datasets.png)

Learners & Evaluators
---------------------

In order to train your models, DIANNE provides different **Learner** services. These provide the training routines that will pull data from a dataset, feed it into the network, calculate the gradients and push weight updates to the repository. Various well-known optimization and regularization techniques are readily available, ranging from vanilla Stochastic Gradient Descent to Adam.

The **Evaluator** service allows you to assess the performance of your model, not only in terms of average loss or classification accuracy, but also in terms of system performance, e.g. required computation time. This is useful for scenarios where you want to deploy your model on devices with limited compute power.

Coordinator
-----------

Once you start to scale up your training, either to train multiple models at the same time, or to train models that are too large for a single machine, you can use the DIANNE **Coordinator** service to keep things manageable. After connecting multiple DIANNE runtimes to one another, the coordinator allows you to manage learning and evaluation tasks using a job-scheduling system. Once you specify your job as a number of `key=value` configuration options, the coordinator will go to work by checking if suitable Learner and Evaluator services are available, configuring those services, deploying the required modules, starting the training procedure and monitoring the training progress.

UI
--

DIANNE comes with a web-based UI to build, deploy, train and evaluate a neural network. During the build phase modules can be drag-and-dropped onto the canvas to create a neural network. Next, each module can be deployed individually to a DIANNE runtime. Once deployed, one can attach a Dataset to train and/or evaluate the neural network. In the run tab, one can connect actual sensor devices (i.e. a camera) to the neural network Input module and visualize the output.
The fastest way to get started with DIANNE is to use the two provided graphical user interfaces, which are available through the browser:

![DIANNE UI](figures/ui.png)
**Builder**: The Builder allows you to design, deploy, train and evaluate a (single) neural network model. During the build phase modules can be drag-and-dropped onto a canvas to create a neural network. Next, each module can be deployed individually to any connected DIANNE runtime. Once deployed, one can attach a Dataset to train and/or evaluate the neural network. In the run tab, one can connect actual sensor devices (i.e. a camera) to the neural network Input module and visualize the output!

Besides the graphical user interface, DIANNE can also be controlled via a command line interface, or programmatically via the various APIs.

Training
--------
**Dashboard**: The Dashboard is the front-end for the Coordinator and allows you to submit jobs through the browser, see what jobs are queued, running or finished, have a look at the progress or results of a job, as well as monitor your infrastructure.

At the moment DIANNE only has a basic Stochastic Gradient Descent training algorithm implemented, using mean square error or negative log likelihood as optimization criterion, and with configurable momentum and regularization parameters. In the future also other training strategies will be supported. The training procedures can also be started via a JSON RPC interface.
![DIANNE UI](figures/ui.png)

Reinforcement Learning
----------------------
Besides the graphical user interfaces, DIANNE can also be controlled via a command line interface, JSON RPC, or programmatically via the various APIs.

DIANNE also has basic support for deep Q learning, using temporal difference learning. DIANNE comes with a basic reinforcement learning agent, as well as interfaces for implementing an environment and experience pools.
Upcoming
--------

DIANNE is still very much a project in development, and we are constantly adding new features, ranging from new modules to even faster native backends. We are currently working hard on adding new learning paradigms, such as backpropagation-through-time for recurrent models and reinforcement learning, as well as making it easier for the user to define their own training procedures. Have a look at our [GitHub page](https://github.com/ibcn-cloudlet/dianne) for the latest features!
Binary file modified doc/figures/runtime.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit cafb9d3

Please sign in to comment.