-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
anon-npc
authored
Sep 29, 2023
0 parents
commit b8c8030
Showing
94 changed files
with
9,035 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,136 @@ | ||
# Squared Non-monotonic Probabilistic Circuits | ||
|
||
This repository contains the official implementation of _squared non-monotonic PCs_, | ||
as well as relevant information to reproduce the experiments of our paper "Subtractive Mixture Models via Squaring: Representation and Learning". | ||
|
||
## How to Run Experiments? | ||
|
||
### Download the Data | ||
|
||
Each data set should be downloaded and placed in the ```./datasets``` directory, | ||
which is the default one. | ||
|
||
#### UCI Datasets | ||
|
||
The continuous UCI data sets that are commonly used in the _normalizing flow_ literature, i.e., | ||
Power, Gas, Hepmass, MiniBoone and also BSDS300, | ||
can be downloaded from [zenodo](https://zenodo.org/record/1161203#.Wmtf_XVl8eN). | ||
|
||
#### Sentences sampled from GPT2 | ||
|
||
The sentences sampled from GPT2 four our experiments on model distillation can be downloaded from [](). | ||
After downloading it, you need to decompress it in the ```./datasets/language/gpt2_commongen``` directory. | ||
|
||
### Run the same hyperparameters grid searches | ||
|
||
The directory ```econfigs/``` contains configuration files of the same hyperparameter grid searches we performed for all our experiments. | ||
See below section about running grids of experiments for details. | ||
|
||
### Run simple experiments | ||
|
||
Simple experiments can be run by executing the Python module ```scripts.experiment.py```. | ||
For a complete overview of the parameters to pass to it, it is suggested to read its code. | ||
|
||
For example, to run an experiment with a ```MonotonicPC``` having input layers computing splines over 32 knots | ||
and on the synthetic dataset ```cosine```, you can execute | ||
```shell | ||
python -m scripts.experiment --dataset cosine --model MonotonicPC --num-components 8 --splines --spline-knots 32 \ | ||
--optimizer Adam --learning-rate 1e-3 --batch-size 128 --verbose | ||
``` | ||
The ```--num-components``` argument is used to provide the number of components of each sum unit in the | ||
tensorized circuit architecture built. | ||
|
||
Note that the flag ```--verbose``` will enable terminal logging (e.g., to show the loss). | ||
All the models are learned by minimizing the negative log-likelihood on the | ||
training data with gradient descent. | ||
|
||
In addition, to run an experiment with a squared non-monotonic PC -- ```BornPC``` -- | ||
on the artificially-constructed data set ```cosine```, you can execute | ||
```shell | ||
python -m scripts.experiment --dataset cosine --model BornPC --num-components 4 \ | ||
--optimizer Adam --learning-rate 1-3 | ||
--batch-size 128 --verbose --num-samples 10000 | ||
``` | ||
The ```--num-samples``` argument is used to specify the number of samples to draw from the | ||
artificial distribution to construct the training split. | ||
An additional 10%/20% amount of samples will be drawn to construct the validation/test split. | ||
|
||
#### Logging Metrics and Models | ||
|
||
To log metrics locally such as the test average log-likelihood or to observe training curves, | ||
one can use either ```tensorboard```. | ||
For ```tensorboard``` it is sufficient to specify | ||
```--tboard-path /path/to/tboard-directory``` | ||
with an arbitrarily chosen path that will contain Tensorboard files. | ||
|
||
It is possible to save the best checkpoint of the model, that will be updated only upon | ||
an improvement of the loss on the validation data. | ||
To enable this, you can specify | ||
```--save-checkpoint``` and ```--checkpoint-path /path/to/checkpoints``` | ||
with a path that will contain the model's weights in the ```.pt``` PyTorch format. | ||
|
||
In the checkpoints path it will be also saved additional information, e.g., | ||
the (quantized) probability density/mass functions estimated by the models on the | ||
artificial continuous/discrete 2D data sets (with the name ```pdf.npy``` or ```pmf.npy```). | ||
|
||
### Run a Grid of Experiments | ||
|
||
To run a batch of experiments, e.g., as to do a hyperparameters grid search, | ||
you can use the ```scripts.grid``` module by specifying a grid configuration JSON file. | ||
The directory ```./econfigs``` contains some examples of such configuration file. | ||
|
||
The fields to specify are the following: | ||
|
||
- ```common``` contains parameters to pass to ```scripts.experiment``` | ||
that are common to each experiment of the batch. | ||
- ```datasets```contains the list of data sets on which each experiment instance will be executed on. | ||
- ```grid.common``` contains a grid of hyperparameters. | ||
Each hyperparameter is a pair ```"name": value``` where value can be either a single value or a list of values. | ||
A products of lists will be performed as to retrieve all the possible configurations of hyperparameters in the grid. | ||
- ```grid.models``` contains additional hyperparameter configurations that are specific for some | ||
dataset or some model. Each entry in ```grid.models``` is a dictionary from a single dataset or a list of datasets, | ||
to a set of maps from model names to hyperparameter configurations. The semantic is that the hyperparameters specified | ||
in ```grid.models``` will overwrite the ones in ```grid.common``` for some specific combination of datasets and models. | ||
|
||
To run a batch of experiments, you can execute | ||
```shell | ||
python -m scripts.grid path/to/config.json | ||
``` | ||
You can also use the flag ```--dry-run``` to just print the list of generated commands, without running them. | ||
This is particularly useful in combination with job schedulers on clusters, e.g., Slurm. | ||
|
||
Additionally, one can specify a number of experiments that will be distatched in parallel | ||
(by default only one experiment will be runned) by specifying ```--num-jobs k```, where k is the maximum number | ||
of experiments that will be "alive" at each time. | ||
|
||
Instead of specifying parallel jobs that will be runned on the same device, | ||
you can also specify multiple devices on which the experiments will be dispatched on. | ||
This can be done with ```--multi-devices```. | ||
For instance, ```--multi-devices cuda:0 cuda:2 cpu``` will dispatch three experiment at a time, | ||
respectively on devices ```cuda:0```, ```cuda:2``` and ```cpu```. | ||
|
||
Finally, you can specify independent repetition for each experiment of the batch, | ||
which will append a different ```--seed``` argument for each experiment command to launch. | ||
This can be done with, for instance, ```--num-repetitions 5```. | ||
|
||
Disclaimer: in case of repeated runs the checkpoints that are saved are not reliable, | ||
as they can be overwritten by repeated run. | ||
|
||
#### Run a Grid of Experiments (on Slurm) | ||
|
||
To run a grid of experiments on a Slurm cluster, we first need to configure some constants in the ```slurm/launch.sh``` | ||
utility scripts, such as the Slurn partition to use, the maximum number of parallel jobs, the needed resources, | ||
and the path to a local directory of nodes in order to save model checkpoints and tensorboard logs. | ||
|
||
Then, we need to generate the commands to dispatch and save it to a text file. | ||
For this purpose, it is possible to use the script ```scripts.grid``` (see above) with the argument ```--dry-run```. | ||
For instance, to generate the commands to execute for the experiments on UCI data sets, it suffices to run the command | ||
```shell | ||
python -m scripts.grid econfigs/uci-data-splines.json --dry-run > exps-uci-data-splines.txt | ||
``` | ||
Finally, the Bash script ```slurm/launch.sh``` will automatically dispatch an array of Slurm jobs to execute. | ||
```shell | ||
EXPS_ID=uci-data bash slurm/launch.sh exps-uci-data-splines.txt | ||
``` | ||
The Slurm jobs should now appear somewhere in the queue, which can be viewed by running ```squeue```. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
{ | ||
"common": { | ||
"tboard-path": "tboard-runs/artificial-continuous", | ||
"checkpoint-path": "checkpoints/artificial-continuous", | ||
"save-checkpoint": true, | ||
"num-epochs": 1200, | ||
"device": "cpu", | ||
"num-workers": 2, | ||
"patience-threshold": 1e-3, | ||
"early-stop-patience": 50 | ||
}, | ||
"datasets": ["mring", "cosine", "funnel", "banana"], | ||
"grid": { | ||
"common": { | ||
"num-components": [4, 8, 12], | ||
"optimizer": "Adam", | ||
"compute-layer": "cp", | ||
"batch-size": 256, | ||
"splines": true, | ||
"spline-order": 2, | ||
"spline-knots": 32, | ||
"verbose": true, | ||
"standardize": true, | ||
"init-scale": 1.0, | ||
"learning-rate": 1e-3 | ||
}, | ||
"models": { | ||
"banana|cosine|mring|funnel": { | ||
"MonotonicPC": { | ||
"init-method": "uniform" | ||
}, | ||
"BornPC": { | ||
"non-monotonic": { | ||
"init-method": "uniform", | ||
"exp-reparam": false | ||
}, | ||
"monotonic": { | ||
"init-method": "uniform", | ||
"exp-reparam": true | ||
} | ||
} | ||
} | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
{ | ||
"common": { | ||
"tboard-path": "tboard-runs/artificial-discrete-binomials", | ||
"checkpoint-path": "checkpoints/artificial-discrete-binomials", | ||
"save-checkpoint": true, | ||
"num-epochs": 1200, | ||
"device": "cpu", | ||
"num-workers": 2, | ||
"patience-threshold": 1e-3, | ||
"early-stop-patience": 50 | ||
}, | ||
"datasets": ["mring", "banana", "funnel", "cosine"], | ||
"grid": { | ||
"common": { | ||
"num-components": [32, 64, 96], | ||
"optimizer": "Adam", | ||
"compute-layer": "cp", | ||
"batch-size": 256, | ||
"binomials": true, | ||
"verbose": true, | ||
"standardize": true, | ||
"init-scale": 1.0, | ||
"learning-rate": 1e-3, | ||
"discretize": true, | ||
"discretize-bins": 32 | ||
}, | ||
"models": { | ||
"banana|cosine|mring|funnel": { | ||
"MonotonicPC": { | ||
"init-method": "uniform" | ||
}, | ||
"BornPC": { | ||
"non-monotonic": { | ||
"init-method": "uniform", | ||
"exp-reparam": false | ||
}, | ||
"monotonic": { | ||
"init-method": "uniform", | ||
"exp-reparam": true | ||
} | ||
} | ||
} | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
{ | ||
"common": { | ||
"tboard-path": "tboard-runs/artificial-discrete-categoricals", | ||
"checkpoint-path": "checkpoints/artificial-discrete-categoricals", | ||
"save-checkpoint": true, | ||
"num-epochs": 1200, | ||
"device": "cpu", | ||
"num-workers": 2, | ||
"patience-threshold": 1e-3, | ||
"early-stop-patience": 50 | ||
}, | ||
"datasets": ["mring", "banana", "funnel", "cosine"], | ||
"grid": { | ||
"common": { | ||
"num-components": [4, 8, 12], | ||
"optimizer": "Adam", | ||
"compute-layer": "cp", | ||
"batch-size": 256, | ||
"binomials": false, | ||
"verbose": true, | ||
"standardize": true, | ||
"init-scale": 1.0, | ||
"learning-rate": 1e-3, | ||
"discretize": true, | ||
"discretize-bins": 32 | ||
}, | ||
"models": { | ||
"banana|cosine|mring|funnel": { | ||
"MonotonicPC": { | ||
"init-method": "uniform" | ||
}, | ||
"BornPC": { | ||
"non-monotonic": { | ||
"init-method": "uniform", | ||
"exp-reparam": false | ||
}, | ||
"monotonic": { | ||
"init-method": "uniform", | ||
"exp-reparam": true | ||
} | ||
} | ||
} | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
{ | ||
"common": { | ||
"tboard-path": "tboard-runs/gaussian-ring", | ||
"checkpoint-path": "checkpoints/gaussian-ring", | ||
"save-checkpoint": true, | ||
"num-epochs": 1200, | ||
"device": "cpu", | ||
"num-workers": 0, | ||
"early-stop-patience": 100 | ||
}, | ||
"datasets": ["ring"], | ||
"grid": { | ||
"common": { | ||
"optimizer": "Adam", | ||
"batch-size": 64, | ||
"learning-rate": 5e-3, | ||
"verbose": true, | ||
"standardize": true, | ||
"init-scale": 1.0 | ||
}, | ||
"models": { | ||
"ring": { | ||
"MonotonicPC": { "num-components": [2, 16], "init-method": "normal" }, | ||
"BornPC": { "num-components": 2, "init-method": "normal" } | ||
} | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
{ | ||
"common": { | ||
"tboard-path": "tboard-runs/gpt2-commongen", | ||
"checkpoint-path": "checkpoints/gpt2-commongen", | ||
"save-checkpoint": true, | ||
"num-epochs": 200, | ||
"device": "cpu", | ||
"num-workers": 2, | ||
"early-stop-patience": 3 | ||
}, | ||
"datasets": ["gpt2_commongen"], | ||
"grid": { | ||
"common": { | ||
"num-components": [32, 64, 128, 256, 512, 1024], | ||
"optimizer": "Adam", | ||
"region-graph": "linear-vtree", | ||
"batch-size": 4096, | ||
"verbose": true, | ||
"init-scale": 1.0, | ||
"dtype": "float32" | ||
}, | ||
"models": { | ||
"gpt2_commongen": { | ||
"MonotonicPC": { "init-method": ["log-normal", "uniform", "dirichlet"], "learning-rate": [5e-3, 1e-2, 5e-2] }, | ||
"BornPC": { "init-method": ["normal", "positive-skewed-normal", "uniform"], "learning-rate": [5e-3, 1e-2, 5e-2] } | ||
} | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
{ | ||
"common": { | ||
"tboard-path": "tboard-runs/image-binomials", | ||
"checkpoint-path": "checkpoints/image-binomials", | ||
"save-checkpoint": false, | ||
"num-epochs": 250, | ||
"device": "cpu", | ||
"num-workers": 2, | ||
"patience-threshold": 1e-3, | ||
"early-stop-patience": 3 | ||
}, | ||
"datasets": [ | ||
"MNIST" | ||
], | ||
"grid": { | ||
"common": { | ||
"num-components": [ | ||
64, | ||
128, | ||
256 | ||
], | ||
"optimizer": "Adam", | ||
"compute-layer": "cp", | ||
"batch-size": 256, | ||
"binomials": true, | ||
"verbose": true, | ||
"init-scale": 1.0, | ||
"learning-rate": [ | ||
5e-3, | ||
1e-2, | ||
5e-2 | ||
], | ||
"region-graph": "quad-tree", | ||
"input-mixture": true, | ||
"dtype": "float32" | ||
}, | ||
"models": { | ||
"MNIST": { | ||
"MonotonicPC": { | ||
"init-method": "uniform" | ||
}, | ||
"BornPC": { | ||
"non-monotonic": { | ||
"init-method": "uniform", | ||
"exp-reparam": false | ||
} | ||
} | ||
} | ||
} | ||
} | ||
} |
Oops, something went wrong.