Skip to content

Policy tree optimization: Heuristic policy search for control of dynamic systems. Uses genetic programming to develop binary trees relating observed indicator variables to actions, either real-valued or discrete.

License

Notifications You must be signed in to change notification settings

dumontgoulart/ptreeopt

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

86 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Policy Tree Optimization

Heuristic policy search for control of dynamic systems. Uses genetic programming to develop binary trees relating observed indicator variables to actions, either real-valued or discrete. A simulation model serves as the objective function.

Requirements: NumPy, PyGraphviz (optional). The example model also uses pandas and Matplotlib but these are not strictly required.

Tested with Python 2.7 and 3.x. Still in active development, especially the multiobjective optimization and documentation. Contributions and bug reports welcome!

Citation: Link to paper

Herman, J.D. and Giuliani, M. Policy tree optimization for threshold-based water resources 
management over multiple timescales, Environmental Modelling and Software, 99, 39-51, 2018.

The full set of experiments and data are available in the paper branch. Note the API has since changed in the master branch.

Quick Start

This example develops a control policy based on a simulation model of Folsom Reservoir.

import numpy as np
from folsom import Folsom
from ptreeopt import PTreeOpt
import logging

np.random.seed(17) # to be able to replicate the results

model = Folsom('folsom/data/folsom-daily-w2016.csv', 
                sd='1995-10-01', ed='2016-09-30', use_tocs = False)

algorithm = PTreeOpt(model.f, 
                    feature_bounds = [[0,1000], [1,365], [0,300]],
                    feature_names = ['Storage', 'Day', 'Inflow'],
                    discrete_actions = True,
                    action_names = ['Release_Demand', 'Hedge_90', 'Hedge_80', 
                    'Hedge_70', 'Hedge_60', 'Hedge_50', 'Flood_Control'],
                    mu = 20, # number of parents per generation
                    cx_prob = 0.70, # crossover probability
                    population_size = 100,
                    max_depth = 7
                    )

logging.basicConfig(level=logging.INFO,format='[%(processName)s/%(levelname)s:%(filename)s:%(funcName)s] %(message)s')

# With only 1000 function evaluations this will not be very good
best_solution, best_score, snapshots = algorithm.run(max_nfe=1000, 
                                                 log_frequency=100,
                                                 snapshot_frequency=100)

The logging module will print to the console every log_frequency function evaluations. snapshots is a dictionary containing keys 'nfe', 'time', 'best_f', 'best_P' (number of function evaluations, elapsed time, best objective function value, and best policy tree). Each key points to a list of length max_nfe/snapshot_frequency. The snapshots are used for convergence information, and are typically saved to a file for later analysis:

import pickle
pickle.dump(snapshots, open('snapshots.pkl', 'wb'))

model.f is a simulation model that will be evaluated many times. The simulation model must be set up to receive a policy and return an objective function value:

def f(P):
  # ...
  for t in range(T):
    # observe indicators x1,x2,x3
    action = P.evaluate([x1,x2,x3]) # returns a string from `action_names`

    if action == 'something':
      # ...
    if action == 'something_else':
      # ...
  
  return objective # assumes minimization

The Folsom simulation model (link) gives a more detailed example.

Functions for plotting results are described in a separate readme.

Parallelization

The example above runs in serial. Function evaluations can also be parallelized using either multiprocessing or mpi4py:

from ptreeopt import MultiprocessingExecutor

with MultiprocessingExecutor(processes=4) as executor:
    best_solution, best_score, snapshots = algorithm.run(max_nfe=1000, 
                                                 log_frequency=100,
                                                 snapshot_frequency=100,
                                                 executor=executor)
from ptreeopt import MPIExecutor

with MPIExecutor() as executor:
    best_solution, best_score, snapshots = algorithm.run(max_nfe=1000,
                                                 log_frequency=100,
                                                 snapshot_frequency=100,
                                                 executor=executor)

Then run mpirun -n 4 python -m mpi4py.futures mpi_example.py on the command line or in a cluster job script. See the examples for more details. This is generational parallelization, meaning that the number of processors should not exceed the population size.

License

Copyright (C) 2017-19 Contributors. Released under the MIT license.

About

Policy tree optimization: Heuristic policy search for control of dynamic systems. Uses genetic programming to develop binary trees relating observed indicator variables to actions, either real-valued or discrete.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.6%
  • Shell 2.4%