Skip to content

Releases: mad-lab-fau/tpcp

v0.25.1 - Fixed Documentation-tests-mixin

25 Oct 09:20
Compare
Choose a tag to compare

[0.25.1] - 2023-10-25

Fixed

  • Ignored names in the testing mixin are now correctly ignored both-ways.
    I.e. it allows to document additional parameters as well, not just leave out parameters.

v0.25 - End of Py3.8 and new validate method

24 Oct 16:24
Compare
Choose a tag to compare

[0.25.0] - 2023-10-24

Added

  • The Scorer class now has the ability to score datapoints in parallel.
    This can be enabled by setting the n_jobs parameter of the Scorer class to something larger than 1.
    (#95)
  • The PyTestSnapshotTest class does now support comparing dataframes with datetime columns.
    (#97)
  • The validate function was introduced to enable validation of an algorithm on arbitrary data without parameter
    optimization.
    (#99)
  • Fixed the bug that the functions optimize and cross_validate were crashing when progress_bar was deactivated.
  • New example about caching.
    (#98)

Changed

  • In line with numpy and some other packages, we drop Python 3.8 support

v0.24.0 - Dateset Improvements

08 Sep 10:55
Compare
Choose a tag to compare

[0.24.0] - 2023-09-08

For all changes in this release see: #85

Deprecated

  • The properties group and groups of the Dataset class are deprecated and will be removed in a future
    release.
    They are replaced by the group_label and group_labels properties of the Dataset class.
    This renaming was done to make it more clear that these properties return the labels of the groups and not the
    groups themselves.
  • The create_group_labels method of the Dataset class is deprecated and will be removed in a future release.
    It is replaced by the create_string_group_labels method of the Dataset class.
    This renaming was done to avoid confusion with the new names for groups and group

Added

  • Added index_as_tuples method to the Dataset class.
    It returns the full index of the dataset as a list of named tuples regardless of the current grouping.
    This might be helpful to extract the label information of a datapoint, when group requires to handle multiple cases,
    as your code expects the dataset in different grouped versions.

Changed

  • BREAKING CHANGE (with Deprecation): The group property of the Dataset class is now called group_label.
  • BREAKING CHANGE: The group_label property now always returns named tuples of strings
    (even for single groups where it used to return strings!).
  • BREAKING CHANGE (with Deprecation): The groups property of the Dataset class is now called group_labels.
  • BREAKING CHANGE: The group_labels property always returns a list of named tuples of strings
    (even for single groups where it used to return a list of strings!).
  • BREAKING CHANGE: The parameter groups of the get_subset method of the Dataset class is now called
    group_labels and always expects a list of named tuples of strings.

v0.23.0 - Testing Utils

30 Aug 08:58
Compare
Choose a tag to compare

[0.23.0] - 2023-08-30

Added

  • We migrated some testing utilities from other libraries to tpcp and exposed some algorithm test helper
    that previously only existed in the tests folder via the actual tpcp API.
    This should make testing algorithms and pipelines developed with tpcp easier.
    These new features are now available in the tpcp.testing module.
    (#89)

v0.22.1 - Fixed `safe_optimize` for GridSearchCV

30 Aug 08:03
Compare
Choose a tag to compare

[0.22.1] - 2023-08-30

Fixed

  • The safe_optimize parameter of GridSearchCV is now correctly used during reoptimization.
    Before, it was only forwarded to the Optimize wrapper during the actual Grid-Search, but not during the final
    reoptimization.

v0.22.0 - Tensorflow support

25 Aug 17:16
Compare
Choose a tag to compare

[0.22.0] - 2023-08-25

Added

  • Official support for tensorflow/keras. The custom hash function now manages tensorflow models explicitly.
    This makes it possible again to use the make_action_safe and make_optimize_safe decorators with algorithms and
    pipelines that have tensorflow/keras models as parameters.
    (#87)
  • Added a new example for tensorflow/keras models.
    (#87)

v0.20.1: Fix cross-validation regression

26 Jul 08:36
Compare
Choose a tag to compare

[0.20.1] - 2023-07-25

Fixed

  • Fixed regression introduced in 0.19.0, which resulted in optimizers not beeing correctly cloned per fold.
    In result, each CV fold would overwrite the optimizer object of the previous fold.
    This did not affect the reported results, but the returned optimizer object was not the one that was used to calculate
    the results.

v0.20.0 - BREAKING CHANGE: Fix optuna multiprocessing

24 Jul 13:04
Compare
Choose a tag to compare

[0.20.0] - 2023-07-24

Changed

  • BREAKING CHANGE: The way how all Optuna based optimizer work has been changed.
    Instead of passing a function, that returns a study, you now need to pass a function that returns the parameters of a
    study.
    Creating the study is now handled by tpcp internally to avoid issues with multiprocessing.
    This results in two changes.
    The parameter name for all optuna pipelines has changed from create_study to get_study_params.
    Further, the expected call signature changed, as get_study_params now gets a seed as argument.
    This seed should be used to initialize the random number generator of the sampler and pruner of a study to ensure
    that each process gets a different seed and sampling process.
    (#80)

    To migrate your code, you need to change the following:

    OLD:

    def create_study():
        return optuna.create_study(sampler=RandomSampler(seed=42))
    
    OptunaSearch(..., create_study=create_study, ...)

    NEW:

    def get_study_params(seed: int):
        return dict(sampler=RandomSampler(seed=seed))
    
    OptunaSearch(..., get_study_params=get_study_params, random_seed=42, ...)

v0.19.0 - Joblib Fixes and better errors

06 Jul 10:49
Compare
Choose a tag to compare

[0.19.0] - 2023-07-06

Added

  • All optimization methods that do complicated loops (over parameters or CV-Folds) now raise new custom error messages
    (OptimizationError and TestError) if they encounter an error.
    These new errors have further information in which iteration of the loop the error occurred and should make it easier
    to debug issues.
  • When a scorer fails, we now print the name (i.e. the group) of the datapoint that caused the error.
    This should make it easier to debug issues with the scorer.

Changed

  • We dropped support for joblib<0.13.0. due to some changes in the API. We only support the new API now, which allowed
    us to simplify some of the multiprocessing code.

v0.18.0 - Some more validation

13 Apr 16:30
Compare
Choose a tag to compare

[0.18.0] - 2023-04-13

Fixed

  • When super().__init__() is called before all parameters of the child class are initialized, we don't get an error
    anymore.
    Now all classes remember their parameters when they are defined and don't try to access parameters that are not
    defined in their own init.
    (#69)

Changed

  • Validation is now performed recursively on all subclasses. Note like before validation is still only performed once
    per class.
    But with this change, we can also validate base classes that are not used directly.
    (#70)

Added

  • We validate now, if a child class implements all the parameters of its parent class.
    While not strictly necessary, this is a sign of bad design, if not done.
    It could also lead to issues with tpcps validation logic.
    (#70)
  • It is now possible to hook into the validation and perform custom validation of classes.
    (#70)
  • The dataset class now activly triggers validation and checks if the dataset subclass implements groupby_cols and
    subset_index.