diff --git a/.github/workflows/codecov-CI.yml b/.github/workflows/codecov-CI.yml index ef2ef259..c20c96c5 100644 --- a/.github/workflows/codecov-CI.yml +++ b/.github/workflows/codecov-CI.yml @@ -27,7 +27,7 @@ jobs: mamba install --quiet --yes --file requirements.txt coverage pytest-cov && python -m coverage run -m pytest --cov=./ --cov-report=xml - name: Upload Coverage to Codecov - uses: codecov/codecov-action@v3 + uses: codecov/codecov-action@v4 with: token: ${{ secrets.CODECOV_TOKEN }} fail_ci_if_error: true diff --git a/.zenodo.json b/.zenodo.json index 8d6ff6c8..65805358 100644 --- a/.zenodo.json +++ b/.zenodo.json @@ -23,7 +23,7 @@ { "name": "Kukulies, Julia", - "affiliation": "University of Gothenburg (Sweden)", + "affiliation": "NSF National Center for Atmospheric Research", "orcid": "0000-0001-6084-0069" }, { diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 6664863a..98aee8b8 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -2,16 +2,14 @@ __Welcome! We are very happy that you are interested in our project and thanks for taking time to contribute! :)__ -We are currently reorganizing our project. So, please check our modifications later. - ## Getting Started ### Installation & Environment details -You will find them in the [README.md](https://github.com/climate-processes/tobac/blob/master/README.md). +You will find them in the [README.md](https://github.com/tobac-project/tobac/blob/master/README.md). ### Tutorials -Tutorials have been prepared to provide you further inside to `tobac`s functionality. Please visit the separate [tobac-tutorials](https://github.com/climate-processes/tobac-tutorials) repository here on Github. - +Tutorials have been prepared to provide you further inside to `tobac`s functionality. Please have a look in the +[examples folder](https://github.com/tobac-project/tobac/tree/main/examples). ### Documentation You will find our documentation at [https://tobac.readthedocs.io](https://tobac.readthedocs.io). @@ -19,56 +17,33 @@ You will find our documentation at [https://tobac.readthedocs.io](https://tobac. ### Testing The tests are located in the [tests folder](https://github.com/climate-processes/tobac/tree/master/tobac/tests). - ## Reporting Bugs -Please create a new issue on [GitHub](https://github.com/climate-processes/tobac/issues) if it is not listed there, yet. +Please create a new issue on [GitHub](https://github.com/tobac-project/tobac/issues) if it is not listed there, yet. ### How to write a good Bug Report? * Give it a clear descriptive title. * Copy and paste the error message. * Describe the steps for reproducing the problem and give an specific example. -* Optional: Make a suggestion to fix it. - +* Optional: Make a suggestion to fix it. ## How to Submit Changes -* Please read the [README.md](https://github.com/climate-processes/tobac/blob/master/README.md) first, to learn about our project goals and check the [changelog.md](). -* Before you start a pull request, please make sure that you added [numpydoc docstrings](#docstringExample) to your functions. This way the api documentation will be parsed properly. -* If it is a larger change or an newly added feature or workflow, please place an example of use in the [tobac-tutorials](https://github.com/climate-processes/tobac-tutorials) repository or adapt the existing examples there. -* If necessary add a folder or modify a file. +* Have a look at [our roadmap](https://github.com/tobac-project/tobac-roadmap/blob/master/tobac-roadmap-main.md) first, +to learn about our project goals and check the +[changelog.md](https://github.com/tobac-project/tobac/blob/main/CHANGELOG.md). +* More details on the code structure and further help for code contributions can be found in our [developer +guide](https://tobac.readthedocs.io/code_structure.html) +* Before you start a pull request, please make sure that you added [numpydoc +docstrings](https://numpydoc.readthedocs.io/en/latest/format.html) to your +functions. See [docstring example in the developer guide](https://tobac.readthedocs.io/contributing.html). This way the +api documentation will be parsed properly. +* If it is a larger change or an newly added feature or workflow, please add an example in the [example +folder](https://github.com/tobac-project/tobac/tree/main/examples) or adapt the existing examples there. * The code should be PEP 8 compliant, as this facilitates our collaboration. Please use the first stable version (22.6.0) of [black](https://black.readthedocs.io/en/stable/) to format your code. When you submit a pull request, all files are checked for formatting. * The tobac repository is set up with pre-commit hooks to automatically format your code when commiting changes. Please run the command "pre-commit install" in the root directory of tobac to set up pre-commit formatting. -We aim to respond to all new issues/pull requests as soon as possible, however at times this is not possible due to work commitments. - -### Numpydoc Example -```python - - ''' - calculate centre of gravity and mass forech individual tracked cell in the simulation - - - Parameters - ---------- - tracks : pandas.DataFram - DataFrame containing trajectories of cell centres - - param mass : iris.cube.Cube - cube of quantity (need coordinates 'time', 'geopotential_height','projection_x_coordinate' and - 'projection_y_coordinate') - - param mask : iris.cube.Cube - cube containing mask (int > where belonging to cloud volume, 0 everywhere else ) - +We aim to respond to all new issues/pull requests as soon as possible, however sometimes this is not possible due to work commitments. - Returns - ------- - track_out : pandas.DataFrame - Dataframe containing t,x,y,z positions of centre of gravity and total cloud mass each tracked cells - at each timestep - - ''' -``` ## Slack In addition to the workflow here on Github, there's a tobac workspace on Slack [tobac-dev.slack.com](tobac-dev.slack.com) that we use for some additional communication around the project. Please join us there to stay updated about all things tobac that go beyond the detailed work on the code. diff --git a/doc/code_reviews.rst b/doc/code_reviews.rst new file mode 100644 index 00000000..da8c2c90 --- /dev/null +++ b/doc/code_reviews.rst @@ -0,0 +1,36 @@ +Code reviews +------------------ + +Before anything is merged into the release branch (:code:`RC_*`), we require that two reviewers accept the code changes of a pull request. + +============================ +How to do a code review +============================ + +* Checkout out pull request locally (`how to checkout a pull request locally `_) + +* Run tests locally + +* Go through code and see if it is readable and easy to understand + +* Not required, but often useful: test new features with your own data + + +============================ +Tips and expectations +============================ + + +Doing a code review can be very challenging if you are unfamiliar with the process. Here is a set of documents which might provide a good resource on how to get started: + +https://github.com/google/eng-practices + + +========================= +Conventional comments +========================= + +The comments in a code review should be clear and constructive. + +A useful way of highlighting the intention of specific comments is to label them according to `conventional comments `_. + diff --git a/doc/code_structure.rst b/doc/code_structure.rst new file mode 100644 index 00000000..cbc1b23e --- /dev/null +++ b/doc/code_structure.rst @@ -0,0 +1,72 @@ +Code structure and key design concepts +-------------------------------------- + +================================== +Modules +================================== + +**tobac** aims to provide a flexible and modular framework which can be seen as a toolbox to create tracking algorithms according to the user's specific research needs. + +The **tobac** package currently consists of three **main modules**: + +1. The :py:mod:`tobac.feature_detection` contains methods to identify objects (*features*) in 2D or 3D (3D or 4D when including the time dimensions) gridded data. This is done by identifying contiguous regions above or below one or multiple user-defined thresholds. The module makes use of :py:mod:`scipy.ndimage.label`, a generic image processing method that labels features in an array. The methods in :py:mod:`tobac.feature_detection` are high-level functions that enable a fast and effective feature detection and create easy-to-use output in form of a :py:mod:`pandas.DataFrame` that contains the coordinates and some basic information on each detected feature. The most high-level methods that is commonly used by users is :py:func:`tobac.feature_detection_multithreshold`. + +2. The :py:mod:`tobac.segmentation` module contains methods to define the extent of the identified feature areas or volumes. This step is needed to create a mask of the identified features because the feature detection currently only saves the center points of the features. The segmentation procedure is performed by using the watershedding method, but more methods are to be implemented in the future. Just as the feature detection, this module can handle both 2D and 3D data. + +3. The :py:mod:`tobac.tracking` module is responsible for linking identified features over time. This module makes primarily use of the python package :py:mod:`trackpy`. Note that the linking using :py:mod:`trackpy` is based on particle tracking principles which means that only the feature center positions (not the entire area or volume associated with each feature) are needed to link features over time. Other methods such as tracking based on overlapping areas from the segmented features are to be implemented. + +In addition to the main modules, there are three **postprocessing modules**: + +4. The :py:mod:`tobac.merge_split` module provides functionality to identify mergers and splitters in the tracking output and to add labels such that one can reconstruct the parent and child tracks of each cell. + +5. The :py:mod:`tobac.analysis` module contains methods to analyze the tracking output and derive statistics about individual tracks as well as summary statistics of the entire populations of tracks or subsets of the latter. + +6. The :py:mod:`tobac.plotting` module provides methods to visualize the tracking output, for example for creating maps and animations of identified features, segmented areas and tracks. + + +Finally, there are two modules that are primarily **important for developers**: + +7. The :py:mod:`tobac.utils` module is a collection of smaller, not necessarily tracking-specific methods that facilitate and support the methods of the main modules. This module has multiple submodules. We separate methods that are rather generic and could also be practical for tobac users who build their own tracking algorithms (:py:mod:`tobac.utils.general`) and methods that mainly facilitate the development of **tobac** (:py:mod:`tobac.utils.internal`). Sometimes, new features come with the need of a whole set of new methods, so it could make sense to save these in their own submodule (see e.g. :py:mod:`tobac.periodic_boundaries`) + +8. The :py:mod:`tobac.testing` module provides support for writing of unit tests. This module contains several methods to create simplified test data sets on which the various methods and parameters for feature detection, segmentation, and tracking can be tested. + +For more information on each submodule, refer to the respective source code documentation. + +One thing to note is that **tobac** as of now is purely functional. The plan is, however, to move towards a more object-oriented design with base classes for the main operations such as feature detection and tracking. + + +======== +Examples +======== + +To help users get started with **tobac** and to demonstrate the various functionalities, **tobac** hosts several detailed and **illustrated examples** in the form of Jupyter notebooks. They are hosted under the directory `examples/` and be executed by the user. Our readthedocs page also hosts a rendered version of our examples as `gallery `_ + + +============================ +Migrating to xarray and dask +============================ + +Currently, **tobac** uses `iris cubes `_ as the +primary data container. However, we are currently working on migrating the source code to +`xarray `_ such that all internal functions are based on `xr.DataArray +objects `_. + +To ensure a robust transition from **iris** to **xarray**, we make use of various decorators that convert input and +output data for the main functions without changing their actual code. These decorators are located in the `decorator +submodule `_. + +In addition, one of our main goals for the future is to fully support `dask `_, in order to scale +to large datasets and enable parallelization. + + + + + + + + + + + + + diff --git a/doc/conf.py b/doc/conf.py index 801eb65b..02a7c5b6 100644 --- a/doc/conf.py +++ b/doc/conf.py @@ -29,6 +29,7 @@ html_static_path = ["_static"] exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"] + project = "tobac" master_doc = "index" diff --git a/doc/contributing.rst b/doc/contributing.rst new file mode 100644 index 00000000..c458a69d --- /dev/null +++ b/doc/contributing.rst @@ -0,0 +1,168 @@ +.. + How to contribute to the tobac project + +How to contribute +------------------------- + +Step-by-step overview of most important points: https://github.com/tobac-project/tobac/blob/main/CONTRIBUTING.md + +========================= +Code of conduct +========================= + +We are a multi-institutional and international community that aims to maintain and increase our diversity. We acknowledge that we all come with different experiences and capacities. Therefore, we strive to foster an inclusive and respectful environment where we help and support each other. We welcome any types of contributions and believe that we together can create accessible, reusable, and maintanable code that empowers researchers and enables groundbreaking science. + +We would like to refer to the `Python code of conduct `_ as we follow the same principles for communication and working with each other! + +========================= +git basics +========================= + +* **Create a Github account**: The first thing, you need to do is to `create a GitHub account `_ if you do not already have one. + +* **Get familiar with the basics of GitHub and git**: + * Getting started with the `basics `_ + * Learn about `branches `_ + * Learn about `forks `_ + * Learn about `pull requests `_ + * Learn about `how to commit and push changes from your local repository `_ + +* **Create an issue**: If you have an idea for a new feature or a suggestion for any kind of code changes, please create an issue for this. We sort `our issues `_ into `milestones `_ to priorize work and manage our workflow, i.e. the different versions of **tobac** to come. + + The issues act, therefore, not only as a place for reporting bugs, but also as a collection of *to do* points. + +* **Work on an issue**: You can also work on any issue that was created by somebody else and is already out there. A tip is to look for the **good first issue** label, if you are a new developer. These issues are usually fairly easy to address and can be good to practice our GitHub workflow. + + +* **Create a pull request from your fork:** We use our personal forks of the tobac repository to create pull requests. This means that you have to first commit and push your local changes to your personal fork and then create a pull request from that fork: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request-from-a-fork + +=================================== +Writing proper documentation +=================================== + +Please provide **Numpy Docstrings** for all new functions. + +**Example**: + +.. code:: + + ''' + Calculates centre of gravity and mass for each individually tracked cell in the simulation. + + + Parameters + ---------- + tracks : pandas.DataFram + DataFrame containing trajectories of cell centres + + param mass : iris.cube.Cube + cube of quantity (need coordinates 'time', 'geopotential_height','projection_x_coordinate' and + 'projection_y_coordinate') + + param mask : iris.cube.Cube + cube containing mask (int > where belonging to cloud volume, 0 everywhere else ) + + + Returns + ------- + track_out : pandas.DataFrame + Dataframe containing t,x,y,z positions of centre of gravity and total cloud mass each tracked cells + at each timestep + + ''' + + + +=================================== +Tips on working on your local code +=================================== + +* Install tobac package with :code:`pip install -e` + * This allows you to directly test your local code changes as you run tobac. Instead of using the **tobac** version of the latest release, your local version of tobac will be used when you import **tobac** in a python script or IDE. + * *Note that* this way of installing a local package will use the code of the checked in branch, so this allows you also to test code while switching between branches. + +* You can locally **build the documentation page**: + * see :doc:`testing_sphinx-based_rendering` + +* Writing `meaningful commit messages `_ can be very helpful for you and people who review your code to better understand the code changes. + + +========================= +Our branching strategy +========================= + +While you can use any type of branching strategy and naming as you work in your personal fork, we have three branches in the tobac repository: + +* :code:`RC_*` +* :code:`dev_*` +* :code:`hotfix` + +:code:`RC_*` is the release candidate of the next tobac version. The asterisk stands here for the specific tobac version: RC_vx.x.x (e.g. RC_v1.5.0). Pull requests to this branch need two reviews to be accepted before it can be merged into main. + +:code:`dev_*` is the development branch where we experiment with new features. This branch is perfectly suited to collaboratively work on a feature together with other **tobac** developers (see :doc:`mentoring`). In general, this branch is used for long-term, comprehensive code changes that might not be covered by a single pull request and where it might not be conceivable in which future **tobac** version to include the changes. There are no branch protection rules for this branch, which means that collaborators of our GitHub organization can directly push changes to this branch. Note that **dev_** can never directly merged into main, it has be merged into the release candidate branch :code:`RC_*` first! There can be more than one `dev_*` branch, therefore it we recommend to describe the feature to work on in the respective branch (e.g. :code:`dev_xarray_transition`). + +:code:`hotfix` is the branch we use for hotfixes, i.e. bug fixes that need to be released as fast as possible because it influences people's code. This branch needs only one review before it can directly merged into :code:`main`. + +In brief: **Unless you are collaboratively working on a comprehensive feature or on a hotfix, the branch to submit your pull request to is the next release candidate RC_v.x.x.x** + + +========================= +GitHub workflow +========================= + +We use several `GitHub actions `_ to +assure continuous integration and to enable an efficient code development and release process. Our workflow +configuration can be found in +`.github/workflows `_ and encompass + +* check that code is formatted using the latest stable version of black +* linting of the latest code changes that checks the code quality and results in a score compared to the most recent released version +* check of the zenodo JSON file that ensures that the citation is correct +* check that all unit tests pass (including testing on multiple operating systems) and report test coverage +* check that the example jupyter notebooks run without problems + +========================= +Writing unit tests +========================= + +We use unit tests that ensure that the functions of each module and submodule work properly. If you add a new +functionality, you should also add a unit test. All tests are located in the `test +folder `_ The module :py:mod:`tobac.testing` may help to +create simple, idealized cases where objects can be tracked to test if the new features result in the expected outcome. + +If you are unsure on how to construct tests and run tests locally, you can find additional documentation on +`pytest `_ and `pytest +fixtures `_. + +You will also notice that we report the test coverage, i.e. how much of our current code is triggered and thus tested by +the unit tests. When you submit a pull request, you will see if your code changes have increased or decreased the test +coverage. Ideally, test coverage should not decrease, so please make sure to add appropriate unit tests that cover +all newly added functions. + +========================= +Add examples +========================= + +In addition to the unit tests, we aim to provide examples on how to use all functionalities and how to choose different +tracking parameters. These `examples `_ are in form of jupyter +notebooks and can be based on simple, idealized test cases or real data. We strongly encourage the use of real data that +is publicly accessible, but another option for new examples with real data is to either upload the data to our `zenodo +repository `_ or create your own data upload on zenodo. Please include the name "tobac" in the data title for the latter. + +========================= +Releasing a new version +========================= + +This is the checklist of steps for a release of a new **tobac** version: + +* Bump version in :code:`__init__.py` in :code:`RC_vXXX` +* Add changelog in :code:`RC_vXXX` +* Regenerate example notebooks with the new version +* Merge :code:`RC_vXXX` into :code:`main` +* Merge updated :code:`main` branch back into release and dev branches +* Delete :code:`RC_vXXX` branch +* Create release +* Push release to conda-forge: https://github.com/tobac-project/tobac-notes/blob/main/uploading_to_conda-forge.md +* Create new tag +* E-mail tobac mailing list + diff --git a/doc/index.rst b/doc/index.rst index d008e946..1b215f20 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -67,12 +67,20 @@ The project is currently being extended by several contributors to include addit .. toctree:: + :caption: Developer guide + :maxdepth: 3 + + code_structure + contributing + code_reviews + mentoring + +.. toctree:: :caption: Compute bulk statistics :maxdepth: 2 bulk_statistics/index - .. toctree:: :caption: API Reference :maxdepth: 3 diff --git a/doc/mentoring.rst b/doc/mentoring.rst new file mode 100644 index 00000000..c956af5e --- /dev/null +++ b/doc/mentoring.rst @@ -0,0 +1,25 @@ +Mentoring and Collaboration +---------------------------- + +============================ +Writing code collaboratively +============================ + +We firmly believe that code can only get better if more than two eyes and one brain work on it. Therefore, we aim to write code collaboratively, in particular, when comprehensive refactoring or enhancements of the code are done. In practice, this can be done by creating a **draft pull request**. This makes it really easy to iteratively improve a pull request with the feedback from others until the pull request is ready for review. + + +When you work on a comprehensive feature with multiple developers, it is recommended to create a draft pull request on the :code:`dev_*` branch. As explained in :doc:`our branching strategy `, this branch does not undergo any protection rules. It is meant to experiment with new code and all collaborators of the `tobac-project organization `_ can directly push to this branch. Creating a draft pull request has the advantage of facilitating the communication with other developers who contribute to the same new feature. You can directly see which changes they make, comment these and discuss ways to go forward. + +============== +Get a mentor +============== + +**Is this your first time contributing to an open-source project?** + +Reach out to the **tobac** developer group and get a mentor! One of our developers will help you getting started and explain how our workflow works. You are, of course, always free to post any questions to GitHub discussions, our Slack channel, or write an email. But sometimes it can also be nice to have a specific person to refer to when things seem overwhelming in the beginning. + +=============== +Pair reviews +=============== + +Another great way of collaboration are pair reviews which means that you are reviewing code together with another developer. You can, for example, reach out to us when you have submitted a pull request and would like to talk through the review points with one of the reviewers in order to collaboratively come up with creative solutions to remaining issues. If you are a reviewer, you can offer a pair review to the person who created the pull request and help them addressing certain review points. diff --git a/doc/testing_sphinx-based_rendering.rst b/doc/testing_sphinx-based_rendering.rst new file mode 100644 index 00000000..f0a7443e --- /dev/null +++ b/doc/testing_sphinx-based_rendering.rst @@ -0,0 +1,154 @@ +How to check the Sphinx-based rendering +--------------------------------------- + + +The workflow has been tested in a linux system. We aim to build a static +website out of the documentation material present in ``tobac``. + +================================== +1. Preparing the Local Environment +================================== + +- **choose a separate place for your testing** + + I will use the temporary directory ``/tmp/website-testing`` which I + need to create. You can use a dedicated place of your choice … + + .. code:: bash + + > mkdir /tmp/website-testing + > cd /tmp/website-testing + + I will indicate my position now with the ``/tmp/website-testing>`` + prompt. + +- **get the official repository** + + .. code:: bash + + /tmp/website-testing> git clone https://github.com/tobac-project/tobac + + You might like to test a certain remote branch ```` then do: + + .. code:: bash + + /tmp/website-testing/tobac> git fetch --all + /tmp/website-testing/tobac> git checkout -t origin/ + +- **Python environment** + + - create a python virtual env + + .. code:: bash + + /tmp/website-testing> python -m venv .python3-venv + + + - and install requirements + + .. code:: bash + + # deactivation conda is only necessary if your loaded conda before … + /tmp/website-testing> conda deactivate + + # activate the new env and upgrade ``pip`` + /tmp/website-testing> source .python3-venv/bin/activate + /tmp/website-testing> pip install –upgrade pip + + # now everything is installed into the local python env! + /tmp/website-testing> pip install -r tobac/doc/requirements.txt + + # and also install RTD scheme + /tmp/website-testing> pip install sphinx_rtd_theme + + `pip`-based installation takes a bit of time, but is much faster than `conda`. + + +If the installation runs without problems, you are ready to build the website. + + +================================== +1. Building the Website +================================== + +Actually, only few steps are needed to build the website, i.e. + +- **running sphinx for rendering** + + .. code:: bash + + /tmp/website-testing> cd tobac + + /tmp/website-testing/tobac> sphinx-build -b html doc doc/_build/html + + If no severe error appeared + +- **view the HTML content** + + .. code:: bash + + /tmp/website-testing/tobac> firefox doc/_build/html/index.html + +================================== +3. Parsing Your Local Changes +================================== + +Now, we connect to your locally hosted ``tobac`` repository and your +development branch. + +- **connect to your local repo**: Assume your repo is located at + ``/tmp/tobac-testing/tobac``, then add a new remote alias and fetch + all content with + + .. code:: bash + + /tmp/website-testing/tobac> git remote add local-repo /tmp/tobac-testing/tobac + /tmp/website-testing/tobac> git fetch --all + +- **check your development branch out**: Now, assume the your + development branch is called ``my-devel``, then do + + .. code:: bash + + # to get a first overview on available branches + /tmp/website-testing/tobac> git branch --all + + # and then actually get your development branch + /tmp/website-testing/tobac> git checkout -b my-devel local-repo/my-devel + + You should see your developments, now … + +- **build and view website again** + + .. code:: bash + + /tmp/website-testing/tobac> sphinx-build -M clean doc doc/_build + /tmp/website-testing/tobac> sphinx-build -b html doc doc/_build/html + /tmp/website-testing/tobac> firefox _build/html/index.html + + +========================================== +Option: Check Rendering of a Pull requests +========================================== + +- **check the pull request out**: Now, assume the PR has the ID ```` and you define the branch name ``BRANCH_NAME`` as you like + + .. code:: bash + + # to get PR shown as dedicated branch + /tmp/website-testing/tobac> git fetch upstream pull/ID/head:BRANCH_NAME + + # and then actually get this PR as branch + /tmp/website-testing/tobac> git checkout BRANCH_NAME + + You should see the PR now ... + +- **build and view website again** + + .. code:: bash + + /tmp/website-testing/tobac> sphinx-build -M clean doc doc/_build + /tmp/website-testing/tobac> sphinx-build -b html doc doc/_build/html + /tmp/website-testing/tobac> firefox _build/html/index.html + +