The code presented in this repository is divided into two sections, separated into two folders.
- Handling different spatial and spectral resolutions of image datasets while maintaining the placement of annotations. (Python)
- Adapting annotation dataset categories. Use segment-based annotations to create other AI employable datasets. (R_Markdown)
- Interacting with CVAT (Python)
1. Handling different spatial and spectral resolutions of image datasets while maintaining the placement of annotations. (Python)
First you use the Harmonize Data Notebook to create your scaled annotation files. Then you can use the Combine Data Notebook to combine all the single annotation files to one combined annotation file.
This Notebook contains two functions.
write_new_annotations(annotations: list, scale: float)
: Write annotation data to json formatget_filename(path: str):
: Get the base file name form a file path
The code generates a list of all paths to the JSON files located within a specified directory. For each JSON file, the code prints the file name and loads its content into a dictionary. The annotations and images from the COCO data set are then extracted and a list of image file names is generated. For each scale in a predefined list, new annotations are generated using a specific function, and the COCO data dictionary is updated. A new file name is generated based on the current time, the original file name, and the scale in question, with the objective of ensuring that the resulting name is compatible with the file system. Subsequently, the updated COCO data is written to a new JSON file, and a confirmation message is displayed.
This Notebook contains two functions:
get_filename(path: str):
: Get the base file name form a file pathmerge_coco_json(json_files: list, output_file: str)
: Merge and write COCO annotation files
The code performs a loop through each annotation file. The file name is extracted without the extension. Subsequently, the corresponding scaled files are collated, and the original annotation file is appended to this list. A list of paths to COCO JSON files for merging is then prepared. An output file name is generated based on the current time and the original file name, ensuring that the name is compatible with the file system. The COCO JSON files are then merged and the resulting file is saved to the specified output path. The final step is to display a confirmation message indicating the location where the merged file is saved.
Checkout the Requirement file in this directory. You can use it with the following command:
pip install -r requirements.txt
2. Adapting annotation dataset categories. Use segment-based annotations to create other AI employable datasets. (R_Markdown)
Please open the folder R_Markdown to see the example files and the R Markdown Script. The Script is designed to work with annotation files based (Json COCO format) on multiple images. The example file, as extracted in the folder with the script, demonstrates the compiling and execution in general, while providing explanation and visualizations of the performed steps and generated outputs.
- R version 4.3.2: GPL (≥ 3)
- RStudio 2023.12.1: Posit End User License Agreement
R Packages
- rmarkdown: GPL-3
- dplyr: MIT
- knitr: GPL-2 | GPL-3
- rjson: GPL-2
- sf: GPL-2 | MIT
- stringr: MIT
- terra: GPL (≥ 3)
The goal was to upload several files to our on-site CVAT instance without having to do it manually. For our project, we had four people working there, so we have four slots in the scripts, but this number can vary depending on your needs.
We've got three different Python scripts. All of these scripts need to be set up with the same environment settings from the .env file. One just sends the image folders to your CVAT instance, one assigns tasks to your team members, and one assigns jobs to your team members.
The .env file contains:
- Username
- Password
With the python script send_images_to_cvat.py you can send images to your CVAT instance. The script expects the following parameters in the script:
- taskpreset: the name for your tasks
- images_folder: where your subfolders are for each task
- host: The ip and port of your CVAT instance
- Project_id: The Id of your Project
- slug: The shortname (slug) of your organization if the project exists in an organization
It's basically the same as in [Upload Images] (#upload-images), but now you can also assign your team members with their usernames and how many tasks you want to give them. The tasks are then automatically assigned while they are uploaded to CVAT. The script can be found here.
Checkout the Requirement file in this directory. You can use it with the following command:
pip install -r requirements.txt
The example data and code in this repository is released under the BSD-3 license.
Funded by the German Federal Ministry for the Environment, Nature Conservation, Nuclear Safety and Consumer Protection (BMUV) based on a resolution of the German Bundestag (Grant No. 67KI21014A).
- Felix Becker ([email protected]) - Python Notebooks (Method 1)
- Robert Retting ([email protected]) - R_Markdown Script (Method 2)