-
Notifications
You must be signed in to change notification settings - Fork 135
GSoC 2023 Project Ideas
Please ask questions through issues on the respective project's repo.
Tags available @henrykironde,...
- Preferred names
(Henry, Sergio, Ethan)
- Preferred_greeting
(Hi|Hello|Dear|Thanks|Thank you [First_name])
The code of conduct should be your first read.
The National Ecological Observatory Network (NEON) collects and provides long-term, open-access ecological data. The NEON Data API provides access to this data. Users must query the data in an optimal way. The NeonVegWrangleR helps in retrieving a targeted sample of this data, clean it and provide researchers in a format ready for ecological analyses.
Monitoring where and why trees die is key to quickly enact conservation of forest ecosystems. Integrating open data from the National Ecological Observatory Network (NEON), it is possible to link ground-truth data from sick/dead trees to airborne remote sensing data (RGB or hyperspectral). This project aims at building a baseline model to detect sick or dead trees either at the pixel or at the individual tree level. Field and imagery data will be extracted using the neonwranglerpy package. Individual tree coordinates are provided for the field data, as stem location. Individual tree crowns can be detected using the DeepForest package, or any other approach the student finds best suited. The student has full flexibility in choosing the baseline model (perspective deliverable), as long as it fits with the scope of the problem (multi-class classification).
Source Code: neonwranglerpy
- Intermediate, long (350 hours)
- Python and Python package deployment
- git/GitHub
- Machine learning
- Software testing
- A baseline model that integrates NEON field data with airborne remote sensing for classifying pixels (or individual tree objects) that are dead, sick, or healthy according to the plantStatus classes provided by NEON data.
- @MarconiS
- @henrysenyondo
- @ethanwhite
Portal Project: Standardize portal forecast aritifacts using EFI community conventions.
The portal project utilizes the portal prediction package to create forecasts based on the portal rodent census data. The goal of the project is to standardize portal forecast artifacts using the EFI community conventions for forecast file formats, forecast metadata, and forecast archiving. The EFI community conventions specification is defined in the article https://doi.org/10.32942/osf.io/9dgtq. The Portal Project relies on several packages, including the Portal Data, Portal Predictions, and Portalcasting.
- Intermediate, long (350 hours)
- R
- git/GitHub
- Formal EFI specified format for forecast output files, forecast metadata, input and forecasted variables.
- @henrysenyondo
- @ethanwhite
DeepForest is an open source Python package for detecting trees (and other organisms) in remote sensing (RGB) imagery from airplanes and drones. The underlying model structure allows for classification as well as detection, allowing DeepForest to be used for identifying trees to species or distinguishing between alive and dead trees, but support for the multi-class aspects of the package need further development. This project would involve a combination of software engineering to improve the UI for working with multi-class models and developing models that are pre-trained to provide features that are useful for transfer learning for species classification and alive/dead classification.
Source Code: https://github.com/weecology/DeepForest
- Intermediate, long (350 hours)
- Python
- Deep learning using Pytorch
- git/GitHub
- An improved UI for working with multi-class models and developing models that are pre-trained to provide features that are useful for transfer learning for species classification and alive/dead classification.
- b
- @henrysenyondo
- @ethanwhite
Portalcasting is an open source R package that supports ecological forecasting of biodiversity for a long-term ecological research program that has been studying desert biodiversity for 45 years. The package provides automated data integration and modular models to produce forecasts for a range of ecological outcomes. While the forecasting system makes large numbers of forecasts it currently does this sequentially instead of in parallel. This project would involve the parallelization of the code base to allow for running on multiple cores both on individual machines and HPCs.
Source Code: https://github.com/weecology/portalcasting
- Intermediate, short (175 hours)
- R
- Parallel programming for embarrassingly parallel problems (i.e., the simple end of parallel programming)
- git/GitHub
- A parallelized program which will reduce the runtime.
- J
- @henrysenyondo
- @ethanwhite