Skip to content

GsoC 2025 projects

Osvaldo A Martin edited this page Feb 8, 2025 · 8 revisions

ArviZ

ArviZ is a project dedicated to promoting and building tools for exploratory analysis of Bayesian models. It currently has a Python and a Julia interface. All projects listed below are for the Python interface.

ArviZ aims to seamlessly integrate with established probabilistic programming languages like PyStan, PyMC, Turing, Soss, emcee, and Pyro and to be easily integrated with novel or bespoke Bayesian analyses. Where the probabilistic programming languages aim to make it easy to build and solve Bayesian models, the ArviZ libraries aim to make it easy to process and analyze the results from those Bayesian models.

Timeline

The timeline of the GSoC internships is available at the GSoC website

Projects

Below is a list of possible topics for your GSoC project, we are also open to other topics, contact us on Gitter (we won't accept proposals on topics outside this idea list from people who hasn't contacted us before). Keep in mind that these are only ideas and that some of them can't be solved entirely in a single GSoC project. When writing your proposal, choose some specific tasks and make sure your proposal is adequate for the GSoC time commitment. We expect all projects to be 350h projects, if you'd like to be considered for a 175h project you must reach out to Gitter. We will not accept 175h applications from people with whom we haven't discussed their time commitments before applying.

Each project also lists some specific requirements needed to be able to successfully complete the project, general requirements are listed below.

Note that these requirements can be learned while writing the proposal and during the community bonding period. You should feel confident to work on any project whose requirements are interesting to you and you would like to learn about them, they are not skills all that you are expected to know before writing your proposal. We aim for GSoC to provide a win-win scenario where you benefit from an inclusive and thriving environment in which to learn and the library benefits from your contributions.

The ArviZ-refactoring projects needs an understanding of the relations between its 3 main modules: plots, stats, and base. However, unless specified otherwise, no specific knowledge of inference libraries or about the internals of from_xyz converter functions is needed.

Students should be familiar with Python, numpy, and scipy. They should also be able to write unit tests for the added functionality using pytest and be able to enforce development conventions and use black, pylint, and pydocstyle for code style and linting.

Expected benefits of working on ArviZ

Students who work on ArviZ can expect their skillset to grow in

  • Bayesian Inference libraries
  • Bayesian modeling workflow and model criticism
  • Matplotlib, bokeh, plotly usage (depending on the project)
  • Xarray usage (depending on the project)
  • Numba or Dask optimization (depending on the project)

Plotting refactoring

We are refactoring the plotting module for better composability and extensibility, see arviz-plots.

Expected output

The expected output is API improvement, extensive testing, and documentation of PlotCollection class and batery-included plots.

Required skills

People working on this project should be familiar with plot facetting, the grammar of graphics, and be comfortable with xarray. Basic familiarity with plotting libraries like matplotlib, bokeh and plotly is needed.

Info

  • Expected size: 350h
  • Difficulty rating: hard
  • Potential mentors: Osvaldo Martin

Stats and diagnostic refactoring

We are refactoring the stats module, see arviz-stats.

Expected output

The expected output is API improvement, extensive testing, and documentation of stats functions.

Required skills

People working on this project should be familiar with bayesian statistics, and the basic of prior/posterior predictive checks, model comparison, sensitivity checks. And willing to learn about new methods for these tasks. They should also be comfortable with xarray.

Info

  • Expected size: 350h
  • Difficulty rating: hard
  • Potential mentors: Osvaldo Martin

Prior elicitation

PreliZ currently supports elicitation on the observed space (unidimensional) and a few experimental functions on the observed space (predictive elicitation). The objective is to expand these features and make them more robust.

Required skills

People working on this project will need to be familiar with Bayesian statistics, PreliZ and possibly also ipywidgets.

Expected outcome

The expected outcome of this project will be the implementation of new features and accompanying documentation that demonstrates how they can be effectively integrated into a Bayesian workflow.

Info

  • Expected size: 350h
  • Difficulty rating: hard
  • Potential Mentors: Osvaldo Martin