MAISI VAE latent space #1937

marceljhuber · 2025-02-04T14:08:46Z

marceljhuber
Feb 4, 2025

I trained the VAE part of MAISI on an grayscale OCT dataset and generated some samples. The results did not look like my expectations. In order to check if it's a problem with my checkpoint I did the same with a checkpopint from the model zoo and got similar results.

The generations are just latent mess. I can't find a way to sample from this model in order to get something meaningful. When I use the z vectors resulting from an image encoding I get perfect reconstructions. Does this mean that the latent space is so complex that random sampling (almost) never leads to usable results and that's why we need a diffusion process in order to make sense of this complex latent space? My expectations were to get something that kind of has the outlines of an real image instead of this latent mess. Also the interpolation between two latent points produce unintuitive looking images.

What part of the architecture leads to this different behaviour of the latent space compared to the original implementation where latent interpolations work much more intuitive?

VAE_generations.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAISI VAE latent space #1937

{{title}}

Replies: 0 comments

Select a reply

MAISI VAE latent space #1937

marceljhuber Feb 4, 2025

Replies: 0 comments

marceljhuber
Feb 4, 2025