Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for a Vision Transformer Model for Digital Image Segmentation #35477

Open
2 tasks done
hanshengzhu0001 opened this issue Jan 2, 2025 · 1 comment
Open
2 tasks done

Comments

@hanshengzhu0001
Copy link

Model description

I would like to request the addition of a Vision Transformer (ViT) model specifically fine-tuned for digital image segmentation tasks. The model should leverage transformer-based architecture to effectively capture spatial relationships within images, improving performance in tasks such as medical image analysis, satellite image segmentation, or autonomous driving.

The Vision Transformer architecture has proven to be highly effective for various vision tasks by using self-attention mechanisms to capture long-range dependencies in images. This model would be particularly valuable for tasks requiring pixel-level classification, where traditional convolutional neural networks (CNNs) often struggle to capture global features effectively.

The ViT model should include the following key features:

  • Pretrained on a large, diverse image segmentation dataset.
  • Fine-tuned for pixel-level classification tasks such as medical image segmentation or semantic segmentation of everyday objects.
  • Support for standard ViT variants such as ViT-B (Base), ViT-L (Large), and ViT-H (Huge) depending on the task's computational budget.
  • Open-source weights and implementation.

Open source status

  • The model implementation is available
  • The model weights are available

Provide useful links for the implementation

@amyeroberts, @qubvel

@Uvi-12
Copy link
Contributor

Uvi-12 commented Jan 3, 2025

@amyeroberts @qubvel I’m interested in working on this task. Could you please confirm if this model is required and would be a valuable addition?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants