Skip to content

sejal-0502/Computer-Vision

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Computer Vision

Project Title

Semantic segmentation, Image captioning and Image-text retrieval is implemented using self-supervised learning and Image-text captioning. The code has been executed in the Linux OS, by creating the virtual environment in conda.

Project Description

Goal of the project

1. Self-supervised learning with DINOv2 model :

1.1 Nearest Neighbor Features with Negative Feature Space Distance as Score

1.2 Visualization of Nearest Neighbors and Calculating Crop Error with Feature Distance Ranking

1.3 Nearest Neighbor Features with Negative Cycle Distance as Score

1.4 Visualization of Nearest Neighbors and Calculating Crop Error with Cycle Distance Ranking

1.5 Nearest Neighbor matching with One-to-Many Frames

1.6 Visualization of Nearest Neighbors Matches for One-to-Many Frames

2. Image-Text Captioning Complete captions are generated with :

2.1 Greedy Search

2.2 Sampling

Image-Text Retrieval is performed using BLIP model

Acknowledgement

The project was build using the python libraries. Additionally, for package installations and environment management, 'Anaconda' has been used. Special thanks to the open-source community for making such a great contributions and for making them available.

A detailed report for the project has been attached.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published