Fish Species Classification with LLM and Image Encoder

This project trains a large language model (LLM) with an image encoder to classify fish species based on input images. The model is fine-tuned to not only identify the fish species but also provide a detailed explanation for its classification decision. You may ask why this was done instead of a normal CNN which is much faster and easier to train??? The answer is that the model also provides an explanation for why a given fish was chosen and suggests potential other species. Otherwise, a CNN would still be much faster and is probably the better option regardless.

Nonetheless, the fine-tuned model is available on Hugging Face.

📖 Overview

Objectives

Train an LLM with an image encoder to classify fish species.
Fine-tune the model to generate explanations for its predictions.
Use image data scraped from the internet and caption data curated from the Australian Museum and an LLM.

Features

Fish Species Identification: Predicts the fish species based on a given image.
Explainability: Provides a detailed textual explanation for its predictions.
Fine-Tuning: Optimized for 4 hours on an A100 GPU.

🛠️ Model Details

Base Model

Architecture: LLaVA (Large Language and Vision Assistant) v1.6
Model Backbone: Mistral-7B
Fine-Tuned Dataset: Sydney Fish Dataset
- Image Data: Scraped from the internet.
- Caption Data: Curated from the Australian Museum and enhanced using an LLM.

Training Details

Training Duration: 4 hours
Hardware: A100 GPU
Optimizer: AdamW
Learning Rate Scheduler: Cosine Annealing

Model Access

The trained model is hosted on Hugging Face:
https://huggingface.co/Declan1/llava-v1.6-mistral-7b-sydneyfish-a100

📂 Dataset

Image Data:
- Source: Scraped from various online sources.
- Format: JPEG/PNG images.
Caption Data:
- Source: Australian Museum's fish species database and an LLM for data augmentation.
- Format: JSON with mappings between image IDs and captions.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Finetune_llava-v1.6-mistral-7b-SYDNEYFISH_A100.ipynb		Finetune_llava-v1.6-mistral-7b-SYDNEYFISH_A100.ipynb
LLM-RAG.py		LLM-RAG.py
README.md		README.md
Sydney_Fish_Dataset_Stratified.py		Sydney_Fish_Dataset_Stratified.py
Testing-Model-RTX4060-Fixed12.ipynb		Testing-Model-RTX4060-Fixed12.ipynb
webscrape.py		webscrape.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fish Species Classification with LLM and Image Encoder

📖 Overview

Objectives

Features

🛠️ Model Details

Base Model

Training Details

Model Access

📂 Dataset

About

Releases

Packages

Languages

Declan-Curran1/Fish-GPT

Folders and files

Latest commit

History

Repository files navigation

Fish Species Classification with LLM and Image Encoder

📖 Overview

Objectives

Features

🛠️ Model Details

Base Model

Training Details

Model Access

📂 Dataset

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages