Question Regarding Training Hyperparameters for DINOv2 + BoQ Model #21

SeunghanYu · 2024-12-26T07:33:28Z

I have a question about the training hyperparameters for the DINOv2 + BoQ model.
In the OpenVPRLab repository, the boq_dinov2.yaml configuration file specifies the following hyperparameters for training:

#---------------------------------------------------
# Datamodule Configuration
#---------------------------------------------------
datamodule:
  train_set_name: "gsv-cities" # use "gsv-cities" if you have downloaded the full dataset
  train_image_size: 
    - 280
    - 280
  img_per_place: 4
  batch_size: 160
  num_workers: 8
  val_image_size: 
    - 322
    - 322
  val_set_names:
    - "msls-val"
    - "pitts30k-val"

#---------------------------------------------------
# VPR Model Configuration
#---------------------------------------------------
backbone:
  module: src.models.backbones
  class: DinoV2
  params:
    backbone_name: "dinov2_vitb14" # name of the vit backbone (see DinoV2.AVAILABLE_MODELS)
    num_unfrozen_blocks: 2

aggregator:
  module: src.models.aggregators # module path
  class: BoQ    # class name in the __init__.py file in the aggregators directory
  params:
    in_channels:  # if left blank we will use backbone.out_channels.
    proj_channels: 384
    num_queries: 64
    num_layers: 2
    row_dim: 32

#---------------------------------------------------
# Loss Function Configuration
#---------------------------------------------------
loss_function: 
  # check src/losses/vpr_losses.py for available loss functions, we are using pytorch_metric_learning library
  # if you want to develop your own loss function, you can add it to the vpr_losses.py file
  # or create a new file in the losses directory and import it into the __inin__.py file
  module: src.losses
  name: vpr_losses
  class: VPRLossFunction
  params:
    loss_fn_name: "MultiSimilarityLoss"   # other possible values: "SupConLoss", "ContrastiveLoss", "TripletMarginLoss"
    miner_name: "MultiSimilarityMiner"    # other possible values: "TripletMarginMiner", "PairMarginMiner"


#---------------------------------------------------
# Trainer Configuration
#---------------------------------------------------
trainer:
  optimizer: adamw
  lr: 0.0002      # learning rate
  wd: 0.001       # weight decay
  warmup: 3900    # linear warmup steps
  max_epochs: 40
  milestones:
    - 10
    - 20
    - 30
  lr_mult: 0.1 # learning rate multiplier at each milestone

However, in the train.py file of the Bag-of-Queries repository, I noticed that some hyperparameters are different.

For example:

## Training config:
self.batch_size: int = 128           # batch size is the number of places per batch
self.img_per_place: int = 4          # number of images per place
self.max_epochs: int = 40
self.warmup_epochs: int = 10         # number of linear warmup epochs (not iterations)
self.lr: float = 1e-4                # learning rate
self.weight_decay: float = 1e-4
self.lr_mul: float = 0.1
self.milestones: list = [10, 20]
self.num_workers: int = 8

def train(hparams, dev_mode=False):
    seed_everything(hparams.seed, workers=True)
    
    # Instantiate the backbone and define the image size for training and validation
    if "dinov2" in hparams.backbone_name:
        backbone = DinoV2(backbone_name=hparams.backbone_name, unfreeze_n_blocks=hparams.unfreeze_n_blocks)
        train_img_size = (224, 224)
        val_img_size = (322, 322)
        hparams.backbone_name = backbone.backbone_name # in case the user passed dinov2 without the version
        hparams.train_img_size = train_img_size
        hparams.val_img_size = val_img_size

Key differences include:

Training image size: (280, 280) -> (224, 224)
Batch size: 160 -> 128
learning rate: 0.0002 -> 0.0001
weight decay: 0.001 -> 0.0001
3900 warmup iterations -> 10 warmup epochs
milestone values: [10, 20, 30] -> [10, 20]

These discrepancies leave me wondering which hyperparameter configuration to follow.

Could you kindly clarify the exact set of hyperparameters recommended for training the DINOv2 + BoQ model? Specifically:

1. What are the correct values for training image size, batch size, learning rate, weight decay, warmup strategy, and milestones?
2. Are there additional settings or considerations that I should be aware of when training this model?

Thank you for your assistance! I greatly appreciate your work on these repositories and look forward to your guidance.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question Regarding Training Hyperparameters for DINOv2 + BoQ Model #21

Question Regarding Training Hyperparameters for DINOv2 + BoQ Model #21

SeunghanYu commented Dec 26, 2024 •

edited

Loading

Question Regarding Training Hyperparameters for DINOv2 + BoQ Model #21

Question Regarding Training Hyperparameters for DINOv2 + BoQ Model #21

Comments

SeunghanYu commented Dec 26, 2024 • edited Loading

SeunghanYu commented Dec 26, 2024 •

edited

Loading