COYO-Labeled-300M is a dataset of machine-labeled 300M image-multi-label pairs. We labeled subset of COYO-700M with a large model (efficientnetv2-xl) trained on imagenet-21k. We followed the same evaluation pipeline as in efficientnet-v2. The labels are top 50 most likely labels out of 21,841 classes from imagenet-21k. The label probabilies are also provided with labels so that the user can select threshold of their choice for multi-label classification use or can take top-1 class for single class classification use.
In other words, COYO-Labeled-300M is a ImageNet-like dataset. Instead of human labeled 1.25 million samples, it's machine-labeled 300 million samples. This dataset is similar to JFT-300M which is not released to the public.
We found that our ViT implementation trained on COYO-Labeled-300M performs similar to the performance numbers in the ViT paper trained on JFT-300M.
We also provide weights for the pretrained ViT model on COYO-Labeled-300M as well as its training & fine-tuning code.
The basic instruction, licenses and contributors are the same as for the coyo-700m.
name | type | description |
---|---|---|
id | long | Unique 64-bit integer ID generated by monotonically_increasing_id() which is the same value that is mapped with the existing COYO-700M. |
url | string | The image URL extracted from the src attribute of the <img> |
imagehash | string | The perceptual hash(pHash) of the image |
labels | sequence[integer] | Inference results of EfficientNetV2-XL model trained on ImageNet-21K dataset (Top 50 indices among 21,841 classes) |
label_probs | sequence[float] | Inference results of EfficientNetV2-XL model trained on ImageNet-21K dataset (Top 50 indices among 21,841 probabilites) |
width | integer | The width of the image |
height | integer | The height of the image |
- Statistics for threshold-based label distribution
Threshold | Labels per Image | Unique Labels | Sampling Ratio |
---|---|---|---|
0.00 | 50.00 | 21,841 | 100.00% |
0.05 | 3.16 | 18,922 | 96.69% |
0.10 | 1.98 | 18,471 | 85.83% |
0.15 | 1.58 | 18,122 | 74.49% |
0.20 | 1.37 | 17,781 | 64.30% |
0.25 | 1.26 | 17,479 | 55.60% |
- You can download the dataset from Huggingface Dataset
- For more information on downloading the image dataset, please refer to download/README.md.