Ind-2M Dataset

Introduction

To address the challenges associated with pre-training models for industrial applications, we have curated a comprehensive dataset called Ind-2M. This dataset has been specifically collected from the industrial scenarios. Ind-2M comprises a vast collection of 2.2 million industrial images, sourced from both publicly available industrial datasets and data obtained through web crawling. The dataset encompasses 1.6 million images of non-defective industrial products, as well as 0.6 million images depicting various defects found in industrial settings. By leveraging the Ind-2M dataset, our objective is to facilitate the advancement of industrial representation through pre-training models.

This project has open-sourced the data crawler part of the Ind-2M dataset. Due to copyright issues, the publicly available data portion of the Ind-2M dataset should be downloaded based on the indexes provided in the paper. The portion of the data crawled in the Ind-2M dataset is called Ind-2M-Crawling, which includes 221,062 industrial product images and 614,002 industrial defect images. Downloading and using this dataset should comply with the dataset's license, and the dataset paper should be cited.

Downloading

https://drive.google.com/drive/folders/19bUh_S114CPiFQQH1_ezMT4A5xgAi3y0?usp=sharing

License

DATASET LICENSE.

Citation

If you use Ind-2M in your research, please use the following BibTeX entry.

@article{zhu2024pixel,
  title={Pixel-level Contrastive Pre-Trainer for Industrial Image Representation},
  author={Zhu, Bingke and Chen, Yingying and Tang, Ming and Wang, Jinqiao},
  journal={IEEE Transactions on Instrumentation and Measurement},
  year={2024},
  publisher={IEEE}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Ind-2M Dataset

Introduction

Downloading

License

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Ind-2M Dataset

Introduction

Downloading

License

Citation