In the Single-Image Super-Resolution task one of the models that achieved best results is the SRGAN, whose architecture is based on the GAN model and uses a VGG-based perceptual loss in order to produce upscaled images.
In the notebook contained in the repository we re-implemented their architecture and trained it on the DIV2K dataset.
Starting from their architecture, we designed an alternative version which uses the edge detection mechanism in order to refine countours and enhance details. Therefore, using the multi-task learning paradigm we trained our model in order to reproduce not only the upscaled image but also the result of the Canny edge detection operator on the input image.
The pipeline used for training of the SRGAN-Canny and the original SRGAN is available on Colab.
In the file Presentation.pdf
, a presentation of the work and challenges encountered during the development of this project is illustrated.
In this section, some of the results obtained by the model are presented. In particular, on the left we can see the low resolution image and on the right the upscaled version produced by the SRGAN-Canny model.
Moreover, in the following table the average MSE, SSIM and PSNR are reported for the dataset used as test sets (Set5, Set14, BSD100) with both the baseline SRGAN architecture proposed in the original paper but trained on our dataset and our SRGAN-Canny version enhanced by the edge detection task.
AVG MSE | AVG SSIM | AVG PSNR | |
---|---|---|---|
SRGAN-VGG54 | 0,01971 | 0,59896 | 23,87128 |
SRGAN-Canny | 0,01808 | 0,61142 | 24,25129 |
In this project, the following libraries for Python were used:
- TensorFlow
- OpenCV
- Numpy
- Matplotlib for plotting
- Weights and Biases in order to track the experiments