This paper presents the results of the Sixth Thermal Image Super-Resolution Challenge held within the Perception Beyond the Visible Spectrum (PBVS) workshop at CVPR 2025. The challenge maintains the same cross-spectral benchmark dataset as the previous year, consisting of 1000 thermal images each paired with corresponding high-resolution RGB images. Track 1 focuses on single thermal image superresolution, enhancing low-resolution infrared images by a factor of ×8, whereas Track 2 addresses guided thermal image super-resolution, performing super-resolution at scale factors of ×8 and ×16 by leveraging high-resolution RGB images as auxiliary inputs. The 2025 edition attracted increased participation, with 128 teams competing in Track 1 and 86 teams in Track 2. The paper describes methodologies employed by the top participating teams, emphasizing innovations in transformer-based and hybrid architectures, and provides a detailed comparative analysis of the results between the 2024 and 2025 challenges. This analysis reveals significant progress in thermal image reconstruction accuracy, showcasing notable advances achieved by the leading methodologies.
This paper outlines the advancements and results of the Fifth Thermal Image Super-Resolution challenge, hosted at the Perception Beyond the Visible Spectrum CVPR 2024 workshop. The challenge employed a novel benchmark crossspectral dataset consisting of 1000 thermal images, each paired with its corresponding registered RGB image. The challenge featured two tracks: Track-1 focused on Single Thermal Image Super-Resolution with an ×8 upscale factor, while Track-2 extended its evaluation to include both ×8 and ×16 scaling factors, utilizing high-resolution RGB images to guide the super-resolution process for low-resolution thermal images. The participation of over 175 teams highlights the research community’s strong engagement and dedication to enhancing image resolution techniques across both single and cross-spectral methodologies. This year’s challenge sets new benchmarks and provides valuable insights into future directions for research in thermal image super-resolution.
This paper presents the results of two tracks from the fourth Thermal Image Super-Resolution (TISR) challenge, held at the Perception Beyond the Visible Spectrum (PBVS) 2023 workshop. Track-1 uses the same thermal image dataset as previous challenges, with 951 training images and 50 validation images at each resolution. In this track, two evaluations were conducted: the first consists of generating a SR image from a HR thermal noisy image downsampled by four, and the second consists of generating a SR image from a mid-resolution image and compare it with its semi-registered HR image (acquired with another camera). The results of Track-1 outperformed those from last year’s challenge. On the other hand, Track-2 uses a new acquired dataset consisting of 160 registered visible and thermal images of the same scenario for training and 30 validation images. This year, more than 150 teams participated in the challenge tracks, demonstrating the community’s ongoing interest in this topic.
This chapter reviews state-of-the-art approaches generally present in the pipeline of video analytics on urban scenarios. A typical pipeline is used to cluster approaches in the literature, including image preprocessing, object detection, object classification, and object tracking modules. Then, a review of recent approaches for each module is given. Additionally, applications and datasets generally used for training and evaluating the performance of these approaches are included. This chapter does not pretend to be an exhaustive review of state-of-the-art video analytics in urban environments but rather an illustration of some of the different recent contributions. The chapter concludes by presenting current trends in video analytics in the urban scenario field
This paper presents results from the third Thermal Image Super-Resolution (TISR) challenge organized in the Perception Beyond the Visible Spectrum (PBVS) 2022 workshop. The challenge uses the same thermal image dataset as the first two challenges, with 951 training images and 50 validation images at each resolution. A set of 20 images was kept aside for testing. The evaluation tasks were to measure the PSNR and SSIM between the SR image and the ground truth (HR thermal noisy image downsampled by four), and also to measure the PSNR and SSIM between the SR image and the semi-registered HR image (acquired with another camera). The results outperformed those from last year’s challenge, improving both evaluation metrics. This year, almost 100 teams participants registered for the challenge, showing the community’s interest in this hot topic.
This paper proposes a novel CNN architecture for the multi-thermal image super-resolution problem. In the proposed scheme, the multi-images are synthetically generated by downsampling and slightly shifting the given image; noise is also added to each of these synthesized images. The proposed architecture uses two attention blocks paths to extract high-frequency details taking advantage of the large information extracted from multiple images of the same scene. Experimental results are provided, showing the proposed scheme has overcome the state-of-the-art approaches.
This paper presents a transfer domain strategy to tackle the limitations of low-resolution thermal sensors and generate higher-resolution images of reasonable quality. The proposed technique employs a CycleGAN architecture and uses a ResNet as an encoder in the generator along with an attention module and a novel loss function. The network is trained on a multi-resolution thermal image dataset acquired with three different thermal sensors. Results report better performance benchmarking results on the 2nd CVPR-PBVS-2021 thermal image super-resolution challenge than state-of-the-art methods. The code of this work is available online.
This paper presents results from the second Thermal Image Super-Resolution (TISR) challenge organized in the framework of the Perception Beyond the Visible Spectrum (PBVS) 2021 workshop. For this second edition, the same thermal image dataset considered during the first challenge has been used; only mid-resolution (MR) and high-resolution (HR) sets have been considered. The dataset consists of 951 training images and 50 testing images for each resolution. A set of 20 images for each resolution is kept aside for evaluation. The two evaluation methodologies proposed for the first challenge are also considered in this opportunity. The first evaluation task consists of measuring the PSNR and SSIM between the obtained SR image and the corresponding ground truth (ie, the HR thermal image downsampled by four). The second evaluation also consists of measuring the PSNR and SSIM, but in this case, considers the x2 SR obtained from the given MR thermal image; this evaluation is performed between the SR image with respect to the semi-registered HR image, which has been acquired with another camera. The results outperformed those from the first challenge, thus showing an improvement in both evaluation metrics.
This paper proposes a novel CycleGAN architecture for thermal image super-resolution, together with a large dataset consisting of thermal images at different resolutions. The dataset has been acquired using three thermal cameras at different resolutions, which acquire images from the same scenario at the same time. The thermal cameras are mounted in a rig trying to minimize the baseline distance to make easier the registration problem. The proposed architecture is based on ResNet6 as a Generator and PatchGAN as a Discriminator. The novelty on the proposed unsupervised super-resolution training (CycleGAN) is possible due to the existence of aforementioned thermal images—images of the same scenario with different resolutions. The proposed approach is evaluated in the dataset and compared with classical bicubic interpolation. The dataset and the network are available.
This paper summarizes the top contributions to the first challenge on thermal image super-resolution (TISR), which was organized as part of the Perception Beyond the Visible Spectrum (PBVS) 2020 workshop. In this challenge, a novel thermal image dataset is considered together with state-of-the-art approaches evaluated under a common framework. The dataset used in the challenge consists of 1021 thermal images, obtained from three distinct thermal cameras at different resolutions (low-resolution, mid-resolution, and high-resolution), resulting in a total of 3063 thermal images. From each resolution, 951 images are used for training and 50 for testing while the 20 remaining images are used for two proposed evaluations. The first evaluation consists of downsampling the low-resolution, mid-resolution, and high-resolution thermal images by× 2,× 3 and× 4 respectively, and comparing their super-resolution results with the corresponding ground truth images. The second evaluation is comprised of obtaining the× 2 super-resolution from a given mid-resolution thermal image and comparing it with the corresponding semi-registered high-resolution thermal image. Out of 51 registered participants, 6 teams reached the final validation phase.
This paper proposes the use of a CycleGAN architecture for thermal image super-resolution under a transfer domain strategy, where middle-resolution images from one camera are transferred to a higher resolution domain of another camera. The proposed approach is trained with a large dataset acquired using three thermal cameras at different resolutions. An unsupervised learning process is followed to train the architecture. Additional loss function is proposed trying to improve results from the state of the art approaches. Following the first thermal image super-resolution challenge (PBVS-CVPR2020) evaluations are performed. A comparison with previous works is presented showing the proposed approach reaches the best results.
Function-as-a-Service (FaaS) platforms enable users to execute user-defined functions without worrying about operational issues such as the management of infrastructure resources. In order to improve performance, different FaaS platforms are implementing optimizations and improvements, but it's not clear how good these implementations are. In this work, Apache OpenWhisk platform is evaluated from an approach that allows to determinate and characterize the performance under different configuration options; it was found that under certain premises an improvement of the performance in cold-booting latencies up to 38% is obtain.
Due to the lack of thermal image datasets, a new dataset has been acquired for proposed a super-resolution approach using a Deep Convolution Neural Network schema. In order to achieve this image enhancement process, a new thermal images dataset is used. Different experiments have been carried out, firstly, the proposed architecture has been trained using only images of the visible spectrum, and later it has been trained with images of the thermal spectrum, the results showed that with the network trained with thermal images, better results are obtained in the process of enhancing the images, maintaining the image details and perspective. The thermal dataset is available at http://www.cidis.espol.edu.ec/es/dataset.
This paper presents details of a distributed platform intended for data acquisition, evaluation, storage and visualization, which is fully implemented under the crowdsourcing paradigm. The proposed platform is the result from collaboration between computer science and petrology researchers and it is intended for academic purposes. The platform is designed within a MTV (Model, Template and View) architecture and also designed for a collaborative data store and managing of rocks from multiple readers and writers, taking advantage of ubiquity of web applications, and neutrality of researchers from different communities to validate the data. The platform is being used and validated by students and academics from our university; in the near future it will be open to other users interested on this topic.
El presente trabajo tiene el propósito de implementar una Aplicación Web utilizando conjuntamente las metodologías MDA (Model Driven Architecture) y MERODE (Model - driven, Existence - dependency Relation Object - oriented DEvelopment), para facilitar la Administración de un Evento en un Grupo de Investigación. Es importante mencionar que anteriormente se ha implementado, como Proyecto de Tesis, un producto llamado AppVlir81, el mismo que es también un Portal Web de Administración de eventos. Nuestra idea es retomar el diseño original de este software y modificarlo, corrigiendo defectos y agregando mejoras que se han identificado durante el tiempo en que el software ha estado en producción. En el primer capítulo se señalan los antecedentes para retomar el Proyecto AppVlir8, la justificación del desarrollo de los módulos siguiendo la elección de la metodología MERODE que ayuda a elaborar un diseño independiente del dominio, método que es completamente compatible con MDA y los objetivos que se esperan alcanzar. En el segundo capítulo se redactan los fundamentos teóricos en los que se basa este Proyecto de Grado, justificando el uso de cada elemento incluido en la arquitectura del mismo. En el tercer capítulo se documentan los requerimientos funcionales y no funcionales levantados para mejorar el diseño del sistema anterior. Además de la especificación del PIM y del PSM, previos a la fase de la transformación al código. En el cuarto capítulo se mencionan los cambios realizados en la arquitectura del sistema anterior y las novedades en el actual, para poder satisfacer algunos requerimientos claves
Este dataset cuenta con 101 imágenes, con una resolución de 640x512.
Este dataset cuenta con 200 imagenes cross-espectrales (imagen visible e imagen térmicas) tomadas al mismo tiempo de la misma escena. Se encuentran registradas mediante la librerias Elastix.
Este dataset cuenta con 1021 imagenes (separadas en 951 para entrenamiento, 50 para testing y 20 para validación). Este dataset cuenta con 3 tipos de resoluciones 160x120, 320x240 y 640x480, categorizados como baja, media y alta.