Workshop 1

Place: Recife, Brazil

Date: November 26 - December 02, 2017

Attendees:

  • George DC Cavalcanti

  • Tsang Ing Ren

  • Thiago José Marques Moura

  • Rene Nobrega de Sousa Gadelha

  • Laurent Heutte

  • Roger Trullo

Activities

Monday, November 27, 2017

Tuesday, November 28, 2017

  • Working meetings: presentation of past and current research activities of NormaSTIC/LITIS and Centro de Informática/UFPE.

Wednesday, November 29, 2017

  • Dayvid VR Oliveira (PhD Student)

Title: Frienemy Indecision Region for Dynamic Ensemble Selection

Abstract: Dynamic Ensemble Selection (DES) techniques aim to select one or more competent classifiers for the classification of each new test sample. Most DES techniques estimate the competence of classifiers using a given criterion over the region of competence of the test sample, usually defined as the set of nearest neighbors of the test sample in the validation set. Despite being very effective in several classification tasks, DES techniques can select classifiers that classify all samples in the region of competence as being from the same class. To tackle this issue, we propose the Frienemy Indecision REgion DES (FIRE-DES) framework. FIRE-DES pre-selects classifiers that correctly classify at least one pair of samples from different classes in the region of competence of the test sample, if the test sample is located in an indecision region (region with samples from different classes). Also, we propose an enhanced version of the FIRE-DES framework, FIRE-DES++, in which the validation set is filtered using prototype selection - reducing the noise sensitivity - and the region of competence is defined using the K-Nearest Neighbors Equality (KNNE), which ensures that all classes are represented in the region of competence. The results show that FIRE-DES improves the performance of several DES techniques, and FIRE-DES++ statistically outperform state-of-the-art DES frameworks.

    • Fidel Alejandro Guerrero Pena (PhD Student)

Title: Image segmentation with shape prior learning

Abstract: Modern semantic segmentation Deep Neural Networks approaches are based on pixel classifications. The problem with this kind of methods is that has poor performance for contour extraction when few textured or low contrast images are available. Also, the Fully Convolutional Neural Networks pixel classification is made based on the relationship of the pixel with its neighbors at different scales. However, this kind of local features approach fails to learn global features like objects shape when objects scales are bigger than encoder depth. This work has as objective the development of an FCN with shape prior learning capability for semantic segmentation.

  • Hector NB Pinheiro (PhD Student)

Title: Robust feature extraction techniques for speaker recognition

Abstract: Personal identification is an essential task in several sectors of our present society. Financial transactions, claims of social benefits, access to restricted environments and resources, and credit card purchases are just a few of the many operations in which personal identification is necessary. Individual physical or behavioral characteristics have been increasingly used for personal identification. Some examples are fingerprints, face, iris, handwriting, and voice. Speaker recognition is a biometric modality that performs personal identification using only the information of the individual's voice. The major challenge in developing such systems comes from the many factors that may influence the acquisition of the voice signals. The distortions caused by these factors are commonly referred as session mismatches. The type of microphone, the acoustic background noise, and the speaker's physical conditions are just a few examples of sources of such mismatches. The process of mitigating mismatches is called compensation. This work proposes the combination of different approaches for feature extraction and compensation for the text-independent speaker verification task in which the system authenticates an individual regardless of the words pronounced by him/her. In the experiments, thousands of speakers are used in a comparative analysis of combinations of the main techniques of literature. The evaluation is performed by using the so-called i-vectors modeling which maps a given utterance, described by feature vectors, into a fixed-length representation that preserves the useful information of the speaker. Then, Probabilistic models are estimated for each speaker using Probabilistic Linear Discriminant Analysis (PLDA). In this work, we also propose a new method for i-vectors comparison based on the Missing Data Theory. In this method, only a subset of the i-vectors components is chosen for the comparison. We perform the choice of the components using a linear complexity search algorithm that maximizes the comparison score between the two i-vectors. This work also describes the main trends for the feature extraction and modeling using the Deep Learning approach. Finally, the main ideas for the conclusion of this work are listed. Such ideas suggest the development of neural networks capable of learning robust characteristics by incorporating the compensation process during the network training.

  • Leonardo Valeriano Neri (PhD Student)

Title: Singular Features for Speaker Segmentation

Abstract: This work proposes a speaker segmentation method using Singular Value Decomposition (SVD) applied to the scaled Mel-Frequency Cepstral Coefficients (MFCCs) as feature extractor. The left singular vectors are a speaker representation containing properties of utterance generation, like style and intonation of spoken content. We define a feature vector for that representation. A sequential speaker segmentation method uses this feature to detect multiple speakers turns in a speech segment. Experiments using the AMI dataset shows an improvement in the precision rate and F1 measure compared to approaches that use Gaussian Mixture Model (GMM) and identity vectors (i-vectors).

  • Luis FA Pereira (PhD Student)

Title: Development of a fast and cost-effective Computed Tomography system for industrial environments by incorporating priors into the image reconstruction workflow

Abstract: Conventional X-ray radiography has been extensively used for inspection and quality assurance of industrial products. However, 2-D X-ray radiography can not provide quantitative information within three dimensions about the scanned object. In order to obtain such depth information, X-ray Computed Tomography (CT) should be applied. Nevertheless, conventional CT systems (at which the X-ray source and detector rotates around the target object) are cost ineffective, inflexible, and suffer from long acquisition times. Therefore, the deployment of such technology is unfeasible for many industrial environments where a high throughput is required as much as the best cost-benefit rate. The main goal of this research is to design a simple and cost effective X-ray CT imaging system of high throughput for industrial environments. Thus, software-based improvements to the CT workflow are evaluated in this research to deal with the under-sampled scenarios created by a fast industrial scanning setup. The basic strategy used is to incorporate prior knowledge about the scanned objects into distinct stages of the CT workflow: pre-processing, reconstruction, and post-processing.

  • Mariana de Araújo Souza (MSC Student)

Title: An Online Local Pool Generation Method for Dynamic Classifier Selection

Abstract: Dynamic Classifier Selection (DCS) techniques have difficulty in selecting the most competent classifier in a pool, even when its presence is assured. Since the DCS techniques rely only on local data to estimate a classifier’s competence, the manner in which the pool is generated could affect the choice of the best classifier for a given instance. That is, the global perspective in which pools are generated may not help the DCS techniques in selecting a competent classifier for instances that are likely to be misclassified. Thus, we propose an online pool generation method that produces a locally accurate pool for test samples in overlap regions of the feature space. That way, by using classifiers that were generated in a local scope, it could be easier for the DCS techniques to select the best one for those instances they would most probably misclassify. For the instances that are far from the class borders, a simple nearest neighbors rule is used in the proposed method. Experimental results show that, not only do the DCS techniques select the best classifier more often, but also the recognition rates of the DCS techniques fairly increase when using a local perspective for generating a pool of classifiers for DCS.

  • Pedro Diamel Marrero Fernandez (PhD Student)

Title: Data augmentation for rigid non-deformable objects detection

Abstract: Small train datasets is a common problem in Deep Neural Networks training. The objective of this work is to propose a new method for focal training using render for data augmentation and generation, applied to rigid non-deformable objects detection. Networks are trained solely with synthetic datasets generated from few examples of the objects of interest. The loss function is guide for the render in the generate process the new objects. In the render function are added some objects-based transformation like illumination, blur, shadows and background changes in addition to classical geometric transformations. Experiments with ColorChecker and plumbing objects detection datasets suggest that a good approximation of input data distribution can be obtained.

Thursday, November 30, 2017

  • Prof. Laurent Heutte

Title: Random Forests for Biomedical Data Classification

Abstract: Learning robust machine models is still a challenging issue for classifying biomedical data. In order to deal with high dimensionality, low sample size, imbalanced classes, Random Forests (RF), which consist in building a classifier ensemble with randomization to produce a diverse pool of tree-based classifiers, have been widely adopted in this field. In this talk, I will illustrate the use of RF on two medical applications: the classification of endomicroscopic images of the lungs and cancer stage/patient prediction with Radiomics, a domain which is increasingly attracting attention. When dealing with medical data, it might happen that only data of one class (eg healthy patient) is available for training. This is typically the case for endomicroscopic images of the lungs and we have proposed an original approach to deal with outliers in medical image classification, namely One Class Random Forests, which has shown to be effective for our problem and competitive with other state of the art one class classifiers. The second application of RF is Radiomics, a new (2012) concept which refers to the analysis of large amount of quantitative tumor features, extracted from multimodal medical images and other information like clinical data and gene or protein data to predict the patient's evolution and/or survival rate. In this case, data are both highly dimensional and heterogeneous. As part of an on going work, we have proposed a dissimilarity-based multi-view learning model with random forest, in which each data view (or group of features) is processed separately so that the data dimension is smaller in each view. By combining different views together, we can take advantage of the heterogeneity between views while avoiding using conventional feature selection methods for reducing the high dimensionality of data.

  • Roger Trullo (PhD Student)

Title: Segmentation of Multiple Organs at Risk in Thoracic CT Images using Deep Anatomical Learning

Abstract: Cancer is one of the leading causes of death worldwide. Radiotherapy is a standard treatment for this condition, and the first step of the radiotherapy process is to identify the target volumes to be treated and the healthy organs at risk (OARs) to be protected, especially in cancer like lymphoma, esophageal or lung cancer. Computed Tomography (CT) is a standard imaging technique for radiotherapy planning. However, due to low contrast, multi-organ segmentation is a challenge. In this paper, we propose a novel framework for the automatic delineation of OARs in thoracic imaging. Different from previous works in OARs segmentation where each organ is segmented separately, we propose two collaborative deep architectures to jointly segment all organs, including esophagus, heart, aorta and trachea. Since most of the organ borders are ill-defined, we believe spatial relationships between the organs must be taken into account to overcome the lack of contrast. The aim of combining two networks is to learn anatomical constraints with the first network, and to use them in the second network to segment each OAR. As first network, we use a deep SharpMask architecture, for providing an effective combination of low-level representations with deep high-level features. Spatial relationships between organs are taken into account by Conditional Random Fields (CRF). Next, the second deep network is employed to refine the segmentation of each organ by using the maps obtained on the first network to learn anatomical constraints for guiding and refining the segmentations. Experimental results show favorable performance on 60 CT scans, comparing with other state-of-the-art methods.

Title: Medical Image Synthesis with Deep Convolutional Adversarial Networks

Abstract: Medical imaging plays a critical role in various clinical applications. However, due to multiple considerations such as cost and radiation dose, the acquisition of certain image modalities may be limited. Thus, medical image synthesis can be of great benefit by estimating a desired imaging modality without incurring an actual scan. In this paper, we propose a generative adversarial approach to address this challenging problem. Specifically, we train a fully convolutional network (FCN) to generate a target image given a source image. To better model a nonlinear mapping from source to target and to produce more realistic target images, we propose to use the adversarial training strategy to better model the FCN. Moreover, the FCN is designed to incorporate an image-gradient-difference based loss function to avoid generating blurry target images. Long-term residual unit is also explored to help the training of the network. We further apply Auto-Context Model (ACM) to implement a context-aware deep convolutional adversarial network. Experimental results show that our method is accurate and robust for synthesizing target images from the corresponding source images. In particular, we evaluate our method on three datasets, to address the tasks of generating CT from MRI and generating 7T MRI from 3T MRI images. Our method outperforms the state-of-the-art methods under comparison in all datasets and tasks.

Friday, December 01, 2017

  • Definition of the schedule and program for the next meeting that will take place in Rouen-France (January/2018).

Saturday, December 02, 2017

  • Trip back home