Completed Research

This project focuses on visual attention as an approach to solve captioning tasks with computer vision. We have studied the efficiency of different hyperparameter configurations on a state-of-the-art visual attention architecture composed of a pre-trained residual neural network encoder, and a long short-term memory decoder. Results show that the selection of both the cost function and the gradient-based optimizer have a significant impact on the captioning results. Our system considers the cross-entropy, Kullback-Leibler divergence, mean squared error, and the negative log-likelihood loss functions, as well as the adaptive momentum, AdamW, RMSprop, stochastic gradient descent, and Adadelta optimizers. Based on the performance metrics, a combination of cross-entropy with Adam is identified as the best alternative returning a Top-5 accuracy value of 73.092, and a BLEU-4 value of 0.201. Setting the cross-entropy as an independent variable, the first two optimization alternatives prove the best performance with a BLEU-4 metric value of 0.201. In terms of the inference loss, Adam outperforms AdamW with 3.413 over 3.418 and a Top-5 accuracy of 73.092 over 72.989.

The team:

Results presented on:

Source code:

Slides:

Computer vision to identify the incorrect use of face masks for COVID-19 awareness

Face mask detection has become a great challenge in computer vision, demanding the coalition of technology with COVID-19 awareness. Researchers have proposed deep learning models to detect the use of face masks. However, the incorrect use of a face mask can be as harmful as not wearing any protection at all. In this thesis, we propose a compound convolutional neural network (CNN) architecture based on two computer vision tasks: object localization to discover faces in images/videos, followed by an image classification CNN to categorize the faces and show if someone is using a face mask correctly, incorrectly, or not wearing any mask at all. The first CNN is built upon RetinaFace, a model to detect faces in images; whereas the second CNN uses a Resnet-152 architecture as a classification backbone. Our model enables an accurate identification of people who are not correctly following the COVID-19 healthcare recommendations on face masks use. We have released our proposed computer vision model to the public, and optimized it for embedded systems deployment, empowering a global use of our technology.

The team:

Results presented on:

Source code:

Slides:

Ongoing Research

Decentralized-based blockchain system for enhancing the management of scientific publications

The peer review process by which a scientific research must go through to be reviewed and subsequently published or rejected has several positive and negative factors, among them is the factor that other scientists evaluate the quality of the work of other scientists to guarantee a rigorous and consistent work . Likewise, the identity of the evaluators is not revealed, in this way it is sought that the evaluation is not biased. However, the review process may lead to the publication to be delayed for more than one year. In some cases, investigators need to pay for the article processing charges. This is why the following proposal was born to improve this system by creating a decentralized version based on blockchain, which is used to review research managed by the scientific community through smart contracts. Since this new system will have a more secure anonymity factor, every step of the review process will be open and transparent. Reviewers will be rewarded with system tokens. The authors of the manuscripts with the highest impact will also be rewarded through tokens. In this way, the review process is expected to be faster, fairer, safer, and more transparent. In this graduation project, it is expected to write a white paper to document the proposed system.

The team:

Results presented on:

  • To be submitted

Source code:

  • Avaliable soon

Natural language processing for collective learning curation

The team:

Hypothesis:

  • Does human interaction empower collective intelligence?

Resources:

Results:

  • To be submitted

Source code:

  • Avaliable soon

Deep reinforcement learning for dynamic electromagnetic spectrum access

Required knowledge:

The team:

Results presented on:

  • To be submitted

Source code:

  • Avaliable soon

Deep learning hyperparameter optimization via genetic algorithms

The team:

Results presented on:

  • To be submitted

Source code:

  • Avaliable soon

Routing algorithm's optimization for wireless sensor networks

Potential benchmarks:

The team:

Results presented on:

  • To be submitted

Source code:

  • Avaliable soon

Human pose estimation: the impact of image processing in computer vision

The team:

Results presented on:

  • To be submitted

Source code:

  • Avaliable soon

UAV control for object goal navigation

The team:

Results presented on:

  • To be submitted

Source code:

  • Avaliable soon

Keypoint estimation and stereo vision for salmon size measurement

The team:

Results presented on:

  • To be submitted

Source code:

  • Avaliable soon

Conditional Adversarial Networks for Rhinoplasty Results Prediction

Technical requitements:

The student is expected to use a dataset of images from nose surgeries and create a image classification algorithm based on CGANs (Conditional Generative Adversarial Networks).

The team:

Results presented on:

  • To be submitted

Source code:

  • Avaliable soon

Open Research Calls for Yachay Students

We are looking for Final Graduation Project students. The students will be the main researchers supervised by a DeepARC team.

Online vs. offline computing: comparison, applications and use-cases

Required knowledge:

Technical requirements:

  • The student will be comparing the training complexity of machine learning models between offline hardware (e.g., NVIDIA Jetson Nano, Intel Neural Compute Stick 2, etc.), and current cloud computing hardware available (i.e., Google Colab, Kaggle Kernel, Azure Notbooks, etc.).

The team:

Results presented on:

  • To be submitted

Source code:

  • Avaliable soon

Artificial intelligence-based eye-tracking algorithm as a neuro-scientific instrument

Technical requirements:

  • Develop and optimize an eye-tracking algorithm using artificial intelligence.

  • Analyze the social impact of the developed technology.

Potential applications:

  • Marketing focalization in websites.

  • Computer control using the eyes for disabled people.

  • Cognitive aptitude and perception enhancement in children.

  • Political message analysis.

The team:

Results presented on:

  • To be submitted

Source code:

  • Avaliable soon

Social media data to empower an AI-based sentiment analysis model

Technical requirements:

  • Use the information of a mobile App API (e.g., Twitter, Spotify, Youtube, Facebook, etc.) to develop a sentiment analysis model using AI.

  • Analyze the social impact of the developed technology.

Potential applications:

  • Political tendency prediction.

  • Recognition of social behavior patterns.

The team:

Results presented on:

  • To be submitted

Source code:

  • Avaliable soon

Feel free to contact us if you are interested in being part of any of our teams.If you are an external professor/researcher, we are happy to collaborate.You can bring your own ideas and we will build a tailored research team for you to work with.If you are a Yachay Tech student, you are required to have approved a related lecture with 8/10 or higher. The max. number of supervised students per semester is 4 (not including co-supervision).Student's applications are evaluated until the first week of the semester (as long as the max. number of students has not been reached).