3rd International Conference on Cloud and Big Data (CLBD 2022)

November 12 ~ 13, 2022, Chennai, India

Accepted Papers

Generic and Accessible Gesture-Controlled Augmented Reality Platform

Arya Rajiv Chaloli, K Anjali Kamath, Divya T Puranam and Prof. Preet Kanwal, Department of Computer Science, PES University Bangalore

ABSTRACT

Augmented Reality (AR) is one of the most popular trends in, technology today. Its accessibility has heightened as new smartphones and other devices equipped with depth-sensing cameras and other AR-related technologies are being introduced into the market. AR helps the user view the real-world environment with virtual objects augmented in it. With advances in AR technology, Augmented Reality is being used in a variety of applications, from medicine to games like Pokémon Go, and to retail and shopping applications that allow us to try on clothes and accessories from the comfort of our homes. In the current times, being hands-off is utterly important due to the widespread attack of the COVID19 pandemic. Thus, an application where the necessity to touch a screen is removed would be highly relevant. In such a scenario, AR comes into play. Essentially, it helps to convert the contents on a physical screen to a virtual object that is seamlessly augmented into reality and the user interacts only with the virtual object, thus avoiding any requirement to touch the actual physical screen and making the whole system touch-free. Hence, the paper aims to propose an Augmented Reality application system that can be integrated into our day-to-day life. With the advent of technology and the digitization of almost all interfaces around us, the opportunity of augmenting these digitized resources has escalated. To demonstrate a generic application that can be used along with any digitized resource, an AR system will be implemented in the project, that can control a personal computer (PC) remotely through hand gestures made by the user on the augmented interface. The methodology proposed targets to build a system that is both generic and accessible. The application is made generic by making the AR system be able to provide an AR interface to any application that can run on a regular PC. The accessibility of the AR system is improved by making it compatible to work on any normal smartphone with a regular camera. There is no necessity for a depth-sensing camera, which is a requirement of popular AR toolkits like ARCore and ARKit.

KEYWORDS

Augmented Reality, Gesture Recognition, Generic, Application, Technology, Accessibility.


Performance Analysis of Supervised Learning Algorithms on Different Applications

Vijayalakshmi Sarraju1, Jaya Pal2 and Supreeti Kamilya3, 1Department of Computer Science and Engineering, BIT Extension Centre Lalpur, 2Department of Computer Science and Engineering, BIT Extension Centre Lalpur, 3Department of Computer Science and Engineering, BIT Mesra

ABSTRACT

In the current era of computation, machine learning is the most commonly used technique to find out a pattern of highly complex datasets. The present paper shows some existing applications, such as stock data mining, undergraduate admission, and breast lesion detection, where different supervised machine learning algorithms are used to classify various patterns. A performance analysis, in terms of accuracy, precision, sensitivity, and specificity is given for all three applications. It is observed that a support vector machine (SVM) is the commonly used supervised learning method that shows good performance in terms of performance metrics. A comparative analysis of SVM classifiers on the above-mentioned applications is shown in the paper.

KEYWORDS

The supervised learning algorithm, stock data mining, undergraduate admission scheme, breast lesion detection and performance analysis.


Classifying Celeste as NP Complete

Zeeshan Ahmed, Alapan Chaudhuri, Kunwar Grover, Ashwin Rao, Kushagra Garg and Pulak Malhotra, International Institute of Information Technology, Hyderabad

ABSTRACT

We analyze the computational complexity of the video game "CELESTE" and prove that solving a generalized level in it is NP-Complete. Further, we also show how upon introducing a small change in the game, we can make it harder (PSPACE-complete, to be precise).

KEYWORDS

Complexity analysis, NP completeness, algorithmic analysis, game analysis.


Breast Cancer Prediction and Early Diagnosis Using Machine Learning Techniques

Vanita Parmar1, Saket Swarndeep2, 1Student, Post Graduate scholar, Post Graduate Department, L.J. Institute of Engineering and Technology, L J University, 2Assistant Professor, L.J. Institute of Engineering and Technology, L J University

ABSTRACT

Breast cancer is one of the diseases that make a high number of deaths every year. Breast cancer occurs in women rarely in men. Symptoms of breast cancer can include bloody discharge from the nipple, change in the shape of nipple breast, composition of the nipple breast and lump in the breast. Here we use cclassification and machine learning methods classify data into different categories for predicting breast cancer. The main purpose of this study is to use machine learning algorithms like: Support Vector Machine (SVM), Logistic Regression, Random Forest and k Nearest Neighbours (k-NN), Naïve Bayes to predict the breast cancer with better accuracy and precision, accuracy, sensitivity and specificity. When training the model, we remove the unnecessary duplicate data from dataset for more accuracy. To create effectively more accuracy we use large dataset. We conduct experiments showing that this technique substantially increases the prediction time on large dataset.

KEYWORDS

SVM, Logistic Regression, Decision Tree, Random Forest, KNN, Naïve Bayes.


Survey on Online Learning Analysis with Students Emotions using Machine Learning and Learning Analytics

Sophiya Mathews1 and Dr. D. John Aravindhar2, 1Assistant Professor, Department of Computer Science, SNMIMT, 2Professor, Department of CSE, HITS, Padur, Chennai

ABSTRACT

The Covid-19 pandemic is wreaking havoc on many aspects of life, including university training programs all around the world. As a result, many universities now employ online learning as a viable option. However, not all training institutions, particularly in low-resource developing nations, have the adequate facilities, resources, or experience to conduct online learning. As a result, many universities are currently grappling with the challenge of designing conventional (face-to-face), e-learning, or blended learning courses under limited circumstances that yet suit students requirements. On this basis, this survey intended to concentrate on students understanding level prediction on online learning through machine learning and learning analytics. This paper recognized the student’s emotion through face expression detection and then followed the learning analytics in order to predict the understanding level of students and to take decisions on the online learning system.

KEYWORDS

Online learning, Learning analytics, Machine learning, Educational data mining, facial emotions, E-learning.


Comparison of Sequence Models for Text Narration from Tabular Data

Mayank Lohani1, Rohan Dasari2, Praveen Thenraj Gunasekaran3, Selvakuberan Karuppasamy4, Subhashini Lakshminarayanan5, 1Data and AI, Advance Technology Centers in India, Accenture, Gurugram, India, 2Data and AI, Advance Technology Centers in India, Accenture, Hyderabad, India, 3Data and AI, Advance Technology Centers in India, Accenture, Chennai, India, 4Data and AI, Advance Technology Centers in India, Accenture, Chennai, India, 5Data and AI, Advance Technology Centers in India, Accenture, Chennai, India

ABSTRACT

This paper demonstrates the survey on sequence models for generation of texts from tabular data. Narrating meaningful information from tables or any other data source is an integral part of daily routine to understand the context. It cost time and human effort in manual approach where we try to eyeball the source and interpret the content. In this era of internet where data is exponentially growing and massive improvement in technology, we propose an NLP based approach where we can generate the meaningful text from the table without the human intervention. In this paper we propose transformer-based models with the goal to generate natural human interpretable language text generated from the input tables. We propose transformer based pre-trained model that is trained with structured and context rich tables and their respective summaries. We present comprehensive comparison between different transformer-based models and conclude with mentioning key points and future research roadmap.

KEYWORDS

Survey, NLP, Transformers, Table to Text.


Optimizing the Parameters of Machine Learning Techniques Using Bat Algorithm in the Analysis of Skin Cancer

P.Gokila Brindha1, Dr.R.Rajalaxmi2, N.Azhaku Mohana1 and R.Preethika1, 1Department of Computer Technology- UG , Kongu Engineering College, Perundurai., 2Department of Computer Science and Engineering, Kongu Engineering College, Perundurai.

ABSTRACT

The growth of abnormal cells in the sun exposed areas of the skin leads to skin cancer. It is diagnosed by physical examination and biopsy. Generally there are three types of skin cancer namely basal cell carcinoma, melanoma and squamous cell carcinoma. Mostly it appears in the areas of human body like ears, lips, face, arms, chest, scalp, palm, toenails and fingernails. The symptoms are lesion, mole color changes, redness, darkening of the skin or irregular border in skin mole. Many machine learning and deep learning methods are applied in the analysis of skin cancer. In this research work VGG16 a pretrained CNN model is used to extract the features from the images and these features are applied to the machine learning techniques for classification. Machine learning techniques such as Decision Tree, K-Nearest Neighbor(KNN), Logistic Regression, Random forest and Support vector machine(SVM) are used. BAT algorithm is used to find the optimal hyper parameters of these machine learning models so as to obtain a good "accurate" prediction. The proposed method helps in finding the cancer at the early stage and the performance of the algorithm are compared based on accuracy.

KEYWORDS

Machine Learning Techniques, Benign, Malignant , Augmentation, BAT algorithm, VGG-16.


Applying Named Entity Recognition on Management Literature: Identifying Construct Entities and Proposing Better Recommendations for Research

Abhijeet Bhattacharya and Ahmed Doha, Sprott School of Management, Carleton University, Ottawa, Canada

ABSTRACT

Finding useful information in the vast amounts of unstructured text in online corpora of management and other academic papers is becoming a more pressing problem as these collections grow over time. This study aims to address this issue by utilising Named Entity Recognition (NER) to identify and extract various Construct entities and mentions found in research paper abstracts. This paper presents a NER model based on the SCIBERT transformer architecture, that has been fine-tuned using a database of over 50,000 research paper abstracts compiled from the top ABDC management journals. Based on our labelled training dataset and annotation scheme, Our SCIBERT model achieves an F1 result of 73.5%, and also performs better when compared to other baseline models such as BERT and BiLSTM-CNN. The study provides a perspective on using Natural Language Processing (NLP) in Management research by providing a framework for better recommendations in the future and automating the Knowledge Extraction process.

KEYWORDS

Named Entity Recognition, Transformers, BERT, SCIBERT, Management Knowledge Discovery.


Screening Deep Learning Inference Accelerators at the Production Lines

Ashish Sharma, Puneesh Khanna, Jaimin Maniyar, AI Group, Intel, Bangalore, India

ABSTRACT

Artificial Intelligence (AI) accelerators can be divided into two main buckets, one for training and another for inference over the trained models. Computation results of AI inference chipsets are expected to be deterministic for a given input. There are different compute engines on the Inference chip which help in acceleration of the Arithmetic operations. The Inference output results are compared with a golden reference output for the accuracy measurements. There can be many errors which can occur during the Inference execution. These errors could be due to the faulty hardware units and these units should be thoroughly screened in the assembly line before they are deployed by the customers in the data centre. This paper talks about a generic Inference application that has been developed to execute inferences over multiple inputs for various real inference models and stress all the compute engines of the Inference chip. Inference outputs from a specific inference unit are stored and are assumed to be golden and further confirmed as golden statistically. Once the golden reference outputs are established, Inference application is deployed in the pre- and post-production environments to screen out defective units whose actual output do not match the reference. Strategy to compare against itself at mass scale resulted in achieving the Defects Per Million target for the customers.

KEYWORDS

Artificial Intelligence, Deep Learning, Inference, Neural Network Processor for Inference (NNP-I), ICE, DELPHI, DSP, SRAM, ICEBO, IFMs, OFMs, DPMO.


Resource Efficient TCAM Implementation using SRAM

Madhu Ennampelli, Kuruvilla Varghese, Senior Member, IEEE, Department of Electronic Systems Engineering, IISc, Bangalore, India

ABSTRACT

Although Ternay Content Addressable Memories (TCAMs) are faster in operation than Static Random Access Memories (SRAMs), TCAMs have disadvantages like high power consumption, low bit density, high cost and complex circuitry. This paper presents a novel approach to adding SRAM advantages to TCAMs using the technique of Hybrid Partitioning. In hybrid partitioning, the TCAM table is divided both horizontally and vertically. These partitioned blocks are directly mapped to their corresponding SRAM cells. To justify the functionality and performance of the design, a 64 x 32 SRAMbased efficient TCAM is successfully designed on Xilinx Field-Programmable Gate Array (FPGA) using Verilog HDL. Design parameters were improved in all corners like speed by 10.52%, power by 7.62%, and resource utilization by 50% compared to performance in[1]. And also Application-specific integrated circuit (ASIC) of the architecture is designed on the 45-nm CMOS technology node to check the feasibility of design on the chip, and similar to FPGA-based architecture, ASIC implementation performance is also increased.

KEYWORDS

Ternary content addressable memory (TCAM), Application-specific integrated circuit (ASIC), field-programmable gate array (FPGA), static random access memory (SRAM).


A solution for car embedded systems security

Hamed AOUADI, Computer sciences Department, Faculty of Sciences of Tunis, University of Tunis El Manar, Campus Universitaire El Manar, Tunisia

ABSTRACT

Modern cars are equipped with multiple electronic embedded systems. These systems offer more efficiency to the car and comfort to passengers. However, they open up new security problems ranging from counterfeiting to remotely taking the control of the vehicle. In this paper, we firstly give an overview of the car security domain then we present the more referenced related works and finally we present our security solution which consists in defining a physical separation between two sets of embedded systems.

KEYWORDS

Car embedded system, security, hardware protection, car area network security.


A Comparative Analysis of Prediction of Student Results Using Decision Trees and Random Forest

Narayan Prasad Dahal and Prof. Dr. Subarna Shakya, Pulchowk Campus, Institute of Engineering, Tribhuvan University, Lalitpur, Nepal

ABSTRACT

There are many types of research based on students past data for predicting their performance. A lot of data mining techniques for analyzing the data have been used so far. This research project predicts the higher secondary students results based on their academic background, family details, and previous examination results using three decision tree algorithms: ID3, C4.5 (J48), and CART (Classification and Regression Tree) with other classification algorithms: Random Forest (RF), K-nearest Neighbors (KNN), Support Vector Machine (SVM) and Artificial Neural Network (ANN). The research project analyzes the performance and accuracy based on the results obtained. It also identifies some common differences based on achieved output and previous research work.

KEYWORDS

Data Mining, Decision Tree, Random Forest.


A Formal Composition of Multi-Agent Organization based on Category Theory

Abdelghani Boudjidj1 and Mohammed El Habib Souidi2, 1Ecole nationale Supérieure d’Informatique (ESI), BP 68M, 16270, Oued-Smar Algiers, Algeria, ICOSI Lab University, Abbes Laghrour khenchela BP 1252 El Houria 40004 Khenchela, Algeria, 2University of Khenchela, Algeria, ICOSI Lab University, Abbes Laghrour khenchela BP 1252 El Houria 40004 Khenchela, Algeria

ABSTRACT

The application of organizational multi-agent systems (MAS) provides the possibility of solving complex distributed problems such as, task grouping mechanisms, supply chain management, and air traffic control. The composition of MAS organizational models can be considered as an effective solution to group different organizational multi-agent systems into a single organizational multi-agent system. The main objective of this paper is to provide a MAS organizational model based on the composition of two organizational models, Agent Group Role (AGR), and Yet Another Multi Agent Model (YAMAM), with the aim of providing a new MAS model combining the concepts of the composed organizational models. Category theory represents the mathematical formalism for studying and modeling different organizations in a categorical way. This paper is mainly based on the idea of modeling the multi-agent organization AGR and YAMAM in a categorical way in order to obtain formal semantic models describing these organizations of MAS, then compose them using also the theory of categories which represents a very sophisticated mathematical toolbox based on composition.

KEYWORDS

Multi-agent systems, Organizational models, Category theory, composition.


The Challenges of Internet of Things Adoption in Developing Countries: An Overview Based on the Technical Context

Ayman Altameem, Department of Computer and Engineering Sciences, College of Applied Studies and Community Services, King Saud University, Riyadh, Saudi Arabia

ABSTRACT

The Internet of Things (IoT) has the potential to change the way we engage with our environments. Its prevalence has spread to various areas of industrial and manufacturing systems in addition to other sectors. However, many organizations are finding it increasingly difficult to navigate IoT. To unleash its full potential and create real economic value, it is essential to learn about the obstacles to IoT delivery. There is high potential for IoT implementation and usage in developing countries, and major barriers must be addressed for IoT delivery. This paper explores the challenges that impact the adoption of IoT in developing countries based on the technical context. It also presents a general conclusion in the form of recommendations to capture the maximum benefits of IoT adoption.

KEYWORDS

Internet of Things adoption, Obstacles of IoT in developing countries, IoT Technical Context.


Study of Consistency and Performance Trade-Off in Cassandra

Kena Vyas and PM Jat, DAIICT, Gandhinagar, Gujarat, India

ABSTRACT

Cassandra is a distributed database with great scalability and performance that can manage massive amounts of data that is not structured. The experiments performed as a part of this paper analyses the Cassandra database by investigating the trade-off between data consistency andperformance.The primary objective is to track the performance for different consistency settings. The setup includes a replicated cluster deployed using VMWare. The paper shows how difference consistency settings affect Cassandras performance under varying workloads. The results measure values for latency and throughput. Based on the results, regression formula for consistency setting is identified such that delays are minimized, performance is maximized and strong data consistency is guaranteed. One of our primary results is that by coordinating consistency settings for both read and write requests, it is possible to minimize Cassandra delays while still ensuring high data consistency.

KEYWORDS

NoSQL, Cassandra, Consistency, Latency, YCSB, and Performance.


An Analysis of Phrase based SMT for English to Manipuri Language

Maibam Indika Devi1 and Bipul Syam Purkayastha2, 1Department of Computer Science, IGNTU-RCM, Kangpokpi, Manipur, India, 2Department of Computer Science, Assam University, Silchar, Assam, India

ABSTRACT

Statistical Machine Translation (SMT) is one of the ruling approaches adopted for developing major translation systems today. Very little work in machine translation has been done for the Manipuri language. Here, a machine translation system from English to Manipuri is reported. The variance in the structure and morphology between English and Manipuri languages and the lack of resources for Manipuri languages pose a significant challenge in developing an MT system for the language pair. While English has poor morphology, SVO structure and belongs to the Indo-European family. Manipuri language has richer morphology, SOV structure and belongs to the Sino-Tibetan family. Manipuri has two scripts- Bengali script and Meitei script. Here the Bengali script is used for developing the system. In this paper, the phrase based SMT technique is adopted using Moses toolkit and corpus from tourism, agriculture and entertainment domain is used for training the system. The evaluation uses the BLEU metric.

KEYWORDS

Phrase-based SMT, English- Manipuri, Moses, BLEU.


Application of Bayesian Optimization and Stacking Integration in Personal Credit Delinquency Prediction

Jicong Yang and Hua Yin, Guangdong University of Finance and Economics, China

ABSTRACT

How to effectively predict personal credit risk is a key problem in the financial field. This study proposes the Stacking model optimized by improved Bayesian optimization algorithm to predict personal credit overdue problems. Firstly, select the desensitization data set provided by UnionPay Commerce and conduct data preprocessing on the data set. Secondly, the XGBoost, random Forest and GBDT optimized by the improved Bayesian optimization algorithm are used as the base model to establish the optimized Stacking credit overdue prediction model. F1-score and AUC were used as evaluation indexes. The performance of the optimized Stacking model was verified to be better than that of other models through the experiments of single model comparison, optimization model comparison and dataset comparison.

KEYWORDS

Improved Bayesian optimization algorithm, optimized Stacking credit overdue prediction model, UnionPay business data set.


Enhanced Accident Detection and Rescue System

Karthick Myilvahanan.J and Krishnaveni A, New Horizon College of Engineering, Karnataka, India

ABSTRACT

These days, the incidence of roadway disasters is at an all-time high. Clinical counsel that is practical can help save lives. This approach aims to make the nearby clinical focus aware of the incident in order to provide early clinical guidance. The appended accelerometer in the vehicle detects the slant of the vehicle and distinguishes mishap. Subsequently, the framework will send the SOS message and area to the enrolled numbers by means of GSM/GPS module.

KEYWORDS

Accident Detection, Alert System, Accelerometer.