International Conference on Data Science and Applications (DSA 2020)

November 28 ~ 29, 2020, London, United Kingdom

Accepted Papers


Using Machine Learning Image Recognition for Code Reviews

Michael Dorin1, 2, Trang Le3, Rajkumar Kolakaluri4, Sergio Montenegro5, 1Aerospace Information Technology, Universität Würzburg,Würzburg, Germany, 2Engineering, University of St. Thomas, St. Paul, MN. USA, 3Engineering, University of St. Thomas, St. Paul, MN. USA, 4Engineering, University of St. Thomas, St. Paul, MN. USA, 5Aerospace Information Technology, Universität Würzburg, Würzburg, Germany

ABSTRACT

It is commonly understood that code reviews are a cost-effective way of finding faults early in the development cycle. However, many modern software developers are too busy to do them. Skipping code reviews means aloss of opportunity to detect expensive faults prior tosoftware release. Software engineers can be pushed in many directions and reviewing code is very often considered anundesirable task, especially when time is wasted reviewing programs that are not ready. In this study, we wish to ascertain the potential for using machine learning and image recognition to detect immature software source code prior to a review. We show that it is possible to use machine learning to detect software problems visually and allow code reviews to focus on application details.The results are promising and are an indication that further research could be valuable.

KEYWORDS

Code Reviews, Machine Learning, Imagine Recognition, Coding Style.


MCA-IICD: Mobile and Cooperative Agent based Approach for Intelligent Integration of Complex Data

Karima GOUASMIA and Wafa MEFTEH, University of Gabès, Tunisia

ABSTRACT

Since many years, data integration has become a delicate task in the data-warehousing process. Indeed, the collected data (from various applications and existing in different forms) must be homogenized to meet several needs such as analytical activities. Today, organizations collect a huge mass of data which become more and more complex. Collected data have different types (text, video, image...) and are located in heterogeneous and dispersed sources. The complexity and the dispersion of this data return their integration a difficult task that necessitates the use of efficient techniques and performed tools in order to provide a unified data source. Our objective is to take advantage of the agent software technology, in particular cooperative agents and mobile agents to perform the integration phase of complex data. This paper gives an overview about related works and presents a new approach (MCA-IICD) for an intelligent integration of complex data based on cooperative and mobile agents.

KEYWORDS

Cooperative agent, Mobile agent, Intelligent data integration.


A Hybrid Harmony Search and Simulated Annealing Algorithm for Job Shop Scheduling Problem

FengLian Yuan, Huihui Wang, Bo Huang, Yubing Guo, and Hang Zhu, School of Computer Science & Engineering, Nanjing University of Science and Technology, Nanjing 210094 China

ABSTRACT

This paper proposes a new scheduling approach that combines harmony search (HS) and simulated annealing (SA) to efficiently solve the job shop scheduling problem (JSSP). This method uses the SA algorithm to generate initial solutions and new solutions for the HS algorithm. Since SA can explore an extensive solution space for optimal schedules by accepting worse solutions, it makes HS run with diverse solutions and more likely to reach a better solution for JSSP. This method can improve global search abilities and convergence in JSSP. Some experiments are carried out on a set of well-known benchmark instances to show the efficiency and effectiveness of the proposed method.

KEYWORDS

Harmony Search, Hybrid Strategy, Job Shop Scheduling Problem & Simulated Annealing.


Explainable AI for Interpretable Credit Scoring

Lara Marie Demajo, Vince Vella and Alexiei Dingli, Department of Artificial Intelligence, University of Malta, Msida, Malta

ABSTRACT

With the ever-growing achievements in Artificial Intelligence (AI) and the recent boosted enthusiasm in FinTech, applications such as credit scoring have gained substantial academic interest. Credit scoring helps financial experts make better decisions regarding whether or not to accept a loan application, such that loans with a high probability of default are not accepted. Apart from the noisy and highly imbalanced data challenges faced by such credit scoring models, recent regulations such as the `right to explanation' introduced by the General Data Protection Regulation (GDPR) and the Equal Credit Opportunity Act (ECOA) have added the need for model interpretability to ensure that algorithmic decisions are understandable and coherent. An interesting concept that has been recently introduced is eXplainable AI (XAI), which focuses on making black-box models more interpretable. In this work, we present a credit scoring model that is both accurate and interpretable. For classification, state-of-the-art performance on the HELOC and Lending Club Datasets is achieved using the XGBoost model. The model is then further enhanced with a 360-degree explanation framework, which provides different explanations (i.e. global, local feature-based and local instance-based) that are required by different people in different situations. Evaluation through the use of functionally-grounded, application-grounded and human-grounded analysis show that the explanations provided are simple, consistent as well as satisfy the six predetermined hypotheses testing for correctness, effectiveness, easy understanding, detail sufficiency and trustworthiness.

KEYWORDS

Credit Scoring, Explainable AI, BRCG, XGBoost, GIRP, SHAP, Anchors, ProtoDash, HELOC, Lending Club.


An Intelligent and Data-driven Mobile Platform for Youth Volunteer Management using Machine Learning and Predictive Analytics

Alyssa Huang1 and Yu Sun2, 1Irvine, CA 92602, 2California State Polytechnic University, Pomona, CA, 91768

ABSTRACT

Volunteering is very important to high school students because it not only allows the teens to apply the knowledge and skills they have acquired to real-life scenarios, but it also enables them to make an association between helping others and their own joy of fulfillment. Choosing the right volunteering opportunities to work on can influence how the teens interact with that cause and how well they can serve the community through their volunteering services. However, high school students who look for volunteer opportunities often do not have enough information about the opportunities around them, so they tend to take whatever opportunity that comes across. On the other hand, as organizations who look for volunteers usually lack effective ways to evaluate and select the volunteers that best fit the jobs, they will just take volunteers on a first-come, first-serve basis. Therefore, there is a need to build a platform that serves as a bridge to connect the volunteers and the organizations that offer volunteer opportunities. In this paper, we focus on creating an intelligent platform that can effectively evaluate volunteer performance and predict best-fit volunteer opportunities by using machine learning algorithms to study 1) the correlation between volunteer profiles (e.g. demographics, preferred jobs, talents, previous volunteering events, etc.) and predictive volunteer performance in specific events and 2) the correlation between volunteer profiles and future volunteer opportunities. Two highest-scoring machine learning algorithms are proposed to make predictions on volunteer performance and event recommendations. We demonstrate that the two highest-scoring algorithms are able to make the best prediction for each query. Alongside the practice with the algorithms, a mobile application, which can run on both iPhone and Android platforms is also created to provide a very convenient and effective way for the volunteers and event supervisors to plan and manage their volunteer activities. As a result of this research, volunteers and organizations that look for volunteers can both benefit from this data-driven platform for a more positive overall experience.

KEYWORDS

Machine learning, Predictive Analytics, Flutter, Volunteer Management, Scikit-learn.


A New Framework of Feature Engineering for Machine Learning in Financial Fraud Detection

Chie Ikeda1, Karim Ouazzane2, Qicheng Yu3, 1School of Computing and Digital Media, London Metropolitan University, London, UK, 2School of Computing and Digital Media, London Metropolitan University, London, UK, 3School of Computing and Digital Media, London Metropolitan University, London, UK

ABSTRACT

Financial fraud activities have soared despite the advancement of fraud detection models empowered by machine learning (ML). To address this issue, we propose a new framework of feature engineering for ML models. The framework consists of feature creation that combines feature aggregation and feature transformation, and feature selection that accommodates a variety of ML algorithms. To illustrate the effectiveness of the framework, we conduct an experiment using an actual financial transaction dataset and show that the framework significantly improves the performance of ML fraud detection models. Specifically, all the ML models complemented by a feature set generated from our framework surpass the same models without such a feature set by nearly 40% on the F1-measure and 20% on the AUC value.

KEYWORDS

Financial Fraud Detection, Feature Engineering, Feature Creation, Feature Selection, Machine Learning.


3-D Offline Signature Verification with Convolutional Neural Network

Na Tyrer1, Fan Yang1, Gary C. Barber1, Guangzhi Qu1, Bo Pang1 and Bingxu Wang1,2, 1Automotive Tribology Center, Department of Mechanical Engineering, School of Engineering and Computer Science, Oakland University, Rochester, Michigan, 48309, USA, 2Faculty of Mechanical Engineering and Automation, Zhejiang Sci-Tech University, Hangzhou, Zhejiang, 310018, P.R.China

ABSTRACT

Signature verification is essential to prevent the forgery of documents in financial, commercial, and legal settings. Utilizing the 3-D information presented by a signature using a 3D optical profilometer is a relatively new idea, and the convolutional neural network is a powerful tool for image recognition. The present research focused on using the 3 dimensions of offline signatures in combination with convolutional neural network to verify signatures. It was found that the accuracy of the data for offline signature verification has been over 90%, which would be a novel method in signature verification.

KEYWORDS

Signature Verification, 3D Optical Profilometer, Convolutional Neural Network.


A Network for Detecting Facial Features During the Covid-19 Epidemic

Bin Lin and YanLin Mu and ZiLi Fu and Chaochao Li and Xuliang Duan, College of Information Engineering, Sichuan Agricultural University, Ya’an, China

ABSTRACT

The new coronavirus can spread through respiratory droplets, and wearing a mask correctly can effectively prevent the virus from spreading. However, the current detection algorithms are based on unobstructed faces, which affects the detection task when wearing a mask. To solve these problems, a facial feature detection algorithm based on Mtcnn+Mobilenet+GDBT in complex scenes is proposed. First, it can detect whether to wear a mask and the fatigue state of the face. Second, it can set different thresholds according to the facial characteristics of different people, and initialize the characteristics of different frames in 5 seconds.Then train a dataset of masks and feature points containing 708 images. The experimental results show that compared with the traditional detection network, the new network can effectively detect facial features in the context of the epidemic. The detection loss of net is 0.01.

KEYWORDS

MTCNN, Mobilenet, GDBT, Fatigue detecting, Facial feature points.


Negative Sampling in Knowledge Representation Learning: A Mini-review

Jing Qian1, 2, Katie Atkinson2, Yong Yue1 and Gangmin Li1*, 1School of Advanced Technology, Xi’an Jiaotong-Liverpool University, Suzhou, Jiangsu Province, China, 2Department of Computer Science, University of Liverpool, Liverpool, United Kingdom

ABSTRACT

Knowledge representation learning (KRL) aims at encoding components of the knowledge graph (KG) into low-dimensional continuous space, which has made considerable successes. Due to space efficiency, most famous KGs contain only positive instances. Nevertheless, typical KRL techniques, especially translational distance-based models, are trained through discriminating positive and negative samples. Negative sampling is unquestionably a non-trivial step in KG embedding. The quality of generated negative samples can directly influence the performance of final knowledge representations in downstream tasks, such as link prediction and triple classification. This review summarizes current negative sampling methods in KRL and we roughly categorize them into three sorts, fixed distribution-based, generative adversarial net (GAN)-based and cluster sampling. Another two novel approaches are also mentioned.

KEYWORDS

Knowledge Representation Learning, Negative Sampling, Generative Adversarial Nets.


Image Wafer Inspection based on Template Matching

Massimiliano Barone, STMicroelectronics, Agrate Brianza, Milano, Italy

ABSTRACT

This paper presents a template matching technique for detecting defects in VLSI wafer images. This method is based on traditional techniques of image analysis and image registration, but it combines the prior art of image wafer inspection in a new way, using prior knowledge like the design layout of VLSI wafer manufacturing process. A golden template of the patterned wafer image under inspection can be obtained from the wafer image itself mixed to the prior knowledge mentioned above. First a mapping between physical space and pixel space is needed. Then a template matching is applied for a more accurate alignment between wafer device and template. Finally, a segmented comparison is used for finding out possible defects. Results of the proposed method are presented in terms of visual quality of defect detection, any misalignment at topology level and number of correctly detected defective devices.

KEYWORDS

Wafer inspection, template matching, image registration, pattern recognition, VLSI wafer images, golden template, segmented comparison, space mapping.


An Efficient Language-independent Multi-font OCR for Arabic Script

Hussein Osman, Karim Zaghw, Mostafa Hazem and Seifeldin Elsehely, Computer Engineering Department, Faculty of Engineering, Cairo University, Egypt

ABSTRACT

Optical Character Recognition (OCR) is the process of extracting digitized text from images of scanned documents. While OCR systems have already matured in many languages, they still have shortcomings in cursive languages with overlapping letters such as the Arabic language. This paper proposes a complete Arabic OCR system that takes a scanned image of Arabic Naskh script as an input and generates a corresponding digital document. Our Arabic OCR system consists of the following modules: Pre-processing, Word-level Feature Extraction, Character Segmentation, Character Recognition, and Post-processing. This paper also proposes an improved font-independent character segmentation algorithm that outperforms the state-of-the-art segmentation algorithms. Lastly, the paper proposes a neural network model for the character recognition task. The system has experimented on several open Arabic corpora datasets with an average character segmentation accuracy 98.06%, character recognition accuracy 99.89%, and overall system accuracy 97.94% achieving outstanding results compared to the state-of-the-art Arabic OCR systems.

KEYWORDS

Arabic OCR, Word Segmentation, Character Segmentation, Character Recognition, Neural Network.


Minimum Probability Conversion Circuits for Stochastic Computing

Chris Collinsworth, Sabiheh Salehi and Sayed Ahmad Salehi, Department of Electrical and Computer Engineering, University of Kentucky, USA

ABSTRACT

Stochastic number generators (SNGs) are used to convert binary numbers to bit-streams in stochastic circuits. While the hardware complexity for arithmetic operations in stochastic computing is significantly less than binary computing, the hardware overhead for SNGs is substantial. An SNG consists of two components: a random number source (RNS) and a probability conversion circuit (PCC). In this paper, we propose two PCC designs with minimum logic, in terms of 2-input gates, that reduce the hardware overhead of SNGs. The proposed PCCs generate bit-streams with very low correlation when they are used for two SNGs that share one RNS. Compared to prior work, the proposed designs can reduce hardware overhead for a 12-bit SNG by as much as 67%.

KEYWORDS

Stochastic computing, stochastic number generator, linear feedback shift register sharing, area-efficiency, cost reduction.


Incremental Automatic Correction for Digital Vlsi Circuits

Lamya Gaber1, Aziza I. Hussein2 and Mohammed Moness1, 1Department of Computers and Systems Engineering, Minia University, Minia, Egypt, 2Department of Electrical and Computer Engineering, Effat University, Jeddah, KSA

ABSTRACT

Recently, the grown complexity of digital VLSI circuits has a high impact on verification methodologies. Many advances toword verification and debugging techniques of digital VLSI circuits has been arisen using Computer Aided Design (CAD). On the other hand, there is a lack of researches on automatically correcting faulty digital circuits. Also, the existing techniques are highly depended on a specific test patterns with specific number which is increased by rising the complexity of VLSI circuits. The second problem is a large size of injecting circuit for correction and large number of SAT solver calls that have also a negative impact on the consuming running time. Therefore, our goal is to avoid the dependency of a given test patterns by incrementally generating compact test patterns corresponding to design errors during rectification. Also, the second goal is to reduce the size of in-circuit mutation circuit for errorfixing process. In addition, the third improvement is the distribution of test patterns can be performed in parallel way that have a positive impact on digital VLSI circuit with large number of inputs and outputs. The experimental results illustrate that the proposed incremental correction algorithm can fix design bugs of type gate replacements in several digital VLSI circuits from ISCAS'85 with high speed and full accuracy. The speed of proposed Auto-correction mechanism outperforms the latest existing methods around 4.8x using ISCAS'85 benchmarks. The parallel distribution of test patterns on digital VLSI circuits during generating new compact test patterns achieves speed around 1.2x compared to latest methods.

KEYWORDS

Auto-correction, ATPG, Fault detection, Verification.


FPGA Routing Acceleration by Extracting Unsatisfiable Subformulas

Tiejun Li, Kefan Ma National University of Defense Technology, china

ABSTRACT

Explaining the causes of infeasibility of Boolean formulas has practical applications in various fields. A small unsatisfiable subset can provide a succinct explanation of infeasibility and is valuable for applications, such as FPGA routing. The Boolean-based FPGA detailed routing formulation expresses the routing constraints as a Boolean function which is satisfiable if and only if the layout is routable. The unsatisfiable subformulas can help the FPGA routing tool to diagnose and eliminate the causes of unroutable. For this typical application, a resolution-based local search algorithm to extract unsatisfiable subformulas is integrated into Boolean-based FPGA routing method. The fastest algorithm of deriving minimum unsatisfiable subformulas, called the branch-and-bound algorithm, is adopted to compare with the local search algorithm. On the standard FPGA routing benchmark, the results show that the local search algorithm outperforms the branch-and-bound algorithm on runtime. It is also concluded that the unsatisfiable subformulas play a very important role in FPGA routing real applications.

KEYWORDS

FPGA routing, Boolean satisfiability, unsatisfiable subformula, local search.


Designof A Low Power Phase Locked Loop for Ultra Wide Band(UWB)Applications using 180Nm CMOS Technology

Akash P,Harini S, Sumanth P, Rashmi S, Department of Electronics and Communication,Don Bosco Institute of Technology,Bengaluru, Karnataka, India

ABSTRACT

The Phase-Locked Loop (PLL) is a primalsystem for the procreation of RF and microwave signals. It generates a versatile output frequency with a similar constancy of a crystal oscillator, by the effectuation of a feedback. This paper showcases the design of a Type-II phase locked loop(PLL) with reduced error rate and reduced phase noise, which has five essential blocks: 1)Phase-Frequencydetector(PFD), 2) Charge-Pump (CP), 3) Loop-Filter (LF), 4)Voltage-Controlled-Oscillator(VCO), and5)Frequency-Divider (FD), used for Wi-Max implementations in 3GHz of Ultra-Wide-Band(UWB) range. This being classified as a mixed-signal circuit, the PLL involves design challenges at high frequencies. Recent advances in integrated circuit technologies hasgiven rise to the employment of high performance PLL, which has become more efficient anddependable.The detailed design,description and simulation is done using the Cadence Tool (License 6.1). The general purpose directory kit (GPDK) used is 180nm technology.This paper showcasesa PLL operating at 3 GHz, where phase noise of overall PLL is -125dBc/Hz and power consumption is 3.17 mW for 1.8V supply. This is designed and presented in 180nm technology, aiming to provide advantageous power, cost and performance at its best.

KEYWORDS

Capture Range, Dead Zone, Loop Bandwidth, Phase Locked Loop(PLL), Phase Noise.


Controlled Machine Text Generation of Football Articles

Tomasz Garbus, University of Warsaw, Poland

ABSTRACT

Among other bene?ts of the rapid development in deep learning, language modelling (LM) systems have excelled at producing relatively long text samples that are (almost) indistinguishable from human-written text. This work categorizes conditional text generation systems into three paradigms: generation with placeholders, prompted generation, adversarial/reinforcement learning and provides an overview of each paradigm along with experiments – either machine- or human-judged. Example corpora of football news are used to discuss how a fast, domain-speci?c named entity recognition (NER) system can be built without much manual labour for English and Polish. The NER module is evaluated on manually labelled texts in both languages. It is then used not only to build ?ne- tuning sets for the language model, but also to aid its generation procedure, resulting in samples more compliant with provided control codes. Finally, a simple tool EDGAR for prompt-driven generation is presented. Two demos are made for the reader to experiment with and compare the proposed solutions with simply ?netuned GPT-2 model.

KEYWORDS

text generation, Transformer, machine learning, named entity recognition, reinforcement learning, natural language generation.


Parallel Data Extraction Using Word Embeddings

Pintu Lohar and Andy Way, ADAPT Centre, Dublin City University, Ireland

ABSTRACT

Building a robust MT system requires a suficiently large parallel corpus to be available as training data. In this paper, we propose to automatically extract parallel sentences from comparable corpora without using any MT system or even any parallel corpus at all. Instead, we use crosslingual information retrieval (CLIR), average word embeddings, text similarity and a bilingual dictionary, thus saving a significant amount of time and effort as no MT system is involved in this process. We conduct experiments on two different kinds of data: (i) formal texts from news domain, and (ii) user-generated content (UGC) from hotel reviews. The automatically extracted sentence pairs are then added to the already available parallel training data and the extended translation models are built from the concatenated data sets. Finally, we compare the performance of our new extended models against the baseline models built from the available data. The experimental evaluation reveals that our proposed approach is capable of improving the translation outputs for both the formal texts and UGC.

KEYWORDS

Machine Translation, Parallel Data, User-generated Content, Word Embeddings.


Evaluating Dutch Named Entity Recognition and De-identification Methods in the Human Resource Domain

Chaïm van Toledo, Friso van Dijk and Marco Spruit, Utrecht University, Utrecht, the Netherlands

ABSTRACT

The human resource (HR) domain contains various types of privacy-sensitive textual data, such as e-mail correspondence and performance appraisal. Doing research on these documents brings several challenges, one of them anonymisation. In this paper, we evaluate the current Dutch text de-identification methods for the HR domain. We also update one of these methods with the latest named entity recognition (NER) models. The result is that the NER model based on the CoNLL 2002 corpus in combination with the BERTje transformer give the best combination for suppressing persons (recall 0.94) and locations (recall 0.82). For suppressing gender, DEDUCE is performing best (recall 0.53). We evaluate NER based on both strict de-identification of entities (a person must be suppressed as a person) as well as with a loose sense of de-identification (no matter what how a person is suppressed, as long it is suppressed).

KEYWORDS

Named Entity Recognition, Dutch, NER, BERT, evaluation, de-identification.


Text Generation with Long Short-term Memory Networks and Generative Probabilistic Networks

Akwarandu Ugo Nwachuku1 and Xavier-Lewis Palmer2, 1Northeastern University, Vancouver BC V6B 5A7, Canada, 2Old Dominion University, Norfolk, VA 23529, USA

ABSTRACT

Text-generation is an interesting area of study with Long short-term memory networks (LSTM). In this work, we present a process engineered to improve them. The method presented here does not deal directly with the workings of the LSTM network but instead mitigates the quality of accepted text generated by the LSTM. The research study aims to answer the question of how an LSTM can work to generate contextually significant texts based on prior inputs.Extracted semantic information in this paper is classified as sentiment, subjectivity, and verbs extracted from all sentences in the training data. A solution is proffered with a generative probabilistic model that predicts semantic information expected in upcoming text or sentences, based on a given sentence input. The semi-artificial intelligent process explored in this paper is applicable to any domain involving sequential data. The research study presents a semantically inclined text-generation method.

KEYWORDS

Machine Learning, Artificial Intelligence, LSTM, Long short-term memory, Natural Language Processing.


An Optimized Cleaning Robot Path Generation and Execution System using Cellular Representation of Workspace

Qile He1 and Yu Sun2, 1Webber Academy, Calgary, Alberta, Canada, 2California State Polytechnic University, Pomona, USA

ABSTRACT

Many robot applications depend on solving the Complete Coverage Path Problem, or CCPP. Specifically, robot vacuum cleaners have seen increased use in recent years, and some models offer room mapping capability using sensors such as LiDAR. With the addition of room mapping, applied robotic cleaning has begun to transition from random walk and heuristic path planning into an environment-aware approach. In this paper, a novel solution for pathfinding and navigation of indoor robot cleaners is proposed. The proposed solution plans a path from a priori cellular decomposition of the work environment. The planned path achieves complete coverage on the map and reduces duplicate coverage. The solution is implemented inside the ROS framework, and is validated with Gazebo simulation. Metrics to evaluate the performance of the proposed algorithm seek to evaluate the efficiency by speed, duplicate coverage and distance travelled.

KEYWORDS

Complete Coverage Path Planning, Mobile Robots, Graph Theory.


Digiprescription: An Intelligent System to Enable Paperless Prescription using Mobile Computing and Natural-language Processing

Richard Zhang1, Yucheng Jiang1, Sophadeth Rithya2, Yu Sun2, 1Walnut Ave Irvine, CA 92604, 2California State Polytechnic University, Pomona, CA, 91768

ABSTRACT

Through our app, it is aimed to teach and tell the patients how to use the drug properly taking off the chances of putting their lives in danger, especially the elderly. It is also efficient to give patients these instructions as well as saving lots of paper. Because of the law, every drug that is given from the pharmacy to the user includes a receipt that lists information of, patient’s information, drug information, insurance information, directions on taking the medicine (black box warning issued by FDA), medication details on how it works, side effects, storage rules, and etc. These pieces of information are crucial to patients, where it tells them how to use the drug properly, but most people would throw these receipts away, which is a risk as well as a waste. Through using this app, the patient can efficiently get information on how to properly use the drug. This application is also helpful, where the user can choose to set reminders on when to eat this drug each week or month.

KEYWORDS

Reminder System, HTML, Python, Firebase, Bootstrap.


Intrusion Detection In Computer Systems By Using Artificial Neural Networks With Deep Learning Approaches

Sergio Hidalgo-Espinoza, Kevin Chamorro-Cupuerán and Oscar Chang- Tortolero, School of Mathematics and Computer Science, University of Yachay Tech, Ecuador

ABSTRACT

Intrusion detection into computer networks has become one of the most important issues in cybersecurity. Attackers keep on researching and coding to discover new vulnerabilities to penetrate information security system. In consequence computer systems must be daily upgraded using up-to-date techniques to keep hackers at bay. This paper focuses on the design and implementation of an intrusion detection system based on Deep Learning architectures. As a first step, a shallow network is trained with labelled log-in [into a computer network] data taken from the Dataset CICIDS2017. The internal behaviour of this network is carefully tracked and tuned by using plotting and exploring codes until it reaches a functional peak in intrusion prediction accuracy. As a second step, an autoencoder, trained with big unlabelled data, is used as a middle processor which feeds compressed information and abstract representation to the original shallow network. It is proven that the resultant deep architecture has a better performance than any version of the shallow network alone. The resultant functional code scripts, written in MATLAB, represent a retrainable system which has been proved using real data, producing good precision and fast response.

KEYWORDS

Artificial Neural Networks, Information Security, Deep Learning, intrusion detection & hacking attacks.


Towards a Risk Assessment for Big Data in Cloud Computing Environment

Saadia Drissi and Soukaina Elhasnaoui, (EAS-LRI) Systems Architecture Team, ENSEM, Hassan II University, Morocco

ABSTRACT

Cloud computing gives a relevant and adaptable support for Big Data by the ease of use, access to resources, low cost use of resources, and the use of strong equipment to process big data. Cloud and big data center on developing the value of a business while reducing capital costs. Big data and cloud computing, both favor companies and by cause of their benefit, the use of big data growths extremely in the cloud. With this serious increase, there are several emerging risk security concerns. Big data has more vulnerabilities with the comparison to classical database, as this database are stored in servers owned by the cloud provider. The various usage of data make safety-related big data in the cloud intolerable with the traditional security measures. The security of big data in the cloud needs to be looked at and discussed. In this current paper, the authors will present and discuss the risk assessment of big-data applications in cloud computing environments and present some ideas for assessing these risks.

KEYWORDS

Cloud computing, risk assessment, big data, security.


RAPDAC: Role Centric Attribute based Policy Driven Access Control Model

Jamil Ahmed, AvantureBytes, Canada

ABSTRACT

Access control models aim to decide whether a user should be denied or granted access to the user’s requested activity. Various access control models have been established and proposed. The most prominent of these models include role-based, attribute-based, policy based access control models as well as role-centric attribute based access control model. In this paper, a novel access control model is presented called “Role centric Attribute based Policy Driven Access Control (RAPDAC) model”. RAPDAC incorporates the concept of “policy” in the “role centric attribute based access control model”. It leverages the concept of ‘policy’ by precisely combining the evaluation of conditions, attributes, permissions and roles in order to allow authorization access. This approach allows capturing the ‘access control policy’ of a real time application in a well defined manner. RAPDAC model allows making access decision at much finer granularity as illustrated by the case study of a real time library information system.

KEYWORDS

Authorization, Access control model, RBAC.


Testing Security Requirements

Wolfgang Prentner and Harry Sneed, ZTP – Civil Technical Quality Assurance Company, Vienna, Austria

ABSTRACT

Requirement-based testing is an approach to test software systems against their documented requirements against documented requirements. A system is considered to be tested when it fulfills all of its requirements. That implies that the test cases are taken from the requirement document. Security requirements are a special type of software requirements intended to protect the software against unauthorized usage and to safeguard their data against unauthorized access. They are usually derived from laws and regulations governing the use of IT systems and to the protection of data from misuse. As such, they are formulated in juristic terms of a natural language. To convert these terms into executable and validatable test cases is the task of the security tester. This contribution to technology uses a case study to describe how this can be done. The starting point is a natural language requirement document depicting what has to be prevented – the security measures to be implemented – in German das Lastenheft – and how it is to be prevented – in German das Pflichtenheft. From these two complementary documents logical test cases, i.e. test conditions are generated which are then joined with the appropriate test data to be converted into physical test cases which can be fed to an automated test driver. The final result is a set of security test cases and a security test plan.

KEYWORDS

Software Security, Data Protection, Authorization, Authentication, Monitoring, Access Control, Security Test Cases, Security Test Planning, Security Test Design, Security Test Execution, Security Test Evaluation.

A Context-aware and Geo-based Mobile Application to Automate the Notification of Public Health Issues using Big Data Analysis

Angela Xiang1 and Yu Sun2, 1Irvine, CA ,USA, 2California State Polytechnic University, CA, USA

ABSTRACT

Coronavirus disease 2019 (COVID-19) is causing an ongoing pandemic. Social distancing and quarantine are the few effective methods to reduce the spreading risk of the coronavirus among people. As business starts to open up and quarantine policies become looser, the risk of COVID-19 spreading becomes greater [1]. This paper describes the development of a computer application to track the user’s surroundings and calculate the exposure risk at a specific geographic location. Our application uses other user’s data and online databases with information on COVID-19 cases to calculate a percentage revealing the user’s possible risk at that location.

KEYWORDS

Flutter, python Flask, Firebase, iOS, Android.


An Interactive Application to Assist Biology Learning using Augmented-reality

1Wenxi Li, 2Marisabel Chang, 2Yu Sun, 1Irvine, CA 92620, 2Department of Computer Science, California State Polytechnic University, Pomona

ABSTRACT

As students learn biology in term of molecule, cells, or proteins, cross section of 2D image is a traditional method study the details of them. However, this method cannot bridge the gap between reality and students’ imagination based on 2D image. This paper proposes a tool which can assist students to learn biology knowledge more effectively through augmented-reality. The experiments on accuracy and performance of the image pattern recognition indicates that FAST algorithm is the best choice currently. It reaches the highest accuracy of 94.5%.

KEYWORDS

Augment reality, Vuforia, Blender, Unity.


U-mentalismutility Patent: an Overview

Luís Homem, Centre for the Philosophy of Sciences of the University of Lisbon, University of Lisbon, Portugal

ABSTRACT

This paper discloses in synthesis a super-computation computer architecture (CA) model, presently a provisional Patent Application at INPI (nº 116408). The outline is focused on a method to perform computation at or near the speed of light, resorting to an inversion of the Princeton CA. It expands from isomorphic binary/RGB (typical) digital “images”, in a network of (UTM)s over Turing-machines (M)s. From the binary/RGB code, an arithmetic theory of (typical) digital images permits fully synchronous/orthogonal calculus in parallelism, wherefrom an exponential surplus is achieved. One such architecture depends on any “cell”-like exponential-prone basis such as the “pixel”, or rather the RGB “octet-byte”, limited as it may be, once it is congruent with any wave-particle duality principle in observable objects under the electromagnetic spectrum and reprogrammable designed. Well-ordered instructions in binary/RGB modules are, further, programming composed to alter the structure of the Internet, in virtual/virtuous eternal recursion/recurrence, under man-machine/machine-machine communication ontology.

KEYWORDS

U-Mentalism, Super-computation, Computer Architecture, Cybernetics, Programming Languages Design.


Malady Detection On Plant Leaves Using Unsupervised Learning Algorithm

Pushya Chaparala1, Siva Shankar Rao. S2, 1Department Of Computer Science And Engineering, Vignan’s Foundation Of Science And Research (Deemed To Be University), Vadlamudi, 522213, India, 2Department Of Computer Science And Engineering, Guru Nanak University, Ibrahimpatnam, 501506

ABSTRACT

Plant Pathology Role Is Important In Agriculture. We Proposed A Solution By Referring Many Existing Approaches That Use The Machine Learning To Detect The “Late Blight” Pathogen On The Leaves. For This, We Created A Database To Train Our System. A Plant Leaf Image Will Be Taken As An Input For This Proposal, This Image Will Be Filtered, Enhanced, And Segmented To Produce A Binary Image. To Acquire A Good Classification Results We Used K-means, Whereas For Texture Analysis Co-occurrence Distribution Method Is Used.

KEYWORDS

Pathology, K-mean Cluster Algorithm, Image Segmentation, Binary Image, Ccm.


Minimising Delay and Energy in Online Dynamic Fog Systems

Faten Alenizi and Omer Rana, School of Computer Science and Informatics, Cardiff University, Cardiff, UK

ABSTRACT

The increasing use of Internet of Things (IoT) devices generates a greater demand for data transfers and puts increased pressure on networks. Additionally, connectivity to cloud services can be costly and inefficient. Fog computing provides resources in proximity to user devices to overcome these drawbacks. However, optimisation of quality of service (QoS) in IoT applications and the management of fog resources are becoming challenging problems. This paper describes a dynamic online offloading scheme in vehicular traffic applications that require execution of delay-sensitive tasks. This paper proposes a combination of two algorithms: dynamic task scheduling (DTS) and dynamic energy control (DEC) that aim to minimise overall delay, enhance throughput of user tasks and minimise energy consumption at the fog layer while maximising the use of resource-constrained fog nodes. Compared to other schemes, our experimental results show that these algorithms can reduce the delay by up to 80.79% and reduce energy consumption by up to 66.39% in fog nodes. Additionally, this approach enhances task execution throughput by 40.88%.

KEYWORDS

Dynamic Fog Computing, iFogSim, Computational Offloading, Energy Consumption, Delay.