NLPA - Accepted Papers

Welcome to NLPA 2020

International Conference on Natural Language Processing & Applications

(NLPA 2020)

June 13 ~ 14, 2020, Helsinki, Finland

Detection and Characterisation of the Non-periodic Noise Generated by Linear Loads in Power Electric Lines in A PLC –Power Line Communication System

Pablo Emilio Rozo Garcia and Johann Alexander Hernandez, Doctorate in Engineering, Francisco José de Caldas District University

ABSTRACT

In this work the characterisation and modelling of the asynchronous impulsive (non-periodic) noise, which occurs in communication lines (PLC) when used as a communication channel, are performed. A strategy was implemented to detect the noise under different circumstances in a residential electrical environment. The study was conducted on purely linear loads. An implementation was adjusted with appropriate measuring devices and algorithms were designed to process the information recorded. The impulsive noises detected are Burst type, because they are the most critical noises that occur in a communication channel through power lines. Statistical processes are used to characterise and to model the noise. The Middleton model is used to determine its presence under certain conditions. The results are satisfactory. The Burst type non-periodic noise is detected. The results were perfect in relation to reality by applying the characterisation and modelling proposed.

KEYWORDS

Powerline, Communication, Noise, Burst, Middlenton, Model, Characterisation

Enhanced Homograph Attack Detection

Zicong Zhu¹, Tran Phuong Thao¹, Hoang-Quoc Nguyen-Son², Rie Shigetomi Yamaguchi¹, Toshiyuki Nakata¹, ¹Graduate School of Information Science and Technology, The University of Tokyo, Japan and ²KDDI Research Inc.

ABSTRACT

The Internationalized Domain Name (IDN) homograph attack is a web security problem in which the attackers deceive the computer users about what websites they access by homologous domain names. Recently the growth of IDN homograph attack has become a severe problem with a significant probability of criminality like frauds for ordinary users. In this paper, we propose a classification method for IDN homograph detection by making use of Structural Similarity Index (SSIM). Compared to the existing approach, the experiment results showed that our improved classification method could increase the accuracy from 95.07% to 96.18% and decrease the false positive rate from 3.92% to 3.23%. We also conducted an empirical analysis of the IDN homograph data and training processes of the SSIM classification approach for discussion that our method takes advantages in homograph detection.

KEYWORDS

Web Security, Internationalized Domain Name, Structural Similarity Index, Homograph Attack, Visual Similarity

A 3D Simulation for the Feedback Loop Between Orbital Debris and Future Space Activities and Economy

Jack Liu¹, Emmanuel Reyes², Yu Sun³, ¹Portola High School, Irvine, CA, ²University of California, Irvine and ³California State Polytechnic University, Pomona

ABSTRACT

Since the success of SpaceX’s reusable launch system program, there has been a massive resurgence in interest in space, hundreds of companies and startups are racing to develop cheaper ways of venturing into the vacuum of space. As a result, the sustainability of the space environment will be put under great danger and pressure, threatening all other future space activities. In the study, we attempt to quantify the chain effect of various forms of space activities and orbital debris using Unity3D, followed by proposing the plan to use NASA’s simulation software Orbital Debris Engineering Model (ORDEM) 3.0 and Debris Assessment Software (DAS) 3.0.

KEYWORDS

Orbital Debris, 3D Simulation, Unity3D

Smart Learning Gateways for Omani HEIs: Benefits, Challenges and solutions

Qasim Alajmic¹, Amer Abuali² and Mohammed A. Al-Sharafi³, ¹Collegeof Arts & Humanitas, Department of Education, A’ Sharqiah University (ASU), Oman, ²Taibah University, IS Department, KSA and ³Facultyof Computer Systems & Software Engineering, Universiti Malaysia Pahang, Malaysia

ABSTRACT

Globally, the higher education is completely transformed with the growth of information and communication technology. This change is due to the advancement of information technology in the world which has led to the creation of conceptual frameworks that design the smart learning environment across the world. Therefore, a great deal of today's teachings relies heavily on the information and technology resources where mosthigher education institutionsstarting digitize their courses curriculum. The smart learning matter has gain a global trend for past few years, but still did not discussed thoroughly in Omani environment. The purpose of this study is to study the challenges, benefits and offer solutions that the higher education in Oman would benefit of. Finally, this paper gives recommendations on how the universitiesin Omani HEIs can minimize and conquer the challenges that higher education’ students in Oman face in becoming a smart learning environment.

Determination of the Optimal Surface Area of Printed Boards

David Aleksanyan¹, Levon Stepanyan² and David Husikyan³, ¹Department of Communication Systems, Marshal Armenak Khamperyants Military Aviation Institute, Yerevan, Armenia, ²Department of Industrial Engineering and Systems Management, American University of Armenia, Yerevan, Armenia and ³Department of Communication Systems, National Polytechnic University of Armenia, Yerevan, Armenia

ABSTRACT

Mathematical model is used in order to determine the surface area of the printed board using its characteristics parameters. The functional dependency of the surface area of the printed board from the quantity of the Integral Schemes (IS), Rent’s coefficient, the average number of outputs of the IS, the minimal width of the metallic wirings is obtained.

KEYWORDS

Integral Schemes, Conduction Layers, Printed Board Surface Area, Rent’s Coefficient.

Machine Learning Approach Towards Road Accident Analysis in India

Shruti Singhal, Bhavini Priyamvada and Dr Rachna Jain, Computer Science Department, Bharati Vidyapeeth’s College of Engineering, Delhi, India

ABSTRACT

This paper aims to study, compare and analyse the performance of six major machine-learning techniques to better understand the occurrence of traffic accidents. The methods considered are Decision trees, Support Vector Machines, Naïve Bayes, Random Forest, K-Nearest Neighbour and, Logistic Regression. For the most realistic and conceivable accident reduction effects with budgetary constraints, the study must be based on objective and scientific surveys to detect and further prevent accidents, understand the causes and the acuteness of injuries.

KEYWORDS

Machine learning, Random Forests Classification, Support Vector Machines, Decision Trees Classification

REBD: A Conceptual Framework for Big Data Requirements Engineering

SandhyaRani Kourla, Eesha Putti and Mina Maleki, Department of Electrical and Computer Engineering and Computer Science, University of Detroit Mercy, Detroit, MI, 48221, USA

ABSTRACT

Requirements engineering (RE), as a part of the project development life cycle, has increasingly been recognized as the key to ensuring on-time, on-budget, and goal-based delivery of software projects; compromising this vital phase is nothing but project failures. RE of big data projects is even more crucial because of the main characteristics of big data, including high volume, velocity, and variety. As the traditional RE methods and tools are user-centric rather than data-centric, employing these methodologies is insufficient to fulfill the RE processes for big data projects. Because of the importance of RE and limitations of traditional RE methodologies in the context of big data software projects, in this paper, a big data requirements engineering framework, named REBD, has been proposed. This conceptual framework describes the systematic plan to carry out big data projects starting from requirements engineering to the development, assuring successful execution, and increased productivity of the big data projects.

KEYWORDS

Big data, requirements engineering, requirements elicitation, knowledge discovery.

GSM Based Automatic Energy Meter Reading and Billing System

Mohammad Golam Mortuza, Hassan Jaki, Md. Humayun Kabir, Zia Uddin Ahmed, Rabiul Hasan Tarek, Md. Razu Ahmed and Mohammed Jashim Uddin, Dept. of Electronic and Telecommunication Engineering, International Islamic University Chittagong, Chattogram, Bangladesh

ABSTRACT

An existence without electricity can't be thought of since it turns into an integral part of human life. In a developing country, people use postpaid electricity for their own purposes. Be that as it may, they don't know the amount of electricity they have consumed and how much cost they have done likewise till they receive the consumption bill at the end of the month. Also, in prepaid meters, to see consumption details, people have to go in front of the meter. In this research, a system has been designed based on GSM technology to solve this problem. The prepaid meter has to be recharged; as a result, clients can use the electricity. The system alerts the client for any kind of emergency. Besides, when the client is away from the house, he can easily switch off the supply of electricity by sending an SMS. This project will support both society and country because it helps to reduce the wastage of electricity and to check electricity consumption and bill from remote distances.

KEYWORDS

GSM, Automatic Meter Reading, Electricity Meter

Parking Assistance Display with Estimated Parking Space Using Stereo Vision

Chi-Cheng Cheng, Chi-Cheng Lee, and Jyun-Han Huang, Department of Mechanical and Electro-Mechanical Engineering, National Sun Yat-Sen University, Kaohsiung, Taiwan, R.O.C.

ABSTRACT

Inexperienced drivers always suffer from limited spatial information coming from side and review mirrors to complete parking tasks. The major obstacle is that they cannot easily estimate relative position of the parking space with respect to their vehicles. Therefore, this paper aims to develop a parking assistance display system that can continuously provide the top view of both the vehicle and the parking space for drivers. The system applies two wide-angle cameras mounted at the rear of the vehicle. In order to search for two farther corners of the parking space with efficiency, the FAST corner detection technique is employed. Three dimensional spatial coordinates of those corners can therefore be determined by the stereo vision framework. As a result, the position of the parking space relative to the vehicle can be estimated. To verify the effectiveness of the proposed parking assistance display, parking experiments with a golf cart were conducted. Experimental results demonstrate the parking tasks can be successfully accomplished with the help from the presented assistance display.

KEYWORDS

Parking Assistance Display, Stereo Vision, Corner Detection, Parking Space Estimation.

Low-cost automatic fish measuring estimation

Vicent Sanz Marco¹, David G. Valcarce², Marta F. Riesco², Vanesa Robles³, Olga Rubiera Rodriguez⁴ and Morito Matsuoka¹, ¹Cybermedia Centre, Osaka University, Osaka, Japan, ²Spanish Institute of Oceanography, Santander, Spain, ³Department of Molecular Biology, Universidad de Leon, Leon, Spain and ⁴Lancaster University, Lancaster, United Kingdom

ABSTRACT

For an optimal fish raising under captivity conditions, biomass calculation is usually an essential factor to estimate the ideal amount of food required. Usually, this process implies human-animal interaction, however, fish manipulation can affect their correct growth or even cause their death. In particular, some fish species like Senegalese sole, can easily be stressed when they are manipulated out from their environment. The advances on image recognition systems have opened a new range of possibilities to avoid any kind of human-animal interaction. With a lowest estimation of 0.8 centimetres, and around 95% of accuracy detection, our novel prototype can successfully provide a highly accurate fish measuring estimation based on an image, which can be provided by any kind of device, such as mobile phone.

KEYWORDS

Image Recognition, Neural Networks, Image Reconstruction, Biology, Applications

Melanoma detection in histopathological images using deep learning

Salah Alheejawi, Richard Berendt, Naresh Jha, and Mrinal Mandal, University of Alberta, Edmonton, Alberta, Canada

ABSTRACT

Histopathological images are widely used to diagnose diseases such as skin cancer. As digital histopathological images are typically of very large size, in the order of a several billion pixels, automated identification of abnormal cell nuclei would be very helpful for doctors to perform quick diagnosis. In this paper, we propose a technique, using deep learning algorithms, to segment the cells nuclei in Hematoxylin and Eosin (H & E) stained images and detect the abnormal melanocytes on the histopathological images. The cell segmentation is done by using a novel Convolutional Neural Network (CNN) architecture. The segmented cells are then classified into abnormal and normal nuclei using a Support Vector Machine classifier. Experimental results show that the CNN can segment the nuclei with more than 90% accuracy. The proposed technique has a low computational complexity.

KEYWORDS

Histopathological image analysis, Nuclei segmentation, Melanoma Detection, Deep learning.

Time Series Analysis for An Experimental data of Lithium-ion Battery

Liming Xie, North Dakota State University, Fargo, ND 58108, USA

ABSTRACT

The experimental data of Lithium-ion battery has its specific sense. This paper is proposed to analyze and forecast it by using autoregressive integrated moving average (ARIMA) and spectral analysis, which has effectively and statistical results. The method includes the identification of the data, estimation and diagnostic checking and forecasting the future values by Box and Jenkins. The analysis shows that the time series models are related with the present value of a series to past values and past prediction errors. After transferring the data by different function, improving autocorrelations are significant. Forecasting the future values of the possible observations show significantly fluctuated such as increasing or decreasing in specific ranges accordingly. In spectral analysis, the parameters of the model were determined by performing spectral analysis of the experimental data to look periodicities or cyclical patterns, and to check the existence of white noise in the data. The Bartlett’s Kolmogorov-Smirnov statistic suggests the white noise of the data. The spectral analysis for the series reveals non11-second cycle of activity for dynamic stress test current, but strong 45-second that highlights the position of the main peak in the spectral density; strong 21-second and 45-second for the urbane dynamometer driver schedule current and voltage, respectively; but no significance for dynamic stress test current.

KEYWORDS

The Dynamic stress test (DST), The urbane dynamometer driver schedule (UDDS), Lithium-ion batteries (LIBs), Autoregressive integrated moving average (ARIMA), Autocorrelation function (ACF), Partial autocorrelation function (PACF).

Grant-free Safe-Scma Based on Detection of Unknown Abnormal Codebooks

Hanyuan Huang, Tao Li and Hui Zhao, College of Cyber Security, Sichuan University, Chengdu, China

ABSTRACT

Non orthogonal multiple access (NOMA) can support massive accesses in 5G. Sparse code multiple access (SCMA) is a typical NOMA technology. Basic principle of SCMA is that multi-user bit data directly map into multi-dimensional complex sequences through the codebook. Grant-free SCMA allows users to select codebook from codebook pool to send data instantly, reducing the cost of overhead and delay of granting process. When the receiver and the sender use same codebook information, the data can be transmitted correctly. But in current SCMA researches, the problem of asymmetric codebook information between sender and receiver caused by the intrusion of codebook pool is not considered. In this paper, abnormal codebook detection is proposed in the grant-free SCMA. Because most of intrusion is unknown, detection is realised by comparing test codebooks with normal states. In this paper, without being focus on the whole normal codebook set, the process of detection is generated by extracting characteristics of normal codebooks. Tested objects in the process can include but not limit to codebook structure, constellations, relationship of constellations, overall feature of codebook pool. Detection can be executed step by step until discovering error state or accomplishing all steps to avoid facing high computational complexity caused by directly comparing with all normal codebooks. Besides, tested abnormal codebooks are saved and evolved to act as detectors. Future detection for unknown abnormal codebooks will do match with detectors which are evolved from those known abnormal codebooks.

KEYWORDS

Grant-free SCMA, SCMA codebooks, abnormal codebooks detection, codebook features extraction, abnormal codebook evolution

Trusted Computing in Data Science: Viable Countermeasure in Risk Management Plan

Uchechukwu EMEJEAMARA, IEEE Computer Society, Connecticut Section, USA, Udochukwu NWODUH, Department of Computer Science, Federal Polytechnic Nekede, Nigeria and Andrew MADU, Department of Computer Science, Federal Polytechnic Nekede, Nigeria

ABSTRACT

The need for secure data systems have promoted the constant reinforcement of security systems in the attempt to prevent and mitigate risks associated with information security. In the information age, it is evident that companies cannot ignore the impact of data, specifically big data, in the decision making processes. It promotes not only the proactive capacity to prevent unwarranted situations while exploiting opportunities but also the keeping up of the pace of market competition. However, since the overreliance on data exposes the company, trusted computing components are necessary to guarantee that data acquired, stored, and processed remains secure from internal and external malice. Numerous measures can be adopted to counter the risks associated with data exploitation and exposure due to data science practices. Nonetheless, trusted computing is a reasonable point to begin from in the aim to protect provenance systems and big data systems through the establishment of a ‘chain of trust’ among the various computing components and platforms.

KEYWORDS

Trusted Computing, Security, Data, Data Science, Provenance, Big Data, Trusted Platform Module, Platform Computation Register

FCNNMD: A Novel Fusion Method Based on Convolutional Neural Network for Malware Detection

Jing Zhang¹ and Yu Wen², ¹Institute of Information Engineering, CAS & School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China and ²Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China

ABSTRACT

Malicious software are rampant and do great harm. The present mainstream malware detection technology has many disadvantages, such as high labour cost, large system overhead, and inability to detect new malware. We propose a novel fusion method based on convolutional neural network for malware detection (FCNNMD). For the sample imbalance problem faced by the convolutional neural network malware detection method, the non-malicious sample is added by means of generating antinetwork generation, etc., to achieve the same number as the malicious sample. For the problem of low accuracy of single model detection, high false positive rate and false negative rate, a malware detection model is constructed by means of model fusion. The model combines four classical convolutional neural network structures. Experiments show that this method can effectively improve the accuracy and robustness of the model. Our method does not need actual running software and has high detection efficiency.

KEYWORDS

Malware Detection, Grayscale Image, Convolutional Neural Networks, Model integration.

The Parallel HTM Spatial Pooler with Actor Model

Damir Dobric¹, Andreas Pech², Bogdan Ghita¹ and Thomas Wennekers¹, ¹University of Plymouth, Faculty of Sciences and Engineering and ²Frankfurt University of Applied Sciences, Dept. Of Computer Science and Engineering

ABSTRACT

The Hierarchical Temporal Memory Cortical Learning Algorithm is an algorithm inspired by the biological functioning of the neo-cortex, which combines spatial pattern recognition and temporal sequence learning. It organizes neurons in layered column-units built from many neurons connected in the more complex structure called regions (areas). Such hierarchically organized structures can also be connected in networks, which provides more cognitive capabilities like invariant representation. Complex topology and a high number of neurons in its structure require wide more compute power than a single machine with multicore processors and GPU. This paper aims to improve the HTM CLA by enabling it for horizontal scale on multiple nodes in a highly distributed system by using the Actor Programming Model. The proposed concept also makes use of existing cloud and serverless technology and it enables easy setup and operation of cortical algorithm in a distributed environment.The Proposed model is based on a mathematical theory and computation model, which targets massive concurrency. Using this model drives different reasoning about concurrent execution and should enable flexible distribution of cortical computation logic across multiple physical nodes. This work is the first one about parallel HTM Spatial Pooler on multiple nodes with named computational model.With the increasing popularity of cloud computing and serverless architecture, this work is the first step towards proposing interconnected independent HTM CLA units in an elastic cognitive network and can provide an alternative to deep neuronal networks, with theoretically unlimited scale in a distributed cloud environment. This paper specifically targets the redesign of a single Spatial Pooler unit.

KEYWORDS

Hierarchical Temporal Memory, Cortical Learning Algorithm, HTM CLA, Actor Programming Model, AI, Parallel, Spatial Pooler

An Investigation to Choose the Proper Therapy Technique in the Management of Autism Spectrum Disorder

Ilker Ozsahin^1,2, Mubarak Taiwo Mustapha^1,2, Safa Ameen Albarwary^1,2 and Dilber Uzun Ozsahin^1,2, ¹Department of Biomedical Engineering, Faculty of Engineering, Near East University, Nicosia, Mersin-10 TRNC, 99138 Turkey and ²DESAM Institute, Near East University, Nicosia, Mersin-10 TRNC, 99138 Turkey

ABSTRACT

Autism spectrum disorder (ASD) is a group of neurodevelopmental conditions in which the individuals face challenges with social engagement, age-appropriate pay and fail to develop appropriate peer relationship according to their developmental level. This study aims to evaluate, compare and rank the therapy techniques used in the management of ASD using the fuzzy preference ranking organization method for enrichment evaluation (PROMETHEE), a multi-criteria decision-making approach. Fuzzy PROMETHEE utilizes a pair-wise comparison of alternatives using preference function and weight. These parameters were prioritized based on their importance for the survivability of the patient. The techniques we selected are as follows: Applied behavioral analysis (ABA), cognitive behavioral therapy (CBT), speech therapy, and pharmacological therapy such as risperidone and aripiprazole. The result indicates that CBT is the most preferred technique, followed by ABA, aripiprazole, speech therapy, and risperidone. Nonetheless, new criteria and parameters could be considered and weights could be assigned based on the interests of the decision maker. We showed the applicability of the proposed technique in informing decision makers in choosing the right therapy technique for the management of ASD.

KEYWORDS

Autism Spectrum Disorders, Therapy, Decision-Making, Fuzzy, PROMETHEE

Automation of Purchase Order in Microsoft Dynamics 365 by Deploying Selenium

Vijay Biju and Shahid Ali, Department of Information Technology, AGI Institute, Auckland, New Zealand

ABSTRACT

Regression testing is very important for dynamic verification. It helps to simulate a suite of test cases periodically and after major changes in the design or its environment, to check that no new bugs were introduced. Evidences regarding benefit of implementing automation testing which includes saves of time and cost as it can re-run test scripts again and again and hence is much quicker than manual testing, providing more confidence in the quality of the product and increasing the ability to meet schedules and significantly reducing the effort that automation requires from testers are provided on the basis of survey of 115 software professionals. In addition to this, automated regression suite has an ability to explore the whole software every day without requiring much of manual effort. Also, bug identification is easier after the incorrect changes have been made. Genius is going through continuous development and requires testing again and again to check if new feature implementation have affected the existing functionality. In addition to this, Erudite is facing issue in validation of the Genius installation at client site since it requires availability of testers to check the critical functionality of the software manually. Erudite wants to create an automated regression suite for Genius which can be executed at client site for checking the functionality of the software. In addition to this, this suite will also help the testing team to validate if the new features which have been added to the existing software are affecting the existing system or not. Visual studio, Selenium Webdriver, Visual SVN and Trello are the tools which have been used to achieve the creation of automation regression suite. The current research will provide guidelines to the future researchers on how to create an automated regression suite for any web application using open source tools.

KEYWORDS

Automation testing, Regression testing, Visual Studio, C#, Selenium Webdriver, Agile- Scrum

Comparative Study of Bert Models for Open QA using Cord-19

Nikita Agarwal, Bhabesh Chanduka and Parul Gupta, College of Information and Computer Sciences University of Massachusetts Amherst, USA

ABSTRACT

With the increasing number of people suffering from the coronavirus disease, it is essential that the medical personnel have quick access to the important information in order to understand and discover a cure for the coronavirus faster. This work addresses the task of extracting the relevant documents from the recent CORD-19 dataset. We frame the information extraction task as an open QA problem and compare the results of three models based on the variants of BERT - BERTBase, and domain-specific language model, namely SciBERT and BioBERT. We report that SciBERT performs better for abstract based extraction and BioBERT performs better for document and paragraph based extraction(hybrid model). The hybrid model extracts most relevant paragraphs from the top k articles of document based model.

KEYWORDS

Information Retrieval, Natural Language Processing, CORD-19, BERT

Systematic Attack Surface Reduction for Deployed Sentiment Analysis Models

Josh Kalin¹, Gerry Dozier¹, and David Noever², ¹Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA and ²PeopleTec, Inc, Huntsville, AL, USA

ABSTRACT

This work proposes a structured approach to baseline a model, identifying attack vectors and securing the machine learning models after deployment.

KEYWORDS

Machine Learning, Sentiment Analysis, Adversarial Attacks, Substitution Attacks

Critical Discourse Analysis of Huawei on Twitter

Zheng qiqi

ABSTRACT

Comparing with traditional media online social media seem to provide more opportunities for people to speak out their ideas. Twitter integrates the whole world users in the platform, so that users from different places can exchange their views in the Internet. These functions own the decentralized feature, which is expected to change the original power structure in international communication.Huawei is an outstanding representative of Chinese company whose texts to some extent illustrate the overseas public ’s evaluation of Chinese image. On that basis, this study adopts the theory of Fairclough’s Critical Discourse Analysis and analyzes the ways of Huawei discourse on Twitter. In this way, the current paper tries to investigate theproduction, distribution and consumption of Huawei discourse on Twitter. Meanwhile, this dissertation also attempts to discuss the situation of the construction and dissolution of the power structure behind social media in the new media era.

KEYWORDS

Twitter, Huawei, CDA theory, Mengwanzhou, Renzhengfei

Contact Us

nlpa@csit2020.org