2nd International Conference on Data Mining and NLP (DNLP 2021)

August 21 ~ 22, 2021, Chennai, India

Accepted Papers


Identifying Ransomware Actors in the Bitcoin Network


Siddhartha Dalal, Zihe Wang and Siddhanth Shabrawal, Columbia University, New York, USA

ABSTRACT

Due to the pseudo-anonymity of the Bitcoin network, users can hide behind their bitcoin addresses that can be generated in unlimited quantity, on the fly, without any formal links between them. Thus, it is being used for payment transfer by the actors involved in ransomware and other illegal activities. The other activity we consider is related to gambling since gambling is often used for transferring illegal funds. The question addressed here is that given temporally limited graphs of Bitcoin transactions, to what extent can one identify common patterns associated with these fraudulent activities and apply them to find other ransomware actors. The problem is rather complex, given that thousands of addresses can belong to the same actor without any obvious links between them and any common pattern of behavior. The main contribution of this paper is to introduce and apply new algorithms for local clustering and supervised graph machine learning for identifying malicious actors. We show that very local subgraphs of the known such actors are sufficient to differentiate between ransomware, random and gambling actors with 85% prediction accuracy on the test data set.

KEYWORDS

Ransomware Actors Identification, Graph Machine Learning, Local Clustering, Bitcoin Network.


Decentralized E-mail System using Blockchain, Smart Contracts and Interplanetary File System


Ankita Vaid, Urvee Ausekar and Abhijit A.M, Computer Engineering and IT, College of Engineering Pune, India

ABSTRACT

This paper describes a e-mailing system that is built using Blockchain and IPFS, and provides complete privacy, end-to-end encryption, decentralization, non-repudiation, and solutions to problems like spam, spoofing and phishing. This is a first-of-its kind solution that provides all the aforesaid features.

KEYWORDS

Blockchain, Email, Privacy, Security, Interplanetary File System.


Digital Land Registry System using Blockchain


Ekta Kithani1, Jaya Tanwani1, Bhavesh Mangnani1, Nikita Achhra1, Prof. Richa Sharma2, Prof. Rohini Temkar2, 1Student of Fourth Year Computer Engineering, Vivekanand Education Society’s Institute of Technology Chembur Mumbai 400074, 2Assistant Professor, Department of Computer Engineering, Vivekanand Education Society’s Institute of Technology Chembur Mumbai 400074

ABSTRACT

In today’s world, the security of data plays a very important role, many industrial sectors are trying to secure the data from hackers. Blockchain is an advanced technology through which peers can digitally transfer currency, financial documents, land properties, etc. It is an open-source public network where no central authority is needed, it is a peer-to-peer network where all transactions, value transfer, data shared through a single node would be verified by all other connected nodes in the network. The traditional land registration process is a slow and laborious process, involves many intermediaries, and has maximum chances of fraudulent and fake land transfer. Blockchain is a perfect domain for the land transfer process, in this paper proposed solution is given on securely transferring land ownership using blockchain technology, without involving any intermediaries, buyers and sellers are making a land ownership deal using ethereum network.

KEYWORDS

Blockchain, Ethereum, Smart contracts, Cryptography, Land Registry, IPFS, RSA, SHA256.


Researching Blockchain Technology and its Usefulness in Higher Education


Shankar Subramanian Iyer1, Arumugam Seetharaman2 and Bhanu Ranjan3, 1Researcher, S.P. Jain School of Global Management, Dubai, 2Dean Academic Affairs at S P Jain School of Global Management, Singapore, 3Associate Professor, S.P. Jain School of Global Management- Singapore

ABSTRACT

The current paper focuses on the potential of using Blockchain Technology (BCT) in the Higher Education Domain and explores its usefulness in solving Higher Education issues. This research discusses the Blockchain features, challenges and its benefits in education, followed by review of some current Blockchain Higher Education applications. This paper reviews the Blockchain Technology (BCT) and its implementation in Higher Education. This research used a quantitative methodology and stratified clustered simple random sampling approach. Data has been gathered through an online survey instrument, and the partial least squares structural equation modelling (PLS-SEM) technique applied to 383 responses. Blockchain technology has its unique features, benefits that can solve Education system requirements, and its successful implementation issues discussed. An effort made to gather enough consensus to build future implementation. The integrated model of Blockchain features matched to the needs of the Education System by agreement of the experts (discussions), and a survey conducted involving the students, teachers, educationists, Blockchain experts, and professionals, is tested and validated by SEM using PLS.

KEYWORDS

Blockchain Technology (BCT), Higher Education Implementation, Higher Education Domain, Higher Education Management, Higher Education Technology, Structural Equation Model.


IoT based Advanced Attendance Management System


Anoj J1, Dr. R Sridharan2 and Dr.V.Karthikeyan3, 1Department of Mechanical Engineering, NIT, Calicut, Kerala, India, 2Department of Mechanical Engineering, NIT, Calicut, Kerala, India, 3Department of Electrical and Electronics Engineering, NIT, Calicut, Kerala, India

ABSTRACT

The projects purpose is to develop a biometrics and facial detection-based attendance register for educational institutions, such as colleges and schools. Hardware and software engineering concepts are merged in this project to produce a consumer product that can replace the current way of attendance recording. This project also makes use of the Internet of Things (IoT) for data transport, storage, and presentation. Face detection and fingerprint detection were integrated into a gateway with a wifi module that is connected to a cloud server to construct the system. The output can be received using a mobile application that is available at the end of each semester.

KEYWORDS

Raspberry Pi (RPi), Face detection, Biometric, Internet-of-Things (IoT), MT CNN.


Proof of Authentication Based Blockchain Architecture to Meet Challenges in Access Control and Secure Management of Electronic Health Records in IoT based Healthcare Systems


Maria Arif, Megha Kuliha and Sunita Varma, Department of Information Technology, S.G.S.I.T.S.,Indore, M.P., India

ABSTRACT

Secure, immutable and transparent feature of blockchain has led researchers to find ways to harness its potential in sectors other than financial services. Blockchain is gaining popularity as a tool that could help solve some of the healthcare industrys age-old problems that have resulted in delayed treatments, inaccessible health records in emergency, wasteful spending and higher costs for health care providers, insurers and patients. Applying blockchain in healthcare brings a new challenge of integrating blockchain with Internet of Things (IoT) networks as sensor based medical and wearable devices are now used to gather information about the health of a patient and provide it to medical applications using wireless networking. This paper proposes an architecture that would provide a decentralized, secure, immutable, transparent, scalable and traceable system for management and access control of electronic health records (EHRs) through the use of consortium blockchain, smart contracts, proof-of-authentication (PoAh) consensus protocol and decentralized cloud.

KEYWORDS

Blockchain, Proof of Authentication, Smart Contracts, Internet of Things, Healthcare.


Music Signal Analysis: Regression Analysis


V. N. Aditya Datta Chivukula and Sri Keshava Reddy Adupala, Department of Computer science and Engineering, International Institute of Information Technology, Bhubaneswar, Odisha, India

ABSTRACT

Machine learning techniques have become a vital part of every ongoing research in technical areas. In recent times the world has witnessed many beautiful applications of machine learning in a practical sense which amaze us in every aspect. This paper is all about whether we should always rely on deep learning techniques or is it really possible to overcome the performance of simple deep learning algorithms by simple statistical machine learning algorithms by understanding the application and processing the data so that it can help in increasing the performance of the algorithm by a notable amount. The paper mentions the importance of data pre-processing than that of the selection of the algorithm. It discusses the functions involving trigonometric, logarithmic, and exponential terms and also talks about functions that are purely trigonometric. Finally, we discuss regression analysis on music signals.

KEYWORDS

Machine Learning, Regression, Music Signal Analysis.


A step counting optimization strategy based on deep reinforcement learning


Zhoubao Sun1* and Mengchen Sun2, 1Jiangsu Key Laboratory of Public Project Audit, Nanjing Audit University, Nanjing, 211815, China, 2College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, 211006, China

ABSTRACT

With the popularity of Internet of Things technology and intelligent devices, the application prospect of accurate step counting has gained more and more attention. In order to solve the problems that the existing algorithms use threshold to filter noise, and the parameters cannot be updated in time, this paper introduces an intelligent optimization strategy based on deep reinforcement learning. In this paper, the problem is transformed into a serialization decision optimization. This paper integrates the noise recognition and the user feedback to update parameters. The end-to-end processing is directly, which alleviates the inaccuracy of step counting in the follow-up step counting module caused by the inaccuracy of noise filtering in the two-stage processing, and makes the model parameters continuously updated. Finally, the experimental results show that the proposed model achieves superior performance to existing approaches.

KEYWORDS

Optimization Strategy, Step Counting, Deep Reinforcement Learning.


Summarization of Commercial Contracts


Keshav Balachandar1, AnamSaatvik Reddy1, A. Shahina1, Nayeemulla Khan2, 1Department of Information Technology, SSN College of Engineering, Chennai, India, 2School of Computer Science and Engineering, VIT University, Chennai, India

ABSTRACT

In this paper, we propose a novel system for providing summaries for commercial contracts such as Non- Disclosure Agreements (NDAs), employment agreements, etc. to enable those reviewing the contract to spend less time on such reviews and improve understanding as well. Since it is observed that a majority of such commercial documents are paragraphed and contain headings/topics followed by their respective content along with their context, we extract those topics and summarize them as per the user’s need. In this paper, we propose that summarizing such paragraphs/topics as per requirements is a more viable approach than summarizing the whole document. We use extractive summarization approaches for this task and compare their performance with human-written summaries. We conclude that the results of extractive techniques are satisfactory and could be improved with a large corpus of data and supervised abstractive summarization methods.

KEYWORDS

Text summarization, automatic summarization, commercial contracts.


Detection of Oil Tank From High Resolution Remote Sensing Images Using Morphological and Statistical Tools


D. Chaudhuri, DRDO Integration Centre Panagarh Base, Muraripur, Burdwan – 731219, W.B., INDIA

ABSTRACT

Oil tank is an important target and automatic detection of the target is an open research issue in satellite based high resolution imagery. This could be used for disaster screening, oil outflow, etc. A new methodology has been proposed for consistent and precise automatic oil tank detection from such panchromatic images. The proposed methodology uses both spatial and spectral properties domain knowledge regarding the character of targets in the sight. Multiple steps are required for detection of the target in the methodology – 1) enhancement technique using directional morphology, 2) multi-seed based clustering procedure using internal gray variance (IGV), 3) binarization and thinning operations, 4) circular shape detection by Hough transform, 5) MST based special relational grouping operation and 6) supervised minimum distance classifier for oil tank detection. IKONOS and Quickbird satellite images are used for testing the proposed algorithm. The outcomes show that the projected methodology in this paper is both precise and competent.

KEYWORDS

Recognition, remote sensing, resolution, enhancement, supervised procedure, clustering, minimal spanning tree (MST).


Classification of Mamographic Images by Openvino: A Proposal of use to Enhance more Efficiency in Cancer Diagnosis


Horacio Emidio de Lucca Junior1, Arnaldo Rodrigues Santos Jr2, 1Centro Educacional da Fundação Salvador Arena, CEFSA, Brazil, 2Centro de Ciências Naturais e Humanas (CCNH), Universidade Federal do ABC, São Bernardo do Campo, SP, Brazil

ABSTRACT

Diseases that are characterized by the disordered growth of cells that, in many cases, have the property of invading tissues and organs are commonly called cancer. Such cells divide quickly and the invasion can be very aggressive and uncontrolled, resulting in formation of malignant tumors. These tumors may migrate to other parts of the body, an event known as metastasis. How aggressive it is and the speed with which a cell multiplies is what differentiates neoplasms from malignant or benign. Benign tumors grow more slowly and cause less damage to neighboring tissues. Mammography is a radiological exam that generates high-quality images of the breast in gray scale. When mammography is performed, it must be done in both breasts, as there is usually a high degree of symmetry between them. This comparison helps discover if any anomalies between them can be determined. Mammographic images from libraries of the European mini - MIAS database and the American digital database DDSM were used in this research for digital improvement and characteristic analysis using some well knows computer programs. These include artificial intelligence to contribute to the proper interpretation of data, and results collection, assisting health professionals in a more accurate diagnosis about the pathology. This work has as main objective to analyze mammography images of breast nodules and to propose a method of classification by shape and texture using computer programs that can maximize the accuracy in the correct diagnosis regarding the malignancy or not of a tumor. It is a tool that it can be useful as a contribution in the interpretation of the results to mastologists who identify such nodules through the analyzed radiological images.

KEYWORDS

Diagnostic imaging, Image processing, Computer-Aided Detection, Computer-Aided Diagnosis.


Classification of Cardiac Arrhythmia Using Machine Learning


Abhinandan Baral, Akash.S, NishanthReddy.B and Dr.Damodar Panigrahy, Department of ECE, SRMIST, Chennai, India

ABSTRACT

Early diagnosis of diseases in the medical sector is in place due to rapid advancements in technology.One of them is detection ofCardiac Arrhythmia in an early stage.Analyzing of the data-set manually in general give arise to inaccuracy due to the miscalculationof the beats. Proper differentiation between normal and abnormal classes is not guaranteed and Interpretation is complicated andtime is consumed more for larger data-sets.This all constraints found its effective remedy in the implementation of machinelearning technique that will help in greater potential and accuracy to detect the possibilities of severe cardiac arrhythmia.The datacan be analyzed and the disease can be detected earlier through the implementation of Machine Learning so that early diagnosiscan be given.


A Video Multi-Parameter Quality Assessment Model based on 3D Convolutional Neural Network on the Cloud


Xue Li1, Jiali Qiu2, 1Xidian University, Xi’an, China, 2Xi’an Jiaotong University, Xi’an, China

ABSTRACT

As the rapid development of big data and the artificial intelligence technology, users prefer uploading more and more local files to the cloud server to reduce the pressure of local storage, but when users upload more and more duplicate files , not only wasting the network bandwidth, but also bringing much more inconvenience to the server management, especially images and videos. To solve the problems above, we design a video multi-parameter quality assessment model based on 3D convolutional neural network in the video deduplication system, we use a method similar to analytic hierarchy process to comprehensively evaluate the impact of packet loss rate, codec, frame rate, bit rate, resolution on video quality, and build a two-stream 3D convolutional neural network from the spatial flow and timing flow to capture the details of video distortion, set the coding layer to remove redundant distortion information. Finally, the LIVE and CSIQ data sets are used for experimental verification, we compare the performance of the proposed scheme with the V-BLIINDS scheme and VIDEO scheme under different packet loss rates. We also use the part of data set to simulate the interaction process between the client and the server, then test the time cost of the scheme. On the whole, the scheme proposed in this paper has a high quality assessment efficiency.

KEYWORDS

Video quality assessment, 3D CNN, Packet loss rate, SRCC, PLCC.


A Comprehensive Study on Phishing Attacks and Different Available Detection Approaches


Janhavi V, Neha R, Anagha B C, Megha Ambi and Bharatesh B S, CSE, VVCE, Mysuru

ABSTRACT

This paper mainly concentrates on the fraudulent phishing attacks along with their detection techniques. As the technological advancements are happening at a large scale the world is becoming small. Due to these technological advancements the whole world is going towards a digital society. Due to these changes, there comes a plethora of cybercrimes such as Phishing, Cyberextortion, Ransomware attack, Crypto jacking, Cyberespionage, and many more Phishing attack being one of the most fraudulent one, extreme care and awareness must be taken to safeguard our data.

KEYWORDS

Phishing Attacks, Fraudulent Method, Data Mining, Visual Similarity, Heuristics.


Lyrics to Music Generator: Statistical Approach


V.N Aditya Datta Chivukula1, Abhiram Reddy Cholleti2 and Rakesh Chandra Balabantaray2, 1Department of Computer Science and Engineering, 2Department of Electronics and Telecom Engineering International Institute Of Information Technology, Bhubaneswar, Odisha, India

ABSTRACT

Natural Language Processing is in growing de-mand with recent developments. This Generator model is one such example of a music generation system conditioned on lyrics. The model pro- posed has been tested on songs having lyrics written only in English, but the idea can be generalized to various languages. This paper’sobjective is to mainly explain how one can actually create a music generator using statistical machine learning methods and how effectively outputs canbe formulated, which are the music signals as they are million sized overa short period of time frame. The parameters mentioned in the paper only serve an explanatory purpose. This paper discusses effective statistical formulation of output thereby decreasing the vast amount of estimation of output parameters, and how to reconstruct the audio signals from predicted parameters by using ‘phase-shift algorithm’ which is also discussed in the paperin detail.

KEYWORDS

Audio Signal Processing, Statistical Machine Learning.


Leveraging of Weighted Ensemble Technique for Identifying Medical Concepts from Clinical Texts at Word and Phrase Level


Dr.Dipankar Das, Krishna Sharma, Department of Computer Science & Engg., Jadavpur University, India

ABSTRACT

Concept identification from medical texts becomes important due to digitization. However, it is not always feasible to identify all such medical concepts manually. Thus, in the present attempt, we have applied five machine learning classifiers (Support Vector Machine, K-Nearest Neighbors, Logistic Regression, Random Forest and Naïve Bayes) and one deep learning classifier (Long Short Term Memory) to identify medical concepts by training a total of 27.383K sentences. In addition, we have also developed a rule based phrase identification module to help the existing classifiers for identifying multiword medical concepts. We have employed word2vec technique for feature extraction and PCA and TSNE for conducting ablation study over various features to select important ones. Finally, we have adopted two different ensemble approaches, stacking and weighted sum to improve the performance of the individual classifier and significant improvements were observed with respect to each of the classifiers. It has been observed that phrase identification module plays an important role when dealing with individual classifier in identifying higher order n-gram medical concepts. Finally, the ensemble approach enhances the results over SVM that was showing initial improvement even after the application of phrase based module.

KEYWORDS

Medical Concepts, Phrase Identification, Ensemble.


Social Media Effects on the Market: Reddit Data Analysis on Stocks


Juan Andrés Talamás, School of Engineering and Science, Tecnológico de Monterrey, Monterrey, Mexico

ABSTRACT

As social media continues to become a more essentialpart of life in general, including hobbies and general interests, itis beginning to have effects that were previously not consideredpossible.While the social and sale effects of online communities is wellknown, these are usually focused on individual choice. (i.e. whatto buy, where to go eating, etc.) Actually affecting the valueof a company was something that almost no-one expected. Therecent events regarding Gamestop, Dogecoin, Blackberry, andsome others, proves otherwise and shows that social media hasreached a point where these communities can consciously andpurposely increase or decrease value.This analysis will attempt to explain the overall influence ofsocial media over perceived value (stock price), and attempt tofind a model that could lead to better understanding of digitalcommunities and their effects on the stock market.

KEYWORDS

Social media, Reddit, stocks, Data science.


Semantic Malayalam Dialogue System for Covid-19 Question Answering


Liji S K and Muhamed Ilyas P, Department of Computer Science, Sullamussalam Science College, Kerala, India

ABSTRACT

Covid-19 is a global pandemic, has affected millions of people physically and mentally. The dynamic and rapidly growing situation with COVID-19 made it more difficult to discourse accurate and authoritative information about the disease. To resolve this issue, here we propose a semantic Malayalam Dialogue System for COVID-19 related Question Answering. This is a user-friendly knowledge system to automatically deliver relevant answers to COVID-19 related queries in the Malayalam language. The NLP techniques are used for document processing, word embedding methods - CBOW and Skip Gram, neural network models are used for semantic modelling. Finally, a cosine similarity measure is used to map and retrieve the answers for the users queries. The experiment is conducted with our own created Malayalam dataset. Here we compare the performance of two Word2Vec algorithms- CBOW and Skip Gram. We identified that Skip-Gram performs more efficiently with our data set. Also, we noticed that CBOW is faster than the Skip Gram model.

KEYWORDS

Question Answering, Dialogue System, Word Embedding, Word2Vec, Continuous Bag Of Word, Skip Gram, Cosine Similarity.


Preparing Annotated Data on Covid - 19 by Employing Naïve Bayes


Dr. DipankarDas and Vidit Sarkar, Department of Computer Science & Engg., Jadavpur University, India

ABSTRACT

The on-going pandemic has opened the pandora’s box of the plethora of hidden problems which the society has been hiding for years. But the positive side to the present scenario is the opening up of opportunities to solve these problems on the global stage. One such area which was being flooded with all kinds of different emotions, and reaction from the people all over the world, is twitter, which is a micro blogging platform. Coronavirus related hash tags have been trending all over for many days unlike any other event in the past. Our experiment mainly deals with the collection, tagging and classification of these tweets based on the different keywords that they may belong to, using the Naive Bayes algorithm at the core.

KEYWORDS

Covid-19, Naïve Bayes, Clustering.


Effective Combination of Bert and Cross-Sentence Contexts in Aspect Extraction


Anh Khoi Le and Truong Son Nguyen, Ho Chi Minh University of Science, National University, Ho Chi Minh City, Viet Nam

ABSTRACT

The Aspect Extraction (AE) field investigates in collecting words which are sentiment aspects in sentences and documents. Despite the pandemic, the number of products purchased online is still growing, which means that the number of product reviews and comments is also increasing rapidly, so the role of the task is gradually crucial. Extract aspects in the text is a difficult task, that requires algorithms capable of deep capturing the semantics of the text. In this work, we combine two models of the two research groups, with the first using the BERT algorithm with multiple concatenated layers and the second using the strategies to enrich the dataset by itself in the training or testing phase. The source code is available on github.com, researchers can run it through scripts, modify it for further research also. https://github.com/leanhkhoi/AE_BERT_CROSS_SENTENCES

KEYWORDS

Sequence Labeling, Aspect Extraction, BERT, Cross-sentence.


Fine-Tuned Predictive Models for Forecasting Severity Level of Covid-19 Patient using Epidemiological Data


Shweta Tikhe and Dipti P. Rana, Department of Computer Engineering, SVNIT, Surat, Gujarat, India

ABSTRACT

HealthCare Data Analytics is the analysis technique using current and historical data related to the health domain to improve outreach, predict trends, and manage health-related things. Severity level prediction will help battle the COVID-19 pandemic by providing early predictions for the treatment. Supervised machine learning models were developed for predicting the severity level of COVID-19 positive patients using Random Forest, Decision Tree, AdaBoost, K-Nearest Neighbours, and Naïve Bayes with the epidemiological dataset of COVID-19. Various sampling techniques, feature selection techniques like feature score, feature importance, correlation matrix are used to minimize the execution time and to improve and fine-tune the model. The result of the performance evaluation measure of the machine learning models showed that the Random Forest classifier has the best results followed by the Decision Tree classifier with SMOTE over-sampling technique. The Random Forest model has the highest accuracy as 98.32% and a precision value as 97.52%.

KEYWORDS

Data Mining, Data analysis, COVID-19, Feature Engineering, Machine Learning.


Dictionary based Gender Identification and Gender based Sentiment Analysis with Polarized Word2Vec


Navodita Saini and Dipti P. Rana, Department of Computer Engineering, SVNIT, Surat, Gujarat, India

ABSTRACT

Each gender is having its special behaviour which can be reflected in every field of social media. Twitter is used to discuss the issues caused by COVID-19 disease. As Twitter does not disclose the gender of the user so in this study we have discussed what kind of approaches we have for gender identification and then we have proposed a dictionary approach for gender identification. When we are working with unlabeled data then we have to go with the dictionary approach for sentiment analysis. But in this study, we proposed an approach which is based on cluster-based approach. Then compared it with the existing approaches and found that our proposed approach gives more accuracy in sentiment analysis for unlabeled data. This study is about the analysis of ten kind of emotions of males and females by which we can observe that how they reacted in this pandemic.

KEYWORDS

Sentiment Analysis, COVID-19, Dictionary, Word2Vec, Polarity.


A Self-Aggregated Hierarchical Topic Model for Short Texts


Yue Niu, Hongjie Zhang and Jing Li, University of Science and Technology of China, Hefei, Anhui, China

ABSTRACT

With the growth of the internet, short texts such as tweets from Twitter, news titles from the RSS, or comments from Amazon have become very prevalent. Many tasks need to retrieve information hidden from the content of short texts. So ontology learning methods are proposed for retrieving structured information. Topic hierarchy is a typical ontology that consists of concepts and taxonomy relations between concepts. Current hierarchical topic models are not specially designed for short texts. These methods use word co-occurrence to construct concepts and different co-occurrence of general or special words to construct taxonomy relations. But in short texts, word co-occurrence is sparse and the occurrence of general or special words is irregular. To overcome this problem and provide an interpretable result, we designed a hierarchical topic model which aggregates short texts into long documents and constructing topics and relations. Because long documents add additional word co-occurrence, our model can avoid the sparsity of word co-occurrence. In experiments, we measured the quality of concepts by topic coherence metric on four real-world short texts corpus. The result showed that our topic hierarchy is more interpretable than other methods.

KEYWORDS

Hierarchical Topic Model, Texts Analysis, Short Texts, Data Mining.


Selection of the Most Suitable Agile Approach for A Given Project And Organization


Isha Sharma, Manchikanti Venkata Satya Divya Teja, and Mrs Deepti Mehrotra, Department of Computer Science & Engineering, Amity University Uttar Pradesh Noida, India

ABSTRACT

The Indian software organizations have long been early adopters of process improvement as a means of demonstrating organizational capabilities to their global client base. As a result, the development approaches in these organizations are often heavily plan-based, generating structures and processes that are appropriate to those approaches. The main Objective of our paper is to help in finding the most suitable Agile methodology for a given project and organization. A systematic Literature review was done to analyses the origin of data. It was found that Agile methodologies also have some limitations.

KEYWORDS

Agile Methodology, Scrum, Kanban, Lean Software Development, Dynamic System Development Method, Crystal Clear Methodology, Extreme Programming.


Artificial intelligence approached dynamically detecting security threats and updating a signature-based intrusion detection system’s database in NGN


Gunay Abdiyeva-Aliyeva1, Mehran Hematyar2, 1Institute of Control Systems of Azerbaijan National Academy of Sciences, AZ1141, B.Vahabzadeh str.,9, Baku, Azerbaijan, 2Azerbaijan Technical University, AZ 1073 H.Cavid av., 25, Baku, Azerbaijan

ABSTRACT

Cyber-attacks threatening the network and information security have increased, especially during the current rapid IT revolution. Therefore, a monitoring and protection system should be used to secure the computer networks. Intrusion detection system (IDS) is one of the most important security systems on the market. IDS is a system that can then be used to monitor network traffic and display alerts for illegal activities or illegal access to the network. IDS is divided into three main types: signature-based IDS, anomaly-based IDS and a mixture of both. Automatically updating the attack list to overcome new attack types is one of the main challenges of signature-based IDS. Most IDS (by network administrators) or websites that use newly detected attack signatures to manually or remotely update their databases. This article proposes a new AI model that uses a filter engine that functions as a second IDS engine to update the attack list by AI and automatically. The results show that using the proposed model can improve the overall accuracy of IDS. The proposed model uses an IP-Factor(IPF) and Non-IP-Factor(NIPF) blacklist that can automatically detect the threats and update the IDS database with new attack features without manual intervention, as well as define new attack features based on similarity.

KEYWORDS

Intrusion Detection System, signature-based, anomaly-based; traffic, AI based IDS, Artificial Intelligence.


Data Fusion using Kalman Filter and LVQ


Mrs. Shobha, Dr. Nalini N, Dept. of Computer Science and Engineering, Nitte Meenakshi Institute of Technology, Yelahanka, Bangalore, Karnataka, India

ABSTRACT

Data fusion is a process of combining different information sources to create more steady, exact, and applicable data than that given by any independent data source. Data fusion measures are regularly classified as low, middle, or high, contingent upon the handling stage at which combination happens. In this paper accuracy of Robust soft LVQ, generalized learning vector quantization, generalized matrix LVQ and Generalized relevance learning vector with Kalman filter data fusion is implemented.

KEYWORDS

Accuracy, Data fusion, Efficiency, Kalman filter, LVQ, Sensor.


The Adoption of the Internet of Thingsfor SMART Agriculturein Zimbabwe


Tsitsi Zengeya1, Dr. Paul Sambo2 and Dr. Nyasha Mabika3, 1Great Zimbabwe University, Department of Mathematics and Computer Science, 2Great Zimbabwe University, Department of Mathematics and Computer Science, 3Great Zimbabwe University, Department of Livestock, Wildlife and Fisheries

ABSTRACT

Zimbabwe has faced severe droughts, resulting in low agricultural outputs. This has threatened food and nutrition security in communitysections, especially in areas with low annual rainfall. There is a growing need to maximize water usage, monitor the environment and nutrients, and temperatures by the adaptation of smart agriculture. This research explored the use of the Internet of Things (IoT) for smart agriculture in Zimbabwe to improve food production. The mixed methodology was used to gather datathrough interviews from 50 purposively sampled A2 farmers in the five agricultural regions of Zimbabwe and was supported by the use of the Internet. The findings reveal that some farmers have adopted IoT in Zimbabwe, others are still to adopt such technology and some are not aware of the technology. IoT’s benefits to Zimbabwean farmers are immense in that it improves food security, water preservation, and farm management. However, for most farmers to benefit from IoT, more awareness campaigns should be carried out and mobile and fixed Internet connectivity improved in some of the areas.

KEYWORDS

Internet of Things, Adoption, Smart Agriculture, Activity Theory, Covid-19.


In Silico Screening and Molecular Docking Studies of Potential Inhibitors Ofpp1γ2


Ritika Saxena1,a, Nabeel Ahmad1,b and Sanjay Mishra2, 1School of Biotechnology, IFTM University Moradabad Uttar Pradesh India, 2Former Professor, School of Biotechnology, IFTM University Moradabad Uttar Pradesh India

ABSTRACT

Testis- specific PP1γ2 is indispensable in the final stages of spermatogenesis. The activity of PP1γ2 phosphatase has been shown to be inversely correlated with motility: low activity in vigorously motile caudal spermatozoa and high activity in immotile caput spermatozoa. In recent study, PP12 was subjected to molecular docking study by screening of a ligand library of known natural phytochemical compounds which in turn may inhibit the catalytic activity of PP1γ2 and ultimately resulting in high sperm motility. In our study, three compounds were identified as potential candidates which showed better binding energies. These phytochemicals were α-terpineol, Coumarin and 2-Phenylpropan-2-ol. Using these modern techniques, these molecules could be used to enhance sperm motility.

KEYWORDS

Spermatogenesis, PP1γ2, Molecular Docking, Phytochemical Compounds, Sperm Motility.


Evaluation of Knowledge and Attitude Towards Adoption of Computer and Informatics Technology among Registered and Student Nurses in Tamilnadu


Mrs. Kavitha Karthik, Health Informatics Supervisor, G.Kuppuswamy Naidu Memorial Hospital, Coimbatore, India

ABSTRACT

A Quantitative descriptive survey was designed to assess Nurses knowledge and attitude towards adoption of computer and informatics technology. This study was conducted among 235 nurses conveniently selected from both public and private teaching hospitals in Coimbatore, India. The data was collected by using standardized PATCH (Pretest for Attitudes Towards Computer in Health Care) Scale. In general, nurses working in private had improved basic computer knowledge comparing to nurses in Public teaching hospitals and all nurses had favorable attitude towards adoption of informatics technology in the health care delivery system. If the computer and informatics technology become an effective and beneficial part of the Indian health care system, it is necessary to help Indian nurses improve their computer and informatics competencies.

KEYWORDS

Knowledge and Attitude, Computer and Informatics Technology, PATCH Scale, Registered and Student Nurses.


Skills Mapping and Career Development Analysis using Artificial Intelligence


Yew Kee Wong, School of Information Engineering, HuangHuai University, Henan, China

ABSTRACT

Artificial intelligence has been an eye-popping word that is impacting every industry in the world. With the rise of such advanced technology, there will be always a question regarding its impact on our social life, environment and economy thus impacting all efforts exerted towards continuous development. From the definition, the welfare of human beings is the core of continuous development. Continuous development is useful only when ordinary people’s lives are improved whether in health, education, employment, environment, equality or justice. Securing decent jobs is a key enabler to promote the components of continuous development, economic growth, social welfare and environmental sustainability. The human resources are the precious resource for nations. The high unemployment and underemployment rates especially in youth is a great threat affecting the continuous economic development of many countries and is influenced by investment in education, and quality of living.

KEYWORDS

Artificial Intelligence, Conceptual Blueprint, Continuous Development, Human Resources, Learning and Employability Blueprint.


Comparative Analysis of Supervised Learning for Network Intrusion Detection


Sunil Kumar Rajwar, Assistant Professor, University Department of Computer Applications, Vinoba Bhave University, Jharkhand, India, Dr. I Mukherjee, Assistant Professor, Department of Computer Science & Engineering, Birla Institute of Technology, Mesra,Ranchi, Jharkhand, India, Dr. Pankaj Kumar Manjhi, Assistant Professor, University Department of Mathematics, Vinoba Bhave University, Jharkhand, India

ABSTRACT

In this paper, we define the outlier detection and its application areas. The most important field of outlier detection is network anomaly detection. This can be achieved by network intrusion system. There are several NIDS are developed and practically implemented. We proposed a comparative analysis of different machine learning methods for network anomaly detection. The standard KDD99 dataset is used worldwide for practical IDS evaluation. We have also used 10% KDD99 dataset with WEKA software for the analysis of different machine learning algorithms. According to the observation K-star, Random Forest, BayesNet, Logistic, IBk, Decision Table performance is better as compared to the other algorithms.

KEYWORDS

N-IDS, Machine learning, KDD, Decision tree.