My research aims at providing the benefits of the data science and machine learning fields to the ordinary people. To this end, I conduct research that spans the Natural Language processing (NLP), recommender systems, classification techniques, big data, machine learning, and security. Although the abovementioned topics seem completely disjoint, they are all interconnected in providing the benefits of the computer science field to the ordinary people. All my works require design, experimentation, quantitative and qualitative analysis, modeling, and simulation to answer the question that how the field can help people.
In particular, my previous work in the abovementioned domains has appeared in 60 research publications including many reputable journals, such as IEEE Transactions on Dependable and Secure Computing, IEEE System Journal, IEEE Transactions on Service Computing, IEEE Transaction on Cloud Computing, and Future Generation Computer Systems. The detail about my publications is in my C.V. In the remainder of this statement, I will describe some of my recent published and on-going research works.
Previous Works and Research Interests:
1. Network traffic analysis
My recent works are on the analysis of the network traffic to identify the video being watched by the users connected to the network. The results of my works have shown that ISP can identify the title of the videos being watched by the users even if the traffic is encrypted by VPN and HTTPS [Kha22] [AfB22].
2. Social media analysis
A major portion of my research is on the analysis of data collected from social media, such as Twitter. In [KhA21], we proposed a convolutional neural network-based framework called “HateClassify” for labelling of social media contents as the hate speech, offensive, or non-offensive. A methodology to separate spammers and bloggers from genuine experts on twitter is proposed in [KhA18]. The proposed approach employs modified Hyperlink Induced Topic Search (HITS) to separate the unsolicited bloggers from the experts on Twitter based on tweets. The approach considers domain specific keywords in the tweets and several tweet characteristics to identify the unsolicited bloggers.
3. IoT and Sensors data analysis
Several my works involve the analysis of data from IoT devices. In [KhA18b], we proposed a methodology that utilizes the energy expenditure for human activities and reduces the dimensions of the feature space to differentiate among different human activities. A convolutional neural network was utilized to identify the human activities in [KhA18c]. I also co-edit a book “Big Data-enabled IoT” along with Dr. Samee U. Khan and Dr. Albert Y. Zomaya. The book covers the challenges and opportunities in the field of Big Data-enabled IoT [KhK19].
4. Recommender Systems
Recommendation systems have remained one of my most interesting research topics. The work [KhK16] was accepted in IEEE Transactions on Service Computing. In this paper, we proposed a scalable emergency evacuation service, termed the MacroServ that recommends the evacuees with the most preferred routes towards safe locations during a disaster. In [KhK14], we proposed the venue recommendation system using ant colony optimization methodology.
5. Machine learning and optimization
The success of machine learning has always inspired me and is among my favorite research interests. In [KhJ21], we proposed an optimization technique named adaptive diff-batch or adadb that removes the problem of overshooting gradient in Adam, slow convergence in diffGrad, and combines the methods with adaptive batch size for further increase in convergence rate. Similarly, in [KhK20], we proposed an adoptive batch size methodology to overcome the problem of slow convergence in diffGrad optimizer algorithm. Moreover, many of my research works can be categorized into applied AI and machine learning field. Those works involve the classification of malwares, cloud computing, and security.
Ongoing and Future Works:
Concept drift Problem
Concept drift is one of the major problems in practical deployment of machine learning algorithms in the real world. Due to this problem, machine learning models tend to forget the learning to change in training and real-world data and their accuracy decreases with the passage of time.
Out-of-distribution Problem
Machine learning models are highly biased toward the known classes and predict the wrong class with high confidence for unseen and unknown data. This problem is known as out-of-distribution (OOD) problem in machine learning communities. I am also working on developing methods to handle this problem.
Explainable Machine learning in Health and Bioinformatics data
Currently I am working on Explainable Machine Learning techniques applied to health and bioinformatics data. The focus is on developing models that not only achieve high predictive accuracy but also provide interpretable insights into complex biological and clinical datasets. By integrating explainability, I aim to enhance trust and usability of machine learning solutions for healthcare professionals, enabling better diagnosis, personalized treatment, and data-driven decision-making.
References:
[KhB22] M. U. S. Khan, S. M. A. H. Bukhari, T. Maqsood, M. A.B. Fayyaz, D. Dancey, and R. Nawaz, "SCNN-Attack: A Side-Channel Attack to Identify YouTube Videos in a VPN and Non-VPN Network Traffic," Electronics, vol. 11, pp. 350, 2022
[AfB22] W. Afandi, S. M. A. H. Bukhari, M. U. S. Khan, T. Maqsood, S. U. Khan, "Fingerprinting Technique for YouTube Videos Identification in Network Traffic," IEEE Access, vol. 10, pp. 76731-76741, 2022
[KhA21] M.U. S. Khan, A. Abbas, A. Rehman and R. Nawaz, "HateClassify: A Service Framework for Hate Speech Identification on Social Media," IEEE Internet Computing, Volume: 25, Issue: 1, pp. 40-49, Jan.-Feb. 1 2021
[KhA18] M.U.S. Khan, M. Ali, A. Abbas, S. U. Khan, and A. Y. Zomaya, “Segregating spammers and unsolicited bloggers from genuine experts on twitter,” IEEE Transactions on Dependable and Secure Computing, vol. 15, no. 4, pp. 551–560, 2018
[KhA18b] M.U.S. Khan, A. Abbas, M. Ali, M. Jawad, S. U. Khan, K. Li, and A. Y. Zomaya, “On the correlation of sensor location and human activity recognition in body area networks (bans),” IEEE Systems Journal, vol. 12, no. 1, pp. 82–91, 2018
[KhA18c] M. U. S. Khan, A. Abbas, M. Jawad, M. Ali, and S. U. Khan, "Convolutional Neural Networks as Means to Identify Apposite Sensor Combination for Human Activity Recognition," in 3rd IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE), Washington D.C., USA, September 2018
[KhK19] M. U. S. Khan, S. U. Khan, and A. Y. Zomaya, Big Data-Enabled Internet of Things, IET Press, London, UK, 2019, XIII, 488 p., ISBN 978–1–78561–636–5.
[KhK17] M.U.S. Khan, O. Khalid, Y. Huang, F. Zhang, R. Ranjan, S. U. Khan, J.Cao, K. Li, B. Veeravalli, and A. Zomaya, “MacroServ: A Route Recommendation Service for Large-Scale Evacuations,” IEEE Transaction of Service Computing, vol. 10, no. 4, pp. 589 - 602, July-Aug. 1 2017
[KhK14] O. Khalid, M.U.S. Khan, S. U. Khan, and A. Y. Zomaya, “OmniSuggest: A Ubiquitous Cloud based Context Aware Recommendation System for Mobile Social Networks,” IEEE Transactions on Services Computing, vol. 7, no. 3, pp. 401-414, 2014
[KhJ21] ]M. U. S. Khan, M. Jawad, and S. U. Khan, "Adadb: Adaptive Diff-batch Optimization Technique for Gradient Descent," IEEE Access, vol. 9, pp. 99581-99588, 2021
[KhK20] W. Khan, S. Ali, M. U. S. Khan, M. Jawad, M. Ali and R. Nawaz, "AdaDiffGrad: An Adaptive Batch Size Implementation Technique for DiffGrad Optimization Method," 2020 14th International Conference on Innovations in Information Technology (IIT), Al Ain, pp. 209-214, 2020