Research

Interests:

AI for Better Medical Diagnostics and Treatment Planning
Biomedical Data Science
Knowledge Discovery in Radiology Data
Information Extraction from Unstructured Text
Natural Language Processing (NLP)
Social Media and Scientific Literature Mining

Topics:

Automatic Tumor Contouring:

Head and Neck Cancer (HNC) is the most common cancer in India, with the oropharynx hosting the majority of cases. More than 500,000 new cases of head and neck cancer HNC are identified worldwide each year, making it the fifth most frequent disease in the world. Radiation therapy is frequently recommended as the primary treatment for head and neck cancer. Modern Radiotherapy for Head and neck cancer HNC treatment requires precise delineation of both tumor and normal structures on CT scans. This segmentation process is crucial for radiotherapy planning, but it is time-consuming and challenging. Algorithms are commercially available for the auto-contouring of normal organs at risk like spinal cord, bones and lungs but not for tumors and automated tumor delineation remains elusive because of variability in size, location, and shape. Moreover this manual process demands extensive training and time, affecting treatment efficiency and potentially introducing inter-observer variability in dose prescription. With this background, we plan to develop a deep learning-based image segmentation framework for identifying the contours of tumors precisely and accurately in head and neck cancers HNC from the CT scan images.

Biomedical Literature Mining:

Online biomedical literature is complex, with its domain-specific terminologies and language structures. The free text, meta-data etc. extracted from published biomedical articles are very useful to derive knowledge using NLP and machine learning techniques, e.g., identification of bio-markers and their influence for certain diseases from relevant literature. We need to develop domain specific knowledge graphs in order to extract semantic information from the text. Hence we need to have a large amount of data from freely available resources like PubMED and process them to develop the knowledge graphs following transformer based models. Furthermore, the aim is develop a generic NLP framework for biomedical literature mining which will take minimum input from the end-user e.g., domain specific keywords to identify relevant articles, which will be processed further based on the knowledge graph to identify significant information.

Knowledge Discovery in Social Media for Better Care:

Data available from social media e.g., Twitter, Facebook etc. make it possible to get information about demographics, languages used, locations and social interactions of the users. Knowledge discovery in social media is an upcoming research interest in the field of public health, as it presents new opportunities in epidemiological surveillance and monitoring. NLP has the ability to gather and find meaning in data collected from social media with geo-tagged location to improve the quality of decision making in various public health issues e.g., early prediction of community health hazards over social media or early prediction of signs of mental illness. However, this area has not been explored much, but has the potential to provide meaningful information to the domain experts. Hence I want to focus on this area and high volume of data is needed in order to do so. Besides crawling data from freely available resources like Reddit, we need to buy some amount of data from Facebook and Twitter to access spatial information as well. The aim is to develop a NLP framework to identify significant information from these texts with annotations regarding possible outcomes e.g., adverse effects of drugs or diseases, panic, scarcity of health-care needs, poor awareness etc.

Clinical Text Mining:

Clinical texts also called clinical notes e.g., discharge summary, prescriptions, radiology reports etc. contain family history, lifestyle, diagnoses, medications, treatment plans and various other medical information of the patients. Clinical notes are increasingly being used all over the world, but represent a vast, underused resource for biomedical research. There are many applications of NLP for knowledge discovery in clinical notes to improve the quality of treatment plans and biomedical research. NLP can investigate the association between drugs and possible adverse events, correlations between diseases (comorbidities) and it can be used for early prediction of signs of different diseases e.g., cancer using the clinical notes. Furthermore, NLP can identify the discriminative image features hidden in radiology reports and can support better diagnostic conclusions. In the next few years, I want to focus mostly on the Radiology domain and the plans are as follow:

Collect and process the radiology images and reports from different resources primarily at Government hospitals like AIIMS in India.
De-identify the protected health information like patient’s name, age, gender etc. to make the reports available for research.
Develop novel NLP frameworks for early prediction of signs of cancers and other diseases. Combine both image and text features by extracting them respectively from the radiology images like CT scan and the reports for effective information extraction from these resources.
Evaluate the performance of these frameworks with the help of experts using the past records as ground truths.

Publications:

Journal:

Archana Yadav, Biswajit Patra and Tanmay Basu, Modeling International Tourist Arrivals: An NLP Perspective. Operations Research Forum, 5(87), Springer Nature, 2024.
Arnab Roy and Tanmay Basu, Postimpact Similarity: A Similarity Measure for Effective Grouping of Unlabelled Text Using Spectral Clustering. Knowledge and Information Systems, vol. 64, pp. 723-742, Springer, 2022.
Tanmay Basu, Simon Goldsworthy and Georgios V. Gkoutos. A Sentence Classification Framework to Identify Geometric Errors in Radiation Therapy from Relevant Literature. Information, MDPI, vol. 12(4), 139, 2021. DOI: 10.3390/info12040139.
Avideep Mukherjee and Tanmay Basu. A Medoid Based Weighting Scheme for Nearest Neighbors Decision Rule Towards Effective Text Categorization. Springer Nature Applied Sciences, vol. 2, 1009, 2020. DOI: 10.1007/s42452- 020-2738-8.
Swarup Chattopadhyay, Tanmay Basu, Asit K.das, Kuntal Ghosh and C.A.Murthy. Towards Effective Discovery of Natural Communities in Complex Networks and Implications in E-commerce. Electronic Commerce Research, Springer. doi:10.1007/s10660-019-09395-y, 2020.
Swarup Chattopadhyay, Tanmay Basu, Asit K.das, Kuntal Ghosh and C.A.Murthy. A similarity based generalized modularity measure towards effective community discovery in complex networks. Physica A: Statistical Mechanics and its Applications. doi.org/10.1016/j.physa.2019.121338, 2019.
Tim Guetterman, Tammy Chang, Melissa DeJonckheere, Tanmay Basu, Elizabeth Scrubs and VG Vinod Vydiswaran. Augmenting Qualitative Text Analysis with Natural Language Processing: Methodological Study. Journal of Medical Internet Research, vol. 20(6):e231, 2018.
Tanmay Basu and C. A. Murthy. A Supervised Feature Selection Technique for Effective Text Categorization, International Journal of Machine Learning and Cybernetics, Springer, vol. 7(5), pp. 877-892, 2016.
Tanmay Basu and C. A. Murthy. A Similarity Assessment Technique for Effective Grouping of Documents, Information Sciences, Elsevier, vol. 311, pp. 149-162, 2015.
Tanmay Basu and C. A. Murthy. A Similarity based Supervised Decision Rule for Qualitative Improvement of Text Categorization, Fundamenta Informaticae, IOS Press, vol. 141(4), pp. 275-295, 2015.
Tanmay Basu and C. A. Murthy. Towards Enriching the Quality of k-Nearest Neighbor Decision Rule for Document Classification, International Journal of Machine Learning and Cybernetics, Springer, vol. 5(6), pp. 897-905, 2014.
Tanmay Basu and C. A. Murthy. CUES: A New Approach for Document Clustering, Journal of Pattern Recognition Research, vol. 8(1), pp. 66-84, 2013.

Conference:

Sumit Kumar and Tanmay Basu. AdaBioBERT: Adaptive Token Sequence Learning for Biomedical Named Entity Recognition. Accepted for Publication in BioNLP Workshop, ACL Conference, Vienna, Austria, 2025.
Shraddha Agarwal, Vinod Kumar Kurmi, Abhirup Banerjee and Tanmay Basu. TCPNet: A Novel Tumor Contour Prediction Network using MRIs. Published in Proceedings of IEEE International Conference on Healthcare Informatics, pp. 183-189, Orlando, USA, 2024.
Anuradha Mahato, Prateek Sarangi, Vinod Kumar Kurmi, Abhirup Banerjee, Abhishek Goyal and Tanmay Basu. Uncertainty Quantification in Deep Learning Framework for Mallampati Classification. Published in Proceedings of Data privacy and Data Analysis in Healthcare Systems Workshop at IEEE International Conference on Healthcare Informatics, pp. 622-627, Orlando, USA, 2024.
Arkapal Panda, Tanmay Basu, Vaibhav Kumar. An Ensemble Learning Framework For Visibility Prediction In Indo-Gangetic Region, published in Proceedings of ICLR 2023 Tiny Papers, Kigali, Rwanda.
Sruthi S, Tanmay Basu. Identification of the Relevance of Comments in Codes Using Bag of Words and Transformer Based Models, published in Proceedings of FIRE 2022 Working Notes, pp. 60-65, Kolkata, India.
Harshvardhan Srivastava, Lijin N. S., Sruthi S and Tanmay Basu. NLP-IISERB@eRisk2022: Exploring the Potential of Bag of Words, Document Embeddings and Transformer Based Framework for Early Prediction of Eating Disorder, Depression and Pathological Gambling Over Social Media, published in Proceedings of CLEF 2022 Working Notes, pp. 987-994, Bologna, Italy.
Sourav Saha, Dwaipayan Roy, B Yuvaraj Goud, Chethan S Reddy and Tanmay Basu. NLP-IISERB@Simpletext2022: To Explore the Performance of BM25 and Transformer Based Frameworks for Automatic Simplification of Scientific Texts, published in Proceedings of CLEF 2022 Working Notes, pp. 2852-2857, Bologna, Italy.
Tanmay Basu. IISERB@ LT-EDI-ACL2022: A Bag of Words and Document Embeddings Based Framework to Identify Severity of Depression Over Social Media. published in Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion, pp. 234-238, ACL 2022.
Tanmay Basu and Georgios V. Gkoutos. Exploring the Performance of Baseline Text Mining Frameworks for Early Prediction of Self Harm Over Social Media, published in Proceedings of CLEF 2021 Working Notes, pp. 928-937, Romania, 2021.
Shubhaditya Goswami, Sukanya Pal, Simon Goldsworthy and Tanmay Basu. An Effective Machine Learning Framework for Data Elements Extraction from the Literature of Anxiety Outcome Measures to Build Systematic Review, published in Proceedings of Business Information Systems, pp. 247-258, Sevilla, Spain, 2019.
Tim Guetterman, Melissa DeJonckheere, Tammy Chang, Tanmay Basu, Elizabeth Scrubs and VG Vinod Vydiswaran. Integrating Natural Language Processing with Qualitative Text Analysis in Mixed Methods Studies with a Large Qualitative Strand. In proceedings of third Mixed Methods International Research Association (MMIRA) International Conference, 2018.
Sayanta Paul, Sree Kalyani Jandhyala and Tanmay Basu. Early Detection of Signs of Anorexia and Depression Over Social Media using Effective Machine Learning Frameworks, published in Proceedings of CLEF 2018Working Notes, Avignon, France.
Avideep Mukherjee and Tanmay Basu. An Effective Nearest Neighbor Classification Technique Using Medoid Based Weighting Scheme, published in Proceedings of the Fourteenth International Conference on Data Science, pp. 231-234, Las Vegas, USA, 2018.
Arnab Roy and Tanmay Basu. Effective Grouping of Unlabeled Texts using A New Similarity Measure for Spectral Clustering, published in Proceedings of the Fourteenth International Conference on Data Science, pp. 181-184, Las Vegas, USA, 2018.
Ritam Majumder and Tanmay Basu. Towards Developing Effective Machine Learning Frameworks to Identify Toxic Conversations Over Social Media, published in Proceedings of the Fourteenth International Conference on Data Science, pp. 239-240, Las Vegas, USA, 2018.
Anurag Banerjee and Tanmay Basu. Yet Another Weighting Scheme for Collaborative Filtering Towards Effective Movie Recommendation, published in Proceedings of the Fourteenth International Conference on Data Science, pp. 237-238, Las Vegas, USA, 2018.
Tanmay Basu and C. A. Murthy. Effective Text Classification by a Supervised Feature Selection Approach, in Proceedings of the IEEE International Conference on Data Mining (ICDM), pp. 918-925, Belgium, 2012.
Tanmay Basu, C. A. Murthy and H. Chakraborty, A Tweak on K Nearest Neighbour Decision Rule, in Proceedings of the International Conference on Image Processing, Computer Vision, and Pattern Recognition (IPCV), pp. 929-935, USA, 2012.
Tanmay Basu and C. A. Murthy. A Feature Selection Method for Improved Document Classification, in Proceedings of the International Conference on Advanced Data Mining and Applications (ADMA), LNCS vol. 7713, pp. 296-305, China, 2012. DOI: 10.1007/978-3-642-35527-1 25
Tanmay Basu and C. A. Murthy. Semantic Relation between Words with the Web as Information Source, in Proceedings of the International Conference on Pattern Recognition and Machine Intelligence (PReMI), LNCS 5909, pp. 267-272, India, 2009. DOI: 10.1007/978-3-642-11164-8 43

Google Sites

Report abuse