Adhya, S., & Sanyal, D. K. (2025). S2WTM: Spherical Sliced-Wasserstein Autoencoder for Topic Modeling. In 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025). (Core A*).
Lahiri, A., Hou, Y., & Sanyal, D. K. (2025). TaxoAlign: Scholarly Taxonomy Generation Using Language Models. In EMNLP 2025. (Core A*) arXiv preprint arXiv:2510.17263.
De, S., Sanyal, D. K., & Mukherjee, I. (2025). Fine-tuned encoder models with data augmentation beat ChatGPT in agricultural named entity recognition and relation extraction. Expert Systems with Applications, 277, 127126.
Rehman, T., Sanyal, D. K., & Chattopadhyay, S. (2025). How green are neural language models? Analyzing energy consumption in text summarization Fine-tuning. In 3rd International Conference on Power Engineering and Intelligent Systems (PEIS 2025). Springer. (🏆 Best Paper Award)
Rehman, T., Sanyal, D. K., & Chattopadhyay, S. (2025). Can pre-trained language models generate titles for research papers?. In Proceedings of the 26th International Conference on Asian Digital Libraries (ICADL) (pp. 154-170). Springer, Singapore. (🏆 Best Student Paper Runner-Up Award)
Adhya, S., Lahiri, A., Sanyal, D. K., & Das, P. P. (2024). Evaluating Negative Sampling Approaches for Neural Topic Models. IEEE Transactions on Artificial Intelligence, 5(11), 5630-5642.
Lahiri, A., Sarkar, P., Sen, M., Sanyal, D. K., Mukherjee I. (2024). Few-TK: A Dataset for Few-shot Scientific Typed Keyphrase Recognition. In Proceedings of the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) Findings. (Core A)
Adhya, S., & Sanyal, D. K. (2024). GINopic: Topic Modeling with Graph Isomorphism Network. In Proceedings of the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). (Core A)
Chakraborty P., Dutta, S., Sanyal, D. K., Majumdar, S., & Das, P. P. (2023). Bringing Order to Chaos: Conceptualizing a Personal Research Knowledge Graph for Scientists. IEEE Data Engineering Bulletin 47(4), 43-56.
Lahiri, A., Sanyal, D. K., & Mukherjee, I. (2023). A Keyphrase-Centric Search Engine for Scientific Papers. In Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation (DC Track).
Chakraborty, P., & Sanyal, D. K. (2023). A Personal Knowledge Graph for Researchers. In Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation (DC Track).
Rehman, T., Chattopadhyay, S., & Sanyal, D. K. (2023). Abstractive Summarization of Scientific Documents: Models and Evaluation Techniques. In Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation (DC Track).
Tokala, Y. S. S. S., Aluru, S. S., Vallabhajosyula, A., Sanyal, D. K., & Das, P. P. (2023). Label informed hierarchical transformers for sequential sentence classification in scientific abstracts. Expert Systems, 40(6), e13238. (IF: 3.3)
Rehman, T., Sanyal, D. K., & Chattopadhyay, S. (2023). Research Highlight Generation with ELMo Contextual Embeddings. Scalable Computing: Practice and Experience, 24(2), 181-190.
Chakraborty, P., & Sanyal, D. K. (2023) A comprehensive survey of personal knowledge graphs. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, e1513. (IF: 7.8)
Rehman, T., Sanyal, D. K., Chattopadhyay, S., Bhowmick, P. K. & Das, P. P. (2023). Generation of Highlights From Research Papers Using Pointer-Generator Networks and SciBERT Embeddings," IEEE Access, 11, pp. 91358-91374. (IF: 3.6)
Kumar, D., Bhowmick, P. K., Dey, S., & Sanyal, D. K. (2023). On the banks of Shodhganga: analysis of the academic genealogy graph of an Indian ETD repository. Scientometrics, 1-36. (IF: 3.9)
Lahiri, A., Sanyal, D. K., & Mukherjee, I. (2023). CitePrompt: Using Prompts to Identify Citation Intent in Scientific Papers. In Proceedings of the 2023 ACM/IEEE Joint Conference on Digital Libraries (JCDL). ACM.
Adhya, S., Lahiri, A., & Sanyal, D. K. (2023). Do Neural Topic Models Really Need Dropout? Analysis of the Effect of Dropout in Topic Modeling. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL). (Core A)
Adhya, S., & Sanyal, D. K. (2023). Improving Neural Topic Models with Wasserstein Knowledge Distillation. In European Conference on Information Retrieval (ECIR). Cham: Springer Nature Switzerland. (Core A)
De, S., Sanyal, D. K., & Mukherjee, I. (2023). AgriNER: An NER Dataset of Agricultural Entities for the Semantic Web. In European Semantic Web Conference (ESWC). Cham: Springer Nature Switzerland.
Adhya, S., Lahiri, A., Sanyal, D. K., & Das, P. P. (2022). Improving Contextualized Topic Models with Negative Sampling. In Proceedings of the 19th International Conference on Natural Language Processing (ICON).
Chakraborty, P., Dutta, S., & Sanyal, D. K. (2022). Personal research knowledge graphs. In Companion Proceedings of The Web Conference (WWW). ACM.
Bhowmick, P. K., Das, P. P., Chakrabarti, P. P., & Sanyal, D. K. (2022). National Digital Library of India: democratizing education in India. Communications of the ACM, 65(11), 58-61. (IF: 22.7)
Rehman, T., Sanyal, D. K., Majumder, P., & Chattopadhyay, S. (2022). Named Entity Recognition Based Automatic Generation of Research Highlights. In Proceedings of the Third Workshop on Scholarly Document Processing, co-located with COLING 2022.
Adhya, S., & Sanyal, D. K. (2022). What Does the Indian Parliament Discuss? An Exploratory Analysis of the Question Hour in the Lok Sabha. In Proceedings of the LREC 2022 workshop on Natural Language Processing for Political Sciences.
Rehman, T., Das, S., Sanyal, D. K., & Chattopadhyay, S. (2022). An Analysis of Abstractive Text Summarization Using Pre-trained Models. In Proceedings of International Conference on Computational Intelligence, Data Science and Cloud Computing (IEM-ICDC). Springer Nature Singapore.
Sanyal, D. K., Bhowmick, P. K., & Das, P. P. (2021). A review of author name disambiguation techniques for the PubMed bibliographic database. Journal of Information Science, 42(2), 227–254, SAGE. (IF: 2.4)
Santosh, T. Y. S. S., Varimalla, N. R., Vallabhajosyula, A., Sanyal, D. K., & Das, P. P. (2021). HiCoVA: Hierarchical conditional variational autoencoder for keyphrase generation. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management (CIKM). ACM. (Core A)
Banerjee, S., Sanyal, D. K., Chattopadhyay, S., Bhowmick, P. K., & Das, P. P. (2021). Automatic recognition of learning resource category in a digital library. In Proceedings of the 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL). ACM.
Rehman, T., Sanyal, D. K., Chattopadhyay, S., Bhowmick, P. K., & Das, P. P. (2021). Automatic generation of research highlights from scientific abstracts. In Proceedings of the 2nd Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents, co-located with JCDL 2021 (Vol. 3004). CEUR-WS.
Santosh, T. Y. S. S., Chakraborty, P., Dutta, S., Sanyal, D. K., & Das, P. P. (2021). Joint entity and relation extraction from scientific documents: role of linguistic information and entity types. In Proceedings of the 2nd Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents, co-located with JCDL 2021 (Vol. 3004). CEUR-WS.
Santosh, T. Y. S. S., Sanyal, D. K., Bhowmick, P. K., & Das, P. P. (2021). Gazetteer-guided keyphrase generation from research papers. In Proceedings of the 25th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), LNCS, (Vol. 12714). Springer, Cham.
Santosh, T. Y. S. S., Sanyal, D. K., Bhowmick, P. K., & Das, P. P. (2020). SaSAKE: syntax and semantics aware keyphrase extraction from research papers. In Proceedings of the 28th International Conference on Computational Linguistics (COLING).
Banerjee, S., Sanyal, D. K., Chattopadhyay, S., Bhowmick, P. K., & Das, P. P. (2020). Segmenting scientific abstracts into discourse categories: a deep learning-based approach for sparse labeled data. In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (JCDL).
Halder, K., Chattopadhyay, A., Sanyal, D. K., Bhowmick, P. K., & Das, P. P. (2020). Analysis of the academic genealogy of education. In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (JCDL).
Jhawar, K., Sanyal, D. K., Chattopadhyay, S., Bhowmick, P. K., & Das, P. P. (2020). Author name disambiguation in PubMed using ensemble-based classification algorithms. In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (JCDL).
Santosh, T. Y. S. S., Sanyal, D. K., Bhowmick, P. K., & Das, P. P. (2020). DAKE: document–level attention for keyphrase extraction. In Proceedings of the 42nd European Conference on Information Retrieval (ECIR), LNCS (Vol. 12036). Springer, Cham. (Core A)
Sanyal, D. K., Bhowmick, P. K., Das, P. P., Chattopadhyay, S., & Santosh, T. Y. S. S. (2019). Enhancing access to scholarly publications with surrogate resources. Scientometrics, 121(2), 1129–1164. Springer. (IF: 3.9)
Santosh, T. Y. S. S., Sanyal, D. K., & Das, P. P. (2019). Person name segmentation with deep neural networks. In Proceedings of the 7th International Conference on Mining Intelligence and Knowledge Exploration (MIKE), LNCS (Vol. 11987). Sringer, Cham.
Sanyal, D. K., Banerjee, S., Agarwal, G., Chattopadhyay, S., Bhowmick, P. K., & Das, P. P. (2019). Illumine: a tool to augment the National Digital Library of India with full texts of research papers. In Proceedings of the 16th IEEE India Council International Conference (INDICON).
Acharya, S., Sanyal, D. K., Mazumdar, J., & Das, P. P. (2019). Archiving endangered Mundā languages in a digital library. In Proceedings of the 6th International Conference on Digital Landscape (ICDL). (🏆 Best Paper Award)
Akhtar, S. S., Sanyal, D. K., Chattopadhyay, S., Bhowmick, P. K., & Das, P. P. (2018). A metadata extractor for books in a digital library. In Proceedings of the 20th International Conference on Asian Digital Libraries (ICADL), LNCS (Vol. 11279). Springer.
Santosh, T. Y. S. S., Sanyal, D. K., Bhowmick, P. K., & Das, P. P. (2018). Surrogator: a tool to enrich a digital library with open access surrogate resources. In Proceedings of the 18th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL).
Sanyal, D. K., Chattopadhyay, S., & Chatterjee, R. (2018). Figure retrieval from biomedical literature: an overview of techniques, tools, and challenges. In N. Dey, S. Borra, A. S. Ashour, & F. Shi (Eds.), Machine Learning in Bio-Signal Analysis and Diagnostic Imaging. Academic Press / Elsevier.
Pandey, N., Sanyal, D. K., Hudait, A., & Sen, A. (2017). Automated classification of software issue reports using machine learning techniques: an empirical study. Innovations in Systems and Software Engineering, Springer, 13(4), 279–297. (IF: 1.2)
Pandey, N., Hudait, A., Sanyal, D. K., & Sen, A. (2016). Automated classification of issue reports from a software issue tracker. In Proceedings of the 4th International Conference on Advanced Computing, Networking, and Informatics (ICACNI) (Vol. 1), AISC (Vol. 518). Springer.