In my academic journey, transitioning from a background in mathematics to a keen interest in Mathematical Biology during my master's degree has defined my research trajectory. Having successfully cleared the CSIR-NET exam and secured the CSIR-JRF fellowship from the Government of India, I embarked on a PhD in Mathematical Biology. So, my research interests are: Mathematical Modelling, Population Biology, Statistical Stability Analysis, Food Chain Modelling, Mathematical Epidemiology, Bayesian Statistics. A brief statement of my research work is given in the following.
Ph.D. Thesis-related research:
My doctoral research delves into Population Biology with a primary focus on developing novel methodologies to detect parameter variation in real data sets through single continuous growth modelling. The objective is to critically investigate existing growth models, considering both continuous and stochastic variations in parameters. The thesis, titled "Mathematical Analysis of Biological Growth Models with Continuously and Stochastically Varying Parameters with Applications to Real Data," aims to enhance analytics techniques for growth studies across diverse domains.
Research problem 1:
Growth curve models serve as the mathematical framework for the qualitative studies of growth in many areas of applied science and due to their extensive use in recent studies, several distinct models were developed over a long period of time (Bhowmick et al., 2014). There are many practical applications for such models. Also, most of the research on population models have treated the parameters as fixed but unknown quantity which are estimated by using the non-linear least squares method that provides a confidence interval. However, due to natural randomness, the parameters may vary over time (Banks, 1994). Now the problem is, if the experimenter has observed data over a time period, then there is uncertainty about the parameter being fixed or changing with time. Even if it is perceived (from biological theory) that a particular model parameter changes over time, estimating that parameter empirically can be difficult. In our first paper, we address these issues by proposing a new methodology to detect parameter variation from real data by using the interval-specific rate parameter (ISRP) proposed by Bhowmick et al (2014).
We initially showed that one model can be obtained from the other by choosing a suitable continuous transformation of the parameters. This idea builds an interconnection between existing models in the literature. To build this interconnection, we have chosen four key models namely, logistic, theta-logistic, exponential, and confined exponential. Then for a given set of training data points and these four key models, we select an optimal mathematical model for the data by using non-linear least square fitting. Then we plot the ISRP profiles of the parameters of the optimal mathematical model and the ISRP profile will indicate whether any variation is present in the parameter. If parameter variation is present then this ISRP profile will indicate the nature of variation in parameters with time by using the interconnecting flowcharts. This enables the experimenter to extrapolate the inference to more complex models. Our proposed methodology will significantly reduce the effort involved in model fitting exercises. The proposed idea has been verified by using simulated and real data sets from three different domains: marketing (LCD-TV sales data from Trappey and Wu, 2008), biology (cattle growth data from Kenward, 1987 and the number of horses and mules on US farms, 1865-1960 from Banks, 1994), and epidemiology (COVID-19 data of Germany). We believe that this work will be helpful for practitioners in the field of growth studies. (Published in Chaos, Solitons and Fractals.)
Research problem 2:
In our first paper we develop a methodology to detect parameter variation in real data sets using the ISRP distribution of parameters. So, interval-specific estimates are an integral part of our proposed methodology, and Bhowmick et al. (2014) provide a conceptual overview and derivation of ISRP. However, for highly non-linear models and non-monotonic data ISRP of the parameters is not derivable using Bhowmick et al. (2014)'s method. Consequently, our methodology becomes vulnerable due to these limitations in the derivation of ISRP. Hence, in our second paper, we propose a novel methodology for estimating ISRP based on a localized maximum likelihood estimator (localized MLE) to overcome these issues. For theoretical validation of our proposed methodology, we check the distribution of the null hypothesis by taking the Von Bertalanffy model as the test bed model and also draw power curves to cross-check the validation. Following that, we verified it with real data sets (cattle growth data from Kenward, 1987). Then, we draw comparisons between these two key methodologies to determine which methodology is more appropriate for selecting the best-fitted model for real data sets. For the comparison study, we check stability, efficiency, and parameter sensitivity and find that our method is better than the existing one. Also, our proposed methodology is time and effort-preserving as we no longer need to derive ISRP analytically, and it is applicable to complex models and non-monotonic datasets where the existing methodology failed to derive the ISRP. (Published in Mathematics and Computers in Simulation.)
Research problem 3:
In the previous two research problems, we mainly focused on detecting the time-dependent variation of the parameter from real data sets. But in literature density-dependent parameter variation is also present and detection of density-dependent parameter variation is needed for better understanding of the growth phenomena. Therefore, in our next problem, we shift our focus to developing a methodology to detect density-dependent parameter variation from real data sets by using our proposed idea of computational-based ISRP. Here we again used the localized MLE method to estimate ISRP and plot its distribution over size to detect density-dependent parameter variation. Validation of the method has been carried out by using simulation studies. We have also applied it to two different data sets from two different domains. (Under review: Ecological Modelling.)
Research problem 4:
Our previous three research problems focused on establishing methods to detect parameter variation (continuous) in real data sets. It is also possible to carry out these studies for models where the parameter changes randomly over time (stochastic variation). As a first step, we gathered all the research so far on stochastic population modeling. In the literature on stochastic growth models for single species, a few key growth equations dominate, such as logistic, Gompertz, exponential, Richards, Bertalanffy-Richards, and theta-logistic. However, the logistic growth model with stochastic treatment has attracted researchers' attention in many different disciplines. So, in our review, we will therefore concentrate on the use of stochastic logistic models in population biology. Our survey reports a bifurcation of studies in logistic growth equations, into harvesting and non-harvesting equations. This study also identifies the importance of data-driven research in stochastic growth equations and the selection of the appropriate models using multi-model inferential techniques. Also based on this survey, we have identified five key research problems in which special attention may be required.
Statistical detection of random variation from real data.
Need more focus on comparative model assessment: A bunch of stochastic logistic models are already available in the literature from which the best model can be chosen. So, it is critical to test the applicability of these models for analyzing real data sets on population growth.
Appropriate use of statistical methods for small sample sizes.
Effect of different correlation structures.
Focus on the applications of Multiphasic models in growth studies.
(Published in Ecological Modelling.)
Collaboration research:
In addition to my thesis work on population and computational biology, I also work on Food Chain Dynamics and Statistical stability analysis.
Research problem 1:
In this research work, we focused on food chain dynamics by using intra-guild predation modeling. For this, we choose a Chitata-Mugil-Shrimp fish dynamics and these species have a wide distribution in African and Asian countries and have been classified as endangered (EN) by the Conservation Assessment and Management Plan. In this paper, we are exploring the causes of Notopterus Chitala's decline in its natural habitat. Our investigation on the decline of Chitala is based on fish data collected from the Bhagirathi River, located in Diamond Harbour, Malancha, and Raidighi, West Bengal, India. Based on the literature, we have considered two variants of IGP models consisting of Chitala as the top predator, mugil as the intermediate predator, and shrimp as the basal prey. Then calibrating these models under the Bayesian modeling framework, we estimate the posterior of the parameters. We use the Reversible-jump Markov chain Monte Carlo method to obtain the posterior model probabilities to select the most suitable model. Our most accurate model allows us to investigate the cause of the decline in Chitala population rates, and the primary reason for the lack of availability is the high extinction risk for mugil populations. Sensitivity analysis has confirmed that the biomass conversion rate from Mugil to Chitala is the most significant parameter. We believe that this study may be useful to develop management strategies for Chitala conservation. (Published in Environmental and Ecological Statistics.)
Research problem 2:
In next research problem belongs to population biology where we focused on analyzing the statistical stability of conditional moments. In population biology, parametric growth models are essential and used to explain growth patterns. Historical data points are also an essential tool in the population biology domain to predict the future of population growth. In the studies of future prediction based on historical data points, the stability analysis of the equilibrium distribution at large time points got considerable attention from researchers. Our work also includes an analysis of the stability of different order moments of relative changes in population sizes using the logistic as a test bed model for assessing the stability of population sizes. We also analyzed the stability behavior for two dimension models also and for that, we chose the predator-prey dynamics with holling type I, and type II function responses, where the prey follows a logistic growth profile.
In order to determine if population sizes are stable, we examine the behavior of moments of population size over time, using a stochastic logistic growth model. We have taken two different RGR estimates to investigate the moments’ convergence. The simulation study indicates that both the estimators behave almost similarly as the expectation values of the estimators are zero and the variance profile stabilizes around zero after some points. We also define conditional statistics over the first RGR estimator and verify the stability of the conditional statistic by simulation study. Hence, the conditional moments of the logistic model for these two estimators are convergent and stable. Then, we also investigate the stability of the interacting population for the first RGR estimator. The simulation study is conducted by drawing data from a multivariate normal setup. For the predator-prey model with Holling type I, both the conditional mean and variance cluster around zero, and the skewness and kurtosis profile further verify their stability. But for the predator-prey model with Holling type II, the conditional mean does not cluster around zero because of the limit cycle stability of the model. So in claiming the stability of the model the experimenter has to look at the conditional variance profile instead of the conditional mean profile as the conditional mean does not cluster around a fixed point but the conditional variance profile clustered around zero very nicely. (Ready for Communication.)
Research problem 3:
Points of inflection in population dynamics indicate critical transitions in growth processes, marking shifts from acceleration to deceleration. Detecting these points is crucial for identifying changes in growth velocity. However, estimating the point of inflection for real datasets is challenging due to model uncertainty and the nonlinear dependence on parameters.
In our study, we take logistic model with environmental and demographic stochasticity separately to assess their effects on point-of-inflection estimates. Through simulations, we analyze the bias and variance of these estimates using a likelihood-based method and Taylor's approximation for standard errors. Our findings highlight how different levels of stochasticity impact estimation accuracy and emphasize the need for robust statistical methods in stochastic modeling frameworks. (Communicated: Chaos, Solitons and Fractals.)
Research problem 4:
After proposing methodologies to detect parameter variation with applications in real data sets from different domains in our two previous papers, we are now focused on applying our proposed methodologies in very large data sets (25 countries' COVID-19 data) to provide a synthesis of our findings. Based on our proposed methodologies, we try to determine how early we can make predictions about the COVID pandemic by looking at single population dynamics. For that, in this problem, we take COVID-19 data from different countries and divide every country’s data into train data and test data (three different ratios: 1:1, 7:3, 9:1). After that, using previous existing methodologies, we find the best-fitted model from the train data, and use the best-fitted model to develop a prediction interval for the future, and then check that prediction interval based on test data. Then, using our proposed methodologies, we first find the best-fitting model, draw a future prediction interval, and test its accuracy. In conclusion, we make a comparison between these proposed and existing methods to determine how effective our proposed methods are at making early predictions about pandemics by relying only on a single population's dynamics. (Ongoing.)
References:
Bhowmick AR, Chattopadhyay G, Bhattacharya S. Simultaneous identification of growth law and estimation of its rate parameter for biological growth data: a new approach. J Biol Phys 2014; 40(1): 71–95.
Banks RB. Growth and diffusion phenomena. Springer; 1994.
Kenward MG. A method for comparing profiles of repeated measurements. J R Stat Soc Ser C (Appl Stat) 1987; 36(3): 296–308.
Trappey CV, Wu H-Y. An evaluation of the time-varying extended logistic, simple logistic, and Gompertz models for forecasting short product lifecycles. Adv Eng Inf 2008; 22(4): 421–30.
Kostov, G., Popova, S., Gochev, V., Koprinkova-Hristova P., Angelov, M. and Georgieva, A. Modeling of Batch Alcohol Fermentation with Free and Immobilized Yeasts Saccharomyces Cerevisiae 46 EVD, Biotechnology & Biotechnological Equipment 2012; 26:3, 3021-3030.