Abstract:
Ancestral inference for branching processes in random environments involves determining the ancestor distribution parameters using the population sizes of descendant generations.
In this project, we introduce a new methodology for ancestral inference utilizing the generalized method of moments. We demonstrate that the estimator's behavior is critically influenced by the coefficient of variation of the environment sequence. Furthermore, despite the process's evolution being heavily dependent on the offspring means of various generations, we show that the joint limiting distribution of the ancestor and offspring estimators of the mean, under appropriate centering and scaling, decouple and converge to independent Gaussian random variables when the ratio of the number of generations to the logarithm of the number of replicates converges to zero.
Additionally, we provide estimators for the limiting variance and illustrate our findings through numerical experiments and data from Polymerase Chain Reaction experiments and COVID-19 data.
Publication:
Abstract:
Big data and streaming data are encountered in a variety of applications in business and industry, and data compression is typically used in these settings for enhancing privacy and reducing operational costs. In these situations, it is common to use sketching and random projections to reduce the dimension of the data yielding compressed data. These data however possess anomalies such as heterogeneity, outliers, and round-off errors which are hard to detect due to volume and processing challenges.
In this project, we describe a new robust and efficient minimum divergence estimator (MDE) to analyze the compressed data in a high-dimensional regression model. Specifically, we evaluate the prediction efficiency and residual efficiency of the MDE relative to least-squares estimators derived from uncompressed data. Using large sample theory and numerical experiments, we also demonstrate that routine use of the proposed robust methods is feasible in these contexts.
The Local Correlation Curve is a statistical tool used to measure the relationship between two variables while accounting for local variations. Unlike traditional correlation measures (e.g., Pearson or Spearman correlation), which provide a single global value, the local correlation curve captures how correlation changes across different segments of the data.
In this project, we applied nonparametric estimation techniques to model local correlation curves, capturing evolving dependencies and improving the interpretability of complex associations. We also developed a machine learning algorithm for local dependency estimation in partial linear regression models, validating its robustness against model misspecification and enhancing predictive reliability.