2022 EIMS Conference on

Applied Statistics


Ewha Institute of Mathematical Sciences (EIMS)

Ewha Womans University, S. Korea, Oct 29 2022

Abstracts

Evaluation of the Difference between Two Spatiotemporal Random Fields

Abstract: Comparing the spatial characteristics of spatiotemporal random fields is often in demand in various fields of study. Especially in climatology, people are interested in learning the difference between the synthetic climate simulation model and climate field reconstructions (CFR) which are estimates of the past climate constructed based on proxy data. However, the comparison can be challenging due to the high-dimensional feature and dependency on the data. We develop a new multiple testing approach to detect local differences in the spatial characteristics of two spatiotemporal random fields by taking the spatial information into account. Our method adopts a two-component mixture model for location-wise p-values and then derives a new false discovery rate (FDR) control, called mirror procedure, to determine the optimal rejection region. This procedure is robust to model misspecification and allows for weak dependency among hypotheses. To integrate the spatial heterogeneity, we model the mixture probability as well as study the benefit if any of allowing the alternative distribution to be spatially varying. An EM-algorithm is developed to estimate the mixture model and implement the FDR procedure. We study the FDR control and the power of our new approach both theoretically and numerically and apply the approach to compare the mean and teleconnection pattern between two synthetic climate fields.


Deep Learning Approach for Shoeprint Matching

Abstract: In forensic science, the evaluation of shoeprints can be challenging because of the degradation of the outsole patterns and the difficulty of discriminating between worn-outs from the same outsole patterns. In this research, we construct a deep learning matching algorithm using a Siamese network and apply the transfer learning, VGG and ResNet, to avoid end-to-end training and to get better source classification. Finally, we could extract the similarity features from deep learning layers and calculate the quantity of how similar the two images are. In data analysis, we construct more realistic mock-crime scene pairs that might be challenging to know their sources. The first set is the pairs when shoeprints are degraded with dust and blood. The second set is the close non-matches when shoeprints share the same class characteristics, meaning the same outsole designs but different worn-outs. In this paper, we find that the proposed algorithm using deep learning disciplines showed higher or comparable performance in classifying the sources of shoeprints than existing methods.


Fitting an Accelerated Failure Time Model with Time-dependent Covariates via Nonparametric Gaussian Scale Mixtures

An accelerated failure time (AFT) model is a popular regression model in survival analysis. It models the relationship between the failure time and a set of covariates via a log link with an addition of a random error. The model can be either parametric or semiparametric depending on the degree of specification of the error distribution. The covariates are usually assumed to be fixed - ‘time independent’. In many biomedical studies, however, ‘time-dependent’ covariates are frequently observed. In this work, we consider a semiparametric time-dependent AFT model. We assume that the distribution of the baseline failure time as an infinite scale mixture of Gaussian densities. Thus, this model is highly flexible compared to that assumes a one-component parametric density. We consider a maximum likelihood estimation and propose an algorithm based on the constrain newton method for estimating model parameters and mixing distributions. The proposed methods are investigated via simulation studies to assess the finite sample properties. The proposed methods are illustrated with a real data set.

Keywords : time dependent covariates, nonparametric gaussian-scale mixture, constrain newton method, survival analysis


A Tree-based Scan Statistic for Detecting Signals of Drug-Drug Interactions in Spontaneous Reporting Databases

The clinical trials generaly focus on the single drug safety and eficacy rather than efects of drug-drug interactions (DDIs). However, concomitant use of multiple drugs can increase the risk of adverse events (AEs) due to DDIs. The proportion of AEs caused by DDI has ben estimated to be around 30% of unexpected AEs. Therefore, detecting signals of AE caused by DDIs is as important as detecting signals of single drug-induced AE in post-market drug safety surveilance. Several statistical methodologies for signal detection of DDIs have ben proposed, such as Ω shrinkage measure (Norèn et al., 208), the chi-square statistics for screning AEs caused by DDI (Gosho et al., 2017), the combination risk ratio (Susuta and Takahashi, 2014), and the concomitant signal score (Noguchi et al., 2020). However, these methods have ben developed without considering a hierarchical structure for an AE code, such as World Health Organization’s Adverse Reaction Terminology. Also, most of proposed methods do not reflect problems for potential reporting bias of spontaneous reporting systems, such as the under-reporting and relative over-reporting for specific drugs or AEs. In this study, we proposed signal detection method for DDIs based on the tre-based scan statistic, which simultaneously searches a node with relative high risk for large number of nodes in a database. Our proposed method can rule out the problems for potential reporting bias through several asumptions. We conducted simulation studies to compare the performance of the proposed method with existing method for various setings. We also performed a real data analysis using the database of Korea Adverse Event Reporting System.

Keywords: Adverse drug reaction, Drug safety surveilance, Hierarchical structure, Reporting bias


Functional Cox Regression Models

In environmental health research, it is of interest in understanding the effect of the neighborhood environment on health. Researchers have shown a protective association of green space around residential address and depression outcomes. In the assessment of exposure to green space, distance buffers are often used to quantify green space exposure. However, buffer distances differ across studies, primarily are determined by researchers a prior and it remains uncertainty in identifying an appropriate buffer distance for exposure assessment. To address geographic uncertainty problem for exposure assessment, we present a domain selection algorithm based on the penalized functional linear Cox regression model. The theoretical properties of our proposed method are studied, and simulation studies are conducted. The method is illustrated in a study of the effect of green space exposure on depression and/or antidepressant use in the Nurses’ Health Study.


Construction of New General Classes of Bivariate Distributions Based on Stochastic Orders

In this study, based on the concepts of stochastic orders, we propose new general classes of bivariate distributions. The failure rate order, usual stochastic order and likelihood ratio order are applied to construct the classes. The joint distributions in each class are derived. It will be seen that the obtained formulas for the joint distributions are very simple and easy to apply. Then, the relationships between the classes are discussed and characterized. We illustrate the practical usefulness of the proposed classes by showing that a number of new families of bivariate distributions can be generated from the classes. Furthermore, to illustrate practical relevance, we apply several developed models to analyze a real bivariate failure time data set.