PASTA Lab. (Probability and Statistical Analysis Lab.)

Research Area

Data mining in Health Care

Data mining has been used intensively and extensively by many organizations. In healthcare, data mining is becoming increasingly popular. Data mining applications can greatly benefit all parties involved in the healthcare industry. For example, data mining can help healthcare insurers detect fraud and abuse, healthcare organizations make customer relationship management decisions, physicians identify effective treatments and best practices, and patients receive better and more affordable healthcare services. The huge amounts of data generated by healthcare transactions are too complex and voluminous to be processed and analyzed by traditional methods. Data mining provides the methodology and technology to transform these mounds of data into useful information for decision making.

Probabilistic Graphical Model

A probabilistic graphical model (PGM) is a probabilistic model for which a graph denotes the conditional dependence structure between random variables. It is commonly used in probability theory, statistics and machine learning. In terms of its applications, they have rapidly been expanding regardless of areas such as biomedical statistics, face/voice recognition, robot engineering, finance, marketing, manufacturing industry and so on.

Among a variety of PGMs, we mainly focus on Bayesian network models which are explicit and intuitive representations of dependencies as well as allow efficient algorithms for probabilistic inference with efficient factorization.

Chemometrics

Chemometrics is the science of extracting information from chemical data. It is a highly interfacial discipline, using methods frequently employed in core data-analytic disciplines such as multivariate statistics, applied mathematics, and computer science, in order to address problems in chemistry, biochemistry, medicine, biology and chemical engineering.

It is applied to solve both descriptive and predictive problems in experimental life sciences, especially in chemistry. In descriptive applications, properties of chemical systems are modeled with the intent of learning the underlying relationships and structure of the system (i.e., model understanding and identification). In predictive applications, properties of chemical systems are modeled with the intent of predicting new properties or behavior of interest. In both cases, the datasets can be small but are often very large and highly complex, involving hundreds to thousands of variables, and hundreds to thousands of cases or observations.

Financial Time Series

Financial time series analysis is concerned with the theory and practice of asset valuation over time. It is a highly empirical discipline, but like other scientific fields theory forms the foundation for making inference. There is, however, a key feature that distinguishes financial time series analysis from other time series analysis. Both financial theory and its empirical time series contain an element of uncertainty. For example, there are various definitions of asset volatility, and for a stock return series, the volatility is not directly observable. As a result of the added uncertainty, statistical theory and methods play an important role in financial time series analysis. (Tsay, 2010)

Recent financial data are analyzed by using econometrics, time series, and data mining techniques.

Our main focus are:

- Bayesian time series

- Volatility modeling

- Factor models

- Computational finance

Page updated

Google Sites

Report abuse