Saturday abstracts

Large Data and Finance II

Saturday May 4

Coffee/pastry 8:30-9:00

Session 9

9:00-9:35

Spectral Volume Models: Universal High-Frequency Periodicities in Intraday Trading Activities

Speaker: Ruixun Zhang (Peking University)

We develop spectral volume models to systematically estimate, explain, and exploit the high-frequency periodicity in intraday trading activities using Fourier analysis. The framework consistently recovers periodicities at specific frequencies in three steps, despite their low signal-to-noise ratios. This reveals important and universal high-frequency periodicities in the United States (US) and Chinese stock markets, and the dominant frequencies explain a significant fraction of the total variance of intraday volumes. We provide evidence that this phenomenon likely reflects the behaviors of trading algorithms with repeated and regular trading instructions. Finally, we demonstrate that uncovering such high-frequency periodicities improves intraday volume predictions and generates excess returns. Long-short portfolios constructed based on the strength of periodicity yield monthly alphas of up to 0.9% in the US and 5% in China.

9:35-10:10

Spoofing and Manipulating Order Books with Learning Algorithms

Speaker: Álvaro Cartea (University of Oxford)

We propose a dynamic model of the limit order book to derive conditions to test if a trading algorithm will learn to manipulate the order book. Our results show that as a market maker becomes more tolerant to bearing inventory risk, the learning algorithm will find optimal strategies that manipulate the book more frequently. Manipulation occurs to induce mean reversion in inventory to an optimal level and to execute round-trip trades with limit orders at a higher probability than was otherwise likely to occur; spoofing is a special case when the market maker prefers that manipulative limit orders are not filled. The conditions are tested with order book data from Nasdaq and we show that market conditions are conducive for an algorithm to learn to manipulate the order book. Finally, when two market makers use learning algorithms to trade, their algorithms can learn to coordinate their manipulation.

Break 10:10 - 10:30

Session 10

10:30-11:05

Coordinated Testing for Identification Failure and Correct Model Specification

Speaker: Eric Renault (University of Warwick)

In the context of GMM or Minimum Distance (MD) inference, we propose inference procedures about structural parameters of interest that are robust to the lack of identification of other structural parameters. On the one hand, we develop an overidentification test that is robust to weak identification. Conversely, we also develop a test of the null of weak identification that is robust to misspecification and powerful through a conditional approach. Importantly, we emphasize that these two testing goals require some coordination to ensure there is enough power to test for weak identification, which is critical for valid inference. More broadly, our paper develops tools to address the large data identification issue pointed by Stock and Wright (2000) in the context of Consumption-based Capital Asset Pricing Model (CCAPM): “approximately a century of monthly data is needed before conventional normal asymptotics provides a good approximation to the finite sample distributions of these estimators”. We illustrate the performance of our different testing strategies through several applications including, the New Keynesian Philips curve (in a GMM framework), as well as Asset Pricing models with stochastic volatility and leverage effect (in a MD framework).

11:05-11:40

On the Theory of Autoencoders

Speaker: Dacheng Xiu (University of Chicago, Booth)

Autoencoders are essential techniques in unsupervised machine learning, predominantly used for dimension reduction, feature extraction, and signal denoising. This study provides non-asymptotic guarantees for deep autoencoders within a nonlinear factor model framework. We demonstrate that deep autoencoders can effectively retrieve common components from model inputs. The associated error includes one component similar to that observed in a linear factor model, and another component that aligns with the optimal nonparametric regression rate, as if the factors were directly observed. Furthermore, we show that the extracted factors converge to the true latent factors, albeit with a functional transformation.

11:40-12:15

STEEL: Singularity-aware Reinforcement Learning

Speaker: Xiaohong Chen (Yale University)

Lunch 12:15 - 1:45

Session 11

1:45-2:20

Simulation of diffusion bridges and estimation for SDEs with random effects

Speaker: Michael Sørensen (University of Copenhagen)

The complexity of likelihood inference for diffusions based on discrete time samples often necessitates the use of computational techniques such as Markov chain Monte Carlo. We consider methods based on sampling of diffusion bridges that are coherent with observed data, typically bridges between the observed values of the diffusion. A diffusion bridge from a to b in [0,T] is a solution X to a stochastic differential equation started at an and conditioned to have the terminal value X_T = b. A method for simulating diffusion bridges for ergodic diffusions is presented. A main advantage is that computing time is linear in T. Approximate bridges are obtained by simulating (e.g. by the Euler scheme) two unconditional diffusions - one with starting point an and another started at b. The first is spliced to the time-reversal of the second the first time they meet. If they do not meet, a new pair is simulated. For real diffusions the two diffusions can be independent. In higher dimensions coupling methods are needed to ensure that the two diffusions have a positive probability of meeting, and the stochastic differential equation governing the second diffusion must be that of the time reversal of the stationary version of the diffusion. The distribution of the approximate bridge is derived and used to construct a Metropolis-Hastings algorithm where the proposals are approximate bridges and the target distribution is that of an exact bridge from a to b. It is briefly discussed how the approach can be combined with methods of exact simulation of diffusions to obtain a bridge simulation method that is both without discretization error and with computing time linear in T. The usefulness of the approach is demonstrated by applications to estimation for stochastic differential equations with random effects (panel data) and diffusions with jumps.

2:20-2:55

Testing many moments by combining many norms

Speaker: Anders Kock (University of Oxford)

Break 2:55-3:15

Session 12

3:15-3:50

Non-reversal Normal Markets Meet Nonexcessive Regulations: Quantile-based Risk Portfolios Outperform Naive Strategy

Speaker: Zhengjun Zhang (University of the Chinese Academy of Sciences)

Overreacting on either profit or loss can severely lead to ill-functioning financial markets and even financial crises. We found that quantile-based risk portfolios outperform naive strategies when a market is in a period of non-reversal with nonexcessive risk regulations. This significant finding was missed in the literature. Evaluating the P\&L balance capabilities of return-risk models, we further found the performances of VaR, CVaR, and MMVaR to be inadequate, specious, and outstanding, respectively, in terms of 1) better risk-return trade-off and portfolio selection under diversified markets and risk levels being from regulators' excessive rules to investors' comfort beliefs (nonexcessive regulations) which result in inverted U-shaped Sharpe ratios, and 2) balanced equity investability. Surprisingly, successful empirical literature studies have hardly found that CVaR outperformed risk measures VaR and variance, though CVaR possesses more probabilistic properties, which may be unnecessary and potentially misspecified. CVaR underperformed other primary risk measures and could damage the market due to its restricted and unbalanced equity investibility. This evidence and other results discussed in our work confirm that the mean-CVaR model is inferior, and the new mean-MMVaR model can become the natural benchmark risk-return trade-off model to replace existing financial investment and portfolio optimization models, and further strongly support that MMVaR can replace VaR in the minimum capital requirements for market risk advocated by Basel Committee on Banking Supervision.

3:50-4:25

Copula estimation for nonsynchronous financial data

Speaker: Rituparna Sen (University of California, Santa Barbara / Indian Statistical Institute, Bangalore)

Copula is a powerful tool to model multivariate data. We propose the modelling of intraday returns of multiple financial assets through copula. The problem originates due to the asynchronous nature of intraday financial data. We propose a consistent estimator of the correlation coefficient in case of elliptical copula and show that the plug-in copula estimator is uniformly convergent. For non-elliptical copulas, we capture the dependence through Kendall’s Tau (leveraging the relation between copula parameter and Kendall’s tau). We demonstrate underestimation of the copula parameter and propose an alternative method to obtain an improved estimator. In simulations, the proposed estimator reduces the bias significantly for a general class of copulas. We apply the proposed methods to real data of several stock prices.

Day 3 concludes 4:25pm

Page updated

Google Sites

Report abuse