Research

The important thing is not to stop questioning.

Research Interests

My research focuses on developing interpretable deep learning algorithms to enhance business decision-making, particularly in finance and IS contexts. I analyze unstructured data (including network structures, financial text, and multi-modal inputs) through domain-adapted methods such as representation learning and LLM alignment. My work explores network representation learning for financial analytics and business-oriented AI Alignment.

Working Papers

"Evaluating and Aligning LLMs with Financial Analysts: from Reports to Market Impact" (Bingze Xu and Kunpeng Zhang)

In Preparation for Submission (Job Market Paper)

Abstract: Financial analyst reports are among the most influential sources for investment decision-making, offering critical insights into firm performance and earnings forecasts. Despite their importance to capital markets, these reports are costly to produce and often accessible only to paying clients. Large language models (LLMs) provide a promising alternative for generating analyst-like reports from quarterly earnings calls (ECs). However, existing LLMs often fail to capture domain-specific features that are essential in financial contexts, and systematic evaluations of their performance in this setting remain scarce. This paper presents the first, large-scale, systematic evaluation of LLM-generated analyst reports and introduces a comprehensive framework that integrates standard evaluation metrics with finance-specific dimensions such as sentiment and language style. Using a rich real-world dataset of analyst reports, we benchmark outputs from leading open- and closed-source LLMs against analyst-written reports. Our analysis shows that LLMs fall short in capturing nuanced linguistic and financial cues critical for accurate market interpretation. To address these shortcomings, we propose a multi-objective alignment framework that fine-tunes LLMs using domain-specific preference signals. This approach substantially reduces the performance gap across key evaluation dimensions. In contrast, generic alignment methods such as Direct Preference Optimization (DPO), when applied without explicit domain objectives, fail to bridge these gaps – especially in financial sentiment. Finally, we assess the downstream value of LLM-generated reports by evaluating their ability to explain and predict market reactions. Reports produced with our multi-objective aligned approach achieve predictive performance most comparable to that of human analysts. Together, these results highlight the importance of domain-aware evaluation and targeted alignment strategies for the responsible deployment of LLMs in high-stakes financial applications and beyond.

"Peer Firm Identification via Wisdom of Crowds: A Network Representation Learning Approach" (Bingze Xu, Kunpeng Zhang, Mandy Dang, and Gavin Zhang)

Under 2nd Round Review Revision at MIS Quarterly

Abstract: Peer firm identification traditionally relies on proximity measure among firms, often based on static industry classification systems like GICS or text similarity from supply-side data like product descriptions in 10-K. Although recent studies have introduced dynamic methods to identify peers for a focal firm, they often suffer from issues related to data coverage, flexibility, and suboptimal performance. To address these limitations, we construct a firm-firm network based on co-mentions in posts from a large-scale social media dataset. Different from methods that use network linkages or typical structural measures (e.g., centralities), we employ a network representation learning technique that captures latent structures among firms, facilitating global structure learning beyond first-degree connections by projecting the network into a lower-dimensional space. Peer firms are identified via cosine similarity between learned embeddings of firms. To enhance the performance of peer firm identification, we develop an ensemble method that combines multiple sources of firm similarities. The results from the extensive evaluations demonstrate that our method outperforms existing methods in explaining cross-sectional variations in stock returns, financial ratios, and valuation multiples for base firms. Additionally, it offers several other desirable properties for peer firm identification. In summary, this study makes three key contributions: (1) leveraging the wisdom of crowds from social media to identify peer firms, enabling highly fine-grained analysis, (2) introducing a network representation learning technique and developing an ensemble method to combine multiple data sources for enhancing peer firm identification, and (3) extensively validating the proposed approach using a set of well-established benchmarking methods and showcasing its superiority through additional practical case studies.

"Credit Rating Prediction Beyond Supply Chains: A Multiplex Network Representation Learning Approach" (Bingze Xu, Kunpeng Zhang, Balaji Padmanabhan, and Yi Yang)

In Preparation for Submission

Abstract: Assessing a firm’s credit rating is a significant and complex problem with wide implications for investors, markets and supply-chain partners of a firm. Typical approaches used in this context use financial, non-financial, and macroeconomic data as determinants for firm credit ratings. However, these are primarily tabular and fail to capture potentially relevant relational dynamics that exist in this ecosystem (firm-firm networks, analyst/board member relationships, etc.). This paper presents a novel heterogenous graph attention network (HGAT) with a multiplex corporate graph that integrates relational information from firm’s supply chain, director, and analyst networks, and then ensemble the learned embeddings with financial ratios for final credit rating predictions. The attention mechanism used facilitates interpretability and allows us to identify how different types of entities influence the final credit prediction of focal firms. Methodologically, we demonstrate that incorporating high-order topology structures at the motif level yields better results with explainable motif-based graphs. These contributions add to the growing literature in the information systems and finance areas on fintech innovations in markets, and have the potential of creating significant efficiencies given the widespread use of firm credit ratings in financial markets.

"MAPO: Multi-Agent-based Prompt Optimization for Text-to-Image Alignment" (Mingwei Sun*, Bingze Xu*, and Kunpeng Zhang)

In Preparation for Submission

Abstract: Text-to-image (T2I) diffusion models have demonstrated impressive capabilities in generating visually appealing high-quality images. However, their performance highly depends on the quality of input prompts, making alignment between generated images and user intent a challenge. Existing approaches typically address this misalignment by leveraging LLMs for automatic prompt refinement or by fine-tuning diffusion models through RLHF. Although effective, these methods often require large-scale human-annotated training data or involve substantial computational costs. To overcome these limitations, we propose MAPO, a novel Multi-Agent Prompt Optimization framework designed to convert raw user prompts into optimized prompts tailored specifically for diffusion models, thus improving image synthesis quality aligned with user intentions. MAPO consists of four specialized agents — a refining agent, a T2I agent, an assessment agent, and a text-image alignment agent — that collaboratively and iteratively refine prompts, evaluate outputs, and optimize image generation quality. Empirical evaluations on three benchmark datasets show that MAPO significantly improves image aesthetics, text-image alignment, and human preference scores, outperforming several state-of-the-art training-based and training-free alignment benchmarks (an average improvement of 2.86%) in prompt-batch learning. In addition, MAPO achieves these advancements with minimal training data and limited computational resources, highlighting both its efficiency and effectiveness.

Research-in-Progress

"How Do Different Explainable Frameworks Affect News Bias Identification?" (Bingze Xu*, Wei Feng*, and Kunpeng Zhang)

Abstract: While NLP techniques have advanced the automatic detection of political stance bias, little is known about effective visualization and communication of these biases. Drawing on cognitive psychology, this study investigates how local and counterfactual explanations influence users’ cognitive and metacognitive processes when evaluating biased news headlines. We fine-tune a bias detection model and conduct experiments to examine how different explanation designs based on SHapley Additive exPlanations (SHAP) affect user perception, reasoning, and decision quality. We hypothesize that interpretability designs can trigger different cognitive and metacognitive processes, such as confirmation bias and cognitive dissonance, which in turn shape user judgments. The research contributes to understanding how cognitive theories can enhance the design of AI systems, offering theoretical support for developing effective news bias mitigation tools. It provides practical insights for media platforms and policymakers seeking to enhance transparency, reduce cognitive burden, and promote bias awareness in digital online platforms.

Page updated

Report abuse