November 15, 2024
8:30 AM EST to 12:30 PM EST
NYU Brooklyn Campus
Workshop Description
Tagline: Unleashing the power of multimodal financial foundation models. Promoting reproducibility, transparency, and usability of GenAI in finance.
Large Language Models (LLMs) have demonstrated remarkable proficiency in understanding and generating human-like text. Popular financial LLMs like BloombergGPT and FinGPT showed their potential in financial services. Multimodal Financial Foundation Models (MFFMs) can digest interleaved multimodal financial data, including fundamental data, market data, analytics, and alternative data. This data possesses unique characteristics like being dynamic, following both structured and unstructured forms, and coming in varying formats, including, but not limited to, charts, graphs, Web APIs, Excel spreadsheets, SEC filings, XBRL data, and SQL data [1]. MFFMs enable a deeper understanding of the complexities underlying financial tasks and data, streamlining the operation of financial services.
On the path to the widespread adoption of MFFMs lie several challenges, including mounting apprehensions regarding reproducibility, transparency, ethics, and appropriate usage. Many existing LLMs function as black boxes, posing challenges in comprehending their operations and ensuring fairness and impartiality. Two major challenges in LLMs are "model cannibalism" and "openwashing." Many models are largely trained and released without transparency in mind, e.g., Claude 3.5 Sonnet. Many supposedly "novel" models may exploit labels from existing LLMs (e.g., GPT-4o) and perform supervised learning. We refer to this problem as "model cannibalism." As a result, MFFMs that currently power the finance industry are opaque in decision-making. They give rise to various challenges, including, but not limited to, inadequate transparency concerning the training data, deficiencies in combating models’ inherent biases, safety and security issues, and modifications to model weights made by malicious actors.
Recently, there have been many instances of another difficulty—"openwashing"—in releasing LLMs, where the released LLMs are marketed as "open" without OSI-approved licenses, e.g., Apache License 2.0 and MIT License. Openwashing inhibits the free use, modification, and distribution of these LLMs. We—a research alliance among Columbia University, Oxford University, and the Linux Foundation (branches including FinOS, PyTorch, LF AI & Data, and the Generative AI Commons)—proposed the Model Openness Framework [2] that defines "true" model openness. MFFMs that comply with this framework have laid the groundwork to enable reproducibility and to adopt robust and reliable models across the finance industry. Our framework offers guidance to researchers and developers seeking to enhance model transparency and reproducibility while allowing permissive usage. For financial enterprises and the finance industry, our framework provides clear guidelines for new models to become commercially suitable without restrictions.
[1] Revolutionizing Finance with LLMs: An Overview of Applications and Insights. Huaqin Zhao et al., 2024.
[2] The Model Openness Framework: Promoting Completeness and Openness for Reproducibility, Transparency and Usability in AI. Matt White, Ibrahim Haddad, Xiao-Yang Liu, et al., 2024.
Contact
Please email iwmffm.icaif.2024@gmail.com if you have any questions.