AI Models Only 79% Accurate When Asked About SEC Filings

Startup Patronus AI has uncovered limitations in artificial intelligence (AI) models when analyzing Securities and Exchange Commission (SEC) filings.

    Get the Full Story

    Complete the form to unlock this article and enjoy unlimited free access to all PYMNTS content — no additional logins required.

    yesSubscribe to our daily newsletter, PYMNTS Today.

    By completing this form, you agree to receive marketing communications from PYMNTS and to the sharing of your information with our sponsor, if applicable, in accordance with our Privacy Policy and Terms and Conditions.

    The findings of the study shed light on the challenges faced by AI models and emphasize the need for improvement to meet the demands of regulated industries, particularly finance, CNBC reported Tuesday (Dec. 19).

    The research focused on large language models (LLMs) that are commonly used in analyzing SEC filings, according to the report. The study revealed that even the best-performing AI model configuration tested only achieved a 79% accuracy rate in answering questions when provided with the entire filing alongside the question.

    One of the major issues identified by the researchers was the AI models’ tendency to refuse to answer questions or provide incorrect information that is not present in the SEC filings, the report said. This lack of accuracy and reliability poses significant concerns, especially in regulated industries where precision is crucial.

    The finance industry values the ability to quickly extract important data and analyze financial narratives, per the report. If AI models could accurately summarize SEC filings or promptly answer questions about their content, it could provide an advantage to users in the competitive financial sector.

    However, the entry of AI models into the industry has not been without challenges, according to the report.

    Advertisement: Scroll to Continue

    One of the main challenges highlighted by Patronus AI is the nondeterministic nature of LLMs, the report said. These models do not consistently produce the same output for the same input, making rigorous testing essential to ensure accurate and reliable results.

    Patronus AI, founded by Anand Kannappan and Rebecca Qian, aims to address this challenge by automating LLM testing with software, per the report.

    Despite the challenges and limitations identified in the study, the co-founders of Patronus AI remain optimistic about the potential of LLMs to assist professionals in the finance industry, according to the report.

    They believe that with continued improvement, these models can provide valuable support to analysts and investors. However, for now, human involvement is necessary to ensure accuracy and reliability.

    PYMNTS Intelligence has found that financial chatbots are evolving into highly capable problem-solvers. In the future of consumer banking, digital assistants will not only listen but also understand and anticipate consumers’ needs, according to “AI and Banking’s New Dawn: From Conversations to Conversions,” a PYMNTS and Galileo collaboration.