Why Your Market Data Doesn't Match the SEC
If you've ever tried to tie a number from a popular stock screener back to a company's official 10-K, you've likely failed. Here is why the data industry deliberately alters financial statements—and how it breaks your models.
The myth of the universal template
When you look up a stock on a major platform like Seeking Alpha, Yahoo Finance, or even institutional terminals, you aren't seeing the company's financial statements. You are seeing a “standardized” interpretation of those statements.
While companies follow GAAP (Generally Accepted Accounting Principles), they have significant leeway in how they label, categorize, and group their line items. A software company might break out “Cloud Subscription Revenue,” while a retailer might just report “Net Sales.”
To make their databases queryable—and to allow you to compare the gross margins of 5,000 different companies instantly—data vendors like S&P Capital IQ, FactSet, and Bloomberg force every company's unique financials into a single, universal template. This process is called standardization.
Standardized vs. As-Reported Data
Specialized data quality firms like Calcbench explicitly warn analysts about the difference between these two types of data. If you don't know which one you are looking at, you are flying blind.
Standardized Data
- → The Goal: Comparability across thousands of peers.
- → The Method: The vendor maps unique company XBRL tags into rigid, predefined buckets (e.g., all forms of income become simply "Revenue").
- → The Risk: Reclassifies expenses, strips out granular segment data, and often buries critical footnotes. It is a "black box."
As-Reported Data
- → The Goal: Perfect accuracy and context for a single firm.
- → The Method: Raw data pulled exactly as it appears in the 10-K or 10-Q filing, preserving management's original labels and structure.
- → The Benefit: You see exactly where cash is flowing, including novel business models and specific one-time charges.
Where the black box breaks down
Major vendors standardize data to improve comparability. But when a vendor decides to reclassify a line item, they alter reality to fit their database schema.
Consider Stock-Based Compensation (SBC). Many modern tech companies report SBC as a distinct, massive line item in their cash flow statements or footnotes, explaining exactly how it is distributed across R&D, Sales, and G&A. A standardized database often aggregates this back into generic operating expenses. Suddenly, you can't tell if a company's “R&D” spend is actual cash investment in servers or just equity dilution handed to executives.
Or consider non-recurring items. If a company takes a massive legal settlement, a standardized data feed might move that expense into “Other Operating Expenses” to normalize the margin trend. The screener looks clean, but the actual cash liability is obscured.
Worse, mapping errors are common. In 2024, data quality analysts found frequent XBRL tagging errors where companies used negative signs for deferred tax liabilities, causing automated standardizers to incorrectly calculate Total Assets for entire sectors.
Stop trusting black-box market data
If you are running a quick quantitative screen on 500 stocks, standardized data is a necessary evil. But if you are building a DCF model, valuing a specific business, or putting real capital on the line, you cannot rely on a vendor's interpretation of the numbers.
At DeepFundamental, we refuse to force companies into a universal template. We parse the SEC filings directly, preserving the as-reported structure. Every line item, every segment, and every footnote is extracted exactly as management filed it under oath.
Because when it comes to fundamental analysis, you shouldn't have to guess what the numbers actually mean.
Analyze the real filings
Skip the standardized screeners. See the exact, as-reported financial statements extracted straight from EDGAR.