The problem of missing financial data is widespread yet often overlooked. An interesting insight into the structure of missing financial data provides a novel research paper by authors Bryzgalova et al. (2022). Firstly, examining the dataset of the 45 most popular characteristics in asset pricing, the authors found that missing data is frequent among almost any characteristic and affects all kinds of firms – small, large, young, mature, profitable, or in financial distress. The requirement of multiple characteristics simultaneously makes the problem even worse. Moreover, the data is not missing randomly; missing values clusters both cross-sectionally and over time. This may lead to a selection bias, making most famous ad-hoc approaches like the median invalid. Last but not least, the returns depend on whether a firm has missing fundamentals. Stocks with a missing characteristic value have lower returns in comparison to their counterparts observing the same variable.
Considering the abovementioned findings, the authors proposed a novel imputation method by modeling characteristics in three-dimensional space (time, individual stocks, and type of characteristics). The main idea is based on estimating a low-dimensional cross-sectional factor model by Principal Component Analysis (PCA) for each month. In conclusion, they used the XS (cross-sectional) information with TS (time-series) information in characteristics to predict missing values, creating two baseline models: the backward-cross-sectional model (B-XS), using only past observed data and backward-forward-cross-sectional model (BF-XS), combining past and future information. According to the authors, the novel approach is simple, easy to use, and significantly outperforms existing alternatives.
Authors: Svetlana Bryzgalova, Sven Lerner, Martin Lettau and Markus Pelger
Title: Missing Financial Data
Missing data is a prevalent, yet often ignored, feature of company fundamentals. In this paper, we document the structure of missing financial data and show how to systematically deal with it. In a comprehensive empirical study we establish four key stylized facts. First, the issue of missing financial data is profound: it affects over 70% of firms that represent about half of the total market cap. Second, the problem becomes particularly severe when requiring multiple characteristics to be present. Third, firm fundamentals are not missing-at-random, invalidating traditional ad-hoc approaches to data imputation and sample selection. Fourth, stock returns themselves depend on missingness. We propose a novel imputation method to obtain a fully observed panel of firm fundamentals. It exploits both time-series and cross-sectional dependency of firm characteristics to impute their missing values, while allowing for general systematic patterns of missing data. Our approach provides a substantial improvement over the standard leading empirical procedures such as using cross-sectional averages or past observations. Our results have crucial implications for many areas of asset pricing.
Originally posted on Quantpedia Blog.
Disclosure: Interactive Brokers
Information posted on IBKR Campus that is provided by third-parties and not by Interactive Brokers does NOT constitute a recommendation by Interactive Brokers that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.
This material is from Quantpedia and is being posted with permission from Quantpedia. The views expressed in this material are solely those of the author and/or Quantpedia and IBKR is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.
In accordance with EU regulation: The statements in this document shall not be considered as an objective or independent explanation of the matters. Please note that this document (a) has not been prepared in accordance with legal requirements designed to promote the independence of investment research, and (b) is not subject to any prohibition on dealing ahead of the dissemination or publication of investment research.
Any trading symbols displayed are for illustrative purposes only and are not intended to portray recommendations.