Multilingual content from IBKR

Close Navigation
Learn more about IBKR accounts
Machine Learning: The Recovery of Missing Firm Characteristics

Machine Learning: The Recovery of Missing Firm Characteristics

Posted February 11, 2022 at 11:00 am
Heiner Beckmeyer
Alpha Architect

The post “Machine Learning: The Recovery of Missing Firm Characteristics” first appeared on Alpha Architect Blog.


Recovering Missing Firm Characteristics with Attention-Based Machine Learning

  • Heiner Beckmeyer and Timo Wiedemann, University of Muenster (Germany)
  • A version of this paper can be found here
  • Want to read our summaries of academic finance papers? Check out our Academic Research Insight category.

What are the research questions?

Firm characteristics are often missing, which forces both researchers and practitioners to come up with workarounds when handling missing data. Previous approaches resorted to either dropping observations with missing entries or simply imputing the cross-sectional mean of a given characteristic. As both procedures accompany serious drawbacks (see below), there is a need for more advanced methods. The authors set up an attention-based machine learning model, motivated by recent advances in natural language to find answers to the following questions:

  1. How do firm characteristics relate to the cross-section of other – observed – characteristics and their historical evolution?
  2. How well does the proposed machine learning approach fare against competing approaches?
  3. How important is it to explicitly model nonlinear and interaction effects? How important is it to incorporate the temporal dynamics of the characteristics?
  4. On which information does the model rely when uncovering the latent structure governing firm characteristics?

What are the Academic Insights?

The authors show that:

  1. The proposed model is highly accurate in extracting the latent structure underlying the evolution of observable firm characteristics. Their approach comfortably outperforms competing methods by a large scale. When using the model to reconstruct available firm characteristics in a controlled environment, the authors show an expected error of around 4 percentiles from the true value which is more than 2-times more accurate than the next-best method.
  2. Incorporating information about the temporal evolution of the characteristics is essential to boost the model’s ability to reconstruct characteristics. While some characteristics exhibit a high degree of autocorrelation, others predominantly depend on cross-characteristic information. Incorporating both types of information is therefore decisive. The authors highlight that the model is flexible enough to simultaneously uncover a wide range of processes governing the evolution of characteristics in a simulation study.
  3. Model sanity checks showing the distribution of the reconstructed (i.e., previously missing) characteristics attest internal validity, with results well in line with expectations. Information is more often missing for smaller firms, and those that would be considered of low quality.
  4. Revisiting the literature on risk factors in financial research shows that many risk premia are likely much smaller than previously thought. Adding to the recent debate on replicability in financial research, the authors highlight, in turn, that most risk premia remain significant. The completed dataset poses an additional out-of-sample hurdle for existing and new risk premia to pass.
  5. Recovered percentiles of firm characteristics have been made publicly available for future research here.
Disclosure: Alpha Architect

The views and opinions expressed herein are those of the author and do not necessarily reflect the views of Alpha Architect, its affiliates or its employees. Our full disclosures are available here. Definitions of common statistics used in our analysis are available here (towards the bottom).

This site provides NO information on our value ETFs or our momentum ETFs. Please refer to this site.

Disclosure: Interactive Brokers

Information posted on IBKR Campus that is provided by third-parties does NOT constitute a recommendation that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.

This material is from Alpha Architect and is being posted with its permission. The views expressed in this material are solely those of the author and/or Alpha Architect and Interactive Brokers is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to buy or sell any security. It should not be construed as research or investment advice or a recommendation to buy, sell or hold any security or commodity. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.

IBKR Campus Newsletters

This website uses cookies to collect usage information in order to offer a better browsing experience. By browsing this site or by clicking on the "ACCEPT COOKIES" button you accept our Cookie Policy.