A statistician’s foray in value investing

Benjamin Graham and Warren Buffett have imparted a treasure trove of knowledge on the topic of investing for the public. Reading Graham’s Intelligent Investor and Buffett’s three decades of Berkshire Hathaway’s shareholder letters is a rare treat. The reader participates in the thought process of two of the most astute investment minds of the 20th century and discovers the required temperament for long term investment success. As such, my comments down below should be viewed as someone thankful for the insights but willing to question the statements and methods, having nonetheless benefited immensely from the knowledge imparted by them through their original delivery.

Holding a central place in the teachings of Benjamin Graham, the father of value investing, is the concept of intrinsic value and margin of safety. Simply stated, intrinsic value is how much a given company is worth as a private concern in its entirety. This can be quite different from its market capitalization, which is the price Mr. Market is willing to offer the company for on a daily basis. Margin of safety, as coined by Graham, describes the extra cushion of value between the price paid for a company and the company’s intrinsic value.

Prima facie, a politician’s campaign speech to end world hunger is equally as noble, but dismissed as a mere platitude, begging the question – “Great, who wouldn’t want to do that? How do you plan on accomplishing this?” But somehow, under the aura of the investment greats, the concept of intrinsic value and margin of safety avoid such inquisitions. But wouldn’t we all want to know the value of what we’re buying and how much of a good or bad deal we’re receiving? After all, who wouldn’t want to buy 50 cent dollars? Or alternatively, who would really want to pay a dollar fifty for a dollar?

This leads us to what I believe to be the main deficiency of value investing as espoused by Graham and his acolytes. It lacks a testable hypothesis. The lack of a testable or falsifiable hypothesis is what separates astrology from astronomy, creationism from evolution and is best captured by a lengthy excerpt of Karl Popper’s, the well know Austrian-born philosopher of science, in Conjectures and Refutations : The Growth of Scientific Knowledge.

“When should a theory be ranked as scientific?” or “Is there a criterion for the scientific character or status of a theory?”

The problem which troubled me at the time was neither, “When is a theory true?” nor “When is a theory acceptable?” my problem was different. I wished to distinguish between science and pseudo-science; knowing very well that science often errs, and that pseudoscience may happen to stumble on the truth.

I knew, of course, the most widely accepted answer to my problem: that science is distinguished from pseudoscience—or from “metaphysics”— by its empirical method, which is essentially inductive, proceeding from observation or experiment. But this did not satisfy me. On the contrary, I often formulated my problem as one of distinguishing between a genuinely empirical method and a non-empirical or even pseudo- empirical method — that is to say, a method which, although it appeals to observation and experiment, nevertheless does not come up to scientific standards. The latter method may be exemplified by astrology, with its stupendous mass of empirical evidence based on observation — on horoscopes and on biographies.

These considerations led me in the winter of 1919-20 to conclusions which I may now reformulate as follows.

1.  It is easy to obtain confirmations, or verifications, for nearly every theory — if we look for confirmations.
2.  Confirmations should count only if they are the result of risky predictions; that is to say, if, unenlightened by the theory in question, we should have expected an event which was incompatible with the theory — an event which would have refuted the theory.
3.  Every “good” scientific theory is a prohibition: it forbids certain things to happen. The more a theory forbids, the better it is.
4.  A theory which is not refutable by any conceivable event is non- scientific. Irrefutability is not a virtue of a theory (as people often think) but a vice.
5.  Every genuine test of a theory is an attempt to falsify it, or to refute it. Testability is falsifiability; but there are degrees of testability: some theories are more testable, more exposed to refutation, than others; they take, as it were, greater risks.
6.  Confirming evidence should not count except when it is the result of a genuine test of the theory; and this means that it can be presented as a serious but unsuccessful attempt to falsify the theory. (I now speak in such cases of “corroborating evidence.”)
7.  Some genuinely testable theories, when found to be false, are still upheld by their admirers — for example by introducing ad hoc some auxiliary assumption, or by reinterpreting the theory ad hoc in such a way that it escapes refutation. Such a procedure is always possible, but it rescues the theory from refutation only at the price of destroying, or at least lowering, its scientific status. (I later described such a rescuing operation as a “conventionalist twist” or a “conventionalist stratagem.”)

One can sum up all this by saying that the criterion of the scientific status of a theory is its falsifiability, or refutability, or testability.

To be sure, value investing, even loosely defined as it is, is a vastly superior method to rank speculation and Graham et al are to be thanked for their contributions. Yet in order for one to believe that value investing is superior to other investment methods, one must heed the wisdom of Karl Popper and posit a testable hypothesis. A hypothesis that goes beyond the ill-defined mandate of buying 50 cent dollars, beyond the nebulous definition of intrinsic value and into the domain of a concrete, empirical and repeatable method to calculate intrinsic value and margin of safety. After all, isn’t the expression intrinsic value debased of any true meaning if we’re each allowed to have our own personal definition of this term?

In pursuit of this noble goal we are immediately beset by another setback. A testable hypothesis implies the use of cold, mechanical and quantitative methods to discern intrinsic value. But all too often, quantitative methods applied in the financial arena just lead to sophistry disguised as numerical precision. This false sense of precision did not escape either Graham or Buffett, butdoes escape many investors.

“In forty-four years of Wall Street experience and study I have never seen dependable calculations made about common-stock values, or related investment policies that went beyond simple arithmetic or the most elementary algebra. Whenever calculus is brought in, or higher algebra, you could take it as a warning signal that the operator was trying to substitute theory for experience, and usually also to give to speculation the deceptive guise of investment.”
– Benjamin Graham

Perhaps we should review the academic and investment communities’ fate in applying sophisticated quantitative models. With any luck, our review will aid in designing and testing a procedure for quantitatively determining intrinsic value as well as address the issue of precision or lack thereof.

Currently, there are two main quantitative approaches within the academic community to determine optimal portfolio allocations (1) the mean variance (MV) approach as espoused by Nobel laureate, Harry Markowitz; and (2) the arbitrage pricing theory (APT) approach as developed by Stephen Ross. Unfortunately, embedded within each paradigm are the potential seeds of the false and misleading precision that so irked Graham.

For the purely quantitative portfolio manager, Markowitz’s mean-variance framework, in theory, allows one to determine the optimal portfolio weights without any consideration for company specific information. In many ways, it is the antithesis of value investing. The portfolio manager merely focuses on the sequence of returns provided by Mr. Market and invests accordingly. This framework requires knowledge of both the mean and covariance of asset returns, which in reality are unknown and have to be estimated from the observed data. Nevertheless, standard industry practice is to ignore the estimation error and simply treat the sample estimates as the true parameters, plugging them in to get optimal portfolio weights.

In this case, Graham’s concerns about false precision are justified by the naivety in which a variable of interest’s point estimate, the mean and covariance, are used without any consideration for the variability of these estimates. Luckily, their exasperation can be alleviated by an appropriate and requisite use of statistics.

If one evaluates the repercussions of such estimation error on the investment decision at hand, an amazing empirical fact emerges. We discover that the precision we initially sought has in fact disappeared. We know with statistical certainty that our desired result is unknowable. So in the hunt for quantitative precision, we are lead to acknowledge our very own lack of precision. In fact, there is strong empirical and theoretical evidence to suggest that an equal weight portfolio will ex- ante outperform any so called Markowitz mean-variance optimal portfolio with high probability.

The second approach, the Arbitrage Pricing Theory (APT) model, decomposes a stock’s return into factors common to all assets and factors specific to a given asset. Macroeconomic factors like the inflation rate, unemployment rate and interest rates would be examples of factors common to all assets while attributes such as size, dividend yield, price momentum, book value, free cash flow, and return on equity would be examples of firm specific factors.

Once such a factor model is posited, the principled practitioner would perform a multiple regression to determine the betas for each factor and each stock. With these betas in hand, a cross sectional regression would determine which factors or exposures were being rewarded and which were not. Those stocks being highly rewarded by such priced risk factors would be purchased and those not rewarded, avoided.

The APT model shares a kindred spirit with value investing. It provides an investor with the relevant factors and a cold, mechanical and quantitative method with which to order or rank various investment opportunities, but it does not determine an intrinsic value per se.

Unfortunately, once again, Graham’s and Buffett’s apprehension of sophisticated quantitative techniques are well founded. Many practitioners blindly pursue a kitchen sink approach where they chuck in all sorts of factors in an APT model. This makes a mockery of statistics as well as a mockery of common sense. For the statistically minded, data mining, variable selection and multicollinearity are obvious problems that will relegate the regression’s factor loadings to be quite meaningless – both statistically and economically. Yet a well specified, parsimonious model with just a few factors, could lead to results that are both economically meaningful and statistical significant.

But this brings us back where we started and in an uncomfortable impasse. Clearly we need a testable theory to base our investments on, yet a carefully conducted and expansive quantitative approach will leave us with an answer that is best described by a shrug of the shoulders and an “Ughh, I don’t know.” Alternatively, a few, a priori, well chosen factors will lead to a meaningful result. But this just means if we start with qualitatively sensible factors like momentum, size, price to book, return on invested capital, free cash flow yield, and earning’s yield, then we can quantitatively arrive at a justified and defensible conclusion.

As such, value investing’s lack of a testable hypothesis does not relegate it to the wrath of financial astrology because of its a priori foundation on sound business practices and sensible qualitative factors. Some may find this impasse quite troubling but in the end, the most financially of action is to be intellectually honest with what we know, what we don’t know,
and what is unknowable.

Equity factor model – the poor man’s version.

My encounter with value investing began somewhat backwards during graduate school. I enrolled in an Empirical Finance course to satiate my curiosity on financial markets and break the monotony of math every day from dawn to dusk.

The course was intended for first year Finance PhD students and covered time-series & cross-sectional properties of asset returns, event studies, and empirical tests of asset pricing models. But for someone like myself who spent the previous ten years trading foreign exchange and commodity markets, the most interesting part of the course was when we explored the interplay between asset pricing theories, statistical assumptions and relevant econometric techniques in the context of classic empirical papers.

Quite quickly, I realized that for every theory posited, an anomaly would be discovered that highlighted the shortcomings of the various academic models. Given that it would be near heretic for an aspiring academic to say the markets were inefficient, the academic community would quickly go on to either explain why such an anomaly really wasn’t an anomaly or develop a better model of asset returns.

But for me, a practitioner who believed that there were opportunities to make reasonable returns from investing in the financial markets, it was music to my ears. The most useful research was always meticulous and grounded in a company’s fundamentals. Things like book value, cash flow, accruals, etc. Collectively, the body of research provided a useful guide as to what may or may not work in the real world and best of all, it was free, rigorous and testable.

Upon graduation, I found myself with a bit of time on my hands before I launched a cmmodity fund so I decided to read all of Warren Buffett’s shareholder letters and The Intelligent Investor by Benjamin Graham.

With this, my two worlds collided. I immediately saw the value of blending a quantitative and qualitative approach to investing. And since that time, it’s been a hobby of mine to continually read the academic research looking for ways to build simple, parsimonious and practical quantitative models to find value stocks.

The factors that I’ve landed upon to build an implementable factor model are as follows

  1. Accruals
  2. Beta
  3. EPS Estimate revisions
  4. EV / EBITDA
  5. EV / MCAP
  6. Financial leverage
  7. Goodwill/Intangible to Equity
  8. Gross margins
  9. Gross profit to total assets
  10. Growth rate of shares
  11. Growth rate of total assets
  12. Inside ownership
  13. Pretax earning yield
  14. Price to free cash flow
  15. Return on assets
  16. Scaled net operating assets
  17. Standard deviation of returns
  18. 5 year returns (mean reversion)
  19. 6 month return (momentum)

For some factors, the higher the better and for other factors, the lower the better. For each factor, there was academic research that showed the efficacy of such a factor in investing, e.g. Sloane (1996) for accruals, Jegadeesh and Titman (1993) for momentum, DeBondt and Thaler (1989) for mean reversion, etc. Over the years, I always filtered the research since most professors publish research in pursuit of tenure and not in pursuit of implementing profitable trading strategies. A good compendium of many known anomalies is Jacobs, Heiko, 2015, “What Explains the Dynamics of 100 Anomalies”, Journal of Banking and Finance, 57, 65-85.

But an important question is how do you combine all of these factors to form a composite score and appropriately deal with outliers. There are countless statistical ways to do this and a number of theories abound. I’ve thought about this long and hard and explored many of the sophisticated approaches. In the end, I think this is where the qualitative judgement of a practitioner comes in. How much should an overall score depend on the balance sheet statement? The cash flow statement? The stock price returns? These are questions best answered by the investor who will actually deploy capital and not be second guessing himself during a market pullback.

As for me, I chose to do this a few ways

  1. Just pick the weightings for each factor. All equal weight? Some more important, etc.?
  2. Create a risk-reward composite score. Average factor score to average factor variability ratio?
  3. Median factor score?
  4. Trimmed mean value of factors?
  5. Perform a regression of past returns versus past factor scores to arrive at regression coefficients for the factor scores. Once you have the regression coefficients you plug in the new factor scores to give you the composite score to rank the investment universe on.

Once the Q4  2015 financial data is available, I’ll run my model and post a list of the top companies using one of the composite scoring methods above. I normally comb through the top 200 names to see which ones warrant an investment.

Interestingly enough, what started as a curiosity about value investing over a decade ago soon became an investing hobby. What started as an investing hobby led to a fundamental shift in how I evaluate businesses, whether for investment or direct management during my time at a major bank. Warren Buffett’s quote “I am a better investor because I am a businessman, and a better businessman because I am an investor” surely resonates with me.