Factors are Not Commodities

The narrative of Smart Beta products is that factors are becoming an investment commodity. Factors are not commodities, but unique expressions of investment themes. One Value strategy can be very different from another, and can lead to very different results. There are many places that factor portfolios can differ. The difficulty for asset allocators is in identifying how factor strategies differ from one another, when they all purport to use the same themes: Value, Momentum and Quality.

Over the last couple of years, several Multi-factor funds that combine Value, Momentum and Quality were launched. As these products compete to garner assets, price competition has started amongst rivals. In December, Blackrock cut fees to smart beta ETFs in competition with Goldman Sachs which has staked out a cost leadership position in the market space. Michael Porter, the expert in competitive strategy, wrote in 1980 that there are three generic strategies that can be applied to any business for identifying a competitive advantage: cost leadership, differentiation or focus.   Cost leadership can be an effective strategy, but the key to any price war is the products need to be near-perfect substitutes for one another, such as commodities. This paper focuses on how quantitative asset managers can have large differences in factor definitions, differences in combining factors into themes, and differences in portfolio construction techniques leading to a wide range of investment experiences in multi-factor investment products.

Factor Definitions

Value investing through ratios seems to be very straightforward. Price/Earnings ratios are quoted widely as a common metric to gauge the relative cheapness of one stock to another. “Information Content of Equity Analyst Reports” by Asquith, Mikhail and Au found that 99% of equity analyst reports use earnings multiples in analyzing a company. The P/E ratio is used widely because it is straightforward and makes intuitive sense: as an equity owner you are entitled to the residual earnings of the company after expenses, interest and taxes. A ratio of price to earnings tells you how much you’re paying for every dollar of earnings.

Getting a P/E ratio is as simple a exercise as opening up a web browser and typing in a search. But if you’ve ever compared P/E ratios from multiple sources, you can get very different numbers for the same company. Take Allergan (NYSE: AGN) as an example. As of January 12th, 2017, Yahoo! Finance had AGN with a P/E of 6.06. But Google Finance had 15.84. If you have access to a Bloomberg terminal, Bloomberg had it as a P/E of 734. Factset has no P/E ratio. You can feel like you’re stuck in Arthur Block’s Segal’s Law: “a man with a watch knows what time it is. A man with two watches is never sure.”

These discrepancies happen because there are a lot of different ways to put together a P/E ratio. One could use Earnings per Share divided by the price of the stock. If so, should you use basic or diluted EPS? There’s a difference if you switch to the LTM Net Income dividend by the total Market Cap of the company, as shares can change over a given quarter. But the reason for Allergan’s different ratios is that some financial information providers use bottom-line earnings while others take Income before Extraordinaries and Discontinued Operations. On August 2nd, Teva (NYSE: TEVA) acquired Allergan’s generics business “Actavis Generics” for $33.4 billion in cash and 100 million shares of Teva, generating $16bn in earnings from Discontinued Operations. After unwinding this, the company actually lost $1.7bn in the third quarter. Hence no P/E ratio. Depending on whether an adjustment is made on this, Allergan will either appear as a top percentile cheap stock on Earnings Yield (inverse of the P/E ratio) or in the 94th percentile.

Accounting adjustments for Extraordinaries and Discontinued Operations aren’t the only item affecting an earnings ratios. When considering earnings, you want to measure the available economic surplus that flows to the holder of the common equity. If preferred stock exists, it supercedes the claims of common shareholders. Fannie Mae (OTC: FNMA) is a great example of how preferred dividends can absorb earnings from common shareholders. During the 2008 crisis, Fannie Mae issued a senior tranche of preferred stock that is owned by the U.S. Treasury, and paying a $9.7bn dividend of the $9.8bn in earnings the company generates. There is a junior preferred tranche held by investors like Pershing Square and the Fairholme funds which is currently not receiving dividends and are submitting legal challenges to receive some portion of the earnings. This leaves Common shareholders behind a long line of investors with a prioritized claim on earnings. But some methodologies adjust earnings take preferred dividends after earnings, while others do not, creating a difference in having a P/E of 2.3 (an Earnings Yield of 43%) or a P/E of 185 (Earnings Yield of 0.5%).

These comments are not about the cheapness of Allergan or Fannie Mae, rather the importance of your definition of “earnings” and the adjustments you apply. If these considerations sound like fundamental investing, it’s because they are. Fundamental analysts consider these adjustments in the analysis of a company.   Factor investors work through the same accounting issues as fundamental investors, with the additional burden of trying to systematically adjust to create the best metric that accounts for the accounting differences across thousands of companies. Investing results can be very different based on these adjustments. In the U.S. Large Stocks Universe, there is a +38bps improvement on the best decile of Earnings Yield if you adjust for Discontinued Items, Extraordinaries and Preferred Dividends. To set some scale, in the eVestment Universe the difference between a median and a top quartile manager just +60bps a year.


Compustat Large Stocks Universe, 1963-2016

Adjustments to Value signals are not limited to Price-to-Earnings. Book Value can be adjusted for the accounting of Goodwill and Intangibles. Dividend Yield can be calculated using the dividends paid over the trailing 12-months, or annualizing the most recent payment. In 2004, Microsoft paid out $32 billion of its $50 billion in cash in a one-time $3 per share dividend when the stock was trading at around $29. Should you include that dividend in calculating yield, knowing that future investors won’t receive similar dividend streams?

Differences in signal construction are not limited to Value factors. Momentum investors know that there are actually three phenomena observed in past price movement: short-term reversals in the first month, medium-term momentum over the next 12 months and long-term reversals over a 3 to 5-year period. Get two momentum investors into a room, and they will disagree over whether to delay the momentum signal one month to avoid reversals, the 12-months minus 1-month. Quality investors argue the usage of non-current balance sheet items, or the loss of effectiveness in investing on changes in analyst estimates. Volatility can be measured using raw volatility, beta, or idiosyncratic vol, to name just a few methods.

Factors are constructed as unique expressions of an investment idea and are not the same for everyone. Small differences can have large impact on which stocks get into the portfolios. These effects are more significant using an optimizer which can maximize errors, or concentrating portfolios giving more weight on individual names. This is far from simply grabbing a P/E ratio from a Bloomberg data feed.  There is skill in constructing factors.

Alpha Signals

Quantitative managers tend to combine individual factors together into themes like Value, Momentum and Quality. But there are several ways that managers can combine factors into models for stock selection. And models can get very complicated. In the process of manager selection, allocators have the difficult task of gauging the effectiveness of these models. The common mistake is assuming complexity equals effectiveness.

To demonstrate how complexity can degrade performance, we can take five factors in the Large Stocks space and aggregate them into a Value theme: Sales/Price, Earnings/Price, EBITDA/EV, Free Cash Flow/Enterprise Value and Shareholder Yield (a combination of dividend and buyback yield).

The most straightforward is an equally-weighted model: give every factor the same weight. This combination of the five factors generates an annual excess return of 4.06% in the top decile. An ordinary linear regression increases the weighting of Free Cash Flow to Enterprise Value and lowers the weighting on Earnings/Price, because it was less effective over that time frame. This increases the apparent effectiveness by +15bps annualized, not a lot, but remember this is Large Cap where edge is harder to generate. Other linear regressions, like ridge or lasso, might be used for parameter shrinkage or variable selection and try to enhance these results.

Moving up the complexity scale, non-linear or machine learning models like Neural Networks, Support Vector Machines or Decision Trees can be used to build the investment signal. There has been a lot of news around Big Data and the increased usage of machine learning algorithms to help predict outcomes. For this example, we’ve built an approach using a Support Vector Regression, a common non-linear machine-learning technique. At first look, the Support Vector Regression looks very effective, increasing the outperformance of selecting stocks on Value to 4.55%, almost a half of a percent annualized return over the equally weighted model.


Compustat Large Stocks Universe, 1963-2016

The appeal of the machine-learning approach is strong. Intuitively, the complex process should do better than the simple, and the first pass results look promising. But the excess returns do not hold up on examination.  This apparent edge is from overfitting a model. Quantitative managers might have different ways of constructing factors, but we are all working with data that does not change as we research ideas: quarterly financial and pricing data back to 1963. As we build models, we can torture that data to create the illusion of increased effectiveness. The linear regression and support vector machines are creating weightings out of the same data used to generate the results, which will always look better.

The statistical method to help guard against overfitting is bootstrapping. The process creates in-sample and out-of-sample tests by taking random subsamples of the dates, as well as subsets of the companies included in the analysis. Regression weightings are generated on an in-sample dataset and tested on an out-of-sample dataset. The process is repeated a hundred times to see how well the weighting process holds up.

In the bootstrapped results, you can see how the unfitted equally weighted model maintains its effectiveness at about the same level. The in-sample data looks just like the first analysis: the linear regression does slightly better and the SVR does significantly better. When applying the highly-fitted Support Vector Regression to the out-of-sample data, the effectiveness inverts. Performance degrades at a statistically significant level once you implement on investments that weren’t part of your training data.


Compustat Large Stocks Universe, 1963-2016

This doesn’t mean that all weighted or machine learning models are broken, rather that complex model construction comes with the risk of overfitting to the data and can dilute the edge of factors. Overfitting is not intentional, but a by-product of having dedicated research resources that are constantly looking for ways to improve upon their process. When evaluating the factor landscape, understand the model used to construct the seemingly similar themes of Value, Momentum or Quality. Complexity in itself is not an edge for performance, and makes the process less transparent to investors creating a “black box” from the density of mathematics. Simple models are more intuitive and likely to hold up in the true out-of-sample dataset, the future.

Multifactor Signals

Multifactor ETFs have a lot of moving parts: the definition of factors, the construction process of building investment themes, as well as the portfolio construction techniques. Market-capitalization ETFs are very straightforward in comparison. Different products use broad, similar universes and weight on a single factor. And market capitalization has one of the most common definitions used for investing: shares outstanding multiplied by the price per share. The result is that different products by different managers have extremely similar results, and these products can be substitutes for one another.

The following two tables show the 2016 returns for three of the most popular market cap ETFs: the SPDR® S&P 500 ETF (SPY), the iShares Russell 1000 ETF (IWB) and the Vanguard S&P 500 ETF (VOO). These are widely held, and as of December 30th, 2016 together have almost $300 billion in assets. For 2016, the returns of these three ETFs are within 17bps of each other. When looking at the annualized daily tracking error for the year, we can see that they track one another very closely. Looking at these returns, it makes sense that the key selection criteria between the funds would be based on the lowest fee.

For a comparison, we can examine four multifactor ETFs that were launched in 2015: iShares Edge MSCI Multifactor USA ETF (LRGF), the SPDR® MSCI USA StrategicFactorsTM ETF (QUS), the Goldman Sachs ActiveBeta U.S. Large Cap Equity ETF (GSLC) and the JP Morgan Diversified Return U.S. Equity ETF (JPUS). Each fund uses a broad large cap universe, and then selects or weights stocks based on a combination of Factor themes: Value, Momentum and Quality metrics. At first glance, it looks like these should be very similar with one another.

Each fund is based on an index, which consists of a publicly stated methodology for how the indexes are constructed. When digging through the construction methodologies, you start seeing that different factors are used in building these themes. The only common Value metric used across all four is Book-to-Price. Two funds do use Sales to Price, but otherwise each fund is using one or two metrics unique to their competitors. QUS does not include momentum, but the other three funds use different expressions of momentum, with two conditionalizing on volatility. The most common Quality metric is Return on Equity, used in three funds, followed by Debt-to-Equity is used in two. Even though most of these funds use the equally-weighted approach in building their investment themes of Value, Momentum and Quality, because of the different inputs, the stock selection will be very different.

These different rankings are then utilized for stock selection and weighting in different portfolio construction techniques. When comparing holdings as of December 30th, 2016, the breadth of securities held range anywhere from 139 to 614 stocks in the fund. Maximum weights range from 3.3% to 0.6%, with the top 25 securities accounting from 43% to 14% of the total assets. They each use different techniques and risk models with unique constraints to shape weightings, leading to widely different portfolios. Looking at these four funds, as well as the SPY S&P 500 fund, they can have higher active share with each other than they do with the overall market.

These differences in signal, construction and holdings leads to very different investment results. When comparing the results for 2016, the best fund had a return of 12.96% while the worst returned 8.73%, a return gap of 423bps for the year. Also, when looking at the daily tracking error between the products, they generate a wider difference of returns with each other than they do with the market.

Keep in perspective that this is a single year. Low performance in 2016 is not an indictment of GSLC; it’s most likely that GSLC was caught in the underperformance of volatility given that it focuses on low volatility names in both its Volatility and Momentum ActiveBeta® Indexes. To confirm that, you would want to run the holdings through a factor attribution framework.

The central point is that even though these four funds look very similar, they generate very different results. Factor products that generate several hundreds of basis points of difference in a single year are not a commoditized product, and should not be chosen for investment in because of a few basis points in fees. Cost leadership is the key feature for generic market-capitalization weighted schemes, but product differentiation and focus in the context of fees should be the reasons for choosing multifactor products.

Summary

There is significant edge in how factor signals are constructed. The difficulty is creating transparency around this edge for investors. Complexity of stock selection and construction methodology decrease transparency, almost as much as active quantitative managers that create a “black box” around their stock ranking methodologies. This leaves investors at a disadvantage on trying to differentiate between quantitative products. This inability to differentiate is why price wars are starting between products that have strong differences in features and results.

Investors need education on this differentiation so they’re not selecting only on the lowest fees. Large institutional and investment consultant manager selection groups will have the difficulty of adding top-tier quantitative investment staff to help with this differentiation. Smaller groups and individual investors will have to advance their own understanding of how quantitative products are constructed. For the entire range factor investors, it will help to build trusted partnerships with quantitative asset managers willing to share insights and help understand the factor investing landscape.

 

Thanks to Patrick O’Shaughnessy and Ehren Stanhope for feedback and edits

The Risk of Low Volatility Strategies

Most factor-based, otherwise known as Smart Beta, ETF strategies are based on a single concept like value or momentum. Over the last two years, the largest flows have been to ETFs investing in low volatility stocks.  The most popular being the iShares Edge MSCI Min Vol USA ETF (USMV), which as of September 30th had grown to $14.4bn USD, more than doubling over the last 12 months.

With product proliferation, there are now a number of low volatility ETFs, each with a different portfolio construction methodology as well as their own method on the best way to select stocks with low volatility.  The most straight-forward method is to look at the raw volatility of the trailing returns, either on a daily or monthly basis, for anywhere from three months to one year.  Some other strategies use the Beta of the stock:  the covariance of the stock with market returns, scaled by the volatility of the market.  In 2014, Frazzini and Pedersen published a “Betting Against Beta” (BaB) strategy, that goes long low-beta and short high-beta, leveraging and deleveraging each side to a beta of 1.  Another option is tracking error, the volatility of the excess returns stock versus the market.  Ang, Hodrick, Xing and Zhang’s 2006 paper “The Cross-Section of Volatility and Expected Returns” introduced the idea of idiosyncratic volatility:  the excess volatility of the stock after a regression on the Fama-French factors.  A last way to measure volatility would be through implied volatility, which uses options pricing to derive the expected future volatility.

We could spend a lot of time arguing the merits of each metric, but in practice the results from investing in volatility factors look very similar.  They exhibit the same return profile:  a portfolio of stocks with high volatility in the past gives high volatility in the future, along with strong underperformance.  Stocks with low volatility continue to have low volatility.  Coupled with modest outperformance, the risk-adjusted Sharpe Ratios look strong. For comparison, the following charts show the excess return, the excess volatility, and the sharpe ratios of portfolios based on Value and Volatility characteristics within the Large Stocks universe, stocks with a market cap greater than average.  Stocks are selected monthly with a holding period of one year as part of the decile portfolio.  While Value has stronger overall return, volatility is less volatile, giving similar Sharpe Ratios between Value and Volatility.  Volatility factors also correlate very highly with one another, much higher than Value characteristics.  This would indicate that Value factors capture different expressions of valuation, where all the volatility metrics seem to be based on the same market phenomenon.

Table 1: Total Return, Volatility and Sharpe Ratios of Value and Volatility Metrics, Compustat Large Stocks Universe 1969-2015. Implied Volatility from OptionsMetrics database, 1996-2015.
Table 1: Total Return, Volatility and Sharpe Ratios of Value and Volatility Metrics, Compustat Large Stocks Universe 1969-2015. Implied Volatility from OptionsMetrics database, 1996-2015.

 

Table 2: Correlation of Decile Spreads of Value and Volatility Metrics, Compustat Large Stocks Universe 1969-2015. Implied Volatility from OptionsMetrics database, 1996-2015.
Table 2: Correlation of Decile Spreads of Value and Volatility Metrics, Compustat Large Stocks Universe 1969-2015. Implied Volatility from OptionsMetrics database, 1996-2015.

A recent McKinsey report cites research that high-net worth investors top focuses are protecting principal, hedging against downside risks, minimizing volatility and generating income.  After the market crash of 2008-2009, it’s easy to see how advisors and plan sponsors could be drawn to “Defensive Equity” or “Low Risk” strategies as ways to protect against future drawdowns.  From the point of an advisor, low volatility strategies ETFs cover three of these, offering down-side protection with equity-like returns.

The risk of low volatility strategies is its usage within the total allocation of a portfolio. For the asset management industry, the value chain for clients is to hire advisors to establish an asset allocation for them. Once an allocation is decided upon, a manager selection process determines the best people to manage assets within those groups. Passive investing has supplanted active mandates within these allocation buckets. For example, if a consultant believes active managers have no edge in large cap stocks, just buy the Vanguard S&P 500 ETF for five basis points.  Small cap stocks are less efficient and active managers have edge there, so use individual mandates in that space.

It is unclear how Smart Beta strategies fit into this.  If it is viewed as a separate asset class, it is invested in based on the total expected return, volatility and diversification it adds to the total portfolio.  If it is viewed as part of an equity allocation, it is judged on the excess return versus a passive benchmark, scaled by the excess volatility.  In the case of a passive benchmark and an active manager, these roles are clear.  Smart Beta makes this more confusing.  Is it a passive allocation to an asset class, or is it a cheap source of alpha?

Volatility factors might deliver solid risk-adjusted returns for an allocator, but they are lacking in the realm of active management.  Low volatility has had modestly higher performance with a lower raw volatility, but it also came with higher excess volatility.  Using the same basic portfolios formed on the deciles of each factor in Large Stocks as above, the tracking error of volatility factors shows higher excess volatility than Value factors.  The tracking error of the top decile of raw volatility is 9.7% versus the equally weighted universe, versus tracking errors in the 6.5%-7.8% range for value factors.  With lower excess returns, Volatility factors have an information ratio about half to one quarter of that of Value and Yield.

Table 3: Tracking Error and Information Ratios of Value and Volatility Metrics, Compustat Large Stocks Universe 1969-2015. Implied Volatility from OptionsMetrics database, 1996-2015.
Table 3: Tracking Error and Information Ratios of Value and Volatility Metrics, Compustat Large Stocks Universe 1969-2015. Implied Volatility from OptionsMetrics database, 1996-2015.

The higher tracking error can be managed down in portfolio construction:  equally weighted versus market-cap weighted, sector agnostic versus sector relative.  But this still leaves a large amount of excess volatility in the portfolio.  The MSCI Min Volatility USA Index, which the iShares Edge ETF is based on, is a good example.  In the index construction methodology, there are several risk factor and sector constraints, but it still leaves the MSCI Min Volatility USA Index with a tracking error of 5.73% to the broader MSCI USA benchmark since it was incepted in 1988.  Looking at the history of this index through the lens of active management, this gives it an Information Ratio of only 0.25.

Table 3: Summary Statistics of MSCI Min Vol USA Index versus MSCI USA Index, Jul-1989 to Sep-2016, Source: Bloomberg
Table 3: Summary Statistics of MSCI Min Vol USA Index versus MSCI USA Index, Jul-1989 to Sep-2016, Source: Bloomberg

Tracking Error and Information Ratios seem a bit clinical compared to the real-time experience the investor has.  A better way to show the effect of this is the rolling 1-year excess return of the strategy versus the broader market benchmark.  This tracking error difference leads to multiple periods of time where there is strong relative underperformance. In over 20% of the rolling one-year observations, the MSCI Min Vol USA Index is trailing the MSCI USA Index by over -5%.  Several periods of over -10% underperformance, and one time reaching -15%.

Chart 1: Rolling 12-Month Excess Return of MSCI Min Vol USA Index over MSCI USA Index, Jul-1989 to Sep-2016, Source: Bloomberg
Chart 1: Rolling 12-Month Excess Return of MSCI Min Vol USA Index over MSCI USA Index, Jul-1989 to Sep-2016, Source: Bloomberg

The framing of how the investment is perceived matters to the advisor and to the discipline they will have in maintaining an investment in it.  Allocations tend to be more strategic, and not subject to the relative performance of one asset class versus another.  Bonds are supposed to behave differently than equities, which is why you own both.  Investments within the allocation tend to be questioned on a more regular basis.

It is impossible to determine how every person is using low volatility ETFs, but asset flows should give some insight.  If the flows were not reacting to near term performance on a relative basis, then it is being used in a strategic allocation.  But the flows to the Volatility ETFs appear to be based a recent spike in the relative performance, and specifically on near-term performance.  The orange line shows the trailing 12-month performance of the USMV ETF relative to the MSCI USA Index.  Low volatility stocks have been outperforming the average stock since the beginning of 2015, with peak outperformance coming around the second quarter of 2016.  This coincided with very strong flows into the product, where by the end of the second quarter there had been almost $8 billion invested over the trailing 12 months with a coincident trailing 12-month outperformance of USMV over the MSCI USA benchmark of +13%.

Chart 2: Rolling 12-Month Excess Return of USMV versus MSCI USA Index with Rolling 12-Month Net Flows to USMV ETF, Source: Bloomberg
Chart 2: Rolling 12-Month Excess Return of USMV versus MSCI USA Index with Rolling 12-Month Net Flows to USMV ETF, Source: Bloomberg

To his credit, Andrew Ang, who heads the Factor-Investing group at Blackrock which runs USMV, is trying to educate investors about how best to utilize low volatility investing.  He wrote in a September 2016 Forbes article “Investors’ aim with low-volatility strategies shouldn’t be to outperform the market, rather to reduce risk and to measure that performance over a full market cycle.”  But it seems to fall on deaf ears, as assets chase one-year relative returns.

As investors flock to the low volatility ETF based on near-term outperformance, they also sell after near-term underperformance.  These types of fund flow reactions to recent performance will only increase the difference in the time-weighted and money-weighted returns of the fund.  As of September 30th, the USMV had returned 14.39% since inception in October 2011, only +22 basis points over the MSCI USA Index which only returned 14.17%.  But because of the flows chasing performance, the average money-weighted return of USMV is only 11.89% over that time frame, creating a gap for investors of 250bps because of return-chasing.  This gap looks like it’s only going to increase.  In the third quarter of 2016, MSCI USA was up +4.06% while USMV was down -1.16%, a gap of -5.22%.  In reaction, USMV saw redemptions of -$877 million in the month of October.

Table 5: Annualized Return of USMV versus MSCI USA Index, Time-Weighted and Money-Weighted, Source: Bloomberg
Table 5: Annualized Return of USMV versus MSCI USA Index, Time-Weighted and Money-Weighted, Source: Bloomberg

Some of this gap is going to be created by individual investors, operating without an advisor or a structured asset allocation plan.  Some of this will also be factor timers, who are shifting allocations based on when they believe a factor will generate outperformance.  But some of the investors in low volatility ETFs are advisors trying to figure out how to use them in a long-term structured allocation.  In my opinion, a long investment in low volatility portfolios can have a place in a portfolio if you’re thinking about it as an allocation.  If you’re going to judge an investment in low volatility as a cheap active equity investment, there are better factors such as value and momentum that offer the opportunity for greater excess return given the active risk taken.

As a last thought, the shift towards factor-based “Smart Beta” ETFs makes me believe more than ever that advisors need to learn all they can about how factor investing works.  Manager selection involves a long process of getting to know the people and style of their investment.  Investing in an ETF seems easier, but it also comes with reduced switching costs.  And while index construction methodology is published as transparent, there could be less understanding of the process given the lack of interaction with the investment manager and the complexity of some of the strategies.  Lower switching costs and lower understanding of factor investing leads to less investment discipline and a continued degradation of investor returns on a money-weighted basis.

-Thanks to Ehren Stanhope and Patrick O’Shaughnessy for the feedback.  Appreciate the help guys.