Media & News

July 07, 2020

On the Rise of Alt-Data

By Grant Wilson, Alex Etra, Web Begole and Jens Nordvig

The  alternative data (“Alt-Data”)  revolution was well underway before the COVID-19 shock. However the pandemic has dramatically accelerated the trend. We think financial market participants face a choice: embrace the revolution, or wither.

 We made this point in our inaugural letter back in early 2016 (see link), and we are making it again today; but with even more conviction.

 The principal role of market participants is to make investment decisions. If those decisions are being made with an inferior information set, an asymmetry will develop over time, and ultimately degrade investment performance and the management of risk.

 There will be exceptions to the rule – but not many. Even long term asset allocation frameworks were tested through the COVID shock, in terms of maximum drawdown metrics. The same can be said for risk parity, where measures of equity beta fell sharply through March, and are yet to recover substantially.

 Alt-Data proved itself to be vastly superior both on the way down and on the way back up. During the most acute phase of market volatility, official economic data were practically redundant as they lagged 1-2 months.  These data did not matter in March or April as investors focused on global economic shutdowns, border closings, rising COVID-19 cases and fatalities. Today, these official data continue to have limited value.  For example, they are still lagging rolling re-closings being experienced across some US states.  Alt-Data is not perfect, but it is an invaluable component of the analytical toolbook.

 An alternative way of framing this is in terms of the classic distinction between risk and uncertainty. Those who embraced Alt-Data early on during COVID-19, were able to manage through the extreme volatility of March in the context of quantifiable risk. The moves were extreme, but Alt-Data, both in terms of COVID-19 and measures of social mobility, helped to fill the void and empower decision making. Those without such resources were facing Knightian uncertainty.

 To be sure, there are limitations and caveats to keep in mind as the Alt-Data revolution proceeds, including around data-sets which are ‘crowded’ .  Further, the adoption of Alt-Data will be slower in more institutionalized settings, such as central banks. In any case, Alt-Data is a rising force, and the COVID19 has only accelerated its ascent.

Brief Recap:

As a macro advisory and data analytics firm, Exante Data is always on the lookout for new data sources that further our core mission of providing institutional investors with tools and insights to facilitate alpha generation and risk management. In past years, we have focused aggressively on capital flow data, which we believe is under-researched, and we have developed a comprehensive platform (with thousands of relevant time series) under our Global Flow Analytics platform.

In 2020, we identified COVID-19 a key risk factor for global financial markets in the week of Jan 20, and commenced our signature ‘coronavirus daily’ on Jan 27  – almost six months ago. For comparison, here is the WHO COVID-19 timeline.

Much of this early work was aimed at filling the gap we saw between early stage epidemiological modeling, and the pass-through in terms of financial markets, economics and policy reaction.

The gap was vast early on, in part due to the rally back in financial markets through February as China brought the outbreak under control.

In early February,  we also started to track China’s return to work via an open source dataset published by Baidu. Specifically, the chart below was one key metric we identified, and it showed the mobility in major Chinese cities starting to bottom 20-30 days after the lock-down efforts were put into place.

Since then, we have expanded our Alt-Data coverage aggressively, with a particular emphasis on measures of mobility, socialization and return to work. Here is an example with a global perspective:

We have automated much of the data collation both in respect of COVID-19 and social mobility, and in some cases we have democratized these data via Twitter (using the @ExanteData handle).

The most timely insights we have continued to be published in our Daily update, augmented by client calls, webinars, and media where appropriate.

Exante’s client base has grown and broadened significantly through this period. The demand we saw for real-time data on COVID-19 specifically peaked in March, whereas the demand for broader Alt-Data continues apace.

We have seen this across our entire global client base, including hedge funds, real money, corporate, private banks, family offices, sovereign wealth funds and central banks.

State of Play

In terms of where we stand today, it is helpful to think of three different use cases for Alt-Data:

(1) Epidemiological
First, it is clear that Alt-Data has played a crucial role in assessing the efficacy of the public health interventions that have been implemented in response to COVID-19. While the first order challenge from a data perspective has been the collation of cases, tracing, testing and hospitalization, Alt-Data has provided key input in terms of quantifying the extent and timing of the non-pharmaceutical interventions that have occurred. This is an ongoing process, as additional data has aided the parameterization of the models that are informing public policy, including as countries emerge from lockdown.

At Exante we saw this first hand, as we contributed our Alt-Data set on China’s return to work to the Imperial College of London’s team in early March. The resulting publication can be found here. ICL published comparable reports in May for Italy and the UK, based on Google’s and Facebook/O2’s Alt-Data respectively, that can be found here and here.

It has not all been smooth sailing. There has been widespread criticism of the role that epidemiological modeling played in informing policy, particularly in the US and the UK. There are many strong substantive criticisms that have been made. A good example is from the International Institute of Forecasters, here.

More dispiriting has been the increasingly ideological and even ad hominem attacks that have been leveled. The authors of these would do well to keep in mind the axiom that “all models are wrong, but some are useful”.  Had the core insights of the early NCoV models been implemented early, the world would be in better shape right now.

Of course, we have been busy too, whilst conscious of staying in our lane. A good example is our regression study of whether (and when) social distancing matters, available here.  We also closely examined whether the Black Live Matters protests would lead to case re-acceleration (unlikely), available here. We have also found, somewhat surprisingly to some, that Alt-Data on restaurant usage is among the most helpful in terms of tracking socialization, as illustrated in a univariate sense below:

There are many other examples.

Our aim throughout has been to provide robust and timely insights. This is particularly important in the context of the recent re-acceleration of cases in the US, a key near-term focus for clients.

Financial Markets

For financial markets, the use case for Alt-Data, which was already strong, has become even more compelling through COVID-19.

By way of example:

  • Our real-time tracking of China’s return to work, that we pursued down to the Intra-city and Inter-city levels, was instrumental in late Q1/early Q2 in terms of understanding demand dynamics as COVID-19 went global. Without Alt-Data it would have been much more difficult to assess the status of China’s supply chains, and the demand for key industrial and agricultural commodities.
  • More generally, Alt-Data has provided timely insights in terms of tracking human mobility, including granularity with respect to usage of public and private transport, offices, restaurants, freight, cargo and other modes of private demand. These data are clearly superior to traditional data, in terms of timeliness. Moreover, there is a sampling effect to consider, as official datasets are based on partials. The pandemic itself amplified this aspect as it is impeded traditional modes of data collection. More generally, we have found, unsurprisingly, that Alt-Data leads PMIs/Retail sales and other traditional data series (here here here).  We have also highlighted the prospect of a so-called ‘Carmageddon’, here.
  • Beyond this, there are myriad of trading views that have been expressed based explicitly or implicitly on real-time data. These range from quantitative expressions, where mobility has led equity returns. To the qualitative extreme, where the day trading phenomena in the US has been disproportionately focused on airlines, cruises and rental cars.
    Throughout this period we have also seen the street tool-up in terms of Alt-Data. This happened with a lag, as the first order response in February was to look through COVID-19. The overwhelming consensus at the time was for a one month V-shape recovery, even more accelerated that the analog from SARS.

The chart below shows that close to the bottom of global activity, the countries which handled the COVID-shock the best (Korea, Taiwan, etc), as quantified by alternative data at the time, were also among those with the most resilient equity markets (although we admit that it is hard to control for sector effects these charts).

The sellside is also increasingly integrating Alt-Data in their publications through Q2 (with various degrees of cogency) it is clear they are on board for the journey ahead. This has important self-reinforcing implications for Alt-Data itself.

Public Policy and Planning

Where we have less conviction is in terms of the uptake of Alt-Data from a public policy and planning perspective.

The most basic issue is that central banks and fiscal authorities remain constrained in terms of formal mandates, along with institutional inertia. It is not a lack of willingness to engage, nor a lack of talent. This can be seen hereherehereherehere.

Uptake requires not just technical skills in data science and programming or advanced degrees in STEM. It requires an interdisciplinary approach to gather, analyze and nimbly bring data to bear on the most timely and relevant problems and questions facing economic policy makers and investors. And it requires buy-in at the institutional level.

For central banks, we are seeing interest and some monetary policy decisions have referenced Alt-Data in passing. But institutional change takes time.

The disjuncture that results is between decision makers in the private sector, with an increasing reliance upon Alt-Data, and those in the public sector, that are awaiting traditional statistics that conform to expressions of mandate.

This has a long way to run.

Limitations and Caveats

There are many limitations and caveats for the Alt-Data revolution:

  • The most important is that Alt-Data, in many cases, remains privately held. In the important case of Baidu, ‘public permissions’ were cancelled in early May (link). Apple, Google and Facebook have also reserved their rights in withdrawing datasets from the public domain. This backdrop makes it more difficult in terms of building sustainable scaffolding. There are, however, many sources available, and we doubt the cat can be put back in the bag, especially with COVID-19 still having a long way to run.
  • Second, there are broader limitations regarding privacy, and data sovereignty as well. The privacy aspect varies significantly by jurisdiction, with Europe’s GPDR and China’s surveillance economy representing the opposing extremes. For those in the middle, it remains to be seen how tolerant citizens will be in terms of having private data collated and aggregated, even if as meta-data, beyond the duration of the pandemic. The general experience has been that big-tech is able to push well into the traditional private domain of individuals, and that individuals are either unaware, or willing to trade away privacy for convenience.
  • Third, in our tracking of Alt-Data we have noted many inconsistencies and deviations from best practice. A good example is China’s Ministry of Transport, that initially suggested a very strong resurgence in private car usage, only for the time series to break in late April (see below).

  • Fourth, while timely, big data, is clearly preferable to lagged, small data, there is still the challenge of extracting the signal from the noise. Traditional statistical problems, such as over-fitting, do not go away, nor at they resolved by the naïve deployment of AI technologies.
  • Finally, given the reluctance of central banks to formally engage with Alt-Data, traditional data releases of course remain relevant. The May NFP release is a good example of this, where both Alt-Data and traditional partials were not much help. The rub was that the formal data release did not matter much, beyond the day in question, despite the unprecedented surprise versus forecast. Further, since quant funds are increasingly able to predict key data releases, we are also observing that the price action increasingly happens before a release, rather than after!

In Conclusion

It is still early days in the Alt-Data revolution. We think the secular trend is not merely intact, but accelerating.

The early adopters were quant funds. But quantamental funds have caught up, and even more pure discretionary funds are paying close attention too; they have to.

We are keen to hear about your own experiences, and are looking forward to continuing the conversation.