How Machine Learning Can Help Bond Investors

19 June 2019
6 min read

When a group of graduate students asked Warren Buffett about the best way to prepare for an investing career, he held up a stack of Securities and Exchange Commission filings. “Read 500 pages like this every day,” he told them. “That’s how knowledge works. It builds up like compound interest.”

It’s a well-worn story, and it gets at a fundamental truth about investing: annual Form 10-K and quarterly Form 10-Q filings are important sources of information about the financial state of a company—its revenues, its risks and its management’s outlook for the future.

For most investors, though, there aren’t enough hours in the day to comb through the onslaught of data coming at them today. An investment firm could easily employ 30 full-time research analysts just to keep up with the torrent of 10-Ks (8,000 per year), 10-Qs (32,000 per year) and earnings call transcripts (20,000 per year)—and even the best of them will occasionally overlook actionable investment insights.

Artificial intelligence (AI) and machine learning (ML) are changing this, giving our industry a new way to tackle investing problems and increase return potential. For instance, a computer equipped with natural language processing (NLP), a type of AI that helps computers understand and interpret human language, can scour hundreds of thousands of corporate filings in seconds and instantly tease out patterns from a swirl of words and numbers, making it easier for investors to predict returns.

Cutting Through the Complexity

Bond managers have been a bit slower than equity investors to incorporate AI into their research process. This is largely because of the sheer amount of data to digitize. There’s just one stock ticker for Apple, but many corresponding bonds, with varying maturities, coupons, covenants and risks.

Our fixed-income team has tried to cut through this complexity in two ways. First, we digitized and centralized bond liquidity data and our own credit analysts’ views, making them instantly available to everyone on our fixed-income team. Then, we made the data available to an AI-driven virtual assistant that can use it to generate lists of potential investment opportunities within seconds.

We’re not alone. Machine learning, NLP and predictive analytics can make huge, complex data sets more meaningful, and bond managers are increasingly seeing the value of incorporating these technologies into their research and investment process.

Corporate filings are a great example. What companies say and do—and what it says about the health of their businesses—tells us a lot about overall credit quality. This makes these signals highly relevant when forecasting corporate bond returns. In time, we expect more and more fixed-income managers will go the extra mile and apply these techniques to credit analysis.

Corporate-Speak: Reading Between the Lines

Why is NLP effective? Simply because companies are run by people, and people are creatures of habit. Most companies use the previous quarter’s or year’s filing as a template and update it to produce the most recent one. A computer can quickly crawl through these lengthy documents, with their densely packed sections of text, tables and charts, and search for key signals that may escape the human eye.

With NLP, a computer can assess the tone, or sentiment, of a 10-K filing. Is management using more “negative” words—“losses,” “impairment,” “adverse”—in the financials section of its 10-K? This may mean the company is facing challenges to its business. Word choice and tone can even offer insights about a company’s culture or the leadership style of its executives.

Complexity of language matters, too. Are certain parts of the management discussion section getting longer and more complex? Has the ratio of numbers to words increased? Perhaps management is trying to divert investor attention from underlying problems. And what about the overall risk outlook? Though management may not say it in so many words, substantial changes here may indicate that the company anticipates that its overall risk exposure will increase.

Sometimes, the signals can be subtle. A credit analyst may not notice a 10% increase in “negative” sentiment in a company’s 10-K, particularly if he’s covering 20 or 30 companies in a given sector. But a computer will—and it can synthesize these and other data into a net sentiment signal and back-test it against years of SEC filings to determine its predictive power.

Investment managers can also use NLP-powered learning to extract thematic risk signals across sectors and industries. If “trade” or “tariffs” start showing up as a risk in the filings of companies from a wide cross-section of industries—manufacturing, agriculture, technology, banking—it’s probably significant and may be predictive. Again, a computer that can crawl through thousands of filings in seconds can register the change more readily than a group of analysts, each devoted to his or her own sector.

Human + Machine

Human language, of course, is complex by nature, and words can have multiple meanings. The term “buyback” would score as a positive for equity-oriented analysts but a negative for fixed-income teams. Using a domain-specific dictionary is essential, and it’s a big reason why we’re creating a credit-specific one for our fixed-income analysts.

The lesson here? Even in our current age of AI and big data, asset managers can’t get by without flesh-and-blood analysts. While computers excel at crunching mountains of data at breakneck speed, it takes humans to do the abstract thinking and high-level strategy planning. Quantitative and fundamental research are complementary, and combining the two is the best way to improve the accuracy of our return forecasts. But to succeed, they must be able to talk to each other in a language both can understand. That means that anything a quantitative model suggests should make intuitive sense.

For example, it’s not hard to accept that a sequential change in corporate sentiment appears to be a predictor of relative returns. But an NLP-driven strategy that suggests that frequent use of the word “vice” in a 10-K filing correlates with poor performance should set off alarm bells and prompt a closer look at the underlying data. In a corporate filing, it’s highly likely that the word refers to “vice president,” not drugs, gambling or prostitution.

Think of it this way: when it comes to data-driven quantitative strategies, there’s always a bit of art mixed in with the science. Quantitative tools like machine learning can provide important insights, but it takes fundamental judgment to inform and qualify those insights.

More Information Is Better

The goal for asset managers should be to extract information from as many sources as possible and synthesize the information more quickly than their competitors. When it comes to 10-Ks, quantitative analysts might start by ranking the various NLP sub-signals—changes in net sentiment or changes in the number-to-word ratio—and evaluate their predictive power using historical data.

The next step? Combine the sub-signals with the most predictive power into one overarching 10-K signal, taking into account the specific target investment universe, such as high-yield bonds, investment-grade credit or the S&P 500. This is because some of the sub-signals—net sentiment, for instance—may be more predictive in one market but less so in another.

Ultimately, fundamental analysts and portfolio managers can examine the signal in concert with other predictive signals and apply them all to portfolio construction. When the process works, it should give managers a way to differentiate among a cross-section of corporate bonds, letting them increase exposure to those that score well and avoid those that score poorly.

That’s important, because bond managers who generate consistent outperformance over the long run generally don’t do it by timing the market or managing duration. They do it by maximizing exposure to factors with a track record of producing alpha and avoiding exposure to those that don’t.

For asset managers, the objective hasn’t changed: we’re still trying to get actionable insights that lead to alpha-generating opportunities. What has changed in the era of machine learning and big data is how we can go about achieving that objective.

The views expressed herein do not constitute research, investment advice or trade recommendations and do not necessarily represent the views of all AB portfolio-management teams.


About the Authors