1 Introduction
In recent years there has been an explosion of interest in natural language processing (NLP)
within finance and macroeconomics. The use of text data to forecast and assist in model
estimation is becoming increasingly commonplace. Still, there are many open questions
around the use of NLP in empirical work. For example, which of the numerous available
methods work best, and work best in specific contexts? Are off-the-shelf tools appropriate,
or are there greater returns to specializing models to the data at hand? How useful is
text for forecasting real output indicators, such as manufacturing output? What explains
the predictions made by complicated NLP models? This paper addresses these questions,
using a novel dataset and a variety of NLP methods ranging from traditional dictionaries to
fine-tuned transformer neural networks.
Our primary data source is the monthly survey microdata underlying the Institute for
Supply Management’s (ISM) Manufacturing Report on Business. The survey is taken by
purchasing managers at a representative sample of U.S. manufacturing firms. Part of the
survey consists of categorical-response questions about aspects of their current operations,
including production, inventories, backlogs, employment, and new orders. The answers to
these questions are of the form “worse/the same/better than last month”, and are aggregated
into the widely-reported ISM diffusion indexes. But the survey also includes free-response
text boxes, where purchasing managers can provide further comments either in general or
about specific aspects of their businesses; these comments are a novel source of signal about
the economy and our focus in this paper.1
Our first step is to quantify the text into an economically important and interpretable
measure. We focus on sentiment, given that waves of optimism and pessimism have his-
torically been linked to business cycle fluctuations (Keynes, 1937). We begin by evaluating
various NLP methods in terms of their ability to correctly classify the sentiment expressed
in individual comments. Our context is fairly specific: the data are manufacturing-sector
purchasing managers opining about about the business outlook for their firm, without much
discussion of financial conditions. While there are numerous sentiment classification mod-
els available, many were developed with other data in mind, such as social media posts
(Nielsen, 2011). Even within economics and finance, most work has focused on finance-
1While ISM collects these responses through the survey, this text is confidential and not incor-
porated into the publicized indexes. A sample of responses are published in the monthly ISM Re-
port on Business (see https://www.ismworld.org/supply-management-news-and-reports/reports/
ism-report-on-business/).
2