Ken Judd of Stanford University’s Hoover Institution recently entertained the internet community with a public rant about being treated very hostile by peer ivory-league colleagues from Chicago University’s "Journal of Political Economy" (JPE), one of the so-called top five journals that can make or break careers in economics. The rather disturbing details aside, one particular aspect of the exchange strikes the observer as really, really odd: the whole argument essentially was an argument about nothing.
This nothing that first ignited and then fanned the flames is the utility function which beats like a heart in Ken Judd’s model. Maybe surprisingly for the non-initiated, this utility function is a completely fictional construct that commands as much empirical support as God’s breath or the "aether" that once was the catch-all of Greek mythology and medieval alchemy respectively. Yet regardless, both parties fought fiercely about its properties.
The nature of the disagreement, therefore, also is extremely superficial because had any of the two parties bothered to offer even the faintest empirical fact in support of their respective parameterization, the discussion would have been over in a matter of seconds. Alas, they did not and this too, is really, really odd.
Coincidentally, the 2022 Association for the advancement of social sciences (ASSA) meeting took place less than two weeks before Judd went public. At this meeting, several papers using the same methodology as Judd’s were presented in many different sessions. This methodology is the so-called dynamic stochastic general equilibrium (short DSGE) model.
One common feature of all the variants of this model1 is the deployment of some version of the said "utility function" which is meant to describe how the "representative agent" decides and acts. Depending on the properties of this function, the chosen parameters and the further model assumptions, policy actions and shocks as well as the status quo can be expressed as numerical representations of macroeconomic variables such as inflation or GDP.
The possibility to link the model to numerical representations of macroeconomic variables makes the proponents of the method label their approach "empirical macroeconomics". However, as I could prove by means of a survey among ASSA presenters, there is nothing empirical about it, at least not when it comes to the very heart of the method.
It is worth noting that the utility function determines how all elements of the model dynamically interact because the "representative agent" is assumed to process all available (model) information with the objective to maximise his utility over an extended – usually infinite – horizon. This processing also entails the consideration of the evolution of the economic variables as well as the agent’s own responses to this evolution. That is why the utility function can be understood as the heart of a DSGE model. Yet this heart is completely imaginary.
To further appreciate the significance of the matter, a comparison to other imaginary elements in science may seem appropriate. For example, the Higgs bosom as well as linguists’ "laryngeals" also are purely imaginary, yet in stark contrast to DSGE’s utility functions they are well-defined and nobody assumes that they vary in the way they are used or operate.
Holding the imaginary item’s properties constant is, of course, a pre-requisite to study the impact of a changing environment such as surprise policy action or the way economic variables generally interact. By analogy, it was only the constancy of the laryngeal’s operation on phonemes that eventually earned it general acceptance (over the course of a century, notabene) and the invisible Higgs bosom was verified by testing predictions of its impact on visible matter. Had researchers not kept the properties of these items constant, consistent inference would not have been possible and all results would ultimately have been arbitrary.
Unfortunately, the latter is the fate of utility functions. One and the same researcher usually entertains different utility functions with differing parameters or functional forms or any combination thereof. It is, therefore, impossible to learn, if the data, the other model assumptions or the particular utility function informs the results. This "flexibility" with respect to the definitions of the utility function is the equivalent of operating with different definitions of the laryngeals in order to fit one’s preferred story about the evolution of phonemes.
Even worse, while it is impossible to conduct an empirical study on laryngeals, it would very well be possible, to investigate utility functions of "representative agents". Although such an agent, too, is imaginary, it would be perfectly possible to directly study human behaviour in order to infer utility functions. Equipped with empirical support for one’s utility function, a researcher could handily argue that his / her model commands more authority and hence its results deserve more attention than that of competing models and the respective results. In short, nasty arguments such as reported by Judd would not emerge in the first place.
A well founded and credible utility function would hence be an enormous advantage for economic research. It would offer not only the opportunity to rigorously test "economic laws" by comparing the fit of competing model designs but to also study the impact of changes in policy presuming the constancy of agents’ preferences. At the same time, models using inappropriate utility functions could easily be discriminated against and not further considered for policy analysis.
In fact, Judd’s initial rebuke from the JPE can at least partly be read as an act of discrimination against his utility function:
"However, the JPE editor and referees said that there was no economic content to the paper because we reported the results for a variety of Epstein-Zin utility function parameters, and got different results for different parameter choices."
It remains, of course, the secret of JPE why "a variety of Epstein-Zin utility function parameters" should be less worthy of publication than any other given that neither the JPE editor nor the referees seem to bother with empirical, first hand evidence supporting the choice of their respective favourite utility functions.
Unfortunately, this carelessness apparently is a much wider problem among DSGE modellers. It is even fair to say that DSGE papers routinely fail to offer empirical backing for the underlying utility functions and the behaviour of the models’ "representative agents" at large. Given the fundamental advantages of empirical evidence, one might also wonder exactly why researchers miss out on this issue.
Enters the ASSA 2022 survey.
In order to shed some light on actual research practice regarding backing up utility functions I conducted a survey among active researchers as follows. First, the ASSA 2022 conference program was scanned for keywords such as "DSGE" and "New Keynesian", the latter being the label of a hugely popular DSGE derivative.
Next, for each of the three days of the conference in total six sessions featuring presentations of DSGE models were selected for attendance. Upon attendance the presenters were addressed via the online QnA dialogue forms. The question asked was: "I wonder if you have any direct, micro-level evidence that actual agents behave like you describe the agents’ behaviour in your model?"
Finally, the respective answers were noted. In case there was no answer given during the session, I sent follow-up emails to the presenters to obtain, clarify or confirm the respective answer.
The survey respondents hailed from the Universities of Chicago, Princeton, Stanford, California in Los Angeles, Minnesota, and the ECB. Two of them were senior faculty with international standing that reaches far beyond their field of specialisation and the others are household names in DSGE modeling. In this sense, the survey was not representative but certainly "authoritative".
All but Monika Piazzesi finally answered the above question.
To cut a long story short, all answers were an unanimous "No".
That means that none of the researchers had any empirical evidence for the foundation on which they build their analysis. This said, the respondents also hold reasons as to why they use their particular utility function or why empirics is not required in the first place. These reasons are truly revealing (category number in brackets added):
"I think that micro-foundations of bond demand is useful and help policy making. That's why in our model there is an explicit micro-foundation (and not simply assumed as bond in the utility function). Of course, one has to be mindful to capture the right micro-foundation and whether one captures all relevant effects (when one squeezes data through someone's model)." (1)
"I'll let you answer that question yourself as you seem interested in epistemological questions. I think economics has learned a lot from theory that predates data […]." (1)
"That's a question with lots of facets. Not sure which answer might be most compelling. Let me try this one. The moon does not think at all about theories of gravity, yet obeys its laws. Remarkable!" (2)
"So in [...] we use a standard ‘money in the utility’ specification that gives us in an (admittedly) simple way a liquidity premium for cash, that we then extend to [...]. There is a JME paper by Feenstra that shows that this specification is equivalent to several cash-in-advance and liquidity constraints." (3)
"In my model, I have agents of household, firms and financial intermediary. Their preference specification are standard in the literature. The preferences for household and financial intermediary are adopted from Ottonello and Winberry (2020)." (3)
Overall, there seem to be three principal arguments against going empirical in empirical macroeconomics:
- theory is (self-evidently) sufficient,
- representative agents’ behaviour follows (inferable) natural laws,
- the utility function is very similar to the utility function other researchers use.
Of these three, the second clearly stands out in that it is worthy of some consideration while argument categories one and three are ultimately circular and hence simply unscientific.
Before turning to (2), one might note that the researcher who put forward this reason had no less than four papers in the conference with four different specifications of individual preferences. By analogy, that would amount to four different laws of gravity for one and the same moon. Ridiculous as it sounds (and is), this likening of agents to dead matter also holds the key for understanding what is going wrong in "empirical macroeconomics" and why it does so.
The second category suggests that representative agents follow physical laws which is truly remarkable on several accounts. For one, it represents no less than a sharp turn away from the widely accepted economic principle of "methodological individualism".2 Instead of treating the human decision maker (the representative agent) as a self-conscious individual with a free will and agency, automated cause-effect relations are studied.
This dehumanisation thus also runs counter to the principles of modernism and enlightenment as it portrays humans as puppets on strings on the mercy of some higher power. Two hundred fifty years ago his higher power used to be divine, in "modern" empirical macroeconomics divinity is replaced by secular sets of equations.
When digging a little deeper, the dead-matter analogy also shows that the "empirical macro" literature pays only lip service to the concept of "microfoundation" of macroeconomics.
"Microfoundation" of macroeconomics is nominally working like this: given the utility functions and according parameters of the economic agent the agent’s optimal decisions in response to shocks and policy measures can be inferred based on knowledge or hypothesis about the economic environment. The ice on the cake is dynamic optimisation that also accounts for the effects of the agent’s own decisions. These dynamically optimal responses finally generate numbers that can be labeled as income or inflation, for example.
According to the survey, however, no empirical evidence about the utility function and its parameters exist. The point of departure, the parameterised utility function, therefore, rests on purely theoretical, or rather, imaginary grounds. The question thus arises of how to assess the quality of the overall inference when the quality of the basis deliberately remains unknown. The "answer" to this question is called "calibration" and works by choosing the free parameters of the economic part of the model such that empirical moments of observable time series like inflation and income match the moments of the numbers generated by the dynamically optimal responses of the model.
The interesting bit of this approach is that although the calibration nominally serves as a means of determining the free model parameters with the aim of closely matching model output and observations, it simultaneously confronts the choice of the underlying utility function with the actual data because the match between empirical and modeled moments depends as much on the choice of the free parameters as on the utility function posited beforehand.
The Judd example handily illustrates that DSGE researchers are well aware of this issue and, therefore, take a lot of pain justifying their choice of the utility function in an effort to make the model results look dependent only on the design of the economic model environment, which usually is the main object of investigation, yet independent of the choice of the utility function.
This effort is akin to assuming a given law of gravity and studying, say, the effect of the moon colliding with Elon Musk’s rocket or its drag on the earth’s oceans, depending on the objective of the investigation. Keeping the law of gravity fixed permits the comparison of models that aim at explaining the tides or advising on measures to limit any potential damage to space kit from collisions. The ice on the cake is to use empirics for pinning down the law of gravity prior to modeling the tides.
In "empirical" macroeconomics, however, the analogue to the law of gravity is far from being fixed. More precisely, it is "fixed" in each paper on its own yet never across papers nor authors. "Empirical macroeconomists" would thus be able to "explain" the tides with many different models where each of them would be fitted with a customised law of gravity. Moreover, no calibration could ever inform which of the different models was the better description of reality and, therefore, no rational man would accept advice based on any of them.
Unfortunately, this arbitrariness in economic modeling is exactly what we observe in the literature. Instead of rigorously selecting the best foundation, that is the most suitable utility function using empirical micro evidence, the choice of utility functions ultimately is a matter of top-down optimisation through the back-door: good calibration results "justify" not only the economic part of the model but implicitly also the utility function.
This justification, ultimately, is the core message of the moon – law of gravity – analogy because only if agents follow objective laws without agency and without an own will, these laws can be inferred from the agent’s behaviour quite as the eternal law of gravitation can be inferred from the movements of celestial objects or from objects dropped from the tower of Pisa, for that matter.
In other words, the approach that claims to rest on "microfoundation" for macroeconomic analysis, does in fact, pursue "macrofoundation". The difference simply being the digression of letting macro data "justify" micro assumptions about the utility function.
Interestingly, attendants of macroeconomic seminars very often can testify to this reverse-engineering process of macro models when researcher swap tips of how to best amend utility function in order to obtain the desired macroeconomic results. One such example is also hidden in the survey responses when reference is made to an "extension" of a utility function.
The macro foundations of macroeconomic analysis cannot, however, fix the issue of arbitrariness. Remember, that the calibration exercise provides the only information about external validity of the whole model. Statements about the functioning of the actual economy and about the actual effects of policy measures, therefore, decisively hinge on the results of the calibration. However, these results depend on the presumed features of the economy AND, simultaneously, on the appropriateness of the utility function and hence cannot be told apart.
Therefore, no matter how perfect calibrations are, they cannot lend support to any statement about economic relations nor about the impact of policy changes based on the respective DSGE model. The only way to make such statements more credible would be to rigorously justify the underlying utility functions independently of the remaining model and with information sourced from outside the model.
This apparent lack of credibility and rigour leads us back to the "methodological individualism", the approach that cannot be squared with DSGE modeling as we have already seen. One might wonder if it is possible for DSGE modellers to simply choose to ignore it.
At a first glance, DSGE models seem to imply total ignorance because representative agents (or representative groups of agents with limited heterogeneity) featuring objective utility functions populate the literature. At a second glance, however, it becomes obvious that "methodological individualism" prevails and even dominates. To understand this dominance one only has to once again note that the representative agent has fixed properties only within any given model, or paper. Across papers and across time researchers appeal to a great many varieties of properties of agents rendering the notion of the (heterogeneous) representative agent(s) with stable preferences absurd.
This variety thus not only causes the arbitrariness described above but also reflects the fact that humans are indeed individuals who cannot be objectified. To put it differently, the underlying fabric of the economy permeates economics even against the more or less explicit will of the researchers concerned.
It is obvious that this fabric poses a considerable challenge for economists, and "empirical macroeconomics" in its current state certainly does not live up to this challenge. However, understanding the main issues can also serve as a guide for future economic research.
First, the consequences of (genuine) individualism for the traditional DSGE model class needs to be understood. Tentatively, letting the degree of heterogeneity within heterogeneous agent models approach infinity should provide a rough idea of where macroeconomics is headed.
Second, accepting subjective judgments in all economic matters will give rise to fundamental unpredictability of key economic variables which will in turn shed a new light on concepts such as rational choice.
Third, instead of musing about agents’ decision making in a purely theoretical fashion, macroeconomists should partake in the "empirical revolution" in economics and include empirical findings about actual decision making in their models.
Upon so doing, public rants such as Judd’s will certainly continue to entertain the internet community yet with way less oddity involved.
For anyone interested in the DSGE literature this hagiographic account is a good source.
See Weber, Max, 1922. Economy and Society, Guenther Roth and Claus Wittich (eds.), Berkeley: University of California Press, 1968.