Vector autoregression

A commenter asks,

I’m curious what your opinion on Christopher Sims’s econometric work is now.

Sims is another macro-econometrician who was awarded a Nobel Prize for work that I think is of no use.

The problem in macro is causal density–there is a high ratio of plausible causal mechanisms to data. If you have dozens of causal variables and only a relative handful of data points, what do you do?

The conventional approach was for the investigator to impose many constraints on the regression model. This is mathematically equivalent to adding rows of “data” that do not come from the real world but instead are constructed by the investigator to conform exactly to the investigator’s theoretical pet theories. The net result is that you learn what the investigator wanted the data to look like. But other investigators can–and do–produce very different empirical narratives for the same real-world observations.

Sims’ approach was for the investigator to narrow down the number of causal variables, so that the computer can produce a model without the investigator doctoring the data. But that is not a solution to the causal density problem. If there are many important causal variables in the real world, then in a non-experimental setting, restricting yourself to looking at a few variables at a time is pointless.

17 thoughts on “Vector autoregression

  1. Around the world, data analysis is plagued with bad behavior. You hit the nail on the head, and it seems that too many have a lot of mathematical tools that they want to use, but lack the discipline to remember to try to model the actual real world.

    Easy to criticize, of course, except that few problems in the real world are tractable when we try to model as many variables as possible with data that provides limited information. Economists find themselves forcing tractability in order to get something done, but all too often sacrificing the usefulness of the effort.

  2. The problem in macro is causal density–there is a high ratio of plausible causal mechanisms to data. If you have dozens of causal variables and only a relative handful of data points, what do you do?

    What do you do? I’d say you use inexpensive techniques to approximate solutions to complex problems. Engineers learn various “Numerical Methods” to do exactly that. Vector Autoregression sounds like an equivalent technique; one tool in a toolbox. The only problem I see with such techniques is treating them as deterministic rather than what they are, inaccurate approximations that are sometimes useful.

    I’m unfamiliar with economic modelling in general so maybe I’m unaware of the degree of misuse but dismissing techniques based on their approximate nature smells of scientism. Its like dismissing a cost-benefit analysis that uses a Monte Carlo Simulation because its just “rolling the dice”.

    • That’s a lot harder to do in social science than in engineering. In engineering the direction of cause and effect is usually robustly understood; not so much in social science. For example, in engineering we know that the tensile strength of steel may cause, but is not caused by, the collapse of a bridge. The budget of a school district may be cause educational success (more money sometimes leads to better results), or it may be the result of educational success (successful parents are willing to spend more), or it may be more or less irrelevant to educational success (oddly, this is one of the more reasonable interpretations of the actual data). Add dozens of other variables of equally vague relationships to the outcome, and it’s fairly easy to build a model that says whatever you want it to say.

      Also, you can say that austenitic steels have a different range of properties than martensitic steels without getting cancelled on social media.

      • But it wasn’t easier in engineering. Let’s not confuse the end-game that appears deterministic with the long process that was originally non-linear. The compressive/tensile strength of steel failed spectacularly in ships and bridges in cold temperature before the brittleness of steel (temperature dependent) was understood. Aluminum was used for motorcycle frames due to its superior strength-to-weight properties but failed repeatedly due to fatigue. Each failure brings with it careful analysis which hopefully results in a new set of guidelines/best-practices.

        The key is that failure is treated as in important feedback mechanism that helps us slowly converge on good, though likely imperfect, solutions. This is engineering rigor. Both scientists and social scientists can learn from it.

        • A key difference here is that, in your engineering examples, once it occurs to someone to ask the question, the hypothesis can be tested experimentally, e.g. you can freeze metal and test its brittleness. But in the social sciences, a great many important questions and hypotheses have been generated, but can’t really be appropriately tested (or in a way that yields reliably reproducible results. Being constrained to natural experiments may not be so bad for micro phenomena, because you can replicate your experiment on many populations and the range of possible confounded is smaller, but the more ‘macro’ it gets, the worse both problems get. It’d be the same in the sciences too: if an ecologist wanted to come up with a model for the global hyena population it’d be much more difficult than merely modeling the population dynamics for a pack of hyenas.

          • …once it occurs to someone to ask the question, the hypothesis can be tested experimentally

            What you had was something like the hull of a steel ship splitting in two in rough seas. Even if you asked the right question about frigid temperatures, the experiments are multi-variate, multi-disciplinary, and complex. Brittleness varies based on the steel manufacturing process and it is independent of strength. Once the problem was understood, steel manufacturers got better at making less brittle steel just like they got better at the other properties of steel while engineers integrated brittleness equations into their design processes.

            My point is not that every social science problem is like the steel brittleness problem, my point is that complex non-linear problems are solvable.

          • @RAD: There’s a whole branch of mathematics called “chaos theory” that addresses the question of when complex nonlinear problems are tractable and when they aren’t. If you have a serious interest, there are many sources to choose from but Wikipedia has a good start. Even if the systems are tractable, you often need either a well-validated theory or a great deal of reliable data to start with, and these are often lacking in the social sciences.

          • Jay, I’m aware of chaos theory; the concept is quite popular in science fiction. Although I have no doubt that some systems are sensitive to initial conditions, I can’t think of any real-world practical problems that are chaotic; maybe some aspects of fluid dynamics (including weather systems) but I’m skeptical that even the namesake “butterfly effect” is chaotic.

            Regardless, I don’t think it applies to the type of problems we are talking about but I appreciate your concern for my continuing education.

          • RAD, lots of real-world social phenomena are clearly chaotic, in the sense that individual-level actors have society-wide effects. The assassinations of JFK and MLK were done by individuals; the War on Terror and WWI were started in reaction to small group actions (9/11 and the assassination of Archduke Ferdinand, respectively). A few hundred extra Germans in Moscow in 1941 would have had immense historical effects. On an individual level, uncontrollable environmental effects are known to shape us about as much as genes and much more than controllable environmental effects.

          • Also, RAD, if you’re trying to understand the world, science fiction is almost* the worst place to start. That’s science as people wish it was, not as it is. For example, real physics says that faster-than-light travel is impossible, full stop.

            *Religion and what bookstores call “metaphysics” are even worse, and that’s saying something.

          • Jay, you suggested that I read up on “chaos theory” to learn more about the tractability of non-linear systems. Individuals-can-have-large-impacts is not chaos theory as I understand it but it does reflect the mainstream non-technical interpretation of “the butterfly effect”.

            My comment about chaos theory and science fiction referred to the fact that the concept, at least the name, has seeped into popular culture. Like many people, I’ve seen Jurassic Park but speaking metaphorically about chaos theory, as Jeff Goldblum’s character does in the movie, is not the same as demonstrating that certain systems are highly sensitive to initial conditions. I find it somewhat ironic that you assume that I was promoting science fiction when I was trying to categorize the usefulness of chaos theory in this discussion; none beyond superficial popular culture references.

            Let me reiterate part of my initial comment …dismissing techniques [like vector autoregression] based on their approximate nature smells of scientism. Attempting to educate me about chaos theory and “understanding the world” seems unrelated to this core point.

          • RAD, the whole point of chaos theory is that there are systems where approximate solutions simply don’t work. Chaotic systems are systems where the exact behavior can be predicted from the exact initial conditions, but approximate behavior cannot be predicted from approximate knowledge of the initial conditions.

          • Put simply, to predict the approximate paths of next month’s hurricanes*, I would need a model that included every single butterfly in China and all other sources of comparable air displacement on the globe.

            *I live in Florida, where this information would be of great interest.

  3. The set of variables that constrain 85% of a bounded economy should be s small set. The problem is the bounds is the FX market, and that boundary shifts considerably for a reserve currency like the dollar. An adaptive vector auto regressor on Russia, an energy concentrated economy with closed currencies boundaries would be useful to hedge betters.

    • I am curious as to why you think “The set of variables that constrain 85% of a bounded economy should be a small set.” It isn’t obvious to me why it should be small as opposed to somewhere in the thousands.

      The reason I say that is that there are plausibly say 100 variables that have moderately small impacts and also interact with each other in unknown and unpredictable ways. That creates both very weak signals and a huge number of potential variables introduced by interaction effects. Even lowing the number down to 10 key variables leaves this problem open.

  4. Hayek mentioned that problem in his Nobel speech. Often the data we need isn’t available. Then using proxies is like the drunk looking under the street lamp for his keys. People think the math is important but I’ve never seen it change a PhDs mind.

Comments are closed.