Information theory short course by Jman Shade

Information theory 101 (information equilibrium edition) With this blog ostensibly dedicated to the purpose of using information theory to understand economics, it seems only natural to have a short introduction to information theory itself. Or at least as much as is necessary to understand this blog (which isn't much). Nearly everything you'd need is contained in Claude Shannon's 1948 paper: A mathematical theory of communicationâ&#x20AC;&#x2039; [pdf] In that paper Shannon defined what "communication" was, mathematically speaking. It comes down to reproducing a string of symbols (a message) selected from a distribution of messages at one point (the transmitter, Tx) at another point (the receiver, Rx). Connecting them is what is called a "channel". The famous picture Shannon drew in that paper is here:

A communication channel Information theory is sometimes made to seem to spring fully formed from Shannon's head like Athena, but it has some precursors in Hartley and Nyquist (both even worked at Bell labs, like Shannon), and Hartley's definition of information (which coincides with Shannon's when all symbols are considered equally probable) is actually the one we resort to most of the time on this blog.

One improvement Shannon made was to come up with a definition of information that could handle symbols with different probabilities. In their book, Shannon and Weaver are careful to note that we are only talking about the symbols when we talk about information, not their meaning. For example I could successfully transmit the series of symbols making up the word BILLION but the meaning could be different to a British English (10⁹ or 10¹²) and an American English (10⁹) speaker. It would be different to a French speaker (10¹²). The information would also be slightly different since letter frequencies (i.e. their probabilities of occurring in a message) differ slightly among the languages/dialects. Shannon came up with the definition of information by looking at its properties: ● ● ●

Something that always happens carries no information (a light that is always on isn't communicating anything -- it has to have at least a tiny chance of going out) The information received from transmitting two independent symbols is the sum of the information from each symbol There is no such thing as negative information

You can see these are intimately connected to probability. Our information function I(p) -- with p a probability -- therefor has to have the mathematical properties ● ● ●

I(p = 1) = 0 I(p₁ p₂) = I(p₁) + I(p₂) I(p) ≥ 0

The second one follows from the probability of two independent events being the product of the two probabilities. It's also the one that dictates that I(p) must be related to the logarithm. Since all probabilities have to obey 1 ≥ p ≥ 0, we have I(p) = log(1/p) This is the information entropy of an instance of a random variable with probability p. The Shannon (information) entropy of a random event is the expected value of it's information entropy H(p) = E[I(p)] = Σₓ pₓ I(pₓ) = - Σₓ pₓ log(pₓ) where the sum is taken over all the states pₓ (where Σₓ pₓ = 1). Also note that p log(p) = 0 for p = 0. There's a bit of abuse of notation in writing H(p). More accurately you could write this in terms of a random variable X with probability function P(X):

H(X) = E[I(X)] = E[- log(P(X))] This form makes it clearer that X is just a dummy variable. The information entropy is actually a property of the distribution the symbols are drawn from P: H(•) = E[I(•)] = E[- log(P(•))] In economics, this becomes the critical point; we say that the information entropy of the distribution P₁ of demand (d) is equal to the information entropy of the distribution P₂ of supply (s): E[I(d)] = E[I(s)] E[- log(P₁(d))] = E[- log(P₂(s))] E[- log(P₁(•))] = E[- log(P₂(•))] and call it information equilibrium (for a single transaction here). The market can be seen as a system for equalizing the distributions of supply and demand (so that everywhere there is some demand, there is some supply ... at least in an ideal market). Also in economics (at least on this blog), we frequently take P to be a uniform distribution (over x = 1..σ symbols) so that: E[I(p)] = - Σₓ pₓ log(pₓ) = - Σₓ (1/σ) log(1/σ) = - (σ/σ) log(1/σ) = log σ The information in n such events (a string of n symbols from an alphabet of size σ with uniformly distributed symbols) is just n E[I(p)] = n log σ Or another way using random variable form for multiple transactions with uniform distributions: E[- log(P₁(•)P₁(•)P₁(•)P₁(•) ... )] = E[- log(P₂(•)P₂(•)P₂(•)P₂(•) ...)] n₁ E[- log(P₁(•))] = n₂ E[- log(P₂(•))] n₁ E[- log(1/σ₁)] = n₂ E[- log(1/σ₂)] n₁ log(σ₁) = n₂ log(σ₂) Taking n₁, n₂ >> 1 while defining n₁ ≡ D/dD (in another abuse of notation where dD is an infinitesimal unit of demand) and n₂ ≡ S/dS, we can write

D/dD log(σ₁) = S/dS log(σ₂) or dD/dS = k D/S where k ≡ log(σ₁)/log(σ₂) is the information transfer index. That's the information equilibrium condition. ... PS In another abuse of notation, on this blog I frequently write: I(D) = I(S) Where I should more technically write (in the notation above) E[n₁ I(P₁(d))] = E[n₂ I(P₂(s))] where d and s are random variables with distributions P₁ and P₂. Also note that these E's aren't economists' E operators, but rather ordinary expected values. Info EQ 101 Also on my flight yesterday, I started writing up what I would say in a chalkboard lecture (or brown bag seminar) about information equilibrium. Update: see the addendum for a bit more on some issues glossed over on the first segment relating to comments from Ken Duda below. ...

At its heart, information equilibrium is about matching up probability distributions so that the probability distribution of demand matches up with the probability distribution of supply. More accurately, we'd say the information revealed by samples from one distribution is equal to the information revealed by samples from another. Let's say we have nd demand widgets on one board and ns supply widgets on another. The probability of a widget appearing on a square is 1/σ, so the information in revealing a widget on a square is - log 1/σ = log σ. The information in n of those widgets is n log σ.

Let's say the information in the two boards is equal so that nd log σd = ns log σs. Take the number of demand widgets to be large so that a single widget is an infinitesimal dD; in that case we can write nd = D/dD and ns = S/dS.

...

Let's substitute these new infinitesimal relationships and rearrange to form a differential equation. Let's call the ratio of the information due to the number of board positions log σd/log σs the information transfer index k.

We say the derivative defines an abstract price P.

Note the key properties of this equation: it's a marginal relationship and it satisfies homogeneity of degree zero.

We'll call this an information equilibrium relationship and use the notation P : D ⇄ S.

...

Note that the distributions on our boards don't exactly have to match up. But you don't sell a widget if there's no demand and you don't sell as many widgets as you can (with no wasted widgets) unless you match the supply distribution with the demand distribution.

We can call the demand distribution the source distribution, or information source and the supply distribution the destination distribution. It functions as an approximation to the Platonic source distribution.

You could measure the information loss using the â&#x20AC;&#x2039;Kullback-Liebler divergenceâ&#x20AC;&#x2039;. However, information loss has a consequence for our differential equation.

...

Since the information in the source is the best a destination (receiver) can receive, the information in the demand distribution is in general greater than the information in the supply distribution (or more technically it takes extra bits to decode a D signal using S than it does using D). When these differ, we call this non-ideal information transfer.

Non-ideal information transfer changes our differential equation into a differential inequality.

Which means (via â&#x20AC;&#x2039;Gronwall's inequalityâ&#x20AC;&#x2039;) that our solutions to the Diff Eq are just bounds on the non-ideal case.

What are those solutions?

...

The first solution is where you take both supply and demand to vary together. This corresponds to general equilibrium. The solution is just a power law.

If we say our supply variables is an exponential as a function of time with supply growth rate σ, then demand and price are also exponentials with growth rates δ ~ k σ and π ~ (k - 1) σ, respectively.

...

There are two other solutions we can get out of this equation. If we take supply to adjust more quickly than demand when it deviates from some initial value D0 (representing partial equilibrium in economics -- in thermodynamics, we'd say were in contact with a "supply bath"), then we get a different exponential solution. The same goes for a demand bath and supply adjusting slowly.

Use ΔS and ΔD for S - S0 and D - D0, respectively.

...

If we relate our partial equilibrium solutions to the definition of price we come up with relationships that give us supply and demand curves.

These should be interpreted in terms of price changes (the are shifts along the supply and demand curves). If price goes down, demand goes up. If price goes up, supply goes up.

Shifts of the curves involve changing the values of D0 and S0.

...

From this we recover the basic Marshallian supply and demand diagram with information transfer index k relating to the price elasticities.

Our general solution also appears on this graph, but for that one ΔS = 0 and ΔD = 0 since we're in general, not partial equilibrium. We'll relate this to the long run aggregate supply curve in a minute.

...

Note that if we have non-ideal information transfer, these solutions all become bounds on the market price, so the price can appear anywhere in this orange triangle.

If we take information equilibrium to hold approximately, we could get a price path (green) that has a bound (black). Normal growth here is punctuated by both a bout of more serious non-ideal information transfer (a recession?) and then a fast (and brief) change supply or demand (a big discovery of oil or a carbon tax, respectively).

...

Since we really haven't specified anything about the widgets, we could easily take these to be aggregate demand widgets and aggregate supply widgets and â&#x20AC;&#x2039;P â&#x20AC;&#x2039;to be the price level.

We have the same solutions to the info eq diff eq again, with the supply curve representing the short run aggregate supple (SRAS) curve and the general equilibrium solution representing the long run aggregate supply (LRAS) curve.

...

What if we have a more realistic system where aggregate demand is in information equilibrium with money and money is in info eq with aggregate supply?

Using the chain rule, we can show that the model is encompassed in a new information equilibrium relationship holds between AD and money (the AS relationship drops out in equilibrium) with a new information transfer index.

And we have the same general eq solution to the information equilibrium condition where AD grows with the money supply and so does the price level.

A generic "quantity theory of money"

...

Let's say the money supply grows exponentially (as we did earlier) at a rate μ, inflation (price level growth) is π and nominal (AD) growth is ν.

Then π ~ (k - 1) μ and ν ~ k μ

Note that if k = 2, inflation equals the money supply growth rate.

What else?

...

Let's say nominal growth is ν = ρ + π, where ρ is real growth and look at the ratio ν/π and write it in terms of the information transfer index and the growth rate of the money supply (which drops out).

If k is very large, then ν ≈ π, which implies that real growth ρ is small compared to inflation. That means large information transfer index is a high inflation limit.

Conversely, if the information transfer index is about 1, then the price level is roughly constant (the time dependence drops out to leading order). A low inflation limit.

List of standard economics derived from information equilibrium Here is a (possibly incomplete, and hopefully growing) list of standard economic results I have derived from information equilibrium. It will serve as a reference post. This does not mean these results are "correct", only that they exist given certain assumptions. For example, the quantity theory of money is only really approximately true if inflation is "high". Another way to say this is that information equilibrium includes these results of standard economics and could reduce to them in certain limits (like how quantum mechanics reduces to Newtonian physics for large objects).

In a sense, this is supposed to serve as an acknowledgement (or evidence) that information equilibrium has a connection to mainstream economics ... and that it's not completely crackpottery.

Supply and demand

http://informationtransfereconomics.blogspot.com/2013/04/supply-and-demand-from-informati on.html

Price elasticities

http://informationtransfereconomics.blogspot.com/2013/04/the-previous-post-with-more-words -and.html

Comparative advantage

http://informationtransfereconomics.blogspot.com/2016/04/comparative-advantage-from-maxi mum.html

AD-AS model

http://informationtransfereconomics.blogspot.com/2015/04/what-does-ad-as-model-mean.html

IS-LM

http://informationtransfereconomics.blogspot.com/2013/08/deriving-is-lm-model-from-informati on.html http://informationtransfereconomics.blogspot.com/2014/03/the-islm-model-again.html http://informationtransfereconomics.blogspot.com/2016/02/the-is-lm-model-as-effective-theory -at.html

Quantity theory of money

http://informationtransfereconomics.blogspot.com/2013/07/recovering-quantity-theory-from.ht ml http://informationtransfereconomics.blogspot.com/2015/05/money-defined-as-information-me diation.html

Cobb-Douglas functions

http://informationtransfereconomics.blogspot.com/2014/05/more-on-cobb-douglas-functions-a nd.html

Solow growth model

http://informationtransfereconomics.blogspot.com/2014/12/the-information-transfer-solow-gro wth.html http://informationtransfereconomics.blogspot.com/2015/05/the-rest-of-solow-model.html http://informationtransfereconomics.blogspot.com/2015/05/dynamics-of-savings-rate-and-solo w-is-lm.html

The Kaldor facts as information equilibrium relationships

http://informationtransfereconomics.blogspot.com/2016/09/the-kaldor-facts.html

Gravity models

http://informationtransfereconomics.blogspot.com/2015/09/information-equilibrium-and-gravity .html

Utility maximization

http://informationtransfereconomics.blogspot.com/2015/03/utility-in-information-equilibrium-m odel.html

Asset pricing equation

http://informationtransfereconomics.blogspot.com/2015/05/the-basic-asset-pricing-equation-a s.html

Euler equation

http://informationtransfereconomics.blogspot.com/2015/06/the-euler-equation-as-maximum-e ntropy.html

DSGE models

http://informationtransfereconomics.blogspot.com/2016/08/dsge-part-1.html http://informationtransfereconomics.blogspot.com/2016/08/dsge-part-2.html http://informationtransfereconomics.blogspot.com/2016/08/dsge-part-3-stochastic-interlude.ht ml http://informationtransfereconomics.blogspot.com/2016/08/dsge-part-4.html http://informationtransfereconomics.blogspot.com/2016/08/dsge-part-5-summary.html

MINIMAC

http://informationtransfereconomics.blogspot.com/2015/06/minimac-as-information-equilibriu m-model.html

Mundell-Fleming as Metzler diagram

http://informationtransfereconomics.blogspot.com/2016/06/metzler-diagrams-from-information .html

Diamond-Dybvig

http://informationtransfereconomics.blogspot.com/2015/04/diamond-dybvig-as-maximum-entr opy-model.html

"Econ 101" effects of price ceilings or floors

http://informationtransfereconomics.blogspot.com/2016/05/what-happens-when-you-push-onprice.html

Cagan model

http://informationtransfereconomics.blogspot.com/2016/08/the-economy-at-end-of-universe-p art-ii.html

Lucas Islands model

http://informationtransfereconomics.blogspot.com/2015/04/towards-information-equilibrium-ta ke-on.html

Supply and demand from information transfer At this point we will take our information transfer process and apply it the the economic problem of supply and demand. In that case, we will identify the information process source as the demand QdQd , the information transfer process destination as the supply QsQs , and

the process signal detector as the price pp . The price detector relates the demand signal

δQdδQd emitted from the demand QdQd to a supply signal δQsδQs that is detected at the supply Qs Qs and delivers a price PP .

We translate Condition 1 in [1] for the applicability of our information theoretical description into the language of supply and demand: Condition 1: The considered economic process can be sufficiently described by only two independent process variables (supply and demand: Qd,QsQd,Qs ) and is able to transfer information. We are now going to look for functions ⟨Qs⟩=F(Qd)⟨Qs⟩=F(Qd) or ⟨Qd⟩=F(Qs)⟨Qd⟩=F(Qs) where the angle brackets denote an expected value. But first we assume ideal information transfer IQs=IQdIQs=IQd such that:

(4) P=1κQdQs(4) P=1κQdQs (5) dQddQs=1κQdQs(5) dQddQs=1κQdQs Note that Eq. (4) represents movement of the supply and demand curves where QdQd is a "floating" information source (in the language of Ref [1]), as opposed to movement along the supply and demand curves where Qd=Qd0Qd=Q0d is a "constant information source".

If we do take Qd=Qd0Qd=Q0d to be a constant information source and integrate the differential equation Eq. (5)

(6) κQd0∫QdQdrefd(Qd)′=∫⟨Q s⟩Qsref1Qsd(Qs)(6) κQ0d∫QrefdQdd(Qd)′=∫Qrefs⟨Qs⟩1Qsd(Qs) We find

(7) ΔQd=Qd−Qdref=Qd0κlog(⟨Qs⟩Qsref)(7) ΔQd=Qd−Qrefd=Q0dκlog⁡(⟨Qs⟩Qrefs)

Equation (7) represents movement along the demand curve, and the equilibrium price PP moves according to Eq. (4) based on the expected value of the supply and our constant demand source:

(8a) P=1κQd0⟨Qs⟩(8a) P=1κQ0d⟨Qs⟩ (8b) ΔQd=Qd0κlog(⟨Qs ⟩Qsref)(8b) ΔQd=Q0dκlog⁡(⟨Qs⟩Qrefs) Equations (8a,b) define a demand curve. A family of demand curves can be generated by taking different values for Qd 0Q0d assuming a constant information transfer index κκ .

Analogously, we can define a supply curve by using a constant information destination

Qs0Q0s and follow the above procedure to find: (9a) P=1κ⟨Qd⟩Qs0(9a) P=1κ⟨Qd⟩Q0s (9b) ΔQs=κQs0log(⟨Qd ⟩Qdref)(9b) ΔQs=κQ0slog⁡(⟨Qd⟩Qrefd) So that equations (9a,b) define a supply curve. Again, a family of supply curves can be generated by taking different values for Qs0Q0s .

Note that equations (8) and (9) linearize (Taylor series around Qx=QxrefQx=Qrefx )

Qd=Qdref+Qd0κ−QsrefPQd=Qrefd+Q0dκ−QrefsP Qs=Qsref−κQs0+Qs 02κ2QdrefPQs=Qrefs−κQ0s+Q0s2κ2QrefdP plus terms of order (Qx)2(Qx)2 such that

Qd=α−βPQd=α−βP Qs=γ+δPQs=γ+δP where

α=Qdref+Qd0/κα=Qrefd+Q0d/κ , β=Qsrefβ=Qrefs ,γ=Qsref−κQs 0γ=Qrefs−κQ0s

and δ=κ2Qs02/Qdrefδ=κ2Q0s2/Qrefd . This recovers a simple linear model of supply and demand (where you can add a time dependence to the price e.g. dPdt∝Qs−QddPdt∝Qs−Qd ).

We can explicitly show the supply and demand curves using equations (8a,b) and (9a,b) and plotting price PP vs change in quantity ΔQx=ΔQsΔQx=ΔQs or ΔQdΔQd . Here we take

κ=1κ=1 and Q xref=1Qrefx=1 and show a few curves of Qx0=1±0.1Q0x=1±0.1 . For example, for x=sx =s and +0.1, we are shifting the supply curve to the right. In the figure we

show a shift in the supply curve (red) to the right and to the left (top two graphs) and a shift in the demand curve (blue) to the right and to the left (bottom two graphs). The new equilibrium price is the intersection of the new colored (supply or demand) curve and the unchanged (demand or supply, respectively) curve.

References

[1] Information transfer model of natural processes: from the ideal gas law to the distance dependent redshift P. Fielitz, G. Borchardt http://arxiv.org/abs/0905.0610v2 [2] http://en.wikipedia.org/wiki/Gronwall's_inequality [3] http://en.wikipedia.org/wiki/Noisy_channel_coding_theorem#Mathematical_statement [4] http://en.wikipedia.org/wiki/Entropic_force [5] http://en.wikipedia.org/wiki/Sticky_(economics)

The previous post with more words and fewer equations The idea behind the information transfer model is that what is called "demand" in economics is essentially a source of information that is being transmitted to the "supply", a receiver, and the thing measuring the information transfer is what we call the "price".

Choosing constant information sources (i.e. keeping demand constant) or constant information destinations (i.e. a fixed quantity supplied) allows you to trace out supply and demand curves and the movement of those curves (allowing the information source and destination to vary) recovers the Marshall model. We can see diminishing marginal utility in the downward sloping demand curves; this comes from the definition of the "detector" measuring the price having supply in the denominator which follows from the identification of the demand as the information source.

Note that since Fielitz and Borchardt originally described physical processes with this information transfer model, we can make an analogy between economics and thermodynamics, specifically ideal gasses: ● ● ●

Price is analogous to the pressure of an ideal gas Demand is analogous to the work done by an ideal gas (and is related to temperature and energy content) Supply is analogous to the volume of an ideal gas

The equation relating the price, supply and demand is analogous to the ideal gas law. One point to make here is that we haven't made any description of how the demand behaves over time, just how it behaves under small perturbations from "equilibrium" -- by which we mean a constant price defined by the intersection of a given supply and demand curve (with ideal information transfer). Humans will decide they don't want e.g. desktop PC's anymore (because of tablets or laptops or whatever reason) and the demand will drop. The economy (and population) grow. Economists frequently use supply and demand diagrams to describe models or specific shocks and this information transfer framework recovers that logic. In the future, I would like to see where else we can take model. In the next post, I will show how you could get (downward) sticky prices from this model by looking at non-ideal information transfer.

PS If we use the linearized version of the supply and demand relationship near the equilibrium price, we can find the (short run) price elasticities from

Qd=Qdref+Qd0κ−QsrefPQd=Qrefd+Q0dκ−QrefsP Qs=Qsref−κQs0+Qs 02κ2QdrefPQs=Qrefs−κQ0s+Q0s2κ2QrefdP Such that

ed=dQd/QddP/P=κQd−Qd 0−κQdrefκQded=dQd/QddP/P=κQd−Q0d−κQrefdκQd Expanding around

ΔQd=Q d−QdrefΔQd=Qd−Qrefd ed≃−Qd0κQdref+O(ΔQd)ed≃−Q0dκQrefd+O(ΔQd) And analogously

es≃κQs0Qs ref+O(ΔQs)es≃κQ0sQrefs+O(ΔQs)

From which we can measure κκ .

(Note, I said fewer equations.)

Comparative advantage from maximum entropy

Paul Krugman has a post that is mostly about what using basic economic arguments to advocate policy positions glosses over, but in it he mentions comparative advantage: Comparative advantage says that countries are made richer by international trade, even if one trading partner is more productive than the other across the board, and the less productive country can only export thanks to low wages. Paul Samuelson once declared this the prime example of an economic insight that is true without being obvious â&#x20AC;&#x201C; and to this day you get furious attempts to refute the concept. So comparative advantage has, for generations, been considered one of the crown jewels of economic analysis. Crown jewel, huh?

Sounds like an excellent example to show the power of a maximum entropy argument. Let's start with the â&#x20AC;&#x2039;example at Wikipedia (following Ricardo) where England and Portugal both produce wine and cloth, and Portugal has an absolute advantage to produce both goods. Here's the table from Wikipedia:

Per the article, we'll take Portugal to have 170 labor units available (defining the state space or opportunity set) and England to have 220. The maximum entropy solution for the two nations separately are indicated with the green (Portugal) and blue (England) points:

I indicated the additional consumption possibilities for Portugal because of trade in the green shaded area. As you can see the production opportunity sets are bounded by the green and blue lines, but the consumption opportunity set is bounded by the black dashed line. This means the maximum entropy (i.e. expected) consumption for both nations is the black point (actually two black points on top of each other -- England gets a bit more wine and Portugal gets a bit more cloth). Already, you can see this is a bit further out are more wine and cloth in total is consumed.

Now we ask: what are the points in the production possibility set that are consistent with this consumption level? They are given here:

In a sense, the consumption possibilities inside the black dashed line makes the production possibility points shown above more likely when considered as a joint probability. That is to say, the most likely position for England's production is the first blue point assuming it can only consume what it produces. This changes if we ask what the most likely position of England's production given it can consume a fraction of what both countries produce.

As you can see Portugal produces more wine and England produces more cloth -- the goods each nation has a comparative advantage (but not absolute advantage) to produce. When trade is allowed, the production of each nation moves to a new information equilibrium:

...

Update 25 April 2016

In checking the results for material in the book (I'm at 10,000 words, 1/3 of the way to my 30,000 word goal), I realized that something I did to speed up the calculation actually made the comparative advantage result less interesting by eliminating a bunch of valid states where England specializes in wine and Portugal specializes in cloth.

It turns out that trade is still likely, even in the case where one nation is more productive across the board, but it is only marginally more likely that one nation will specialize in the good for which they have a comparative advantage.

In the England-Portugal example, it is only slightly more likely that England would specialize in cloth. There is a significant probability that England could specialize in wine (at least if the numbers were given as they are above).

This is interesting because it produces an evolutionary mechanism for trade. A slight advantage turns into a greater and greater advantage over time. This fits more into the idea of agglomerationâ&#x20AC;&#x2039; (increasing returns to scale).

Additionally it explains why e.g. â&#x20AC;&#x2039;bacteria would trade metabolites without resorting to rational agents.

Here are the opportunity sets consistent with the maximum entropy consumption solution:

And here is the combined diagram:

Update + 1 hour

Also, this is more consistent with the data. See this paper [â&#x20AC;&#x2039;pdfâ&#x20AC;&#x2039;]. Here's a quote: While the slope coefficient falls short of its theoretical value (one), it remains positive and statistically significant.

What does the AD-AS model mean?

Cameron Murray has a new post up about logical fallacies and contradictions involved in setting up the AD-AS model in macroeconomics. I personally like the AD-AS model as a toy model of how an economy works. It misses out on a bunch of details, but I think it forms a fine basis from which to depart with more detailed model.

The focus of the Murray's post is the fallacy of composition, which I've seen used as a rhetorical device in many instances (the sum of government spending effects on local spending doesn't mean there is an aggregate effect, or prudent increased saving by individuals isn't prudent for the overall economic situation in the paradox of thrift). As a physicist, I've always thought of it as a strange rhetorical device. In physics we have large numbers of examples where the fallacy applies, but it is never used. I think the reason it is never used is that in general there is a specific effect at work and we'd refer to that effect instead of the "fallacy of composition" -- quark confinement, entropic forces, emergent dimensions in string theory, pretty much all of materials science.

I think the fallacy of composition would better be called the warning of composition -- an idea that warns you of: ● ●

Effects that might go away at the macro scale (an example is the SMD theorem, and on this blog most of the details of how economic agents operate) Effects that might not exist at the micro scale, but do at the macro scale ("entropic forces", emergent properties, and on this blog nominal rigidity)

The warning of composition can help prevent you from making unwarranted jumps in logic. But sometimes those jumps are warranted (or you have explicit machinery for adding the effects together).

So let's look at the AD-AS model in the information equilibrium framework. It essentially lives entirely on the macro scale, so there isn't any fallacy of composition. We instead have failures of information equilibrium, exceptions and other micro effects.

The basic set up starts with the main information equilbrium condition for the information in the aggregate demand in information equilibrium with aggregate supply I(AD) = I(AS), so that we have the differential equation:

P=d ADdAS=kADASP=dADdAS=kADAS Note that this is just the minimal (interesting) differential equation consistent with long run neutrality (homogeneity of degree zero in supply and demand functions). There are three solutions we look at here:

1. Both AD and AS vary (general equilibrium; for example in a growth model) 2. AD is held constant (partial equilibrium; the system is in contact with a "demand bath" -a demand curve) 3. AS is held constant (partial equilibrium; the system is in contact with a "supply bath" -- a supply curve) The solution to the first case is given by (you can ignore the angle brackets -- they represent expectation values in an ensemble of agents [1])

⟨AD⟩=a0(⟨AS⟩s0)k⟨AD⟩=a0(⟨AS⟩s0)k P=a 0k(⟨AS⟩s0)k−1P=a0k(⟨AS⟩s0)k−1

The other two solutions are family of supply and demand curves (where we define

ΔAS=AS−s0ΔAS=AS−s0 and ΔAD=AD−a0ΔAD=AD−a0 parameterized by ββ and αα respectively so that

P=a 0kβexp(+ΔASβk)P=a0kβexp⁡(+ΔASβk)

P=α ks0exp(−kΔADα)P=αks0exp⁡(−kΔADα)

Shifts in the supply and demand curves are shifts in ββ and αα . Here are the resulting supply and demand diagrams (there are more details about this derivation here, the expectation values ⟨AD⟩⟨AD⟩ and ⟨AS⟩⟨AS⟩ parameterize the location along the supply and demand curves, respectively and drop out in the price functions above):

So now let's try and address some of Cameron Murray's problems with AD-AS ...

interpreted as showing Real GDP related to the rate of inflation in a period (not the level, but the rate of change of price levels relative to a base year). This overcomes the fallacy of composition because it compares the price level with last year’s price level, but contradicts earlier discussions about anticipated vs unanticipated inflation. If inflation is anticipated at any reasonable level then there should be no economic effect, and hence no relationship (no curve, or a vertical line at best). A really good way to clearly see some of the properties of the information equilibrium version of the AD-AS model is to take k=1k=1 . This is just a parameter, so we can set it to whatever we'd like and what is left should still express the fundamental properties of the model (as opposed to parameter dependent ones). In that case we have (I'll drop the angle brackets for now [1]):

(1a) AD=a0ASs0(1a) AD=a0ASs0 (1b) P=a0(1b) P=a0

Note that PP is a constant. The AD-AS model has nothing to do with trend inflation or the

absolute price level. The absolute (level) version of the model is trivial [2]. If k>1k>1 then all that adds is a trend growth and trend inflation.

All of the real action is in the partial equilibrium (supply and demand curves) which do not materially change when k=1k=1

(2a) P=a0βexp(+ΔASβ)(2a) P=a0βexp⁡(+ΔASβ) (2b) P=αs0exp(−ΔADα)(2b) P=αs0exp⁡(−ΔADα)

This prepares us for Murray's point 2: As income rises consumption will rise, and as income falls, consumption will fall (p269) ... In aggregate the total consumption (or demand) in the economy is equal to the total incomes in the economy be definition, since someone’s consumption is somebody else’s income. This sentence, allowing from proper aggregation that avoids the fallacy of composition, merely says that aggregate incomes and consumption just rise and fall together since they are equal at any point in time by definition.

There is a difference between a rise in a price in general equilibrium Eqs. (1a, b) and a rise in the price in partial equilibrium Eqs. (2a, b). The former case (which is the one referred to in the quote from p269) is essentially a change in a0a0 in equation (1b) or trend inflation if

k>1k>1 .

Murray's subsequent comments involve combining the AD-AS model with some model of interest rates, beyond the scope of this one post (here's an example in the IS-LM model). He does at the top of his post say: [No economics professors] could explain what the concept of the price level in the aggregate even is, nor what mechanism was meant to be at play in generating the relationship between price level and output.

Within the information equilibrium version of the AD-AS model we have an answer. But it completely ignores the "warning" of composition. The AD-AS model at its heart is a simple aggregation of millions of supply and demand diagrams for each product in a large economy. The price level in aggregate is simply the sum of all prices and the mechanism for the relationship between PP , ADAD and ASAS is just supply and demand operating on the grand scale. If there is more demand for aggregate output than there is aggregate output that exists, prices in general will rise. If there is a fall in demand, prices in general will fall.

It's a simple model! The mechanism is the same as supply and demand ... now there may be some issues with the reasoning behind that ... [3]

Update 4/6/2015

Cameron Murray was right that the AD-AS model presented above doesn't directly address the fallacy of composition; I commented back on his blog and I've attached the comment below for reference here ...

I guess I wrote up the AD-AS model post a bit too quickly and really only addressed the fallacy of composition with a single sentence: "[the model] essentially lives entirely on the macro scale, so there isn't any fallacy of composition." That is to say the model makes no specific assumptions about the details of household or firm behavior. You could interpret that in a couple of ways, though 1. Micro behavior is approximately random 2. Micro behavior is so complex it appears random 3. Any detailed micro behavior is subject to the fallacy of composition -- after being filtered through the aggregation process, all that's left looks random 'Random' here is being used in the thermodynamics sense -- I'm not saying households are irrational, increasing or decreasing their holdings of liquid assets at random month to month. Each atom in a gas obeys very strict physical laws like momentum and energy conservation. Likewise each economic agent could have some kind of deterministic micro behavior. Just like we don't know the history of collisions for an atom in a gas, we don't know the history of financial transactions of one household. Any given household will be increasing or decreasing its holdings of liquid assets at any given time because of that history. Most of the time these are in "detailed balance": households increasing equals households decreasing. If e.g. inflation falls, there could be a tendency on average for those holdings to increase (based on millions of household decisions consistent with millions of financial histories) due to

a small imbalance, aggregated up to the macro level. However, any individual household wouldn't point to the falling inflation. And it's not even true that inflation is influencing household behavior -- falling inflation is simply the most likely macro state where liquid asset holdings are increasing (in the AD-AS model). Since there are millions of households, the law of large numbers kicks in. In that way, the AD-AS model I present at the link is a macro-only model. Higher inflation doesn't cause individual households to increase their demand for iPads. The higher inflation macro state is consistent (in the AD-AS model) with a macro state in which more iPads are being consumed. Update II 4/6/2105

I should also add that the general equilibrium solution -- #1 in the list at the top and equations (1a,b) -- essentially defines the "long run aggregate supply" (LRAS) curve while equation (2a) defines the "short run aggregate supply" (SRAS) curve, so the full diagram is:

Footnotes

[1] The angle brackets are a bit like how in thermodynamics one takes volume to be V=⟨V⟩V=⟨V⟩ .

Volume doesn't exist for an individual atom, but only makes sense as a property of an ensemble of atoms. The same goes for demand in the information equilibrium model -- it is not a property of an individual agent, but an ensemble of agents.

[2] If we say AS is made of components, like capital and labor, then it gets a bit more interesting and you arrive at the Cobb-Douglas production function in the Solow growth model.

[3] Supply and demand seem to be entropic forces (and supply and demand diagrams are entropic force diagrams) that may defy a microscopic description. For example, here's one way to get something that looks like utility but really is just random behavior. What does the AD-AS model mean?

I think the fallacy of composition would better be called the warning of composition -- an idea that warns you of: ● ●

The warning of composition can help prevent you from making unwarranted jumps in logic. But sometimes those jumps are warranted (or you have explicit machinery for adding the effects together).

1. Both AD and AS vary (general equilibrium; for example in a growth model) 2. AD is held constant (partial equilibrium; the system is in contact with a "demand bath" -- a demand curve) 3. AS is held constant (partial equilibrium; the system is in contact with a "supply bath" -- a supply curve) The solution to the first case is given by (you can ignore the angle brackets -- they represent expectation values in an ensemble of agents [1])

⟨AD⟩=a0(⟨AS⟩s0)k⟨AD⟩=a0(⟨AS⟩s0)k

P=a 0k(⟨AS⟩s0)k−1P=a0k(⟨AS⟩s0)k−1

The

other

two

solutions

ΔAS=AS−s0ΔAS=AS−s0 respectively so that

are

family

supply

and

demand curves (where we define

and ΔAD=AD−a0ΔAD=AD−a0 parameterized by ββ and αα

P=a 0kβexp(+ΔASβk)P=a0kβexp⁡(+ΔASβk) P=α ks0exp(−kΔADα)P=αks0exp⁡(−kΔADα)

So now let's try and address some of Cameron Murray's problems with AD-AS ...

Let's start with his point 1: The fallacy at play here is that there is an aggregate price level. As we saw in Chapter 8 on Inflation, price levels are not absolute but relative to some base year. In the economy as a whole there is no external price from which to determine a price level. Hence the idea of an economy-wide price level is in a given time period is a fallacy. Often the AD-AS model is interpreted as showing Real GDP related to the rate of inflation in a period (not the level, but the rate of change of price levels relative to a base year). This overcomes the fallacy of composition because it compares the price level with last year’s price level, but contradicts earlier discussions about anticipated vs unanticipated inflation. If inflation is anticipated at any reasonable level then there should be no economic effect, and hence no relationship (no curve, or a vertical line at best). A really good way to clearly see some of the properties of the information equilibrium version of the AD-AS model is to take k=1k=1 . This is just a parameter, so we can set it to whatever we'd like and

what is left should still express the fundamental properties of the model (as opposed to parameter dependent ones). In that case we have (I'll drop the angle brackets for now [1]):

(1a) AD=a0ASs0(1a) AD=a0ASs0

(1b) P=a0(1b) P=a0

Note that PP is a constant. The AD-AS model has nothing to do with trend inflation or the absolute price level. The absolute (level) version of the model is trivial [2]. If k>1k>1 then all that adds is a trend growth and trend inflation.

All of the real action is in the partial equilibrium (supply and demand curves) which do not materially change when k=1k=1

(2a) P=a0βexp(+ΔASβ)(2a) P=a0βexp⁡(+ΔASβ) (2b) P=αs0exp(−ΔADα)(2b) P=αs0exp⁡(−ΔADα)

It's a simple model! The mechanism is the same as supply and demand ... now there may be some issues with the reasoning behind that ... [3]

Update 4/6/2015

2. Micro behavior is so complex it appears random 3. Any detailed micro behavior is subject to the fallacy of composition -- after being filtered through the aggregation process, all that's left looks random 'Random' here is being used in the thermodynamics sense -- I'm not saying households are irrational, increasing or decreasing their holdings of liquid assets at random month to month. Each atom in a gas obeys very strict physical laws like momentum and energy conservation. Likewise each economic agent could have some kind of deterministic micro behavior. Just like we don't know the history of collisions for an atom in a gas, we don't know the history of financial transactions of one household. Any given household will be increasing or decreasing its holdings of liquid assets at any given time because of that history. Most of the time these are in "detailed balance": households increasing equals households decreasing. If e.g. inflation falls, there could be a tendency on average for those holdings to increase (based on millions of household decisions consistent with millions of financial histories) due to a small imbalance, aggregated up to the macro level. However, any individual household wouldn't point to the falling inflation. And it's not even true that inflation is influencing household behavior -- falling inflation is simply the most likely macro state where liquid asset holdings are increasing (in the AD-AS model). Since there are millions of households, the law of large numbers kicks in. In that way, the AD-AS model I present at the link is a macro-only model. Higher inflation doesn't cause individual households to increase their demand for iPads. The higher inflation macro state is consistent (in the AD-AS model) with a macro state in which more iPads are being consumed. Update II 4/6/2105

I should also add that the general equilibrium solution -- #1 in the list at the top and equations (1a,b) -essentially defines the "long run aggregate supply" (LRAS) curve while equation (2a) defines the "short run aggregate supply" (SRAS) curve, so the full diagram is:

Footnotes

[1] The angle brackets are a bit like how in thermodynamics one takes volume to be V=⟨V⟩V=⟨V⟩ .

[2] If we say AS is made of components, like capital and labor, then it gets a bit more interesting and you arrive at the Cobb-Douglas production function in the Solow growth model.

Deriving the IS-LM model from information theory I would like to use this derivation to illustrate a point: the information transfer framework is more general than the specific application to a quantity theory of money that has made up the bulk of the blog posts over the past month or so. The framework allows you to build supply-and-demand based models in a rigorous way. I will use it here to build the IS-LM model.

The IS-LM model attempts to explain the macroeconomy as the interaction between two markets: the Investment-Savings (goods) market and the Liquidity-Money Supply (money) market. The former effectively models the demand for goods with the interest rate functioning as the price (with what I can only guess is "aggregate supply" acting as the supply). The latter effectively models the demand for money with the interest rate functioning as the price (with the money supply acting as the supply). In the most basic version of the model, there is no real distinction made between the nominal and real interest rate.

Economists might find my "acting as the supply" language funny. I am only using it because in the information transfer framework, we have to know where the information source is transferring information: "the supply" is the destination. In our case, we are looking at two markets with a single constant information source (the aggregate demand) transferring information to the money supply (in the LM market) and the aggregate supply (in the IS market) via the interest rate (a single information transfer detector). The equation that governs this process is given by Equations (8a,b) in this post:

(8a) P=1κQd0⟨Qs⟩(8a) P=1κQ0d⟨Qs⟩ (8b) ΔQd=Qd0κlog(⟨Qs ⟩Qsref)(8b) ΔQd=Q0dκlog⁡(⟨Qs⟩Qrefs)

However, each market employs these equations differently. The IS market is a fairly straightforward application. The price PP is replaced with the interest rate rr , and the constant information source

Qd0Q0d becomes the equilibrium aggregate demand/output Y0Y0 (although we will also take it to be Y0 →Y0+ΔGY0→Y0+ΔG in order to show the effects of a a boost in government spending, which shifts the IS curve outward). The expected aggregate supply is put in the place of ⟨Qs ⟩⟨ Qs⟩ is the

variable used to trace out the IS curve. It can be eliminated to give a relationship between the interest rate and the change in YY (ΔYΔY put in the place of ΔQdΔQd ). Thus we obtain

logr=logY0 κISISref−κ ISΔYY0log⁡r=log⁡Y0κISISref−κISΔYY0

The LM market employs an equilibrium condition in addition to Equations (8a,b), setting

ΔQs=ΔQdΔQs=ΔQd

via the money supply ΔQs=ΔMΔQs=ΔM (this selects a point on the money

demand curve). The constant information source Qd0Q0d is still the equilibrium aggregate

demand/output Y0Y0 , but in the LM market we look at the curve traced out by the equilibrium point for shifts

the

money

Y0→Y0+ΔYY0→Y0+ΔY

demand

curve

(changing

the

"constant"

information

source,

). These two pieces of information allow us to write down the LM market

equation:

logr=logY0+ΔYκLMLMref−κLMΔMY0+ΔYlog⁡r=log⁡Y0+ΔYκLMLMref−κLMΔMY0+ΔY Plotting both of these equations we obtain the IS-LM diagram which behaves as it should for monetary and fiscal expansion:

In both cases, κxxκxx and XXrefXXref are constants that can be used to fit the model to data (I basically set them all to 1 because all I want to show here is behavior). The interest rate and output are in arbitrary units (effectively set by the constants).

As an aside, there is an interesting effect in the model. It basically breaks down if r=0r=0 (in the thermodynamic analogy, it is like trying to describe a zero pressure system -- it doesn't have any particles in it). As it approaches zero, the LM curve (and the IS curve) flatten out, producing the liquidity trap effect in the IS-LM model as popularized by Paul Krugman. Here is the graph for a close approach to zero:

This is not to say the zero lower bound problem is "correct" anymore than the IS-LM model is "correct". The results here only say that the IS-LM model is a perfectly acceptable model in the information transfer framework, which serves more to validate the framework (since IS-LM is an accepted part of economic theory ... economists may disagree whether it describes economic reality, but they agree that it e.g. belongs in economic textbooks).

What use is couching the IS-LM model in information theory? In my personal opinion, this is far more rigorous than how the model appears in economics. It is also possible information theory could help give a new source of intuition. To that end, let me describe the IS-LM model in the language of information theory: Aggregate demand acts as a constant information source sending a signal detected by the interest rate to both the aggregate supply and the money supply. Changes in aggregate demand are registered as changes in the information source in the LM market, but are registered in the response of the aggregate supply in the IS market [1]. Aggregate supply shifts to bring equilibrium to the IS market (the supply reads the information change), but M is set by the central bank and so does not automatically adjust. This creates a disequilibrium situation in which IAD=IASIAD=IAS but IAD≠IMIAD≠IM ; in order

to restore equilibrium, either AD must return to its previous level or M must adjust (adjusting the interest rate) [2]. This defines what a recession is in the IS-LM model: a failure of the central bank to receive information (in information theory, we must have IM≤IADIM≤IAD , i.e. the central bank cannot receive

more information than is being transferred). A shift in output (e.g. by increasing government spending) is registered as a change in the information source in both the IS market and LM market so we can maintain IAD=IASIAD=IAS and IAD=IMIAD=IM by letting the interest rate adjust to the new equilibrium (e.g. crowding out).

[1] This difference is due to a modeling choice in order to represent empirically observed behavior. [2] In a more complicated model there may be other possibilities.

Recovering the quantity theory from information transfer In this post, I see where the information transfer model of the quantity theory of money (ITM + QTM) reduces to the traditional QTM. I will take P=P0eitP=P0eit , Qd=Qd refertQd=Qrefdert , and

Qs=Qsrefer0tQs=Qrefser0t where ii is the growth rate of the price level (inflation), rr is the NGDP growth rate and r0 r0 is the growth rate of the monetary base. Using the equations here, we obtain it+logP0=log1κ+(1κ−1)(r0t+logQs ref)+logQdrefQsrefit+log⁡P0=log⁡1κ+(1κ−1)(r0t+log⁡Qrefs)+l og⁡QrefdQrefs

With

κ=r0t+ logQsrefrt+logQdrefκ=r0t+log⁡Qrefsrt+log⁡Qrefd Taking the limit as t→∞t→∞ (the long run) one can show that the leading terms are

it∼(1κ−1)r0t=(r−r0)tit∼(1κ−1)r0t=(r−r0)t With

κ∼r0rκ∼r0r For κ=1/2κ=1/2 , the rate of the price level increase is i≈r0i≈r0 , which is the main result of the

traditional quantity theory of money. I would like to point out here that the quantity theory of money in

the information transfer framework does not require κ=1/2κ=1/2 , hence the ITF + QTM can describe the observed deviation for low inflation economies.

More interestingly, if we look at the ratio of NGDP growth rate to the growth rate of the price level r/ir/i , with r=i+Rr=i+R where RR is the real growth rate for large ii , we find that

1κr0(1κ−1)r0=i+Ri∼ii∼11κr0(1κ−1)r0=i+Ri∼ii∼1

So 1κ1κ−1∼11κ1κ−1∼1

such that κ=1/2κ=1/2 for economies with high growth in the price level -- and since κ=1/2κ=1/2 , we find that i≈r0i≈r0 and recover the traditional quantity theory of money.

The ITM + QTM approach has an additional result that says that for countries well-approximated by the traditional quantity theory of money, r≈2r0r≈2r0 , or r/r0≈2r/r0≈2 , which is roughly true for the US since the depression (data are from FRED, I used GNP for years before the 1950s, and by ΔyΔy I mean the continuously compounded annual rate of change):

The gray line is r/r0=2r/r0=2 , the average of r/r0=1.87r/r0=1.87 is shown as the red line and the

dashed red line is the post-1980 average r/r0=1.31r/r0=1.31 . This lower value explains the deviation from the diagonal line (the traditional QTM) in the graphs in this post. It is not that countries with lower inflation rates don't obey the QTM. Countries with lower inflation rates apparently tend to have monetary policies that are too tight for a traditional QTM, but they can be described by a ITM + QTM theory with larger κκ . From the ratios above, the average κ=0.53κ=0.53 over the entire period

shown, but the post-1980 average has been κ=0.76κ=0.76 . Why this happens is not determined by

the model, which only says that if r/r 0≈2r/r0≈2 , then the traditional QTM is a good theory and that for

high inflation rates, r/r0→2r/r0→2 so the traditional QTM works for countries with high growth in the price level. (But the traditional QTM does not necessarily follow for countries with high monetary base

growth rates! Large i⟹i⟹ QTM, but large r0r0 requires other knowledge about the economy like rr in order to see if ii is large and the QTM applies, or κ∼r0/r≈1/2κ∼r0/r≈1/2 and the QTM applies.)

Note also that even the (shown) seasonally adjusted ratio is highly volatile -- likely illustrating one of the reasons the Fed doesn't target e.g. M2. It is also ironic that the point when Milton Friedman's ideas supposedly were at their peak was the point when the US started deviating more from the traditional quantity theory of money. Money defined as information mediation

In the previous post I wrote that "macroeconomics does not currently know what money is or does". Instead of simply tearing down, I'd like to be constructive. In the information equilibrium framework money has a rather simple and mathematically beautiful explanation.

Let's start with an AD/AS model with aggregate demand (N) in information equilibrium with aggregate supply (S) with the price level (P) as the detector. We write this model in information transfer notation as:

P:N→SP:N→S

and the information equilibrium condition

(1) P≡dNdS=kNS(1) P≡dNdS=kNS

In general we can make this transformation using a new variable M (i.e. money):

(2) P=dNdMdMdS=kNMMS(2) P=dNdMdMdS=kNMMS

If we take N to be in information equilibrium with M, which is in information equilibrium with S, i.e.

P:N →M→SP:N→M→S

Then we can use the information equilibrium condition

dMdS=ksMSdMdS=ksMS

to show that equation (2) can be re-written

P=dNdMdMdS=kNMMSP=dNdMdMdS=kNMMS

P=dNdM(ksMS)=kNMMSP=dNdM(ksMS)=kNMMS

P=d NdM=kksNMP=dNdM=kksNM

(3) P=dNdM=knNM(3) P=dNdM=knNM

Where we've defined kn≡k/k skn≡k/ks . The solution to differential equation (3) defines a quantity theory of money where the price level goes as

logP∼(kn−1)logMlog⁡P∼(kn−1)log⁡M

In words, we've introduced a new widget that mediates the information transfer from aggregate demand to aggregate supply, which allows us to re-write the entire theory in terms of aggregate demand and the quantity of that widget (i.e. money).

One interesting side note -- if we consider non-ideal information transfer we have to combine the equations:

(4a) dMdS≤ksMS(4a) dMdS≤ksMS

(4b) P=dNdMdMdS≤kNMMS(4b) P=dNdMdMdS≤kNMMS

Since by equation (4a) the derivative dM/dSd M/dS is less than ksM/SksM/S , so the replacement of

the derivative is with a quantity that is greater than the derivative. Therefore, we don't have the bound and in general we have to allow

dNdM≰knNMdNdM≰knNM

That is to say, out of information equilibrium, the price level can be above or below the equilibrium value given by the quantity theory of money (as is observed) -- with disequilibrium on the supply side being key to above equilibrium inflation. Supply shocks (supply out of equilibrium with money) tend to

lead to inflation (e.g. the oil shocks of the 1970s) while demand shocks (demand out of equilibrium with money) tend to lead to disinflation.

Update 5/13/2015:

A couple more observations:

●

It is interesting that the information transfer index knkn , which becomes 1/κ1/κ in the price level model I use, is a composite of the indices k/ksk/ks . The index kk , representing a 'conversion factor' from aggregate supply to aggregate demand and the index ksks , representing the

conversion factor from aggregate supply to money, should in a sense be of the same order -meaning the index knkn should be approximately ~ 1. Which is what we observe (knkn seems to ●

range from about 0.9 to 2.5, or κκ from 0.4 to 1.1).

A liquidity trap economy has ks≈k ks≈k , i.e. κ≈1κ≈1 , while a quantity theory economy has

k>ksk>ks , i.e. κ<1κ<1 .

More on Cobb-Douglas functions and information transfer

Here's a bit more on Cobb-Douglas (CD) functions (previous posts here [1] and here [2]). In Ref [1], I was looking at the Solow growth model which uses a Cobb-Douglas functional form (at least asymptotically):

(1) Y=cKαLβ(1) Y=cKαLβ In Ref [2], I was looking at matching theory which frequently uses a CD ansatz. In my original derivation of a quasi-CD form in the information transfer model in [2], I came up with a function that made a prediction about the exponents in CD forms, i.e.

(2) Y∼Kκ−1L1−1/κ(2) Y∼Kκ−1L1−1/κ

The question is: is this true? If I look at fits to CD production functions or growth models, do they obey Eq (2)? So I went out and tried to do some literature searches and plot the exponents αα and ββ (WOLOG, I took the larger exponent to be αα ). There were several papers on the original

Cobb-Douglas work from the 1920s (large blue point at (0.75,0.25)(0.75,0.25) in the graph below for

US capital/labor output model), along with several papers on production functions in various industries in different countries. However, the resulting points are by no means exhaustive. In the graph below,

αα and ββ can fall anywhere in the plot; some models assume constant "returns to scale" such that α+β=1α+β=1 , which means the results must fall along the gray line. The information transfer model result (2) above implies that the points must fall along the red line and the red dot at

(0.62,0.38)(0.62,0.38)

is the only solution consistent with constant returns to scale. Here is the

graph (on the left):

This doesn't look very good. The graph on the right looks a little better, but simply represents plotting

α+βα+β

vs αα . The most charitable interpretation would be that the information transfer model is

predictive of the returns to scale (increasing, constant, diminishing).

However! During the course of reading a bunch of papers, I came across the "derivation" of the Cobb-Douglas form (it originally comes from the mathematician Cobb who sort of guessed the form based on Euler's theorem, see e.g. here). The derivation posits the basic information transfer model equation I derived in two ways (here and here) as its starting point**. So let's proceed starting from the basic differential equations (we're assuming two markets p1:Y→Kp1:Y→K and p2:Y→Lp2:Y→L ):

(3a) ∂Y∂ K=1κ1YK(3a) ∂Y∂K=1κ1YK

(3b) ∂Y∂ L=1κ2YL(3b) ∂Y∂L=1κ2YL

The economics rationale for equations (3a,b) are that the left hand sides are the marginal productivity of capital/labor which are assumed to be proportional to the right hand sides -- the productivity per unit capital/labor. In the information transfer model, the relationship follows from a model of aggregate demand sending information to aggregate supply (capital and labor) where the information transfer is "ideal" i.e. no information loss. The solutions are:

Y(K,L )∼f( L)K1/κ1Y(K,L)∼f(L)K1/κ1

Y(K,L )∼g(K)L1/κ2Y(K,L)∼g(K)L1/κ2

and therefore we have

(4) Y(K,L)∼K1/κ1L1/κ2(4) Y(K,L)∼K1/κ1L1/κ2

Equation (4) is the generic Cobb-Douglas form in equation (1). In this case, unlike equation (2), the exponents are free to take on any value. Equation (2) makes more sense when describing matching theory (i.e. a single market p:V→Up:V→U ), while equation (4) makes more sense when describing multiple interacting markets (or production factors).

** I'd like to point out that assuming a well-known partial differential equation is not very different from assuming its well-known solution -- something I refer to as "ad hoc in the worst way" in an earlier post. The information transfer Solow growth model is remarkably accurate

I wanted to jump in (uninvited) on this conversation (e.g. here, here) about total factor productivity (TFP, aka phlogiston) using some previous work I've done with the information transfer model and the Solow growth model.

In using the Solow growth model, economists assume the Cobb-Douglas form

Y=A KαLβY=AKαLβ

as well as assume α+β=1α+β=1 to enforce constant returns to scale and then assign the remaining variation to AA , the Solow residual aka TFP.

The conversation linked at the top of this post is then about why TFP seems to have slowed down (e.g. the Great Stagnation or whatever your model is). This is all a bit funny to me because it is effectively asking why the fudge factor is going away (if TFP is constant then AA is just a constant in the formula above).

Well, my intended contribution was to say "Hey, what if αα and ββ are changing?". In the information transfer model, the Solow growth model follows from a little bit of algebra and gives us

Y=A K1/κ1L1/κ2Y=AK1/κ1L1/κ2

where

κiκi

are

the

information

p2:Y→Lp2:Y→L .

transfer

indices

the

markets

p1:Y→Kp1:Y→K and

The first step in looking at changing κκ is to look at constant κκ (and constant AA ). That threw me off my original intent because, well ... because the information transfer Solow growth model with constant TFP and constant κiκi is a perfect fit:

It's so good, I had to make sure I wasn't implicitly using GDP data to fit GDP data. Even the derivative (NGDP growth) is basically a perfect model (for economics):

Of course, in the information transfer model κ1κ1 and κ2κ2 have no a priori relationship to each other and in fact we have

1κ1+1κ2=1.251κ1+1κ2=1.25

or individually κ1=1.18κ1=1.18 and κ2=2.50κ2=2.50 . So there aren't "constant returns to scale".

In the information transfer model, this is not a big worry. The two numbers represent the relative information entropy in the widgets represented by each input (dollars of capital and number of employees, respectively) relative to the widgets represented in the output (dollars of NGDP) -- why should those things add up to one in any combination? That is to say the values of κκ above simply

say there are fewer indistinguishable types of jobs than there are indistinguishable types of capital investments so adding a dollar of capital adds a lot more entropy (unknown ways in which it could be allocated) than adding an employee. A dollar of capital is interchangeable for a lot of different things (computers, airplanes, paper) whereas a teacher or an engineer tend to go into teaching and engineering slots**. Adding the former adds more entropy, and entropy means economic growth.

PS Added 12/5/2014 12pm PST: The results above use nominal capital and nominal GDP rather than the usual real capital and real output (RGDP). The results with 'real' (not nominal) values don't work as well. I am becoming increasingly convinced that "real" quantities may not be very meaningful.

** This is highly speculative, but it lends itself to a strange interpretation of salaries. Economists may make more money than sociologists because they are interchangeable among a larger class of jobs; CEO's may make the largest amount of money because they are interchangeable among e.g. every level of management and probably most of the entry level positions. A less specific job description (higher information entropy in filling that job) corresponds with a bigger contribution to NGDP and hence a higher salary. The rest of the Solow model

Here, I mostly referred to the Cobb-Douglas production function piece, not the piece of the Solow model responsible for creating the equilibrium level of capital. That part is relatively straight-forward. Here we go ...

Let's assume two additional information equilibrium relationships with capital KK being the information source and investment II and depreciation DD (include population growth in here if you'd like) being information destinations. In the notation I've been using: K→IK→I and K→DK→D .

This immediately leads to the solutions of the differential equations:

KK0=(DD0)δKK0=(DD0)δ

KK0=(II0)σKK0=(II0)σ

Therefore we have (the first relationship coming from the Cobb-Douglas production function)

Y∼Kα , I∼ K1/σ and D∼K1/δY∼Kα , I∼K1/σ and D∼K1/δ

If σ=1/ασ=1/α and δ=1δ=1 we recover the original Solow model, but in general σ>δσ>δ allows there to be an equilibrium. Here is a generic plot:

Assuming the relationships K→ IK→I and K→DK→D hold simultaneously gives us the equilibrium value of K=K∗K=K∗ :

K∗=K0exp(σδlogI0/D0σ−δ)K∗=K0exp⁡(σδlog⁡I0/D0σ−δ)

As a side note, I left the small KK region off on purpose. The information equilibrium model is not valid for small values of KK (or any variable). That allows one to choose parameters for investment and

depreciation that could be e.g. greater than output for small KK -- a nonsense result in the Solow model, but just an invalid region of the model in the information equilibrium framework.

An interesting add-on is that YY and II have a supply and demand relationship in partial equilibrium

with capital being demand and investment being supply (since Y→KY→K , by transitivity they are in information

equilibrium).

the

savings

rate

(the

price

the

market

Y→I=Y→K→IY→I=Y→K→I ), we should be able to work out how it changes depending on shocks to demand. There should be a direct connection to the IS-LM model as well. Dynamics of the savings rate and Solow + IS-LM

Hello! I'm back from a short vacation and slowly getting to the comments.

As I mentioned here, there might be a bit more to the information equilibrium picture of the Solow model than just the basic mechanics -- in particular I pointed out we might be able to figure out some dynamics of the savings rate relative to demand shocks.

In the previous post, we built the model:

Y→K→IY→K→I

Where YY is output, KK is capital and II is investment. Since information equilibrium (IE) is an equivalence relation, we have the model:

p:Y→Ip:Y→I

with abstract price pp which was described here (except using the symbol NN instead of YY ) in the context of the IS-LM model. If we write down the differential equation resulting from that IE model

(1) p=dYdI= 1ηYI(1) p=dYdI=1ηYI

There are a few of things we can glean from this ...

I. General equilibrium

We can solve equation (1) under general equilibrium giving us Y∼I1/ηY∼I1/η . Empirically, we have

η≃1η≃1 :

Combining that with the results from the Solow model, we have

Y∼Kα ,K∼IσandY∼IY∼Kα,K∼IσandY∼I

which tells us that α≃1/σα≃1/σ -- one of the conditions that gave us the original Solow model.

II. Partial equilibrium

Since Y→IY→I we have a supply and demand relationship between output and investment in partial equilibrium. We can use equation (1) and η=1η=1 to write

I=(1/p)Y≡sYI=(1/p)Y≡sY

Where we have defined the saving rate as s≡1/(pη)s≡1/(pη) to be (the inverse of) the abstract price

pp in the investment market. The supply and demand diagram (including an aggregate demand shock) looks like this:

A shock to aggregate demand would be associated in a fall in the abstract price and thus a rise in the savings rate. There is some evidence of this in the empirical data:

Overall, you don't always have pure supply or demand shocks, so there might be some deviations from a pure demand shock view. In particular, a "supply shock" (investment shock) should lead to a fall in the savings rate.

III. Interest rates

If we update the model here (i.e. the IS-LM model mentioned above) to include the more recent interest rate (rr ) model written in terms of investment and the money supply/base money:

(r→pm):I→M(r→pm):I→M

where pmpm is the abstract price of money (which is in IE with the interest rate), we have a pretty

complete model of economic growth that combines the Solow model with the IS-LM model. The interest rate joins the already empirically accurate production function:

Since I inevitably get questions about causality, it is important to note that these are all IE relationships therefore all relationships are effectively causal in either direction. However it is also important to note that the direct impact of MM on YY is neglected in the above treatment (including the interest rates) -and the direct impact changes depending on the information transfer index in the price level model.

Summary

A full summary of the Solow + IS-LM model in terms of IE relationships is:

Y→K→ I,K→DY→K→I,K→D

Y→LY→L

1/s:Y→I1/s:Y→I

(r→pm):I→M(r→pm):I→M

The Kaldor facts Dietrich Vollrath has a great pair of posts [1], [2] on the "balanced growth path" (BGP) framework of growth economics. Here's Vollrath BGP is really just a name for a set of conditions related to several major pieces of economic data: 1. The growth rate of output per worker is constant over time 2. The rate of return on capital is constant over time 3. The share of output paid to capital is constant over time These three conditions are part of the the “Kaldor Facts” established in by Nicolas Kaldor in 1957. How would you capture these "stylized facts" as information equilibrium relationships? Fairly simply as the Solow model (relating real output Y, capital K, labor L, depreciation D, and investment I) alongside an additional information equilibrium (IE) relationship between real output Y and population P. In the Solow model, constant returns to scale are assumed so that the IT index for the IE relationships between output and capital and output and labor are α and 1-α, respectively. The relationship between Y and P captures the first stylized fact, and the Solow model captures the second and third ones. This is shown in diagram form in the following graphic:

The Kaldor facts as information equilibrium (IE) relationships. Also shown are more empirically accurate IE relationships (the "nominal" Solow model).

As Vollrath mentions in [1], there are some questions as to whether the stylized facts represent the data. While the first works reasonably well, the second and third are less successful. Vollrath claims you probably wouldn't reject any of the hypotheses. However, I think the worse failure of the second and third facts can be directly related to the finding that the IE version of the Solow model (the "nominal" Solow model, which tells us to use nominal quantities like nominal output and doesn't have constant returns to scale) works really well empirically (shown in the diagram above for the US, UK, and Mexico). In fact, this is the basis of a really good empirical model of output, labor, capital, and inflation (so both real and nominal output) I called "the quantity theory of labor and capital".

The main point, however, is that we can capture the original Kaldor facts as a concise set of information equilibrium relationships. Information equilibrium and the gravity model of trade

Paul Krugman mentions gravity models today on his blog which gives me an opportunity to apply the information transfer model to a new economic model.

The model is essentially a Cobb-Douglas function (see e.g. here) produced from the information equilibrium relationships T→N1T→N1 and T→N2T→N2 where TT is the volume of trade and NiNi is the aggregate demand (NGDP) of country ii so that:

T=α(N1N(0)1)k1(N2N(0)2)k2T=α(N1N1(0))k1(N2N2(0))k2

This piece is essentially given by Krugman's very nice overview of the argument which is really an information equilibrium argument: Here’s my take: Think about two cities with the same per capita GDP — we can relax that assumption in a minute. They will trade if residents of city A find things being sold by residents of city B that they want, and vice versa. So what’s the probability that an A resident will find a B resident with something he or she wants? Applying what one of my old teachers used to call the principle of insignificant reason, a good first guess would be that this probability is proportional to the number of potential sellers — B’s population. And how many such desirous buyers will there be? Again applying insignificant reason, a good guess is that it’s proportional to the number of potential buyers — A’s population. So other things equal we would expect exports from B to A to be proportional to the product of their populations. What if GDP per capita isn’t the same? You can think of this as increasing the “effective” population, both in terms of producers and in terms of consumers. So the attraction is now the product of the GDPs. So what about distance? Krugman has some issues with it: And there’s also a puzzle about both the effect of distance and the effect of borders, both of which seem larger than concrete costs can explain. I have some more fundamental issues with distance. The Earth realizes a specific set of nation-states that have a fairly highly spatially correlated wealth distribution (e.g. rich Europe and poor Africa). Distance is therefore going to be correlated with partner NGDP for most countries. But if trade is a significant contributer to NGDP this spatial correlation will be an effect of trade, making it difficult to tease out the relationship -- the exact samples you need (countries with the same GDP at all different distances) are the ones that don't appear in the sample.

However, if distance between countries 1 and 2 is in information equilibrium with NGDP, then we can write down the model D12→N1D12→N1 so that

D12∼Nk31D12∼N1k3

And thus re-write the equation at the top of this post

T∼(N1N(0)1)k1+γk3(N1N( 0)1)−γk3(N2N(0)2)k2T∼(N1N1(0))k1+γk3(N1N1(0))−γk3(N2N2(0))k2

T=α′(N1N(0)1)k′(N2 N(0)2)k2/Dγ12T=α′(N1N1(0))k′(N2N2(0))k2/D12γ

So from an econometric view, you can probably get a form that looks like the gravity model equation. However, I am not entirely sure we can unambiguously associate the effect with distance per se.

PS The figure is a realized universe in a causal dynamical triangulation model of quantum gravity. Utility in an information equilibrium model

I've said many times on this blog that the concept of utility is unnecessary (except to solve an information problem); in this post I'll allow utility, but show that even then utility maximization is unnecessary to select an economic equilibrium. Essentially, I will derive the utility approach to economics using the information transfer model.

Let's define utility U(x1,x2,...)U(x1,x2,...) to be the information source in the markets

MUxi:U→xiMUxi:U→xi

for i=1...Di=1...D where MUxiMUxi is the marginal utility (a detector) for the good xixi (information destination). We can immediately write down the main information transfer model equation:

MUxi= ∂U∂xi=kiUxiMUxi=∂U∂xi=kiUxi

Solving the differential equations, our utility function U(x1,x2,...)U(x1,x2,...) is

U(x1,x 2,...)=a∏i(xici)kiU(x1,x2,...)=a∏i(xici)ki

which is a Cobb-Douglas utility function (aa and the cici are constant). In the traditional economic

approach, the next step is to look at the level curves of this utility function alongside the budget constraint (in a two-good market i.e. two dimensions):

(1) M≥∑ipixi=p1x1+p2x2(1) M≥∑ipixi=p1x1+p2x2

The level curve that is tangent to the budget constraint gives us the maximized utility. We show this in the next graph (level curve is the dashed gray curve and the maximized utility point is gray as well):

The entropy maximizing solution is given as the blue dot at the centroid of the blue triangle with the solid utility level curve passing through it. If we take every point that satisfies the budget constraint (1) as equally likely (the blue shaded region), the expected equilibrium value is given by the centroid. Now this centroid moves in effectively the same way as the utility maximum, just at a lower utility value:

The red line shows that at a higher price of x1x1 with the same budget constraint, less of x1x1 can be

afforded. Both the utility maxima (gray dots) and the entropy maxima (blue and red dots) move inward towards smaller x1x1 and lower utility, tracing out a demand curve. For two goods, there is only one big difference in the utility maximizing and entropy maximizing models of supply and demand: the entropy maximizing equilibrium doesn't fully saturate the budget constraint.

One possibility is to make the artificial restriction that all income MM must be spent. The centroid of the

boundary of the blue triangle would essentially be the middle point of the budget constraint line. However, we don't actually have to do this.

If we randomly sample the triangle bounded by the budget constraint, we end up with something like this in two dimensions:

The distribution of the sum p1x1+p2x2p1x1+p2x2 is somewhat uniformly distributed between 0 and the budget constraint as shown by the blue line in the next graph:

However, if we increase the number of dimensions (the number of goods in the economy), most of the points are near the budget constraint (the red and green lines). This is a general property of higher dimensional shapes: nearly all of the points are in a thin shell near the surface. That means for a large number of dimensions, the expected equilibrium should be near the budget constraint even without making the artificial restriction above:

Now this point only maximizes utility if every good is in a sense the same -- the budget constraint (1) and the utility function UU is symmetric under interchange of xi↔xjxi↔xj . Typically, the entropy

maximizing point is near the utility maximizing point if neither the utility function nor the budget constraint are wildly asymmetric (ki≫ kjki≫kj or p i≫pjpi≫pj ).

We can't directly observe utility, though. Therefore it is always possible to perform a preference-preserving transformation f(xi,αi):xi→expαilogxif(xi,αi):xi→exp⁡αilog⁡xi that makes the utility level surface tangent at the maximum entropy point near the budget constraint (or makes the utility function symmetric). [Correction, per LAL in comments below.]

Aside: This framework would allow us to experimentally determine whether utility maximization or entropy maximization represented a better model of microeconomics. Using multiple goods, aggregate the different allocations chosen by individual subjects (every student in a classroom is given nn tokens to allocate among mm different goods like candy bars or bacon).

Second aside: The typical economics approach appears to be setting MUxi=βpiMUxi=βpi ; we could

accommodate that with an additional information equilibrium condition MUxi∼piMUxi∼pi which would allow logMU≃klogplog⁡MU≃klog⁡p . The typical approach assumes k=1k=1 . The basic asset pricing equation as a maximum entropy condition

Commenter LAL has brought up the basic asset pricing equation a couple of times, and so I had a go at looking at it as a maximum entropy/information equilibrium model. Turns out it works out. In Cochrane's book (updated with link) the equation appears as:

(1) pt=E[βu′(ct+1)u′(ct)xt+1](1) pt=E[βu′(ct+1)u′(ct)xt+1]

Where ptpt is the price at time tt , ctct is consumption at time tt , uu is a utility function, and ββ is a

future discount factor. Now xtxt is also the price at time tt (although it's called the payoff) and of course there is the funny business of the EE that essentially says all the terms at a time t+ 1t+1 exist only in

the minds of humans (and turns an xx into a pp ). Rational expectations is the assumption that the EE is largely meaningless on average (i.e. approximately equal to the identity function).

As a physicist, I'm not particularly squeamish about the future appearing in an equation (or time dropping out of the model altogether), so I will rewrite equation (0) as:

(1) pi=β u′(cj)u′(ci)pj(1) pi=βu′(cj)u′(ci)pj

It turns out much of the machinery is the same as the Diamond-Dybvig model, so I'll just adapt the beginning of that post for this one.

The asset pricing equation is originally a model of consumption in two time periods, but we will take that to be a large number of time periods (for reasons that will be clear later). Time tt will be between 0 and 1.

Let's define a utility function U(c1,c2,...)U(c1,c2,...) to be the information source in the markets

MUci:U→ciMUci:U→ci

for i=1...ni=1...n where MUciMUci is the marginal utility (a detector) for the consumption cici in the

ithith

period (information destination). We can immediately write down the main information transfer

model equation:

MUci= ∂U∂ci=αiUciMUci=∂U∂ci=αiUci

Solving the differential equations, our utility function U(c1,c2,...)U(c1,c2,...) is

U(c1,c 2,...)=a∏i(ciCi)αiU(c1,c2,...)=a∏i(ciCi)αi

Where the CiCi and aa are constants. The basic timeline we will consider is here:

Period ii is some "early" time period near t=0t=0 with consumption cici while period jj is some "late"

time period near t=1t=1 with consumption cjcj . We'll only be making changes in these two time periods. The "relevant" (i.e. changing) piece of the utility function is (taking a logarithm):

(2) logU∼...+αilogci+...+αjl ogcj+ ...+logU0(2) log⁡U∼...+αilog⁡ci+...+αjlog⁡cj+...+log⁡U0

where all the various CiCi 's, αiαi 's and aa ended up in l ogU0log⁡U0 .

Now the derivation of the asset pricing equation sets up a utility maximization problem where normal consumption in period ii (called eiei ) is reduced to purchase ξξ of some asset at price pipi , and added back to consumption in period jj at some new expected price pjpj . So we have:

(3a) ci=ei−piξ(3a) ci=ei−piξ

(3b) cj=ej+pjξ(3b) cj=ej+pjξ

Normally, you'd plug these into the utility equation (2), and maximize (i.e. take a derivative with respect to ξξ and set equal to zero). The picture appears in this diagram (utility level curves are in gray):

The change in the amount ξξ of the asset held represents wiggling around the point (ei,ej)(ei,ej) along

a line with slope defined by the relative size of the prices pipi and pjpj to reach the point labeled with an 'x': the utility maximum constrained to the light blue line.

Instead of doing that, we will use entropy maximization to find the 'equilibrium'. In that case, we can actually be more general, allowing for the case that e.g. you don't (in period jj ) sell all of the asset you acquired in period ii -- i.e. any combination below the blue line is allowed. However, if there are a large

number of time periods (a high dimensional consumption space), the most probable values of consumption are still near the blue line (more on that here, here). Yes, that was a bit of a detour to get back to the same place, but I think it is important to emphasize the generality here.

If the states along the blue line are all equally probable (maximum entropy assumption), then the average state will appear at the midpoint of the blue line. I won't bore you with the algebra, but that gives us the maximum entropy equilibrium:

ξ=eipj−ejpi2pipjξ=eipj−ejpi2pipj

If we assume we have an "optimal portfolio", i.e we are already holding as much of the asset as we'd like, we can take ξ=0ξ=0 , which tells us ek=ckek=ck via the equations (3) above, and we obtain the condition:

(4) pi=cicjpj(4) pi=cicjpj

Not quite equation (1), yet. However, note that

1U∂U∂ci=∂logU∂ci=αici1U∂U∂ci=∂log⁡U∂ci=αici

So we can re-write (4) as (note that the jj , i.e. the future, and ii , i.e. the present, flip from numerator and denominator):

(5) pi=α iαj∂ U/ ∂cj∂U/∂cipj(5) pi=αiαj∂U/∂cj∂U/∂cipj

Which is formally similar to equation (1) if we identify β≡αi/αjβ≡αi/αj . You can stick the EE and brackets around it if you'd like.

I thought this was pretty cool.

Now just because you can use the information equilibrium model and some maximum entropy arguments to arrive at equation (5) doesn't mean equation (1) is a correct model of asset prices -much like how you can build the IS-LM model and the quantity theory of money in the information equilibrium framework, this is just another model with a information equilibrium description. Actually equation (4) is more fundamental in the information equilibrium view and basically says that the condition you'd meet for the optimal portfolio is simply that the ratio of the current to expected future consumption is equal to the ratio of the current to the expected price of that asset. Essentially if you think the price of some asset is going to go up 10%, you will adjust your portfolio so your expected future consumption goes up by 10%. The Euler equation as a maximum entropy condition

In the discussion of the RCK model on these two posts I realized the Euler equation could be written as a maximum entropy condition. It's actually a fairly trivial application of the entropy maximizing version of the asset pricing equation:

pi=αiαj∂U/∂cj∂U/∂cipjpi=αiαj∂U/∂cj∂U/∂cipj

To get to the typical macroeconomic Euler equation, define αi/αj≡βαi/αj≡β and re-arrange:

∂U∂ci=βpjpi∂U∂cj∂U∂ci=βpjpi∂U∂cj

The price at time tjtj divided by the price at time titi is just (one plus) the interest rate RR (for the time

tj−titj−ti ), so:

∂U∂ci=β(1+R)∂U∂cj∂U∂ci=β(1+R)∂U∂cj

And we're done.

The intuition behind the traditional economic Euler equation is (borrowed from these lecture notes [pdf]) The Euler equation essentially says that [an agent] must be indifferent between consuming one more unit today on the one hand and saving that unit and consuming in the future on the other [if utility is maximized]. The intuition for the maximum entropy version is different. It does involve the assumption of a large number of consumption periods (otherwise the intertemporal budget constraint wouldn't be saturated), but that isn't terribly important. The entropy maximum is actually given by (Eq. 4 at the link, re-arranged and using pj/pi=1+Rpj/pi=1+R ):

cj=ci(1+R)cj=ci(1+R)

The form of the utility function UU allows us to transform it into the equation above, but this is the more fundamental version from the information equilibrium standpoint. This equation says that since you could be anywhere along the blue line between cjcj maximized and cici maximized on this graph:

the typical location for an economic agent is in the middle of that blue line [1]. Agents themselves might not be indifferent to their location on the blue line (or even the interior of the triangle), but a maximum entropy ensemble of agents is. Another way to put it is that the maximum entropy ensemble doesn't break the underlying symmetry of the system -- the interest rate does. If the interest rate was zero, all consumption periods would be the same and consumption would be equal. A finite interest rate transforms both the coordinate system and the location of maximum entropy point. You'd imagine deforming the n-dimensional simplex so that each axis was scaled by (1+r)(1+r) where rr is the interest rate between titi and ti +1ti+1 . DSGE, part 1

Olivier Blanchard [pdf] both criticized and defended DSGE, which prompted several takes from Simon Wren-Lewis, Brad DeLong, Paul Krugman, Noah Smith (Twitter conversation with DeLong), and others. Since the resulting commentary was largely negative (and I have been negative about DSGE before: here, here), let me [continue to] be contrarian and defend DSGE in the only way I know how: by converting it into an information equilibrium model.

I put the IE model in a DSGE form before, but my goal here is the converse: to reproduce a mainstream DSGE macro model in the language of information equilibrium -- or at least start the project. First let me collect several pieces I've already assembled: ● ● ● ●

Solow model/Cobb-Douglas production functions (the original RBC model is the genesis of the DSGE approach and it starts from a Solow model) Utility maximization (in IE we have entropy maximization which coincides with utility maximization for certain parameter choices) Euler equation Log-linearization (this shows the general form of log-linear model equations that can be obtained with an information equilibrium model)

One thing to note is that information transfer economics allows for the information equilibrium relationships above to fail (in a specific way -- generically as prices that fall below "equilibrium" prices).

My contribution for today is to set up a Taylor rule as an information equilibrium condition. This is actually somewhat trivial and involves two information equilibrium relationships

R⇄ΠR⇄YR⇄ΠR⇄Y where R≡er≈1+rR≡er≈1+r is the short term nominal interest rate, Π≡eπ≈ 1+πΠ≡eπ≈1+π is the inflation rate, and YY is real output (you could also specify consumption CC ). The general solution to the differential equations these information equilibrium relationships set up give us:

RRref=(ΠΠref)λΠ(YYref)λYRRref=(ΠΠref)λΠ(YYref)λY

Where RrefRref , ΠrefΠref , YrefYref , λΠλΠ , and λYλY are parameters. The parameters λΠλΠ and

λYλY

are the information transfer indices for the information equilibrium relationships above. Taking

the log of this equation, we obtain

logRr=λΠlogΠ+λYl ogY+c =λΠπ +λYy+clog⁡R=λΠlog⁡Π+λYlog⁡Y+cr=λΠπ+λYy+c

Interestingly, non-ideal information transfer means that the observed interest rate will generally fall below the "ideal" interest rate (robs≤rrobs≤r ).

The Taylor rule equation isn't the best information equilibrium model of interest rates, however (yes, I understand it is supposed to guide policy, but in DSGE models it tends to represent the central bank "reaction function" -- i.e. how the central bank will set rates given current conditions). The model here and here is a much better model on average. If we take the information equilibrium relationships

p:P: r⇄pN⇄MBN⇄M0r⇄pp:N⇄MBP:N⇄M0

where rr is the short term interest rate, PP is the price level, pp is the "price of money", NN is nominal output (=PY=PY ), MBMB is the monetary base, and M0M0 is the monetary base minus reserves. For short times (constant information transfer index kk ) we can apply some algebra to obtain

logr=c1logY+c1(k−2)k−1logP+c 1logα+c0log⁡r=c1log⁡Y+c1(k−2)k−1log⁡P+c1log⁡α+c0 where α=M0/MBα=M0/MB . The equation for the long term interest rate can be obtained by taking

α=1α=1

. If we take Y≡ey~Y≡ey~ and P≡eπ~P≡eπ~ (I use tildes to distinguish from the formula

above), then we can write this as:

logr=c1y~+c1(k−2)k−1π~ +c1logα+c0log⁡r=c1y~+c1(k−2)k−1π~+c1log⁡α+c0

There are some key differences between this formula and the more traditional Taylor rule. However, if interest rates stay near some value r0r0 (and αα is approximately constant), then we can say (subsuming some extra constants into the new value c′0c0′ )

logr0+δrr0δr≈c1y~+c1(k−2)k−1π~+c1logα+c0=c1r0y~+c1r0(k−2)k−1π~ +c′0log⁡r0+δrr 0≈c1y~+c1(k−2)k−1π~+c1log⁡α+c0δr=c1r0y~+c1r0(k−2)k−1π~+c0′

We can make the identifications:

λYλΠc≈c1r0≈c1r0(k−2)k−1≈c′0λY≈c1r0λΠ≈c1r0(k−2)k−1c≈c0′

For k≫1k≫1 , which is the quantity theory limit of the information equilibrium model, we have

λY≈λΠλY≈λΠ , which was true in Taylor's original 1993 version. Additionally, given c1≃4c1≃4 , we could take r0 ≃10%r0≃10% (which is where the short interest rate was in the late 80s and early 90s). [This gives both λx ∼0.5λx∼0.5 , which is where Taylor originally set the parameters.]

Essentially, deviations from some interest rate r0r0 have the same form as the Taylor rule above.

Another way to put this is that in the eyes of an information equilibrium model, the Taylor rule is incorrectly shown as a formula for the value of the nominal interest rate; it should represent a deviation from some "typical" value r0r0 . This typical rate is subsumed into the values of the coefficients.

...

Update 16 August 2016

I do want to point out two things:

1. Although the form of an existing DSGE model could potentially be obtained with a series of information equilibrium relationships, the interpretation of the equations is different and non-ideal information transfer means the IE versions of the DSGE models will occasionally fail in a way

described by non-ideal information transfer. A good example is the Taylor rule above: it represents an equilibrium, not necessarily something achieved by the actions of a central bank. 2. It is likely (probably definitely true) that not all DSGE models can be obtained from information equilibrium relationships. There is an overlap in the Venn diagram of DSGE models and IE models, but neither is a subset of the other. ...

Update 17 August 2016

One thing to note is that I am leaving off the "stochastic" bit and the EE operators for now. The

interpretation of the information equilibrium version of the DSGE model will show how these pieces arise. They are deeply linked. In the second installment, I begin the discussion of the EE operators. [Now added in part 3.] DSGE, part 2

I am continuing to build a standard DSGE model (specifically, the simple three equation New Keynesian DSGE model) using information equilibrium (and maximum entropy). In part 1, I summarized the references and built a "Taylor rule". In this installment, I will use the Euler equation to derive the "IS curve". I'll assume rational expectations for simplicity at first (one can drop the EE 's), but will add some discussion at the end.

Let's start with the information equilibrium relationship between (real) output and (real) consumption

Y⇄CY⇄C . This tells us that Y∼C1/σY∼C1/σ

or in log-linear form y=1σcy=1σc . I took the information transfer index to be 1/σ1/σ so that we end up something that might be recognizable by economists. Now let's import the maximum entropy condition relating two periods of consumption at time tt and t+1t+1 from this post:

Ct+1=C t(1+rt)Ct+1=Ct(1+rt)

or in log-linear form ct+1=ct+ rtct+1=ct+rt . Substituting output yy , defining the real interest rate in

terms of the nominal interest rate ii and expected inflation rt≡it−πt+1rt≡it−πt+1 , and rearranging we obtain:

yt=−1σ(it−πt +1)+yt+1yt=−1σ(it−πt+1)+yt+1

And there we have the NK IS curve. We can add in the expectation operators if you'd like:

yt=−1σ(it−Et πt+ 1)+Etyt+1yt=−1σ(it−Etπt+1)+Etyt+1

And this is where the information equilibrium version of the IS curve has a different interpretation. The information equilibrium model can be viewed as a transfer of information from the future to the present. We can interpret the "expected" value as the ideal information transfer value, and deviations from that as non-ideal information transfer. The value added by this interpretation is that instead of rational expectations where the deviation from the expected value has some zero-mean distribution, we generally have e.g. prices that will be bounded from above by the ideal information equilibrium solution. Here's an example using interest rates:

We could think of the EE operators as a warning: this variable may come in below expectations due to

coordinations (financial panic, recession). Therefore, we should think of the information equilibrium NK DSGE model as a bound on a dynamic system, not necessarily the real result. With this in mind, it is no wonder DSGE models would work well for the great moderation but fail during a massive coordination event. Posted by Jason Smith at 12:30 PM DSGE, part 3 (stochastic interlude)

So far, I've left out the stochastic terms (the S in DSGE) in this series (part 1, part 2). I'd like to show how it would appear in the information equilibrium models. Let's start with the NK IS curve:

yt=−1σ(it−EI πt+ 1)+EIyt+1yt=−1σ(it−EIπt+1)+EIyt+1

Per part 2, we should interpret the expectations operators (the EE 's) instead as "information

equilibrium" values (we can use the use the same letter EE to which I affixed a subscript II above). Actually, we should interpret every variable in the equation as the information equilibrium values:

EIyt=−1σ(EIit−EI πt+ 1)+EIyt+1EIyt=−1σ(EIit−EIπt+1)+EIyt+1

If we want to use observed values at the present time index tt , we need to account for a deviation nn due to non-ideal information transfer

xt=E Ixt−ntxt=EIxt−nt

This deviation can look very much like a traditional stochastic process:

Substituting and collecting (as νν ) these stochastic terms introduced by removing the EIEI operators at the current time index tt (but not for future ones since we don't know what the values are), we obtain the traditional DSGE form of the NK IS curve:

yt=−1σ(it−EI πt+ 1)+EI yt+1+νtyt=−1σ(it−EIπt+1)+EIyt+1+νt

The stochastic piece is the deviation from ideal information transfer and maximum entropy. DSGE, part 4

In the fourth installment, I am going to build one version of the final piece of the New Keynesian DSGE model in terms of information equilibrium: the NK Phillips curve. In the first three installments I built (1) a Taylor rule, (2) the NK IS curve, and (3) related expected values and information equilibrium values to the stochastic piece of DSGE models. I'm not 100% happy with the result -- the stochastic piece has a deterministic component -- but then the NK DSGE model isn't very empirically accurate.

Let's start with the information equilibrium relationship between nominal output and the price level

Π⇄NΠ⇄N

so that we can say (with information transfer index αα , and using the definition of the

information equilibrium expectation operators from here)

EIπt+1−EI πt= α(EInt+1−EInt)EIπt+1−EIπt=α(EInt+1−EInt)

Using the following substitutions (defining the information equilibrium value in terms of an observed value and a stochastic component, defining the output gap xx , and defining real output)

EIatxtnt≡at− νat≡EI yt−yt≡yt+πtEIat≡at−νtaxt≡EIyt−ytnt≡yt+πt

and a little bit of algebra, we find

πtμt=EIπt+1+α1−αxt+μt≡νπt−α1−ανyt−α1−α(EIyt+1−EIyt)πt=EIπt+1+α1−αxt+μtμt≡νtπ−α1 −ανty−α1−α(EIyt+1−EIyt)

The first equation is essentially the NK Phillips curve; the second is the "stochastic" piece. One difference from the standard result is that there is no discount factor applied to future information equilibrium inflation (the first term of the first equation). A second difference is that the stochastic piece actually contains information equilibrium real growth (the last term). In a sense, it is a biased random walk towards reducing the output gap.

Anyway, this is just one way to construct a NK Phillips curve. I'm not 100% satisfied with this derivation because of those two differences; maybe a better one will come along in a later update. DSGE, part 5 (summary)

I've just finished deriving a version of the three-equation New Keynesian DSGE model from a series of information equilibrium relationships and a maximum entropy condition. We have

ΠXRR⇄N with IT index α⇄C with IT index 1/σ⇄Πt+1 with IT index λπ⇄ Xt+1 with IT index λxΠ⇄N with IT index αX⇄C with IT index 1/σR⇄Πt+1 with IT index λπR⇄Xt+1 with IT index λx

along with a maximum entropy condition on the intertemporal consumption {Ct}{Ct} subject to a budget constraint:

Ct+1=RtCtCt+1=RtCt

We can represent these graphically

These stand for information equilibrium relationships between the price level ΠΠ and nominal output

NN , real output gap XX and consumption CC , nominal interest rate RR and the price level, and the nominal interest rate and the output gap XX . These yield rtxtπt=λπEIπt+1+λxEIxt+1+c=−1σ(rt−EIπt+1)+EIxt+1+νt=EIπt+1+α 1−αxt+μtrt=λπEIπt+1 +λxEIxt+1+cxt=−1σ(rt−EIπt+1)+EIxt+1+νtπt=EIπt+1+α1−αxt+μt

with information equilibrium rational (i.e. model-consistent) expectations EIEI and "stochastic innovation" terms νν and μμ (the latter has a bias towards closing the output gap -- i.e. the IE version

has a different distribution for its random variables). With the exception of a lack of a coefficient for the first term on the RHS of the last equation, this is essentially the three equation New Keynesian DSGE model: Taylor rule, IS curve, and Philips curve (respectively).

One thing I'd like to emphasize is that although this model exists as a set of information equilibrium relationships, they are not the best set of relationships. For example, the typical model I use here (here are some others) that relates some of the same variables is

Π:NrMpM:NΠ:N⇄M0 with IT index k⇄pM with IT index c1⇄M with IT index c2⇄L with IT index c3Π:N⇄M0 with IT index krM⇄pM with IT index c1pM:N⇄M with IT index c2Π:N⇄L with IT index c3

where M0 is the monetary base without reserves and M=M= M0 or MB (the monetary base with

reserves) and rM0rM0 is the long term interest rate (e.g. 10-year treasuries) and rMBrMB is the short

term interest rate (e.g 3-month treasuries). Additionally, the stochastic innovation term in the first relationship is directly related to changes in the employment level LL . In part 1 of this series, I related

this model to the Taylor rule; the last IE relationship is effectively Okun's law (in terms of hours worked here or added with capital to the Solow model here -- making this model a kind of weird hybrid of a RBC model deriving from Solow and a monetary/quantity theory of money model).

Here is the completed series for reference: DSGE, part 1 [Taylor rule] DSGE, part 2 [IS curve] DSGE, part 3 (stochastic interlude) [relates EIEI and stochastic terms] DSGE, part 4 [Phillips curve] DSGE, part 5 [the current post] MINIMAC as an information equilibrium model

Nick Rowe has a post up where he blegs the impossible ... A 3D Edgeworth box does not exist (it is at a minimum 6D as I mention in a comment at the post) [1]. However, Nick does cite MINIMAC, a minimal macro model described by Paul Krugman here. It gives us a fun new example to apply the information equilibrium framework!

We'll start with our typical utility framework (see e.g. here or here) with

U→CiU→Ci

U→MiU→Mi

Where there are a large number of periods i=1...ni=1...n . Our utility function is:

U∼∏inCs i∏jnMσjU∼∏inCis∏jnMjσ

I'm going to build in a connection to the information transfer model right off the bat by adding (assuming constant information transfer index for simplicity):

P:N→MP:N→M

so that:

P∼Mk−1P∼Mk−1

and more importantly for us:

(MP)1−s∼M(2−k)(1−s)(MP)1−s∼M(2−k)(1−s)

Which means that our utility function matches the MINIMAC utility function up to a logarithm (our UU would be logUlog⁡U using Krugman's UU ) if σ=(2−k)(1−s)σ=(2−k)(1−s) :

U∼∏inCs i∏j n(MjPj)1−sU∼∏inCis∏jn(MjPj)1−s

The general budget constraint is given by:

L=∑iCi+∑jMjL=∑iCi+∑jMj

In the MINIMAC model, we're only concerned with two periods (call them ii and jj ). Essentially in period ii , Ck≤i=0Ck≤i=0 and Mk≤i =M Mk≤i=M and in period jj , Ck≥j=CCk≥j=C and

Mk≥j=M′Mk≥j=M′

to make the connection with Krugman's notation. We'll use the maximum entropy

assumption with a large number of time periods so that the most likely point is near the budget constraint (first shown here):

My terrible handwriting and the maximum entropy solution for a large number of periods (high dimensional volume). The orange dots represent the density of states for a high dimensional system. Connection with Krugman's notation also shown.

There are some interesting observations. If k=2k=2 , which is κ=1/2κ=1/2 , then we have the quantity

theory of money, but σ=0σ=0 , so utility only depends on consumption. Also if we take Mi=MjMi=Mj (constant money supply), we should randomly observe cases of unemployment where L′<LL′<L and consumption is below the maximum entropy level near the budget constraint:

Occasionally, you observe a point (red) that moves away from the budget constraint resulting in unemployment. In fact, we should typically observe L′<L L′<L since the maximum entropy point is near, but not exactly at the budget constraint. Voilà! The natural rate of unemployment is essentially dependent on the dimensionality of the consumption periods. With an infinite number, you'd observe no unemployment. For two time periods, you'd observe ~ 50% unemployment (the red dot in the image above would appear near the center of the triangle most of the time). In our world with some large, but not infinite, number of periods we have a distribution that peaks around a natural rate around ~ 5%:

The natural rate is given by the dimensionality of the temporal structure of the model. In some large, but finite, number of time periods you have an unemployment rate near e.g. 5%.

Footnotes:

[1] You can easily fit a pair of xy axes together where x1 = x0 - x2 and y1 = y0 - y2 (flip both axes) but you can't do it for three sets of xyz axes since x1 = x0 - x2 -x3 (i.e. it depends on two axes). As Nick mentions in reply to my comment, you can do it for 2 agents and 3 goods. And he's right that the math works out fine -- it's basically a three-good Arrow-Debreu general equilibrium model. For my next trick, I think I will build Nick's model. Metzler diagrams from information equilibrium

Paul Krugman has a post today where he organizes some DSGE model results in a simplified Mundell-Fleming model represented as a Metzler diagram. Let me show you how this can be represented as an information equilibrium (IE) model.

We have interest rates r1,r2r1,r2 in two countries coupled through an exchange rate ee . Define the interest rate riri to be in information equilibrium with the price of money MiMi in the respective country (with money demand DiDi ) -- this sets up four IE relationships:

r1p1:D1r2p2:D 2⇄p1⇄M1⇄p2⇄M2r1⇄p1p1:D1⇄M1r2⇄p2p2:D2⇄M2

This leads to the formulas (see the paper)

(1) ri=(kiDiMi)ci(1) ri=(kiDiMi)ci

Additionally, exchange rates are basically given as a ratio of the price of money in one country to another:

e≡p1p2=αMk1− 11Mk2−12e≡p1p2=αM1k1−1M2k2−1

And now we can plot the formula (1) versus Mk1−11M1k1−1 (blue) and M1−k22 M21−k2 (yellow) at constant DiDi (partial equilibrium: assuming demand changes slowly compared to moneytary policy changes). This gives us the Metzler diagram from Krugman's post and everything that goes along with it:

Also, for k≈1k≈1 (liquidity trap conditions), these curves flatten out:

Diamond-Dybvig as a maximum entropy model

I'm pretty sure this is not the standard way to present Diamond-Dybvig (which seems more commonly to be presented as a game theory problem). However, this presentation will allow me to leverage some of the machinery of this post on utility and information equilibrium. I'm also hoping I haven't completely misunderstood the model.

Diamond-Dybvig is originally a model of consumption in 3 time periods, but we will take that to be a large number of time periods (for reasons that will be clear later). Time tt will be between 0 and 1.

Let's define a utility function U(c1,c2,...)U(c1,c2,...) to be the information source in the markets

MUci:U→ciMUci:U→ci

for i=1...ni=1...n where MUciMUci is the marginal utility (a detector) for the consumption cici in the

ithith

period (information destination). We can immediately write down the main information transfer

model equation:

MUci= ∂U∂ci=kiUciMUci=∂U∂ci=kiUci

Solving the differential equations, our utility function U(c1,c2,...)U(c1,c2,...) is

U(c1,c 2,...)=a∏i(ciCi)kiU(c1,c2,...)=a∏i(ciCi)ki

Where the CiCi and aa are constants. The basic timeline we will consider is here:

Periods ii and kk are some "early" time periods near t=0t=0 with consumption cici and ckck while period jj is a "late" time period near t=1t=1 with consumption cjcj . We introduce a "budget constraint" that basically says if you take your money out of a bank early, you don't get any interest. This is roughly the same as in the normal model except now period 1 is the early period ii and period 2 is the late period jj . We define tt to be tj−tit j−ti with tj≈1tj≈1 so the bank's budget constraint is

(1) tci+(1−t) cj1+r=1(1) tci+(1−t)cj1+r=1

The total available state space is therefore an nn -dimensional polytope with vertices along axes c1c1 , c2c2 , ... cncn . For example, in three dimensions (periods) we have something that looks like this:

Visualizing this in higher dimensions â&#x20AC;&#x2039;is harderâ&#x20AC;&#x2039;. Each point inside this region is taken to be equally likely (equipartition or maximum information entropy). Since we are looking at a higher dimensional space, we can take advantage of the fact that nearly all of the points are near the surface ... here, for example is the probability density of the location of the points in a 50-dimensional polytope (where 1 indicates saturation of the budget constraint):

Therefore the most likely point will be just inside the center of that surface (e.g. the center of the triangle in the 3D model above). If we just look at our two important dimensions -- an early and late period -- we have the following picture:

The green line is Eq. (1) the bank's budget constraint (all green shaded points are equally likely, and the intercepts are given by the constraint equation above) and the blue dashed line is the maximum density of states just inside the surface defined by the budget constraint. The blue 45 degree line is the case where consumption is perfectly smoothed over every period -- which is assumed to be the desired social optimum [0]. The most likely state with equal consumption in every period is given by E in the diagram.

The "no bank" solution is labeled NB where consumption in the early period is ci≈1ci≈1 . The

maximum entropy solution where all consumption smoothing (and even consumption "roughening") states are possible because of the existence of banks is labeled B.

The utility level curves are derived from the Cobb-Douglas utility function at the top of this post. You can see that in this case we have B at higher utility than E or NB and that having banks allows us to reach closer to E than NB.

If people move their consumption forward in time (looking at time tk<titk<ti ), you can get a bank run as the solution utility (red, below) passes beyond the utility curve that goes through the NB solution. Here are the two cases where there isn't a run (labeled NR) and there is a run (labeled R):

Of course, the utility curves are unnecessary for the information equilibrium/maximum entropy model and we can get essentially the same results without referencing them [1], except that in the maximum entropy case we can only say a run happens when R reaches ci≈1ci≈1 (the condition dividing the two solutions becomes the consumption in the early period is equal to the consumption in the case of no banks, rather than the utility of the consumption in the first period is equal to the utility of the consumption in the case of no banks).

I got into looking at Diamond Dybvig earlier today because of this post by Frances Coppola, who wanted to add in a bunch of dynamics of money and lending with a central bank. The thing is that the maximum entropy approach is agnostic about how consumption is mediated or the source of the interest rate. So it is actually a pretty general mechanism that should be valid across a wide array of models. In fact, we see here that the Diamond Dybvig mechanism derives mostly from the idea of the bank budget constraint (see footnote [1], too), so in any model where banks have a budget constraint of the form Eq. (1) above, you can achieve bank runs. Therefore deposit insurance generally works by

alleviating the budget constraint. No amount of bells and whistles can help you understand this basic message better.

It would be easy to add this model of the interest rate so that we take (allowing the possibility of non-ideal information transfer)

r≤(1kpNGDPMB)1/krr≤(1kpNGDPMB)1/kr

This would be equality in the ideal information transfer (information equilibrium) case. Adding in the price level model, we'd have two regimes: high and low inflation. In the high inflation scenario, monetary expansion raises interest rates (and contraction lowers them); in the low inflation scenario, monetary expansion lowers interest rates (and contraction raises them). See e.g. here. I'll try to work through the consequences of that in a later post ... it mostly moves the bank budget constraint Eq. (1).

Footnotes:

[0] Why? I'm not sure. It makes more sense to me that people would want to spend more when they take in more ... I guess it is just one of those times in economics where this applies: ¯\_(ツ)_/¯

[1] In that case the diagrams are much less cluttered and look like this:

What happens when you push on a price?

Ed. note: This post has been sitting around because I never found a satisfying answer. However, this post from John Handley inspired a comment that led to a more scientific take on it. A lot of economics deals with situations where some entity impacts a market price: taxes, subsidies, or interest rates in general with a central bank. With the information equilibrium picture of economics, it's easy to say what happens when you change demand or supply ... the price is a detector of information flow.

For my thought experiments, I always like to think of an ideal gas with pressure pp , energy EE and volume VV (analogous to price pp , demand DD and supply SS , respectively):

p=kEVp=kEV

How do I increase the pressure of the system? Well, I can reduce VV or increase EE (raise

temperature or add more gas). One thing is for certain: grabbing the pressure needle and turning it will not raise the pressure of the gas! (This is like Nick Rowe's thought experiment of grabbing the speedometer needle).

... at least under the conditions where the detector represents an ideal probe (the probe has minimal impact on the system ... like that pressure gauge or speedometer needle). But our probe is the market itself -- it is maximally connected to the system. Therefore when you push on a price (through a regulation, tax, minimum wage, or quota system), it does impact supply and/or demand. The and/or is critical because these impacts are observed to be empirically different.

Since we don't know, we have to plead ignorance. Therefore price dynamics (for a short time and near equilibrium with D≈DeqD≈Deq and S≈SeqS≈Seq ) should follow:

dpdt=a0+a1t+o(t2)+d10(D−Deq)+o(D2)+s10(S−Seq)+o(S2)+d11ddt(D−Deq)+o(D2)+ s1 1ddt(S−Se q)+o(S2)+c20(D− De q)(S−Seq)+o(D2S2)dpdt=a0+a1t+o(t2)+d10(D−Deq)+o(D 2)+s10(S−Seq)+o(S2)+d11ddt(D−Deq)+o(D2)+s11ddt(S−Seq)+o(S2)+c20(D−Deq)(S−Seq)+o(D2S2)

This gives us an excellent way to organize a lot of effects. The leading constant coefficient would be where un-modeled macroeconomic inflation would go (it is a kind of mean field approximation). Entering into a0a0 and a1a1 would be non-ideal information transfer -- movements in the prices that have nothing to do with changes in supply and demand. Interestingly, these first terms also contain expectations.

The next terms do not make the assumption that Deq=SeqDeq=Seq or that they even adjust at the

same rate. This covers the possibilities that demand could perpetually outstrip supply (leading to market-specific inflation -- housing comes to mind), and that demand adjust to price changes faster than supply does (or vice versa). For example, demand for gasoline is fairly constant for small shifts in price, so price changes reflect changes in supply (d10≈0d10≈0 ). If you think pushing on a price moves you to a different equilibrium, then you might take Xeq=Xeq(t)Xeq=Xeq(t) , but we'll assume

dXeq/dt=0dXeq/dt=0 for now.

Basically, your theory of economics determines the particular form of the expansion. The "Walrasian" assumption (per John Handley's post) is that D=SD=S always. Adding rational expectations of (constant) inflation leaves you with the model:

dpdt=a0dpdt=a0

Assuming information equilibrium yields a non-trivial restriction on the form of the expansion (see e.g. here for what happens when you add time to the information equilibrium condition). We obtain (taking

X−Xeq≡ΔXX−Xeq≡ΔX ): dpdt=kSeqdDdt−kΔSS2eqdDdt−kΔDS2eqdSdt+⋯dpdt=kSeqdDdt−kΔSSeq2dDdt−kΔDSeq2 dSdt+⋯

We find that almost all of the terms in the expansion above have zero coefficients. The leading term would be d11=k/Seqd11=k/Seq . The next terms would be the c21c21 terms -- second order cross terms with one time derivative. Including only the lowest order terms and adding back in the possibility of non-ideal information transfer, we have

dpdt=a 0+a1t+ kSeqdDdtdpdt=a0+a1t+kSeqdDdt

All small price changes are due to (temporal) changes in demand or non-ideal information transfer! Integrating (dropping the higher order time term):

p(t)−p(t0)=a0(t− t0)+kSeq(D(t)−D(t0))p(t)−p(t0)=a0(t−t0)+kSeq(D(t)−D(t0))

This means when you push on a price, at least to leading order, you impact demand (or cause non-ideal information transfer). It also has the opposite sign you might expect. An increase in price would increase demand! Note that this assumes general equilibrium (where demand and supply both adjust quickly to changes). But in general equilibrium, increasing demand means increasing supply as well, so we can understand the result that way. It could also be the case that nominal demand (DD ) goes up while real demand (D/pD/p ) goes down depending on the value of the coefficients.

If we assume demand adjusts slowly (dD/dt≈ 0dD/dt≈0 ), then we get the "Econ 101" result (returning to information equilibrium) where an increase in price reduces demand, assuming supply is increasing (e.g. economic growth):

dpdt=−kΔDS2eqdSdtdpdt=−kΔDSeq2dSdt

For information equilibrium to reproduce the Econ 101 result that a tax increase reduces demand, you have to assume 1) information transfer is ideal, 2) demand changes slowly, and 3) economic growth ... or instead of 1-3, just assume non-ideal information transfer. Therefore the simplest explanations of the

standard Econ 101 impacts of pushing on a price would actually be a decline in real demand or breaking information equilibrium.

This is not to say these assumptions aren't valid -- they could well be. It's just that there are a lot of assumptions at work whenever anyone tells you what the effects of changing a price are. The Economy at the End of the Universe, part II

Nick Rowe asked in a comment on this post if I could look at the second model (a Cagan monetary demand model) he describes in his post and that took me on an adventure that might be enlightening. I also thought about calling this post Life, the Universe and Economics (or something like that) as the sequel to the previous post.

Anyway, a Cagan model has M/PM/P (real money supply) as a negative function of expected inflation, so using rational expectations (model-consistent expectations) and taking logM≡mlog⁡M≡m and

logP≡plog⁡P≡p we have

logM/P=m−p=−τπE= −τddtlogP=−τp˙log⁡M/P=m−p=−τπE=−τddtlog⁡P=−τp˙

The general (forward-looking) solution to this equation is an integral of the money supply (or monetary base or what have you)

p(t)=1τ∫ ∞td t′et−t′τm(t′)p(t)=1τ∫t∞dt′et−t′τm(t′)

As an aside, this is an example of using a Laplace transform to solve a differential equation. We can see that ττ is essentially a time horizon. It weights the future (expected) money supply by a decreasing exponential factor (similar to a discount factor).

But that infinity is also a time horizon -- it tells us where to stop considering the future at all. Note that times T≫τ>tT≫τ>t don't contribute much to the integral. If we look at the limiting behavior of this

integral (at times tt in the distant future after any impact of a shock to monetary policy such that

m(t)=μtm(t)=μt -- constant money growth) we have:

P(t)=expp(t)=exp(μ(t+τ )−e(t−T)/τ(T+τ))P(t)=expp(t)=exp(μ(t+τ)−e(t−T)/τ(T+τ))

This has two different limits depending on whether you take ττ to infinity first (it's 0) or TT to infinity first (it's infinite). Again, we have the same issue as we had with Nick Rowe's reductio ad absurdum and the

conclusion we should draw is that only the limits T≫τT≫τ and τ≫Tτ≫T could make sense. There's

either rapid discounting or slow discounting (those limits just mentioned, respectively) at the horizon. And in those two cases we have either P∼expμtP∼exp⁡μt (steady growth in the price level) or

P∼1P∼1 (i.e. constant, but with a growing money supply).

If we take the first limit we have something that looks basically like the previous post (except the steady state has constant growth, which we could have chosen for the previous model but didn't for simplicity):

In any case, we have the same problem that as the time horizon t0t0 at which m(t0)m(t0) returns to "normal" moves out, the result differs by a larger and larger amount from P∼expμtP∼exp⁡μt . Here's the graph of inflation (p˙(t)p˙(t) ):

Note that the Cagan model has different results [pdf] for adaptive expectations. Depending on

parameters, inflation is either basically monetary (it depends on m(t)m(t) Nick Rowe's concrete steppes) or basically expectations (i.e. is independent of m(t)m(t) ). The former limit is either rapid

discounting, rapid adaptation, or both. The latter is either slow discounting, slow adaptation. or both (but results in exploding inflation).

...

Addendum

It is interesting to look at this model as an information transfer model X:M⇄PX:M⇄P where "expectations" XX are the detector of information equilibrium between the money supply and the price level

kMP=dMdP≡X=e−τπEkMP=dMdP≡X=e−τπE

Or in the form above, using x=logXx=log⁡X

m−p+logk=xm−p+log⁡k=x

In general equilibrium we have

X∼Pk−1X∼Pk−1

so if we use rational expectations, we must equate the price XX with the value of XX above such that

X∼e(k−1)logP∼e−τπE≡ e−τddtlogPX∼e(k−1)log⁡P∼e−τπE≡e−τddtlog⁡P

and we discover the operator formula

−τddt=k−1−τddt=k−1

when acting on p=logPp=log⁡P . That means

P∼exp(exp(1−kτt))P∼exp⁡(exp⁡(1−kτt))

That's not a typo -- it's a double exponential. In order to have PP not explode (given rational expectations), we need k=1k=1 , which implies that the price level is constant. This is exactly the

same finding as a more traditional approach to the Cagan model [pdf]. The resolution here depends on what model you trust more. If you trust rational expectations, then you question the equilibrium. If you trust information equilibrium, you question rational expectations.

...

PS Gosh, I had to think about this for nearly two weeks and over three flights. I may still have gotten it wrong. Weird and subtle things can happen when you have to deal with infinity.

PPS You can see the limits pretty directly just looking at the equation

m−p=−τp˙m−p=−τp˙ If τ≫Tτ≫T , then you have −τp˙≈0−τp˙≈0 , and thus P∼1P∼1 . If T≫τT ≫τ , then you have

m−p≈0m−p≈0 and so P∼M(t)∼expμtP∼M(t)∼expμt in my example. Towards an information equilibrium take on the Lucas Islands model

Commenter LAL asked if I could take a look at the Lucas Islands model in the information equilibrium framework. This post is basically just a set-up of that model and some initial observations. We'll start with a series of markets:

pi:(∑j≠inj)→sipi:(∑j≠inj)→si

pi:ni→mpi:ni→m

These two markets set up the model where the price pipi depends on both on supply (sisi ) and demand for the individual goods (first market) as well as the growth of the aggregate money supply

(i.e. trend inflation, second market). If there are a large number of markets, we can make an

immediate simplification where

∑j≠inj≈∑jnj≡N∑j≠inj≈∑jnj≡N

The second market is a simple quantity theory of money and we can immediately solve the resulting differential equation (much like is done in the development of the partition function approach):

ni=ci(mm0)kini=ci(mm0)ki

pi=ciki(mm0)ki−1pi=ciki(mm0)ki−1

We can define real output ni≡Pyin i≡Pyi with the average price level PP over the II islands

P=1I∑ipi=1I∑i cik i(mm0)ki−1P=1I∑ipi=1I∑iciki(mm0)ki−1

so that:

N=∑ini=∑iPyi= P∑ iyi≡PYN=∑ini=∑iPyi=P∑iyi≡PY

Equilibrium

From the first market, and using the simplification from a large number of markets, we can say that:

pi=dNdsi=aiNsipi=dNdsi=aiNsi

Now, let's make a simplifying assumption that all the II islands are identical so that ai=a0ai=a0 ,

ki=k0ki=k0 and ci=c0ci=c0 . This leaves us with the equations (in equilibrium): pi=a0Nsipi=a0Nsi

P=1I∑ipi=c0k0(mm0)k0−1P=1I∑ipi=c0k0(mm0)k0−1

N=∑ini=Ic0(mm0)k0N=∑ini=Ic0(mm0)k0

so that

Y=N /P =Ik0mm0Y=N/P=Ik0mm0

and yi=m/(k0m0)yi=m/(k0m0) . In equilibrium we also have:

c0k0(mm0)k0−1=a0Nsic0k0(mm0)k0−1=a0Nsi sic0k0(mm0)k0−1=I a0c0(mm0)k0sic0k0(mm0)k0−1=Ia0c0(mm0)k0 si=a0Ik0mm0 =a0Ysi=a0Ik0mm0=a0Y

and each pi=Ppi=P .

Disequilibrium (shocks)

The Lucas Islands model adds in two kinds of shocks (or fluctuations): monetary policy σmσm and idiosyncratic market shocks σiσ i . We can add these to the model's general differential equations from the two markets at the top of this post:

pi=∂ni∂si=ainisi+σipi=∂ni∂si=ainisi+σi

pi=∂ni∂m=k inim+σmpi=∂ni∂m=kinim+σm

In this model, agents won't be able to tell the difference between σmσm and σiσi (the signal extraction

problem). If we use the identical market simplification and solve the differential equations, we obtain (assuming I've done my math right):

ni=n0(sis0,i)a0+(si−s0,i)σilogsi/ s0,i( mm0 )k0+(m−m0)σmlogm/m0ni=n0(sis0,i)a0+(si−s0,i)σilog⁡si/s0,i(mm0)k 0+(m−m0)σmlog⁡m/m0

Which we could re-write in terms of re-defined shocks Σ mΣm and ΣiΣi

ni=n0(sis0,i)a0+Σi( mm0 )k0+Σmni=n0(sis0,i)a0+Σi(mm0)k0+Σm

or in a log-linear form:

n~i=logn0+(a0+Σi)(s~i−s~0,i)+(k0+Σm)(m~−m~0)n~i=log⁡n0+(a0+Σi)(s~i−s~0,i)+(k0+Σm)(m ~−m~0)

If we use the second differential equation (log-linearized) we have:

p~i=n~i−m~+logk0p~i=n~i−m~+log⁡k0

p~i=logn0+(a0+Σi)(s~i−s~0,i)+(k0+Σm)(m~−m~0)−m~+logk0p~i=log⁡n0+(a0+Σi)(s~i−s~0, i)+(k0+Σm)(m~−m~0)−m~+log⁡k0

p~i=p~Σ=0+Σi(s~i− s~0 ,i)+Σm(m~−m~0)p~i=p~Σ=0+Σi(s~i−s~0,i)+Σm(m~−m~0)

The Lucas model stipulates that each island will change production based on the the price signal they see (and compare it to their inflation expectations based on monetary policy). This creates two ways that equilibrium can be restored (for example the maximum entropy/equilibrium state where all prices are equal) given a rise in price:

1. Production (rise in supply) 2. Fall in demand The agents use option 2 when the price rise is in line with their expectations based on monetary shocks and option 1 when it is not. Using the gas in a box visualization of the forces from e.g. this post, we would see an imbalance of the number of particles between two boxes being rectified by both particles moving from one box to another (red, fall in demand) and the addition of new particles (blue, production):

There is no a priori reason for one adjustment over the other, so both adjustments should happen if they are in the model. We'd need to define a "chemical potential" for the supply units which would make it bigger or smaller component of the adjustment depending on the magnitude of the price difference from the equilibrium price (both boxes have equal numbers of particles).

Monetary shifts increase (or decrease) the numbers of points across all the boxes (the islands) and if the system was otherwise in equilibrium, there would be no production needed to offset cases of disequilibrium -- money is neutral in that case.

That is all for now. I will continue this approach in a future post. Models and frameworks Given that I recently put forward the idea that inflation and growth are all about labor force growth, I thought I'd clarify some things. Some of you might have asked yourself (or did ask me in comments) about how this "new model" [1] relates to the "old model" [2] that's all about money. I know I did.

The key thing to understand is that the information transfer framework (described in more detail in my arXiv paper) is just that: a framework. It isn't a model itself, just a tool to build models. Those models don't have to be consistent with each other. So there really is no "new model" or "old model", just different models that may be different approximations (or one or both might become empirically invalid as more data comes in).

And as a tool, it's basically an information-theoretic realization of Marshallian supply and demand diagrams. What you do is posit an information equilibrium relationship between A and B, which I write A ⇄ B, or an information equilibrium relationship with an abstract price p = dA/dB, which I write p : A ⇄ B, and here's what's included (act now!) ...

● ●

●

A general equilibrium relationship between A and B (with price p) where A and B vary together (that always applies). Generally, more A o r more B leads to more B or more A, respectively. A partial equilibrium supply and demand relationship between A and B (with price p) with B being supply and A being demand -- it applies when either A or B is considered to move slowly with respect to the other (it's an approximation to the former where A or B is held constant). The possibility of "market failure" where we have non-ideal information transfer that I write A → B (all of the information from A doesn't make it to B). This leads to a non-ideal price p* < p as well as a non-ideal supply B* < B.

●

A maximum entropy principle that describes what (information) equilibrium between A and B actually means, including a causality that can go in both directions along with potentially emergent entropic forces that have no formulation in terms of agents.

So in the information transfer framework there are information equilibrium relationships A ⇄ B and more general information transfer relationships A → B. I tend to refer to these individual relationships as "markets". Given these basic "units of model", you can construct all kinds of relationships. Traditionally crossing-diagrams are easiest. Things like the AD-AS model or the IS-LM model can be concisely written as the market

P : AD ⇄ AS

where AD is aggregate demand and AS is aggregate supply, or the markets

(r ⇄ p) : I ⇄ M

PY ⇄ I

PY ⇄ AS

for the IS-LM model where PY is nominal output (i.e. P × Y = NGDP, I also tend to write it N on this blog and in the paper), I is investment, M is the "money supply", p is the "price of money" and r is the interest rate.

Another aspect of the model is that information equilibrium is an equivalence relation, so that AD ⇄ M and M ⇄ AS implies AD ⇄ AS (this makes an interesting definition of money). This means that if you find a relationship (as I did in [2])

CPI : NGDP ⇄ CLF

there could be some other factor(s) X (, Y, Z, ...) such that

NGDP ⇄ X ⇄ Y ⇄ Z ⇄ CLF

Relationships like this can be inferred from a price that doesn't follow CPI* < CPI, but can be above or below the ideal price CPI (CPI* < CPI or CPI* > CPI) that follows from being careful about the direction of information flow and the intermediate abstract prices p₁ and p₂ in the markets

p₁ : NGDP ⇄ X p₂ : X ⇄ CLF

These would probably find their best analogy in "supply shocks" (price spikes due to non-ideal information transfer) as opposed to "demand shocks" (price falls due to non-ideal information transfer). Note that in the model CPI : NGDP ⇄ CLF with intermediate X, CPI = p₁ × p₂ because CPI = dNGDP/dCLF = (dNGDP/dX) (dX/dCLF) via the chain rule.

In the end, however, the only way to distinguish among different information equilibrium models (or information transfer models) is empirically. This framework works much like how quantum field theory works as a framework (as a physicist, I like to have a framework ... anything else is just philosophy). You observe something in an experiment and want to describe it. One group of researchers models it as a real scalar field and writes down a Lagrangian

ℒ = ϕ (∂² – m) ϕ

Another group models it as a spin-1/2 field

ℒ = ψ (i ∂ – m) ψ

(ok, that one's missing a slash and a bar). Both "theories" are perfectly acceptable ex ante, but ex post one or both may be incompatible with empirical data.

Actually one of the goals of this blog (and the information transfer model) was to introduce exactly this kind of model rejection to economics: I was inspired to do this because of Noah Smith's recent post on why macroeconomics doesn't seem to work very well. Put simply: there is limited empirical information to choose between alternatives. My plan is to produce an economic framework that captures at least a rich subset of the phenomena in a sufficiently rigorous way that it could be used to eliminate alternatives. I've come up with several different information equilibrium relationships -- or models built from collections of relationships (see below) -- and I am testing their abilities with forecasts. Some might fail. Some have failed already. For example, the IS-LM model does not work if inflation is high (but represents an implicit assumption that inflation is low, so it is best to think of it as an approximation in the case of low inflation). A few of Scott Sumner's versions of his "market monetarist" model can be written as information equilibrium relationships (see below) ... and they mostly fail.

In a sense, I wanted to try to get away from the typical econoblogosphere (and sometimes even academic economics) BS where someone says "X depends on Y" and someone else (such as myself) would say "that doesn't work empirically in magnitude and/or direction over time" and that someone would come back with other factors A, B and C that are involved at different times. I wanted a world where someone asks: is X ⇄ Y? And then looks at the data and says yes or no. DSGE almost passes this test -- these models are at least specific enough to compare to data. However they don't ever seem to look at the data and say no ... it's always "add a financial sector" or "add behavioral economics". There isn't enough data to support that kind of elaboration.

A good example is the quantity theory of money. It says PY = MV. Now this was great in a world where people thought V was constant (i.e. the old Cambridge k). But that turns out not to be the case and now V could depend on E[PY] or E[P] or E[M] or something else. What are these specific expectation models? Is E[P] = TIPS? Or is V ≡ PY/M is now a definition? And what is M? M2? MB?

Essentially various versions of the quantity theory of money have been falsified empirically (or at best a loose approximation when inflation is high) ... but it keeps trucking along because it doesn't exist in a framework where either its scope or validity can be challenged.

It's probably a naive hope, but it's the kind of naive hope that distinguishes "science" from "mathematical philosophy".

...

Addendum: information equilibrium models

Note that just because these models can be formulated does not mean they are correct.

I. The "quantity theory of labor" [1]

P : PY ⇄ CLF

See this post for this one.

II. "The" IT model [2]

P : PY ⇄ M0 (r¹⁰ʸ ⇄ pᴹ⁰) : PY ⇄ M0 (r³ᵐ ⇄ pᴹᴮ) : PY ⇄ MB P : PY ⇄ L

where the r's represent the long and short term interest rates (3 month and 10 year), M0 is base minus reserves, MB is the monetary base (including reserves) and L is the labor supply (the last relationship is essentially Okun's law). I usually measure the price level P with core PCE, but empirically it is hard to tell the difference between core PCE and core CPI (or the deflator). This model also allows the information transfer index in the first market to slowly vary. This represents a kind of analytic continuation from a "quantity theory of money" to an "IS-LM model with liquidity trap".

It also has a "DSGE form" that connects labor shocks to nominal output shocks, shows more clearly the limits where monetary expansion is expansionary vs not (liquidity trap) as well as interest rate dynamics in those two limits (separating out the income/inflation effect and the liquidity effect).

Both this model and the next one are in my paper.

III. Solow model (plus IS-LM)

PY ⇄ L PY ⇄ K ⇄ I K⇄D 1/s : PY ⇄ I (r³ᵐ ⇄ p) : I ⇄ MB

where the last market is the IS-LM piece, K is capital and D is depreciation. This is a bit different from the traditional Solow model in that it is written in terms of nominal quantities. This may sound problematic, but it throws out total factor productivity as unnecessary and is remarkably empirically accurate in describing output as well as the short term interest rate.

IV. Scott Sumner's various models ( 1), ( 2) and (3)

1) u : NGDP ⇄ W/H

... this is just empirically wrong over more than a few years. H is total hours worked and W is total nominal wages.

2) (W/H)/(PY/L) ⇄ u

... but H/L ⇄ u has almost no content (higher unemployment means fewer worked hours per person) and the relationship c : PY ⇄ W has a constant abstract price meaning PY = c W with c constant. The model reduces to (1/c) (H/L) ⇄ u or just the content-less H/L ⇄ u.

The correct version of both of these is P : PY ⇄ L or P : PY ⇄ H, which are just Okun's law (above).

3) (1/P) : M/P ⇄ M

This may look a bit weird, but it could potentially work if Sumner didn't insist on an information transfer index k = 1 (if k is not 1, that opens the door to a liquidity trap, however). As it is, it predicts that the price level is constant in general equilibrium and unexpected growth shocks are deflationary in the short run. Models and frameworks Given that I recently put forward the idea that inflation and growth are all about labor force growth, I thought I'd clarify some things. Some of you might have asked yourself (or did ask me in comments) about how this "new model" [1] relates to the "old model" [2] that's all about money. I know I did.

● ●

●

P : AD ⇄ AS

where AD is aggregate demand and AS is aggregate supply, or the markets

(r ⇄ p) : I ⇄ M

PY ⇄ I

PY ⇄ AS

CPI : NGDP ⇄ CLF

there could be some other factor(s) X (, Y, Z, ...) such that

NGDP ⇄ X ⇄ Y ⇄ Z ⇄ CLF

p₁ : NGDP ⇄ X p₂ : X ⇄ CLF

ℒ = ϕ (∂² – m) ϕ

Another group models it as a spin-1/2 field

ℒ = ψ (i ∂ – m) ψ

(ok, that one's missing a slash and a bar). Both "theories" are perfectly acceptable ex ante, but ex post one or both may be incompatible with empirical data.

plan is to produce an economic framework that captures at least a rich subset of the phenomena in a sufficiently rigorous way that it could be used to eliminate alternatives. I've come up with several different information equilibrium relationships -- or models built from collections of relationships (see below) -- and I am testing their abilities with forecasts. Some might fail. Some have failed already. For example, the IS-LM model does not work if inflation is high (but represents an implicit assumption that inflation is low, so it is best to think of it as an approximation in the case of low inflation). A few of Scott Sumner's versions of his "market monetarist" model can be written as information equilibrium relationships (see below) ... and they mostly fail.

It's probably a naive hope, but it's the kind of naive hope that distinguishes "science" from "mathematical philosophy".

...

Addendum: information equilibrium models

Note that just because these models can be formulated does not mean they are correct.

I. The "quantity theory of labor" [1]

P : PY ⇄ CLF

See this post for this one.

II. "The" IT model [2]

P : PY ⇄ M0 (r¹⁰ʸ ⇄ pᴹ⁰) : PY ⇄ M0 (r³ᵐ ⇄ pᴹᴮ) : PY ⇄ MB P : PY ⇄ L

Both this model and the next one are in my paper.

III. Solow model (plus IS-LM)

PY ⇄ L PY ⇄ K ⇄ I K⇄D 1/s : PY ⇄ I (r³ᵐ ⇄ p) : I ⇄ MB

IV. Scott Sumner's various models ( 1), ( 2) and (3)

1) u : NGDP ⇄ W/H

... this is just empirically wrong over more than a few years. H is total hours worked and W is total nominal wages.

2) (W/H)/(PY/L) ⇄ u

The correct version of both of these is P : PY ⇄ L or P : PY ⇄ H, which are just Okun's law (above).

3) (1/P) : M/P ⇄ M

This may look a bit weird, but it could potentially work if Sumner didn't insist on an information transfer index k = 1 (if k is not 1, that opens the door to a liquidity trap, however). As it is, it predicts that the price level is constant in general equilibrium and unexpected growth shocks are deflationary in the short run. The economic state space: a mini-seminar I thought I'd curate a mini-seminar of information equilibrium blog posts chosen from over the past two years that draw on a particular visualization of the economic state space:

The picture above represents a histogram of different markets', individuals', households', firms', goods, or generic agents' "growth" states. For example, if firm X grows in revenue at 10% per year, then firm X gets a box at the 10% mark. But those boxes could well be wage growth or productivity growth for example.

The concept of "information equilibrium" is based on the idea that the distribution is stable, but e.g. the box for firm X moves around in it. Additionally non-ideal information transfer (information loss) results in the over-representation of the lower growth (or negative growth) states relative to e.g. a Gaussian distribution. Economic forces that maintain the distribution would be entropic forces like diffusion -they'd have a macroeconomy-level description but no agent-level description.

If you need background on the basics of information equilibrium, here are some slides.

Here are six seven nine posts that expand on this idea of the economic state space (plus one link to a picture of a version using CPI components):

1. "A statistical equilibrium approach to the distribution of profit rates"

This post shows some empirical evidence that these stable distributions (which may actually be stable distributions) may exist for profit rates.

2. Micro stickiness versus macro stickiness

The movement of particular goods or labor growth states inside the distribution while the distribution itself stays relatively unchanged allows for nominal rigidity at for the aggregate economy, but without requiring individual wages/prices to be "sticky". This potentially resolves a conundrum because individual prices aren't empirically observed to be sticky, but aggregate stickiness is needed e.g. to make monetary policy have an effect on the economy.

3. Balanced growth, maximum entropy, and partition functions

We capture the concepts of the previous two posts in a single framework by defining a partition function (based on a maximum entropy distribution with constrained macroeconomic observables such as the existence of "economic growth"). This is used in the next post (#4).

4. An ensemble of labor markets

In this post we build a very simple model of an economy that exhibits a productivity slowdown simply because it is more likely to organize a large economy out of a distribution with many low productivity states than a few high productivity states.

5. Does saving make sense?

The picture at the top of this page gives a possible interpretation of national income accounting identities like Y = C + S + T + NX as more than just definitions.

6. Internal devaluation and the fluctuation theorem

The over-representation of the low/negative growth states has many implications for economics. In particular, although the information transfer approach is built on a generalized thermodynamics, the "econo-dyanmics" theory is very different from thermodynamics. The over-representation of low/negative growth states means "econo-dynamics" does not have a second law that entropy always increases. Thermodynamics has an exception for very small decreases in entropy (the fluctuation theorem), but econo-dynamics may have an even larger deviation that are at the heart of recessions.

7. Coordination costs money, causes recessions

The large deviations from the second law of thermodynamics may have their source in large scale coordination -- periods when agents behave in the same way such as a panic in a financial crisis or having the same reaction to a piece of economic news. Recessions could be viewed as a spontaneous fall in entropy. That is something that does not occur in physics or in many other maximum entropy approaches meaning that economics is a very different field of study.

8. Price growth (i.e. inflation) state distribution (NEW!)

An empirical look at a stable "statistical equilibrium" of price changes (price change state space) based on the MIT billion price project. The results support the economic state space picture.

9. Stocks and k-states (NEW!)

A theoretical and empirical look at the application of "statistical equilibrium" of k-states (in economic state space) to stock markets. The economic state space picture allows us to understand the Fama-French three-factor investing model. Apples, bananas and the information transfer model of supply and demand

I was asked by Nick Rowe to explain where the information comes in: "If I trade an apple for a banana, there is supply and demand, but where does information come into it?" My initial reply on the post was less clear than I'd like, so I'm going to try and do better here. First we'll start with a single good (apples).

Let's say there are d potential apple buyers, labelled 1, 2 ... dd . Selling one apple to #42 uncovers

log2dlog2⁡d

bits of information (the number of bits required to describe the ID number). Selling a

second apple to #42 uncovers another log2dlog2⁡d bits of information for a total of

log2d+log2d=2log2dlog2⁡d+log2⁡d=2log2⁡d bits of information. Selling a third apple to #1005 uncovers another log2dlog2⁡d bits of information, and so on, until we've sold ndnd apples and uncovered ndlog2dndlog2⁡d bits.

This uncovered information is transferred from the buyers (the demand) who ostensibly know how much they'd like to buy (at a given price) to the sellers (the supply) who only have some vague idea of the size of the apple market after they start to sell some apples (or do a little market research).

The optimal way to register this information would be to keep a log of each apple and each buyer's ID number. In that case, the information captured by the supply would be

(1) Id=ndlog2d(1) Id=ndlog2⁡d

However, the suppliers don't actually know the size of their market d so they actually capture

(2) Is=nslog2s≤ndlog2d(2) Is=nslog2⁡s≤ndlog2⁡d

bits of information where ss could be e.g. the sellers' estimate of dd [1]. We have Is≤IdIs≤Id . In an ideal world, this would be equality. If only there were something that existed to help gauge the size of the market for a product allowing a seller to capture all the information ... (hint: it's called money).

Which reminds me, we haven't actually discussed what the buyer's are buying these apples with yet. We'll start with Nick's suggestion of using a banana to buy an apple. In that case the informaton the sellers collect (by acquiring a banana from the apple buyer) is

nblog2b=nslog2snblog2⁡b=nslog2⁡s

where bb is the number of potential banana buyers. We can use this to determine the "exchange rate"

(the price of a banana in terms of apples). If we take the smallest unit of bananas to be dBdB so that

B/dB=nbB/dB=nb , then

Blog2b=SdBdSlog2sBlog2⁡b=SdBdSlog2⁡s

where ns=S/dSns=S/dS the number of apples supplied to the apple buyers. If we assumed the market for apples bananas is about the same size as the market for apples, we could say that

log2b∼log2s∼log2dlog2⁡b∼log2⁡s∼log2⁡d , so that: B∼SdBdS≤ndB∼SdBdS≤nd

where dB/dSdB/dS (where we let dSdS and dBdB become infinitesimal) is the "exchange rate" (price) of apples to bananas. What if the apple buyer gave the apple seller something else instead of bananas? Well, starting with equation (2) you'd have

nslog2s≤ndlog2dnslog2⁡s≤ndlog2⁡d

SdSlog2s≤DdDlog2dSdSlog2⁡s≤DdDlog2⁡d dDdS≤log2dlog2sDSdDdS≤log2⁡dlog2⁡sDS We'll call the left hand side the price PP (like the exchange rate above) and define

log2s/log2d≡κlog2⁡s/log2⁡d≡κ (the information transfer index). Leaving us with: (3) P=dDdS≤1κDS(3) P=dDdS≤1κDS where we can call SS the supply and DD the demand.

take

the

thing

exchanged

log2s→log2mlog2⁡s→log2⁡m

dollars,

then

could

potentially

take

where mm is the size of the money supply (monetary base). This

would allow you to get a really good estimate of the potential market size d by looking at the price (or at least changes in the price) while knowing the size of the money supply.

In the information transfer model, I have typically assumed equality in equation (3) and taken κκ to be

an unknown constant I fit to the data. One notable exception is the money market where I took

d=NGDPd=NGDP and s=MBs=MB and used it to describe the price level. In many of the posts on this blog, I've used the shorthand notation P:D→SP :D→S for a model that transfers information from the demand DD to the supply SS with price PP . PS Some notes:

1. Equation (1) assumes transactions are maximally uninformative, or equivalently, all microstates with

ndnd

apples sold are equally likely. One way of thinking about that is that it makes the fewest

assumptions about potential microfoundations.

2. Equation (3) is the simplest possible relationship between supply and demand that maintains homogeneity of degree zero (related to the long run neutrality of money).

3. Per Nick Rowe's original post, if the network of exchanges collapsed, we'd see a fall in the total amount of information being exchanged so that if we looked at the market P:AD→ASP:AD→AS for aggregate supply and aggregate demand, we'd see a fall in ADAD and/or PP .

Update: Forgot to add a very important note ...

4. Equation (3) above can be solved to r ecover supply and demand curves and Marshall's diagrams.

[1] Footnote added in update 3/14/2014. The a priori estimate information about the size of dd would actually have to come from somewhere else. Being strict about it, all the seller would know is that ss is

at most the size of the total number of apples he or she has sold and at least the number of different

people sold to. In this way, s→ds→d , eventually but only if the market for apples was dominated by a monopoly. Competitive sellers really can't get at the information about the size of dd without some sort of tool to measure the size of the market -- that's what money does. It allows a seller to gauge the size

of the market in order to calibrate the information they receive (they don't know if Is=Is= 10 bits or 100 bits) so they can use that information to supply the market. If the price suddenly goes up, the amount of

information being received suddenly increases (IsIs goes from say 10 bits per apple to 20 bits per apple), telling the supplier that demand (dd ) has increased (or supply from all the suppliers has fallen). How money transfers information

In this post I showed how the information flows in a simple market for apples, but here I'm going to show what I hinted at in the earlier post: money is a tool to transfer information from the demand to the supply.

Let's start with a simple system of dd buyers (green) and two suppliers of gray bundles of something. Each sale adds +log2d+log2⁡d bits of information [1] about the size of the market dd and the

distribution of demand to a supplier. Or at least it would if the supplier had some knowledge of the size of dd in the first place. If there was only one supplier, that supplier could use the total amount of goods

sold ss as an estimate since the total amount of information would have to be less than or equal to the information coming from the buyers (the source), i.e. (dropping the base of the loglog 's):

(1) Is=nslogs≤ndlogd=Id(1) Is=nslog⁡s≤ndlog⁡d=Id

You can't get more information from a message than the message contains! However, there is more than one supplier, so each supplier only sees a fraction of the total supply and a fraction of the total demand. If this were all there is to it, then while each transaction could transfer logdlog⁡d bits of information, the supplier would have no idea. Enter money; now with each sale a supplier acquires a few inherently worthless tokens:

How does this help? Well, because these tokens are available to everyone (either set up by some government or based on custom), the supplier has some idea of how many are out there (let's call it

mâ&#x20AC;&#x2039;m ):

Now each sale is accompanied by nmnm tokens of money, so that each token transfers +logm+log⁡m bits of information from the demand to the supply. This monetary system could potentially work well enough so that we can say the information captured by the supply is equal to the demand, thus equation (1) becomes:

(2) Is=nslogs=nmlogm=ndlogd=Id(2) Is=nslog⁡s=nmlog⁡m=ndlog⁡d=Id

We call this ideal information transfer when we use the equal sign. If we take ns=S/dSns=S/dS

where dSdS is the smallest/infinitesimal unit [2] of supply and likewise nd=D /d Dnd=D/dD for demand and assume D,S≫1D,S≫1 (a very large market), we can write:

SdSlogs=DdDlogdSdSlog⁡s=DdDlog⁡d

dDdS=logdlogsDSdDdS=log⁡dlog⁡sDS

(3) dDdS=1κDS(3) dDdS=1κDS

where we've defined the information transfer index κ≡logs/logdκ≡log⁡s/log⁡d . The left hand side of equation (3) can be identified with the price pp because it is proportional to dM/dSdM/dS , or the rate of change of money for the smallest increase in quantity supplied.

But wait, there's more! See, money can be exchanged for all kinds of goods and services:

This means that the total demand of all goods and services (ADAD , or aggregate demand) is related to the total amount of money (MM ) so that, again assuming ideal information transfer:

(4) P=dADdM=1κADM(4) P=dADdM=1κADM

where PP is an overall measure of the price of all goods and services (ADAD ); it's called the price

level. The rate of change of the price level over time is inflation. Now for the totally awesome part: we can solve equation (4) for aggregate demand in terms of the money supply. It's a differential equation that can be solved by integration. We re-arrange equation 4 like this:

(5) dADAD=1κdMM(5) dADAD=1κdMM

and integrate

∫ADAD0dAD′AD′=1κ∫MM0 dM′M′∫AD0ADdAD′AD′=1κ∫M0MdM′M′

logADAD0=1κlogMM0log⁡ADAD0=1κlog⁡MM0

ADAD0=(MM0)1/κADAD0=(MM0)1/κ

Using equation (4) again, we have

(6) P=1κAD0M0(MM0)1/κ−1(6) P=1κAD0M0(MM0)1/κ−1

If κ=1/2κ=1/2 , then P∼MP∼M , and the price level rises with the money supply, i.e. the quantity theory of money. Awesome, huh? Except the quantity theory doesn't really work that well (κ∼0.6κ∼0.6

works better for the US and other countries, except Japan where κ∼1κ∼1 is a better model). But we left out a big piece: aggregate demand (e.g. NGDP) is measured in the same units as the money supply. And to top it off the money supply is adjusted by the central bank based on economic conditions! This is the picture of the macroeconomy:

What does it mean? It means that κ=logm/logdκ=log⁡m/log⁡d , or the amount of information transferred from the demand to the supply relative to the amount of information transferred by money, is changing! If we assume this happens somewhat slowly [3], we can transform equation (6) into:

(7) P=α1κ(MB,N GDP)(MBMB0)1/κ(MB,NGDP)−1(7) P=α1κ(MB,NGDP)(MBMB0)1/κ(MB,NGDP)−1

κ(MB,NGDP)=logMB/c 0logNGDP/c0κ(MB,NGDP)=log⁡MB/c0log⁡NGDP/c0

where we've replaced M with the monetary base, AD with NGDP, grouped a bunch of constants as αα

, and introduced a new constant c0c0 since the units of money are arbitrary. We can fit this model to

the price level (it works best, see the lower right graph, when the monetary base is actually just the currency component of the monetary base):

Pretty cool, huh? Using this model (the information transfer model or ITM), we seem to get less inflation from a given increase in the money supply as the money supply and the economy get bigger. In fact, it can even go the other way -- in Japan an increase in the money supply (blue) decreases the price level (brown):

The rest of this blog is devoted to exploring this model and the concept of information transfer in economic, even commenting on current events based on these ideas. Have a look around!

Footnotes

[1] You can think of it as a sale discovering the ID number of a buyer. If the ID number is a member of the set {1, 2, 3, 4, ... , d}, then you need log2dlog2⁡d bits to represent the ID number. Thus, a sale

transfers log2dlog2⁡d bits from the buyer (demand) to the supplier (supply). I will drop the subscript 2 for the binary log in the rest of the post.

[2] We are looking at infinitesimal units to see how supply and demand change with respect to each other. for those unfamiliar with this concept, it forms the basis of calculus. And as a physicist, I am given to frequent abuses of notation.

[3] Technically, integrating equation (5) as we did assumes κκ is independent of ADAD and MM . However, if we assume κκ doesn't change rapidly with MM or NGDPNGDP , then the integration can proceed as shown, but is only an approximation.