Monday, June 20, 2011

The two-envelope problem

Suppose you go to a conference where the speaker invites you up on stage.  He offers you two envelopes and tells you the ratio between the sums in them.  He says "the ratio between their value is r," where r > 1.  He gives you one envelope.  Then he says "want to switch?"  You might switch or not, judging what you think the speaker's motivation is, and what you think the speaker thinks of your motivation (reasoning about what he's reasoning is indeed something that a Bayesian actor should resolve, e.g., this Princess Bride poisoned-cup reasoning.  After you've satisfied (or confused) yourself about which envelope might be better, do you still want to keep switching forever? 

The paradoxical expected value:

An argument from expected value might indicate that you should switch: if your envelope is worth x coins, you stand to gain x * (r-1) or lose x * (1-(1/r)).  For example, if your envelope is worth 10 coins and the ratio between the envelopes is 2:1, then the other envelope is worth 20 coins or 5 coins.  By switching, you would gain 10 or lose 5.  If the odds of gaining and losing are just about equal (and, by symmetry, they odds that you have the less-valuable envelop might well be 1:1), then you benefit from switching. 

But symmetry implies indifference.

In the two-envelope problem, this indifferent strategy seems optimal: switch if you feel like it (e.g., if it's nice to chat with the speaker) and switch back again if you want (e.g., if you like to hear the audience laugh) and stop switching when it ceases to amuse you (the audience gets restless). The paradox is that you might believe that switching has no value (since the first envelope came to you randomly, and it could just as easily have been the other one) and that it has value (It would seem that by symmetry, the odds of increasing or decreasing your wealth are 1:1, since the rewards of increasing are greater than the cost of decreasing, the expected value is positive).

Prejudice helps; opening the envelop helps:

You can estimate the average unopened envelope's value from your prejudices about generosity, games, professors, conferences and money.  Look at the speaker's shoes... Check that it really is currency, and not a check, and figure out how much currency fits in an envelope, not a suitcase.  Think about whether it would have unduly inconvenienced the speaker to find deflated currency from a country which recently saw its currency lose value.  Does the speaker jealously watch the envelopes?  Does the speaker's briefcase have a handcuff on it, like a diamond-trafficker's.  Now just guess.  I guess ten coin because at this moment i guess the speaker would want to offer the minimal value which doesn't appear cheap.  Now open the envelope.  If I open it and see a bill worth 100 coin, I have to increase your estimate of the value of the other envelope up from 10 coin, but not up to 100 coin or more.  My new estimate is some average of the old estimate and the observed value -- 100 coin.  In any case, the expected value is between the old expected value and the observed value, so to switch away from a surprisingly good envelope has negative expected value (-EV).  On the other hand, if I open it and see a bill worth 5 coin, I expect the other envelope to contain something between 5 and 20 coin, so the same argument says I should switch from a surprisingly bad envelope.

Prejudice may be enough:

The expected value of switching from an envelope containing x to the other envelope is: x * (r-1) - x * (1-(1/r)).  The expected value of switching is the integral of this over all values of x.  We might hope that this sums to zero, even for unusual distributions of expected values of x.

Two-valued coins:

Suppose we live in a country which denominates its coins as 5,10, or 50.  The coins have one number written on the back and one on the front.  The coins are always printed 5-10 or 10-50 -- i.e., with 5 on one side and 10 on the other, or with 50 on one side and 10 on the other.  Suppose that the coins grow on trees, inside of flat nuts.  The shell (or husk or shuck) of the nut obscures the values written on the coins.  The value of a random coin depends first on which type of coin it is, and second, on which face is showing.  This country has a tradition of gathering nuts around old trees and laying them in long rows in wasteland.  The coins are not buried.  Cultivators simply lay them in rows, where they bloom into lines of trees.  Each cultivator usually owns a row, and they compare one row to another to see which strategies generate the most value.  The value of the coin, and then of the tree, is measured by the number of fruits produced.  A coin laid with the value x facing down will produce x fruits.  Agriculturists have not been able to find out how to generate better-quality nuts.  The two types of nut are generated with equal probability and with either face up.  Farmers enjoy eating the fruit of these trees, so any method to routinely increase the value of the nuts, the trees, or the rows of trees would add to their happiness.  

Some people can sense the value of the nut through the husk (or shell) with uncanny accuracy.  These children are profitably employed -- they go down a row of coins which have not yet germinated, and flip them so as to leave the better value up.  For some reason, these children never pick up the coin, examine both sides, and leave the coin in its preferred orientation -- they only look at the upwards-facing value, and for this reason three things can happen:

They see 5, and they know it is a 5-10 coin in its preferred orientation.
They see 50, and they know it is a 10-50 coin that should be flipped.
They see 10, and they don't whether it's a 5-10 coin or a 10-50 coin.

A winning strategy:

The professional switchers have settled on this strategy: when seeing 5 or 10, ignore it.  When seeing 50 flip it.  Proof: If a 5 is showing, then the value 10 is in the dirt, and this is the best-possible orientation for that coin.  If the visible value is 50, then the value 10 is in the dirt.  Flipping the coin creates 40 fruits worth of value.  When we see the value 10, the number facing down into the dirt is equally like to be a 50 or a 5, and the 40-fruit benefit of leaving the 50 facing down outweighs the 5-fruit cost of leaving a 5 facing down.  So it is best to leave the 10s alone.  The professional switcher simply looks for surprising good value going to waste and corrects only that sort of error.  A similar strategy applies, if the accuracy of the nut-reading is imperfect, for any information is better than none. 

Before opening the envelope:

We have a row of nuts laid out, and I can't read the values at all since I wasn't raised as a nut-reading savant.  Would we do better to flip all of our nuts?  We notice that the children flip 1 nut for every 3 they leave in place, so perhaps we should flip none of our nuts.  On the other hand, if we simply toss the nuts down in a random way, it seems clear that we can't expect to improve on the productivity of the row of trees by flipping every nut. 

It seems to me that this is the paradox: With information, you can improve the expected value of the line of trees.  Without information, you can't.  Indeed, the expected value of flipping a nut is:

    the value of flipping a 5, times the proportion of 5's,
+    the value of flipping a 10, times the proportion of 10's,
+    the value of flipping a top-value V, times the proportion of top-values.

If we suppose that coins A-T and T-V are equally likely, then the proportions of 5's, 10's, and 50's are 25% - 50% - 25%, and that calculation yields:

    -(T-A) times 25%
+    -(V-T)/2 + (T-A)/2 times 50%
+    (V-T) times 25%

The first term is the cost of flipping the A=5's into the earth; we plant A=5 instead of T=10 and lose T-A fruits.  This happens to 1/4 of the coins in the tree-line, which happen to be showing a A=5.  The second term is the cost of flipping the T's.  We plant T, and so we are sure to gain T fruits, but lose V=50 or A=5 with equal probability.  The third term is the benefit from flipping the V's.  The children are always glad to see a top-value coin and flip it into the ground.  The benefit of doing this is V-T fruits.  About 25% of the nuts in the tree-line are showing the top value V=50 before being flipped.

Now, when we add those terms together, -5/4 - 40/4 + 5/4 + 40/4 = 0, and you can see from the variables that the cancellation is algebraic: the costs and benefits of flipping 10s balance the costs of flipping 5s into the dirt and the benefits of flipping 50s into the dirt. 

No comments:

Post a Comment

Clean Meter for
Click to verify