Picoeconomics in a Nutshell

 

Picoeconomics (micro-micro-economics) is the study of how mental processes based on conflicting motives interact strategically.  Begun as an inquiry into the instability of choice, it has become a bottom-up model of many familiar experiences, including pain, appetites and emotions, strength of will, freedom of will, compulsiveness [*Dangers 5-15??], procrastination , reward by imagination,  addiction to non-substance-based activities [*OxfordMacGuffinAlt5,11— update?? 8-19] and the high valuation of distant goals [*].  Research has concentrated on the emergence of willpower from the strategic interaction of successive motivational states within the person, since all of the phenomena just listed can be tied into that.  Methods have included (...)

A simple experimental finding, hyperbolic delay discounting, has suggested that the conventional framework of motivation is misaligned: that decisions do not stay fixed in the absence of new information, that pain is built of the same stuff as pleasure, that no learned processes-- including conditioned processes-- are independent of reward, and most importantly that there need be no inborn faculties of prioritization, will, identity, self-judgment, or other higher evaluative process.  Instead a person can be well described as a population of learned processes that compete or combine in strict accordance with how well they maximize prospective reward. 

Such marketplace models have been proposed since Bentham’s “two sovereign masters, pain and pleasure,” but heretofore their agents have simply estimated prospects from the outside world and stuck to their conclusions in the absence of new information.  Opponents have complained that these agents are mere throughputs, totally manipulated by the reward contingencies they face. [?*Russell, 1978]  The recognition of hyperbolic discounting means that an efficient internal marketplace has to deal with its own shifting preferences as a function of the proximity of its prospective rewards, and hence estimate how its own future and current choices will affect outcomes. 

With hyperbolic discounting the person is no longer a throughput but rather a literally self-conscious agent, who makes decisions by finding equilibria in her shifting self-predictions.  She shares some of her properties with social groups such as legislatures and financial markets, analogies that have always been thought of as more poetic than descriptive.  The implications of hyperbolic discounting fit a wide range of human experience, and suggest corrections to some basic tenets of psychological theory.

Development of picoeconomics

[*selection from Eyes Open EconMod]

The recognition of conflicting motives that endure over time, rather than simply competing and producing a winner, dates back at least to Plato and the Judeo-Christian Bible.  (Passions endure against our reason, or the Devil lures us from the good.)  Freud hypothesized an ego that negotiates with both impulses (id) and rigid controls (superego); but quantitative modeling, based on how prospective rewards are discounted as a function of delay, began only in the 1950s. 

Economist Robert Strotz pointed out that only one shape of discount curve describes consistent relative valuations of alternative rewards at different delays [?*].  He said people might have any number of other shapes; and if they did they should expect their preferences to change over time—“dynamic inconsistency”-- and should either do something to lock in their current choice or accept the prospect of not getting it.  Thus the self might be divided, not into separate faculties, but simply over time.

Independently behavioral psychologist Richard Herrnstein discovered what he called the matching law—that subjects who can spend time with either of two simultaneous streams of reward allocate their time in proportion to rate, amount, and immediacy of rewards in the streams.[?*] 

George Ainslie, a psychiatrist looking for an experimental model of impulsiveness, pointed out that with single rewards the matching law describes hyperbolic discount curves. [*1975]  That is, if a subject faces two discrete alternative rewards, a smaller, sooner (SS) one and a larger, later (LL) one, she should show temporary preference for the SS reward for a period just before it is due, an example of Strotz’ dynamic inconsistency.  This phenomenon was first demonstrated in pigeons, some of which also learned a response that committed them to the LL alternative when the SS reward came close. [*1974]

Hyperbolic delay discounting and the phenomena it predicts form the basis of picoeconomics.

 

Why picoeconomics? 

A well-confirmed observation—that there is a basic tendency to discount future reward in a hyperbolic curve {see Hyperbolic delay discounting}—suggests answers to several questions that have been persistently puzzling, or, in some cases, answered prematurely:

Picoeconomics (micro-microeconomics) proposes a model that is coherent and parsimonious, but only partially tested.

[Root]

Hyperbolic Discounting

People seek rewardReward following a mental process is what selects that process to get repeated.  We evolved to seek reward, not adaptiveness or realism directly. [BBS McKay-- Non-instrumental: “It is certainly true”]

Hyperbolic delay discounting.  The selecting power of a reward* is inversely proportional to its expected delay.  A graph with delay on the x axis and selecting power on the y axis forms a hyperbola[EconMod: “In a seemingly unrelated”]

Reward is a unitary phenomenon.  Although different parts of the brain seem to specialize in evaluating rewards at different time ranges of delays, the evaluations are consistent among parts, and consistent with hyperbolic discount curves[HypVCondRoss:  “valuation clearly takes… await better data”]

Discounting research. 

The internal marketplace of reward.  Alternative mental processes (some leading to behaviors) compete on the basis of expected, discounted* reward.  [OxfordMacGuffin: “The neural mechanics”]

Interests.  The mental processes that are learned because they get a particular reward constitute its interest—just like an economic interest in an external marketplace.  Only interests that are dominant at different times stay distinct from each other.  [Breakdown: “This lability of preference]

Impulsiveness.  We tend to prefer smaller, sooner (SS) rewards to larger, later (LL) rewards temporarily, when the SS rewards are imminent.  Long term interests* can’t eliminate a short term interests if its SS rewards are sometimes dominant.    [

[Branch from Impulsiveness]

Impulse Control 

Short summary on recursive self-prediction and concept of will.: [Hansson: Recursive self-prediction provides…]

Unpredictability of future choices.  Our basic tendency to impulsiveness means we can’t be sure what we will choose in the future.  [

Commitment.  A long term interest can forestall future impulses by arranging for physical or social forces, by restricting attention, or by arousing a contrary emotion.  [Pico Commitment Passage]  Without commitment in advance, control of a current impulse depends on willpower*, hypothesized to come from intertemporal bargaining*.  [

Intertemporal bargaining.  If you realize that your current choice is a test case for how you can expect yourself to make similar choices in the future, this perception sets up a variant of repeated prisoner’s dilemma among the successive times you make this choice.  Your relationship to yourself in the future is one of limited warfare, leading to recursive self-prediction, [ ] which is the basis of both willpower* and freedom of will,*  [Selectionist model: Will as intertemporal bargaining] and, conversely, sudden surges of temptation {Sudden appetites are positive…}

Willpower.  The many properties of what is called willpower are modeled well by intertemporal bargains, many of them tacit, among successive choice-makers, but modeled poorly by alternative theories such as the diversion of attention, a muscle-like organ, or one’s insight about broader patterns of choice.  [Eyes open: IV. Evidence about the mechanism]

The force of symbols.  The “symbolic value” of a choice can be understood as the way it defines the category of choices for which it is to be a test case in intertemporal bargaining.  Suddenly perceiving your current choice to be a symbol or example of a larger category can lead to radical changes of preference.   [Reply to Miller: “We do not share… not a mere metaphor”]

Bright lines.  Intertemporal contracts are self-enforcing, but are apt to be unstable unless they draw a line between cooperation and defection that stands out from other possible lines.  Such bright lines, like the one between any smoking and none, deter you from proposing new terms for a contract when temptation is high.  [ ]

Willpower may not be deliberate.  You inevitably bargain with yourself whenever you notice that some aspects of your current choice situation will occur again.  The process won’t usually take the form of an actual resolution, and is apt to seem nonsensical if you think you are a consistent choice-maker and know your own mind.[PennResponsibility: “it might seem incredible… 133-142)”]

Willpower is new in evolution.  Nonhumans can imagine no more than a few hours of future, and must rely on instincts to motivate long term projects such as hoarding, migrating, defending territory, etc.  People whose foresight makes them realize they are endangered by impulses have had to learn commitment as best they can, the most effective means being intertemporal bargaining.  [Hansson: “Hyperbolic discounting raises the obvious… Sugden, 2001)”]

Willpower has serious side effects.  The incentives created by seeing your current choices as test cases make it harder to live in the here-and-now.  These incentives may also lead to areas like phobias where your will gives up, blind spots where you avoid noticing lapses (repression and denial), and a giving up of richer option that don’t obviously fit in with a rule (compulsiveness*).  [Dangers of willpower: 2. Willpower is an awkward expedient]  


Selfhood

[Branch from bright lines]

Freedom of will.  Recursive self-prediction* makes even your imminent choices unpredictable from a knowledge of your prior incentives, even by you.  This unpredictability, and the genuine participation in your choice that recursive self-prediction represents, fulfill the usual philosophical requirements for having free will.  [Free will as recursive: The experience of free will: unpredicdtability and initiative]

Moral responsibility.  A deterministic chain of prior causes is often said to excuse you from blame, but self-blame from breaking an intertemporal bargain* comes directly from your reduced expectations of long term reward.  Perhaps social blame is vicarious self-blame—“if I were in her place I would blame myself”—and thus also compatible with strict determinism.  [Free will as recursive: The experience of free will: responsibility …”marketplace model of decision-making.”]  

Self as population.  As interests* learn to get their rewards, they come to include intertemporal bargaining* and commitment* processes that forestall competing interests that will be dominant at different times. [Breakdown: 3.1] The interaction of these processes builds ego functions—a self—from the bottom up, without any overarching government.  [HypvCondMaddenSend: Top Down and Bottom Up Theories] 

Ego functions grow as foresight increases.  Processes that learn to “get the jump on” other processes can steer the train of choice toward or away from them, but can remain in control only through the rewarding power of the chosen processes.  [Motivation/momentary: Higher mental processes… “an incentive for such a process;” add figures with lily pads, explain  A better model would use plants that grow away from their energy source on the surface of a pond.  The stems of lotuses anchor them and give them a competitive advantage over plants with no depth.]

Self as virtual.  We think of our inner selves (and our gods) as body-like or at least organ-like, but instead we consist of the relationship of motivational forces.  It is true that the brain generates these forces, but the foci of these forces—both the choosing self and the chosen objects—are built from our expectations via our histories.  [No link as yet]

 


Involuntary Behaviors

[Branch from Impulsiveness]


Conditioning does not explain involuntary behaviors.  We have a wide range of involuntary behaviors.  They are not just transferred reflexes, but shaped by reward.  {Breakdown for BBS: “there are long-lasting preferences… pushed, presumably by conditioning.”}

Supposedly conditioned responses are based on reward.  Emotions and other involuntary mental processes are conventionally supposed to be forced on us by conditioning, that is, by the pairing of new stimuli with innate releasing stimuli; but this is a crude analogy with laboratory conditioning.  [PicoConditioningPassage].  Instead there is evidence that emotion is a reward-seeking mental process.  [Uncertainty: Emotion as reward-dependent behavior]

Brief recurring impulses are itches.  Processes such as itches, tics, obsessive repetitions, and consumption of a substance at near its satiation point produce a combination of SS reward and longer-term nonreward, making them “wanted but not liked” in Berridge’s terminology.  Their motivational value can be plotted as cyclical temporary preferences.  [Uncertainty: Negative emotions without conditioning]

Even pains have to lure us. Experiences such as physical pain, panic, and horrible memories are neither liked nor wanted, but still require our participation.  Since people can be taught to withhold the emotional (“protopathic”) component of pain or the panic response to phobic objects, these processes that are supposed to be entirely negative clearly gain entry by giving us urges, rather than by some automatic trigger.  [

Pains may consist of very rapid recurring impulses.  The observation that pains attract attention but not motor behavior suggests a cycle of brief reward alternating with longer nonreward—the same cycle as an addiction or an itch but much more rapid.  Other models are possible, but would still have to account for the phenomenon of an aversive component that can be avoided by resisting an attractive component.  [Precis: 4.1 The problem of pain… “positive their valence will be.”]

Involuntary behaviors are still governed by reward.  All options that involve the obedience to urges are within the internal marketplace.  For some behaviors the urge is noticeable only when absent or opposed, as in breathing; for some it only partially affects a hardwired process, as when vegetative processes are altered by biofeedback or hatha yoga. [Hansson: “the ability of negative incentives… affected by recursive self-prediction.]

The need for appetite gives long term interests an advantage.  Where activities need an aroused appetite to be (fully) rewarding, the impulse for them can be avoided with less motivation by avoiding the appetite {cf. Commitment}.  Complete avoidance of an appetite may result in a loss of taste for it.  [Pico: Tastes and appetites passage]

Sudden appetites are positive feedback phenomena.  When a reminder leads to a sudden appetite without predicting that its object is more available, the appetite is not literally conditioned to the reminder.  Rather the reminder is an occasion* for the (goal-directed*) appetite to test the person’s resolve, leading to increasing expression to the extent that success seems possible.  [HypVCond: Does classical conditioning…;Recursive of CRs; Recursive of motivated]

Unwelcome behaviors differ according to how long they’re preferred.  Behaviors that are unwelcome in the longest view range from the very brief (attention to pains*, which are unwanted but participated in) through itches* (wanted but not liked) and addictions* (liked but not approved of) to a stable narrowing of character (compulsiveness*—approved of  but not wished for).  {OsloDischotomy: IV. Dichotomies can be replaced…}

Motivation must be momentary.  Reward evolved as a selection mechanism for choices, and thus operates entirely within the current moment.  To be of current value, both future and past rewards must be represented as a present experience, with the expectation of the future rewards constructed of the memory of past rewards.  [BBSSuddendorf OR Motivation momentary]

The future must pass through the narrow neck of the present.  Even the most important future plans must withstand the competition of current urges.  However, to the extent that future goals depend on imagination, they may gain reward value from the properties of this imagination, even when their discounted value would be disappearingly small.  [DUP: Motivation/momentary: 2. Utility is more than info…”reported by Mazur (1986, v.s.).” RE-WRITE]

The brain can’t store reward.  Activities that cease to be surpising cease to be rewarding.  [Pico Surprise Passage]



Most Reward Is Self-Reward

[Branch from Self as virtual]

Imagination is the greatest source of reward.  The most conspicuous mental rewards are emotions.  [Uncertainty: “The conventional theory… Stearns, 1986)”]

Vicarious trial and error (VTE).  The internal marketplace works by trying out options in imagination.  Most human reward comes from mental (endogenously rewarded) scenarios.*  [MacGuffin: Among imaginative… “worth spending time in?”]

The demon at the calliope.  As long as we have drive for a hunger or emotion, satisfaction will be available on demand.  [Hatfield: 5. A motivational model of emotions].  Endogenous* reward is that which does not strictly depend on events outside of the mind.  [MacGuffin:

The paradox of wealth.  The expectation of satisfactions over the long term is what is meant by wealth.  But this expectation tends to spoil these satisfactions by reducing appetite for them. [MacGuffin: Reward by an abundant good…]

The paradox of self-reward.  Mental processes such as emotions that both reward and are rewarded might seem to be paradoxical, a circular and potentially explosive phenomenon. However, demanding emotion at will reduces the drive prematurely, so the most direct routes to it lose out to routes that depend on occasions.*  [Uncertainty: 4.2 Positive emotions without releasing stimuli].

We learn to use adequately rare occasions as cues for emotion.  Since endogenous reward is limited by appetite, it becomes attached to external events in proportion to their rarity (as well as their appropriateness to the appetite and their surprisingness*).  In one manner of speaking we could be said to be using such singular* events as occasions* for self-reward, but mostly the trying out of events as occasions occurs without deliberate intent.  [MacGuffin: Virtual reward comes to be governed by occasions OR ].

Intrusive emotions are controlled by recursive self-prediction.  The old Darwin-James-Lange theory is partially correct, in that our perception of whether emotions such as fear and nausea are increasing or decreasing is positively fed back to our expression of those emotions.  Behavior therapies to master them entail inducing them under conditions where other incentives will be enough to compete with this feedback.  [Munich: Recursive self-prediction in will…”against temptation, too.”]

The construction of beliefs is governed by reward.  Belief is ultimately our discernment of constraints on choice, which we experience as facts.  Since only beliefs connected with reward survive in the market for our attention, our criteria for factuality will be either instrumental predictiveness or hedonic effectiveness, that is, singularity that occasions reward.  [Hansson: Beliefs may arise through recursive… “like a fact of the external world.”

Belief is a recursive process that constrains endogenous reward.  [Hansson: “Personal rules supply another…  zones on a continuum.”

[Branch from Self as virtual]


Impulse control is only half of rationality.  A society on the edge of starvation needs to make sure that people control impulses-- avoid dangers and save enough seed corn for next year’s crops.  A rich society has at least as much incentive to refresh people’s appetites, a need that often opposes the need for impulse control but cannot be weighed directly against it.   [MacGuffin:  “Historically, the dysphoria” + Reward by an abundant good] 


Welfare Picoeconomics

[Branch from Reward is a unitary phenomenon]


Welfare planners should go by the properties of emotional reward.  Welfare economics now takes account of more than economic prosperity, but still deals with a grab-bag of factors that have simply been observed to correlate with self-reported happiness.  Since most human reward is mental—emotion is the most identifiable—the study of this process should let welfare economics become more predictive.  [Welfare Picoeconomics—in development.]

Happiness has a ceiling, but not a floor.  Planners conceive various aggregations of reward as happiness, and are dismayed that self-reported happiness does not increase as society becomes more successful in getting the rewards it has sought.  But reward evolved as a homeostatic mechanism to maintain reproductive fitness, like those that maintain oxygen levels and temperature, and would thus be expected to vary below normality but not above it.  [Uncertainty: “Research has confirmed… drink, oxygen, and temperature.”]

Reward is sovereign.  Although the reward process itself evolved because selecting individual behaviors is more adaptive than selecting whole organisms, within an organism reward is the ultimate determinant of choice, regardless of adaptiveness.  Just as individuals can be seen as vehicles for transmitting their genes, behaviors can be seen as mechanisms for obtaining reward. [Narrow neck: 2. Reward is more than just utility…”consumption of an external commodity.”]{People seek reward}

Reward is the basis of both positive and negative experiences.   {Even pains have to lure us}

Monetary wealth or poverty is subject to adaptation.  Welfare economics focuses on access to goods, but reward and aversion come from changes of level of this access, and do not attach to any level per se.  Imagine a society in which people were paid only in heroin… {Paradox of wealth} [Welfare picoeconomics: The fable of heroin—in development]

Durable goods require refreshment of appetite.  In the absence of impulses we make our own reward, but this requires sources of adequately rare and surprising occasions.*

Vicarious reward is the richest source of occasions.

Durable bads are those that do not habituate.

 
Intelligence is a double-edged sword.  Humans’ larger cortices don’t only predict external rewards better, they generate internal reward by imagination.  Imagination adaptively helps planning, but maladaptively dissociates reward from its evolved purposes.  [BBS McKay]
Economic goods often get their value from being occasions for endogenous reward.

Imagination loses power through habituation. The extent to which imagination can reward in its own right depends on its tendency to habituate (continuum from fantasy-prone personalities to stimulus-seeking personalities).  Habituation is driven by the immediate reward differential for anticipation.