Subagents Are Not a Metaphor

Epistemic status: mixed, some long-forgotten why I believe it.

There is a lot of figurative talk about people being composed of subagents that play games against each other, vying for control, that form coalitions, have relationships with eachother… In my circles, this is usually done with disclaimers that it’s a useful metaphor, half-true, and/or wrong but useful.

Every model that’s a useful metaphor, half-true, or wrong but useful, is useful because something (usually more limited in scope) is literally all-true. The people who come up with metaphorical half-true or wrong-but-useful models usually have the nuance there in their heads. Explicit verbal-ness is useful though, for communicating, and for knowing exactly what you believe so you can reason about it in lots of ways.

So when I talk about subagents, I’m being literal. I use it very loosely, but loosely in the narrow sense that people are using words loosely when they say “technically”. It still adheres completely to an explicit idea, and the broadness comes from the broad applicability of that explicit idea. Hopefully like economists mean when they call some things markets that don’t involve exchange of money.

Here’s are the parts composing my technical definition of an agent:

Values
This could be anything from literally a utility function to highly framing-dependent. Degenerate case: embedded in lookup table from world model to actions.
World-Model
Degenerate case: stateless world model consisting of just sense inputs.
Search Process
Causal decision theory is a search process.
“From a fixed list of actions, pick the most positively reinforced” is another.
Degenerate case: lookup table from world model to actions.

Note: this says a thermostat is an agent. Not figuratively an agent. Literally technically an agent. Feature not bug.

The parts have to be causally connected in a certain way. Values and world model into the search process. That has to be connected into the actions the agent takes.

Agents do not have to be cleanly separated. They are occurrences of a pattern, and patterns can overlap, like there are two instances of the pattern “AA” in “AAA”. Like two values stacked on the same set of available actions at different times.

It is very hard to track all the things you value at once, complicated human. There are many frames of thinking where some are more salient.

I assert how processing power will be allocated, including default mode network processing, what explicit structures you’ll adopt and to what extent, even what beliefs you can have, are decided by subagents. These subagents mostly seem to have access to the world model embedded in your “inner simulator”, your ability to play forward a movie based on anticipations from a hypothetical. Most of it seems to be unconscious. Doing focusing to me seems to dredge up what I think are models subagents are making decisions based on.

So cooperation among subagents is not just a matter of “that way I can brush my teeth and stuff”, but is a heavy contributor to how good you will be at thinking.

You know that thing people are accessing if you ask if they’ll keep to New Years resolutions, and they say “yes”, and you say, “really?”, and they say, “well, no.”? Inner sim sees through most self-propaganda. So they can predict what you’ll do, really. Therefore, using timeless decision theory to cooperate with them works.

Leave a Reply Cancel reply