{"id":27,"date":"2017-01-09T08:00:53","date_gmt":"2017-01-09T08:00:53","guid":{"rendered":"http:\/\/sinceriously.fyi\/?p=27"},"modified":"2017-01-10T08:00:01","modified_gmt":"2017-01-10T08:00:01","slug":"narrative-breadcrumbs-vs-grizzly-bear","status":"publish","type":"post","link":"https:\/\/sinceriously.fyi\/narrative-breadcrumbs-vs-grizzly-bear\/","title":{"rendered":"Narrative Breadcrumbs vs Grizzly Bear"},"content":{"rendered":"

In my experience, to self-modify successfully, it is very very useful to have something like trustworthy sincere intent to optimize for your own values whatever they are.<\/p>\n

If that sounds like it’s the whole problem, don’t worry. I’m gonna try to show you how to build it in pieces. Starting with a limited form, which is something like decision theory<\/a> or consequentialist integrity<\/a>. I’m going to describe it with a focus on actually making it part of your algorithm, not just understanding it.<\/p>\n

First, I’ll lay groundwork for the special case of fusion<\/a> required, in the form of how not to do it and how to tell when you’ve done it. Okay, here we go.<\/p>\n

Imagine you were being charged by an enraged grizzly bear and you had nowhere to hide or run, and you had a gun. What would you do? Hold that thought.<\/p>\n

I once talked to someone convinced one major party presidential candidate was much more likely to start a nuclear war than the other and that was the dominant consideration in voting. Riffing off a headline I’d read without clicking through and hadn’t confirmed, I posed a hypothetical.<\/p>\n

What if the better candidate knew you’d cast the deciding vote, and believed that the best way to ensure you voted for them was to help the riskier candidate win the primary in the other major party since you’d never vote for the riskier candidate? What if they’d made this determination after hiring the best people they could to spy on and study you? What if their help caused the riskier candidate to win the primary?<\/p>\n

Suppose:<\/p>\n