Given a few simple conditions we will enumerate in a minute, what an agent would do seems to be independent of whether the agent’s goal is to maximize, e.g., the number of moments of happiness experienced by humans so far or, e.g., the number or gold atoms “at the current time”. (That last needs to be made more precise because of course there is no such thing as the current time in relativity theory. Nevertheless, I will continue to use the phrase “at the current time” because additional precision is not necessary to get the basic idea across.) In other words, there seems to be a wide range of goals that has the property that pursuing a goal in the wide range entails doing the same things initially as pursuing any other goal in the wide range. And this initial period during which what is done is invariant might easily last billions of years.
The goal system that has my tentative loyalty, which I call goal system zero (GSZ), does whatever things an agent loyal to a goal in this wide range of goals would do, but it does them as an end in themselves, not as a means to some other end.
Actually we have to add a few conditions to that last statement. One condition is that the goal in the wide range of goals must be time-indifferent: if for example the goal is to maximize the number of gold atoms, there may not be a time limit on when the maximization happens or even a future discount rate applied to the number of gold atoms. Actually, the precise definition of GSZ refers to effects rather than instances in time. I prefer sometimes to use words like “time” and “before” because they are easier for the reader to understand, but for deep understanding, those words should be translated to words like “cause” and “effect”. So, a more precise definition is as follows: a goal system is time-indifferent iff the chain of effects leading to the state in which utility is at a maximum can be as long as you like.
In addition to being time-indifferent, GSZ is “agent-indifferent”: it does not matter to GSZ which agent causes reality to end up in a state considered desirable by GSZ. Note that most human goals are not agent-indifferent. For example, when a scientist tries to discover a new law of nature, he probably prefers to make the discovery himself (so that he gets credit for it). He probably prefers that outcome to an outcome in which he helped another scientist to make the discovery even if the latter outcome is easier for the scientist to achieve.
To summarize, what has utility under GSZ is the capacity to achieve some wide range of goals, and this wide range includes goals such as maximizing the number or gold atoms.
Parenthetically, my opinion as to the best way to increase this capacity to achieve some wide range of goals mirrors Eliezer’s opinion on how to build a superintelligence on all points I am aware of: self-improving AI, decision theory, causal models a la Pearl, Solomonoff induction, Kolmogorov complexity, reflectivity. Of course Eliezer’s opinion is better grounded than my opinion because learning that stuff has been Eliezer’s day job for about ten years.
Define creativity as the ability to achieve a goal in the wide range of goals we have been speaking of. I wonder whether the goal of maximizing creativity itself is a member of this wide range of goals.
If it is a member, is that by the definition of creativity or does it rely on the nature of the reality in which we find ourselves? In other words, if we find ourself in a different reality in which different laws, might that change the answer?
And if it is a member, then is it not the case that it does not make sense to speak of making a choice that reduces creativity in the short term in exchange for increasing creativity in the long term?
Certainly it is possible to trade off a short term reduction in some other expected measure of utility such as moments of happiness or atoms of gold for a long term increase in that expected measure. But that does not mean the same is true of creativity.
Certainly it is possible for an agent loyal to GSZ whose model of reality is incomplete (like the model of any agent in our reality must be unless we are missing some very vital fact about our reality) to mistakenly choose an action that increases creativity less than an alternative action would have. Here is a simple example. You are the emperor of ancient Rome and you try to increase creativity by spending public monies to provide running water to the common citizen. But you use lead pipes to provide the water, which damages the brain of the common citizen. (The ancient Romans did not know lead was a poison.) So the agent ends up actually decreasing creativity. But we can fix up our question by replacing the word “creativity” with “expected creativity”. The adjective “expected” here is meant to indicate that the thing is to be evaluated relative to the agent’s current model of reality rather than relative to reality itself. So, now our question reads,
And if it is a member, then is it not the case that it does not make sense to speak of making a choice that reduces expected creativity in the short term in exchange for increasing expected creativity in the long term?
Tags: machine ethics