- In this post I argue that the Newcomb experiment is not only feasible, but might very well be significant to the design of intelligent autonomous robots [...]. Finally, as requested by some readers of my previous post, I will reveal my own Newcombian orientation.
- Although I like this argument, I don't think it holds.
- The strange moral here is that once the program becomes aware of its own correctness (the fact that it never concludes that a program is safe if it isn't), it becomes incorrect! Also notice that we have no reason to think that the AI would be unable to follow this argument. It is not only aware of its own correctness, but also aware of the fact that if it thinks it is correct, it isn't. So an equally conceivable scenario is that it ends up in complete doubt of its abilities, and answers that it can't know anything.
The Newcomb problem seems to be just one of several ways in which a self-aware computer program can end up in a logical meltdown. Faced with the two boxes, it knows that taking both gives more than taking one, and at the same time that taking one box gives $1,000,000 and taking both gives $1000.
It might even end up taking none of the boxes: It shouldn't take the small one because that bungles a million dollars. And taking only the large one will give you a thousand dollars less than taking both, which gives you $1000. Ergo, there is no way of getting any money!?
The last paragraph illustrates the danger with erroneous conclusions. They diffuse through the system. You can't have a little contradiction in an AI capable of reasoning. If you believe that 0=1, you will also believe that you are the pope (how many popes are identical to you?).