Let's say an expert says X, because Y. Then Y turns out to be false. How much should we update our beliefs about X?
Obviously the answer has to be some. There is a conservation of expected probability argument here, and the reasoning being true certainly has to count into a prediction's favor. It's useful information. But maybe we shouldn't update a huge amount – the expert could be correct about the prediction, be making it for ten reasons, and simply be wrong about which is the most persuasive line of reasoning. This happens a lot, and we should expect that, in communication between experts and the public, and even between experts, that whatever wisdom they have is embedded into that prediction.
That being said, if you're understanding of the reasoning is so bad you'd use a line of logic that turns out to be false, it's possible you believe these statements for the wrong reasons. And that means we should, at the margin, ignore the prediction – it has less information, because it's closer to noise, or perhaps even the consequences of bias or external malicious actions trying to confuse them.
At the end of the day, if someone predicts X, because Y, and we learn Y is false in short order, we should be extremely skeptical of their predictions about X.
With all that in mind, Robert Miles makes the argument that we ought to make sure to get good at making sure computers do what we want, as they get more capable in June of 2015. In September of 2015, Sean Holden responds and says the default outcome is that even extremely complex systems will do what we want, and we shouldn't be concerned because the game of Go hasn't even been meaningfully challenged, so we aren't making computer systems that can engage in more generic problem solving any time soon, and even Moore's law won't help, so even decades of advancement would be a drop in the bucket. (Robert Miles then clarifies that he thinks the safety engineering is worth doing even if it'll take a long time to be useful).
One month after Holden's statements are published, AlphaGo beats the European Go Champion 5-0, and 5 months after that no human can meaningfully challenge state of the art Go programs. In a year and a half after that the system is generalized to other games and doesn't even borrow a single wisp of human wisdom about these games. It was a two year horizon, where thousands of years of competition and wisdom had any use at all, even in the abstract, and the modern version of the system beats all previous versions without even knowing the rules of the game. Sean Holden said no one was even scratching the surface of playing games where they have to figure out the rules – two years later the optimal systems throw that information in the garbage can because they're so sophisticated they don't need to be told the rules. He says, what about poker – but that's been solved too. Literally every single statement he made about AI capabilities has been demonstrated to be wrong pretty quickly after he said it.
It seems fair to say: Sean Holden is not a superforecaster.
But, if that means we must update to consider his original conclusion (that computers will somehow suddenly start doing precisely what we want when we make them answer even more open-ended questions), by how much should we update?
Probably a lot. I wasn't able to find contact information for him, but I think he'd say he's changed his mind. The idea that, given even less time to build safety features or general control mechanisms, we'd do better, seems odd. And his initial claim is should be unlikely. Our priors should say that, as software has gotten more complex up until now it's introduced a great deal of strange and hard to track down bugs, so it will in the future. And even current machine learning techniques are already so hard for humans to understand it would probably be better to start over with a different paradigm. And it turns out high quality answers to open-ended questions are routinely extremely sensitive to even minor errors or meta-errors in our ability to describe what we want.
I think we should take this engineering task seriously. We've already got machine learning systems influential enough that we ought to care what they do, and I don't see that going away as the state of the art gets more general and more powerful, do you?
We should be taking this deadly seriously. People are impressed by this technology, but given the predictions their absence produced, we ought to be terrified. And by terrified I mean, deeply engaging in the technical work, of course. Anything else would be a failure to update correctly on the evidence those expert predictions provided.