Why Validation Is Feasible

2020-02-29

 

I recently had the opportunity to discuss a few complexity theoretic questions with my son, who currently studies computer science. And I realized that - despite the still open P ≟ NP problem - many interesting and non-trivial equality results were established since 1989, the last time I took a complexity theory class at university.

While updating my world view on which computational problems are how complex, I learned that there is now a lot of evidence that an Arthur with a "classical computer that has a good random number generator" is able to verify proof certificates handed to him by an arbitrarily powerful but potentially dishonest Merlin. The proof certificates may be maliciously wrong - in the case of a black hat hacker - or accidentally incomplete - in the case of an imperfectly trained self driving car (see also 'Inadequate Safety Culture' Contributed to Uber Automated Test Vehicle Crash and Fußgänger auf Fahrbahn nicht vorgesehen). Arthur would still be able to tell the correct proof from the imperfect (or wrong) "proof" with arbitrarily high confidence.

Despite the outstanding proof of P ≠ NP, it is intuitively clear that finding a proof of - say, Fermat's last theorem, which took 356 years - is much more difficult than verifying it (verification and completion of Andrew Wiles' proof took another 18 months). It still surprised me, however, that it is now proven that an Arthur with "normal" probabilistic-polynomial capabilities can verify proofs of problems from complexity classes that are several magnitudes larger than the class "BPP", which is the class of problems Arthur can solve himself.

The first time I learned the power of intelligent interrogation was in bank supervision at BaFin. Despite the on-site examination teams having magnitudes smaller ressources than the bank, and despite often learning the nitty-gritty details of specific risk exposures during the examination from the bank (because many market-micro-structure details are not known publicly), we often managed to identify strengths and weaknesses in the bank's risk quantification processes that surprised the bank's senior management.

Similarly, when I was responsible for my team's pricing of reinsurance constracts at MunichRe, I needed to improve my skills of spotting potential weaknesses in the "proof certificates" created by my team for the pricing of a reinsurance contract. And since 2015, model validation has been the main business activity of infinada Consulting GmbH, requiring a similar "Arthurian" skill set.

It often has been conjectured that future systems employing "machine learning" might be so complex that humans will be unable to test or validate their reliability. I am now more convinced than ever that Arthur - a human with a "normal" (non-quantum) computer with a true source of randomness - can validate or falsify statements about very, very complex systems - including today's most complex machines like self-driving cars - as long as they are specifically designed to be not Turing-complete.

The following table contains an overview of complexity classes and their relations. The right-hand-side provides a walk-through.

Some Complexity Classes

deterministic Turing Machines (TM) non-deterministic TM probabilistic TM interactive proof systems quantum computers
RE (recursively enumerable) =2020MIP* =2012QMIP
R (decidable)
PR (primitive recursive)
ELEMENTARY
NEXP =1991MIP
EXPTIME
PSPACE=1970 NPSPACE=1990IP =2009QIP
NP MA AMQAM
P (polynomial time) BPP BQP

 

Notable Theorems and Conjectures

All practical experiences of the last 50 years point to NP ≠ P. Personally, I would not completely rule out that NP ≠ P could be added to the Zermelo-Fraenkel axiom system without contradiction. If I interpret Scott Aaronson correctly, however, he not only believes that NP ≠ P is very reasonable to assume, but moreoever, that it can be proven within the axiom system ZFC - the proof just hasn't been found yet. For the full story, see his 112-page paper, or you can also read the short story, told in terms of frogs.

NPI (="NP intermediate") are those problems in NP, that are neither in P nor in NPC (= NP-complete). Ladner showed 1975 that if NP ≠ P, then NPI is not empty. Problems suspected to be in NPI are integer factorization and the graph isomorphism problem.

Insights into "probabilistically checkable proofs" (PCP) are the key to understanding the power of the AM protocol specifically and interactive proof systems in general.

Let PCP[r(n), q(n)] denote the class of problems that have proofs checkable by a BPP-capable Arthur who uses O(r(n)) bits of randomness and queries O(q(n)) bits of the proof (provided by Merlin).

By definition, NP = PCP[0, poly(n)]. But it was shown in 1998 by Arora, Lund, Motwani, Sudan, and Szegedy that

NP = PCP[log(n), 1].

In other words,

For every NP-problem, Arthur needs only O(log(n)) bits of randomness and can restrict his inspection of the proof certificate to only a finite number of bits to convince himself that the proof is correct with arbitrarily high probability.

One of the most remarkable results is Shamir's 1990/1992 theorem

IP = PSPACE.

This appears to be the most important equality result in the whole hierarchy and it is the reason why validation of complex systems is feasible.

 

Consequences for the Validation of Statements About Complex Systems

One consequence of Adi Shamir's IP = PSPACE theorem is:

If a benign alien race, or a don't-be-evil entity on earth with vast computing power finds a strategy for White to win the game of Chess, then it could also prove to a mortal human equipped with a "normal personal computer", via a short communication and to arbitrary statistical certainty, that it indeed can play the game perfectly and will always win as White from the starting position.

Another consequence is that the validation of very large, machine-learned systems is possible, as long as they are designed such that their inputs, internal states and outputs are suitably restricted.

If, however, the system at hand, like Pokeman Yellow or the Border Gateway Protocol is accidentally Turing complete, then even simple questions about the system may be undecidable.