G12FCO -- Formal Computation

Lecture 10 -- The classes P and NP

The class P of decision problems

We've seen that an algorithm can be either tractable or intractable, depending on whether or not its worst-case behaviour depends polynomially on the size of input. This idea can be extended to the underlying problem. A problem is tractable if it has a polynomial solution in its worst case. Note that having a polynomial algorithm demonstrates that the problem is tractable, but not having one may simply mean that we haven't yet found it. Occasionally, problems that were thought for many years to be intractable have been made tractable by the discovery of a better algorithm. The class P of decision problems comprises those problems that are tractable in this sense.

Here we are talking about polynomial time. There is a wider class of problems, called PSPACE, solvable with a polynomial amount of memory. All problems in P are also in PSPACE (for how could a program access more memory than it has time for?); but there are plenty of programs that run for very long times but need little memory, so that the underlying problem is in PSPACE but is thought unlikely to be in P. For example, many generalised games [chess, draughts, go etc. played on arbitrarily large boards] are [sort-of] in PSPACE but are believed not to be in P -- though this has not yet been proved, and even if it were, it would not follow that the game as played from the (generalised) standard starting position is intractable. The `sort-of' in the previous sentence is because it depends how you generalise the rules. In chess, for example, there is a rule that says that either player can claim a draw if so-many moves have been played without a pawn move or a capture; this rule means that, for normal chess, no game needs to go on for more than around 6000 moves, and it is this that puts a cap on the amount of storage needed. If the generalised version is played with a similar rule, then the storage needed is polynomially bounded; but if the length of game can grow exponentially with board size, then so could the amount of storage needed.

There are lots of demonstrably tractable problems, even though the absolutely best algorithm is often still unknown; for example, zillions of matrix problems -- matrix multiplication, matrix inversion, finding determinants, etc.

There are also lots of problems that are demonstrably intractable. Sadly, these are all either very hard to describe briefly or relate to advanced theory of functions. But, as we hinted last time, there is a much more interesting set of problems [including Travelling Salesman], which may be tractable, but for which no polynomial-time algorithm is known. Lots of these problems are related in a very interesting way: they belong to a class called NP, which we now introduce.

Oracles

Many apparently hard problems become easy if you have a lucky guess.

Example: Given a number n, is it composite? Note that the size of input for this is the number of digits, proportional to log n. So this problem is tractable if we can find an algorithm whose time is, in the worst-case, a polynomial in log n; no such algorithm is known. [It is very easy to find an algorithm for this whose worst-case is polynomial in n; just try all smaller numbers as divisors.] However, if n is, in fact, composite, then we could imagine someone whispering in our ear `Try 12345678901!', and we try dividing n by that number and, lo and behold, the remainder turns out to be zero, and we thereby know that n is composite.

The formalisation of the `lucky guess' is the oracle.

Historically, the most famous oracle was that at Delphi. It was consulted by kings and peasants anxious to know their futures, which we may take it would involve an intractable computation. The advice helped them to plan their campaigns. However, all too often, the advice turned out to be understandable only with hindsight, which comes too late. Computing oracles have the same problem of credibility.

So, we are invited to contemplate a computer which has an attached oracle. The oracle can give us whatever advice it likes, and it is up to us to decide how to use its advice. Is such a computer more powerful than a `real' computer? Since the oracle doesn't necessarily give us the same advice every time we ask the same question, we have a non-deterministic computer, which will sometimes be `lucky' [when the oracle is particularly helpful]. Do lucky computers make intractable problems tractable?

Non-deterministic computations

There are many possible models of how to measure the capabilities of a non-deterministic computer. Here is the `standard' one:

Start with the problem `instance' in the input `file'.
Hand over control to the oracle. This may not interfere with the instance, but is free otherwise to write whatever information it likes, based on whatever computation it sees fit, into an `advice' file. The time taken to do this is ignored.
The oracle then hands control back to an ordinary computer. This may use the advice in whatever way it likes; it may ignore it, or use it as a source of `random' numbers or as an initial guess, for example. Time starts now.
For a decision problem, a correct non-deterministic computation must terminate having printed Yes [no matter what the oracle has said] if `yes' is the correct answer, and must not terminate having printed Yes if `no' is the correct answer.
It is not required to terminate at all in the latter case, which is perhaps a little surprising. This doesn't really matter, as you can always combine a `fast' pseudo-algorithm that needs to be lucky to get the right answer with a `slow' real algorithm that always works; it just makes some of the arguments easier, as we don't have to worry about the `no' instances.
The time taken by a particular [pseudo-]algorithm on a given `yes' instance of the problem is defined to be the minimum time taken over all possible oracles -- that is, the time taken by the luckiest possible computer. Note the asymmetry -- we are not concerned with the time taken to solve `no' instances.

The class NP of decision problems

A decision problem is said to be in NP if there exists a pseudo-algorithm for this problem whose non-deterministic time taken for `yes' instances of the problem is bounded by a polynomial function of the length of the input. In other words, NP is the class of problems that become tractable if the luckiest possible computer is presented with `yes' instances. It is irrelevant what happens for the `no' instances, as long as the computer doesn't actually get them wrong, and it is irrelevant what happens if the computer isn't lucky.

This may seem like an absurdly useless definition, but bear with it. There is a complementary class, co-NP, of problems for which computers need to be lucky with the `no' instances. We've seen that `Is a given number prime' -- the complement of the `composite' problem discussed earlier -- is in co-NP [for a `no' instance can be quickly established by a lucky guess at a factor]; this problem happens also to be in NP, but showing that depends on some reasonably deep number theory. It might seem more sensible to look at problems that are in both NP and co-NP, but these turn out to be generally less interesting than those just in NP [as far as we know].

It is obvious that any problem in P is also in NP [for the algorithm that establishes membership of P works just as well in NP].
Some problems, such as testing whether a number is composite, are clearly in NP, and could plausibly be in P -- no-one would be very surprised if a new improved algorithm obtained factorisations much faster than current algorithms.
Some problems, such as testing whether a network has a Hamiltonian circuit, are clearly in NP, but it would frankly be rather a surprise if anyone found polynomial algorithms for them. But ...
... no problem in NP has ever been proved not to be in P.
The problem of whether or not P and NP are the same class of problems is probably the outstanding unsolved problem in mathematics today. Many economically-important problems are in NP; if they are in fact tractable without, as well as with, a lucky guess, then much larger instances of them can be tackled. If on the other hand, these classes differ, and there are problems in NP which are shown to be intractable, that too is important news, as we can stop looking for certain types of algorithm for them, and concentrate on making the best of what we have.

Transformations

The theory of complexity would be much harder were it not for transformations. It is often possible to use an algorithm for one problem as part of the solution for another. This can enable us to put bounds on complexity. For example:

cos x = sin (pi/2 - x), so if we have a good way of calculating sines, we have a good way to calculate cosines, and vice versa. Calculating cosines cannot be much harder than calculating sines.
Finding the median element of an array of items can be found by sorting the items and then looking at the middle element. This is not the efficient, or recommended, way, but it establishes a bound on the complexity of the median problem.
The Hamiltonian Circuit decision problem, HC, asks whether there is a route round a given network of roads and villages that visits every village exactly once before returning to the starting point. The Travelling Salesperson decision problem, TS, asks whether, given a network of roads (of given lengths) and villages and a distance n, there is a way of visiting every village at least once while travelling no further than n. If we know how to solve TS, then we can solve a particular instance of HC by assigning to each road the particular length 1, to n the number of villages, and handing over this network and value of n to the TS algorithm. There may be easier ways to solve HC, but this construction shows at least that HC is not significantly harder than TS. In particular, it is already clear that HC and TS both belong to NP; this construction shows that if TS turns out to be in P, then so is HC, or equivalently if HC turns out to be proven intractable, then so must be TS.

top of page

G12FCO home page. border=0

Previous lecture. border=0

Next lecture. border=0

Problem Class 3 .