G12FCO -- Formal Computation

Lecture 11 -- Polynomial Transformations and Completeness

bottom of page

Polynomial transformations

We skated rather quickly over some very thin ice when we said that HC cannot be significantly harder than TS. To do the transformation, we had to take an instance of HC, and use that to write out an instance of TS. We didn't show that doing the writing out was tractable -- though in fact this is fairly easy to see. Nor did we point out that the instance of TS is much bigger than the instance of HC [as HC simply needs to know whether or not two particular `villages' are connected, while TS needs to know how far apart they are]. Luckily, although these two caveats may mean that HC for a given length of input may take spectacularly longer than an instance of TS of the same length [for its solution depends on the solution of a much longer instance of TS], this does not affect membership of P or NP. This is because any polynomial function of a polynomial is itself also a polynomial; so, as long as the time taken to do the transformation is polynomial in N, the size of the input, so must be the size of the output which acts as the input to the transformed problem, and so must be the time taken by a polynomial-time algorithm for the transformed problem.
A number of fallacious `proofs' that P = NP have depended on violating this. Typically, they use a horrendously complicated transformation to turn a problem such as HC into an instance of a much simpler problem -- the solution of simultaneous linear equations is a favourite -- and `hence HC is in P', forgetting that the transformation was intractable. Beware!
Given two problems A and B, if there is a polynomial-time transformation that converts an arbitrary instance of A into an instance of B which has the same yes/no outcome, we write A transf B. [Several notations for this are used in the literature, but most of them don't travel well onto Web pages!] In some sense, the existence of this transformation shows that B is at least as hard as A, though as argued above this is not to be taken literally as meaning that it necessarily takes longer for a given size of input.

This relationship is reflexive [A transf A] and transitive [A transf B and B transf C implies that A transf C], but not symmetric [A transf B does not imply that B transf A]. So, it defines a partial ordering on problems, by which some problems are harder than others. Are there any hardest problems?

Completeness

A decision problem C is said to be NP-hard if D transf C for every decision problem D in NP [and similarly for other classes of problems]. [We haven't shown that any such problem exists, or even could exist.] C would be, in the above sense, a generalisation of all the problems in NP, and hence at least as hard to solve, in the somewhat technical sense above.

If the problem C is itself also in NP, then it is said to be NP-complete [and again similarly for other classes of problems]. An NP-complete problem [if any such animal exists] is a hardest-possible problem in NP.

At first sight, this is an extraordinary notion. What could be meant by a generalisation of all the possible problems in NP? There is too much variety -- networking problems, timetabling problems, packing problems, combinatorial mathematics, arithmetic, matrices, geometry, you name it, every branch of mathematics has its NP problems.

But there is a common thread, and we can at least easily conceive of an NP-hard problem. Every problem in NP has a pseudo-algorithm, presumably intended to run on a computer, thus expressed as a computer program in some language. We'll formalise some of this [much] later on, but informally we can ask the decision problem COMP: `If we run this program with that data on such-and-such a computer, will the output be yes?'. Every problem in NP then generates an instance of COMP. Since for a given problem D the program and computer can be fixed, and the data only has to be copied, the transformation from D to COMP can certainly be achieved in polynomial time, and so D transf COMP. Sadly, COMP isn't in NP [how hard it is, we'll see in due course]. So, we have an NP-hard problem, but not an NP-complete problem. For that, we need to look at:

Satisfiability

Suppose a, b, c, ... are logical [true/false] variables, and F is a formula involving them. Then we can ask whether there is any way of assigning values to a, b, c, ... which will make F have the value true. For example,
F = ((a and not b) or (b and c)) and not c [... true if a is true, b is false and c is false], or
F = (a or b) and (a or not b) and not a [... which is always false].
The `satisfiability' decision problem, SAT is this problem for a slightly restricted form of F. Specifically, F must have form A and B and C and ..., where each of A, B, .... contains only ors and nots of the individual variables. Any F can be put into this form, though not necessarily in polynomial time.

Why is SAT important? Because of:

Theorem [Stephen Cook, 1971]: SAT is NP-complete.
The proof of Cook's theorem is decidedly messy -- not difficult, just messy -- and it will only be sketched here. For full details, consult, for example, the book by Garey and Johnson mentioned in the introduction.

Sketch of proof of Cook's theorem

Remark

That was the messiest part of the whole module. We're now through the worst, and can get back to more specific problems. The essence of Cook's theorem, and a theme that will return, is the ability to emulate a computer, which has in principle a nice simple logical structure, in other problems; which in turn gives us the power to emulate any computable problem indirectly by emulating a computer running a program for it; which in turn gives us an indirect way of relating two apparently quite different problems. The idea is much more important than the detail.

We could have used a time-limited version of COMP instead of SAT in Cook's theorem. But what we next want to do is to use the NP-completeness of SAT to discover lots more NP-complete problems, and SAT, with its nice simple logical structure, is much more suitable for this purpose than COMP; so we would simply have deferred the mess, as we would next have had to transform COMP to SAT.

Although Cook's theorem shows how to transform any problem in NP to SAT, many specific problems can be transformed much more succinctly. We'll see an example later.

top of page


border=0 G12FCO home page. border=0 Previous lecture. border=0 Next lecture. border=0 Coursework 3 .
Copyright © Dr A. N. Walker, 1996, 2016.