G12FCO -- Formal Computation

Lecture 9 -- Feasibility

We've looked quite a lot at how long it takes for an assortment of algorithms to run. But actually we've picked up a somewhat false picture. In truth, almost nothing is known beyond some trivial results and a tiny handful of deeper results (like the O(N log N) result for comparison-based sorting).

For example, if we want to add two N×N matrices, then the work needed is clearly O(N^2); we can't know what the answer is without looking at all N^2 elements of each matrix, and by adding them up more-or-less as you look, you get the answer with no wasted effort. But if you want to multiply the matrices, what then? The obvious way, working straight from the definition, involves adding up the results of N multiplications for each of the N^2 elements of the answer, so the work done is O(N^3). But that is much more than the O(N^2) needed to look at all the elements, so each element is being used repeatedly. Is there scope for improvement? Yes, certainly. Strassen showed in 1968 that by quartering each matrix [into four matrices each N/2 square], multiplying those, and combining the results, you could get away with 7 matrix multiplications where you would expect 8; applying this idea recursively leads to an O(N ^ lg 7) method. Since lg 7 ~ 2.81 < 3, this is an improvement! Slightly better techniques are now known, but only by tiny amounts, at huge expense in complication, and the traditional technique is certainly still the best known for ordinary matrices with N < 1000000. This leaves a huge gap between practice and theory.

However, the gap between N^2 and N^3 pales into insignificance by comparison with that between N^2 and N! or 2^N. Yet that is the sort of gap we face with many of the backtracking algorithms we looked at in the last lecture. No-one knows whether we absolutely need to backtrack in, for example, the Hamiltonian circuit problem. Perhaps there is a clever way of deciding where to go next that will enable us to determine quickly what route to follow? To see how important this is, let us look at some numbers.

Some values of functions

To fix our ideas, we take some simple functions of N, and assume that these functions give the amount of work needed to implement some range of algorithms. We assume that our computer can do a unit of work in one microsecond; if our computer is faster or slower than this, then the results will have to be modified by the corresponding factor. We look first at how long it takes to run the algorithm for various values of N.

function \ N	10	100	1000	1000000
log N	3us	7us	10 us	20 us
N	10us	0.1ms	1ms	1s
N log N	30us	0.7ms	10ms	20s
N^2	0.1ms	10ms	1s	2w
N^3	1ms	1s	20m	30000y
N^5	0.1s	3h	30y	3e16 y
2^N	1ms	3e16 y	3e288 y	large
N!	3s	3e142 y	large	large
N^N	3h	3e186 y	large	large

Note that even times as `small' as 3e16 years are more than a million times the estimated age of the Universe. Even these times are not entirely hopeless, however. Perhaps the problem is not really N^5 but a tenth or a hundredth of that. Perhaps, in a few years, computers will be a thousand times faster. Perhaps we can, for a project of sufficient importance to mankind, devote a million, or a thousand million, computers to this task. Then the time comes down to, at best, 300 years. Some cathedrals have taken this long to build! In this sort of context, the figure in the table of 30000 years is almost a challenge. Scrounge together all the computers within this University, save a factor of ten or a hundred somewhere, e.g. by waiting for faster computers, and it almost becomes an undergraduate project. Running these algorithms is simply a matter of devoting enough resources to it; if your project is important enough, it can be done.

By contrast, the longer times given, like 3e142 years, are completely beyond any imaginable resources. Numbers like 142 are so homely that non-mathematicians are apt to misunderstand how big they are as exponents. But you could give every man, woman and child on earth a million computers, each a million times faster than a PC, throw in another million for luck, and you're down to 5e114 years. Give every atom in the visible universe this sort of computing power, and you still can't complete this algorithm in a reasonable time. If, perchance, we have a problem which cannot be solved in fewer than 100! steps, then, in plain language, we have to say that this solution cannot feasibly be found. Yet that is, roughly, the situation we find ourselves in with several of our backtracking problems. The annoying feature is that we haven't managed to prove that so many steps are needed; if only we could move these problems up into the polynomial lines.

At the risk of flogging a dead horse, let us also look at a companion table, which shows, for these same functions, how big an N we can handle in a given time.

function \ time	1 second	1 hour	1 week	1 year	1000y	1000000y
log N	large	large	large	large	large	large
N	1e6	3e9	5e11	3e13	3e16	3e19
N log N	87847	2e8	2e10	1e12	1e15	8e17
N^2	1000	60000	7e5	5e6	2e8	5e9
N^3	100	1532	8000	31000	3e5	3e6
N^5	16	81	218	500	2000	8000
2^N	20	32	40	45	55	65
N!	9	13	15	16	18	21
N^N	7	9	11	12	14	16

The contrast between the polynomials and the last few lines is again stark. If N is more than, say, 100, then running even an N^5 algorithm is straightforward, whereas even a 2^N algorithm is quite out of the question. Bearing in mind, again, that the application of many computers, faster computers, tweaks to the programming, can bring the million-year project to feasibility, we see that even values of N as large as several thousand may be within range for quite high-degree polynomials; but nothing can rescue those last three lines.

Complexity classes

If you have two algorithms, and one is twice [or a hundred times] as fast as the other, this is `interesting', and on a given computer for a given problem within a given budget, the faster algorithm may be usable when the slower one isn't [disregarding questions such as the amount of storage space needed]. But, longer term, this sort of difference becomes unimportant; as computers get faster and [logically] bigger, the slower algorithm too becomes practicable. If you press the button and the answer drops out, you don't really care whether it takes 0.1 seconds or 0.001 seconds. What does make a difference is if the run-time is proportional to functions that are on different lines in the above tables. We've already seen this with sorting, and the above discussion shows that it will apply even more to backtracking algorithms. So, we can divide our algorithms into complexity classes; two algorithms are in the same class if their behaviour for large N puts them on the same line in the above tables.

This is a pukka mathematical equivalence relation -- it is reflexive [an algorithm is in the same class as itself], symmetric [if A and B are on the same line, so are B and A], and transitive [if A and B are on the same line, and so are B and C, then so also are A and C] -- so it does divide all algorithms into equivalence classes. But we are skating over very thin ice, some of which needs a little more exploration.

We are here begging lots of questions:

How do I find out what the functional dependence of my algorithm is for large N? Answer: Pass. Well, it's sometimes obvious; but it's usually very hard, and the best you can do is often just to put bounds on it.
My algorithm behaves very differently in different cases -- sometimes it runs very fast, sometimes it has to backtrack. Isn't this a problem? Answer: Yes. Usually we look at `worst case' behaviour. But sometimes average behaviour is more relevant, and then you sometimes have to specify the distribution of the input, in some sense.
I'm trying to solve chess -- what is N in my case? Answer: It isn't. You have a finite problem, and we can't take the limit as your problem gets large [even though it's already larger than you could possibly solve in practice]. So, we have to generalise chess -- play it on a large board, for example, then you get to say how big the board is.
Isn't that a stupid answer? Answer: Yes. But this is `Formal Computation', not `How to Solve Chess'. We are now concerned only with algorithms that handle a generalised problem, of size N.
OK, but I still don't know exactly what N is. I have a route problem, so I want to specify a number of villages, then the roads between them, and their lengths. Is N the number of villages, or the number of roads, or what? One of these could be the square of the other! Answer: In what follows, it usually won't matter. If it does, then we take, for definiteness, N to be the number of characters in the input that specifies our particular instance of the problem.
Right, I want to make my program look good, so I'm going to specify all sorts of junk in the input, so that my program runs as fast on my big input as yours does on a much smaller input. Answer: For a given problem, there will be a `canonical form' of the input, designed to be a natural representation. But again, in what follows, it usually won't matter.
In similar vein, my program runs slower than yours because it produces nicer output. That isn't fair! Answer: In what follows, we usually look only at `decision problems', that is, problems for which we simply require a yes/no answer. For example, `Does this network possess a Hamiltonian circuit?', `Is this number prime?' or `Is there a Travelling Salesman circuit for this network in less than 146.3 miles?'

top of page

G12FCO home page. border=0

Previous lecture. border=0

Next lecture. border=0

Problem Class 3 .