G12FCO -- Formal Computation

Lecture 17 -- Powers and limits of Turing machines

bottom of page

Discussion

The `bracket matching' TM shows that TMs can do some things that FSMs can't. Since a TM can be built with any FSM as its `computer', there is no abstract computation that an FSM can do that a TM can't. So TMs are intrinsically more powerful, as a class, than FSMs.
By `abstract computation', I mean one that does not need interaction with the environment -- temperature sensors, mouse clicks, etc.
Clearly, as a real computer is an FSM, a TM is capable, in principle, of doing anything that a real computer can -- and more-or-less as efficiently if we interpret each square of the tape as a floppy disc. On the other hand, it is clearly quite straightforward to write a computer program which emulates the steps taken by a TM, if necessary using a sequence of floppies as auxiliary storage to hold the tape. [The computer will have to be big enough to hold and process the state information in the TM, so we may have to design the computer after seeing the TM and learning how many states it has. But see below.] We deduce that TMs and real computers are equally powerful -- TMs are adequate models of what we plausibly mean by computers, and the computations that they can carry out are exactly the computations that we expect computers to be able to perform.

On the other hand, we have already seen that FSMs, although they capture some of the essence of computation, are less powerful. What about the other models and abstractions that we mentioned? How do their powers compare with those of FSMs and TMs and real computers? Well, many of them, including push-down automata, are of an intermediate capability -- `better' than FSMs, `worse' than TMs. Some of them, including Petri nets, are `differently' powerful than FSMs -- they can do some things that FSMs can't, but can't do some things that FSMs can. Surprisingly many, including some forms of function theory, Life and re-write systems, are equi-powerful with TMs, meaning that everything that one can do, so can the other.

This is usually easy to see one way around -- if you can write a computer program to emulate Life, for example, this shows that anything computable by Life is also computable by computer! The other way is usually harder, and is usually achieved by showing how to use Life [or whatever] to emulate a particular `universal' Turing machine.
However, ...

Church's thesis

..., no-one has yet found a plausible model of computation which is more powerful than the TM. In other words, in order to solve problems which a TM can't, it seems that you have to invoke supernatural powers -- some deus ex machina that either plucks the answer out of thin air, or else does infinitely many steps in a finite time, or is able to search unboundedly many items in a single step, or something similar. This proposition was first formalised by Alonzo Church in relation to formal logic, and refined by Turing in relation to computing, so it is known as `Church's thesis' or `Turing's thesis' or `the Church-Turing thesis' depending on taste and on minor differences in wording. Whichever, here is the proposition defended by Turing:
Any process which could naturally be called an effective procedure can be realized by a Turing machine.
Note that this is a belief rather than a theorem -- at least until we have a formal definition of `naturally' -- but it has stood up pretty well over the years. There are philosophical debates about whether quantum mechanics and/or the human soul allow the real world or real people to do things that are beyond the powers of Turing machines; but at least we can say that no-one has yet proposed, even in principle, a real computer or model thereof that goes beyond the TM.

The universal Turing machine

The theory of intractability was much advanced by the discovery of a universal NP decision problem, that, in some sense, encompassed all other decision problems in NP. Likewise, the theory of computability was much advanced by the discovery of a universal Turing machine -- a Turing machine that, in some sense, encompasses all other Turing machines. As we'll see, this is, with hindsight, not such a startling idea today as it was in the 1930s. But the effect is that we don't have to ask about what an arbitrary Turing machine can do -- either to establish that some computations are impossible or to show that some model is Turing-equivalent -- but only about what one particular TM can do.

The basic idea is very simple. We construct a TM which can emulate an arbitrary TM. In the light of the discussion in the last lecture, we can assume that the emulated TM uses just one singly-infinite tape, that it uses a binary alphabet on that tape, and that it never moves the tape more than one square at a time. The UTM will then have a slightly larger alphabet [to allow for markers, like the X in the bracket-matching TM], and its tape will have three main regions, here shown marked off by Xs:

X ... TM description ... X current state X emulated tape X emulated tape ...
The left region is a description of the emulated TM -- in essence, similar to the matrices displayed in the bracket-matching example. Then comes a space where the current state of the emulated TM is saved. Finally, the right region is a copy of the emulated tape, except that the emulated head position is marked. Note that the current state may [well] be too big for the FSM of the UTM to retain, so it must be written out to tape; likewise, the TM description could be arbitrarily big, and it too must therefore be stached away on the tape.

The emulation cycle is now quite straightforward. Starting from a known state and input, the read head searches the TM description for a match. This is a process quite similar to the bracket-matching problem; you mark off matched symbols until either you reach the end of the state and have a complete match or else a mismatch is found, in which case you restore the marked off symbols and try again with another state in the description. For each symbol, you have to move the head between the current state and the description and back again to pick up and compare the next symbol, so this is not exactly a quick and easy process! Having found the current state, you can now look up what the next state, output symbol and direction are, and copy, bit by bit, the next state to overwrite the current state. Then, knowing what to write and how to move the emulated head, you move up to the marker in the middle of the emulated tape, make the appropriate local adjustment, pick up the next emulated input symbol, and go back left to re-start for the next cycle. [This can all be done much more efficiently if the UTM has three tapes, one for each region, then most of the moves can be avoided; but we're here asking about possibility, not efficiency.]

In the 1930s, this demonstration must have seemed like magic. Today, we're used to computer simulations! No-one would find it surprising that we could write a [fixed] computer program that could simulate the behaviour of a TM given as input the matrices that describe it and the initial tape. But actually, the UTM described above is even closer to a modern computer than that.

Firstly, any general-purpose computer is in some sense acting as a UTM. If you write a QBasic [or C or whatever] program, you don't expect the computer to be a QBasic machine [though there have been special-purpose language computers, constructed directly to run in hardware a particular instance of a program in that language]; you expect a compiler or interpreter to convert that program into machine code [a `.exe' file on a PC] that the computer understands and that emulates the running of your program.

Secondly, and even more fundamentally, most modern computers are microcoded. That means that the `real' underlying computer is running a special `microprogram' which enables its processor chip to pick up `instructions' from the main store and compute local functions of them; in particular, those local functions may be such as to make the instructions look like those of an Intel 80386 processor, or like those of a M68000 processor, or any other known processor. By switching the microprogram, the computer is a 386 PC, or a Mac, or a Unix workstation, or any other known computer. Effectively, most modern PCs are UTMs emulating the behaviour of a PC. This is how you can run the same binary on a whole range of computers, how you can upgrade your PC without changing any files, and how a modern PC can run Mac programs [and vice versa].

You might expect, even from the above description, a UTM to be quite big. Actually they can be made surprisingly small. A complete state diagram can fit comfortably onto a single page -- the front cover of Minsky op. cit., repeated inside on page 142, shows a transition diagram for a UTM with 23 states which follows almost exactly the description above. Even this is unnecessarily large -- there is a rather silly pseudo-academic pseudo-competition for the smallest possible UTM, but the smallest ones have to use rather obscure encodings so are not easy to understand.

Limitations of TMs

We've seen lots of things that TMs can do; what is it that they cannot do? Just as bracket-matching exposes a fundamental limitation of FSMs, so the halting problem exposes a fundamental limitation of TMs. Another similarity: no-one really wants to match off mere pairs of parentheses, so there is an unfair tendency to mock this problem as just an academic toy of no real importance, just so no-one really wants to solve the halting problem, which therefore also tends to be dismissed. The importance is that the halting problem is a paradigm for a whole series of similar problems, and very often the simplest way to show that your problem is unsolvable is to show that its solution could be transformed to a solution of the HP.

What is the halting problem? Recall that the FSM of a TM is supposed to have an initial state and a halt state which it enters to signify the end of its calculation. The halting problem is to look at a description of a TM and its initial tape and determine whether or not that TM, started on that initial tape, eventually enters the halt state.

The halting problem is unsolvable!

Suppose, to the contrary, that there is a TM A which can solve the halting problem. Specifically, we can suppose that A is to be started off with an initial tape which looks rather like that described above for the UTM, and it is to compute away -- emulating or analysing, doing whatever it likes -- and eventually halt having written either HALTS or DOESN'T HALT on its tape. That is, A is given for its initial tape the description of a machine M and an initial tape T for M, and determines whether or not M halts on initial tape T.

Now, if the machine A exists, then so does the following machine B. B is to solve the halting problem not for a general tape T but for the specific instance where T is the description of M. That is, it solves the problem of whether or not M halts when given its own description as initial tape. [Please don't -- yet! -- ask why, or whether this is a sensible question.] We can derive B easily from A simply by adding some new states which copy the description of M, which is on its initial tape, to the place where A expects the initial tape of M to be, and then handing over to A.

Now, if the machine B exists, then so does the following machine C. C is derived from B by replacing the states where B writes HALTS on the tape by some simple loop. Now, B given the description of M determines whether or not M halts given its own description. So, what does C do given its own description? If it halts, then B would write HALTS, and therefore C enters this loop and doesn't halt. If it loops, then B would write DOESN'T HALT and halt, and therefore so must C, since we haven't changed those bits of B.

In summary: if A exists, so does B, and so does C. But if C exists, then when applied to its own description it enters a loop if it halts and halts if it enters a loop. This is a contradiction; so C cannot exist; so B cannot exist; so A cannot exist, QED.

If you understand this proof at first glance, you are too clever to be doing this module, and should probably have been giving it. If you think you understand this proof at first glance, you probably don't. If you think you will never understand this proof, please just run through it again, enlightenment will eventually dawn. We explore some consequences in the next lecture.

top of page


border=0 G12FCO home page. border=0 Previous lecture. border=0 Next lecture. border=0 Problem Class 5 .
Copyright © Dr A. N. Walker, 1996, 2016.