By `abstract computation', I mean one that does not need interaction with the environment -- temperature sensors, mouse clicks, etc.Clearly, as a real computer is an FSM, a TM is capable, in principle, of doing anything that a real computer can -- and more-or-less as efficiently if we interpret each square of the tape as a floppy disc. On the other hand, it is clearly quite straightforward to write a computer program which emulates the steps taken by a TM, if necessary using a sequence of floppies as auxiliary storage to hold the tape. [The computer will have to be big enough to hold and process the state information in the TM, so we may have to design the computer after seeing the TM and learning how many states it has. But see below.] We deduce that TMs and real computers are equally powerful -- TMs are adequate models of what we plausibly mean by computers, and the computations that they can carry out are exactly the computations that we expect computers to be able to perform.
On the other hand, we have already seen that FSMs, although they capture some of the essence of computation, are less powerful. What about the other models and abstractions that we mentioned? How do their powers compare with those of FSMs and TMs and real computers? Well, many of them, including push-down automata, are of an intermediate capability -- `better' than FSMs, `worse' than TMs. Some of them, including Petri nets, are `differently' powerful than FSMs -- they can do some things that FSMs can't, but can't do some things that FSMs can. Surprisingly many, including some forms of function theory, Life and re-write systems, are equi-powerful with TMs, meaning that everything that one can do, so can the other.
This is usually easy to see one way around -- if you can write a computer program to emulate Life, for example, this shows that anything computable by Life is also computable by computer! The other way is usually harder, and is usually achieved by showing how to use Life [or whatever] to emulate a particular `universal' Turing machine.However, ...
Any process which could naturally be called an effective procedure can be realized by a Turing machine.Note that this is a belief rather than a theorem -- at least until we have a formal definition of `naturally' -- but it has stood up pretty well over the years. There are philosophical debates about whether quantum mechanics and/or the human soul allow the real world or real people to do things that are beyond the powers of Turing machines; but at least we can say that no-one has yet proposed, even in principle, a real computer or model thereof that goes beyond the TM.
The basic idea is very simple. We construct a TM which can emulate an arbitrary TM. In the light of the discussion in the last lecture, we can assume that the emulated TM uses just one singly-infinite tape, that it uses a binary alphabet on that tape, and that it never moves the tape more than one square at a time. The UTM will then have a slightly larger alphabet [to allow for markers, like the X in the bracket-matching TM], and its tape will have three main regions, here shown marked off by Xs:
X | ... TM description ... | X | current state | X | emulated tape | X | emulated tape ... |
The emulation cycle is now quite straightforward. Starting from a known state and input, the read head searches the TM description for a match. This is a process quite similar to the bracket-matching problem; you mark off matched symbols until either you reach the end of the state and have a complete match or else a mismatch is found, in which case you restore the marked off symbols and try again with another state in the description. For each symbol, you have to move the head between the current state and the description and back again to pick up and compare the next symbol, so this is not exactly a quick and easy process! Having found the current state, you can now look up what the next state, output symbol and direction are, and copy, bit by bit, the next state to overwrite the current state. Then, knowing what to write and how to move the emulated head, you move up to the marker in the middle of the emulated tape, make the appropriate local adjustment, pick up the next emulated input symbol, and go back left to re-start for the next cycle. [This can all be done much more efficiently if the UTM has three tapes, one for each region, then most of the moves can be avoided; but we're here asking about possibility, not efficiency.]
In the 1930s, this demonstration must have seemed like magic. Today, we're used to computer simulations! No-one would find it surprising that we could write a [fixed] computer program that could simulate the behaviour of a TM given as input the matrices that describe it and the initial tape. But actually, the UTM described above is even closer to a modern computer than that.
Firstly, any general-purpose computer is in some sense acting as a UTM. If you write a QBasic [or C or whatever] program, you don't expect the computer to be a QBasic machine [though there have been special-purpose language computers, constructed directly to run in hardware a particular instance of a program in that language]; you expect a compiler or interpreter to convert that program into machine code [a `.exe' file on a PC] that the computer understands and that emulates the running of your program.
Secondly, and even more fundamentally, most modern computers are microcoded. That means that the `real' underlying computer is running a special `microprogram' which enables its processor chip to pick up `instructions' from the main store and compute local functions of them; in particular, those local functions may be such as to make the instructions look like those of an Intel 80386 processor, or like those of a M68000 processor, or any other known processor. By switching the microprogram, the computer is a 386 PC, or a Mac, or a Unix workstation, or any other known computer. Effectively, most modern PCs are UTMs emulating the behaviour of a PC. This is how you can run the same binary on a whole range of computers, how you can upgrade your PC without changing any files, and how a modern PC can run Mac programs [and vice versa].
You might expect, even from the above description, a UTM to be quite big. Actually they can be made surprisingly small. A complete state diagram can fit comfortably onto a single page -- the front cover of Minsky op. cit., repeated inside on page 142, shows a transition diagram for a UTM with 23 states which follows almost exactly the description above. Even this is unnecessarily large -- there is a rather silly pseudo-academic pseudo-competition for the smallest possible UTM, but the smallest ones have to use rather obscure encodings so are not easy to understand.
What is the halting problem? Recall that the FSM of a TM is supposed to have an initial state and a halt state which it enters to signify the end of its calculation. The halting problem is to look at a description of a TM and its initial tape and determine whether or not that TM, started on that initial tape, eventually enters the halt state.
HALTS
or DOESN'T HALT
on its tape.
That is, A is given for its initial tape the description
of a machine M and an initial tape T for M,
and determines whether or not M halts on initial tape T.
Now, if the machine A exists, then so does the following machine B. B is to solve the halting problem not for a general tape T but for the specific instance where T is the description of M. That is, it solves the problem of whether or not M halts when given its own description as initial tape. [Please don't -- yet! -- ask why, or whether this is a sensible question.] We can derive B easily from A simply by adding some new states which copy the description of M, which is on its initial tape, to the place where A expects the initial tape of M to be, and then handing over to A.
Now, if the machine B exists, then so does the following
machine C.
C is derived from B by replacing the states where
B writes HALTS
on the tape by some simple loop.
Now, B given the description of M determines whether
or not M halts given its own description.
So, what does C do given its own description?
If it halts, then B would write HALTS
, and therefore
C enters this loop and doesn't halt.
If it loops, then B would write DOESN'T HALT
and
halt, and therefore so must C, since we haven't changed those
bits of B.
In summary: if A exists, so does B, and so does C. But if C exists, then when applied to its own description it enters a loop if it halts and halts if it enters a loop. This is a contradiction; so C cannot exist; so B cannot exist; so A cannot exist, QED.
If you think you understand this proof at first glance, you probably don't. If you think you will never understand this proof, please just run through it again, enlightenment will eventually dawn. We explore some consequences in the next lecture.