博文

图灵文章《论可计算数及其在判定问题上的应用》的第9章译文

已有 1303 次阅读 2023-6-4 05:59 |个人分类:图灵论著专研与精译工作群|系统分类:科研笔记

9. 可计算数的范围

还没有人试图证明“可计算”的数包括所有被自然当做可以计算的数。所有可以给出的论证本质上都局限于直觉，而且由于这个原因，在数学上相当不令人满意。真正的问题是，“什么是在计算一个数时可能的过程？ ”

我将使用的论证有三种。

1. 直接求助于直觉。

2. 证明两个定义等价（如果新的定义具有更大的直觉吸引力）。

3. 举出大量可计算数的例子。

一旦承认可计算的数都是“可计算的”，其他几个相同性质的命题也就随之而来。特别是，如果有一个判定希尔伯特函数演算的公式是否可证的通用过程，那么这个判定就可以由机器来完成。

I. [类型 (a)] 这个论证只是对§1的观点的详细阐述

计算通常是通过在纸上书写某些符号来完成的。我们可以假设这张纸像孩子的算术书一样被分成一个个方格。在初级算术中，有时会使用纸的二维特性，但这种做法总是可以避免的，我想大家会同意，纸的二维特性不是计算的必要条件。那么，我假设计算是在一维纸上进行的，即在被分成方格的纸带上。我还假设，可以打印的符号数目是有限的。如果我们允许数目是无限的，那么就会出现差异程度任意小的符号，限制符号数目并不会有严重的影响，因为总是可以使用符号序列来代替单个符号。因此，如17或9999999999999999这样的阿拉伯数字将被认为是单个符号。同样，在任何欧洲语言中，单词都被视为单一符号（然而，中文倾向于拥有可枚举的无限符号）。从我们的角度来看，单一符号和复合符号之间的区别在于，复合符号如果过于冗长，就不能识别出来。这是与经验相符的。我们不能一眼看出9999999999999和99999999999999是否相同。

计算者在任何时候的行为都是由他所观察到的符号和他在那一刻的“心智状态”决定的。我们可以假设，计算者在某一时刻能够观察到的符号或方格的数量有一个上限B。如果他想观察更多，他必须使用连续的观察。我们还将假设，需要考虑的心智状态的数目是有限的。这样做的理由与限制符号数目的理由是一样的。如果我们允许有无限多的心智状态，其中一些就会因“无限地接近”而造成混淆。同样，这个限制并不严重影响计算，因为使用更复杂的心智状态可以通过在纸带上写更多的符号来避免。

让我们想象一下，计算机进行的操作被分解成“简单操作”，这些操作是如此简单，以至于不容易想象它们被进一步分解。每个这样的操作都是由计算者及其纸带组成的物理系统的变化构成。如果我们知道纸带上的符号序列，这些都是由计算者（通过特定次序）和计算者的心智状态观察到的，我们就知道系统的状态。我们可以假设，在一个简单的操作中，被改变的符号不超过一个。任何其他变化都可以被设定为这种简单的变化，其中符号会被改变的那些方格的情况与被观察到的方格相同。因此，我们可以在不失一般性地假设，符号被改变的方格总是“被观察的”方格。

除了这些符号的变化，简单操作必须包括观察到的方格分布的变化。新的观察到的方格必须能被计算者立即识别。我认为合理的假设是，它们只能是与之前观察到的最接近的方格的距离不超过某个特定值。我们假设每个新的被观察方格与前一个立即被观察方格的距离不超过L个方格。

关于“即时识别”，我们可以认为还有其他类型的方格是即时可识别的。特别是，由特殊符号标记的方格可以被视为立即可识别的。现在，如果这些方格只由单一符号标记，那么它们的数量就会很有限，我们不应该将这些标记的方格与观察到的方格相连接而破坏我们的理论。另一方面，如果它们是由一连串的符号标记的，我们就不能把识别过程看作是一个简单的过程，这是一个应该强调的基本出发点。在大多数数学论文中，方程和定理都是有编号的。通常情况下，这些编号不会超过（比如）1000。因此，我们有可能通过编号一眼就认出一个定理。但如果论文很长，我们可能会看到第157767733443477号定理；然后，在论文的更远处，我们可能会发现“……因此（应用第157767733443477号定理）我们有……”。为了确定相关定理，我们应该逐个比较这两个数，可能要用铅笔在数字上打勾，以确保它们不会被计算两次。如果尽管如此，人们仍然认为还有其他“立即识别”的方格，只要这些方格可以通过我的机器所能完成的某些过程找到，就不会破坏我的论点。这一观点将在下文第III部分中得到发展。

因此，简单操作必须包括：

(a) 改变一个被观察方格的符号；

(b) 将一个被观察方格移动到与前一个被观察方格距离L以内的位置上；

可能其中一些变化必然涉及到心智状态的改变。因此，最一般的单一操作必须被认为是以下的一种：

A. 符号的可能变化（a）以及心智的可能变化一起。

B. 观察到的方块的可能变化（b）以及心智的可能变化。

正如第250页所建议的那样，实际执行的操作是由计算者的心智状态和观察到的符号决定的。特别是，它们决定了操作进行后计算者的心智状态。

我们现在可以建造一台机器来做这个计算机的工作。计算机的每一种心智状态都对应着机器的一个m-格局。机器扫描B个方格，对应于计算机观察到的B个方格。在任何行动中，机器都可以改变一个被扫描的方格上的符号，也可以将任何一个被扫描的方格改为与其他被扫描的方格之一相距不超过L个方格的另一个方格。所做的动作和后续的格局是由扫描的符号和m-格局决定的。刚刚描述的机器与第2节中定义的计算机并无本质区别，与这种类型的任何机器相对应，可以构造一个计算机来计算相同的序列，也就是说，由计算机计算的序列。

II. [类型(b)]

如果希尔伯特的函数演算的符号被修改为系统性的，并且只涉及有限数量的符号，那么就有可能构造一个自动机器K，用来寻找演算的所有可证明的公式。

令α是一个序列，我们用Gα(x)表示“α的第x个数字是1”的命题，因此-Gα(x)表示“α的第x个数字是0”。进一步假设我们可以找到一组定义序列α的属性，并且可以用Gα(x)和命题函数N(x)、F(x,y)来表达这些性质，其中，N(x)表示“x是一个非负整数”，F(x,y)表示 y = x +1。用合取符号把这些公式连接起来，我们可以得到一个定义α的公式，将其命名为U。U的项中必须包含皮亚诺公理的必要条件，也就是：

(Ǝu) N (u)& (x)(N(x)→(Ǝy)F(x,y)) & ( F(x,y)→N(y) ),

我们将把它缩写为P。

当我们说“U定义α”时，意思是，-U不是一个可证明的公式，同时，对于每个n，以下公式（An）或（Bn）之一是可证明的：

U & F(n) →Gα(u(n)),(An)

U & F(n) → ( –Gα(u(n))),(Bn)

其中F(n)]代表F(u, u') & F(u', u'') & ... F(u(n-1), u(n)) 。

我说α是一个可计算的序列：稍微修改K，便得到一个可以计算α的机器Kα。

我们把α的运动分成几个部分。第n部分专门用来寻找α的第n个数字。在第(n-1)-部分完成后，在所有的符号后面印上一个双冒号::，接下来的工作完全在这个双冒号右边的方格上完成。第一步是写字母A，后面是公式（An），然后是B，后面是（Bn）。然后机器Kα开始做K的工作，但只要找到一个可证明的公式，这个公式就会与（An）和（Bn）比较。如果是与(An)相同的公式，则打印出数字 “1”，第n节就完成了。如果是（Bn），则打印“0”，该部分结束。如果它和An、Bn都不相同，那么K的工作就从它停止的地方继续进行，迟早会达到公式(An)或(Bn)中的一个；这源于我们对α和U的假设，以及K的已知性质。因此，第n个部分最终会结束。Kα是非循环（circle-free）的，α是可计算的。

还可以证明，以这种方式通过使用公理来定义的数α包括所有可计算的数，这可以通过用函数演算来描述计算机器来完成。

必须记住，我们给“U定义α”这个短语附上了相当特殊的含义。可计算数并不包括所有(一般意义上)的可定义数。让δ是一个序列，其第n个数字是1或0，因为n是或不是满足的。第8节的定理的一个直接结果是，δ是不可计算的。（就我们目前所知）任何指定数量的δ数都有可能被计算出来，但不是通过一个通用过程。当足够多的δ数字被计算出来时，为了获得更多的数字，就需要一种基本的新方法。

III. 这可以被看作是对I的修改，也可以看作是对II的推论

我们假设，就像在I中一样，计算是在纸带上进行的；但我们通过考虑一个更物理的和明确的对应物来避免引入“心智状态”。计算机总是有可能在某个时刻暂停，转向其他的操作，然后过一段时间后还可以再返回到之前停止的点继续运行。如果机器这样做，它必须留下记录（以某种标准形式书写），解释如何继续工作。这个记录是“心智状态”的对应物。我们将假设计算机的工作方式很随意，它每次都不超过一个步骤。指示的记录必须使它能够执行一个步骤并写下一个记录。因此，计算在任何阶段的进展状态，完全由指令的注释和磁带上的符号决定。也就是说，系统的状态可以由一个单一的表达式（符号序列）来描述，包括磁带上的符号和Δ（我们假设它不会出现在其他地方），然后是指令的注释。这个表达式可以被称为“状态公式”。我们知道，任何给定阶段的状态公式是由最后一步之前的状态公式决定的，我们假设这两个公式的关系可以用函项演算来表达。换句话说，我们假设有一个公理U，它以任何阶段的状态公式与进行中的阶段的状态公式的关系来表达格局计算机行为的规则。如果是这样，返回索引，我们可以构造一台机器来写下连续的状态公式，从而计算出所需的数。

9. The extent of the computable numbers.

No attempt has yet been made to show that the “computable” numbers include all numbers which would naturally be regarded as computable. All arguments which can be given are bound to be, fundamentally, appeals to intuition, and for this reason rather unsatisfactory mathematically. The real question at issue is “What are the possible processes which can be carried out in computing a number?”

The arguments which I shall use are of three kinds.

1. A direct appeal to intuition.

2. A proof of the equivalence of two definitions (in case the new definition has a greater intuitive appeal).

3. Giving examples of large classes of numbers which are computable.

Once it is granted that computable numbers are all “computable” several other propositions of the same character follow. In particular, it follows that, if there is a general process for determining whether a formula of the Hilbert function calculus is provable, then the determination can be carried out by a machine.

I. [Type (a)]. This argument is only an elaboration of the ideas of §1.

Computing is normally done by writing certain symbols on paper. We may suppose this paper is divided into squares like a child's arithmetic book. In elementary arithmetic the two-dimensional character of the paper is sometimes used. But such a use is always avoidable, and I think that it will be agreed that the two-dimensional character of paper is no essential of computation. I assume then that the computation is carried out on one-dimensional paper, i.e. on a tape divided into squares. I shall also suppose that the number of symbols which may be printed is finite. If we were to allow an infinity of symbols, then there would be symbols differing to an arbitrarily small extent.[6] The effect of this restriction of the number of symbols is not very serious. It is always possible to use sequences of symbols in the place of single symbols. Thus an Arabic numeral such as {250} 17 or 999999999999999 is normally treated as a single symbol. Similarly in any European language words are treated as single symbols (Chinese, however, attempts to have an enumerable infinity of symbols). The differences from our point of view between the single and compound symbols is that the compound symbols, if they are too lengthy, cannot be observed at one glance. This is in accordance with experience. We cannot tell at a glance whether 9999999999999999 and 999999999999999 are the same.

The behaviour of the computer at any moment is determined by the symbols which he is observing. and his “state of mind” at that moment. We may suppose that there is a bound B to the number of symbols or squares which the computer can observe at one moment. If he wishes to observe more, he must use successive observations. We will also suppose that the number of states of mind which need be taken into account is finite. The reasons for this are of the same character as those which restrict the number of symbols. If we admitted an infinity of states of mind, some of them will be “arbitrarily close” and will be confused. Again, the restriction is not one which seriously affects computation, since the use of more complicated states of mind can be avoided by writing more symbols on the tape.

Let us imagine the operations performed by the computer to be split up into “simple operations” which are so elementary that it is not easy to imagine them further divided. Every such operation consists of some change of the physical system consisting of the computer and his tape. We know the state of the system if we know the sequence of symbols on the tape, which of these are observed by the computer (possibly with a special order), and the state of mind of the computer. We may suppose that in a simple operation not more than one symbol is altered. Any other changes can be set up into simple changes of this kind. The situation in regard to the squares whose symbols may be altered in this way is the same as in regard to the observed squares. We may, therefore, without loss of generality, assume that the squares whose symbols are changed are always “observed” squares.

Besides these changes of symbols, the simple operations must include changes of distribution of observed squares. The new observed squares must be immediately recognisable by the computer. I think it is reasonable to suppose that they can only be squares whose distance from the closest of the immediately previously observed squares does not exceed a certain fixed amount. Let us say that each of the new observed squares is within L squares of an immediately previously observed square.

In connection with “immediate recognisability”, it may be thought that there are other kinds of square which are immediately recognisable. In particular, squares marked by special symbols might be taken as immediately recognisable. Now if these squares are marked only by single symbols there can be only a finite number of them, and we should not upset our theory by adjoining these marked squares to the observed squares. If, on the other hand, they are marked by a sequence of symbols, we cannot regard the process of recognition as a simple process. This is a fundamental point and should be illustrated. In most mathematical papers the equations and theorems are numbered. Normally the numbers do not go beyond (say) 1000. It is, therefore, possible to recognise a theorem at a glance by its number. But if the paper was very long, we might reach Theorem 157767733443477; then, farther on in the paper, we might find “... hence (applying Theorem 157767733443477) we have...”. In order to make sure which was the relevant theorem we should have to compare the two numbers figure by figure, possibly ticking the figures off in pencil to make sure of their not being counted twice. If in spite of this it is still thought that there are other “immediately recognisable” squares, it does not upset my contention so long as these squares can be found by some process of which my type of machine is capable. This idea is developed in III below.

The simple operations must therefore include:

(a) Changes of the symbol on one of the observed squares.

(b) Changes of one of the squares observed to another square within L squares of one of the previously observed squares.

It may be that some of these changes necessarily involve a change of state of mind. The most general single operation must therefore be taken to be one of the following:

A possible change (a) of symbol together with a possible change of state of mind.

B. A possible change (b) of observed squares, together with a possible change of state of mind.

The operation actually performed is determined, as has been suggested on p.250, by the state of mind of the computer and the observed symbols. In particular, they determine the state of mind of the computer after the operation is carried out.

We may now construct a machine to do the work of this computer. To each state of mind of the computer corresponds an “m-configuration” of the machine. The machine scans B squares corresponding to the B squares observed by the computer. In any move the machine can change a symbol on a scanned square or can change anyone of the scanned squares to another square distant not more than L squares from one of the other scanned {252} squares. The move which is done, and the succeeding configuration, are determined by the scanned symbol and the m-configuration. The machines just described do not differ very essentially from computing machines as defined in §2, and corresponding to any machine of this type a computing machine can be constructed to compute the same sequence, that is to say the sequence computed by the computer.

II. [Type (b)].

If the notation of the Hilbert functional calculus[7] is modified so as to be systematic, and so as to involve only a finite number of symbols, it becomes possible to construct an automatic[8] machine K which will find all the provable formulae of the calculus.[9]

Now let α be a sequence, and let us denote by Gα(x) the proposition “The x-th figure of α is 1”, so that [10] – Gα(x) means “The x-th figure of α is 0”. Suppose further that we can find a set of properties which define the sequence α and which can be expressed in terms of Gα(x) and of the propositional functions N(x) meaning “x is a non-negative integer” and F(x,y) meaning “y = x + 1”. When we join all these formulae together conjunctively we shall have a formula, U say, which defines α. The terms of U must include the necessary parts of the Peano axioms, viz.,

(Ǝu) N (u)& (x)(N(x)→(Ǝy)F(x,y)) & ( F(x,y)→N(y) ),

When we say “U defines α”, we mean that –U is not a provable formula, and also that, for each n, one of the following formulae (An) or (Bn) is provable.

U & F(n) →Gα(u(n)),(An)[11]

U & F(n) → ( –Gα(u(n))),(Bn)

where F(n)] stands for F(u, u') & F(u', u'') & … F(u(n-1), u(n)).

I say that α is then a computable sequence: a machine Kα to computeα can be obtained by a fairly simple modification of K.

We divide the motion of α into sections. The n-th section is devoted to finding the n-th figure of α. After the (n – 1)-th section is finished a double colon : : is printed after all the symbols, and the succeeding work is done wholly on the squares to the right of this double colon. The first step is to write the letter “A” followed by the formula (An) and then “B” followed by (Bn). The machine Kα then starts to do the work of K, but whenever a provable formula is found, this formula is compared with (An) and with (Bn). If it is the same formula as (An), then the figure “1” is printed, and the n-th section is finished. If it is (Bn), then “0” is printed and the section is finished. If it is different from both, then the work of K is continued from the point at which it had been abandoned. Sooner or later one of the formulae (An) or (Bn) is reached; this follows from our hypotheses about α and U, and the known nature of K. Hence the n-th section will eventually be finished; Kα is circle-free; α is computable.

It can also be shown that the numbers α definable in this way by the use of axioms include all the computable numbers. This is done by describing computing machines in terms of the function calculus.

It must be remembered that we have attached rather a special meaning to the phrase “U defines α ”. The computable numbers do not include all (in the ordinary sense) definable numbers. Let δ be a sequence whose n-th figure is 1 or 0 according as n is or is not satisfactory. It is an immediate consequence of the theorem of §8 that δ is not computable. It is (so far as we know at present) possible that any assigned number of figures of δ can be calculated, but not by a uniform process. When sufficiently many figures of δ have been calculated, an essentially new method is necessary in order to obtain more figures.

III. This may be regarded as a modification of I or as a corollary of II.

We suppose, as in I, that the computation is carried out on a tape; but we avoid introducing the “state of mind” by considering a more physical and definite counterpart of it. It is always possible for the computer to break off from his work, to go away and forget all about it, and later to come back and go on with it. If he does this he must leave a note of instructions (written in some standard form) explaining how the work is to be continued. This note is the counterpart of the “state of mind”. We will suppose that the computer works by such a desultory manner that he never does more than one step at a sitting. The note of instructions must enable him to carry out one step and write the next note. Thus the state of progress of the computation at any stage is completely determined by the note of {254} instructions and the symbols on the tape. That is, the state of the system may be described by a single expression (sequence of symbols), consisting of the symbols on the tape followed by Δ (which we suppose not to appear elsewhere) and then by the note of instructions. This expression may be called the “state formula”. We know that the state formula at any given stage is determined by the state formula before the last step was made, and we assume that the relation of these two formulae is expressible in the functional calculus. In other words we assume that there is an axiom U which expresses the rules governing the behaviour of the computer, in terms of the relation of the state formula at any stage to the state formula at the proceeding stage. If this is so, return to the indexwe can construct a machine to write down the successive state formulae, and hence to compute the required number.

转载本文请联系原作者获取授权，同时请注明本文来自柳渝科学网博客。
链接地址：https://m.sciencenet.cn/blog-2322490-1390454.html

上一篇：康托尔论对角线法的论文 - “关于所有实数代数集合的一个属性”
下一篇：伦敦西区观看音乐剧《悲惨世界》

不确定性的困惑与NP理论分享 http://blog.sciencenet.cn/u/liuyu2205 平常心是道

博文

图灵文章《论可计算数及其在判定问题上的应用》的第9章译文

当前推荐数：2 推荐人：郑永军 杨正瓴

该博文允许注册用户评论请点击登录评论 (0 个评论)

柳渝

全部作者的精选博文

全部作者的其他最新博文

全部精选博文导读

不确定性的困惑与NP理论分享 http://blog.sciencenet.cn/u/liuyu2205 平常心是道

博文

图灵文章《论可计算数及其在判定问题上的应用》的第9章译文

当前推荐数：2 推荐人： 郑永军 杨正瓴

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

柳渝

全部作者的精选博文

全部作者的其他最新博文

全部精选博文导读

当前推荐数：2 推荐人：郑永军杨正瓴

该博文允许注册用户评论请点击登录评论 (0 个评论)