Defn: A language is said to be recursively enumerable (RE) if there exists a TM that accepts it. (A TM accepts a string if, when given that string as input, the TM gets to a final halting state. On strings that it does not accept, it either halts in a non-final state or never halts.)
Defn: A language is said to be recursive if there exists a TM that accepts it and that halts on every input string.
One useful way to look at it is that, for a language to be recursive, there must exist a membership algorithm, while to be RE requires only (the weaker condition) that there is a membership semi-algorithm (or membership procedure) (meaning one that is not guaranteed to halt when the answer to the question "Is w in L?" is NO).
Question:
What does the definition of RE have to do with "enumeration"?
Answer: A definition of RE equivalent to the one given above
is that a language is RE iff there exists a
(necessarily never-ending, in the case of an infinite language)
procedure that enumerates (i.e., prints) its members.
To qualify as an enumeration procedure for a language L, the procedure
must have the property that, for every w ∈ L, w will be printed
within a finite amount of time (i.e., within a finite number of steps
of the computation). This does not preclude the possibility that
some elements of L will be printed multiple times.
do for each string w ∈ Σ* | if M accepts w then | | print w | fi od |
Given a TM/algorithm M that accepts L (but perhaps does not halt in some cases when presented with a non-member of L), how can we use it to devise a TM/algorithm that enumerates the members of L? To the right is a naive approach, which does not work.
Why doesn't it work? Because when the procedure "asks" M whether it accepts w, M might never respond with an answer. But as long as the procedure might wait for an answer from M, it can never be sure that a YES answer will not come at some point in the future, so it cannot simply abort M and assume that "M hasn't yet given an answer" implies that the correct answer is NO.
A correct enumeration procedure is as follows:
k := 0 do forever | k := k+1 | do for i in [1..k] | | w := wi // the i-th string in some ordering of Σ* | | if M accepts w in k or fewer steps then | | | print w | | fi | od od |
For every value of k, this enumeration procedure will print every string that is accepted by M in k or fewer steps, and thus every string in L will be printed over and over again. This is fine, as what is required is only that every string accepted by M is eventually printed at least once, which it will be, given enough execution time.
Notice that this procedure assumes that there is an algorithmic way of producing all the strings w1, w2, ..., over any alphabet Σ But there is. (See below.)
For a recursive language, devising an enumeration procedure is simpler, due to the fact that we can use a membership algorithm (as opposed to a semi-algorithm):
i := 0 do forever | i := i+1 | w := wi; // the i-th string in some ordering of Σ* | if M accepts w then | | print w | fi od |
Do there exist languages that are not RE?? YES!
In order to show this, it's useful to introduce the notion of a countable set.
Definition: A set is countable if it is either finite or
can be placed into one-to-one correspondence with the positive integers
(or natural numbers, if you prefer).
The term countably infinite is sometimes used to refer to
infinite sets that are countable.
Technically, we say that set S is countably infinite if there exists a
bijection f : ℤ+ → S.
Note: Actually, it suffices for f to be a surjection
(also called an "onto" function), meaning that for every element
z of S, there is at least one k such that f(k) = z.
If f : ℤ+ → S is a surjection, then
one can "derive" from it a bijection
f': ℤ+ → S.
Yet another way to show that S is countable is to describe
an injective function g : S → ℤ+.
End of note.
An indirect way of demonstrating that a set is countable is to describe a systematic/algorithmic way of enumerating its members (possibly with repeats) so that, for every member z, z appears in the enumeration after only finitely many "iterations". Implicitly, such an algorithm defines a (surjective) function f as follows: f(k) = k-th element in the enumeration.
An enumeration scheme that does not work is 0, 1, 2, 3, ..., -1, -2, ... because no negative number will ever appear. That is, one cannot "enumerate" all the nonnegative integers followed by the negative integers, because infinitely many nonnegative integers would have to appear before the first negative integer.
+---+ 3 | 10| +---+---+ 2 | 6 | 9 | +---+---+---+ 1 | 3 | 5 | 8 | +---+---+---+---+ 0 | 1 | 2 | 4 | 7 | +---+---+---+---+ 0 1 2 3 |
k := 0 do forever | k := k+1 | x := 0k // the "smallest" string of length k | do while x ≠ 1k | | print x | | x := x+1 // interpreting x as a binary numeral | od | print x // i.e., 1k od |
Recall that, for a set S, its powerset, denoted ℘(S) (or 2S), is the set containing all of the subsets of S. For example, if S = {a,b,c}, then ℘(S) = { ∅, {a}, {b}, {c}, {a,b}, {a,c}, {b,c}, {a,b,c}}.
Theorem 11.1: If S is an infinite countable set, then its powerset ℘(S) is not countable.
Proof: (Cantor's Diagonalization) Assume that S = {x0, x1, x2, ... } and suppose that we could form a one-to-one correspondence between ℘(S) and ℕ. That would mean that the members of ℘(S) could be listed, exhaustively, as S0, S1, S2, etc., etc.
Take S' to be the set { xi : xi ∉ Si}. Clearly, S' ⊆ S and thus &by our assumption that all subsets of S appear in the enumeration S0, S1, ...— there must exist k such that Sk = S'. Now consider whether or not xk ∈ Sk. We have
xk ∈ Sk = < assumption Sk = S' > xk ∈ S' = < by definition of S' > xk ∉ Sk |
What we have just proved is that, under the assumption that ℘(S) is countable, xk ∈ Sk is equal to its own negation! Hence, that assumption must be false. ■
Using this result, we can show that RE languages exist, simply because there are more languages than there are TM's that can accept them.
Theorem 11.2: For any nonempty alphabet Σ, there exist languages over Σ that are not RE.
Proof: The languages over Σ correspond to the subsets of Σ*, but Theorem 11.1 assures us that there are an uncountable number of such languages. Meanwhile, the set of all TM's is countable, because every TM can be encoded by a string in {0,1}*, and that set is countable. ■
The above is a nonconstructive proof in that it fails to identify any particular language as being non-RE. The next theorem describes such a language.
Theorem 11.3: There is an RE language (over the alphabet {a}) whose complement is not RE.
Proof: (diagonalization) Let Mi be the i-th TM in some algorithmic enumeration of Turing Machines having input alphabet {a}. Define L = { ai : ai ∈ L(Mi)}
First we argue that L is RE. This is so because we can devise a procedure that, upon reading ai, constructs (an encoding of) Mi and then simulates the computation of Mi on ai. Our procedure will accept ai iff Mi does.
Now to argue that the complement of L is not RE. The complement of L is Lc = { ai : ai ∉ L(Mi)}.
Assume, contrary to what is to be proved, that Lc is RE. Then, by definition of RE, there exists k such that L(Mk) = Lc. What happens when Mk is given ak as input? There are two possibilities, each of which leads to a contradiction:
Or, to state the argument in another way, consider the expression ak ∈ L(Mk). We have
ak ∈ L(Mk) = < assumption L(Mk) = Lc > ak ∈ Lc = < by definition of Lc > ak ∉ L(Mk) |
Our assumption that Lc is RE allowed us to prove that ak ∈ L(Mk) is equal to its own negation. Hence, that assumption must be false. ■
Theorem 11.2 demonstrates that non-RE languages exist, and Theorem 11.3 gave us a specific example. We have yet to show that there is any difference between the class of RE languages and the class of recursive languages. Theorem 11.4, in conjunction with Theorem 11.3, tells us that, indeed, the recursive languages form a proper subclass of the RE languages.
Theorem 11.4 A language L is recursive iff both it and its complement Lc are RE.
Proof: The implication from left-to-right is easy: By the definitions of recursive and RE, every recursive language is RE. It remains to show only that the complement Lc of L is also RE. But it is obvious that the recursive languages are closed under complement. (A decision algorithm for L can be transformed into a decision algorithm for Lc simply by negating the "return value".) Thus, Lc is recursive and hence RE.
To show implication from right-to-left is more difficult. Suppose that both L and its complement Lc are RE. We must show that there is a membership algorithm for L. Let M1 and M2 be TM's that accept L and Lc, respectively. (Such TM's must exist if L and Lc are RE.) Given a string w, to decide whether it is a member of L, give w to both M1 and M2 and execute them, in parallel. Because w is member of one of L or Lc, eventually one of the TM's will accept it. If w is accepted by M1, answer YES. If w is accepted by M2, answer NO. ■
Theorem 11.5: There exist RE languages that are not recursive; hence, the recursive languages form a proper subset of the RE languages.
Proof: Follows immediately from Theorem 11.4, which identified an RE language whose complement is not RE, and Theorem 11.5, which tells us that any such language is not recursive.
Let us consider the so-called Halting Problem: Given a (string encoding a) TM (or a computer program in some language) M and a input w, will M halt when given w as input?
Does a decision algorithm solving this problem exist? The naive non-solution is to simulate M on input w and see what happens. If the computation halts, we answer YES. The problem with this is that, if M fails to halt, our decision algorithm will never provide a verdict. Now, perhaps our algorithm is "smart enough" to detect when M's computation repeats the same configuration for a second time, in which case we "know" that it is in an infinite loop and an answer of NO can be given. But not every infinite loop results in the same configuration being repeated, and so there will be cases in which our decision algorithm never gives an answer, which means that it is not really an algorithm.
The discussion above points out the difficulty of devising a solution to the Halting Problem, but it does not prove that no solution exists. What follows is such a proof.
Imagine the following Java program, where we assume that the halts() method is a solution to the Halting Problem (for Java programs).
public class Halting1 { /* Prints YES if the first command-line argument is a valid Java ** program that halts when applied to the input provided via the ** second command-line argument. Otherwise, prints NO. */ public static void main(String[] args) { String javaProgram = args[0]; String inputString = args[1]; if (halts(javaProgram, inputString)) { System.out.println("YES"); } else { System.out.println("NO"); } } /* Returns true if the first argument, M, is a valid Java program ** that halts when given the second argument, w, as input. ** Otherwise returns false. */ public static boolean halts(String M, String w) { ... } } |
Now suppose we modify the main() method of the program to get this one (which goes into an infinite loop when the call to halts() yields true, rather than printing "YES"):
public class Halting2 { /* Goes into an infinite loop if the first command-line argument ** is a valid Java program that halts when fed as input the second ** command-line argument. Otherwise, prints NO. */ public static void main(String[] args) { String javaProgram = args[0]; String inputString = args[1]; if (halts(javaProgram, inputString)) { while (true) { // infinite loop! } } else { System.out.println("NO"); } } /* Returns true if the first argument, M, is a valid Java program ** that halts when given the second argument, w, as input. ** Otherwise returns false. */ public static boolean halts(String M, String w) { ... } } |
We can characterize the behavior of Halting2 as follows: Given as inputs a Java program M and a string w, Halting2 halts iff M, when applied to input w, fails to halt.
Let's modify the program again, so that the second argument passed to halts() is the same as the first:
public class Halting3 { /* Goes into an infinite loop if the first command-line argument ** is a valid Java program that halts when given its own source code ** as input. Otherwise, prints NO. */ public static void main(String[] args) { String javaProgram = args[0]; if (halts(javaProgram, javaProgram) { while (true) { // infinite loop } } else { System.out.println("NO"); } } /* Returns true if the first argument, M, is a valid Java program ** that halts when given the second argument, w, as input. ** Otherwise returns false. */ public static boolean halts(String M, String w) { ... } } |
We can characterize the behavior of Halting3 as follows: Given as input a Java program M, Halting3 halts iff M, when given its own source code as input, fails to halt.
Let us define a Java program to be self-convergent if, when given its own source code as input, halts. Meanwhile, we define a self-divergent program to be one that is not self-convergent. Then a more concise characterization of Halting3's behavior is this:
Given as input a Java program M, Halting3 halts iff M is self-divergent.
Now we can show a contradiction, which shows that our assumption (which is that the method halts() exists) is false:
Halting3 is self-divergent = < Halting3's behavior is that it halts iff it is given as input a self-divergent program > Halting3 halts when given Halting3 as input = < definition of self-convergent > Halting3 is self-convergent = < definition of self-divergent > Halting3 is not self-divergent |
An alternative (and less efficient) argument demonstrating the contradiction is this:
What would happen if we provided the source code of Halting3 to itself as
input? There are two possibilities, each of which leads to a contradiction:
These two cases exhaust all possibilities, and in both we arrived at a contradiction. It follows that the original assumption, that the halts() method exists, is false. |
Having proved that the Halting Problem is undecidable, we can make use of it to prove the undecidability of other decision problems via a technique called problem reduction. We say that problem A reduces to problem B if an algorithmic solution to A can be devised, assuming the existence of an algorithmic solution to B. (For example, the solution to A could include one or more calls to a method that solves B.)
Suppose that A reduces to B. Then this implication holds:
If B is decidable, then A is decidable, too.
But every implication is equivalent to its contrapositive, which in this case is
If A is undecidable, then B is undecidable, too.
What this means is that, to prove that problem B is undecidable, it suffices to demonstrate, for some undecidable problem A, that A reduces to B.
Example 1: As a first example, here we prove that the Prints Glorp problem (with respect to Java programs) is undecidable by demonstrating a reduction to it from the Halting Problem:
Assume the existence of the following Java method (which is a solution to the "Prints Glorp" problem):
/* Given strings M and w, reports whether or not M is a valid Java ** program that, if given w as input (via command line argument), ** prints the string "Glorp". */ public static boolean printsGlorp(String M, String w) { ... } |
Now we devise a Java method that solves the Halting Problem (leaving some details unspecified) by making use of the printsGlorp() method:
/* Given strings M and w, reports whether or not M is a valid Java ** program that halts (including due to an exception being thrown) ** if given w as input (via command line argument). */ public static boolean halts(String M, String w) { //Step 1: Modify M by removing all "print" statements //Step 2: Further modify M by inserting an exception handler // in the main() method that catches every (otherwise // non-caught) exception and handles it by printing // "Glorp". //Step 3: Further modify M by inserting a print statement // that prints "Glorp" at any point where M would // have halted (e.g., just before the end of the // main() method. // Let M' be the string resulting from the modifications to M // described above. The relevant fact relating the behaviors // of programs M and M' is that the answer to the question // "Does M halt if fed w as input?" is the same as the answer // to the question, "Does M' print 'Glorp' if fed w as input?" // The answer to the latter question (and hence to the former) // is the response to the call to printsGlorp() below. return printsGlorp(M',w); } |
Example 2: Here it is proved that the "Enters state k" problem for Turing machines is undecidable. It is, at its essence, the same as the problem addressed above, but is included here in case the reader would find the proof easier to grasp.
Note: For the purpose of explicitly distinguishing between a Turing machine and a string that encodes it, here we will use the notation TM(M) to refer to the Turing machine encoded by the character string M. End of note.
The problem is this: Given a string M that encodes Turing machine TM(M), a string w (over the input alphabet of TM(M)), and a number k, determine whether or not TM(M), if fed input w, will ever enter state k. (Here we are assuming, without loss of generality, that TM(M)'s states are identified by integers.)
Assume the existence of a Java method that solves the problem, this being its specification:
/* Returns true iff TM(M) is a Turing machine that, fed the ** string w as input, will at some point enter state #k. */ public static boolean entersStateK(String M, String w, int k) { ... } |
With the hypothetical method entersStateK() at our disposal, we can devise a Java method (leaving some details unspecified) that solves the Turing machine halting problem:
/* Returns true iff T(M) is a Turing machine that, ** fed the string w as input, will eventually halt. */ public static boolean haltsTM(String M, String w) { //Step 1: Assign to int variable k a number that does not identify // any state in the Turing machine TM(M). //Step 2: Modify string M to obtain the string M' such that TM(M') is the // Turing machine just like TM(M), with these exceptions: For every // state-symbol pair (j,c) such that δM(j,c) is // undefined, make δM'(j,c) = (k,c,R). (The choices // of c and R on the right-hand side are arbitrary.) The relevant // fact relating the behaviors of Turing Machines TM(M) and TM(M') is // that, given the same inputs, their computations would be identical // until such time as TM(M) reaches a halting configuration (meaning // one in which it was in some state j with the read/write head scanning // some symbol c for which δM(j,c) is undefined). Instead // of halting in that situation, TM(M') will enter state k! The point // is that the answer to the question, "Does TM(M) halt?" is the same as // the answer to the question "Does TM(M') ever enter state k?". That // question is answered by calling the hypothetical entersStateK() method: return entersStateK(M',w,k); } |
What has been demonstrated is that, if there is an algorithmic solution to the "Enters state k" problem, then there is an algorithmic solution to the Halting Problem. That is, the Halting Problem has been reduced to the "Enters State k" problem. It follows that the "Enters state k" problem is undecidable.
Example 3: Here it is proved that the Halts on Empty Input problem (for Java programs) is undecidable by showing a reduction to that problem from the Halting Problem (for Java programs). Note: The Halts on Empty Input problem is seemingly a simpler problem than the Halting Problem because, in effect, the former is a special case of the latter in which the input string is fixed (to be the empty string) rather than being a second parameter that could be any string.
Assume the existence of the following Java method (which is a solution to the "Halts on Empty Input for Java Programs" problem):
/* Given string M, reports whether or not M is a valid Java program that ** halts if given the empty string as input (via command line argument). */ public static boolean haltsOnEmptyInputJava(String M) { ... } |
Now we devise a Java method that solves the Halting Problem (leaving some details unspecified) by making use of the haltsOnEmptyInputJava() method:
/* Given strings M and w, reports whether or not M is a valid Java ** program that halts (including due to an exception being thrown) ** if given w as input (via command line argument). */ public static boolean haltsJava(String M, String w) { // Modify M by inserting, as the first statement in the // main() method: // args[0] = "...."; // (where "...." is the String literal corresponding to the value // of argument w). Let M' be the resulting string. The relevant // fact relating the behaviors of programs M and M' is that the // answer to the question "Does M halt if fed w as input?" is the // same as the answer to the question, "Does M' halt if fed the empty // string as input?" The answer to the latter question (and hence to // the former) is the response to the call to haltsOnEmptyInputJava() below. return haltsOnEmptyInputJava(M'); } |
Example 4: Here it is proved that the Halts on Empty Input problem (for Turing machines) is undecidable by showing a reduction to that problem from the Halting Problem (for Turing machines). At its essence, this problem is the same as that of Example 3.
Assume the existence of the following Java method (which is a solution to the "Halts on Empty Input" problem):
/* Given string M (an encoding of a Turing machine), reports whether or ** not TM(M) halts if given the empty string as input. */ public static boolean haltsOnEmptyInputTM(String M) { ... } |
Now we devise a Java method that solves the Halting Problem (leaving some details unspecified) by making use of the haltsOnEmptyInputTM() method:
/* Given strings M and w, reports whether or not TM(M) is ** a Turing machine that halts if given w as input. */ public static boolean haltsTM(String M, String w) { // Modify M to obtain M', an encoding of a Turing machine // that does this: // (1) Erases any string initially on the tape (by replacing // every non-blank by a blank) // (2) Writes the string w onto its tape // (3) Places the read/write head at the left end of w // (4) Enters the initial state of TM(M) // // The relevant fact relating the behaviors of TM(M) and TM(M') is // that the answer to the question "Does TM(M) halt if fed w as // input?" is the same as the answer to the question, "Does TM(M') // halt if fed the empty string (or any other string, for that matter) // as input?" The answer to the latter question (and hence to the // former) is the response to the call to haltsOnEmpty() below. return haltsOnEmptyInputTM(M'); } |