Rice’s theorem

An interactive tutorial

Turing famously showed that computers can’t decide whether your code halts. But in 1951, Henry Rice proved a much more devastating result: “computers can’t decide anything interesting about your code’s input-output!” In this chapter, you’ll learn exactly what he meant, and how he proved it, by building on Turing’s famous theorem.

Let’s start with a very practical question: can you replace a function with a regular expression? Imagine, one day at work, you come across this function in the codebase:

function isNumeric(str: string) {
  if (str === "") return false;
  for (const c of str) {
    if (!"0123456789".includes(c))
      return false;
  }
  return true;
}

Hmm, you think ... could calls to this function be replaced by a regular expression test, like /a+/.test(str)?

Yeah, we could replace it with something like:

Actually, I think it could. We could replace it with something like this:

function isNumericRegex(s: string) {
  return /^[0-9]+$/.test(s);
}

But wouldn’t it be cool if your IDE gave you a refactoring hint? Imagine: “function isNumeric can be replaced by a regular expression.” The IDE is smart enough to give us other hints, like “unused variable”. Would this regular expression test be much harder?

Warm-up: `canBeRegex`

If your IDE could make this decision, it would have an internal function like this:

function canBeRegex(funcStr: string) {
  /* Complex logic here */
}

canBeRegex takes the source code for a function, analyzes it, and returns true or false, telling us whether the function could be replaced with a regular expression. For example, what should the following call return?

canBeRegex(`
  function isLen3(s: string) {
    return s.length === 3;
  }
`)

Yes, it could be replaced by the regex ^...$, for example.

Actually, I think it should return true. It’s considering this function:

function isLen3(s: string) {
  return s.length === 3;
}

This function could be replaced by the regex ^...$, or by ^.{3}$, either of which matches strings that have exactly three characters.

Let’s try another function, bad_isLen3:

function bad_isLen3(str: string) {
  for (let i = 0; i <= 99; i++);
  return str.length === 3;
}

What should canBeRegex return when given the source code of bad_isLen3?

Yeah: bad_isLen3 does some useless upfront work, but otherwise has the same input-output behavior as isLen3.

Well ... yes, bad_isLen3 does some useless upfront work that a regex probably wouldn’t do. But we don’t want our IDE to fuss over the implementation details or efficiency; it should just tell us whether there’s a regex that has the same input-output as the function.

But now what about awful_isLen3:

function awful_isLen3(str: string) {
  for (let i = 0; i <= i; i++); // hmm
  return str.length === 3;
}

Actually, this function is not replaceable by a regex. That for loop never halts, because i <= i is always true! This means that awful_isLen3 never halts, and its input-output is actually equivalent to the loop function:

Right, that for loop never halts! So awful_isLen3 never halts, and its input-output is actually equivalent to the loop function:

function loop(s) { while(true); }

This is clearly not replaceable by a regex, so canBeRegex should return false on it. Perhaps you see the problem: doesn’t canBeRegex need to decide whether that first line halts?

That old halting problem again

In Chapter 1, we discovered the halting problem. Here’s a quick reminder:

function halts(code: string) { ... }

The halts function is given some JavaScript code in string form, and it must tell us whether it halts. For example, what should this return?

halts(`
  let x = 1;
  while (x > 0) x = x / 2;
`)

No, halts should always halt, even if the function that it’s given does not halt!

Alright, there are two reasonable answers here! If you think of x as a real or rational number, x would never reach zero, so halts should return false. But in JavaScript floating-point numbers, it eventually gets to 5e-324 / 2, which evaluates to zero, so halts should return true!

The point is: deciding whether code halts is freaking hard. And in Chapter 1, we learned Turing’s proof that halts is actually impossible to implement: any implementation you write will have a bug!

So, isn’t canBeRegex impossible too? Put on your logician’s hat: what would we need to do to prove that canBeRegex is impossible to write?

No, but this is a common mistake! If you implement canBeRegex using halts, you’re just showing that your implementation can’t exist. Perhaps there’s a cleverer way to implement canBeRegex that doesn’t rely on using halts!

However, if you can implement halts using canBeRegex, you’ve got a strong argument that canBeRegex can’t exist. Let’s see how.

Right! If you implement halts using canBeRegex, you’ve got a proof by contradiction that canBeRegex can’t exist.

So, here’s an idea for implementing halts using canBeRegex:

function halts(code: string) {
  return canBeRegex(`
    function isLen3(s: string) {
      ${code}
      return s.length === 3;
    }
  `);
}

Take your time and read it closely! We take our function isLen3, which we know can be replaced with a regex. Then we inject the code at the top of that function. Finally, we ask whether the function can still be replaced with a regex after this change.

Does this really solve the halting problem? Let’s check. First, assume that code does halt. Now read the above halts(code): what will it return?

Now assume that code does not halt. Then what will the above halts(code) return?

So this does solve the halting problem. But we know that’s impossible! We have a contradiction, so our initial assumption must be false. That is, canBeRegex cannot exist! This trick is the core idea of Rice’s theorem.

Code isolation

Now for a short interlude. Sharp-eyed readers might wonder whether the injected ${code} can “escape its sandbox”. Here’s a tricky example:

halts(`throw new Error("escape");`)

The code throw new Error("escape") would halt with an error. But what would halts return, if we use the above implementation?

I think it would return false, because the following function cannot be replaced by a regex:

function (str: string) {
  throw new Error("escape");
  return str.length === 3;
}

The bug is that raw string injection like ${code} does not isolate the code. So, when we write ${code}, think of this as a shorthand for running the code in an isolated sandbox, like this one.

`canHalt`: maybe an easier question?

So, we showed that we can’t implement canBeRegex, because a regular expression always halts, and we can’t tell whether a function halts.

But perhaps you’re thinking: “A regular expression always halts. So implementing canBeRegex is actually even harder than solving the halting problem. This is not so surprising.”

Well, let’s consider a new property, canHalt. Given a function’s source code, it tells us whether there is at least one argument that causes the function to halt. For example, what should this return?

canHalt(`function (n: number) {
  if (isPrime(n) && isEven(n)) return;
  while (true);
}`)

Right, the input 2 would cause this function to halt, because $2$ is a prime number and an even number.

I apologize! This was a bit tricksy! There’s one input that would cause this function to halt: the input 2, because $2$ is both a prime number and an even number!

Now, this canHalt property does not imply the function always halts. So does Rice’s argument still work to show that canHalt cannot be written? Let’s try running it through the proof anyway! Assume we have canHalt, and then write:

function halts(code: string) {
  const t = `function ret42() {
    ${code}
    return 42;
  }`;
  return canHalt(t);
}

Check this implementation of halts! If code halts, it should return true, and if code does not halt, it should return false. So, does it correctly solve the halting problem?

I think it does solve the halting problem, and here’s why.

First, assume code halts. Then we’re effectively asking whether this function can halt:

function ret42() {
  return 42;
}

It certainly can halt (it always does), so halts would return true here.

Then assume code does not halt. Then it transforms ret42 into the loop function, which cannot halt, so halts would return false here.

Either way, halts returns the correct answer, so it does solve the halting problem.

So, similarly, canHalt is also impossible to implement! Here’s how we were able to prove it. We have two functions, loop and ret42:

// does NOT satisfy canHalt
function loop (x) { while (true); }

// DOES satisfy canHalt
function ret42 (x) { return 42; }

By injecting the code at the top of the ret42 function, it either

• transforms it into the loop function (if code does not halt), or

• leaves it unmodified (if code does halt).

We then use canHalt to distinguish whether the injected code transformed the ret42 function into loop, or left it alone. And this tells us whether code halts.

Final boss: `numHaltsIsEven`

Let’s try an even trickier function property: numHaltsIsEven. This considers how many distinct values you can pass to the function that would cause it to halt. It then returns true if that number is even.

As before, let’s write a function that will satisfy the property:

function haltIf4Or8(n: number) {
  if (n === 4 || n === 8) return;
  while (true) { }
}

How many distinct values of n will cause haltIf4Or8 to halt?

There are exactly two values, 4 and 8, that cause this function to halt. And two is an even number, so this function should satisfy numHaltsIsEven.

Now let’s use our standard trick, and try to solve the halting problem by injecting the code at the start of haltIf4Or8:

function halts(code: string) {
  const t = `function haltIf4Or8(n) {
    ${code}
    if (n === 4 || n === 8) return;
    while (true) { }
  }`;
  return numHaltsIsEven(t);
}

Use the same “proof-by-cases” technique, first assuming code halts, then assuming it doesn’t. Does the above solve the halting problem?

Actually, it doesn’t! Let’s check both cases. If code halts, then halts returns true, which is all good and correct.

But if code does not halt, what happens? Then the function analyzed by numHaltsIsEven is equivalent to the loop function. So, does the loop function have an even number of inputs that cause it to halt?

Yes: it has zero inputs that cause it to halt, and zero is an even number! So in this case, halts also returns true, which is incorrect!

Indeed! This version of halts always returns true! The cause is that the loop function has zero distinct inputs that cause it to halt, and zero is an even number.

But this is fixable. We just need to construct a function that does not satisfy numHaltsIsEven. Which of these will do the job?

function haltIf3Or7(n: number) {
  if (n === 3 || n === 7) return;
  while (true) { }
}

function haltIf3(n: number) {
  if (n === 3) return;
  while (true) { }
}

No, haltIf3Or7 also has an even number of inputs that cause it to halt. So it also satisfies numHaltsIsEven. But we’re looking for a function that does not satisfy numHaltsIsEven.

The function haltIf3 has one input that causes it to halt, namely the input 3. One is not considered even, so this function does not satisfy numHaltsIsEven.

Right! haltIf3 has one input that causes it to halt, namely the input 3. One is not considered even, so this function does not satisfy numHaltsIsEven.

Second time lucky: let’s try again to implement halts, by injecting the code into haltIf3, then inverting the final result with !:

function halts(code: string) {
  const t = `function haltIf3(n) {
    ${code}
    if (n === 3) return;
    while (true) { }
  }`;
  return !numHaltsIsEven(t);
}

If you check it, you should find that this correctly solves the halting problem — thus proving that numHaltsIsEven cannot exist!

Now for an any property $P$

Finally, you’re prepared to see Rice’s trick in its full generality. We’re given any function property $P$ , like “can be a regex” or “number of halting inputs is even”. We want to show that a function isP cannot be written.

Rice’s trick

Assume, by contradiction, that isP exists.

We will find some function t, and then inject code at the top of it. If the loop function does not satisfy $P$ , we set t to a function that does satisfy $P$ ; otherwise, we set t to a function that does not satisfy $P$ .

We then test whether some arbitrary code halts, by asking isP: “would injecting code at the top of t convert it into the loop function?” But this answers the halting problem, so isP cannot exist!

Well ... almost. Henry Rice says there are two properties of $P$ that make this trick work: first, $P$ must be a semantic property, and second, it must be a non-trivial property. Let’s see what they mean.

Too many variables

Perhaps you’ve seen IDE warnings like: “Function f has more than 20 variables. Consider splitting into smaller functions.” But that sounds like a property of functions! Didn’t we just see those are impossible to decide?

Can we really write a function hasMoreThan20Vars that takes a function string and tells us whether it has more than 20 variables?

Sure we can! It reads through the source code, and counts every const x, var y, et cetera. No problem here! So, what’s different about this property? To see, let’s assume we have hasMoreThan20Vars, and try out Rice’s trick:

function halts(code: string) {
  const t = `function lotsOfVars(n) {
    ${code}
    const a,b,c,d,e,f,g,h,i,j,k,l,m,n,
      o,p,q,r,s,t,u,v,w,x,y,z;
    return;
  }`;
  return hasMoreThan20Vars(t);
}

Use the usual proof-by-cases: does this solve the halting problem?

The function lotsOfVars starts with 26 variables, so it satisfies hasMoreThan20Vars. Injecting some code at the top doesn’t make any difference to that. It doesn’t tell us anything about whether code halts.

When Mr. Rice says $P$ must be semantic, he means it must be solely concerned with the input and output of the function when called. That is: if two functions have the same input-output, then $P$ cannot be true of one, and false of the other.

Let’s see a couple of example properties. Consider a property returnsInput, which tells us whether a function returns its own argument. For examples

// satisfies `returnsInput`
function id(x) { return x; }

// does not satisfy `returnsInput`
function increment(x) { return x+1; }

Is the property returnsInput semantic?

I think it’s semantic. The property can be decided by only looking at the input-output table, e.g. 1 -> 1, "foo" -> "foo", etc.

(Bonus exercise: consider the property returnsOwnSourceCode, which tells us whether a function will return its own source code. Is returnsOwnSourceCode a semantic property? Will Rice’s trick work on it? Answers on a postcard!)

Solving the halting problem is trivial?!

The final property we’ll consider today is called solvesHaltingProblem. Given a function’s source code, it tells us whether that function solves the halting problem — that is, whether it’s a correct implementation of halts.

As always, let’s try to run Rice’s proof. First: does the loop function satisfy solvesHaltingProblem?

It does not! The loop function never terminates, let alone with a correct answer.

So, next, we must set t to a function that does satisfy solvesHaltingProblem. Can you find one?

Really?! If you can find such a t, please email it to me, because you’ve solved the halting problem!

The problem is, there is no such function — Turing taught us this. And so we can’t proceed with Rice’s proof.

If there are no functions that satisfy the property $P$ , then we can’t use isP to decide anything, because it always returns false. Similarly, if every function satisfies $P$ , then we can’t use isP to decide anything, because it always returns true.

This is what Rice means by triviality. The property $P$ is trivial if all functions satisfy it, or if none do. To show that isP cannot be written, Rice’s proof requires that $P$ is non-trivial: that it is satisfied by some, but not all, functions.

Conclusion

And this concludes the proof! Quoting Wikipedia: “Rice’s theorem states that all non-trivial semantic properties of programs are undecidable.” By now, you should understand what that theorem means, how to prove it, and why those “non-trivial semantic” qualifications are in there.

In the next chapter, we’ll be exploring Kurt Gödel’s second incompleteness theorem, which shows that proof systems can’t prove their own consistency! Stay tuned!

This chapter is free this week — to read the rest of Busy Beavers and support me writing it, please buy the course! Or if you want to know when the next chapter comes out, sign up for updates here. You might also enjoy our course Everyday Data Science, which uses similar interactive storytelling. Or if you’re feeling inspired, you can use TigYog to write your own course just like these ones!

Next in Busy Beavers!:

Coming soon

8. Just how busy can a busy beaver be?

Let’s play a game! We’ll write a JavaScript program of less than 10 characters that prints as much as possible before halting. Whoever’s program prints more is the busiest beaver, and they win the game.

What’s the busiest program? Just how busy is it? How can we find it? In this chapter, we’ll have a crack at these questions, but fair warning: we might not get many answers.

1. Uncomputable functions

2. Interpreters and universality

3. The halting problem

Or, Turing’s trick to troll any termination test

4. Proofs about programs

5. Gödel’s first incompleteness theorem

6. Godel’s second incompleteness theorem

7. Rice’s theorem

8. Just how busy can a busy beaver be?

9. Quines and fixed points

Rice’s theorem

An interactive tutorial

Warm-up: `canBeRegex`

That old halting problem again

Code isolation

`canHalt`: maybe an easier question?

Final boss: `numHaltsIsEven`

Now for an any property $P$

Too many variables

Solving the halting problem is trivial?!

Conclusion

Next in Busy Beavers!:

8. Just how busy can a busy beaver be?