Category: Computer Science

Myhill-Nerode Theorem in Practice

When it comes to practical side of computer science, we often work with regular and context-free languages. Regular languages are most common for expressing syntax through the widely used regular expressions. Somewhat stronger context-free grammars dominate in the field of compilers and various language processors.

As we know from the Chomsky hierarchy of formal languages, the class of regular languages is a subset (or a sort-of a special case) of context-free language class. This means that every regular language can be described not only by finite automatons, regular expressions or regular grammars but also by context-free grammars, pushdown automatons and also (for the hard-core folks) a turing machine.

In theory this is fine, but it’s not very good for practice. I mean, if you were developing some app that requires syntax checking, would you rather develop some sort of lineary bounded automaton or use a library for regular expressions?

Sometimes we want to prove, that we don’t need to do magic to check the correct syntax of an email address. There are some ways of doing so. For example pumping theorem, which is fairly straight forward. But pumping lemma can be only used to prove, that a language is not regular. The lemma doesn’t provide sufficient condition for a language to be regular. That’s where the folks Myhill and Nerode come in!

The Myhill-Nerode theorem on the contrary provides necessary and sufficient condition for the language to be regular! Yay! But as you can imagine it’s not that simple :-P.

Formal definition

Definition 1. Let Σ be an alphabet and ~ a equivalence relation on Σ*. Then relation ~ is right invariant if for all u, v, w ∈ Σ*:

  • u ~ v <=> uw ~ vw

Definition 2. Let L be a language (not neccessarily regular) over Σ. We define relation ~L called prefix equivalence on Σ* as follows:

  • u ~L v <=> ∀w ∈ Σ*: uw ∈ L <=> vw ∈ L

Theorem 1. (Myhill-Nerode) Let L be a language over Σ. Then these three statements are equivalent:

  1. L is accepted by some deterministic finite automaton.
  2. L is the union of some of the equivalence classes of a right invariant equivalence relation of finite index.
  3. Relation ~L is of finite index.

Informal description

The heart of this theorem is the fact that finite automaton has only a finite number of states. With this, any language that can be described by a finite automaton must consist of only finite number of string patterns. This is expressed by the right invariant relation with finite index.

The prefix equivalence is a form of the right invariant equivalence, but stronger. It’s tied to the respective language and says that if some strings are ~L-equivalent they both are included or excluded from the language after we extend them. The Myhill-Nerode theorem says, that a regular language always has a finite number of equivalence classes, i.e. there is only a finite number of word patterns that can be repeated through the string.

Example proof

Now a little example of how to show, that a language is not regular by using this theorem. Let’s take for instance the most classic context-free language of all times L = { anbn | n >= 0 }.

We’ll be interested in the third part of Myhill-Nerode theorem which states, that relation ~L is of finite index. We need to find a way of showing that it’s not and therefore the language is not regular (since Myhill-Nerode theorem is an equivalence).

In this case, the most elegant way is proof by contradiction. We will show, that L = { anbn | n >= 0 } is not reglar.

Proof 1. Suppose that L is a regular language. Let i, j \in N be two natural numbers so i \neq j. Then consider words u, v, w \in \Sigma^* and a sequence of words over \Sigma^* a1, a2, …, ai.

Now let’s assign some values to strings u, v and w:

u = a^i, v=a^j, c=b^i.

According to the definition of prefix equivalence (definition 2),

a^ib^i \in L \Leftrightarrow a^jb^i \in L

we can see, that none of the words ai from the sequence are ~L-equivalent. The sequence is infinite, so the index of the relation will be infinite. But that’s a contradiction with the Myhill-Nerode theorem. Therefore, the language is not regular.


Theorem Proving in Mathematics

The very foundation of mathematics as we know it! Or, well, that evil thing math teachers had invented so they could torture the students. And we all know that very well … Anyway, I’m one of the tortured this fall, so let’s talk the math!

Mathematical proving is a way of convincingly demonstrating that some mathematical statement is necessarily true[1]. There are several different types and ways of proofs and proving things, but one thing is common among all of them — a proof can only be obtained from unquestionable truths (axioms) by some sort of deductive reasoning. Experience, intuition or belief are not valid arguments when it comes to mathematical proving.

When you show that your statement works for every possible element from the universe, your statement or proposition becomes a theorem. Now, let’s have a look at some of the techniques that are frequently used to prove stuff.

Direct Proof

Direct proof in mathematics is a way of proving that some statement is true by showing that the statement can be deduced from generally anything that is known to be true (which are some basic truths — axioms and all previously proven theorems). It’s most common for proving conditional statements in a form of A → B.

Formally speaking, when constructing a direct proof, were looking for a sequence of statements A1, A2, …, An so (A1 → A2) ∧ … ∧ (An-1 → An) ∧ T, where T = A → B is the statement we’re proving.

It sounds weird, but it’s really simple. Take for instance solving an equation. Almost any 5th grader can do that, right? If you think about that, the sequence of steps you go through in order to get to the value of x is a sort of a direct proof. You have some precondition (the equation) and some expected result (e.g. x = 5). Also there are some basic axioms that your math teacher told you (like subtracting 5 of both sides of the equation). While solving the equation, you try to come up with the right sequence of steps that will get you from x+1-2x = -4 to x=5.

The solution looks like this:

x+1-2x = -4 \Rightarrow \\ x-2x=-5 \Rightarrow \\ -x = -5 \Rightarrow \\ \bf x = 5

We have proven that with the given precondition, x = 5. Formally speaking, x+1-2x = -4 \Rightarrow x = 5.

Proof by Induction

More precisely proof by mathematical induction. This proof is often used when we need to show, that something is true for an infinitely large universum (natural numbers for instance). It’s tied to the universal quantification in predicate logic.

The proof consists of two steps. At first, base case is proved for some element from the universe, then an induction rule is used to prove all other cases.

  1. The basis (base case): showing that the statement holds when n is equal to the lowest value that n is given in the question. Usually, n = 0 or n = 1.
  2. The inductive step: showing that if the statement holds for some n, then the statement also holds when n + 1 is substituted for n.

The principle can be illustrated on the “domino effect” on a row of falling dominoes.

  1. Base case — The first domino will fall.
  2. Induction rule — Whenever a domino falls, its next neighbor will also fall.
Domino effect

Figure 1: The domino effect

Proof by Contradiction

Proof by contradiction is rather common as well. In this proof, it is shown that if some statement were so, a logical contradiction occurs, hence the statement must be not so. It starts by adding a statement to the preconditions, that we expect to be false. Then we try to show why the precondition is so, but in the process we find some contradiction.

Textbook case of the proof by contradiction is proving that a language is not regular by pumping theorem. We say, that a language is regular then pumping lemma states, that if a language is regular, some conditions and restrictions must be met.

Here’s an example proof from wikipedia[1]:

Suppose that \sqrt{2} were a rational number, so by definition \sqrt{2} = \frac{a}{b} where a and b are non-zero integers with no common factor. Thus, b\sqrt{2} = a. Squaring both sides yields 2b2 = a2. Since 2 divides the left hand side, 2 must also divide the right hand side (as they are equal and both integers). So a2 is even, which implies that a must also be even. So we can write a = 2c, where c is also an integer. Substitution into the original equation yields 2b2 = (2c)2 = 4c2. Dividing both sides by 2 yields b2 = 2c2. But then, by the same argument as before, 2 divides b2, so b must be even. However, if a and b are both even, they share a factor, namely 2. This contradicts our assumption, so we are forced to conclude that \sqrt{2} is an irrational number.

At first we make an assumption which is followed by sequence of steps, that should be valid, if the former statement is so until we find a contradiction, therefore the opposite is true.


Proofs in mathematics can be a little … eh, well, fricking intimidating at first. They’re strictly formal, sometimes very hard to understand and it might seem almost impossible to come up with one, when you need to. Proving theorems requires a lot of knowledge and experience. Proofs in math might not be for anybody, but it’s good to know they’re there :-).


Language in Computer Science

The last post was about strings, so now something about languages. From the theoretical point of view a language is a set of words or more precisely sentences (strings). Possibly and usually an infinite set. Only sentences that are present in the set are part of the language, nothing else. Neat feature of languages is, that they interconnect two separate formal theories/models into one thing. Let me explain.

Language is a set. As in mathematical set. So anything that applies to sets applies to languages as well (like union, intersection, complement, Cartesian product). Members of any language are only strings. And there are some operations defined over strings (that is concatenation, reversal, sub-strings etc.). We can take advantage of this fact and abstract those, so they work with language as well. There is generally a lot you can do with a formal language either from the set or string point of view.

Definition 1. A formal language L over an alphabet Σ is a subset of Σ*. Formally, L ⊆ Σ*. The magical Σ* denotes a set of all existing words and sentences over Σ. Sometimes it’s called an alphabet iteration.

Like I said earlier, all operations defined for sets work with languages as well. For example language union L1 ∪ L2 = {s; s ∈ L1 ∨ s ∈ L2} etc. More interesting are the string operations that can be abstracted so they work on languages. There is language concatenation:

Definition 2. Let L1 be a language over Σ1 and L2 be a language over Σ2. Concatenation of these two L = L1⋅L2 = {xy; x ∈ L1, y ∈ L2} is a language over alphabet Σ = Σ1 ∪ Σ2.

Language concatenation as well as the string equivalent is not commutative operation. There is a couple of special cases:

  • L ⋅ {ε} = L
  • L ⋅ ∅ = ∅

The first, {ε} is a language with a single word — an empty string, in the second case we’re talking about an empty set. Concatenation can be looked at a sort of multiplication between the two languages (remember Cartesian product?). As a result of this thought, we define language power.

Definition 3. Let L be a language over Σ, then Ln denotes n-power of language and

  • L0= {ε}
  • Ln+1=Ln⋅L

With language power defined, we can advance to the last operation called language iteration. It’s basically an abstraction of language power.

Definition 4. L+ denotes positive iteration of language L defined accordingly

  • L+ = L ∪ L2∪ … ∪ L

Definition 5. L* denotes iteration of language L defined accordingly

  • L* = {ε} ∪ L ∪ L2∪ … ∪ L


In this post were introduced some of the most basic and most common operations that are defined on languages. The most important thing to remember is, that language is a set of sentences (or words) and the operations that we define originate from these two areas.

We have some set operations like union, intersection, complement as for generic sets. And we have abstractions for string operations like concatenation (notice, that it’s somewhat similar to cartesian product), language power and iteration.

Anyway, these couple of operations is nowhere near to the complete list of all existing functions for languages! After all, if you dive into some algebraic structures a little you can easily define your own :-).

Strings in Computer Science

You have probably stumbled upon a string data type in some programming language. In C it’s <string.h>, in C++ std::string, Python has them even PHP! They’re useful, pretty straight-forward. And the most beautiful thing is, that there is a theoretical foundation for them! These operations e.g. reversation, concatenation, making sub-string and of course the string itself! Let’s have a look.

Definition 1. String is a finite ordered sequence of characters. Let Σ be an alphabet, then s ∈ Σ* is a string over Σ.

The mysterious Σ* is an alphabet iteration. That is a set of all existing strings over Σ. Which makes it a infinite set, unless the alphabet is empty. You see, all the strings fall within this definition. s = “abcc” might be a string over alphabet Σ = {a, b, c}. One, sort of special case exists — an empty string, that is usually denoted by ε and its length is 0. Now, let’s proceed to some string operations. One of the most important is concatenation.

Definition 2. Let w and x be strings over Σ. Then string y = wx is a concatenation of these two. It’s also a string over Σ. You see, that concatenation is not commutative. You cannot switch the strings and expect to get the same result. Length of the new string |y| = |w| + |x| i.e. is equal to a sum of lengths of the concatenated strings.

Concatenation is very important string operation. It allows you to connect multiple strings together. When you have two or multiple strings connected, for instance w,x,y,z ∈ Σ* , z = wxy, you can say that w is a prefix of z, y is a suffix of z and x is a substring of z. A proper prefix of a string is not equal to the string itself, i.e. xy != ε and equivalently y is a proper suffix if wx != ε.

Definition 3. Let x = a1a2…an be a string over Σ. Reversal of the string, y = reverse(x) = an…a2a1.

One rather trivial definition in the end, reverse string is the same sequence of characters backwards. The thing is, why can string reversal be defined this rather intuitive way? It’s because we can be absolutely sure, that each string has a finite number of symbols, i.e. we know the n.


It’s important to point out, that we’re talking about theoretical strings. You can apply this theory to string literals in programming (like I did) and it will work, but the strings in theory are far more abstract than that. It actually depends on what is your alphabet. Let’s say, that a every symbol in Σ represent a single processor instruction. If you create a string over that Σ, you’ll get a sequence of instructions. You can do it multiple times and get programs x, y ∈ Σ*. Now, you can use concatenation, right? Then y = xz, where y is also a program in which x and z are subroutines!


Strings are useful, when you need to model any sequential behavior. Computers are sequential machines (well, not that much lately, but in principle) so there is plenty to cover with these couple of definitions!


  • Češka M., Vojnar T., Smrčka A., Teoretická informatika, Studijní opora.

Basic Computer Science

There is a couple of very basic definitions and axioms in computer science. I consider them to be very important, because everything that comes later is based on them. And if you don’t fully comprehend the basic stuff, it will be very hard to understand anything further. That’s why I decided to write a whole post on these trivial definitions.


Yeah, I’m not kidding. There’s a formal definition for alphabet. And what’s worse: it actually makes sense. I just wonder what would a kinder-gardener say on this :-D. Anyway, here it goes:

Definition 1.1. An alphabet is a finite non-empty set of symbols.

Alphabets are usually denoted by Greek big sigma Σ. A set Σ can be referred to as alphabet when it’s not empty, but also not infinitely big.

  • Σ = {a, b, c} is an alphabet
  • Σ = {0, 1} is an alphabet
  • Σ = {} is NOT an alphabet
  • Σ = all integers; is NOT an alphabet

Content of an alphabet are symbols. A symbol is some indivisible or atomic unit that can have some meaning (not necessarily). For example, if Σ = {righteous, dude} were alphabet, righteous and dude would be considered to be atomic indivisible elements (even though they contain multiple characters). Some folks also choose to omit the finite word from the definition. But that’s for some very advanced stuff and I’m guessing that finite alphabets will be more than sufficient for us.


Another fundamental thing is a string. String or a word is a sequence of some symbols (the order is important). Strings usually contain symbols from only one alphabet.  In that case we say string x over \Sigma is a finite sequence of symbols from \Sigma. The formal definition is this:

Definition 1.2. Let \Sigma be an alphabet.

  1. empty string \epsilon is string over \Sigma
  2. if x is a string over \Sigma and a \in \Sigma is a symbol from \Sigma, then xa is a string over \Sigma
  3. y is string over \Sigma only when we can derive it by applying rules 1 and 2.

There are some more basic definitions like string length, concatenation, reversation that are useful and important, but they might be a little intimidating at first, so let’s skip ahead to languages.


What is a formal language? We need to define \Sigma^* first. \Sigma^* is a set of all existing strings over alphabet \Sigma. For example, let \Sigma = \{0, 1\}, then \Sigma^* = \{\epsilon\} \cup \{ all positive binary numbers \}.

Then a formal language L over alphabet \Sigma is a subset of \Sigma^*. Formally said, L \subseteq \Sigma^*. Basically, language is a set of strings. In context of programming languages, a language would be a set of all possible programs in that programming language. You see, that in most practical cases the set will be infinite (as it is in the programming language case). So, it’s not very practical to describe languages by enumerating all possible sentences of the language. One way of describing a language is with things called formal grammars. But we’ll get on to that later.


Introduction to Computer Science

In the upcoming semester I’ll be taking class called Theoretical Computer Science. Which is said to be the hardest thing you can attend here at BUT. Only half the people pass the bar every year. It’s brutal. And since I’d really like to be a part of the lucky half, I thought I could dig into the theory a little earlier and see how bad it is.

So Computer Science, right? What the hell is all that about? Theoretical computer science or theoretical information technology (as referred by some people) is a formal foundation for the things we like to call computers. What computers do? They essentially compute stuff. Current computer hardware is built and programmed to work with numbers. Who cares about a bunch of numbers? But the numbers represent some sort of information. In broader terms, computers work with information. A computer takes some information and transforms it into another information. What Wikipedia says:

Computer science deals with the theoretical foundations of information, computation, and with practical techniques for their implementation and application. Computer scientists invent algorithmic processes that create, describe, and transform information and formulate suitable abstractions to model complex systems.

Right, so we have information. How do people work with these anyway? People share their thoughts though languages. We talk and sometimes listen, we also read and sometimes write. Other people like to draw, play charades or sing. In all of these cases the information is encoded into some sort of language that others can understand. Sure, we’re people, it comes naturally to us, but what about the machines? A coffeemaker will most certainly not learn to talk to you or even understand your needs. Machines are dumb. That’s where we (the nerdy guys from engineering department) come in with formal languages — a substantial part of computer science. By formalizing common languages we make the machines understand our instructions.

Ever heard of a programming language? Programming language is basically a sequence of instructions we use when we tell the computers what to do. It’s a language that we use when we talk to the computers. It sounds a little weird, but that’s it. Computer science sets up some ground rules so the computers can algorithmically analyze the instructions and process them. It explores possibilities of computers — what can be processed by a computer? Is there anything that computers can’t solve, why? Bunch of interesting stuff.

Interesting, but sometimes very hard to understand. I wanted to put some basics here too, but I don’t want to scare you off too early. I got stuck for a while with the very basics at first (I figured it out eventually). The math will come in a stand-alone post shortly after. Stay tuned ;-).