A major task in editing texts is to find the right position. Let’s see how this can be automated efficiently. As the decision whether a word occurs in the text depends on the whole text, we cannot expect to be any more efficient than linear in the length of the text. So, let’s stay at least with this. Also, as the presence of a word cannot be established by looking at a single letter, at least a finite amount of information needs to be carried over while traversing the text.
The concept of linearly traversing a text while keeping a finite amount of information is known as finite automata. More precisely, a deterministic finite automaton is given by the following data.
The transition function delta is lifted from letters to a function delta^* on words by iterating over the letters of the word. A word w is accepted by the automaton, if delta^*(q_0)(w) in F.
Since the power set of a finite set is still a finite set, we can allow non-determinism in the following sense.
The function delta is again lifted from letters to words by iteration over the letters of the word; this times by means of relational composition.
Non-deterministic automata can be transformed into deterministic ones, by taking 2^Q as the new state set, keeping track of the set of states the non-deterministic automaton can possibly be in.
Exercise. Work out the details.
Note. Power sets can easily be kept track of by bit vectors.
The languages definable by non-deterministic finite automata contains the single-word languages and is closed under the following operations.
Additionally, the of the combined automaton is the sum of the sizes of the component automata plus O(1).
Exercise. Proof this.