Define an operation third on strings and language as third(a1a2a3a4a5a6 ...) = a3a6 ... with the appropriate extension of this definition to languages. Prove the closure of the family of regular languages under this operation. - RenukaBhovi/Formal-Languages-and-Automata-Theory GitHub Wiki
For any regular language L, we can construct a DFA D that generates all strings in it. Given this DFA D, if we can construct another DFA D0 which generates strings (a3a6...) for any string (a1a2a3a4a5a6) in L, then regular language is closed under third operation.
Since strings in third(L) are concatenation of symbols which can only appear at the (3∗i)th position in strings from L, given D = (Q,Σ,δ,q0,F) for the original language, we construct D0 in the following way: Let D = (Q,Σ,δ,q0,F) be the original DFA. Construct the following new DFA D0 = (Q0,Σ,δ0,q0,F0), where Q0, F0 and δ0 are defined as follows. Add the initial state q0 of D0 to Q0 Pick an unlabeled state qi in Q0, perform the following operations: For every 3 transitions in δ such that:
- δ(qi,ax) = qj
- δ(qj,ay) = qk
- δ(qk,az) = ql
we add ql to Q0 if ql ∈/ Q0, and also add δ0(qi,az) = ql to δ0 if δ0 doesn’t contain such a transition yet. For the state ql we have found, check if it is a final state in D or if there exists a transition δ(ql,a) = qf, δ(qf,b) = qg such that qf or qg is a final state. If so, add ql to F0. Above, notice that qi, qj, qk, ql don’t have to be different states, and ax, ay, az don’t have to be different symbols. If there is a transition like δ(qi,a) = qi, then the 3 transitions we are looking for could be i. δ(qi,a) = qi ii. δ(qi,a) = qi iii. δ(qi,a) = qi Repeat this until no new transition could be found for qi. Then label qi. (d) Repeat (b) and (c) until all states in Q0 are labeled. Now we prove that D0 generates third(s) for any string s accepted by D, and nothing else. Part 1: given a string s in the original language, D0 generates third(s).
Proof. We prove this by induction.
Base case: the (3 ∗ 1)th symbol (third symbol) in s can be read as the first symbol by D0.
Assume s = a1a2a3..., there must be transition rules in δ that start from q0, read a1, a2, a3 and reach some state; otherwise s won’t be accepted. Assume these transition rules are as: δ(q0,a1) = qi, δ(qi,a2) = qj, δ(qj,a3) = qk. In the algorithm described above, q0 is the first symbol we put into Q0 (step (a)), and we will find these transition rules in step (b) and add δ0(q0,a3) = qk to δ0. So we can always read a3 at the initial state of D0. Inductive step: assuming (3 ∗ i)th symbol aw has been successfully read as the ith by D0, we prove that (3 ∗ (i + 1))th symbol az can be read as the (i + 1)th symbol by D0.
Since (3∗i)th symbol can be read by D0, we must have added some transition δ0(qi,aw) = qj to δ0 and state qj to Q0 (unlabeled).
Again since s = ...awaxayaz... can be accepted by D, there must be transitions like δ(qj,ax) = qk, δ(qk,ay) = qm, δ(qm,az) = qn.
When we process qj as an unlabeled state in step (b), we will add δ0(qj,az) = qn to δ0. Therefore az can be read as the (i+1)th symbol in D0 after we have read the ith symbol and reached qj. Another thing to clarify is that by following the transitions of D0, we won’t miss any symbol with ”third” position (we won’t reach final state and stop reading input until we have found all such symbols). The only reason we make a state q final state in D0 is when it is a final state in D and we have δ0(p,a) = q, or when we have δ0(p,a) = q and can reach a final state in D from q within 2 transitions.
Part 2: given a string s accepted by D0, we can always find a string s0 which can be accepted by D and third(s0) = s.
Proof. The idea is the same as part 1, except now we prove in the other direction. For any symbol a in s read by D0, we argue that there must exist 3 transition rules in D so that 2 other symbols are read before a is read.