Last month we asked you to explain the behavior of this program:
#include
class C {
void (C::*s)();
void a(){x(a,"%c",b,"",c,""); }
void b(){x(a," %c",b,"",c,""); }
void c(){x(a,"\n%c",d,"",c,""); }
void d(){x(a,"\n %c",d,"",c,""); }
void x(void(C::*a)(),char*as,void(C::*b)(),
char*bs,void(C::*c)(),char*cs){
switch(int l=getchar()) {
case -1: s=0; return;
case ' ': case '\t': printf(bs,l); s=b; break;
case '\n': printf(cs,l); s= c; return;
default: printf(as,l); s= a; break;
}
}
public:
C():s(a){while(s)(this->*s)();}
};
int main(int,char*[]) {
C a;
return 0;
}
This program collapses groups of blanks and tabs into a single space. It also collapses groups of blank lines (empty lines and lines containing only white-space characters) into a single blank line. It does this by implementing a simple four-state finite state machine (FSM).
We'll start our explanation by rewriting this program to make the names a little more helpful:
#include
class C {
void (C::*currentState)();
void startState(){
changeState(startState,"%c",sawSpace,"",sawNewline,"");
}
void sawSpace(){
changeState(startState," %c",sawSpace,"",sawNewline,"");
}
void sawNewline(){
changeState(startState,"\n%c",sawNewlineSpace,"",
sawNewline,"");
}
void sawNewlineSpace(){
changeState(startState,"\n %c",sawNewlineSpace,"",
sawNewline,"");
}
void changeState(void(C::*nextStateIfNormal)(),
char*normalFmt,
void(C::*nextStateIfSpace)(), char*spaceFmt,
void(C::*nextStateIfNewline)(), char*newlineFmt){
switch(int charRead=getchar()) {
case -1:
currentState=0;
return;
case ' ': case '\t':
printf(spaceFmt,charRead);
currentState = nextStateIfSpace;
break;
case '\n':
printf(newlineFmt,charRead);
currentState = nextStateIfNewline;
return;
default:
printf(normalFmt,charRead);
currentState = nextStateIfNormal;
break;
}
}
public:
C():currentState(startState){
while(currentState)
(this->*currentState)();
}
};
int main(int,char*[]) {
C a;
return 0;
}
Each of the four states is represented by a member function:
startState |
Start state, or last character was not a white space or new line |
sawSpace |
Last character seen was a space or tab |
sawNewline |
Last character seen was a new line |
sawNewlineSpace |
Last characters seen were a new line followed by white space |
The data member C::currentState contains the current state of the FSM, expressed as a pointer to member, which will refer to one of the four member functions above.
The constructor C::C() runs the FSM by repeatedly calling the member function corresponding to the current state. That function calls the changeState routine, passing in six arguments:
- The next state if a normal character is read, and a printf control string to be used in that case;
- The next state if a white-space character is read, and a corresponding printf control string;
- The next state if a new-line character is read, and a corresponding printf control string.
The changeState function reads the next character, prints it using the specified printf control string, and sets currentState to the specified next state (or to 0 on end-of-file).
Note the use of zero-length printf control strings to suppress the printing of characters such as extra white space. Another obfuscation is the alternating use of return and break in changeState; because there is no statement after the switch, these statements are equivalent. I also realize I should have used the issspace macro to check for white space in changeState, but I was worried that might make the code easier to understand.