CIS 2107‎ > ‎

Lab 3 - Comment Remover

Mark Dolan
CIS 2107
Assignment 3
Comment Remover


One of the important jobs of the preprocessor is to remove comments from a program before it is compiled. In this assignment, you'll implement this part of the preprocessor in a program that you'll call rmcmt (remove comments).

Examining Preprocessed Code

Recall that you can instruct gcc to run the preprocessor only on a File by using the -E switch.  For each of the following, what would be the preprocessed output? Verify your answer by writing some test code and run it through gcc.

1. some/* crazy */stuff
Output: somestuff

2. some/* crazy */ stuff
(note the difference in spacing between this one and the previous one)
Output: some stuff

3. some/*crazy /*crazy*/*/stuff
Output: some */stuff

4. "some /* crazy */ stuff "
Output: "some /* crazy */ stuff "

5. some/* "crazy" */ stuff
Output: some stuff

6. some /* crazy stuff
Output: rmcmt.c:13:6: error: unterminated comment
        some


2 DFA (30 points)
Draw a Finite state machine, modeling how you're going to parse the code. Recall the one that we sketched on the board in class for the word count program. We did something like this for the word count program, which resulted in the code which was something like:

1 /*
2 * file WordCount.c
3 * counts words, lines, and characters
4 */
5
6 #include <stdio.h>
7 #include <ctype.h>
8
9 #define IN 1 /* inside a word */
10 #define OUT 0 /* outside a word */
11
12
13 int main(void) {
14 int c, nl, nw, nc, state;
Page 2
15
16 state = OUT;
17 nl=nw=nc=0;
18
19 while((c = getchar()) != EOF) {
20 nc++;
21 if (c == '\n')
22 nl++;
23 if (isspace(c))
24 state = OUT;
25 else if (state == OUT) {
26 state = IN;
27 nw++;
28 }
29 }
30 printf("lines=%d, words=%d, characters=%d\n", nl, nw, nc);
31 return 0;
32 }


Do the same thing for the comment remove program to help you organize your thoughts.


Implementation
Write your code to remove /* */-style comments from C code (it does not have to handle C99's //-style comments). Your program should take its input from STDIN, and send its output to STDOUT as the word count program did. Also, similar to WordCount, it is suggested that you simply use getchar( ) and putchar( ) for input and output and shell redirection (the < and > operators) to run your program using Files for input and output instead of the keyboard and screen.

Your program should produce uncommented code just as gcc did in your answers to Part 1.  Pay particular attention to the spacing rules, the rules for comment characters inside and outside string literals, and what happens when comments are nested.  When code contains an unterminated comment, print an error message and the line number where the comment began, just as gcc does.

Testing
Write some short, sample C code to test to see whether or not your comment remover works properly. This code should be useful to you to answer the questions in part 1 as well.



/* rmcmt.c Mark Dolan, CIS 2107, Comment Remover 091614 */
#include <stdio.h>
#define OUT 0
#define IN 1
#define MAYBE_IN 2
#define MAYBE_OUT 3
#define IN_QUOTE 4
int main(int argc, char **argv) {
 int state=OUT;
 int cur;
 int temp;
 int lines = 0;
 int lineError = 0;
 
while ((cur=getchar())!=EOF) {
 if (state == OUT && cur=='/'){
  state = MAYBE_IN;
  temp = cur;
 }else if (state == OUT && cur =='\"' || cur == '\'') {
  putchar(cur);
  state = IN_QUOTE;
  lineError = lines;
 }else if(state == OUT){
 putchar(cur);
 }else if(state == MAYBE_IN && cur == '*'){
  state = IN;
  lineError = lines;
 }else if(state == MAYBE_IN && cur=='\"' || cur=='\''){
  putchar(temp);
  putchar(cur);
  state = IN_QUOTE;
  lineError = lines;
 }else if(state == MAYBE_IN){
  putchar(temp);
  putchar(cur);
  state = OUT;
 }else if(state == IN && cur == '*'){
  state = MAYBE_OUT;
  lineError = lines;
 }else if(state == IN){;
 }
 else if(state == MAYBE_OUT && cur=='/'){
  state = OUT;
 }else if(state == MAYBE_OUT){
  state = IN;
 }else if(state == IN_QUOTE && cur=='\"' || cur == '\''){
  putchar(cur);
  state = OUT;
 }else if(state == IN_QUOTE){
  putchar(cur);
 }
 if(cur == '\n'){
  lines++;
 }
}//End While
if(state == IN){
 printf("A comment was not properly closed on line %d\n", lineError);
 return 1;
}else if(state == IN_QUOTE){
 printf("No terminating quotation mark found on line %d\n", lineError);
 return 1;
}else{
 printf("program decommented successfully\n");
 return 0;
}
}//End Main


Mark Dolan CIS 2107 Lab 3 State Machine
SelectionFile type iconFile nameDescriptionSizeRevisionTimeUser
Ċ
View Download
  95k v. 2 Sep 17, 2014, 3:28 PM Mark Dolan
ċ

Download
  1k v. 2 Sep 17, 2014, 3:28 PM Mark Dolan
ċ

Download
  1k v. 2 Sep 17, 2014, 3:28 PM Mark Dolan
Comments