Lab 7
Lab assignment: Calculators Forever Performing a dictionary lookup
For this week's lab you'll be getting some hands on experience with C strings (char * and char[]), C library functions for operating on and with C strings, and also file input, specifically reading lines from a file into an array of C strings. These are all skills that will prove useful if not necessary for for the project. Note that you may use the C++ form of file/user input, though you should know both types.
You are given the file silly.txt (a silly listing of words much smaller than the dictionary), which you should put in the same folder as the program you are trying to run. Read the characters into an array of char[]'s, a char[][]. Each element of the array should be a char[], an array of characters containing one line from the file.
Once you've generated this array, loop while reading user input. Each user input should be a single string. Given this string, you should check whether this string was in the file. Print "Valid word" if it was in the file, "Invalid word" if it was not.
For extra credit, and to help you get one step closer to your project, instead of checking whether the word they entered was in the file, check if it is the prefix of any words in the file. If it is not, print "Invalid prefix", otherwise print all the words that start with the prefix. For example, if they type in the word swab, you should print swab and swabbed, as both are words in the dictionary which begin with swab.
The execution of the finished program should look like what is shown below. Output the name of the file you're reading from (silly.txt) if it successfully opened, otherwise print could not open file. (User input is shown in bold, though in your program it will not be bold.)
For one point:
File silly.txt is open and ready for checking.
Word 0. A
Word 1000. arousal
Word 2000. bombardment
Word 3000. chest
Word 4000. cool
Word 5000. detox
Word 6000. elfin
Word 7000. field-hockey
Word 8000. gibberish
Word 9000. high-jump
Word 10000. informally
Word 11000. lecherous
Word 12000. metaphysical
Word 13000. notion
Word 14000. peerless
Word 15000. principally
Word 16000. relegate
Word 17000. screen
Word 18000. soaring
Word 19000. sunken
Word 20000. transcend
Word 21000. vigor
For two points:
File silly.txt is open and ready for checking.
Which word would you like to look up (exit to exit)? abate
abate is a valid word
Which word would you like to look up (exit to exit)? Boromir
Boromir is not a valid word
And for extra credit:
File silly.txt is open and ready for checking.
Which word would you like to look up (exit to exit)? blot
The following words begin with blot
blot
blotch
blotchy
blotter
Which word would you like to look up (exit to exit)? belei
No words in the dictionary begin with belei
You may use any of the C libraries for file input and C string operations, we will not be covering all of them however. Wherever possible, you should use the builtin and library C functions instead of looping through the characters yourself. If you wish to use C++ operations for reading in from a file, and input and output (the << and >> operators), that is fine, but be sure you're using char[]'s and not C++ strings.
Stage 1 (1 point):
Read in the words from the file one line at a time, and store them into an array of character arrays (char[][]). No string will be longer than 80 characters, and there are 21857 strings in the library. As a demonstration that you've successfully read in the dictionary, print out every 1000th word in the character array. As a starting point you may use the sample file called lab7.cpp provided at the bottom of this page.
Stage 2 (1 point):
Commenting out the print statements from the previous step, create a loop asking for input, printing out whether or not each word appears in the dictionary. On entry of the word exit, the program should exit the loop and terminate. The input should be recorded in a char[] of length 10, and you can use scanf with the value of the char[] variable, rather than with the ampersand like we did with int and char variables. The input statement should appear as it did in the example, as should the output statements. You should not reopen the file each time, instead checking for the existence of the string in the array. You can use strcmp or strncmp from the string.h library to check for equality. Note that == will likely not do what you are hoping for.
Stage 3 (Extra Credit) (1 point):
Look for all strings in the dictionary which match the prefix input by the user. A character by character comparison would be fine here, though it might be best to get used to the strstr function. You should also consider taking some shortcuts to improve efficiency since the array is already sorted, and elements of the array with the same substring will be adjacent. If you're really feeling ambitious, look at binary sort for a logarithmic versus linear time solution.
Notes:
Keep in mind that this is a team effort so you should agree with your partner on what you are going to do before you start typing. The partner who is typing is the "driver" and the partner watching is the "navigator." Be sure to switch roles every 10 to 15 minutes, to foster a deep understanding of the code for both partners. The navigator should be watching for syntax errors and verifying the correctness of the code you're writing.
It will speed things up for you if you keep a window open for editing and have a separate window open for compiling and running your program. Remember that windows are resizeable!
Submission:
1. You should work with a partner for this (and all the remaining) lab(s). Only one of you need to submit the program to Blackboard, though you should be certain that both of your names be present in a comment at the top of the .c or .cpp source file.
2. You should turn in to Blackboard by the END OF THE LAB (8:50 for the 8-9 lab session, 9:50 for the 9-10 lab session). I know it's tempting to keep working on it, but other classes come in, and it's not fair to the students who are limited to that particular time span if you go over. Which isn't to say that you can't work on it later, to check your solution against the one I post for your own understanding. But what you submit for a grade should be before the next hour begins.
3. If you wish, you may submit your lab by 11 am on Thursday for a 1 point penalty. If you can't finish up the second point by the end of lab, you can still earn the score by completing all three steps and submitting your code by the day after.