Prog 4: Guess Wordle Word

Previously we worked on figuring out what the best first and second Wordle guesses are. Now we're flipping it around, where we come up with the secret word (or we choose one at random) and our program does the guessing to try and find the secret word. This time your code will be implementing ideas, not specifications. You can choose how you want to do it, and the level of performance your program provides.

Here is an example of what the program looks like when it is run:

Your program score will all come from how many guesses your program takes to find the secret word, as shown below. All three of your program runs must satisfy the Number of Guesses description to earn the points.

To earn any points at all, feedback on letter guesses (correct / incorrect positions) must be accurate, correctly indicating which guess letters match in the correct position (shown as upper-case) and which guess letters are in the secret word but in some other position (indicated by an asterisk under the letter). No points are awarded unless your program correctly has this letter matching feedback, along with multiple numbered moves continuing until the secret word is found.

Prog 4 Scoring

For example below are the results of one program that would earn 70 points, because letter matching feedback is correct and all three program runs find the secret word in <= 100 guesses. Your program output should match the format of this example, but the selected secret word and number of guesses does not need to match this output! Make sure your program works properly with less common words (because we will select them!), such as yarns, wants, tills, buddy, enfix, yexed (past participle of "yex" which is from Middle English and means "to hiccup"). (10/21 note: In the second example below the feedback on guess 2 ("RArEr") was corrected to not have the asterisk under the second lower-case 'r' in the middle of the word.)

The above sample output does the following after each guess :

  1. Eliminates potential words that do not include exact match letters

  2. Eliminates words that don't have the letters we know are in the word, but are in a different position.

What it doesn't do is eliminate words that do have a character in a position that didn't match. For instance if a word starting with 's' doesn't have a match on 's', then all words starting with 's' should be eliminated from consideration as secret word candidates. If we do implement exclusion letters like this, the above long sequence of guesses to find yexed can become much shorter:

Again, remember that your program is not supposed to match the output and sequence shown in these results. The scoring formula and way you write your program will change your results. This is expected for this program!

Hard-coding the first guess to be the best starting word, as discovered by the previous program, also can change the length of the guesses sequence, as shown below:

Using soare as the first word as shown above is not necessarily the best starting word in this context, since it was the best starting word when matching against the original Wordle set of ~2,000 answer words, but that is not the same as being the best starting word when trying to find some secret word from the set of all ~12,000 words, including all the obscure words that are in there. That is why we are giving full credit on this assignment for finding any word within 7 15 guesses rather than the original Wordle 6 guesses.

Requirements

  1. Work must be submitted in Replit (not in Gradescope) in the Prog 4: Guess Wordle Word project, where there is starter code for you (also shown below). You must select the Submit button in Replit when you are done!

  2. No late submissions will be accepted, because the deadline is after the end of week 13.

  3. All points come from program execution. No style points will be given. TAs will run each of your programs to determine points.

  4. Instructional staff will only give assistance in person and through Piazza on code that is reasonably documented. The standard for "reasonable" documentation can be seen in the posted solutions to previous programs, where variable names are meaningful, each function has a description, parameters have descriptions, and each section of multiple lines of code have a description.

  5. Program must be your own individual work, no partners. According to the syllabus, originally when we were going to have 5-6 projects then both projects 3 and 4 were going to be group projects. Now that we only have 4 projects, we are switching program 4 to once again be individual work only.

Recommendations

  1. Write your design of each new section of code as comments first, then implement the code one section at a time. This will make your code understandable, will help you debug your code, and will make your program eligible for help from the instructional staff.

  2. Play wordle, if you haven't already, so you understand what is involved and can gain insight which you then turn around and implement as code. Want to play more than the once-per-day limit? Try the wordle time machine.

  3. To minimize the number of guesses, be reflective about how you think when playing wordle. What are your strategies? Think about how you might implement them to improve your program's performance. Below are some possible strategies, though you may think of others, or may want to implement only some of them. For whatever strategies you come up with, you should come up with some score for all the possible words, and then for your next guess select the word with the highest score.

      1. Prefer guesses with more common letters. To do this you would have to either do some research to find the relative commonality of different letters, or compute it yourself, for instance based on all the letters in the words file. Scoring each word would be based on how common each of its letters are.

      2. Prefer guesses that have letters in meaningful places. For instance, is 's' more common as a first letter? Last letter? Second letter? To use this strategy you would need to count the number of each letter in each of the 5 letter positions, and use those counts to score each word.

      3. Hard-code the first couple of guesses to be words that you think are effective, such as words you found as the result of the previous program

      4. As play progresses, eliminate words that don't match information you have found so far. For instance you could eliminate words from consideration that:

          1. Do not have the exact match letters (shown in upper case) in the positions where they have been found so far. For instance if a previous guess reveals that the secret word has an 'a' in the first position, then you could eliminate all words from consideration that do not have an 'a' in the first position.

          2. Are lacking the letters that you have found so far that you know are in the word, even if you don't know which position they belong in. For instance if a previous reveals that the secret word has an 'm' in it, then you can eliminate from consideration all words that don't have an 'm'.

          3. Have a non-exact matching letter in a position you know it can't be. For instance if a previous guess reveals that the word has an 'r' somewhere, but you know it is not in the last position because your guess with an 'r' in the last position gave an inexact match, then you can eliminate all words that have an 'r' in the last position.

  4. Added 11/7/22: The letters matching algorithm should be something like this:

Starter Code (available in Replit)