Friday, January 19, 2024

Disturbances in the Force Ahead for Wordle

Comment on Wordle No. 943

January 19, 2024

Some of you may recall that I wrote software that can play Wordle, the online game owned by and housed at The New York Times. At the outset such software must know what words are accepted by Wordle as guesses and what words might be used as the solution in a game. Thus, the software must employ two word lists.

There is one game every day. The first game in history (before acquisition by The New York Times) was conducted online June 19, 2021. Yesterday (January 18, 2024) was the day of game no. 943. As that game is over, I may reveal its solution, which was "stole". My favorite first guess is the word "roate". After making that guess yesterday, I saw this:

ROATE

It's relative rare for me to determine three of the five letters with my first guess. My software, using a list of what I think are words that might reasonably appear as solutions, reported that the possible solutions among the words on my list were "stoke", "stole", "stone", "stove", and "those". At first glance it appears that I have a 20% chance of guessing correctly on my second guess by picking one of those 5 words at random. But I can dig deeper. For each of those words I can hypothesize that that word is the solution, use any one of the words as a guess for that solution, and find the number of words that would still be in play. In this way I can determine for each of those words as a guess the number of words still in play following that word as a guess, and select for my next guess a word for which the number of words still in play is minimum. With this example -- and using my word lists -- each of the four words beginning with 's' is a better second guess than the word "those".

There are two further wrinkles, one helpful and one not helpful. The unhelpful wrinkle is that the solution may not be on the list of words that I imagine as possible solutions. For that list I had a good start because the early game of Wordle had semi-public lists of allowed guesses and possible solutions. As time went by, the list of allowed guesses was significantly expanded but remained semi-public. But the list of possible solutions dropped out of semi-public sight, and, more to the point, there have been four solutions over time -- in games 646 (guano), 659 (snafu), 720 (balsa), and 730 (kazoo) -- that were not on the early list of possible solutions.

The helpful wrinkle is that up through today -- game 944 -- no word has been used as a solution more than once. Quite obviously, unless Wordle is discontinued before, say, its 10th anniversary in 2031, sole use solutions cannot continue forever.

Why do I call that a helpful wrinkle? I expect that the Wordle editor at the New York Times is aware of the history of sole use and is aware that at some point sole use must be discontinued. But for the meanwhile a player can gain advantage by assuming that sole use is still in effect. With game 943 the assumption of sole use narrows the list of 5 words in play to 2, "stoke" and "stole", which means that a random second guess among those has a 50% chance of being correct.

One more example. Suppose the first guess is "least". Then

LEAST

That is a remarkably lucky guess yielding 4 letters. The words in play -- using my list of possible solutions -- are "steel", "stole", and "style". The "deeper dig" procedure flags "stole" and "style" as optimal second guesses, but assuming sole use, there is only one optimal second guess, which is "stole".