3.5 SCRABBLE AND MAVEN

Scrabble is a board game where players try to form words from tiles they have pulled from a bag. If you are not familiar with the game you should consult [A] and [B] (or at least take a look at Figure 3.5.4 at the end of the section). The major differences between this game and the ones we have covered so far (checkers and chess) are:

  1. You do not know what tiles your opponent has or the tiles you will draw from the bag after you play, so it is a game with incomplete information.
  2. The machine needs a dictionary of words (humans are supposed to remember them) so that there is a need for a knowledge database.

People started writing computer programs that played Scrabble since the 1970's but the programs did not fare well against human players. A noteworthy development was the Scrabble program by Andrew Appel and Guy Jacobson (then at Carnegie Mellon University) written in 1983 and described in their 1988 paper [E]. One of their major contributions was to store the word dictionary in a structure that both saved storage space and facilitated play.

The structure is called a Directed Acyclic Word Graph (DAGW for short) and it had been first proposed around 1980 as a way of saving both computer storage space and search time. A dictionary that would require 780,000 bytes to be stored in word form, it takes only 175,000 bytes in DAWG form. The next few illustrations explain the concept of DAWG and show why it is helpful for Scrabble. Figure 3.5.1 shows part of such a graph. It covers some of the words starting with F, in particular FAR, FIR, FOR, FARM, FIRM, FORM, etc. The filled circles denote a possible end of word.

(a) (b) (c) (d)
Figure 3.5.1: (a) Example of part of a DAWG; (b) The path corresponding to the word FARMS is marked in red; (c) The path corresponding to the word FIRM is marked in green; (d) The path corresponding to the word FOR is marked in blue.

Figure 3.5.2 shows a part of another DAWG; a bit more complex than that of Figure 3.5.1. It covers the words CAR, CAT, DO, DOG, EAR (and their plurals), DONE, EAT, and EATS.

Figure 3.5.2: A DAWG that includes the words DOGS (in red) and CATS (in blue). Adapted from [E].

To better see the importance of DAWGs for Scrabble we consider depictions of partially collapsed parts. The left diagram of Figure 3.5.3 shows a collapsed version of the DAWG of Figure 3.5.2. Let us look at the right diagram of the same figure. We see that two words INFIRMITY and INFINITY are represented in such a way as to emphasize their common prefix and suffix. This is ideally suited to Scrabble because words with common parts appear close together.

Figure 3.5.3: Partially collapsed parts of DAWGs

Today the best computer Scrabble game is arguably the Maven written by Brian Sheppard [C]. It has been beating all human opponents since the mid-1990's.

Writing a computer program that plays this game requires several other task besides creating an easily searchable dictionary such as strategies for evaluating plays. For example, should one play 5 tiles rather than 4, even if the resulting score is lower? This might be the right move if one wishes to draw more tiles in the hope of obtaining a particular letter. For example if you have the Q in your rack but you are missing a U. Such strategies were "hand crafted" and the best way to describe the overall approach is to quote from Sheppard's paper [C]. ENDGAME?

MAVEN is a good example of the "fundamental engineering" approach to constructing intelligent behavior. Design, testing, and quality control contributed more to the quality of MAVEN than any grant concepts of artificial intelligence.

A computer can beat a human player in Scrabble not because the machine is smarter but because it can look through a dictionary much faster than a human. Suppose you want to find all possible anagrams for the seven tiles in your rack. You have 7 choices for the first letter, 6 for the second, 5 for the third, etc so the total number of anagrams is 7X6X5...X2X1 or 5040. If you spent a second thinking about an anagram you will need over an hour. But a modern personal computer can look at least at 10,000 anagrams a second so it can do the search (and check which ones match the dictionary entries) in less than a second. This is particularly useful for playing all your seven tiles in one turn so that you can earn a 50 point bonus.

You can download Maven from the site listed below [D]. In spite of its name as "free-game-downloads" you have to pay for some of the games, but the charge was under $10 when I downloaded my copy. You can adjust the level at which Maven plays so that you will have some chances to win. The program implements the level by discarding moves from amongst those generated, in effect handicapping itself. At the lowest level it considers only 8% of the moves. Figure 3.5.2 shows the board at the end of the game (one of the few I won against Maven even though it was playing at the lowest level).

Figure 3.5.4: The board at the end of a game of Scrabble against Maven. Note that program uses a dictionary with a very large vocabulary, hence the unusual words such as MISATONE (to atone wrongly), FARADISM (the use of faradic current for therapeutic purposes, and RATOONER (a plant that sprouts from a root planted the previous year).

Click on the image to see it at a higher resolution.

ZZ

Notes

[A] Official Scrabble site http://www.scrabble.com/

[B] Rules of the game http://www.hasbro.com/scrabble/en_US/rulesSetup.cfm

[C] Brian Sheppard, "World-championship-caliber Scrabble", Artificial Intelligence, vol. 134 (Jan. 2000), pp. 241–275

[D] http://free-game-downloads.mosw.com/abandonware/pc/strategy_games/games_m_n/maven.html

[E] A. A. Appel and G. J. Jacobson "The World's Fastest Scrabble Program", CACM, vol. 31 (May 1988), PP. 572-578, 580.

Back to Contents --- Previous Section