22C:21 Project 1 Solution

Here is the solution to Project 1.

When BuildDict was run on the 6 text files provided it created a list of 22444 words. I used a hash table of size 10501. This is a prime number that is roughly half the number of words we intend to store in the table. The performance of hashing with chaining is determined by the lengths of the linked lists that are formed due to collisions. Here is a table that shows the distribution of the lengths of the lists. All lists have length 10 or less and more than 90% of the lists jave length 4 or less. These statistics indicate that our hashing scheme is performing reasonably well.

	Max length is: 10
	Length  Number of lists
	0        1285
	1        2627
	2        2801
	3        1991
	4        1105
	5        439
	6        170
	7        61
	8        19
	9        2
	10       1
Just for comparison, I ran BuildDict again with a hash table of size 5003 (also a prime). Here are the statistics for this run. Comparing the two tables shows what you would expect: as the size of the hash table becomes smaller, the lengths of the lists become longer.
	Max length is: 13
	Length  Number of words
	0        53
	1        270
	2        548
	3        812
	4        1017
	5        842
	6        648
	7        362
	8        258
	9        105
	10       53
	11       22
	12       8
	13       5

Here are two four runs of the SpellCheck program. The file check1.txt is the file that was spell checked, the corrected output file is check1.txt.out, and interaction1 is how I interacted with the program (specifying what to replace with what, etc.). The names for the rest of the runs are similar.