Week 4: Lab Document, 9/20


Let G be a graph with vertex set V and edge set E. For any vertex v in G, the clustering coefficient of v, denoted C(v), is the ratio of the number of edges between pairs of neighbors of v to the total number of pairs of neighbors of v. For example, say vertex v has 10 neighbors. Then v has (10 choose 2) = 45 pairs of neighbors and suppose that of these pairs of neighbors 15 pairs are connected by edges. Then the clustering coefficient of v is 1/3. If a vertex has one or no neighbors, let us define its clustering coefficient to be 1. The clustering coefficient of G, denoted C(G), is the average of the clustering coefficients of all the vertices in G. The value of C(v), for any v, is between 0 and 1 and therefore the value of C(G) is also between 0 and 1.

Problem:

  1. Add the following two methods to the myGraph class to respectively compute the clustering coefficient of a given vertex and the clustering coefficient of the graph.
    	public double clusteringCoefficient(String v)
    	public double clusteringCoefficient()
    
  2. Test your implementation of the clusteringCoefficient methods on the Ladders graph, described below. Later we will use the Ladders graph to play the Ladders game. In this two-player game, one player chooses a starting word and an ending word and the other player constructs a "ladder" between the two words. A ladder is a sequence of words that starts at the starting word, ends at the ending word, and each word in the sequence (except the first) is obtained from the previous word by changing a letter in a single position. For example, suppose the starting word is flour and the ending word is bread, then a ladder between these two words is: flour, floor, flood, blood, brood, broad, bread. Here is the link that will take you to a file called words.dat that contains 5757 five letter English words. This word list comes from Stanford Graphbase, a collection of interesting data sets and graph algorithms put together by Donald E. Knuth. The original file can be found here. This word list is the database that your program will use to construct the Ladders graph. The Ladders graph should contain a vertex for every word in the file and an edge between every pair of vertices that differ in exactly one position. To construct the graph, you should call the function addVertex on every word in the file and then the function addEdge for every edge you want to add. Finally, after constructing the Ladders graph, compute its clustering coefficient.