CS241 -- Lecture Notes: Graphs

Daisy Tang

Back To Lectures Notes


This lecture covers Chapter 14 of our text book.

Click here for the slides presentation.


Introduction to Graphs
Graphs are the most general data structure. They are also commonly used data structures.

Graph definitions:

  • A non-linear data structure consisting of nodes and links between nodes.

Undirected graph definition:

  • An undirected graph is a set of nodes and a set of links between the nodes.
  • Each node is called a vertex, each link is called an edge, and each edge connects two vertices.
  • The order of the two connected vertices is unimportant.
  • An undirected graph is a finite set of vertices together with a finite set of edges. Both sets might be empty, which is called the empty graph.

In a drawing of a graph, the placement of the vertices and edges is unimportant.

A Motivating Example: Coin Game (Undirected State Graphs)

Rules:

  1. You may flip the middle coin whenever you want to.
  2. You may flip one of the end coin only if the other two coins are the same as each other.

Graphs in Problem Solving:

  • Often a problem can be represented as a graph, and the solution to the problem is obtained by solving a problem on the corresponding graph.

To solve the above problem, we are going to build an undirected state graph. See Figure 14.1 on page 703 of our text book. Once we know the undirected state graph, the game becomes a problem of finding a path from one vertex to another, where the path is allowed only to follow edges.

 

Directed Graphs
Definition:
  • A directed graph is a finite set of vertices together with a finite set of edges. Both sets might be empty, which is called the empty graph.
  • Each edge is associated with two vertices, called its source and target vertices.
  • We say that the edge connects its source to its target.
  • The order of the two connected vertices is important.

One application of directed graphs is a state graph for a game where reversing a move is sometimes forbidden. For example, tic-tac-toe.

More Graph Terminology:

  • Loop: an edge that connects a vertex to itself.
  • Path: a sequence of vertices, p0, p1, ..., pm, such that each adjacent pair of vertices  pi and pi+1 are connected by an edge.
  • Cycle: a simple path with no repeated vertices or edges other than the starting and ending vertices. A cycle in a directed graph is called a directed cycle.
  • Multiple edges: in principle, a graph can have two or more edges connecting the same two vertices in the same direction.
  • Simple graphs: the graphs that have no loops and no multiple edges. In fact, many applications require only simple directed graphs or even simple undirected graphs.

Airline Routing Example: Crocodile Airlines Routes

Questions:

  1. How many vertices and edges does the graph have? How many loops?
  2. Is it a simple graph? Why or why not?
  3. What is the shortest path from "Black Stump" to "Melbourne"? -- The shortest path problem.

Exercises:

Suppose we have 4 coins in the coin game.

Three rules:

  1. Either of the end coins may be flipped whenever you want to.
  2. A middle coin may be flipped from head to tail only if the coin to its immediate right is already heads.
  3. A middle coin may be flipped from tail to head only if the coin to its immediate left is already tails.

Your mission: Draw the directed state graph for this game and determine whether it is possible to go from the start to the goal. Why does the graph need to be directed?

 

Graph Implementations
Different kinds of graphs require different kinds of implementations, but the fundamental concepts of all graph implementations are similar. We'll look at several representations for one particular kind of graph: directed graphs in which loops are allowed.

Representing Graphs with an Adjacency Matrix

Definition:

  • An adjacency matrix is a square grid of true/false values that represent the edges of a graph.
  • If the graph contains n vertices, then the grid contains n rows and n columns.
  • For two vertex numbers i and j, the component at row i and column j is true if there is an edge from vertex i to vertex j; otherwise, the component is false.

We can use a two-dimensional array to store an adjacency matrix:

boolean[][] adjacent = new boolean[4][4];

Once the adjacency matrix has been set, an application can examine locations of the matrix to determine which edges are present and which are missing.

Representing Graphs with Edge Lists

Definition:

  • A directed graph with n vertices can be represented by n different linked lists.
  • List number i provides the connections for vertex i.
  • For each entry j in list number i, there is an edge from i to j.

Loops and multiple edges could be allowed.

Representing Graphs with Edge Sets

To represent a graph with n vertices, we can declare an array of n sets of integers. For example:

IntSet[] connections = new IntSet[10]; // 10 vertices

A set such as connections[i] contains the vertex numbers of all the vertices to which vertex i is connected.

Which Representation is Best?

If the space is available, then an adjacency matrix is easier to implement and is generally easier to use than edge lists or edge sets.

There are also other considerations:

  1. Adding or removing edges
  2. Checking whether a particular edge is present
  3. Iterating a loop that executes one time for each edge with a particular source vertex

Our observations:

  • Both (1) and (2) require only a small constant amount of time with the adjacency matrices.
  • Both (1) and (2) require O(n) operations with the edge list representation in the worst case (where n is the number of vertices).
  • With edge sets, both (1) and (2) might require O(n) operations -- which could be cut to O(log n) by using a fast set representation. Which one?
  • With (3), edge lists and edge sets (O(e), where e is the number of edges that have vertex i as their source) are more efficient than adjacency matrix (O(n)).

In general, your choice of representations should be based on your expectations as to which operations are most frequent. One last consideration is the average number of edges originating at a vertex. If each vertex has only a few edges (sparse graph), then an adjacency matrix is mostly wasted space filled with the value false.

 

Graph Implementation (Directed Graphs with no Multiple Edges)
public class Graph<E> {
    private boolean[][] edges; // edges[i][j] is true if there is a vertex from i to j
    private E[] labels; // labels[i] contains the label for vertex i

    // Constructor: initialize a Graph with n vertices, no edges, and null labels
    public Graph(int n) {
        edges = new boolean[n][n]; // All values initially false
        labels = (E[]) new Object[n]; // All values initially null
    }

    // Add an edge
    public void addEdge(int source, int target) {
        edges[source][target] = true;
    }

    // Accessor method to get the label of a vertex of this Graph
    public E getLabel(int vertex) {
        return labels[vertex];
    }

    // Test whether an edge exists
    public boolean isEdge(int source, int target) {
        return edges[source][target];
    }

    // Obtain a list of neighbors of a specified vertex of this Graph
    public int[] neighbors(int vertex) {
        int i;
        int count = 0;
        int[] answer;

        for (i = 0; i < labels.length; i++) {
            if (edges[vertex][i])
                count++;
        }
        answer = new int[count];
        count = 0;
        for (i = 0; i < labels.length; i++) {
            if (edges[vertex][i])
                answer[count++] = i;
        }
        return answer;
    }

    // Remove an edge
    public void removeEdge(int source, int target) {
        edges[source][target] = false;
    }

    // Change the label of a vertex of this Graph
    public void setLabel(int vertex, E newLabel) {
        labels[vertex] = newLabel;
    }

    // Accessor method to determine the number of vertices in this Graph
    public int size() {
        return labels.length;
    }
}

Exercises:

  1. Write a new Graph method that interchanges two specified vertices. For example, after interchanging vertices i and j, the original neighbors of vertex i will now be neighbors of vertex j and vice versa.
Graph Traversals
There are two common ways of traversing a graph:
  • Breadth-first search uses a queue to keep track of vertices that still need to be visited.
  • Depth-first search uses a stack (can also implement it recursively).

The same underlying purpose:

  • To start at one vertex of a graph (the "start" vertex) and process the information contained at that vertex.
  • Move along an edge to process a neighbor.
  • When the traversal finishes, all of the vertices that can be reached from the start vertex have been processed.

Note: a traversal processes only those vertices that can be reached from the start vertex.

Note: the algorithm must not enter a repetitive cycle. To prevent this, the algorithm needs to mark each vertex as it is processed.

Depth-First Search (DFS)

  • Expand the deepest unexpanded (unmarked) node.
  • Implementation: stack or recursion.

Breadth-First Search (BFS)

  • Expand the shallowest unexpanded (unmarked) node.
  • Implementation: queue.

Exercises:

  1. Do a DFS and BFS of the Australia graph starting at Sydney. List the order in which the cities are visited.
  2. What kind of search occurs if you replace BFS's queue with a stack?
  3. Implement DFS using recursive and non-recursive methods, and BFS with a non-recursive method.

DFS can be used for traversing or searching a tree. One starts at the root (selecting some node as the root in the graph case) and explores as far as possible along each branch before backtracking. Formally, DFS progresses by expanding the first child node of the search tree that appears and thus going deeper and deeper until a goal node is found, or until it hits a node that has no children. Then the search backtracks, returning to the most recent node it hasn't finished exploring.

Similarly, BFS can also be used for traversing or searching a tree. The BFS begins at a root node and searches all the children nodes. Then for each of these children nodes in turn, it inspects their children which were not visited, and so on. The process will continue until a goal node is found or it hits nodes with no children.

 

Learning Objectives
After studying this chapter, you should be able to:
  • follow and explain graph-based algorithms using the usual CS terminology.
  • design and implement classes for graphs.
  • List the order in which nodes are visited for BFS and DFS and implement these algorithms.

Last updated: Oct. 2012