Internal representation

Choosing a representation for the data is a matter of trade-offs.

To choose a good representation for the data, we always need to know:

What does the data look like — how big is the data and how large is each part relative to the others;
What operations do we need to perform — how often is each operation performed.

With respect to the graph size, we have dense graphs, where m = Θ(n^2), sparse graphs, where m = O(n), and some intermediate graphs.

In dense graphs, the degrees of most vertices are of the order of Θ(n). In sparse graphs, the degrees of most vertices are small (O(1)), but we can sometimes have a few vertices of very high degree. For instance, the graph corresponding to a road network is a sparse graph. If we represent intersections by vertices, each vertex (intersection) usually has 3 or 4 neighbours (out of the hundreds of millions intersections in the world).

Typically, we have the following operations to be performed:

Given vertices x and y, test if (x,y) is an edge
Given a vertex x, parse the set Nout(x) of outbound neighbours of x
Given a vertex x, parse the set Nin(x) of inbound neighbours of x
Parse the set of vertices of the graph

Adjacency matrix

We have a n×n matrix with 0-1 or true-false values, defined as: a(x,y) = 1 if there is an edge from x to y, and 0 otherwise.

Memory: Θ(n2)

Test edge: O(1)

Parse neighbours: Θ(n)

Summary:

Adjacency matrix is good for dense graphs, but bad for sparse graphs. Imagine a graph with 108 vertices and 4×108 edges, but which occupies 10^16 bits (or around 1000TB).

Internal representation

Adjacency matrix

List of edges