Given a connected weighted graph G, it is often desired to create a spanning tree T for G such that the sum of the weights of the tree edges in T is as small as possible. Such a tree is called as minimum spanning tree and represents the most inexpensive way of connecting all the nodes in G. There are a number of techniques for creating a minimum spanning tree for a weighted graph. The first of these, Prim’s algorithm, discovered independently by Prim and Dijkstra, is very much like Dijkstra’s algorithm for finding shortest paths. An arbitrary node is chosen initially as the tree root (note that in an undirected graph and its spanning tree, any node can be considered the tree root and the nodes adjacent to it as its sons). The nodes of the graph are then appended to the tree one at a time until all nodes of the graph are included.

The node of the graph added to the tree at each point is that node adjacent to a node of the tree by an arc of minimum weight. The arc of minimum weight becomes a tree arc connecting the new node to the tree. When all the nodes of the graph have been added to the tree, a minimum spanning tree has been constructed for the graph.

To see that this technique creates a minimum spanning tree, consider a minimum spanning tree T for the graph and consider the partial tree PT built by Prim’s algorithm.

Suppose that (a, b) is the minimum-cost arc from nodes in PT to nodes not in PT, and suppose that (a, b) is not in T. Then, since there is a path between any two graph nodes in a spanning tree, there must be an alternate path between a and b in T that does not include arc (a, b). This alternate path P must include an arc (x, y) from a node in PT to a node outside of PT. Let us assume that P contains sub-paths between a and x and between y and b.

Now consider what would happen if we replaced arc (x, y) in T with (a, b) to create NT. We claim that NT is also a spanning tree. To prove this, we need to show two things: that any two nodes of the graph are connected in NT and that NT does not contain a cycle, that is, that there is only one path between any two nodes in NT.

Since T is a spanning tree, any two nodes, m and n, are connected in T. If the path between them in T does not contain (x, y), the same path connects them in NT. If the path between them in T does contain (x, y), consider the path in NT formed by the sub-path in T from m to x, the sub-path in P (which is in T) from x to a, the arc (a, b), the sub-path in P from b to y, and the sub-path in T from y to n. This is a path from m to n in NT. Thus any two nodes of the graph are connected in NT.

To show that NT does not contain a cycle, suppose that it did. If the cycle does not contain (a, b) then the same cycle would exist in T. But that is impossible, since T is a spanning tree. Thus the cycle must contain (a, b). Now consider the same cycle with arc (a, b) replaced with the sub-path of P between a and x, the arc (x, y), and the sub-path in P between y and b. The resulting path must also be a cycle and is a path entirely in T. But, again, T cannot contain a cycle. Therefore NT also does not contain a cycle.

NT has thus been shown to be a spanning tree. But NT must have lower cost than T, since (a, b) was chosen to have lower cost than (x, y). Thus it is not a minimum spanning tree unless it includes the lowest weight arc from PT to nodes outside PT. Therefore any arc added by Prim’s algorithm must be part of a minimum spanning tree.

The crux of the algorithm is a method for efficient determination of the “closest” node to a partial spanning tree. Initially, when the partial tree consists of a single root node, the distance of any other node n from the tree, distance[n], is equal to weight(root ,n). When a new node, current, is added to the tree, distance[n] is modified to the minimum of distance[n] and weight(current, n). The node added to the tree at each point is the node whose distance is lowest. For nodes m in te tree, distance[m] is set to infinity, so that a node outside the tree is chosen as closest. An additional array closest[n] points to the node in the tree such that distance[n] = weight(closest[n], n); that is, the node in the tree closest to n. If two nodes, x and y, are not adjacent, weight(x, y) is also infinity.

C++ Implementation: Program for creating minimum spanning tree using Prim’s algorithm.