I walk the (train) line – part quatre – Tell me why
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
(TL;DR: Dijkstra’s algorithm is correct. Mathematical proof inside <3<3<3)
LaTeX and MathJax warning for those viewing my feed: please view directly on website!
This is part four of our epic. Find the other parts here:
Last time, we learnt about Dijkstra’s algorithm and implemented it in R. We found the shortest path from Newtown Station to Small Bar.
But this is not enough for me! I’m obsessed with learning the details of how things work. Merely implementing Dijkstra’s algorithm would leave me feeling empty.
Let’s prove that Dijkstra’s algorithm works!
Proving the “correctness” of Dijkstra’s algorithm
I’ll be analysing the proof found on page 597 of Cormen et al, Introduction to Algorithms, Second Edition. Excellent book – please get it for yourself!
What are we proving?
To quote directly from page 597 of the book:
Dijkstra’s algorithm, run on a weighted, directed graph with non- negative weight function and source , terminates with for all vertices
What in the above quote do we already understand from our knowledge gained from the earlier posts?
- We know what weighted, directed graphs are from my previous posts.
- We also know that a graph is made up of a set of vertices and a set of edges .
- We know what a source node is.
Let’s define the stuff that we have not yet encountered:
- We don’t know what a weight function is. It’s simply some mathematical function that takes in the set of edges and maps them to real numbers (the weights). More formally, . In our case, we want non-negative weights only.
- is the distance that Dijkstra’s algorithm has calculated from our source node to some vertex in our set of vertices .
- is the actual shortest path distance between our source node and some vertex in our graph.
Now let’s translate the initial quote!
As we mentioned last time, Dijkstra solves the single source shortest path problem. Let’s assume that we know the true shortest distances from our source node to all other nodes in our graph (including the source node itself). Then, to prove that Dijkstra’s algorithm is “correct”, we need to show that when the algorithm terminates, the distances found by Dijkstra from our source node to all other nodes in our graph (including the source node itself) are equal to the “true” shortest distances.
Good!
How do we prove this? We focus on the ‘while’ loop!
To prove the correctness of Dijkstra, the authors focus on the big while loop in Dijkstra. Here are some key lines from our implementation in GitHub from last time:
while (nodes_left_to_process(vec_of_nodes)) { ... vec_of_nodes <- remove_min_dist_node(vec_of_nodes, min_dist_node_id) }
In the book, the authors explicitly maintain a set of nodes that
have been processed by Dijkstra. The while
loop terminates when this
set (that is, we have processed all nodes in our graph). In
our implementation, we instead maintained a set of nodes that haven’t
been processed yet (vec_of_nodes
). Our loop continues until this set
is empty (i.e. when we have processed all nodes in our graph). I will
state everything the authors’ write in terms of our implementation.
If you’re following along with the book, keep this in mind!
The authors use what is called a loop invariant to prove Dijkstra’s correctness. These proofs are broken down into three phases. From page 18 of the book, those three phases are these:
Initialisation: It is true prior to the first iteration of the loop.
Maintenance: If it is true before an iteration of the loop, it remains true before the next iteration.
Termination: When the loop terminates, the invariant gives us a useful property that helps show that the algorithm is correct.
The initialisation phase is like the base case in an induction proof. The maintenance phase is like the inductive step. The termination phase is something different. Unlike in an induction proof, we don’t continue indefinitely. Our algorithm terminates!
The authors use this specific loop invariant:
At the start of each iteration of the while loop … for each vertex … (which we have finished processing.)
Time to ATAAAAACK!
Structure of the proof
Let’s tackle each part!
Initialisation
Initially, we have not processed any vertices. So our invariant is trivially true! We are done with this step.
Maintenance
This is where the bulk of the proof is.
Let’s show that when we are done processing our vertex , . We don’t go about this directly. We proceed to show this indirectly via a proof by contradiction. We assume something is true. We start to reason and arrive at some contradiction. This contradiction means that our original assumption cannot be true. Therefore, the opposite must be true!
If this is the first time you’ve seen a proof by contradiction, this might be very confusing. If so, see this nice article.
Let’s do this!
Let’s start with an assumption
This is the assumption that we will make which will later give rise to a contradiction:
There is some vertex that we have just finished processing which is the first vertex for which .
If we cast our minds back to last time, Dijkstra begins by first processing our source node, . We set the distance from our source node to our source node to zero in this little function:
initialise_distances_vector <- function(distances_vector, source_node_id) { distances_vector[[source_node_id]] <- 0 return(distances_vector) }
Therefore, (the distance found by Dijkstra from our source node to our source node) is equal to (the true distance from our source node to our source node). They are the same node, so the distance is clearly zero!
So we have our first observation - this mysterious vertex cannot be our source node. That is, .
Next, we reason that there must be some path from our source node
to our node . Remember the LARGE_NUMBER
constant we defined in
our implementation of Dijkstra?
LARGE_NUMBER <- 999999
We set all distances in our vector of distances to this LARGE_NUMBER
in this function:
create_distances_vec <- function(osm_graph, source_node_id) { num_nodes <- vcount(osm_graph) distances_vec <- rep(LARGE_NUMBER, num_nodes) ... }
In the book, the authors initialise all distances by setting them to
. They are writing a proof in the abstract world of maths. Our
job is to implement the algorithm in an R program. Our choice to use a
LARGE_NUMBER
constant of 999,999 metres for our little map of a part
of Sydney has the same practical effect. However, as we are
mathematically proving the properties of our algorithm, let’s talk in
terms of .
The authors reason that there must be some path from to , because if there was no path at all, . That is, the distance found by Dijkstra would remain as the value we initialised it with, and therefore, our algorithm would have found the “correct” distance from to (which happens to be because there is no path in our graph from to ). If our algorithm found the correct distance between and , then this would have violated our initial assumption that .
We are safe to proceed! We have our second observation - there is some path between and . And because there is “some path”, there must exist a shortest path!
To explain the next part, let’s refer to this image which can be found on p597 of the book:
This image shows us some path , which connects our source node to our vertex . The dark blob named is the set of nodes that have already been processed.
We have a vertex which is the first vertex along the path that has not been processed yet (notice how it is outside of the dark blob ). is the predecessor on this path (i.e. the vertex directly before ), and has already been processed by our algorithm.
The authors reason that because is the first vertex for which our algorithm fails to find the actual shortest path, when was processed, Dijkstra must have found the correct shortest path. That is, . When we processed , all edges out of were relaxed. These are the relevant lines in our implementation:
for (i in seq_along(adjacent_to_min_dist_node)) { ... if (current_distance_to + distance_to_adj_node < curr_known_dist_to_adj_node) { vec_of_distances[[adjacent_node_id]] <- current_distance_to + distance_to_adj_node vec_of_predecessors[[adjacent_node_id]] = min_dist_node_id } ... }
This means that the algorithm found the correct shortest path distance to ! That is, we have .
Let’s contradict our assumption!
We started with an initial assumption that our algorithm found the wrong shortest path distance for our vertex . That is, . We assumed that was the first vertex where this had happened. We will now contradict our assumption and show that in fact !
The vertex occurs prior to on some shortest path. Therefore, the shortest path distance between our source and must be less than or equal to our distance from our source node to . Why less than or equal to? Because and may be the same node. So . This means that:
The last bit is relevant because Dijkstra relaxes edge
weights downward from our initial LARGE_NUMBER
value and continues to
decrease the edge weights until their reach their lower bounds. If
is the lower bound, then the smallest edge weight
that the algorithm can find is .
The authors then continue - both and were unprocessed when our algorithm chose to process in its while loop. So that means, and that:
What??? Did you see that???
CONTRADICTION!!!
This means that when our algorithm is done processing , we have in fact found the actual shortest path between our source and and that this shortest distance is maintained through to when the algorithm terminates.
Hooray! Our maintenance step is complete.
Termination
This step is simple! Once our algorithm terminates, we have finished processing all vertices. That is, our vector of nodes left to process is empty and our loop stops:
while (nodes_left_to_process(vec_of_nodes)) { ... }
This termination phase along with the maintenance phase above show that Dijkstra’s algorithm finds the shortest path distance for all vertices (where is the set of vertices in our graph).
Next time
Mathematics can be difficult…but I see so much beauty in it! The way the steps in mathematical proofs flow is breathtaking.
Next time, I will finally write about how I came up with different routes to walk to work! It involves a little bit of geometry 😉
Justin
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.