Shuo Wang
2 min readNov 20, 2019

Special Cases and Generality

One of the things I have always been fascinated by computer science is that there are many simple algorithms that you can come up with on the fly, and it turns out it’s complexity is optimal, but you feel something is amiss. Then later on you encounter the same problem and realize that the simple solution you came up with was actually a special case of a much more general approach, and it can be disguised so well that you don’t even notice the fact that it is.

The Disjoint Set Union problem is one such case.

First, a brief explanation of the problem:

Suppose I have N nodes:

1, 2, 3, …, N

and they are all connected, no direction.

1-2-3-4-…-N

(not necessarily in order, there is no order here, 1-3-2…)

Naturally there are N-1 edges.

If I add one more edge: 1-3,

it’ll be redundant, so the question is how to detect the redundant edge.

in this case, it’s acceptable to say the redundant edge is 1–3, or 1–2, or 2–3. It doesn’t matter which one you mark as redundant, you just have to mark one of them.

Simple Solution

This is the simple solution I came up with:

First, loop through all of the edges, put the two nodes in each edge into a set.

Second, loop though each edge again, remove the nodes that are in each edge, when you enter an edge where both nodes are missing, then you have encountered the redundant edge.

In terms of time complexity, it is O(N), since you are looping through N edges twice.

In terms of space, it is also O(N), since you only ever store N nodes.

General Solution

So for this problem, the general solution is to employ the Disjoint Set data structure with union find algorithm, and the tutorial for that can be found on leetcode.

https://leetcode.com/articles/redundant-connection/

As you can see, it’s quite different from the simple solution I came up with. But on second though, it’s really the same solution.

If you start out by assuming there is only one tree, which is reasonable to assume in this particular problem, then when I loop through each edge and remove nodes, I am essentially adding those edges and nodes to the set, and what’s remaining in the set in the algorithm are the nodes not added to the tree.

Eventually when we reach the duplicate edge we are checking whether that edge is already in the disjoint set tree by checking if both nodes are missing, this corresponds to the match call in the disjoin set general solution.

Conclusion

Specific cases can sometimes be very well hidden and not obvious how to generalize them, this is true to many real world problems, and it feels so good when you figure it out!

Shuo Wang
Shuo Wang

Written by Shuo Wang

Interesting pieces on various topics in finance and technology.

No responses yet