Tree problems are so popular recently that we’ve seen so many candidates have been asked about it by companies like Google, Facebook, Microsoft and so on.

On second thought, this makes a lot of sense. Tree is one of the most useful and fundamental data structures in real products. For instance, tree structure is widely used in machine learning like decision trees. What’s more, tree related interview questions can cover a lot of topics like iteration and recursion.

This week, I’m going to talk about LCA (Lowest Common Ancestor) problem with some extensions. Again, the answer itself is not important and what this post focuses on is how to come up with the right idea and how to analyze the problem from scratch. I’ll cover topics including recursion, big-O analysis, iteration and so on.

## Question

*Given a binary tree and two nodes, how to find the common ancestor of the two nodes?*

In the above example, for nodes 5 and 4, the lowest common ancestor is 1. And for nodes 6 and 2, the ancestor is 0. The question has been asked by both Google and Microsoft recently.

## Analysis

Think about the problem by yourself before reading the rest of the post.

Here we go. IMHO, tree problems are relatively easier compared to other data structures in coding interviews. The reason is that there are clear patterns to solve tree problems and they are related to the basic concepts. Let me explain this in detail.

If you have read our previous post – Deepest Node In a Tree, you should know that generally there are two basic concepts about tree problems:

- Traverse. You should be very familiar with traversing a tree. Things like in-order traversal, BFS should come to your mind immediately.
- Recursion. Since it’s very easy to divide a tree problem to subproblems (left and right sub-trees), recursion is one of the most common techniques.

Let’s see how these two concepts can be used to solve LCA.

## Traverse

It should be already clear to you if you know traversal can be used. Let’s say for nodes A and B, it should be very easy to get the path from the root to the nodes. I’ll skip the discussion about how to get the path as it should be easy for you.

Once you have two paths – root to A and root to B, you can just iterate over the two paths simultaneously and the last common node is the lowest common ancestor. Some people find it hard to think about getting the path, that is because they are not familiar with traversal. Once tree traversal has become your basic tools, everything comes naturally.

What is the complexity of the algorithm? To get two paths, we need to traverse the tree twice. Finding the common node also requires one iteration. So the time complexity is O(N). Space complexity is also O(N) as we need extra space to store the paths.

## Recursion

The disadvantage of the previous solution is that it needs to iterate 3 times with extra space. Let’s see if we can make it better.

To use recursion, you need to figure out two things: 1. What is the end point? 2. How to combine sub-problem solutions to solve the bigger problem?

First, we can get LCA from both left tree and right tree (if exists). If the either of A or B is the root node, then the root is the LCA and we just return the root, which is the end point of the recursion. As we keep divide the tree into sub-trees, eventually, we’ll hit either A and B.

To combine sub-problem solutions, if LCA(left tree) returns a node, we know that both A and B locate in left tree and the returned node is the final result. If both LCA(left) and LCA(right) return non-empty nodes, it means A and B are in left and right tree respectively. In this case, the root node is the lowest common node.

You might be confused why it’s possible for both left and right return non-empty nodes. This is because we assume that A and B must exist in the tree so that in any of the subproblem whose root is A or B, we just return the root.

Since we traverse all nodes at most once without external space, both time and space complexity is O(N).

## Follow-up

*What if each node has a parent pointer that points to its parent?*

In fact, this is an even simpler problem. With the parent pointer, you can get the path from A/B to the root. Then we can easily get the LCA as the first solution.

However, the time complexity here is O(h) where h is the height of the tree. This is because to get the path to the root, you don’t need to traverse the whole tree as before. Similarly, the space complexity is also O(h).

## Takeaways

I believe that with more posts about tree interview questions, you are more familiar with this data structure.

- Basic concepts like traversal. As you can see, once you are familiar with these basic techniques, it’s quite easy to come up with the right idea. If you find yourself confused about particular concepts, do go back and review the textbook.
- Recursion. We’ve been using recursion to solve tree problems for so many times.

The post is written by
**
Gainlo** - a platform that allows you to have mock interviews with employees from Google, Amazon etc..

I'd like to learn more

You can get the solution in O (h) when working from the root. Use a modified version of the “find” method to find both nodes at once. Assuming that both nodes are in the same tree, the rout will always be an ancestor.

From the root node, check to see if either node is a reference to the root node. Return it if it is, otherwise recurse to see if they are both on the same child path as each other. If they are, keep recursing. If not, then we’ve found the LCA.

My solution operates in O (h) time without any space overhead.

The time complexity of the above solution is O(n) and not O(h), as the method does a simple tree traversal in bottom up fashion

Yes, Rick Mac Gillis probably thought that it is a binary search tree, in which case lookup in O(log n) = O(h) would be possible in case the tree is balanced.

Hi,

I think there is a linear solution: http://ideone.com/Y1tgvJ

In the analysis of “Traverse”, it says ” To get two paths, we need to traverse the tree twice. Finding the common node also requires one iteration. So the time complexity is O(N). Space complexity is also O(N) as we need extra space to store the paths.”

Once traverse is enough to find the two paths, no need to traverse the tree twice. The time complexity is still O(n) though.

The space complexity to store the patches should be the height of the tree, O(h)

Another O(h) solution using recursion.

Start from root.

if(root is null) return not found.

If value at root is between the min and max, return value at root.

else if value at root is > min and max both, move root to left

else if value at root is < min and max both , move root to right

Time complexity = O(h) where h is the height of the ancestor node.

Sample code below. Please review for any error.

int tree::findlca(node *root, int min, int max){

if(root == NULL)

return INT_MIN; //basically not found

if(min data && root->data data;

else if(min data && max data)

return findlca(root->left, min, max);

else

return findlca(root->right, min, max);

}

Never mind, my solution is for BST instead of BT.