People always complaint about they can’t come up with the right solution.
However good approach doesn’t come from nowhere. If you analyze the problem in the right direction, those smart answers should come to your mind naturally.
We’ve been seeing to many candidates struggling to come up with a solution in Gainlo interviews. That’s why in this post, I want to talk about something interesting and different.
We’ve been elaborating a lot of practical strategies for both preparation and interview in previous posts, now I’d like to use an example interview question and explain how a candidate should solve coding problems step by step instead of telling you the final answer directly.
The cool thing about it is that you can deeply understand how an interview question is solved from the beginning to the end and what to think of when getting stuck. In other words, I want you to be able to use techniques in this post to solve other questions as well.
There are a set of dictionary words and a set of license plate numbers. Write a code/algorithm to find the shortest dictionary word which contains all the characters in the license plate. Ex: RC101 is the license plate number. The shortest word that can be found in the dictionary is CAR.
This is a popular question that has been asked by many companies and I’d like to use this question as an example to guide you how to solve it in a real interview.
Most interview questions are described shortly and there can be some restrictions that are unclear. So the first thing you should do is to clarify anything unclear, which is also what many interviewers evaluate.
Look at the example question, there are many things you should clarify in the beginning:
- How is the dictionary stored? Suppose the interviewer said you can store in any way you like.
- What about duplicates letters in the plate? Ignore them.
- Should the letters order be preserved? No.
- Is there enough memory to store the whole dictionary? Yes.
- Can I assume that the plate contains only digit and letters? Yes.
- Are letters in dictionary and plate all capital? Yes.
You may not be able to come up with all these questions, but the first three are expected to be clarified in the beginning. You should also realize that digits in the license can be ignored totally.
#2 Simplest solution
It’s good if you can come up with the optimal solution after 10min, but not everyone can do it and in fact, only a few people can do it.
A better and safer approach is always talking about whatever solution you have in the beginning, and you can analyze the efficiency to tell the interviewer that you know it’s not optimal and you will optimize it next.
This strategy is very important because it tells the interviewer that first you can solve this problem quickly and correctly. Second, you know that your approach is naive and you are working on optimizing it. Lastly, even if you failed to come up with a better approach, the interviewer can say that this candidate, at least, can solve the problem fast.
So back to the example question, brute force is usually something you can come up with first. It should be very easy for you to realize that you can iterate over each word in the dictionary and check if it contains all the letters in the license.
If there are n words in the dictionary, then the space complexity is O(n) and time complexity is O(n) for each license (you should be able to come up with this in seconds).
In most cases brute force won’t be the best solution in an interview as it’s too simple. So now you should try to optimize the solution.
Normally you can make your algorithm faster (time complexity), consumes less memory (space complexity) or both, and that’s why it’s very important to have the big-O analysis of your existing solution, which gives you some ideas about how to improve.
Apparently you can hardly reduce space complexity as you always need to store the dictionary unless you put it in disk, which will be extremely slow to lookup. So our focus is to reduce time complexity, especially the lookup time.
The reason the previous solution is slow is that we are iterating over the dictionary one by one. To optimize this, we can sort the dictionary by word length, then we can iterate from the shortest word and stop whenever there’s a match.
Although the time complexity is still O(n) if you count the worst scenario and ignore the preprocess time, the algorithm is obviously faster in average. This is also a common pattern that by doing preprocessing, you can reduce the lookup time.
Another direction we can think of is the classic tradeoff between time and memory.
Even if you’ve no idea about the solution, you should think about how to consume more memory in order to reduce lookup time. More specifically, we need to figure out how to store the dictionary or which data structure to use to make the lookup faster.
At this point, I hope something like hash, tree has came into your mind. I suggest you read the analysis above one more time as it tells you exactly how to analyze from the high level and explains why someone can come up with the correct direction.
So let’s see if hash can make a difference. Since we are looking up certain word that contains all the given letters, it’s quite natural to make a HashTable keyed by letter and its value can be a list of words containing that letter.
Then the solution is clearer now. For each letter in the given license, we can get a list of words containing that letter. And for all these lists, we just need to find the shortest common word among them. If at this point you haven’t thought of the classic problem finding intersections between two arrays, you need to practice with more coding questions.
To sum up, this solution is preprocessing the dictionary into a HashTable keyed by letter and valued by sorted words containing that letter.
As mentioned above that you should think about how to store the dictionary to faster lookup. Another approach is to store the dictionary in a prefix tree. I’ll leave this solution as a practice for you. If you couldn’t figure it out, please drop me an email.
#4 Extreme cases
Sometimes when you change restrictions to an extreme degree, you may come up with other solutions. This doesn’t always work, but it’s worth to try.
The most common approach is assuming we have infinite memory to store whatever we want. For this question, we can use any of the above approaches to get shortest words for ALL the license plates and store the results in a huge hash table. Then the lookup time is reduced to O(1)!
It’s important to analyze here that we assume license plate doesn’t contain too many characters (maybe at most 7) and we have enough memory to store all the permutations and the corresponding words.
The question is really just an example and the most important takeaway from this post is the general process that you can use to analyze and solve a coding question. Solutions don’t come from nowhere, when you choose a direction to approach, it should be quite natural to think about the relevant techniques you have. Just like when you decide to spend more memory to reduce lookup time, hash/trees should come to your mind immediately.
Also, you’ll notice that I’d like to analyze the efficiency (both time and space) immediately after a solution without even being asked to, which gives me clear idea about what can be further optimized and how much potential it has.
Did you identify any other patterns?