Software Engineer Interview - Chapter 4.2: Whiteboard programming interview

What is whiteboard interview?

A whiteboard interview is one where you are given a technical question and are asked to write a code in a programming language of your choice. In the process, you interact with the interviewer who'll try to gauge your analytical abilities in using various data-structures and algorithms together to solve the problem. It's called a whiteboard interview because traditionally, the candidate was expected to write the code on a whiteboard (in the interview room). With the rise of virtual interviews, this has moved to using some kind of webpage where both the candidate and the interviewer can work on a document, like, HackerRank test, or Google shared word document, etc.

What to expect at whiteboard interview?

The interview begins with the interviewer giving you a problem statement to which you need to find an answer for, and code a solution. In the process, it is expected that you talk-aloud your thought process, like, clarifying with the interviewer on what you identified as information given in the problem statement and what is asked by it; And based on that, what data structure(s) and algorithm(s) you plan to use, and how you plan to use them. If there is anything in the problem statement that disqualifies other data structure(s) or algorithm(s) from being used, then also discuss that aloud. If you are having a hard time figuring out thesolution, then the interviewer will also ask some probing questions to understand your thought process and will guide you into having an insight that can be used to solve the problem. On the other hand, if you get an optimal solution in your first attempt, then the interviewer may still ask probing questions to verify that you were able to solve the question based on your skill and not on sheer luck. Also note that the problem statement is deliberately worded to not be very detailed and it is expected that you will work with the interviewer to clarify the input(s) coming in, expected output, the edge cases that can occur and how to handle it. In doing so, the goal is to get a glimpse of how you interact with other developers to clarify what is expected of the code before you start working on it.

For writing the code, the candidate is allowed to use any language of their choice. Note that during the interview, the code that you write won't actually be executed. Nonetheless, it is expected that excluding minor compile errors, the code should look similar to as expected for the language, and more importantly, it should achieve the desired goal of the problem statement. The goal behind having you write a code and not just a pseudo-code is to verify the claim that you have a working knowledge of certain programming language, i.e., if you claim to know a programming language and have worked in it for some time, then you should know enough of syntax to write a basic code. To verify that the code is achieving the desired goal, you should manually work out the flow of code execution when it is provided some test input, and verify that the output from the code does match the expectation.

Last, realize that a whiteboard interview may contain multiple questions. An interviewer may start by asking you to solve a simple problem. However, after the initial question is solved, they may change the premise of the question to increase the difficulty level. Maybe, the question is modified to have a more generalized form of input, or maybe it is needed to apply a different processing on the input, or both. There are a couple of goals here: (1) Starting with a simple question helps "settle" in the candidate to the interview. On the other hand, starting directly with a complex question might increase the interview-anxiety in the candidate and make them perform at a lower level than what they are capable of. This can cause a company to miss out on a good candidate. (2) The simple question helps set up the context of problem and it is easier to then introduce a more complex problem. (3) Evaluating a candidate on a more-complex problem helps with gauging their technical level, i.e. whether the candidate's skills are at beginner or senior level, and if their understanding of data-structures and algorithm is just basic or at an advanced level. This helps the human resources team identify a suitable position and compensation level to offer the candidate that is competitive and won't cause the company to lose the candidate by offering a low amount, or to have a candidate with proportionally lower skill compared to the compensation offered.

Preparations / Things to consider

This section covers some of the preparations / things-to-consider for the whiteboard interview.

On-site vs virtual interview

The general layout and expectation for a whiteboard interview remains the same whether it is conducted on-site or virtually. However, there can be some differences between the two layout. In a whiteboard interview, as part of explaining your thought process to the interviewer, or maybe when you are trying to demonstrate the behavior of your code for some input, you may want to draw a diagram. For example, you may want to draw a binary tree, or maybe you want to draw an example of a graph. In this case, it is easier to draw diagram on a physical whiteboard. However, if you are in a remote interview, then it is harder to draw diagrams using a mouse than it is to draw on a whiteboard. Additionally, if the interviewer doesn't provide a test platform where images can be easily drawn (like, using Google shared word document), then an image based communication can get even harder. To mitigate this issue, when participating in a virtual whiteboard interview, I would suggest keeping a notepad and a pen handy with you. You can quickly draw the image on your notebook and share the image with the interviewer over the camera. If you are using a detachable camera, then remove it from the laptop and hold it such that it shows the notebook page. You can now draw images and explain your thought process as you draw. It may be a good idea to do a practice session with one of your friends before the actual interview where you try to unmount the camera, have it face your notebook and draw/talk at same time. Note that some interviewer may not allow doing so because this prevents them from collecting evidence that they need to add in your feedback. Meaning, if you write anything which is not on the test platform provided by the interviewer, then they won't be able to submit that to the hiring committee that helps them make a decision. If your interviewer is disallowing you to draw on your notebook for this reason, then a simple solution is that you can raise your notebook page to camera level and the interviewer can take a screenshot of their screen, which will show the image made by you in your notebook. However, if the interviewer is absolutely adamant and does not allow you to use your notebook, then just ask them for alternative suggestions on how you can communicate your ideas.

Another difference between in-person and virtual whiteboard interview is the availability and use of "space" to write code. A physical whiteboard limits the amount of space available to write. If you want to develop your thought process by writing down notes, or you want to write code, or want to check your code using a given input - everything must happen within the fixed space provided on the whiteboard. This leads to situation where you may delete something and later realize it was useful, or start writing small, or generally become conservative about using whiteboard space. On the other hand, there are no such restrictions on the amount of available space when working on a digital document during a remote whiteboard interview. However, in this case, it can become hard to keep track of items on the document, and scrolling between different portions of the document. In both cases, I would suggest using some "space management" strategy to prevent the scenario from becoming a problem. For example, in an in-person interview, you may want to divide the whiteboard in two portions, with one portion only having code, and the other portion having images/discussion. Now, if you add some discussions / notes / images in the portion reserved for writing code, then either re-do the boundary between the 2 portions, or only write temporary text that will immediately be deleted. Same strategy can also be used for virtual interviews. Note that you may still run into space management issues, but at least it will not happen till much later in the interview. Up until that time, your workplace (i.e., the whiteboard or the shared document) will not look cluttered and this will enable you can participate in the interview with more clarity and focus, compared to if it was cluttered.

Data structure and algorithm

Probably the most important thing when preparing for the whiteboard interview is to understand the most commonly used data structures and algorithms. A couple of reasons why this is important: (1) Even for beginner software engineer position, you'll be competing against college level graduate students who are skilled in data structures and algorithms. So, you lose a competitive edge as a viable candidate for the position if you are not familiar with these concepts. (2) If you don't know about basic data structures, then it will become almost impossible for you to make meaningful contributions to the codebase. It will also become very taxing for the team to help you understand the existing codebase. (3) If you know that the interview will have a whiteboard section, and you still did not work towards understanding different data structures and algorithms, then it can convey that you don't really care for the position. Hence, it is strongly suggested to understand about the different data structures and algorithms. For data structures, one should also know about the common operations associated with the data structure, the space and time complexity of the operation and should also be able to write code for the data structure and the operation. For algorithms, one should know the space and time complexity of the algorithm and should also be able to code it.

Array
List: Stating very briefly, a list data structure is used when there is a collection of data being modeled and the collection is arranged in any kind of sequential manner. If the sequential order is based on the value of elements in the list, then it means that the list if sorted. A list can be an ArrayList or LinkedList. An ArrayList is a data structure with a list like behavior, where the underlying implementation uses an array. A LinkedList is a data structure with a list like behavior, such that each entry of the list contains a reference to the next element. An ArrayList is preferred when the code is using the list to retrieve an element any any arbitary position much more than inserting a new element in the list. A LinkedList is preferred when the code is using the list to insert an element at the beginning / end of the list much more than retrieving an element. A somewhat of a mix between these two concepts is an unrolled linked list.
Set: Stating very briefly, a set data structure is used when there is a collection of data being modeled and each entry in the collection is unique. Most commonly, the elements of a set are not required to be in any order. A HashSet is probably the most common implementation of set that uses an array of hash-buckets along with a list to store elements that fall in same bucket, to provide an effective O(1) insert and lookup. However, in doing so, the order of elements in the set are not guaranteed to be same as the order in which they are inserted. A LinkedHashSet is an enhancement of HashSet that stores element in same order as it was inserted. A TreeSet is a different type of enhancement where the elements are sorted in given order as soon as they are inserted. This is preferable if elments of set are read in an ordered manner more frequently than new elements are added to it.
Map: Consider a set implementation, such that with each set-entry, there is another value associated to it; This data structure is a map. Thus, a map can be considered as a collection of (key, value) pair, such that all keys of a map are unique. Like a set, a map can be implemented as a HashMap, LinkedHashMap or TreeMap.
Stack, Queue and PriorityQueue (or Heap): A Stack is a specialization of list where new elements are added to the top of existing stack, and when retrieving an element from the stack, the last added element is returned. Hence, it behaves as a "Last In, First Out" (LIFO) structure. A Queue is a different specialization where new elements are added at end of a queue, and when retrieving an element from the queue, the first element is returned. Hence, it behaves as a "First In, First Out" (FIFO) structure. A peculiar thing about stack and queue is that it is possible to implement a stack data structure using a queue, and it is also possible to implement a queue data structure using a stack. A candidate must understand how this works out. A PriorityQueue is a special type of queue where the element don't always get placed at the end of the queue. It can be put anywhere in the queue depending on a "priority" associated with the element. Another name for same data structure is a Heap; I've personally found it easier to understand the real worls use of a priority-queue than a heap, and so I preferred to name it as prioirty-queue. It's a good idea to understand the algorithm behind adding and removing elements from the priority-queue / heap.
Advanced data structures: Tree, Graph, Disjointed-set are example of advanced data structures. These structures don't get used as commonly as the ones above. However, there are various operations / algorithms associated with these structures and it is a good idea to review those.
Realize that it is also possible that in a question, more than one data structures gets used to store related data. For example, let's say you have entries about people with their full name, nationality and date of birth. Maybe one operation requires you to find all people of "US" nationality and with birthday in different months; And another operation requires you to find all people of birthdays in January and having different nationality. If this data were stored in a database, you'd make an index on two columns of same table. However, for in-memory data, use can create a Map of birth-month as key, and having as value another map of nationality vs full name. Additionally, you can have a different map where nationality is key and the list of associated birth-month if the value. So, two different map data structure gets used to model same data in-memory.

Algorithms and space/time complexity for common operations associated with the data structures mentioned above.
Sorting: I strongly suggest knowing the algorithm for quick-sort, merge-sort and bucket-sort. Also, realize that if the goal is to just get top-N elements, then a heap data structure can be used to do so efficiently, rather than completely sorting an array / list and then picking top-N elements.
Hashing or bucketing: This forms the basis for making a hash-set. While it is not necessary to know of a good hashing function, it is a good idea to understand the various scenarios where hashing can be used. Associated with the idea of hashing is the occurrence of hash-collisions, that can cause various elements to be put in a single bucket. If bucketing is done using some other criteria rather than hashing, then a Bloom Filter can be used to identify with a high level of certainity if an element is present or not in a bucket.
Traversal of tree, cyclic graph, acyclic graph using depth first or breadth first traversal. Realize how breadth first algorith uses a queue data structure, and the depth first uses a stack data structure.
Understand what is a "reverse index", how it gets used in context of text processing and how to identify if the given problem could be solved in a more performant way by making a reverse index.
A really, really important skill to develop is trying to identify if an algorithm is performing some "repeated" action. The best example to see a "repeat" action happening is in insertion or bubble sort algorith which causes these algorithms to have complexity of O(N*N). By removing the repeat action, the quick-sort and merge-sort algorithms are able to achieve sorting with complexity of O(N*log(N)). The Rabin-Karp algorithm for string search introduces the concept of rolling hash to cut down repeated actions and improved the complexity of string search algorithm. Sometimes "repeat" actions can be a bit tricky to find because it might be hidden inside some other method. For example, let's say that your solution depends on splitting a string and the same operation is done in every iteration loop, even though it could have been done outside of it. Here, if you mistakenly consider the split as an O(1) operation just because it is done by a library method, and not by a method made by you, then you will be unable to identify the pocket of poor performance in your code. Identifying the occurrence of "repeat" actions is a really important skill!! Let's say that during the interview, you are given a problem to solve and you wrote a code using a non-optimal algorithm. At this point, the interviewer will try to guide you into identifying the optimal algorithm. Even if you are unable to identify the optimal algorithm but instead identify the repeat actions that are occurring, then the interviewer will count it as a positive attribute for you. Also, identifying repeat actions is the first step towards understanding how the algorithm can be made optimal.
Understand and practice examples of Dynamic Programming. In trying to understand dynamic programming, I would suggest only using the references which identify that a dynamic programming problem must have (1) Overlapping subproblems and (2) Optimal substructure. Unfortunately, many references equate dynamic programming with recursion, which I feel is not a correct approach to the concept. Realize that it may be possible that the given problem statement for the whiteboard interview can be broken down into smaller sub-sections, and dynamic programming can be used to solve one such subsection.

Clarify input, output and processing requirements

As mentioned earlier, the problem statement is purposefully worded to not be very detailed. In doing so, the interviewer's goal is to verify if the candidate works with them in clarifying the problem requirement before starting to solve the problem, and how nicely are they able to do so. Note that the interviewer may ask the counter-question on why / how the clarification question asked by you is useful in solving the question. It is an acceptable answer is to say that you are pre-emptively trying to understand the input(s), output and the processing requirement for the code, and you can revisit the question later, if it seems necessary.

Does the input contain a number? If yes: Is the number integer or floating point? Is the number only positive, or can be both positive and negative? Can it be zero? Is the number represented by a primary data type and cannot be null, or is it represented by an Object and can be null?
Does the input contain a string? If yes: Can the string be null? Can the string be empty?
Does the input contain a list or an array? If yes: Can the list be null? Can the list be empty? Can a single entry in the list be null valued? Are all values in the list unique? Are the list elements ordered in any manner - If yes, does the order need to remain the same during the processing? Is the list sorted - If yes, then what is the sort order?

Does the output need to be a particular data type? For example, should it be a number, string, object, list, boolean, etc.
Does the output need to be formatted in a particular manner? For example, if the output is a list, then should it be sorted?

How should null value from input be handled? Work with interviewer to understand what a null value could mean from the perspective of someone using the code you are about to write, and whether it makes more sense to raise an exception or to silently fail. In most cases, but not always, raising an exception is a good way to proceed.
Does the code assume something about the nature of data that it is working on, like, the input being non-null, or input list being non-empty. What happens is the premise is broken? Should an error be raised?
Something to discuss after you've written the solution: Is the solution thread-safe? Does it need to be thread safe? Are there any processing chunks that you think can be written such that they can be executed parallely? Are you required to write code in a manner that enables parallel execution?
Are you allowed to modify the input in any manner during one / multiple processing steps? It is an accepted practice and generally a good idea to not modify the input data.

Pseudo-code, then code

In a whiteboard, it is expected that the candidate will write a code that looks idiomatic for the language. However, this does not mean that you cannot start by writing a pseudo-code. If writing the entire code seems burdensome, then the suggestion is to instead start by writing a pseudo-code highlighting the steps that you plan to take. The advantage of doing so is that it can help you get clarity on the workflow that code should achieve. Once the plan is clear, you can then start changing the sections of pseudo-code into real code. Even when doing so, you're free to replace pseudo-code by some method call, and define the method separately. Meaning, when the interviewer asks you to write code, it is not an expectation for the code to be monolithic. You can define and use various helper methods as you feel suitable.

Testing the code

After you've written the code, you should manually test the code and verify that it meets the expectation of the problem statement. In doing so, your goal is not only to demonstrate to the interviewer that your code is correct, but also to show that you are an expert in testing your code throught various test cases. When testing your code, I would suggest the following: (1) Keep count of how many tests your code has gone through. For example, at top of the code, add a comment where you keep count of the tests. This is suggested because the interview may overlook / forget the count of tests your code has gone through and you want to remind them that you did a good job in testing your code. (2) Start by testing your code against edge cases, then using easy input, and lastly, using complex inputs. However, don't try to test for the same case over again. For example, let's say you are being asked to write a method that sorts a list of strings. The first test case can be that if null is provided, then an error is raised. The second test case that if empty list is provided, then an empty list is returned. The third test can be if a list with single string is provided, then it returns a list with same string. Note, how the first 3 tests are very simple. They haven't yet probed the bulk of the method, but still you have already tested your code against 3 cases! I would NOT suggest making the fourth test case where you still provide a list with single string, which is different from that given in the third test. This test is similar to the third test are doesn't provide any new insights. Note that if the requirement were that the method should sort the provided string and also lowercase the string content, then some good tests could have been that if the method is given a list with one string in upper, lower or mixed case, then the output still comes out in lowercase. For our example, a good test can be that if the method is given a list with 3 strings, then it works as expected. This should still be sufficient to verify that the bulk of code that you wrote is correct. Just a reminder, that following the first sugestion, add a comment near your code that it was tested using 4 tests.

Copy on question upgrade

The whiteboard question will normally start with a simple premise. Once that is solved, the interviewer will upgrade the question by generalizing certain input conditions, and the question harder. There are a couple of suggestions at this point: (1) When the question is upgraded, don't delete the answer you wrote for the previous question. The goal is have all your efforts in the interview be really visible to the interviewer and anyone else in the interview panel looking at the solution at a later time. In a vitual interview where a shared document is used, this means that you'd want to make a copy of your code and only modify the copied code, and not the original one. (2) The primary focus should be on trying to solve the new question. However, if you get time, then try to identify and present a solution, or at least identify the method signature that makes the new solution "act" similar to the previous answer in case where the generalizations are removed. For example, let's say the first question is to sort a list of elements. You did a quick sort with complexity of O(N*log(N)). The upgraded question is that sorting was done so that you can get top 5 elements from list. In this case, it's best to use a heap, which will bring down complexity to O(N*log(5)) ~ O(N). However, if instead of 5, this algorithm is given the length of list, then it will sort the entire list via heap-sort, achieving the goal of the first question. Thus, the upgraded solution is able to handle both the first and the second question. Doing so conveys to the interviewer that you also consider code-reuse while developing a code, which is a great skill! That being said, realize that it is not always possible to write a second solution which is also able to cover the requirements of previous question. Or, maybe there isn't sufficient time left to do so. In this case, at least mention to the interview on how you think the two questions are related and so, you would try to write a solution for the second question which is also able to cover the requirements for the first question. Identify what are the places in code where you'll attempt to make the change to achieve this goal. This way, you can initiate a discussion around code-reuse even if you won't have time to write it!

In a on-site, in-person interview, it may be hard or almost impossible to not delete the answer you wrote for the previous question, as you are trying to answer the upgraded question. This is due to a limitation of physical whiteboard space available. In this case, when a question is upgraded, I would suggest; (1) Ask you interviewer if they want to take a photo of answer you wrote for the first question. (2) Here, start by saying how you feel that since the second question is a generalization of first, so, in the spirit of code-reusability, you feel that the answer to the second question should ideally be in a form that when it is given an input similar to that for the first question, then the algorithm for answer to second question should behave like the algorithm for answer to the first question. Ask your interviewer if it's ok for you to also take a photo of the first answer before you modify it so that it can help you with the discussion later.

Use descriptive variable name

During the interview, it might be tempting to use non-descriptive variable names because - why spend time in a time-bound interview in writing variable names, which will not be of any benefit? Here's the thing though, writing descriptive variable names, except for loop indexes, can be really beneficial. (1) It'll help you remember the goal for making the variable. Since you are talking and coding at same time, a descriptive variable name helps ensure that you are not storing any unexpected values inside the variable, nor that the intent of variable is getting misinterpreted. (2) Since a descriptive variable name helps communicate the intent of a variable, so it is easier for the interviewer to understand the code. This enables the interviewer to be more engaged with you and your code. (3) This helps demonstrate to the user that when writing code, you also prioritize producing a readable code, which is a positive attribute.

Asking for hints

As mentioned multiple time above, a whiteboard interview is not just about ensuring whether a candidate already has an understanding of data structures and algorithms. An equally big piece is to evaluate how the candidate work with the team and how they analyze any new information. Also evaluated is whether the candidate understands that they cannot proceed on their own and asks for help / information! People work in team so that everyone can learn from the expertise of others. Asking for help is not a bad signal in any way, quite simply because no one person can know everything! What makes a whiteboard interview go bad is if you don't ask for help at all when you are clearly stuck, or if you are unable to identify what it is that is keeping you from reaching the desired target rather than just asking for answer, or if you are given an information, but you are not processing it at all and just remaining stuck. The amount of hints given, and how close the hints are to actual solution only determines the level of position you are hired for, and not whether you'll be extended an offer or not. Even so, it is NOT the case that with every hint given, you'll be offered lower position or salary. Bottom line: If you are stuck in your interview, then tell your interviewer what problem you're facing and work with them to make progress with the solution. Don't just be there not telling anyone that you are stuck!

Things to observe during the interview

Every thing up to now has been about how you should be ready for a whiteboard interview. However, an interview is a two-way communication! This means, you can, and should also observe the interviewer to get a glimpse of workplace culture. There are common points that can be observed in all interviews and they are discussed in a later chapter. Particularly for whiteboard, you should observe if the interviewer engaged with you when you got stuck and weren't making a progress. It might be that you didn't talk about it; But after a certain, not a large amount of time, the interviewer should be reaching out to you none-the-less. If that's not how the interview went, then it means that the interviewer was allowed to conduct the interview before getting properly calibrated to do so. Whatever the reason, this is not a good sign and shows that the company doesn't prioritize clear communications, or doesn't value interviews or new hires as much! Maybe because it expects a high churn of software developers! Another thing to observe is if you were initially given an easier question which transitioned into a harder one, or if you were just given 1 question to solve during the entire interview. With just one question asked, I feel that it may be harder to identify the proper skill level of a candidate. This raises the chance that you may not be offered a proper compensation level and job position matching your skill level.