Monday, March 23, 2015

The First Phone Screen -- Alibaba (II)

Hi. Everyone. Welcome back. In this post I'll cut right to business. You know, the resume evaluation had been over. The day after that, the real first phone screen came. It was March 18, Wednesday, a sunny day. At 9:50, I went downstairs to the second floor, found a quiet area with few people around in the lobby, took my cellphone out of my pocket and kept staring at the screen. Time passed. 10:00. 10:02. 10:04. At 10:05, the phone rang.

"Hello. Is this Boyu? This is the one from Alibaba who made the appointment for the phone screen with you yesterday."
"Yes. Here is Boyu."
"Okay. Are you free? Shall we begin?"
"Sure."

The phone screen started like this. The interviewer asked me nine questions in total. The entire process lasted for almost 40 mins. In the end, I hung up the phone and found my cheek sweaty. I think that's probably because I kept my ear pressed against the phone so as to get every single word coming from the other end.

1. Please introduce yourself first.

2. The first technical question is coming. There are millions or even billions items online in Taobao Mall. For each item, it has a number indicating the views from customers. Now we need an algorithm to return the top 1000 items which have the most views.

The question sounds quite simple and common but I couldn't come up with a nice solution. An intuitive answer is to sort. However, it is not workable since the data are massive. The time complexity cannot meet the requirements. Another solution hit me then. Maybe I can select a small part of data with random sampling and sort the part of data to get an inexact result. I spoke out of my answer to the interviewer but he said he wanted the exact result. What should I do? How to get the top 1000 exactly and quickly? I asked myself dozens of times but still got no clues. I struggled to think of some other alternatives and I felt there was no one talking on the phone for nearly 2 or 3 mins. Suddenly, I was struck with another answer. I didn't have time to think about whether it was fairly good or not. As long as it could help me talk something out, that's good enough. That's all I thought of at that time. I said to the interviewer that I could count the number of digits for the views first. The bigger the views, the more the number of digits. Then only the views with most digits would be counted in the array to sort. To tell you the truth, I didn't know if it was an proper answer. As I said, it could have me talk and that's enough. The interviewer said, "Okay. Could you find some other solutions?" I ran out of words and I had to tell the interviewer that's all I got.

After I got back to my desk, I searched the question on the internet. I found this turned out to be a class of problems under massive data, called Top K problem. There are so many solutions that you can find there. I just extracted some keywords. It usually uses the heap or the red-black tree to solve this sort of question. I know heaps and I thought of this kind of data structure during the interview. But I didn't know how to use the heap to speed up this process and didn't mention a bit about it. For the red-black tree, I know nothing at all.

Here I learn a lesson. You must try to talk something out during the interview, even though you have not come up with a solution yet. Don't have the communication quiet. You can speak out your process of thought. You can mention at least the data structure you are going to use. Anything is helpful. If you don't talk, the interviewer knows nothing about you. If you talk out some pieces of the process of thought of yours, the interviewer knows what your brain is going on and he can lead you to the final step by step. If I had told the interviewer heaps might be useful to solve this question, he might have pointed something out to help me out or at least he would have known that I knew this kind of data structure.

3. He asked me what I learn about machine learning and data mining, the same question as the resume evaluation yesterday. As you see, how essential these two branches are for an algorithms engineer. I told him that I knew a little about them, which are the subjects I have been learning. He then asked me whether I knew some algorithms about classification and clustering. I know a little about the classifiers of Bayesian, Logistic and SVM. I was asked to tell the idea of SVM next. Here I should thank a professor called zzx, who was the teacher in charge of the class "Machine Learning" during the first semester for graduate. I took the class and learnt some basic concepts about classification and clustering through this. They came in handy at this time. Further, the interviewer wanted me to say the formula to be optimized in the problem of linear binary-class SVM. I didn't remember the details about the formula but I knew the optimal result could be solved by a method called "Lagrange Multiplier Method". Then he let me talk about the Logistic. I know the logistic function can map the data into an interval between 0 and 1. Nothing more. After that, he also let me talk something about clustering. Since I only know K-means, I talked the basic idea behind this algorithm to him. He then asked me if I knew hierarchical clustering. To be frank, I know nothing. :(

4. It turned to data mining in the following. "Have you heard of associate rule in data mining?" I don't know the name of this concept at all. I tried to understand this concept according to the literal sense. "NO." I got completely denied. Okay. Next question.

5. Do you know something about multi-thread safe? There are some stuffs that you cannot use in multi-thread. Could you point some out?
In the beginning, when I heard the word of multi-thread, I got excited and thought he would ask the same question as the resume evaluation did. But my heart sank when the phrase of "multi-thread safe" came out. I've never heard of anything about multi-thread safe though I have been using JAVA to start multi-thread programs for almost one year.

6. I didn't know if it was because I got no answer for the questions for a long time. The man asked me a very easy question here, which was the only one that I got the right answer during this interview. "There is a binary tree. Each node stores one letter. How to output all the paths from the root node to the leaf nodes?"
I'm not going to discuss this. There are various ways to get the result, recursive or non-recursive. I believe you know at least one of them. Otherwise, go and find any kind of Data Structure textbook to look it up.

7. The last technical question was a classic one. The moment he said there was a string of letters "ABCDEFG", I noticed it was the problem of Longest Common Subsequence (LCS). I also realized that I forgot how to deal with it with the optimal method. I even forgot the optimal solution was DP (Dynamic Programming). What a shame! I was taught this both in the undergraduate and the post-graduate. I really ought to learn algorithms from the bottom up.

8. What's do you think your biggest advantage and disadvantage?
It's the behavioral question that every interviewee should make preparation for. Remember to write something down about these sort of questions. "During the projects you participated in before, what you enjoyed most or least, what you considered most challenging, what you learnt, what the hardest bug was, etc." They are as important as the technical questions.

9. "Do you have any questions to ask me?" This was the last question.

You've guessed it. Yes. I got rejected in the evening on the same day. I had been prepared for that. Although I screwed up it, I learnt that there was a long way to go in my career seeking. It's okay to fail the first time, not to mention I haven't started to prepare yet. All I can say is that look forward and keep going. GO GO Fighting!

2 comments:

  1. 我占到沙发了!还没开始想这些“形式化问答”,项目经历几乎都是自己一个人完成的怎么办,没接过大的合作项目,都不知道该怎么准备。

    ReplyDelete
  2. 我居然读完了- -

    ReplyDelete