On Reading CS Papers – Thoughts & Reflections

Be forewarned:

  • This is not an advice post. There are tons of people out there who desperately want to give people advice on reading papers. Read theirs, please.
  • This post is a continuous reflection on the topic “how to read a CS paper” from my personal practice. I will list out my academic status before each point so that it may be interesting to myself on how my view on the matter has changed as time goes forward.

2018

The first year of my CS master program. Just get started on CS research.

  • It’s OK to not like a paper

In my first semester, I majorly read papers on Human Computation and Crowdsourcing.  Very occasionally, I read papers on NLP. Some papers on NLP are from extra readings in Greg’s course. Some are related to Greg’s final project, which deals with both code and language.  I don’t really like and want to read papers back then. In NLP class, I prefer to read textbooks (Jufrasky’s one) and tutorial posts that I can find online. One roadblock for me to read papers is that there is certain background knowledge gap I need to fill and I just simply don’t know how to read a paper. So, for Greg’s NLP course, I only read some papers related to my final project. This paper is the base paper for my final project. I got this paper from professors in linguistics and software engineering and they want me to try out the same idea but using neural network model instead. I read this paper several times and the more I read, the more I want to throw up.  I just think this paper hides many critical implementation details and the score 95% is just too high for me to believe. The authors open source their code but their code has some nasty maven dependencies, which won’t compile under my environment. Their evaluation metric is non-standard in NLP and many “junk words” wrap around their results. Of course, the result of my experiment is quite negative.  I often think it is just a waste of life to spend your precious time on some paper you dislike.  Here, I’m more of talking about paper writing style and the reproducibility of papers’ results. I probably want to count shunning from some background gap as a legitime reason not like a paper.

  • Try to get most of the paper and go from there

I got this message from Matt’s Crowdsourcing class. In the class, I have read a very mathematical heavy paper, which invokes some combinations of PGM and variational inference on the credibility of fake news. I’m worried back then about how should I approach a paper like this one, which I’m extremely lack of background and mathematics formula looks daunting.  I pose my doubts on Canvas and Matt responds in class and gives the message.  I think the message really gives me some courage on continuing read papers.

  • It’s OK to skip (most) parts of a paper.  Remember: paper is not a textbook!

This semester I’m taking a distributed system class. To be honest, distributed system paper can be extremely boring if they are from industry. Even worse, system paper can be quite long: usually around 15 pages, double column. So, if I read every word from beginning to end, I’ll be super tired and the goal is not feasible for a four-paper-per-week class. So, I have to skip. Some papers are quite useful maybe just for one or two paragraphs. Some papers are useful maybe just because of one figure. As long as your expectation about a paper gets met, you can stop wherever you want.

  • Multiple views of reading a paper

I didn’t get the point until very recently. I did quite terrible on the first midterm of my distributed system class. The exam is about how to design a system to meet a certain requirement. In the first half of the course, I focus on the knowledge part presented by the paper but that doesn’t work out well. Until then, I realize that I need to read those systems paper from a system design point of view: what problems they need to solve, what challenges they have, how they solve the challenges.  OF course, those papers are valuable from knowledge perspective: how consistent hashing works, for example. But, depends on the goal of reading paper, I can prioritize different angles of reading a paper. If I need to implement the system mentioned in the paper, I probably need to switch to a different paper reading style.

  • Get every bit of details of paper if you need to

It’s time again for the final course projects. Again, I need to generate some ideas and find some baseline papers. In this case, “skip parts” and “get most out of the paper and move on” strategy probably won’t work well. All in all, I need to understand the paper and those are rely on the details from the paper. In this case, I need to sit through the whole journey and remove any blockers that I may encounter.

Advertisements

Job hunting lesson learned

This post contains a collection of lessons I learned during the job hunting. I’m still looking for internship & job. That’s good because that means this post will be at least frequently updated in the foreseeable future.

  1. Always attending career fairs. In UT, if you are a CS student with a good standing, you can get an invitation to an event called FOCS Career Night. There will be a lot of recruiters. But, be careful, most of the recruiters are actually engineers or UT students (that’s right, some companies make Campus Ambassador attend the event as if they are the recruiters). There is a huge difference between recruiters and engineers: recruiters get the call on who gets the interview, not engineers! I made a mistake by attending the FOCS Career Night only and skip the Career Fairs. In fact, recruiters are actually coming to Career Fairs and some of them doing on-campus interview signup immediately. The on-campus interview is much better than OA. Even you got an invitation for scheduling an interview after FOCS Career Night, you still want to talk to the company at Career Fairs because recruiters can barely check their emails when they are on travel and interview slots are always based on first come first serve policy. So, you always make sure to come to the Career Fairs and schedule an interview immediately instead of replying the invitation email and wait for the response and then got one said interview slots are all filled. This happens to me on Indeed.
  2. Always doing OA immediately. When you receive an OA, the company usually will tell you that you can finish the test within certain days. However, things can change rather quickly. Even they give you buffer like finishing this test within 4 days, ignore the message and do OA immediately. Slots can fill rather quickly and some company has this under-table rule on even they say 4 days, they really mean immediately. This happens to me on Dropbox.
  3. Always finishing OA within min(60 minutes, restricted time). Some company allows you to finish OA within days. In other words, even you start the test, you have a couple of days to finish it. Ignore this, please! Even though the test lasts for days, finish it as quick as you can. The finishing time is a strong indicator of your coding ability. This happens to me on Twitter.
  4. Always follow-up with the recruiter. Sometimes, there might be system error: they send reject email to the wrong person. Make sure you confirm this with the recruiter and finish your OA no matter what happens. Even you got rejected, OA is still an invaluable practice opportunity. This happens to me on Dropbox.
  5. Always make sure you apply in the University Recruiting section. Companies make specific web pages for fresh graduate and recent graduate. Make sure you submit your resume there. If you submit the resume to the wrong place, you may in a pool that is filled with professional with 5+ years of experience. That always leads to either no hear back at all or an immediately reject letter. This happens to me on Dropbox.
  6. Use the LinkedIn and be aggressive. I’m a shy person but job hunting like the name suggested, it’s a hunting. You have to be aggressive. Connect with as many people as you can whether it is from Career Fair, social events, LinkedIn in-mail. Be polite and be bold. Ask them for the opportunity. One special note is that you may want to “harass” recruiters and senior developers in LinkedIn. Their words have much more power and you may get an interview very quickly. If they being rude when you ask for the favor politely, you already know that this company is definitely not the one you want to work with. This happens to me on Teradata (BTW, they are on the polite side).
  7. Prepare for the technical interview questions:
    1. The interviewer may make some slight modification to the questions even they are from leetcode. For example, instead of asking what exactly the shortest path are in the original leetcode question, the interviewer may ask how many steps in the shortest path. The difference is the former one may expect a list of coordinates (i.e., steps) and the latter one may expect a simply a number. This happens to me on Pocket Gem. The takeaway is that when you solve leetcode questions, think about what possible variations might be. However, it may seem infeasible that you do it for every problem. You don’t have to unless you don’t have anything else to do. The next point will help to address this concern.
    2. Browse some recent interview questions from the company you are about to interview with from forum. This helps to address the previous concern. If you see the company interview some leetcode questions, you may want to look at that leetcode questions and think about the possible variations. Also, usually company has a pool of questions and get some prior exposure from a forum, you may have a good preparation already. Also, this point helps if you are very short of preparation time. In this case, you just prepare for the questions from the forum and you’re good to go. Sometimes, this works much better than a long-term preparation strategy, which you may feel over-prepared and feel a good chunk of time get wasted on leetcode when you can simply prepare the questions from the forum.
    3. Get practice on the leetcode. Usually, people emphasize the importance of getting practice on leetcode. That’s true. However, this depends on when you about to apply for the position. For recruiting new grads, some companies prefer to start early (e.g., in Fall) and others don’t (e.g. in Januarg till March). People always think they should start early as soon as possible to get a spot in the limited headcount. That’s true but this strategy usually comes with a risk: you’ll see new interview questions that no other has seen before. Each year, companies may update their pool of questions. If you think you can solve leetcode problem like “1+1” and have a solid preparation in system design, then start early A the AP is best strategy.  However, if you are in an OKish position in algorithm preparation and design preparation, then you may delay applying one or two weeks. The beauty of delay comes directly from previous point: you may get exposure to the pool of questions before the actual interview. How long to delay is a case-by-case situation. Some companies (e.g., Dropbox, Pocket Gem) will be quite active and send your OA almost immediately after your application and you may want on your side. However, some companies may have a long process to take before setting up any interviews, then you may want to apply ASAP and let the internal processing time takes its time.

Towards the end of the semester

Busy with the final projects. The takeaway from this semester is never picking two 395T courses at the same time. Sorry.

— Update: 01/03/18 —

Last semester ends up amazingly well. Every time I read this post, I always picture Prof. Dana Ballard’s pull-up gesture in my mind and how he compares the difficulty of coursework with the workout in a gym: you always want to lift a heavier weight to gain muscles. For an unknown reason, his voice and pull-up gesture always amuse me.

Leaving IBM

To be honest, this is probably the most difficult post I have ever written. This is majorly because there is a ton of stuff I want to say but I’m unsure whether I should keep them public or should keep it to myself. Another factor that makes this post hard to write is because the span of drafting. I have been drafting this post since April in 2016, right after when I decide to start the whole process of quit-IBM-and-get-a-PhD project.  I used to use this post as a log to record things and feelings when somethings happens around me at IBM. Frankly, if I take a look at the stuff I record (mostly are rantings) retrospectively, lots of stuff still hold but the anger just passes away with the time. So, that year-long drafting really makes me hesitate even more because the mood when those stuff are written are gone. However, two years can be a significant amount of time and quitting IBM can be called “an end of era” and I should give a closure to my happy-and-bitter experience with IBM anyway. So, here it goes.

 

Thank you, IBM!

I’m really thankful for the opportunities working with IBM. This experience really makes me grow both technically and mentally.  Technical-wise, I have the opportunity to get hands on experience with DB2 development. DB2 as a database engine is extremely complex. It has over 10 million lines of code and it is way beyond the scope of any school project. Working on those projects are quite challenging because there is no way you can get clear understanding of every part of the project. I still remember when I attend the new hire education on DB2, there is one guy says: “I have been working on the DB2 optimizer for over 10 years but I cannot claim with certainty that I know every bit of the component I own.” This fact really shocks me and based upon my experience so far, his claim still holds but with one subtle assumption, which I’ll talk about later. There are lots of tools are developed internally and reading through both the code and tool chains are a great fortune for any self-motivated developers. I pick a lots of skills alongside: C, C++, Makefile, Emacs, Perl, Shell, AIX and many more. I’m really appreciated with this opportunity and I feel my knowledge with database and operating system grow a lot since my graduation from college.

Mentally, there are also lots of gains. Being a fresh grad is no easy. Lots of people get burned out because they are just like people who try to learn swim and are put inside water: either swim or drown. I’m lucky that my first job is with IBM because the atmosphere is just so relax: people expect you to learn on your own but they are also friendly enough (majority of them) to give you a hand when you need help. I still remember my first ticket with a customer is on a severity one issue, which should be updated your progress with the problem daily. There is a lot of pressure on me because I really have no clue with the product at the very beginning. I’m thankful for those who help me at that time and many difficult moments afterwards. That makes me realize how important is to be nice and stay active with the people around you.  Because no matter how good you are with technology and the product, there are always stuff you don’t know. Staying active with people around you may help you go through the difficult moment like this by giving you a thread that you can start at least pull. In addition, participating with toastmasters club really improve my communication and leadership skills and more importantly, I make tons of friends inside the club. Without working at IBM, I probably won’t even know the existence of the toastmasters club. If you happen to follow my posts, you’ll see lots of going on around me when I work at IBM. Every experience you go through offer you a great opportunity to learn and improve yourself. Some people may look at them as setbacks but for me, I look at them as opportunities.

toastmasters1

( the picture on the left is all the comments people give to me about my speech and on the right is the awards I have earned inside the club in these two years)

With the help of all those experience, I have developed a good habit of writing blogs (both technical and non-technical), reading books, and keep working out six days per week. All those things cannot be possible if I work at a place where extra hour work commonly happened. I’m very thankful for IBM for this because staying healthy both physically and mentally are super critical for one’s career. Even though those stuff don’t directly come from IBM, but IBM does provide the environment to nurture this things to happen.

 

IBM has its own problem. The problem is centered around people. There are many words I want to say but I think I’ll keep them secretly but I want to show my point with a picture:

ibm_survey

I don’t know why IBM’s term “resource action” on firing employees and the sentence “IBM recognize that our employee are our most valuable resources.” bother me so much. I probably just hate the word “resource” as a way to directly describe people and how this word get spammed so much around IBM. I know everyone working for a big corporation is just like a cog in a machine. However, what I feel based upon lots of things happened around me is that IBM as its attitudes represented by its first-line managers (because those people I commonly work with) makes this fact very explicitly. It hurts, to be honest. No matter how hard you work and no matter how many prizes you have earned for yourself and your first-line manager, you are nothing more than a cog in a machine, which is not worth for high price to have you around because there are many cogs behind you that are ready to replace you. They are much cheaper, much younger, and more or less can work like you because your duty in the machine is just so precisely specified, which doesn’t really depend on how much experience you have had under your belt. To me, that’s devastating.

This leads to the problem that talented people are reluctant to stay with company. My mentor and the people are so good with DB2 have bid farewell to the team. That’s really sad to me because they are the truly asset to the company and the product. The consequence of this is that crucial knowledge is gone with people. Some quirks existing in the product are only known by some people and once they leave the company, the knowledge is gone with them. That makes mastering of the product even harder. That’s the subtle assumption that the person makes during the new hire education and that’s also part of the problem when working with legacy code. The whole legacy code issue is worth another post but one thing I now strongly believe is that any technical problem has its own root cause in company culture and management style. To me, I’m not a guru now but I cannot see the way to become a guru with my current position, which scares me the most

That’s it for this section and I’ll leave the rest to my journal.

“Research” Interest

This week Friday, I meet with my future roommate in Beijing. During the lunch, we had a conversation about each one’s research interest. My roommate, likes me, is also a CS graduate student at Austin. However, unlike me, he has a clear vision about what direction he is going to pursue in graduate school. He just finished his undergraduate degree in Automation department at Tsinghua University. Automation department, as he explained, is similar to a mixture of mechanical engineering and electrical engineering. He has interest in mathematics since high school and naturally, he wants to work on machine learning theory in graduate school with emphasis on computer vision (CV).

Now comes to my turn. That’s a hard question I have been thinking about for a while. I don’t have clear vision on what I’m going to pursue next. I think maybe I’m too greedy and want to keep everything. However, I also realize that I may not be as greedy as I thought initially. I know I don’t want to work on computer architecture, computation theory, algorithm, compiler, network. Now, my options really just choosing among operating system, database, and machine learning. For the machine learning, I even know I probably won’t choose computer vision eventually (still want to try a course though) and I more lean towards the natural language processing (NLP). However, picking one out of those areas is just too hard for me now, even after I did some analysis in my last post trying to buy myself into picking machine learning only. There is always a question running in my head: why I have to pick one? Sometimes I just envy the person like my future roommate who doesn’t have this torture in his mind (maybe he does? I don’t know).

This feeling, to be honest, doesn’t new to me. When I was undergraduate facing the pressure of getting a job, a naive approach is just locking oneself in the room and keeping thinking what profession might suit me the best. After two years of working, I grow up enough to know that this methodology on making choice is stupid and I also grow up enough to know that “give up is a practice of art”. Why I’m in this rush to pick the direction I want to pursue even before I’m taking any graduate course yet? Why can’t I sit down and try out several courses first? Because I want to get a PhD in good school so bad. Let’s face the fact that people get smarter and smarter in generations. Here “smarter and smarter” doesn’t necessarily mean that people won’t repeat the mistake that happened before. It means that people will have better capability to improve themselves. Machine learning is not hot in 2014 from my experience in college. Back that time, Leetcode only has around 100 problems. I have no particular emotional attachment to machine learning material when I’m taking the AI class. Maybe because wisconsin has tradition in system area? I don’t know. However, in 2017, everyone, even my mother who is a retired accountant, can say some words about AI, machine learning. Isn’t that crazy?

On my homepage,  I write the following words:

I like to spend time on both system and machine learning: system programming is deeply rooted in my heart that cannot easily get rid of; machine learning is like the magic trick that the audience always want to know how it works. I come back to the academia in the hope of finding the spark between these two fascinating fields.

Trust me, I really mean it. Maybe because I graduate from wisconsin, I have naturally passion for system-level programming, no matter it from operating system or database. Professor Remzi’s system class is just a blast for anyone who wants to know what’s going on really under the software application layer. Professor Naughton’s db course is fully of insights that I can keep referring to even I begin to work a DBMS in real world. Wisconsin is just too good in system field and this is something that I can hardly say no even I have work so hard lie to my face saying that “system is not worth your time”. What about machine learning? To be honest, great AI dream may never accomplish. Undergraduate AI course surveys almost every corner of AI development but only machine learning becomes the hottest nowadays. Almost every AI-related development nowadays (i.e. NLP,  Robotics, CV) relies on machine learning technique support. Why I’m attracted to machine learning? Because it’s so cool. I’m like a kid who is eager to know what is going on behind magic trick. Machine learning is a technique to solve un-programmable task. We cannot come up with a procedure to teach machine read text, identify image object, and so on. We can solve these tasks only because the advancement of machine learning. Isn’t this great? Why both? I think machine learning and system becomes more and more inseparable. Without good knowledge about system, one can hardly build a good machine learning system. Implementing batch gradient descent using map-reduce is a good example in this case.

I just realized that I haven’t answered the question about rushing towards the making decision. In order to get a good graduate school to pursue PhD, you need to demonstrate that you can do research. This is done by publishing papers. Most of undergraduates nowadays have papers under their belt. That’s huge pressure to me. Master program only has two years. I cannot afford the time to look around. I need to get started with research immediately in order to have a good standing when I apply to PhD in 2018.

So, as you can tell, I have problem. So, as a future researcher, I need to solve the problem. Here is what I’m planning to do:

  • Take courses in machine learning in first semester and begin to work on research project as soon as I can. I’ll give NLP problem a chance.
  • Meanwhile, sitting in OS class and begin to read papers produced by the Berkeley Database group. People their seem to have interest in the intersection between machine learning and system. This paper looks like promising one.
  • Talk to more people in the area and seek some advice from others.
  • Start reading “How to stop worrying and start living

Will this solve the problem eventually? I don’t know. Only time can tell.

What are some useful, but little-known, features of the tools used in professional mathematics?

What's new

A few days ago, I was talking with Ed Dunne, who is currently the Executive Editor of Mathematical Reviews (and in particular with its online incarnation at MathSciNet).  At the time, I was mentioning how laborious it was for me to create a BibTeX file for dozens of references by using MathSciNet to locate each reference separately, and to export each one to BibTeX format.  He then informed me that underneath to every MathSciNet reference there was a little link to add the reference to a Clipboard, and then one could export the entire Clipboard at once to whatever format one wished.  In retrospect, this was a functionality of the site that had always been visible, but I had never bothered to explore it, and now I can populate a BibTeX file much more quickly.

This made me realise that perhaps there are many other useful features of…

View original post 695 more words

Some thoughts on learning

刚才在YouTube首页随便打开了一个推荐视频,视频的名字叫做How to Learn Faster with the Feynman Technique (Example Included)。我个人其实很少点开这种带有非常强烈功利色彩的标题去看的,更别提是这种方法论的视频。并不是说我讨厌这类视频,我只是觉得这类视频看多了,如果不立刻去实践的话,看了也没有任何效果,反而浪费掉了很多时间。这个视频本身不长5分钟左右,看完后,我不仅觉得这个方法可以试一试,而且更重要的是让我回忆起过去看过的一些东西以及一些想法。所以赶紧记录在这里。

方法本身并不复杂。具体其实可以看作者这个网页文字版。我就粗略记录一下要点:

这个方法的核心就是explaining the concept。一个典型场景就是去office hour找老师问问题:在你向老师解释完问题,你的疑惑点,以及你的解决方法之后,很有可能出现的情况就是还没等老师说话,你就说:“噢,我懂了。谢谢老师!” 然后转身跑出了办公室。这个场景体现出两个要点:1. explaining the concept确实非常重要 2. 也许这个问题已经在我们的内心里过了无数遍,却只有我们向另一个人解释的时候我们才会 “噢!!!!”。 第一点不难理解。但是第二点确是极度困难:我们从哪里能找到一个愿意天天听我们explain what we have learned的人呢? 上面的视频提出了一种解决办法,步骤如下:

  1. 找张纸,把你要解释的概念写在纸的最上头
  2. 假装有一个第三方在场,用你自己的语言向他解释这个概念。这里作者有几点强调:
    1.  用最通俗易懂的语言来解释
    2. 不要仅限于定义。挑战自己,用例子,图等形式来确定你可以把所解释的概念实际运用起来。
    3. 把在场的这个第三方想像成一个小孩子。这么做的好处就是小孩子会经常问“为什么?” 这个会让我们做到对这个概念的细节有百分百的理解。
  3. 在做完前两步之后,你就要重新回顾你在哪些地方解释出现模糊,拿不准,或者不知道。这些地方就是我们对这个概念理解薄弱的地方。我们要做的就是拿出资料将这些模糊点搞清楚。这步和考完试改错题非常相似。
  4. 这步其实是第三步的引申。就是看之前哪些解释的地方运用了复杂的语言或者堆砌了大量技术概念。我们要做的就是尽力把这些地方用更简洁的语句进行重写。

视频介绍的这个方法其实在各种我所看过的资料里都有体现过。比如在The Pragmatic Programmer一书中讲如何debug一节时,作者介绍的一个技巧叫做”Rubber Ducking”:

A very simple but particularly useful technique for finding the cause of a problem is simply to explain it to someone else. The other person should look over your shoulder at the screen, and nod his or her head constantly (like a rubber duck bobbing up and down in a bathtub). They do not need to say a word; the simple act of explaining, step by step, what the code is supposed to do often causes the problem to leap off the screen and announce itself.

It sounds simple, but in explaining the problem to another person you must explicitly state things that you may take for granted when going through the code yourself. By having to verbalize some of these assumptions, you may suddenly gain new insight into the problem.

这个”Rubber ducking”的来源就是其中一位作者在早期的时候看见一个很厉害的developer经常拿着一个”a small yellow rubber duck, which he’d place on his terminal while coding”。

The Lady Tasting Tea一书的前言中作者也提到作者在写着本书的时候收到了来自他的妻子很多的帮助。因为他的妻子不是statistician,所以如果作者用很technical的语言来写的话,他的妻子根本就看不懂。所以这迫使作者不得不采用更加通俗易懂的语言来解释统计背后的哲学及观念发展。

这也让我想到我最开始写Tech blog的一个主要motivation就是把我所学的东西通过文章的形式解释出来,争取消灭掉不懂装懂的情况。而视频方法的3,4步就是未来我在写tech blog的时候需要更加注意的地方:不要把所学的内容进行简单的归乃重复,要注意用更加简略的语言讲述出来。这里题外话一句:我发现将知识点内嵌在文章里是一个非常不错的保存知识点的方式。这么做我感觉会使知识点不会过于碎片化,并且提供充足的上下文将独立的知识点很有效的串联起来。这也许是“connect the dots”的一种体现吧。

最后用一个quote来作为这篇post的收尾:

“The first principle is that you must not fool yourself — and you are the easiest person to fool.” — Richard Feynman