- This is not an advice post. There are tons of people out there who desperately want to give people advice on reading papers. Read theirs, please.
- This post is a continuous reflection on the topic “how to read a CS paper” from my personal practice. I will list out my academic status before each point so that it may be interesting to myself on how my view on the matter has changed as time goes forward.
The first year of my CS master program. Just get started on CS research.
- It’s OK to not like a paper
In my first semester, I majorly read papers on Human Computation and Crowdsourcing. Very occasionally, I read papers on NLP. Some papers on NLP are from extra readings in Greg’s course. Some are related to Greg’s final project, which deals with both code and language. I don’t really like and want to read papers back then. In NLP class, I prefer to read textbooks (Jufrasky’s one) and tutorial posts that I can find online. One roadblock for me to read papers is that there is certain background knowledge gap I need to fill and I just simply don’t know how to read a paper. So, for Greg’s NLP course, I only read some papers related to my final project. This paper is the base paper for my final project. I got this paper from professors in linguistics and software engineering and they want me to try out the same idea but using neural network model instead. I read this paper several times and the more I read, the more I want to throw up. I just think this paper hides many critical implementation details and the score 95% is just too high for me to believe. The authors open source their code but their code has some nasty maven dependencies, which won’t compile under my environment. Their evaluation metric is non-standard in NLP and many “junk words” wrap around their results. Of course, the result of my experiment is quite negative. I often think it is just a waste of life to spend your precious time on some paper you dislike. Here, I’m more of talking about paper writing style and the reproducibility of papers’ results. I probably want to count shunning from some background gap as a legitime reason not like a paper.
- Try to get most of the paper and go from there
I got this message from Matt’s Crowdsourcing class. In the class, I have read a very mathematical heavy paper, which invokes some combinations of PGM and variational inference on the credibility of fake news. I’m worried back then about how should I approach a paper like this one, which I’m extremely lack of background and mathematics formula looks daunting. I pose my doubts on Canvas and Matt responds in class and gives the message. I think the message really gives me some courage on continuing read papers.
- It’s OK to skip (most) parts of a paper. Remember: paper is not a textbook!
This semester I’m taking a distributed system class. To be honest, distributed system paper can be extremely boring if they are from industry. Even worse, system paper can be quite long: usually around 15 pages, double column. So, if I read every word from beginning to end, I’ll be super tired and the goal is not feasible for a four-paper-per-week class. So, I have to skip. Some papers are quite useful maybe just for one or two paragraphs. Some papers are useful maybe just because of one figure. As long as your expectation about a paper gets met, you can stop wherever you want.
- Multiple views of reading a paper
I didn’t get the point until very recently. I did quite terrible on the first midterm of my distributed system class. The exam is about how to design a system to meet a certain requirement. In the first half of the course, I focus on the knowledge part presented by the paper but that doesn’t work out well. Until then, I realize that I need to read those systems paper from a system design point of view: what problems they need to solve, what challenges they have, how they solve the challenges. OF course, those papers are valuable from knowledge perspective: how consistent hashing works, for example. But, depends on the goal of reading paper, I can prioritize different angles of reading a paper. If I need to implement the system mentioned in the paper, I probably need to switch to a different paper reading style.
- Get every bit of details of paper if you need to
It’s time again for the final course projects. Again, I need to generate some ideas and find some baseline papers. In this case, “skip parts” and “get most out of the paper and move on” strategy probably won’t work well. All in all, I need to understand the paper and those are rely on the details from the paper. In this case, I need to sit through the whole journey and remove any blockers that I may encounter.