On Reading CS Papers – Thoughts & Reflections

Be forewarned:

  • This is not an advice post. There are tons of people out there who desperately want to give people advice on reading papers. Read theirs, please.
  • This post is a continuous reflection on the topic “how to read a CS paper” from my personal practice. I will list out my academic status before each point so that it may be interesting to myself on how my view on the matter has changed as time goes forward.

2018

The first year of my CS master program. Just get started on CS research.

  • It’s OK to not like a paper

In my first semester, I majorly read papers on Human Computation and Crowdsourcing.  Very occasionally, I read papers on NLP. Some papers on NLP are from extra readings in Greg’s course. Some are related to Greg’s final project, which deals with both code and language.  I don’t really like and want to read papers back then. In NLP class, I prefer to read textbooks (Jufrasky’s one) and tutorial posts that I can find online. One roadblock for me to read papers is that there is certain background knowledge gap I need to fill and I just simply don’t know how to read a paper. So, for Greg’s NLP course, I only read some papers related to my final project. This paper is the base paper for my final project. I got this paper from professors in linguistics and software engineering and they want me to try out the same idea but using neural network model instead. I read this paper several times and the more I read, the more I want to throw up.  I just think this paper hides many critical implementation details and the score 95% is just too high for me to believe. The authors open source their code but their code has some nasty maven dependencies, which won’t compile under my environment. Their evaluation metric is non-standard in NLP and many “junk words” wrap around their results. Of course, the result of my experiment is quite negative.  I often think it is just a waste of life to spend your precious time on some paper you dislike.  Here, I’m more of talking about paper writing style and the reproducibility of papers’ results. I probably want to count shunning from some background gap as a legitime reason not like a paper.

  • Try to get most of the paper and go from there

I got this message from Matt’s Crowdsourcing class. In the class, I have read a very mathematical heavy paper, which invokes some combinations of PGM and variational inference on the credibility of fake news. I’m worried back then about how should I approach a paper like this one, which I’m extremely lack of background and mathematics formula looks daunting.  I pose my doubts on Canvas and Matt responds in class and gives the message.  I think the message really gives me some courage on continuing read papers.

  • It’s OK to skip (most) parts of a paper.  Remember: paper is not a textbook!

This semester I’m taking a distributed system class. To be honest, distributed system paper can be extremely boring if they are from industry. Even worse, system paper can be quite long: usually around 15 pages, double column. So, if I read every word from beginning to end, I’ll be super tired and the goal is not feasible for a four-paper-per-week class. So, I have to skip. Some papers are quite useful maybe just for one or two paragraphs. Some papers are useful maybe just because of one figure. As long as your expectation about a paper gets met, you can stop wherever you want.

  • Multiple views of reading a paper

I didn’t get the point until very recently. I did quite terrible on the first midterm of my distributed system class. The exam is about how to design a system to meet a certain requirement. In the first half of the course, I focus on the knowledge part presented by the paper but that doesn’t work out well. Until then, I realize that I need to read those systems paper from a system design point of view: what problems they need to solve, what challenges they have, how they solve the challenges.  OF course, those papers are valuable from knowledge perspective: how consistent hashing works, for example. But, depends on the goal of reading paper, I can prioritize different angles of reading a paper. If I need to implement the system mentioned in the paper, I probably need to switch to a different paper reading style.

  • Get every bit of details of paper if you need to

It’s time again for the final course projects. Again, I need to generate some ideas and find some baseline papers. In this case, “skip parts” and “get most out of the paper and move on” strategy probably won’t work well. All in all, I need to understand the paper and those are rely on the details from the paper. In this case, I need to sit through the whole journey and remove any blockers that I may encounter.

Advertisements

Freedom of speech

Piazza is an online forum tool that is heavily used in the academia. It is used to help students ask questions and get feedback from both peers and instructors. It has a goal that is similar to Slack in the sense that they both try to cut the duplicate emails sent by several people for the same or similar type of request. It is a good tool but every tool that comes with power has its own consequence.

Instructors can perform the following configuration when they setup the forum for the course.

Screen Shot 2018-02-09 at 11.58.12 PM

Basically, this option means that when you make a post, whether you can choose to be “Anonymous” to both your peers and instructors or to your peers only (instructors can still see who makes the post).  The following picture shows what this option looks like from student’s perspective:

Screen Shot 2018-02-09 at 11.58.33 PM

The intention for this option I guess is that some students may feel embarrassed to ask questions. They might think their questions are dumb and will make them look bad in front of peers or instructors. I think this option is used as a way to encourage students to ask questions bravely.

However, this option may get abused. From my observation, Piazza is used as a way for instructors to show off their teaching quality. This is important for Assistant Professors because teaching still means something (if teaching quality doesn’t matter, why institution asks for the teaching statement at the very first place?). In addition, the teaching quality in some sense is an important indicator for students to evaluate you as a person. This is important for professors who are looking for graduate students because research publication is only part of the story and how those professors interact with students may be a crucial indicator to how good a professor as a human being is (evaluation may be a better indicator but it is confidential). Thus, if some potential students look at the piazza that his interested professor teaches gets a lot of complaints. The students may have a second thought on whether he should work with him for research (maybe he is a very bad person even he is doing a good research).

Thus, the instructors have a strong motivation to censor the posts on the piazza. This scares the students because they don’t have a secure way to provide feedback to the instructor. Let’s assume that the majority of students has a good heart: they won’t say bad stuff to the instructor who actually really cares about students. Thus, the time that something slightly negative appears on the Piazza may be a very important signal to the instructor that something wrong with his teaching. However, due to the strong motivation for instructors to show off their teaching quality through Piazza, the instructors may start to censor the speech on the Piazza by turning the option off.

I didn’t realize this thing last semester. Last semester, the instructor from one course sets this option off and I was thinking maybe he wants to know the students who are shy to ask questions and provide some individual attention. However, this semester, the instructor from one of my course initially turn the option on so that everyone can truly ask questions as “Anonymous”. Then, until one day, someone makes the below post and the option is turned off. Now, no students dare to make slightly negative posts.

Screen Shot 2018-02-10 at 12.28.57 AM

 

I fully understand the interests conflict between students and instructors on the use of Piazza: students may think Piazza is a secure way to provide anonymous feedback while instructor may think bad posts on the forum make them look bad. However, I still think there should be a better way to address this conflict to protect both students and instructors especially with the technology we have nowadays. But, (unintentional) censorship is not something we want to culture especially in the Academia. By the way, for this course, I still think the instructor is good but the material is quite challenging without laying down a solid theoretical foundation beforehand. He went through the material again after this post but too bad the truely “Anonymous” is gone.