End Semester Recap

I just finished all the exam and papers today. It has been a long day (wake up at 6) and I feel very exhausted. However, I want to do a quick recap of this semester before my judgment affected by my final grades.


CS 380D Distributed System

My first exam is a disaster. The exam is all about system design + understanding of RAFT. I didn’t get used to the system design in general. All I do is to remember every detail of some system implementations, which usually don’t matter from a design perspective. Vijay has been emphasized this point a lot but I didn’t get it until the second half of the course. The course is good and the biggest takeaway for me is two:

  • Can comfortably read distributed system paper. I cannot claim I can read all types of system paper but for distributed system paper, I begin to get the momentum and start to know where to focus on during the reading. Takes a lot of struggling to get this point but I’m happy overall after reading more than 30 papers.
  • Got intrigued by the distributed system and storage system. In the past, I have been struggling to find my research interests.  But, thanks to this course, I become more intrigued with the combination of distributed system and storage. Right now, I like storage more. I read tons of LSM-based storage paper to find a topic for my final course project. I really enjoy the moment to read LevelDB and PebblesDB’s code and enhance them in some way. That further makes me want to know more about SSDs and HDDs.

CS 388 Natural Language Processing

I trade this course with algorithm class. I have a mixed feeling right now. On one hand, unlike the NLP course that I take in the previous semester, which looks at NLP from models perspectives (HMM, CRF, different networks). this semester’s course is from more traditional linguistics + machine learning perspectives. I really like this part. Overall, I strongly believe linguistics domain knowledge should play the key role in NLP study not various deep learning manic.  First two homework, we look at language models and LSTM based on the intuition of prediction can be two ways. I really like Mooney’s view that you always think about intuition whether the model can work or not instead of mindlessly applying models.  Like last semester’s NLP class, my interests with class declines as the semester progresses partly due to the fact that the material is no longer relevant for homework and exam. That is my bad.

The final project is on VQA, which mostly done by my partner. I only gather the literature and survey the field plus some proofreading. I’m OK with that as I want to have more time working on my system project and my partner wants to work alone in the modeling.  This leads to my lesson learned from the class:

  • Graduate school is about research, not class. Pick the easiest courses and buy yourself time to work on the research problem that attracts you.

If I look back right now, I want to take algorithm class instead. My thoughts to NLP is that I want to start from the dumbass baseline and know the history of the field. If you think about NLP, the most basic technique is just regular expression pattern matching. But, how do we go from there to more complex statistical models is the most interesting point I want to learn.

LIN380M Semantics I

The course is taught by Hans Kamp, which I believe invents the Discourse Representation Theory (DRS). Really nice man. I learn the predicate logic, typed lambda calculus, Montague grammar and DRS. Very good course for the logic-based approach to derive the semantic meaning of a sentence. However, I do feel people in this field put a lot of efforts in handling rule-based exceptions like how do we handle type clash in Montague grammar. When I turn in the final exam, Hans is reading some research paper. He is still doing research and that inspires me a lot.

Other Lesson Learned

  • “Don’t be afraid to fail, be afraid not to try”. I learn a lot from my final system project partner. Reading complex code can be daunting but we can always start to play around even when we cannot understand the code fully. There is a great deal of psychological barrier to be overcome. My partner always starts with reading and then writing. Once bug happens, he is happy because the bug is an indicator of progress, which eventually leads to working code.
  • Work independently. When I got stuck for a while, I always want to seek help instead of counting on myself to solve the problem. It seems that I can never trust myself ability to solve the problem. By observing how my project partner solves the problem, I learn a lot. Start to trying and always seek for the root cause of the problem and situation changes as long as you start trying.
  • Some tips about system paper writing:
    • Use hatch on the bar graph. People may print out their paper in black and white. Use hatches on the bar graph help them to distinguish which bar is your system and which bar is the baseline system.
    • Add more descriptions to each figure and table below. I used to think that there should be only one line of description for each picture. But, as pointed out by my another project partner, people need instructions when read the graphs. People love the pictures and they hate to go to the paragraphs to search for the instructions to understand the graph. Thus, put instructions directly below the picture. Great insight!
  • I really want to know how to measure a system accurately.  From my system project, I realize that measuring the system performance is really hard. Numbers fluctuate crazily and you have no clue why is that because there are some many layers of abstraction  & factors in the experiment environment that can potentially impact the system measurement. I really want to know more about this area during my own study and summer internship.
  • System improvement without provable theoretical guarantees will be very unlikely successful. Overhead or the constant factor hidden in the big-O model usually dominate the actual improvement you might think you can get. For example, there are overhead in spawning threads. We need to compare how much we can get by having multiple threads running in parallel to do the subtask vs. having one single thread do the whole thing. PebblesDB’s paper on the guard and improvement to compaction ultimately prove that we really need to think more before getting our hands dirty. By reading the paper, I get the feeling that they know the system will work before even implementing one because they can clearly show that their functionality works before writing a single line of code. I need to develop more sense about this and taking more theory class.


Ok. Time to pack and catch the flight.


On Reading CS Papers – Thoughts & Reflections

Be forewarned:

  • This is not an advice post. There are tons of people out there who desperately want to give people advice on reading papers. Read theirs, please.
  • This post is a continuous reflection on the topic “how to read a CS paper” from my personal practice. I will list out my academic status before each point so that it may be interesting to myself on how my view on the matter has changed as time goes forward.


The first year of my CS master program. Just get started on CS research.

  • It’s OK to not like a paper

In my first semester, I majorly read papers on Human Computation and Crowdsourcing.  Very occasionally, I read papers on NLP. Some papers on NLP are from extra readings in Greg’s course. Some are related to Greg’s final project, which deals with both code and language.  I don’t really like and want to read papers back then. In NLP class, I prefer to read textbooks (Jufrasky’s one) and tutorial posts that I can find online. One roadblock for me to read papers is that there is certain background knowledge gap I need to fill and I just simply don’t know how to read a paper. So, for Greg’s NLP course, I only read some papers related to my final project. This paper is the base paper for my final project. I got this paper from professors in linguistics and software engineering and they want me to try out the same idea but using neural network model instead. I read this paper several times and the more I read, the more I want to throw up.  I just think this paper hides many critical implementation details and the score 95% is just too high for me to believe. The authors open source their code but their code has some nasty maven dependencies, which won’t compile under my environment. Their evaluation metric is non-standard in NLP and many “junk words” wrap around their results. Of course, the result of my experiment is quite negative.  I often think it is just a waste of life to spend your precious time on some paper you dislike.  Here, I’m more of talking about paper writing style and the reproducibility of papers’ results. I probably want to count shunning from some background gap as a legitime reason not like a paper.

  • Try to get most of the paper and go from there

I got this message from Matt’s Crowdsourcing class. In the class, I have read a very mathematical heavy paper, which invokes some combinations of PGM and variational inference on the credibility of fake news. I’m worried back then about how should I approach a paper like this one, which I’m extremely lack of background and mathematics formula looks daunting.  I pose my doubts on Canvas and Matt responds in class and gives the message.  I think the message really gives me some courage on continuing read papers.

  • It’s OK to skip (most) parts of a paper.  Remember: paper is not a textbook!

This semester I’m taking a distributed system class. To be honest, distributed system paper can be extremely boring if they are from industry. Even worse, system paper can be quite long: usually around 15 pages, double column. So, if I read every word from beginning to end, I’ll be super tired and the goal is not feasible for a four-paper-per-week class. So, I have to skip. Some papers are quite useful maybe just for one or two paragraphs. Some papers are useful maybe just because of one figure. As long as your expectation about a paper gets met, you can stop wherever you want.

  • Multiple views of reading a paper

I didn’t get the point until very recently. I did quite terrible on the first midterm of my distributed system class. The exam is about how to design a system to meet a certain requirement. In the first half of the course, I focus on the knowledge part presented by the paper but that doesn’t work out well. Until then, I realize that I need to read those systems paper from a system design point of view: what problems they need to solve, what challenges they have, how they solve the challenges.  OF course, those papers are valuable from knowledge perspective: how consistent hashing works, for example. But, depends on the goal of reading paper, I can prioritize different angles of reading a paper. If I need to implement the system mentioned in the paper, I probably need to switch to a different paper reading style.

  • Get every bit of details of paper if you need to

It’s time again for the final course projects. Again, I need to generate some ideas and find some baseline papers. In this case, “skip parts” and “get most out of the paper and move on” strategy probably won’t work well. All in all, I need to understand the paper and those are rely on the details from the paper. In this case, I need to sit through the whole journey and remove any blockers that I may encounter.

Job hunting lesson learned

This post contains a collection of lessons I learned during the job hunting. I’m still looking for internship & job. That’s good because that means this post will be at least frequently updated in the foreseeable future.

  1. Always attending career fairs. In UT, if you are a CS student with a good standing, you can get an invitation to an event called FOCS Career Night. There will be a lot of recruiters. But, be careful, most of the recruiters are actually engineers or UT students (that’s right, some companies make Campus Ambassador attend the event as if they are the recruiters). There is a huge difference between recruiters and engineers: recruiters get the call on who gets the interview, not engineers! I made a mistake by attending the FOCS Career Night only and skip the Career Fairs. In fact, recruiters are actually coming to Career Fairs and some of them doing on-campus interview signup immediately. The on-campus interview is much better than OA. Even you got an invitation for scheduling an interview after FOCS Career Night, you still want to talk to the company at Career Fairs because recruiters can barely check their emails when they are on travel and interview slots are always based on first come first serve policy. So, you always make sure to come to the Career Fairs and schedule an interview immediately instead of replying the invitation email and wait for the response and then got one said interview slots are all filled. This happens to me on Indeed.
  2. Always doing OA immediately. When you receive an OA, the company usually will tell you that you can finish the test within certain days. However, things can change rather quickly. Even they give you buffer like finishing this test within 4 days, ignore the message and do OA immediately. Slots can fill rather quickly and some company has this under-table rule on even they say 4 days, they really mean immediately. This happens to me on Dropbox.
  3. Always finishing OA within min(60 minutes, restricted time). Some company allows you to finish OA within days. In other words, even you start the test, you have a couple of days to finish it. Ignore this, please! Even though the test lasts for days, finish it as quick as you can. The finishing time is a strong indicator of your coding ability. This happens to me on Twitter.
  4. Always follow-up with the recruiter. Sometimes, there might be system error: they send reject email to the wrong person. Make sure you confirm this with the recruiter and finish your OA no matter what happens. Even you got rejected, OA is still an invaluable practice opportunity. This happens to me on Dropbox.
  5. Always make sure you apply in the University Recruiting section. Companies make specific web pages for fresh graduate and recent graduate. Make sure you submit your resume there. If you submit the resume to the wrong place, you may in a pool that is filled with professional with 5+ years of experience. That always leads to either no hear back at all or an immediately reject letter. This happens to me on Dropbox.
  6. Use the LinkedIn and be aggressive. I’m a shy person but job hunting like the name suggested, it’s a hunting. You have to be aggressive. Connect with as many people as you can whether it is from Career Fair, social events, LinkedIn in-mail. Be polite and be bold. Ask them for the opportunity. One special note is that you may want to “harass” recruiters and senior developers in LinkedIn. Their words have much more power and you may get an interview very quickly. If they being rude when you ask for the favor politely, you already know that this company is definitely not the one you want to work with. This happens to me on Teradata (BTW, they are on the polite side).
  7. Prepare for the technical interview questions:
    1. The interviewer may make some slight modification to the questions even they are from leetcode. For example, instead of asking what exactly the shortest path are in the original leetcode question, the interviewer may ask how many steps in the shortest path. The difference is the former one may expect a list of coordinates (i.e., steps) and the latter one may expect a simply a number. This happens to me on Pocket Gem. The takeaway is that when you solve leetcode questions, think about what possible variations might be. However, it may seem infeasible that you do it for every problem. You don’t have to unless you don’t have anything else to do. The next point will help to address this concern.
    2. Browse some recent interview questions from the company you are about to interview with from forum. This helps to address the previous concern. If you see the company interview some leetcode questions, you may want to look at that leetcode questions and think about the possible variations. Also, usually company has a pool of questions and get some prior exposure from a forum, you may have a good preparation already. Also, this point helps if you are very short of preparation time. In this case, you just prepare for the questions from the forum and you’re good to go. Sometimes, this works much better than a long-term preparation strategy, which you may feel over-prepared and feel a good chunk of time get wasted on leetcode when you can simply prepare the questions from the forum.
    3. Get practice on the leetcode. Usually, people emphasize the importance of getting practice on leetcode. That’s true. However, this depends on when you about to apply for the position. For recruiting new grads, some companies prefer to start early (e.g., in Fall) and others don’t (e.g. in Januarg till March). People always think they should start early as soon as possible to get a spot in the limited headcount. That’s true but this strategy usually comes with a risk: you’ll see new interview questions that no other has seen before. Each year, companies may update their pool of questions. If you think you can solve leetcode problem like “1+1” and have a solid preparation in system design, then start early A the AP is best strategy.  However, if you are in an OKish position in algorithm preparation and design preparation, then you may delay applying one or two weeks. The beauty of delay comes directly from previous point: you may get exposure to the pool of questions before the actual interview. How long to delay is a case-by-case situation. Some companies (e.g., Dropbox, Pocket Gem) will be quite active and send your OA almost immediately after your application and you may want on your side. However, some companies may have a long process to take before setting up any interviews, then you may want to apply ASAP and let the internal processing time takes its time.

Towards the end of the semester

Busy with the final projects. The takeaway from this semester is never picking two 395T courses at the same time. Sorry.

— Update: 01/03/18 —

Last semester ends up amazingly well. Every time I read this post, I always picture Prof. Dana Ballard’s pull-up gesture in my mind and how he compares the difficulty of coursework with the workout in a gym: you always want to lift a heavier weight to gain muscles. For an unknown reason, his voice and pull-up gesture always amuse me.

Leaving IBM

To be honest, this is probably the most difficult post I have ever written. This is majorly because there is a ton of stuff I want to say but I’m unsure whether I should keep them public or should keep it to myself. Another factor that makes this post hard to write is because the span of drafting. I have been drafting this post since April in 2016, right after when I decide to start the whole process of quit-IBM-and-get-a-PhD project.  I used to use this post as a log to record things and feelings when somethings happens around me at IBM. Frankly, if I take a look at the stuff I record (mostly are rantings) retrospectively, lots of stuff still hold but the anger just passes away with the time. So, that year-long drafting really makes me hesitate even more because the mood when those stuff are written are gone. However, two years can be a significant amount of time and quitting IBM can be called “an end of era” and I should give a closure to my happy-and-bitter experience with IBM anyway. So, here it goes.


Thank you, IBM!

I’m really thankful for the opportunities working with IBM. This experience really makes me grow both technically and mentally.  Technical-wise, I have the opportunity to get hands on experience with DB2 development. DB2 as a database engine is extremely complex. It has over 10 million lines of code and it is way beyond the scope of any school project. Working on those projects are quite challenging because there is no way you can get clear understanding of every part of the project. I still remember when I attend the new hire education on DB2, there is one guy says: “I have been working on the DB2 optimizer for over 10 years but I cannot claim with certainty that I know every bit of the component I own.” This fact really shocks me and based upon my experience so far, his claim still holds but with one subtle assumption, which I’ll talk about later. There are lots of tools are developed internally and reading through both the code and tool chains are a great fortune for any self-motivated developers. I pick a lots of skills alongside: C, C++, Makefile, Emacs, Perl, Shell, AIX and many more. I’m really appreciated with this opportunity and I feel my knowledge with database and operating system grow a lot since my graduation from college.

Mentally, there are also lots of gains. Being a fresh grad is no easy. Lots of people get burned out because they are just like people who try to learn swim and are put inside water: either swim or drown. I’m lucky that my first job is with IBM because the atmosphere is just so relax: people expect you to learn on your own but they are also friendly enough (majority of them) to give you a hand when you need help. I still remember my first ticket with a customer is on a severity one issue, which should be updated your progress with the problem daily. There is a lot of pressure on me because I really have no clue with the product at the very beginning. I’m thankful for those who help me at that time and many difficult moments afterwards. That makes me realize how important is to be nice and stay active with the people around you.  Because no matter how good you are with technology and the product, there are always stuff you don’t know. Staying active with people around you may help you go through the difficult moment like this by giving you a thread that you can start at least pull. In addition, participating with toastmasters club really improve my communication and leadership skills and more importantly, I make tons of friends inside the club. Without working at IBM, I probably won’t even know the existence of the toastmasters club. If you happen to follow my posts, you’ll see lots of going on around me when I work at IBM. Every experience you go through offer you a great opportunity to learn and improve yourself. Some people may look at them as setbacks but for me, I look at them as opportunities.


( the picture on the left is all the comments people give to me about my speech and on the right is the awards I have earned inside the club in these two years)

With the help of all those experience, I have developed a good habit of writing blogs (both technical and non-technical), reading books, and keep working out six days per week. All those things cannot be possible if I work at a place where extra hour work commonly happened. I’m very thankful for IBM for this because staying healthy both physically and mentally are super critical for one’s career. Even though those stuff don’t directly come from IBM, but IBM does provide the environment to nurture this things to happen.


IBM has its own problem. The problem is centered around people. There are many words I want to say but I think I’ll keep them secretly but I want to show my point with a picture:


I don’t know why IBM’s term “resource action” on firing employees and the sentence “IBM recognize that our employee are our most valuable resources.” bother me so much. I probably just hate the word “resource” as a way to directly describe people and how this word get spammed so much around IBM. I know everyone working for a big corporation is just like a cog in a machine. However, what I feel based upon lots of things happened around me is that IBM as its attitudes represented by its first-line managers (because those people I commonly work with) makes this fact very explicitly. It hurts, to be honest. No matter how hard you work and no matter how many prizes you have earned for yourself and your first-line manager, you are nothing more than a cog in a machine, which is not worth for high price to have you around because there are many cogs behind you that are ready to replace you. They are much cheaper, much younger, and more or less can work like you because your duty in the machine is just so precisely specified, which doesn’t really depend on how much experience you have had under your belt. To me, that’s devastating.

This leads to the problem that talented people are reluctant to stay with company. My mentor and the people are so good with DB2 have bid farewell to the team. That’s really sad to me because they are the truly asset to the company and the product. The consequence of this is that crucial knowledge is gone with people. Some quirks existing in the product are only known by some people and once they leave the company, the knowledge is gone with them. That makes mastering of the product even harder. That’s the subtle assumption that the person makes during the new hire education and that’s also part of the problem when working with legacy code. The whole legacy code issue is worth another post but one thing I now strongly believe is that any technical problem has its own root cause in company culture and management style. To me, I’m not a guru now but I cannot see the way to become a guru with my current position, which scares me the most

That’s it for this section and I’ll leave the rest to my journal.

“Research” Interest

This week Friday, I meet with my future roommate in Beijing. During the lunch, we had a conversation about each one’s research interest. My roommate, likes me, is also a CS graduate student at Austin. However, unlike me, he has a clear vision about what direction he is going to pursue in graduate school. He just finished his undergraduate degree in Automation department at Tsinghua University. Automation department, as he explained, is similar to a mixture of mechanical engineering and electrical engineering. He has interest in mathematics since high school and naturally, he wants to work on machine learning theory in graduate school with emphasis on computer vision (CV).

Now comes to my turn. That’s a hard question I have been thinking about for a while. I don’t have clear vision on what I’m going to pursue next. I think maybe I’m too greedy and want to keep everything. However, I also realize that I may not be as greedy as I thought initially. I know I don’t want to work on computer architecture, computation theory, algorithm, compiler, network. Now, my options really just choosing among operating system, database, and machine learning. For the machine learning, I even know I probably won’t choose computer vision eventually (still want to try a course though) and I more lean towards the natural language processing (NLP). However, picking one out of those areas is just too hard for me now, even after I did some analysis in my last post trying to buy myself into picking machine learning only. There is always a question running in my head: why I have to pick one? Sometimes I just envy the person like my future roommate who doesn’t have this torture in his mind (maybe he does? I don’t know).

This feeling, to be honest, doesn’t new to me. When I was undergraduate facing the pressure of getting a job, a naive approach is just locking oneself in the room and keeping thinking what profession might suit me the best. After two years of working, I grow up enough to know that this methodology on making choice is stupid and I also grow up enough to know that “give up is a practice of art”. Why I’m in this rush to pick the direction I want to pursue even before I’m taking any graduate course yet? Why can’t I sit down and try out several courses first? Because I want to get a PhD in good school so bad. Let’s face the fact that people get smarter and smarter in generations. Here “smarter and smarter” doesn’t necessarily mean that people won’t repeat the mistake that happened before. It means that people will have better capability to improve themselves. Machine learning is not hot in 2014 from my experience in college. Back that time, Leetcode only has around 100 problems. I have no particular emotional attachment to machine learning material when I’m taking the AI class. Maybe because wisconsin has tradition in system area? I don’t know. However, in 2017, everyone, even my mother who is a retired accountant, can say some words about AI, machine learning. Isn’t that crazy?

On my homepage,  I write the following words:

I like to spend time on both system and machine learning: system programming is deeply rooted in my heart that cannot easily get rid of; machine learning is like the magic trick that the audience always want to know how it works. I come back to the academia in the hope of finding the spark between these two fascinating fields.

Trust me, I really mean it. Maybe because I graduate from wisconsin, I have naturally passion for system-level programming, no matter it from operating system or database. Professor Remzi’s system class is just a blast for anyone who wants to know what’s going on really under the software application layer. Professor Naughton’s db course is fully of insights that I can keep referring to even I begin to work a DBMS in real world. Wisconsin is just too good in system field and this is something that I can hardly say no even I have work so hard lie to my face saying that “system is not worth your time”. What about machine learning? To be honest, great AI dream may never accomplish. Undergraduate AI course surveys almost every corner of AI development but only machine learning becomes the hottest nowadays. Almost every AI-related development nowadays (i.e. NLP,  Robotics, CV) relies on machine learning technique support. Why I’m attracted to machine learning? Because it’s so cool. I’m like a kid who is eager to know what is going on behind magic trick. Machine learning is a technique to solve un-programmable task. We cannot come up with a procedure to teach machine read text, identify image object, and so on. We can solve these tasks only because the advancement of machine learning. Isn’t this great? Why both? I think machine learning and system becomes more and more inseparable. Without good knowledge about system, one can hardly build a good machine learning system. Implementing batch gradient descent using map-reduce is a good example in this case.

I just realized that I haven’t answered the question about rushing towards the making decision. In order to get a good graduate school to pursue PhD, you need to demonstrate that you can do research. This is done by publishing papers. Most of undergraduates nowadays have papers under their belt. That’s huge pressure to me. Master program only has two years. I cannot afford the time to look around. I need to get started with research immediately in order to have a good standing when I apply to PhD in 2018.

So, as you can tell, I have problem. So, as a future researcher, I need to solve the problem. Here is what I’m planning to do:

  • Take courses in machine learning in first semester and begin to work on research project as soon as I can. I’ll give NLP problem a chance.
  • Meanwhile, sitting in OS class and begin to read papers produced by the Berkeley Database group. People their seem to have interest in the intersection between machine learning and system. This paper looks like promising one.
  • Talk to more people in the area and seek some advice from others.
  • Start reading “How to stop worrying and start living

Will this solve the problem eventually? I don’t know. Only time can tell.

What are some useful, but little-known, features of the tools used in professional mathematics?

What's new

A few days ago, I was talking with Ed Dunne, who is currently the Executive Editor of Mathematical Reviews (and in particular with its online incarnation at MathSciNet).  At the time, I was mentioning how laborious it was for me to create a BibTeX file for dozens of references by using MathSciNet to locate each reference separately, and to export each one to BibTeX format.  He then informed me that underneath to every MathSciNet reference there was a little link to add the reference to a Clipboard, and then one could export the entire Clipboard at once to whatever format one wished.  In retrospect, this was a functionality of the site that had always been visible, but I had never bothered to explore it, and now I can populate a BibTeX file much more quickly.

This made me realise that perhaps there are many other useful features of…

View original post 695 more words