记第一次车祸修车经历

车祸回顾

2月11号晚上6点10分左右,在我家附近高速出口,一辆黑车突然停了下来。之后我意识到开始踩刹车,但是因为11号周末Austin的极端天气,路面结冰,我车打滑没办法停下来,撞到了前面的黑车的后保险杠。后来我把挡挂到了P上,然后把车停了下来。但是后边一俩白车可能因为相似问题,撞到了我。但是因为我的车是斜着停的,所以后面的那辆白车直接撞到了我的驾驶员侧门,并擦着我的前保险杠滑了出去。所幸的是没有人员受伤。

和保险公司打交道

我用的是State Farm保险。流程基本上是这样的。我首先给我自己Policy开了一个claim。然后保险公司就开始决定事故的责任方了。我后来想因为后边的白车撞上了我,如果我自己给自己的policy开claim,那么是不是责任认定就要认定我是全责了。后来我就给State Farm collision team打了个电话(电话开了claim之后就能看到)。后来得知如下信息。因为也是第一次发生事故,通过电话也算是对处理事故流程也算是有个大致的了解了。

  • 我这个情况是一个claim,两个liability (责任):一个liability是决定我撞上第一辆黑车的,另一个liability是决定白车撞上我的。后来保险公司电话专员告诉我他会给白车的policy也开一个claim (内部术语好像叫做dual claim)。因为白车也是state farm的保险,所以内部流程应该比较好操作。我想如果是别家保险,还是应该先给对方保险公司开claim。给自己policy开claim是不是会造成我是责任方的印象?
  • 保险公司会了解liability,我被撞的话,deductible会返还给我,但是我撞黑车的liability要走deductible,deductible之后的费用保险公司才会支付。
  • 修车分两种:一种是走和保险公司合作的修车厂,这种情况修车厂会给保险公司发estimate,保险公司就会直接支付修车费用。这里如果你想用原厂配件(叫original equipment manufacturer (OEM) parts),差价就得自付。另外一种就是走pocket estimate。就是你把车拍照,然后自己找collision center(body shop)去修,保险公司会按照estimate把钱打给你,多出来的部分,如果是表面看不出来但是修车过程中发现的damage,保险公司会找appraiser和body shop商量重新estimate,比原先estimate多出来的部分是叫supplement estimate。除去这些,多出来的人工费啥的,保险公司就拒绝支付,然后你自己来付差价了。不管走哪种修车方式,都可以跟state farm说把车tow到你指定的修车厂。

以上是state farm的流程。貌似不具备通用性,问了问朋友,他之前出过车祸,他是直接问dealer找指定的body shop,然后他再联系的保险公司,就没了。他用的是progressive。我的做法是找品牌商网上的certificated collision center,一个个打电话问,问能不能work with state farm。

事件处理记录

11号晚上我向我的保险公司file了claim。

12号的早上,我打电话了解到我dealer附近的collision center是可以work with state farm的。接近中午的时候state farm claim associate给我打电话了解事故详情。同时问我了指定的修车厂并安排了拖车。我决定还是用pocket estimate让4s店的collision center来修。下午时分,state farm又给我打来电话,说明了一下保险libility的决定。

13号我用pocket estimate上传了照片,也就是提交了damage report。

16号下午2点左右拖车公司打来电话说要今天过来拖车。但是因为下雪怕出现意外同时我手头上的pocket estimate还没有好,理想我是手里拿着pocket estimate去collision center。所以我reschedule到下周一了。

17号下午pocket estimate报告出来了

19号收到了邮件说state farm要直接给我打支票。但是我之前联系保险公司说要让钱直接打到修车厂。早些时候和保险公司又打了一个电话让他们stop payment。现在流程变成了在修车厂完成修理后,我再给保险公司打一个电话,让他们issue payment直接给修车厂。换句话说,我需要协调时间点。另外,由于后边白车撞上了我,我付的deductible会被后边车的policy refund。同时,保险公司帮我确认了下周一拖车的appointment。16号我以为是reschedule但是内部显示是cancel了。

22号下午拖车公司把车拖走。

23号上午修理厂发来表格确认收到车并且让我填好表之后,他们就可以开始修理了。

24号修理厂向State Farm报了修理价格(也就是supplement)。State Farm估计的是3400刀左右,修理厂报价是8000多刀。

3月1号State Farm更新了estimate,达到了7000多,但是和修理厂报价有1000刀左右的差距,不知道接下来怎么搞。先等等看修理厂怎么说吧。

3月2号修车厂调整了他们的报价和保险公司estimate相接近。但是我需要支付额外225刀的OEM Parts difference。同意的话他们就直接去下parts的订单了。上网查了查如何negotiate,感觉这部分也没啥好搞的。修车厂肯定会找各种理由来证明账单上的钱数。所以就同意支付了。

由于3月2号晚间的时候收到短信说state farm向修理厂支付了3000多刀的费用,我很是奇怪因为我在19号的时候明确跟state farm说不要把钱支付给我,待修理完毕后,我再通知他们给修理厂打钱。所以我3号给他们打了一个电话。得知的是original estimate的钱state farm已经打给我了(确实是)并且supplment部分的钱,他们是直接打给修理厂的。但是,现在他们给修理厂的钱和打给我的钱汇总和修理厂给我看的estimate有差距。所以,我就直接写信给修理厂,求证我最后需要支付的钱数。最后得知就是insurance check,deductible以及225刀part difference。

3月8号收到修理厂电话说部件已经送到 这周就能修理好。

3月12号收到修理厂打来电话说车已经修好,可以来修车厂来取车。同时告知之前的225刀part difference也已经免除了。现在只需要保险公司打给我的钱再加上deductible。

3月15号来修理厂提车。因为保险公司是直接把钱打到了我的账户,我想着是直接给修理厂写一个personal check。到了修理厂,被告知personal check在他们那里只有在第一次拖车过来的时候才能收。取车的话只收cashier check或者直接从debit card上收。但是收取金额已经超过了我debit card当日最高上限,所以当场以卡背后的号码给银行打电话让他们通过这笔交易。几经沟通,终于搞定。提车成功。

3月15号当天给insurance company打电话 问询deductible refund的问题。晚间打了一个电话,基本上这个叫internal recovery。说会在3-5个工作日内打到我的账户上。

3月15号晚上在上一个电话20多分钟之后,我接到了刚才同样的state farm agent电话。说她在和她的上级review我的claim的时候说我的deductible不能被reimburse因为我的claim里边有我的liability。如果我的车只是被后边的车撞了的话,我的deductible就会被reimburse。她建议我明天再给claim team打电话,说清情况,并要求电话转给property complex team来解决这个事情。看了一下之前的日志,我这个情况的术语叫做dual claim。记住这个方便明天的沟通。在网上搜了搜,通过这个链接发现了如下一段话

In most cases, you do not have to pay your deductible if another insured driver hits you. The other driver’s liability insurance should pay for your repairs. If you have collision coverage, you can choose to go through your insurance to repair your car, but you still won’t have to pay the deductible. Your insurance company will seek full reimbursement from the at-fault driver’s insurer.

If another person files a claim against you, your liability coverage will cover the costs of repairs. You will not pay a deductible to cover damages to the other party. But you will have to pay a deductible to get your own car fixed when you are at-fault. You can also expect to pay all or part of your deductible in situations where fault is shared between you and the other driver. You may be on the hook for any damage you cause that exceeds your policy limits, too.

现在的问题是我花了deductible修了我自己的车,但是damage是由我撞了前面车和我被后边车撞这两个不同撞击造成的。换句话讲,就是我付的deductible最少也应该被reimburse一多半,因为我车上的damage很大一部分是由后边的车撞上了我造成的。虽然确实有“You can also expect to pay all or part of your deductible in situations where fault is shared between you and the other driver.”这句话,但是我的车的大部分damage是由于后边车撞上我造成的。也就是“In most cases, you do not have to pay your deductible if another insured driver hits you.”情况要占更多一些。准备明天基于这点与保险公司争论一下。

3月16日致电了state farm,试了几个不同的claim team agent,得到的回复如下:

  • agent1: 整个dual claim需要走segregation team的流程,需要大概2周的时间,他们和我取得联系,并且refund deductible。
  • agent2: 告知我整个segregation process在2月11日incident发生的当天就开始了。通常这个需要等车行修完车,该付的钱付清楚之后,segregation team会bill at-fault party的insurance policy。第一笔拿到的钱会用来reimburse我的deductible。并且告诉我下次致电问以下问题就可以了:“Has this (claim)send to the segregation yet?”

基于如上回答,我想等两周之后再follow-up。通常整个流程最多有3个月。

3月19日保险公司邮件通知我我的deductible已经refund了。

3月22日银行收到了退款。至此,整个事件就画上了句号。最后看下来,修车并没有花钱,虽然保费涨了4块钱。

科研日志3

今天是关于导师的吐槽。开学2个月以来,目前最大的成果就是证明了导师原来最引以为傲的idea并不怎么work。之所以说是并不怎么work是因为原来他的具体算法实现被证明无法被非常简单的adapt到我现在研究的问题上。但是,算法的理念也许还有挽救空间,所以现在判死刑感觉还是为时尚早。现在整个工作的重心由原来开发新的算法变成对这个问题的领域的背景知识进行学习。之前开发算法是想着使新的算法符合某些特性来使其满足一些bound。但是这些特性都是从导师那里道听途说而来。现在既然直接开发算法没有什么头绪了,就索性花时间来搞清这些特性是什么以及具体bound是什么。现在这学期的目标也就修正为把相关的两篇key paper弄懂了。不得不说现在做的东西是越来越理论了。实现相比起来也就没那么困难了。回到导师吐槽这里来。周二meeting的时候他洋洋洒洒讲了快一个小时把他脑海里的想法跟我说,关于算法开发方面,下一步怎么做。但是他脑海里的都是些他自己的定义,但是用的terminology的单词又是在正常情况下表示的另外一个意思。比如说,他所谓的加号其实在正常数学体系来看是乘法。虽然这并不妨碍我听懂他的意思,但是我主要精力也就放在decode他的语言里了,至于他讲这些是为了解决现在我这个算法中的什么问题,我就不明白了。我之前其实就我现在算法的版本,已经写下来,让他看了。估计他是太沉迷搞他的房子了,根本没有看。所以,我用的一套语言体系和他的完全不兼容不说,他对于我现在的算法是什么样的状态估计也只是有个大概的印象。他最后meeting中也意识到了这个问题,所以说要看看我写的,然后周四跟我再meet来讨论。然后就没有然后了,因为现在已经是周四傍晚了。这点让我非常的不爽。这就好比我工作都写好报告了,本身也不长,让你的上级看一下就可以很好的在会议上比较好的讨论出来。换句话讲,我基本上就差把写好的东西喂给他吃了。关于这点我跟女朋友又聊了聊。得出的结论是基本上学术界都这样。不像工作,你有经理,同级。如果一件事无法推进,他们会帮助你去push别人。但是,在学术界你的导师基本上就是jeff bezos。jeff承诺了什么事,如果他不干,你也不好说什么。所以说现在的思路就变成了,他愿意开会愿意聊,就让他说,说的有用的就听,没用的就算了。开会也不用催他了,愿意开就开,不愿意就算了。自己先把背景知识这块先学了再说。之后怎么着再看。现在我想到the 7 habbits of effective people里说的一项原则,像导师这种都属于外部因素,你无法控制,所以与其把期望和精力放在他身上,不如放在自己能控制的地方,比如说把关键论文搞懂。写到这,我又想到了一点可以去尝试,那就是像在亚马逊那样,开会的头20分钟,花会上的时间,给他时间去读,盯着他读,读完了再讨论。下周试试这招。说实话像导师这种我觉得也不能全当成不可控的因素看。我要想办法限制不可控的范围,把完全不可控变成部分可控。我开会我也不给他讲了,又不是不识字,就让他读,读完了再讨论。有些毛病改让他改也得改改了。尝试建立一个相对职业的工作环境我觉得是非常有必要的。写完这些,心里舒服了些,可以继续钻到我论文里去了。噢,对了,不爽的一点就是当初来之前,士气很高,以为他要干一桩大的,给自己职业生涯画个完美的句号。结果开学以来他主要就是在折腾房子。再加交2门课,什么写funding proposal这我现在都当放屁了。Again,把精力花在自己能控制的因素上。剩下的就该怎么着怎么着了。可惜研究是很有意思,万一让我读下来,我就找个公司搞研究去了。

SIGMOD 2020 conference experience

This is my brain dump of SIGMOD 2020 conference experience.

First of all, I really like virtual conference. I have been to conference once in the past. One big lesson learned from that experience is that there is no way to attend all sessions simply due to it’s not physical possible. Sessions are running concurrently at the same time. It’s cumbersome to navigate through the venue and get around the crowd to reach the session you like in time. However, with Zoom, the magic happens. I can open up all sessions I’m interested in and mute the speaker via drop audio setting in Zoom. If I find the topic I want to hear more, I can instantly switch to the desired Zoom window, reset the audio setting, and listen to the talk. Analogously, the experience feels like watching some streaming marathon on Twitch or watching the International from your laptop. Also, virtual means no visa headache 🙂 Another good thing about having sessions in Zoom is that I can easily ask questions whether via directly chiming in (Thanks Boon Thau Loo for promoting me to the panel to ask question live) or through typing. Asking questions in person offline can be challenge but being able to type questions online creates a relax environment for me to interact with speakers.

Another big advantage of virtual conference is the cost. Thanks to the COVID, this year’s SIGMOD is free. All I need is to sign up and then I can get into the internal system to attend any sessions I’m interested in. My perception on the academic conference is that everyone gathers at some fancy resort, enjoy the social interaction for a week, and then fly back home. I would imagine how costly this can be given the flight expense, hotel room, and registration fee. The zero cost conference means outreach; means reaching out wide audience. I would really love the organizer publish some stats on how many people actually virtually attend conference. I’ll be in huge surprise if we don’t see a huge number jump there. In addition, the free cost feels like welfare for me: I don’t have to pay a few hundred dollars to get myself motivated for the PhD journey ahead of me. I can see an accessible conference like SIGMOD this year will be a huge morale booster to someone who is struggling in their PhD and will motivate people to do good work.

Another big win for me is the recording part meaning that each presenter records their talk before hand and the session chair simply plays the videos one by one. In some session, I do experience some technical issue like there is video with no audio. But the issue is fixed within 5 minutes and the small distraction doesn’t impact the whole session experience at all. Pre-recording means high talk quality. The speaker can give his best performance for his talking. I think many of the speakers probably record their talks several times to pick the one they think can best deliver their idea in their work. Another great part is that the presenter is actually standing by to take questions and can give replies in the Zoom chat and even in the Slack channel several hours after the talk. This feature is very nice because we can keep the discussion in asynchronously fashion; both question and answer can be written out for further digest. Talking about Slack, I see PC chairs in SIGMOD and PODS are quite busy: you can see them across almost all Slack channels. Constantly saving questions and comments from Zoom to Slack to spark more discussion; spread Zoom link and session context. I think they deserve some kudos.

A Slack screenshot on a PC chair organized&pasted the content from Zoom to Slack

Being virtual means there will be a lot of writing communication: whether it is through Zoom chat or through Slack. This is huge benefit. I assume important information can easily get lost in the offline conversation. For example, it’s really too early for me to get up at 7am Friday to attend New Researchers Symposium given an 8 hours work ahead of me. However, thanks to being virtual, lots of discussion actually happen both during the Zoom live and more importantly, on Slack. People use Slack to ask questions and some panelist is nice enough to write their answer on Slack thread as well. This is good for me because now, during the lunch break, I can scan through the Slack channel and get some information from the past discussion. I figure this would not be possible if the conference is offline.

I’m not sure how the conference is run in the past but I think this year’s organizer puts huge effort to organize all-in-one page with Zoom link, Slack link, and schedule in one page so that I can easily find the information I want.

A peek of all-in-one schedule web page. Very useful.

As a first time academic conference attendee, matching actual people with their name in paper is a huge win for me. In some way, this does feel like handshake events for Japanese idol. Indeed, those “Japanese idol” are in fact quite approachable. I really do enjoy Anastasia Ailamaki’s smiley face from the camera and her persistent typing to answer the questions both from Zoom chat and Slack channel. There are several sessions I really like from this perspective are SIGMOD Plenary Panel: “The Next 5 Years: What Opportunities Should the Database Community Seize to Maximize its Impact?”, Mohan’s Retirement Party, and Industry Panel: Startups Founded by Database Researchers. So many names that I saw both from paper and from internal codebase. It’s very cool to see some roast and teasing online.

Social wise, I really like Zoomside Chat series. For example, “Zoomside Chat with Jian Pei”. It feels like coffee break and the topic is very relaxing. This is the place where some “ungraceful” question get asked. Also, “Zoomside Chat with Tamer Özsu” is also fun. I really wish there would be more time allocated for this type of social events.

Research wise, taking a look of the accepted papers beforehand is really helpful. On my laptop, I have a list of papers written on the Notes. I write down my thoughts and comments for each corresponding paper on the list. Due to my personal interest, Wednesday and Thursday sessions interest me the most. Luckily, my targeted papers spread out quite evenly through two days. My biggest regret is to not go through the papers I’m interested in beforehand. The result of not doing so is to get lost in sessions that may seem tangible to my research direction. This is worth improving for the next time to make most out of conference. Having said that, I’m still able to learn some useful benchmarks that I can run related to my research. Also, sitting through the talks (even lost) help me to further refine my paper list for future reference. Another thing I notice is that workshops are much nicer for learning. Research sessions usually have only 10 minutes for each speaker. Speaker has to move very fast and cover part of material on paper. However, workshop speaker has 25 minutes (at least for aiDM workshop) and the pace is much slower compared to normal research sessions. Lastly, attending sessions is a great way to discover the knowledge gap: even they may not relate to my research direction, it is still fun to learn for pleasure.

Some downside of this year’s virtual conference is Gather. I don’t know how useful it is for others but it is not quite useful for a working professional like me. First of all, my company “bans” the website: I can get into the room but it will take forever to load the venue floor plan and see other people. If I really want to use Gather, I have to disconnect from company’s VPN. I want to walk around the venue while waiting for the build. However, VPN-unfriendly Gather is not quite helpful here. Another disadvantage for virtual means I don’t have to participate “fully”: I can run errands; check work emails; fix some code bugs for the work. I don’t have to give out full energy to the conference. I guess this is really my bad.

Overall, I’m very grateful for this virtual experience at SIGMOD this year. The overall experience is excellent. I’m hoping they will do something similar next time; maybe partial virtual? However, I surely will miss the chance to see people and attend sessions. That motivates me the most to do good work because I want to attend next time (maybe as a presenter).

UPDATE (06/19/20):

Received an email from organizer

The last workshop has finished, and SIGMOD/PODS 2020 is now history. We suspect it will be a landmark in most of your minds, separating SIGMOD/PODS Conferences into those pre-2020 and those post-2020. Even before all the adjustments brought on by the COVID-19 crisis, we planned to stream more of the sessions. Our registration of ~3000 shows that there is high demand for online access to the conference. If our community is serious about fostering diversity and inclusion, then remote participation should become a permanent option.

Looking forward to the remote participation in the future.

Appendix

This section collects some useful comments I gathered from Slack. It is for my future reference; might be useful to you as well.

From SIGMOD Plenary Panel: “The Next 5 Years: What Opportunities Should the Database Community Seize to Maximize its Impact?” on whether researcher should be “customer obsession” and solve real problem:

Joe Hellerstein11:19 AM
I’m going to take a somewhat different tack than @AnHai Doan.

I would never discourage work that is detached from current industrial use; I think it’s not constructive to suggest that you need customers to start down a line of thinking. Sounds like a broadside against pure curiosity-driven research, and I LOVE the idea of pure curiosity-driven research. In fact, for really promising young thinkers, this seems like THE BEST reason to go into research rather than industry or startups. The best reward is the joy of the idea.

But I think I share some of Anhai’s concern about improving the odds of impact and depth in our community.

What I tend to find often leads to less-than-inspiring work is variant n+1 on a hot topic for large n. What Stonebraker calls “polishing a round ball”. The narrow gauge for creativity in a busy area makes it really hard to find either inspiring insights or significant impact on practice; but at the same time the threshold for publication is often low because social factors in reviewing favor hot topics (competition, familiarity, and yes — commercial relevance of the topic, which can lead to boring research too!) That’s something we can try to address constructively.

Now I am guilty of going deep and narrow sometimes myself, e.g. in distributed transactions and consistency in the last many years. But it’s been rewarding and fun, and I like to think we had a new lens on things that let us hit paydirt a few times. Certainly outside my group, work like Natacha Crooks’ beautiful paper on client-centric isolation in PODC 17 demonstrates there is still room for major breakthroughs there. So some topics do merit depth and continued chipping away for gold.

Bottom line, my primary advice to folks is to do research that inspires you.

Joe Hellerstein  11:32 AM

To align with @AnHai Doan a bit more, if you are searching for relevance, you don’t need to have a friend who is an executive at a corporation. Find 30-40 professionals on LinkedIn who might use software like you’re considering, and interview them to find out how they spend their time. Don’t ask them “do you think my idea is cool” (because they’ll almost always say yes to be nice). Ask them what they do all day, what bugs them. I learned this from Jeff Heer and Sean Kandel, who did this prior to our Wrangler research, that eventually led to Trifacta. It’s a very repeatable model that simply requires different legwork than we usually do in our community. http://vis.stanford.edu/papers/enterprise-analysis-interviews

AnHai Doan  12:51 PM

As for where to find the customers, I really like your suggestion. Another thing I may add is that one can go talk to the domain scientists in the SAME university. Many of them now have tons of data and are struggling to process them. These domain scientists are often sitting just ten minutes from one’s office, and they are dying for any help. Talk with them to really understand the kind of data problems they have. Often at the start those are very mundane basic problems, such as querying a big amount of data. But if one can help them solve those basic problems, then often many more interesting problems come up.

AnHai Doan  12:56

Yet another thing to do is go talk to companies in the SAME TOWN. They often are downed in data too and would love to get some help. One can very quickly get to know the kinds of problems they have. This has worked at least for me. My group started out working on data problems with several domain science groups at UW-Madison. We developed solutions that were used by them and in turn they gave us feedback. Then we took those solutions to local companies (insurance, health, heating/cooling, three companies), and they helped us improve the solution. Then we got funding to do a startup. This is perhaps also a possible roadmap.

New Researchers Symposium with a question asking about how to fix “the despicable but common toxicity of the database community in the tone of their reviews and often during in-person questions/discussions?” and ask about how to write a good review.

Joe Hellerstein7 hours ago
I sense there are stories here, and I’m sorry to hear this. In my experience, the face-to-face interactions in our community have gotten more professional over the years. I’m very sorry if the questioner has witnessed bad public behavior.

But in my experience the reviews have actually gotten worse in recent years. I believe we need a process change. The root of the problem is that reviewers are generally not held accountable for what they write.

One thing I learned in my startup is that team members who voice concerns are not very valuable — startups are risky by nature, and concern-mongering just contributes to negativity. However, team members who voice concerns and propose solutions are gold. We should ask the same of reviewers.

Anecdote: Doug Terry often signs his reviews. I went through an exercise last CIDR where I decided to try that, and I found instantly that I became much more helpful and constructive in my reviewing, even for papers that I did not recommend for acceptance. It didn’t feel OK just to criticize; I felt more responsible to suggest and encourage changes.

I’ve heard arguments why this isn’t a reasonable global solution—e.g. junior researchers could face retribution for signing negative reviews. But we might consider other mechanisms for ensuring that the reviewers are (a) held to account and (b) required to be constructive.

美国第一次买车经历

在Austin工作没有车是万万不行的。所以,不得已,买了辆2020年Corolla。以下是我买车的经历和一些心得体会。希望对下次买车有所帮助。

经历概述

我买车主要分两个阶段:看车和买车。看车阶段主要试驾了Nissan的Altima,Versa hatchback,Sentra SV;看了Honda的Civic(小插曲:去了First Texas Honda,他们告诉我没有保险不让试驾,所以直接就没有怎么考虑Honda了);以及试驾了丰田的Camry和Corolla。之前租车对chevrolet的Malibu印象不错:开起来稳重,70迈以上不发飘,但是加速较慢,从ramp上高速的时候尤其明显(需要猛踩油门),再加上Chevrolet的dealership离我住的地方比较远,想想就作罢了。最后我把考虑范围基本缩小到了尼桑和丰田这两个品牌上。试驾的时候Sentra内饰简陋,方向盘很重,所以作罢。后来又考虑到保值及品牌稳定性的关系,最终选择了丰田。再加上预算在2W之内,最后选择了北美四大神车之一的Corolla。

买车阶段我主要参考的就是

其实主要就是邮件询价,然后拿一个dealer的邮件去compete另一个dealer。我当时直接是去每个dealer的官网打开网页chat,然后管他们要Corolla 2020的OTD。主要参考的就是上面Reddit链接中的Chapter VI。其实在邮件砍价的过程中有的dealer会要别的dealer的邮件证明,证明确实有dealer给我开到过那个价格。有的dealer会吹毛求疵说必须要看到sales consultant开的purchasing order而不是sales manager的。我觉得这些都是套路,坚持的底线就是不看到满意的OTD是不下场过去进入买车环节的。但是我也被Charles Maund Toyota这个车行套路了。跟我聊的是19K OTD并且可以付现金。但是到了之后,却又跟我说是做Finance。最后我选择的是Round Rock Toyota。主要理由其实就是那边的人挺痛快:说好OTD,然后第二天开出了签名的purchasing order。

经验

  • 丰田的dealer的惯用套路就是看你试驾后,不由分说就开始让你录入个人信息,急着让你购买。但是请注意个人信息保密条款只有在买车之后才会签署。也就是说,如果你不买车但又把自己个人信息交代了的话,dealer很有可能直接转手就把你的信息卖掉了。我这次买车就有这部分的担心。所以在试完驾后一定要斩钉截铁的高速dealer我就是来试车不买车,然后直接走。
  • 询价阶段一定要但注册一个邮箱,用自己的英文名去讲价,尽量避免自己个人信息暴露。我是每个model对应一个邮箱,用不同的名字去讲价。
  • 我买车的时候我没有用到graduate rebate。其实rebate感觉车行不会都用,因为他们会在MSRP上面给你减一定钱数,然后谎称这里已经包含了rebate。我当时买完车,然后找他们说要能不能再给我一个graduate rebate。之前讲价虽然从未提及我毕业生的身份,但他们还是说已经给我算了,并管我要我的毕业证,想自己吃掉这部分rebate。我就已毕业证还未寄到搪塞过去了。
  • 交车的时候不要着急,多想想要问的问题。我走的时候就没有问清保养的问题。
  • 买车的时候要auto insurance。我就直接去网上把主要insurance company按照固定的plan去quote了一下价格,做了下apple-to-apple comparison。还是会有惊喜的。之前做好功课,在dealer那里他会提供联系方式来卖保险,拒绝掉即可。

以上只是我的一些小总结,肯定会有很多不足之处,下次再来补充。

08/04/19 Update: 我在买车的时候一部分走的是和丰田做finance。一个原因没有选Charles Maund Toyota是他们要求全部OTD走finance,但是他们说我可以在走Finance后立刻付清全款。当时我对立刻付清全款有些疑虑:不知道是不是存在closing cost即如果没有把贷款维持一定天数就连本带息全部付清的话,会有额外成本。我6月20几号买的车然后7月2号付清的贷款,最后其实是没有我之前担心的额外cost的。也就是说只要Toyota Financial那边把账户创建,就可以立刻付清全部贷款,没有任何额外cost。但是值得注意的是创建账户需要几个工作日(5天左右),如果想要网上付清贷款的话,需要account number才能在丰田网上创建账户。通常account number是写在monthly bill上的,但是等账单寄到就又会花上好几天。因此,想要快速拿到account number的话可以直接给丰田打电话,然后报上VIN,然后他们会通过电话直接告诉你的account number。

另外我是08/03/19去上车牌的,那个周二接到的电话说车牌到了,直接去车行他们直接把车牌给我,然后免费帮我装上。

08/22/19 Update: 车的Title寄到了我的手里了,至此买车的事情也告一段落了。因为Texas参加了Electronic Lien & Title (ELT),因此Toyota通过电子的方式把自己Lienholder移除了。这样我拿到title后也不用再去DMV去变更owner了。

My critical mistakes in Academia and reflections

This post is a summary and reflection of the critical mistakes I have made throughout my post-secondary academia career. This is a gift for my child (if there is one) and it might be helpful for others.

Diverse interests without focus

I have three majors from economics, computer science, and mathematics after I finish my undergraduate degree. I often get wowed from other people. However, the more I focus on one field, the more I feel three majors are diversified enough to have no focus. Even I have a major in computer science, I didn’t take courses in computer architecture, operating systems, networks, compilers, which are essential courses for a computer science major. I have to take hard way to catch up with those missing material: reading classics. This process takes a long time and I’m still on my way finish studying them. If I have an end goal of becoming a computer scientist, there certainly no need to obtain a major in economics and I should become more focus on the mathematical branch related to topology, combinatorics, and logic. Taking extra unnecessary courses may not be the only waste. Lacking of background incurs extra cost during the PhD and job applications. Even though I manage to secure a position in industry, I still need lots of work to catch up.

Doesn’t know the end goal

Even though I have foreshadowed this point in the previous paragraph, I want to emphasize how important it is to know the end goal. Ideally, people should discover their interests from high school. However, start the exploration in college is not too late. But, the exploration should end after freshman so that there is enough time to become specialized and concentrate on something. Knowing the end goal cannot happen immediately but at least, it should happen by Junior. During the college, I hopped around three majors with no common theme at all: what I want to do for my future? I avoid to answer this question by taking the majors that may seem to offer the greatest flexibility in the future. However, the cost of doing so is the lack of depth. In addition, I didn’t know what I want to do for the computer science career: research or software engineer? That leads to one huge mistake detailed in “Failure in seizing ‘the’ opportunity” section. Knowing the end goal is very very important and the book “The 7 habits of Highly Effective People” should help.

Doesn’t engage in research with long term vision early

People often emphasize how important to get involved with research in college mainly because research is a critical component of higher education. I certainly did but my mistake is that I’m involving in research in ad-hoc way: I did research in math, in psychology, and in statistics without a common theme that connects them all together. In math, I did research in probability; in psychology, I did research in early childhood education; in statistics, I did research in fMRI. Those research experience is helpful only in the sense that they help me to discover what I don’t like. I always admire the people who can discover their interests early: there are lots of options; how can one settle on one without trying out others first? That’s my unresolved question. Technically speaking, I don’t think this section should be considered as a mistake but certainly, it is something that incurs lots of detours in my short-lived academia career.

Failure in seizing “the” opportunity

I started to compose this post when I was on the spring break trip in Alaska. I ran into a group of people who were from my undergraduate institution – University of Wisconsin-Madison. I had a brief chat with them. One question I asked one of them who happened to be a CS major was: does Wisconsin start to set bar for people who want to declare CS major? “No! Everyone can do it! That’s the amazing part of Wisconsin: the university gives everyone opportunities to try!” She answered. “I know a friend who transferred to Wisconsin from University of Washington to study CS because he cannot study CS at UW. Students in UW can study CS only when they are admitted to CS directly from high school.” Her replies don’t surprise: that’s the same impression I have about Wisconsin. However, her answer stirs a huge pain in my heart. I suddenly have guts to admit a huge mistake I have made during my first year study at UT-Austin.

I’m unsure about what to do with summer: whether I want to go to a research lab to prepare my PhD application or finding an internship in industry. As you can see, here I have the mistake of not knowing my end goal: I’m not sure whether I want to pursue a career in research or in software engineering. I contacted one of my former professors in Wisconsin and he was kind enough to offer me a position in his lab over the summer. He is a famous researcher and people are dying to work with him. But, guess you already know, I blow up the chance and work on a software engineering internship over the summer. Of course, the professor is unhappy but he is kind enough to not saying that explicitly. In the following Fall, I applied for PhD programs and I asked him for a letter. Without big surprise, I got rejected by all the programs including the school the professor is in. After learning the admission results, I keep lying to myself about all the drawbacks of attending a PhD program and I constantly have debate in my heart about whether I have made a good decision for the summer. After talking with the girl from my school during the trip, I suddenly realized that how upset I am in my heart and how I keep avoiding facing the fact that I have made a huge mistake and blow up “the” opportunity. I couldn’t help to imagine that if everything works out over the summer, I may already have the admission from his lab to have the privilege to study for PhD program. Of course, in real life, there is no “if”. Failure in seizing “the” opportunity can be treated as a pivot point in my life. A person’s life might be settled after a few pivotal decisions. I think I just made a mistake in one of them.

The only takeaway: Never ever give up your interest

I write the following in Chinese to my parents:

如果有孩子 我一定教育他不要因为钱和客观因素就轻易放弃梦想 因为放弃梦想的感觉真的很难受 即使最后你没有钱 但是你至少知道你为了梦想努力过 那种踏实的感觉是用钱买不回来的

Basically, it says that there is no such thing has higher worth than one’s dream. After getting rejected by all PhD programs, I know that getting an internship in industry over the summer signifies my give-up my interests in becoming a researcher for the money. I didn’t upset at the very beginning but the more I think about, the more I think I should stick with my interests no matter how poor or how old I am. Now, I’m in a situation about I should hog onto something that is not my interest: money in this case. I’m not sure eventually, I can have a way to switch back to my dream but I know it’s going to be a long and hard way

2017 End-year Recap

距离要起床去机场还不到2个小时了。实在是辗转难眠,就起床开始写今年的倒数第二篇博客了。如果我在飞机上能读完那本书的话,还是会有一篇book review的。

先贴上2016年的回顾吧。毕竟格式是要保持一致的。

2017年回顾

上来先做个工作报告,回顾一下16年展望中的工作进展:

  • 博客数量至少100篇!

粗略数了数,17年目前为止总共写了62篇博客。其中技术类44篇更新在我的个人主页上。虽然没有完成既定的目标,但是我个人对这个数量还是比较满意的。年初的时候就基本发现1年写100篇博客其实还是不现实的。如果在这个数量前加个“有质量”的定语,那就更加不可能。“有质量”仅仅是指对我个人来说。技术博客9月份以前由于工作原因时间比较充分,所以还是可以好好看看书,然后写写的。但是到9月份的时候就有灌水之嫌了。所以,我就果断作罢,停止技术博客更新了。希望回国冬假期间能补上几篇。Wordpress的博客这一年来还是坚持每月至少更新一篇,整体质量还算说得过去,只有11月份灌水了一下,这里作为半吊子作家自我检讨一下。博客的灌水究其原因还是时间不够。随着开始硕士学习,课程强度使得我没时间沉淀。每天都在张着嘴,被老师拿各种新东西往里揣。现在感觉有点消化不良,希望冬假能沉淀沉淀。

  • 体脂比降到15%以下,体重降到70kg

看到这个是老泪纵横。在国内控制的可以叫做胜利在望,但是出来了就可以叫做惨不忍睹了。最好记录是72.6公斤,12%体脂比。主要出来检讨一下在国外这几个月骄奢淫逸的罪行。首先没弄个体重秤是最大的问题原因。果然没有数字的直接刺激,就很难评估每次运动的直接成果。其次就是吃了。最开始吃还是克制了一些,但是后来就非常放飞了。10月底开始我家来了个重要客人来我这入伙,那真是变成了想怎么吃就怎么吃了。一顿饭不仅要弄个2,3个蔬菜,连肉大部分时间都是既有白肉也有红肉。每次蒸米饭,我的手抓个3把就差不多了。但是由于客人实在太过尊贵,就抓个4,5把了。米饭真是个好东西。亚马逊19.99一大袋便宜不说,吃起来特别管饱。每次两个人坐在椅子上,互相看着对方拍着肚皮的样子,一种幸福感与安全感混杂的情绪就油然而生了。现在我做饭口碑算是小有建立起来了,至少在那位不能说名字的客人面前,我做的饭是属于管够并且“多搁点盐就是餐馆水平”的了。现在和我室友,以及那位客人相约减肥,为此我室友还搞了个体重秤。希望能如愿。

  • 看书频率要达到这位的速度

这个又是罪过了,完全没有达到预期。如果把全年以出国日期8月5号作为切割点的话,两段时间各自出现了一些问题。出国前看书偏细致,算法书逢题比作,看的实在是过于精细了一点。同时,自己文学类书籍看过一些,但是频率还是不及。出国后看书效率明显提升。这个主要得益于跳着看这个方法。 这里非常感谢Prof. Dana Ballard教的Machine Learning以及其他courses的老师们,自学成为主要学习手段。疯狂的project进度逼迫着我这个完美主义者向能用就行主义者的进化。看一本书直接就看最相关的章节,所有背景知识都是后补,并且如果又不理解的但又不影响阅读的,就画个标记搁置起来后边再看。意识到一本书可以看多遍的道理,所以第一遍读时的贪欲就少了很多,就不求每个点都读懂了。是的,写这段话的时候,我脑海里浮现的书名就是PRML。但是,一本书没有看完大部分章节终究还是不能说看过的,所以8月份后问题主要出现在时间不够上边。介于未来几年希望能读完PhD的我来说,状况可能改善不会太大。

  • 每读一本书都要写book review!

这个做的还是不错的。因为毕竟真正读完的就没有几本而且都集中在上班时期,所以每本读完的书都写过book review了。

  • 有所学校能收了我!

这个愿望算是实现了。感谢主。我来到了UT-Austin!

从2016年的展望来看,5个点真正完成的了只有最后两个,完成率40%,只能说一般。但是从2017年整体来看,我还是比较满意的。适应了从职场人到学生的转变,虽然第一学期的Graduate school非常难熬,但是我还是非常高兴自己能挺了下来。希望新的一年里能继续加油。

2018年展望

  • 向下扎根,向上结果

其实这是教会2018年要交通的主题。结合自己来看就是希望自己能够更加的了解神,接近神,信靠神。教会里属灵前辈讲男人是头。17年的第一学期主要参加的就是团契和主日了。祷告会一次也没有参加过,甚是惭愧。重要的客人这方面已经积累了10多年了,要超越不容易,但是还是要做。具体来说,18年希望内心得刚强。有的时候我深深佩服我这位客人。总觉得内心是刚强的,尤其在美国,在外旅行的时候。要向她学习。这点我觉得解决问题的关键还是在主那里。也许主让我和这位客人相遇就是想去除我内心上的软弱呢?我还是非常相信这点的。

  • 找到实习或者署研

这点其实是老生常谈的问题。研究方向成为了17年一个贯穿始终的话题。坦率的讲,我第一学期之后还是没有发现我真正的研究方向。NLP已经成为我AI方向中的头号Candidate。但是System那边还是希望能多explore一下再做最终决定。至少目前我是这样想的,但是不到课表确定的最后一刻,任何问题都还是说不定。确定了研究方向暑期研究具体做什么也就确定了很大一部分了。剩下的就是确定导师了。实习算是另外一个方向,主要是为了刷题多积累点动力。另外一学期的政治学习也积累了不少动力。

  • 有学校可上

18年底又又又要申请学校了,这次希望继续有神的保守。

这一切的一切都需要主的保守!

In relationships: a first taste

It’s October 30th today. I only have one more day left to compose a post for October. Blogging can be very hard during school time because there are endless tasks you need to get done in a timely fashion with certain expected results. Even though I have given up watching videoes, playing video games, writing technical blogs (almost) for this semester, I still want to write something here to keep the blogging trend going: I have written at least one post per month for the past two years. So, here it is.

There are many things happened in October and surprisingly, those things are all about the relationship: I got baptism to become a Christian, which indicates a new relationship with the God; I start seeing a woman, which is a relationship in a normal standard. One thing I am always curious about when I don’t involve those relationships is: how life can be different when you are in a relationship. Most of my knowledge on this matter is from the media and the people I observe. For the relationship with the God, I barely know anything. I haven’t actively thought about this since I graduated from the college and I won’t even think about being a Christian before coming to Austin. For the relationship with a woman, that I have been thinking about quite actively especially when I was a high school student. I always want to know the taste of being with someone. However, quite surprisingly, if you ask me now how life changed after being with God and being with a woman, I would say: the former one is quite significant but the latter one doesn’t change much.

Being with the God

Being with the God is a huge decision to me. I went to a church back in Madison for two years but I could barely feel anything internally. I always treat going church on Sunday morning as a way to sing some songs and take a break from study. However, after arriving in Austin and thanks to some incidents, the picture of God becomes clear to me. I start to feel the life journey I have been through is perfectly designed to me. Attending Madison for undergraduate makes me mentally strong to the setbacks and going back to China for work makes me grow up like an adult and start to learn all the soft skills I previously ignored: communication, love, and family. All those things prepare me to head back to the States and pursue the further study. In addition, I always know that I have sin but I don’t know what way can help me to get rid of that and start a new life. Even worse, I constantly get seduced by Satan to do the things that hurt my friends and my family. I know I’m wrong but the pleasure coming from the crime is just too much and that gives me the pulse to commit again next time. Thankfully, I have the chance to know the God and I get my way out of the vicious cycle.  After becoming a Christian, I learn to view things in God’s view and try to pass the love to others. I learn to forgive the conflict and do things in the honor of God. Thanks to God, he prepares a woman for me.

Being with a woman

Surprisingly, being in a relationship doesn’t change my life that much. I simply have one more person to care about and I need to allocate certain time for that person. This doesn’t differ from spending time with my parents previously. She is a Christian as well and we adhere to the same core values. All the rest of difference seems trivial to reconcile. However, we have been dating for like a month and we are still in the calibration period: we start to know more about each other and be careful with the relationship traps that people usually fall into. However, with the help of the God, I think I’ll be fine.

Leaving IBM

To be honest, this is probably the most difficult post I have ever written. This is majorly because there is a ton of stuff I want to say but I’m unsure whether I should keep them public or should keep it to myself. Another factor that makes this post hard to write is because the span of drafting. I have been drafting this post since April in 2016, right after when I decide to start the whole process of quit-IBM-and-get-a-PhD project.  I used to use this post as a log to record things and feelings when somethings happens around me at IBM. Frankly, if I take a look at the stuff I record (mostly are rantings) retrospectively, lots of stuff still hold but the anger just passes away with the time. So, that year-long drafting really makes me hesitate even more because the mood when those stuff are written are gone. However, two years can be a significant amount of time and quitting IBM can be called “an end of era” and I should give a closure to my happy-and-bitter experience with IBM anyway. So, here it goes.

 

Thank you, IBM!

I’m really thankful for the opportunities working with IBM. This experience really makes me grow both technically and mentally.  Technical-wise, I have the opportunity to get hands on experience with DB2 development. DB2 as a database engine is extremely complex. It has over 10 million lines of code and it is way beyond the scope of any school project. Working on those projects are quite challenging because there is no way you can get clear understanding of every part of the project. I still remember when I attend the new hire education on DB2, there is one guy says: “I have been working on the DB2 optimizer for over 10 years but I cannot claim with certainty that I know every bit of the component I own.” This fact really shocks me and based upon my experience so far, his claim still holds but with one subtle assumption, which I’ll talk about later. There are lots of tools are developed internally and reading through both the code and tool chains are a great fortune for any self-motivated developers. I pick a lots of skills alongside: C, C++, Makefile, Emacs, Perl, Shell, AIX and many more. I’m really appreciated with this opportunity and I feel my knowledge with database and operating system grow a lot since my graduation from college.

Mentally, there are also lots of gains. Being a fresh grad is no easy. Lots of people get burned out because they are just like people who try to learn swim and are put inside water: either swim or drown. I’m lucky that my first job is with IBM because the atmosphere is just so relax: people expect you to learn on your own but they are also friendly enough (majority of them) to give you a hand when you need help. I still remember my first ticket with a customer is on a severity one issue, which should be updated your progress with the problem daily. There is a lot of pressure on me because I really have no clue with the product at the very beginning. I’m thankful for those who help me at that time and many difficult moments afterwards. That makes me realize how important is to be nice and stay active with the people around you.  Because no matter how good you are with technology and the product, there are always stuff you don’t know. Staying active with people around you may help you go through the difficult moment like this by giving you a thread that you can start at least pull. In addition, participating with toastmasters club really improve my communication and leadership skills and more importantly, I make tons of friends inside the club. Without working at IBM, I probably won’t even know the existence of the toastmasters club. If you happen to follow my posts, you’ll see lots of going on around me when I work at IBM. Every experience you go through offer you a great opportunity to learn and improve yourself. Some people may look at them as setbacks but for me, I look at them as opportunities.

toastmasters1

( the picture on the left is all the comments people give to me about my speech and on the right is the awards I have earned inside the club in these two years)

With the help of all those experience, I have developed a good habit of writing blogs (both technical and non-technical), reading books, and keep working out six days per week. All those things cannot be possible if I work at a place where extra hour work commonly happened. I’m very thankful for IBM for this because staying healthy both physically and mentally are super critical for one’s career. Even though those stuff don’t directly come from IBM, but IBM does provide the environment to nurture this things to happen.

 

IBM has its own problem. The problem is centered around people. There are many words I want to say but I think I’ll keep them secretly but I want to show my point with a picture:

ibm_survey

I don’t know why IBM’s term “resource action” on firing employees and the sentence “IBM recognize that our employee are our most valuable resources.” bother me so much. I probably just hate the word “resource” as a way to directly describe people and how this word get spammed so much around IBM. I know everyone working for a big corporation is just like a cog in a machine. However, what I feel based upon lots of things happened around me is that IBM as its attitudes represented by its first-line managers (because those people I commonly work with) makes this fact very explicitly. It hurts, to be honest. No matter how hard you work and no matter how many prizes you have earned for yourself and your first-line manager, you are nothing more than a cog in a machine, which is not worth for high price to have you around because there are many cogs behind you that are ready to replace you. They are much cheaper, much younger, and more or less can work like you because your duty in the machine is just so precisely specified, which doesn’t really depend on how much experience you have had under your belt. To me, that’s devastating.

This leads to the problem that talented people are reluctant to stay with company. My mentor and the people are so good with DB2 have bid farewell to the team. That’s really sad to me because they are the truly asset to the company and the product. The consequence of this is that crucial knowledge is gone with people. Some quirks existing in the product are only known by some people and once they leave the company, the knowledge is gone with them. That makes mastering of the product even harder. That’s the subtle assumption that the person makes during the new hire education and that’s also part of the problem when working with legacy code. The whole legacy code issue is worth another post but one thing I now strongly believe is that any technical problem has its own root cause in company culture and management style. To me, I’m not a guru now but I cannot see the way to become a guru with my current position, which scares me the most

That’s it for this section and I’ll leave the rest to my journal.

Takeaway from DTCC 2017

由于同事出差,我有幸参加了在北京国际会议中心举办的第八届中国数据库技术大会(Database Technology Conference China 2017)。这是我第一次参加业界交流大会,内心还是格外兴奋的。这次大会确实有很多的收获,我想用这篇博客记录下来。本来我想用英文记录的,毕竟对于计算机领域,英文是我的“母语”,但是介于分享主要以中文为主,所以我就还是以中文来记录了。

会议目标

虽然机会来的很突然,但是我还是设立了一些目标以最大可能的利用好这次机会(以下是这篇博文的英文初稿,由于实在是懒着重新翻译成中文,各位就凑合着看吧):

Get some sense from the peers

Focus on your own product is quite important. However, it’s even more important to see how your peers doing. I’m not an architect yet but I feel it’s helpful to begin thinking like an architect and see what the problems that your peers are facing and how they try to solve them. In addition, by knowing how’s the going with your peers, you may get a measure of yourself: is the work you are doing on the same level as your peers? Are you in a good shape in the job market? What’s the gap you need to fulfill skill-wise?

Deepen the understanding of the field

Even almost two years working on the database field, I still think myself as a newbie. This is mainly because database is arguably the most complex software that people can ever make and there are tons of stuff I don’t know. So, I want to see in a high level that what’s the trend of the field and what kind of reflection that people derive from their day-to-day engineering practice. I think this may help me to catch-up with the masters.

AI or System?

As I disclosed in my last post, I decide to head back to school and get a master degree. To be honest, my ultimate goal is to acquire a PhD in Computer Science and currently I’m actively preparing for it. The most important question is that which field I want to study?  I have two options and I have some interests in both fields: AI and System. Why these two options and not others is worth a whole new post and I don’t want to discuss here. So, my task for now is to gather as much information as possible about these two fields and see which one looks more attractive to me. This event is extremely helpful because it has sharing on System as well as on AI.

Day 1

第一天分为上下半场。上午是开场及四个分享。下午则是五个同时进行的专场,每个专场有六个同一主题的分享。这就造成了我无法参加每一个分享。第一天我的策略就是面面俱到:系统的我也参加,AI相关的我也参加。以下就是针对我参加的每一场的一些心得感悟和评论:

年度主题解读 (曹鹏 – 京东金融副总裁)

本次会议的主题叫做“数据驱动,价值发现”。这个分享是从京东金融自身的角度对本次会议的主题进行了结构。从中我记住了两点:

  1. Finance领域受到了机器学习的冲击,最近几年有越来越多的FinTech公司出现。机器学习在这种公司的主要应用从这个分享来看是对客户群体更加精确的定位和分析。相应的,对于量化交易策略的作用,这个分享没有涉及。我最近一直比较关心机器学习在金融领域的应用,但是从这个分享上,我没有找到我想要找到的答案。因为,在我看来,对客户群体的精确定位是一种机器学习的通用应用,并不具备金融行业的独特性。
  2. 数据公司在我看来是一个不错的创业想法。分享中提到数据对于京东金融的重要性。他们不仅要求数据的广度,也要求数据的厚度。一个重要问题是数据是具有很强的时效性和冷热变化的。一年前顾客的消费记录对于现在来说并不具备非常强的指导意义。因此,京东金融每天都要收集大量的数据(~6TB)来保证整个分析的准确性。同时,演讲者透露出即便在这种情况下,他们觉得数据还是远远无法满足他们的需求的。这个就能解释为什么IBM最近收购了The Weather Company和医疗影像公司Merge Healthcare:无非就是看上了这两家公司的数据。这让我想做数据贩卖商会不会是一个不错的创业点子呢?

数据库发展概览 (吴承杨 – 甲骨文)

这场分享整体来说亮点不多。不过还是有一些重要信息的:

  1.  在去IOE喊了那么多年的今天,Oracle的市场占有率依然有56%之多
  2. 数据库的未来是云:这里演讲者用一个case讲述hybrid cloud的重要性。企业现在面临的问题是如何将公有云的数据和本地服务器上的数据有效的对接在一起以及如何将公有云私有化等。整场演讲更像是Oracle解决方案介绍会,技术方面很少涉及,但是指出了未来数据库发展的方向:上云。
  3. 演讲者台风不错,是一个不错的演讲者。

数据技术的下一站 – 数据应用 (王桐 – 永洪科技)

这个分享反应出永洪科技的主营业务和技术实力可能不是那么雄厚。整个分享我感受到永洪科技做的是数据库的应用开发,而不是数据库系统的本身。从这个分享中我了解到永洪把传统数据库以及大数据系统做了个集成平台,并在上面开发了针对不同行业应用的服务。这个感觉和IBM自家的Bluemix非常像,少的只是Watson系列。我个人看来做软件系统集成要比做系统本身难度要低很多。整个分享关注在永洪科技所提供的各种数据应用的服务。我查了一下,公司属于初创成立于2012年,我觉得走到今天这个地步也是不容易的。

整个分享亮点还是有的。一个是人物岗位关系图的展示,流程之间的pending关系以一种网状图的形式展现出来,每个节点是一个岗位。通过这种展示,我们能清晰看出哪个岗位人物最关键,他的缺席或者能力高低会对整个公司业务带来何种影响。另外一个亮点就是资源配置图。展现的是诸如会议室的使用情况,使用率等指标。但凡在IBM呆过的,对会议室这点肯定会深有体会:无数会议室被人预定却无发得到充分利用。我想这种资源展示应该是对我们这种会议室资源紧张的地方来讲会有很大帮助的吧?

达梦如何冲击核心业务系统 – 国产数据库的产品发展之路 (韩朱忠 – 达梦数据)

我觉得这个分享可能是今天最励志的分享了。整个分享讲的就是一个国产小厂商是如何奋斗和外资数据库斗争,一点点争取市场份额,成长到今天这个样子的。这里边讲到的一个关于他们对这个用C写的数据库的SQL优化能力进行提升的例子。 他们曾经遇到过一条SQL, 长达3.9K行,换句话说就是粘到word文档里能粘350多页。里边包含着17个inner join, 557个子查询, 831个or筛选, 1000+个查询字段,2731个case when。他们通过不断优化将这个SQL语句从几百分钟降到不到1秒。另外一个故事是讲国产数据库生存的艰辛。因为大企业及银行电信等核心产业的数据库都是采用外资的, 国产根本进不去。国产只能在中小企业市场去竞争。但是,这家数据库通过自身的不断努力,终于拿下国家电网的单子以及西藏和东方航空的单子。这在我看来是非常了不起的成就。这就让我对IBM产生了反思。我不觉得我们DB2能在不经过针对性的优化的情况下就能处理这么复杂的SQL语句。这个例子也让我觉得要么我们是在用我们的名声和过去的积累在赢得客户,要么就是DB2售前的同事在做POC的时候super tryhard。我明显感受到我们和这些国产数据库在努力程度上的差距。也许有一天我和他们的地位会呼唤?我相信这是IBM高层不愿意看到的事情。我们确实该努力了。

SSD的IO Determination特性在数据库业务优化中的应用与拓展 (阳学仕 – 宝存科技)

这个是从storage上出发来讲如何用软件模拟硬件来提升读写速度。换句话说,这个分享带给我的思考就是数据库怎样才能利用IO determination提升读写速度。这里讲的IO determination我粗浅理解看来就是让硬盘上的应用能更加和谐共处,并通过提升应用优先级,IO资源上下限,以及时间上对读写顺序进行优化等方式来使应用获得所需要的资源。另外SSD对于网络发展的匹配也有涉及:通过硬件的提升,我们现在基本可以做到本地写入和通过网络写入远程只有10几微秒的差距。这些在我看来是属于OS的领域。硬件对DB的加成这个方向让我感到耳目一新。

面向未来的数据库体系架构的思考 (张瑞 – 阿里巴巴)

这个主要介绍的是阿里巴巴里的AliSQL的架构以及针对阿里业务特点的数据库架构的反思。这里有两点我想提及:

  1. 国内厂商和IBM在对待数据库上有本质上的区别。国内厂商如阿里巴巴,腾讯,以及百度都是以自身业务痛点作为出发点对自家的数据库进行开发和改造。所以相应的,这些家的数据库改造,提升都是带有极强的针对性的。他们的数据库架构可能并不具备非常强的通用性。相反,IBM是把数据库作为产品来销售的,因此在数据库本身设计上考虑到的更多是面面俱到,大而全的尽可能满足所有用户类型的需求。这就导致在某些场景下,IBM的DB2做不到像AliSQL, OceanDB, TDB那样强劲。因此,在超大型公司做数据库,最终方向可能都是“私人订制”。
  2. 机器学习与系统结合的越来越紧密。这里演讲者提到他们想在未来把自动运维转换到智能运维上面来。SQL不再是DBA来手动看,而是通过ML的某种方式来进行优化。这些阿里的人还没有想好但是他们觉得这是未来的方向。

下午场综述

下午听的有”百度NewSQL数据库系统”, “Tencent MySQL内核优化解析”, “滴滴大数据应用”,“自然语言技术在文智趋势分析产品上的应用”。百度上最大收获是说现在分布式事物数据库非常的热,如果研究透,就没有在国内趟不过去的问题。另外一点收获就是不要过分崇拜Google系统。虽然细节我没有听的特别懂,但是从演讲者言语间我感受到,黑猫白猫抓到耗子就是好猫。有的时候不能太学究。而且系统之间即使是理念一模一样,但是由于implementation不同,也会导致巨大的性能差异。

腾讯的讲的非常Technical, 加上演讲者是技术出身,整个session非常的煎熬,感觉就是内核优化是个大坑,需要很扎实的DB知识。最后两场我选得是和机器学习相关的。不得不说没有达到我心中的理想。滴滴介绍的是他们一些数学模型应用的场景。我感觉演讲者应该是加入滴滴时间不长,并没有从一些模型上讲出个所以然来,反倒是应用场景上更让我感受到经济学家也是有用武之地的:比如说如何运用高峰涨价来调控司机和打车人之间的供求关系,以及如何收取取消订单等行为给平台所带来的损失。也许是民怨太重,整个滴滴分享感觉像是个新闻发布会。最后的自然语言技术应用是非常无聊的。演讲者是产品经理出身,主要介绍了下腾讯是如何针把NLP技术应用在新闻上的。非常泛泛,没有提及一些NLP上的技术难点,非常失望。

Day 2

第二天我觉得整体上不如第一天的精彩。主要原因我在想是方向性和行业发展战略性的内容比例在降低而具体技术内容所占比例在上升。不得不说的是通过这两天大会的观察,国内数据库领域MySQL系和Oracle系还是占主流,这主要是因为互联网行业的蓬勃发展。下面我就简单聊聊这一天的观察和体悟:

  • Informix现在是和物联网IOT紧密的捆绑在了一起

在IBM我的邻居就是Informix Technical Support组。他们组的老大之前也分享过Informix在物联网领域的应用。这在我看来是为Informix这个昔日的巨人在找新的发力点以获得新生。这点也在今天题为“万物互联时代的数据库支撑平台–SinoDB”上获得了印证。SinoDB可以理解为Informix的fork因为这个公司从IBM这里获得了Informix的源代码的授权。不得不说的是IBM在这里变成了吐槽的对象,这些以Informix元老员工成立的公司认为IBM并没有善待Informix这个继子。他们认为是时候把自己的“孩子”重新领回来让他茁壮成长了。这也让我不得不思考当初IBM收购Informix到底是为了什么?问了问和我一同参会的同事,Informix的代码是否已经和DB2的有机的融合在一起现在还是个未知数。这也让我明白为什么在Oracle收购MySQL之后会出现这么多MySQL的fork:毕竟不是亲儿子。

  • 问题的多重性和domain knowledge的重要性

下午场我就是盯着机器学习专场在听。其中我觉得来自连家的“机器学习技术在房屋估价中的应用”的分享最为有意思。分享的内容其实从标题就可以猜出个八九不离十。这个分享一个重要的信息就是机器学习并不是以算法为核心的而是以建立在以domain knowledge为支撑的加工过的data的基础上的。对于链家的问题就是他们的数据量是十万级的,远不及一些图像处理或者文本处理的亿级别的数据。另外他们的数据是类别变量和连续变量混合,连续变量有数量级差异;以及不可避免的脏数据。这些都很大程度上决定了要基于domain knowledge的feature engineering和针对数据特点的算法确定。现在想想也就不难理解为什么从在本科上统计课到现在看的Prof. Andrew Ng’s ML课程,大家拿到数据的第一步都是plotting:就是为了能更好的结合自己的domain knowledge来观察数据特点及预处理。另外说一句就是,在我看来从昨天的滴滴大数据应用到今天这场链家的机器学习应用,他们本质上处理的问题都是属于经济学范畴。与经济学中计量经济所不同的是,机器学习的方法更加暴力:分析数据就是分析数据,而不是先要把问题归类到经济,然后按照经济的科班套路先建模再通过数据验证模型的套路来解决问题。我这里不想说也不够资格说哪个解决问题的方式方法更好。我想说的是一个问题放在不同角度来解决套路真的是完全不一样。站在不同位置上看待同一个问题也许能会擦出更加明亮的火花?

Day 3

最后一天就是全天的专场了。前两天听下来基本上对System, ML方向有了个粗略的sense。到了第三天我就把重点放在了其他一些领域比如说区块链。这里我觉得讲的比较好的就是“区块链与大数据技术结合的商业应用”这场。可以看出的是区块链作为一个新兴技术,由于账本本身是公开的,可以把 这个想象成一个巨大的只支持insert和select的数据库,那么对于这个数据库里的数据挖掘和针对这个数据库所能做的一些优化就成为了现在区块链届关注的重点。据介绍现在这个账本已经有3,400G这么大。我另外了解到,分布式账本这种技术应用场景还是非常广泛的。比如说红十字会接受捐赠就可以利用区块链技术使得所有捐款信息完全透明公开。说句题外话,现在任何一个项目都需要不同类型的人才。系统,AI都有自己施展拳脚的空间。

小结

参加conference确实是一个非常愉快的体验。像我这种技术渣渣可以了解到各个领域的前进方向,找到自己努力的方向和未来的定位。和我一快来的同事就跟我说参加这个会议让自己更加坚定了当初自己选择的方向。另外,如果有丰富的工程经验也可以通过这次会议吸取同行的一些经验教训,取长补短。另外,丰富的networking机会也是这种会议的价值所在。

走出会议的那一刻,我觉得天空好蓝。

Decision

突然发现时间已经到了4月的最后一天。这个月我还没有在wordpress上写过任何博客。所以赶紧开始赶制这四月第一篇博客。

4月份发生的比较重大的事情就是录取结果陆续放出了。是的,我准备再度去美国读书,攻读计算机专业。读书这件事情其实筹备了很久,从去年的5,6月份考取GRE开始算的话,连准备考试,申请,到拿到申请结果,前前后后也快有一年的时间。今天这篇博客并不是想要去回答:为什么想要再出去读书,申请期间发生了什么事情等问题。这些我打算在我8月份的博客里去回答。今天,我只是想谈谈选录取offer这个话题。

研究生选offer和本科选offer还是有些不一样的。我还依稀记得本科选校的时候,就是在wisconsin和illinois之间选。那个时候主要是看综排并辅以专业排名。2010年wisconsin还是可以排到US News 35位的。另外看了看wisconsin在自然科学,社会科学排名方面都是基本在北美前10左右。最后学费上wisconsin也是低于illinois不少。所以最后就选择去了wisconsin。现在来看选择wisconsin我是一点没有后悔的。因为在那里锻炼了我坚毅的品格。课业的压力和暗无天日的长冬对于任何一个人从心理到生理上都是非常大的挑战。后来,忘记在哪里看到了一种叫badger holds的说法,说wisconsin出来的人不管走到哪里都能hold住全场。我想此言还是有些道理的。

到了研究生选校就是另外一番策略了。一般来讲是专排高于综排。换句话讲就是在研究生阶段是专业排名要比综合排名更加重要的。但是,在计算机领域,有四大神校:MIT, Stanford, UC-Berkeley, 和CMU。这四所学校是基本可以做到综排和专排兼顾的。所以如果被这四所学校录取的话,基本上可以做到无脑选择去了。这里说的是基本上,说明也有例外。这里我会在后边说。下面谈谈我对择校的反思。

先说下结果。这次我录取的项目有: CMU-SV MS-SE, Brown MS-CS, NYU MS-CS, GaTech MS-CS, UT-Austin MS-CS, UCSD MS-CS, CornellTech MMeng-CS, Columbia MS-CS. 最后我选择了UT-Austin MS-CS。

Know your goals:  employment or research?

这里说的是上Master的目的:是直接毕业找工作还是想要为未来PhD做准备?这条在本质上决定了选择offer的大方向。很少有项目能做到两者兼顾的。首先说说直接毕业找工作。如果是为了直接毕业找工作的话,那么学校placement statistics, 地理位置就要在决定过程中占有相对大的比例。与之相应的research实力,课程设置,导师等就不是考虑的重中之重了。如果以这个标准来看,CMU-SV MS-SE,Brown MS-CS, GaTech MS-CS, UCSD MS-CS, Columbia MS-CS, NYU MS-CS, CornellTech MMeng-CS就是些非常不错的选择了。再来说说研究方向,这里我们要看的就是教授,研究方向了。这里我并不是要说鱼和熊掌不可兼得,我想说的是每个项目都有不同的侧重点。

Big Department or Small Department?

系的大小也是影响决定的一个重要因素。Brown是第一个给我offer的,同时也是让我心动了很长时间。原因就是以Brown为代表的这种小而精的系,每个教授可以给每个学生最大的attention, 系里的氛围像大家庭一样。换句话讲,人均资源相对于那些大系来说会多一些。但是任何事物都不是完美的。系小的缺点就在于课程设置不会那么丰富并且研究领域会出现侧重:不会每个领域都会有教授的。我再来说说系大的特点。系大基本上就是系小取反。你面对的可能是更加激烈的竞争,资源就这些,每个人要去try their best to fight for the resources. 这里没有人会去babysitting,所以学生会被要求更加独立。但是,这不就是现实社会么?

Do a campus visit if you can!

其实专业排名,系的大小,你的兴趣所在都只是生活的一部分。最重要的是你要选择一个自己感觉到舒服的地方。毕竟在接下来的一到两年里,你要在这里学习生活。这个时候Campus visit就显得非常重要。去实地感受一下这里是不是真的适合你。去一个不适合自己的地方是一件非常痛苦的事情。我之前去过NYC,知道在NYC学习是一件多么不容易的事情,所以我知道如果可能的话,我还是比较适合college town。另外,在wisconsin学习的经历告诉我,天气同样是不可忽视的因素。这些都是排名等数据无法明确告诉你的。再多说一句,但凡在外边读过书的人都会知道图书馆的重要性。Wisconsin在这方面做的是非常非常好的。为什么呢?因为整个学校有43个大大小小的图书馆。我上学的时候最新欢光顾的就是Memorial, Law, Music,Astronomy, Econ这几所图书馆。每个系的building都会有图书馆的存在。如果我想的话,我可以根据心情挑不同图书馆来学习。但是,比如说GaTech, 整个campus只有一座图书馆,这对于一个喜欢泡图书馆的人来说,是非常难过的。这些因素都会很大程度影响一个人两年时间的开心与否的。

Carefully research the program!

其实在申请的时候去非常详细研究一个program是非常困难的。为什么呢?因为你的时间有限,而且研究的非常详细的话(比如哪个老师不好好备课,哪个老师上课很难这种),如果最终没有录取的话,那么你的努力就会白费。所以,很多人都是从一些指标和项目的粗略描述上来决定是否申请。但是,在得到录取结果后,就不能如此草率了。当时在做决定的时候,我最终在GaTech和UT-Austin做选择。我的做法是把每个学校老师和我兴趣沾边的都去翻了一遍,并且把两所学校的课程,每门课的主页都去翻了一下。我发现GaTech因为online master的原因,相当部分的课是看视频上课,老师负责答疑的。比如说,Machine Learning for Trading 这门课。上过MOOC的人都应该知道,网络版的课难度是比不上实体课的。因为受众群体不一样。这样如果你是想锻炼自己研究能力,去训练读paper的能力的话,像GaTech可能就不太适合。

打算这篇博客其实想了很久,因为在4月15日前我真的是非常纠结。但是,所幸的是,我还是仔细的去比较了每个项目,去发现项目之间的nuance, 然后做出了非常艰难的取舍。本来在写之前感觉自己会滔滔不绝写出很多。但是等真正落下笔去写,会发现其实选校的依据并没有自己想象的那么多。总结一句话,就是知道自己想要什么是整个选校过程中最重要的依据。但是,话说回来,这条哪件事不适用呢?