19 June 2006

Having an Impact

Having just completed my thesis, I've been thinking a lot about what directions I should persue, post-graduation. While I'll probably stick with some of the stuff I've been working on in the past, this is also an opportunity to diversify. This opens the question: what are promising areas to work on? One important qualification (though by no means the only one) is impact: I want to work on things that are important, either in the sense that they become tools used by others for a diverse set of problems, or change the way people think about or look at a particular area.

Rather than applying my own model for what work has been important, one can look back at what other people think have been the most influential papers in NLP, as well as what papers have won best paper awards recently (with the caveat that the latter is probably less indicative). The story seems to be that the most impactful papers, at least as measured by this criteria, are either MT papers, parsing papers or machine learning papers. This makes some historical sense: MT is sort of the quintessential NLP application, while parsing is essentially the language-for-language's sake task.

This analysis, unfortunately, puts me in a sticky situation. I've already been doing machine learning stuff, which I hope will turn out to be useful for other people. I'd ideally like to get back into really languagey stuff in the future, and this leaves parsing at MT. I have little interest in working on parsing (nothing against parsing: it's just not my thing, and there are lots of people who can probably do it better than me), so now I'm stuct with MT. MT is, without a doubt, an interesting topic. But the field is so crowded right now, it seems difficult to find a sufficiently large and interesting niche to carve out. I personally think out-of-English translation is promising, especially conjoined with morphological research (pretty much required when going into a highly inflected language), but there are a large number of barriers to entry to this problem (evaluation, my American-esque inability to speak any foreign language sufficiently reliably to do error analysis (except Japanese, but there's so little data there), etc.).

But regardless of my personal interests and limitations, the fact that the field is so dominated by two problems seems unfortunate. There are 44 bins that ACL allowed you to select for your topic this year and while many of them largely overlap, there are many that do not. Why, for instance, does research in discourse, inference, phonology, question answering, semantics, and word sense disambiguation have comparatively little weight in the field as a whole? Why are there six sessions on MT (plus one multilingual), six on parsing (plus two on grammars), but only two on semantics and discourse and one on language modeling at ACL this year? And beyond that, there seem to be lots of interesting problems that no one is working on (or at least that no one publishes at ACL). I do not doubt that these are interesting problems, but it seems that there is a severe skew that may not be healthy to the community.

11 comments:

Anonymous said...

By parsing, do mean parsing a sentence?
If so, then doesn't Machine translation require parsing?

I am just a high schooler right now, so I apologize if the question is a bit stupid.

hal said...

Yes, parsing = parsing a sentence into syntactic units. Although there is a current trend toward using sytactic information for translation, the current best systems (according to some metrics) do not use syntax at all, but essentially segment a foreign sentence into phrases, translate the phrases independently through memorizing potential translations, then reorder them to try to produce fluent output. I'm planning on putting together a "getting started in" for MT soon, but the current story is: no.

Anonymous said...

Mind.Forth, the quintessentially NLP AI that I have been working on for a decade (and had a breakthrough in a few weeks ago), has elements of both parsing and machine translation. Alas, however, I do not know how to direct anybody towards a career in open-source artificial intelligence. I can only ask that NLP-ers download Mind.Forth, run it in Tutorial mode to watch the artificial mind think -- and show it to as many interested persons as possible. Then maybe jobs in NLP AI will begin to emerge. Best of luck!

John_Cass said...

Do you mean research or working for an organization?

Kevin Duh said...

I feel you! The question of doing impactful research is so important! I've been stuck thinking about my thesis topic for a while now. I want to work on something that has practical impact, either to the research community (as a well-cited paper) or to the industry/society (as a good application). However, how does one do this? It seems that one can either enter a crowded field (which guarantees a large audience) and carve out a niche, or invent a new area that have strong future impact.

As I see it, the crowded fields now are MT and parsing. To gain some recognition here, one really needs to invent some new model/technique that outperforms all the other systems (e.g. David Chiang's Hiero system), or find a less-studied area such as morphology in MT or adaptation of parsers.

Regarding new fields, personally I think things that deal with the web, such as sentiment analysis of reviews, social networks, blogs, and
>machine reading
, etc. may be future killer applications because they have direct social ramifications. I also think that productivity tools, such as email analysis, may have a huge market as future office workers become buried in information. Another area that does not directly relate to NLP but has the potential is multimedia communication--there is an increasingly large repository of video and audio content on the web (eg. youtube.com) which requires easy browsing and searching. NLP and summarization of these data may open up some opportunities.

Having said all that, I'm still pretty much want to work on machine learning. Personally, I'd like to develop a research direction that allows me to stradle both machine learning and NLP, applying advanced machine learning methods to NLP, and inspiring new machine learning problems from NLP. This led me to the following question: are there types of problems in NLP that hasn't been investigated in the machine learning community?

After being stuck on figuring out a thesis topic for a while, my advisor suggested that I just start work on *something*. Anything is fine. I guess the act of working on some project may inspire me to think of better ideas.

Anonymous said...

The take-away message from the most "impactful" papers list is that we forget most of what happened more than 10 years ago.

What about long-term impact? My vote's for Shannon's seminal paper on information theory -- he introduced character n-grams and the noisy channel model. In the late 1940s. Or Viterbi's first-best HMM decoder?

As a field, I would claim that there haven't been *any* papers written and published in CL/ACL that'll stand the test of time.

The ACL has been around since the 1960s. I remember Ron Kaplan and Bonnie Webber complaining in 1987 that all of the significant work done in the 1970s (Lunar, parallel parsing, etc.) was either forgotten or being reinvented.

If someone had asked me the question in 1987, I'd be listing the LFG papers of Kaplan, unification grammars of Shieber, some cool work on coordination by Steedman, nifty abductive reasoning by Hobbs, etc. etc.

My guess is that Hal's current list will look just as quaint in 20 years. Especially since almost none of it is really that NLP-specific, and the stuff that is NLP-specific (Collins's parser, IBM's translation models, Identifinder/Nymble) has been superseded by "better" models.

hal said...

John -- I mean research.

Kevin -- I agree new fields are fun, and probably the direction I will go. But there are two caveats. (1) It can sometime be very difficult to break ground, publishing-wise. People seems somewhat psycologically more comfortable with old ground. Moreover, and more importantly, if you start a new field, you have to solve a lot of problems, the biggest of which (from a publications perspective) is evaluation. In my experience (and the experience of many people I've talked to), this is the easiest way to have a paper killed: someone can always find something wrong with your evaluation method. (2) I don't know how much one should worry about this, but for a lot of reasonably commercially viable web-based things, I have some concern that a larger organization (eg., Google, Yahoo, MSN) would essentially scoop what new cool thing I try (not intentionally -- good ideas just seem to pop up in different places all the time) and "win" just on the basis of having more data. While this hasn't happened to me, it's a bit scary and I'd imagine somewhat frustrating.

Bob -- very insightful :). I'd imagine that of the things on the list, the only thing that has any hope of mattering is the IBM paper, if for no other reason than it really brought home the idea of alignments to NLP (of course they existed in speech for quite some time before that)...or, at least, having not been around in 1993, my impression is that this paper did that. But I think your point is well taken. There is some sense in which some of the old stuff might be coming back (the PARC people have been publishing on LFG for a few years now, some of the syntactic MT work at ISI is starting to have more and more of a unification perspective, though they don't call it that, etc.). I even know of some people who (I believe unsuccessfully, unfortunately) tried to reinvent Hobbs abductive reasoning using modern statistical techniques (note that really abduction is nothing more than Bayesian inference; the "bang" that Hobbs gets is just the explaining away phenomenon).

So what can we learn from this? That no one will remember what we do? That we should stick to whatever's the hot topic of the day? It seems there must be a positive in here, but I'm searching for it...

John_Cass said...

I think there's some irony to a data-mining expert wondering which topic you should research next. :-)

Though maybe rather than trying to find the topic everyone is discussing you really want to find the area no one is discussing.

hal said...

My sense is that techniques vary more over time than problems. CRFs, max margin stuff, etc. are very popular now, though the problem foci (MT, parsing) haven't changed in 50 years (if anything, my sense is that things were actually more broad 15-25 years ago than they are now). Of course, this is outside my first-hand knowledge, so I may be wrong :).

hal said...

Bob, I had another thought on the "things past two decades are forgotten." I don't think this is really necessarily true. We just don't remember them as papers. Especially the younguns among us, I never read the Sparck-Jones tf-idf paper, or the Viterbi paper or the unification paper, but I learned about these things in classes and clearly something like tf-idf has had more of an impact than, say, parsing with maxent models (no offense to Adwait). But I think that, at least for current students, we think of these things as "known," rather than "research." (I had to specifically seek out the Brown 93 paper to read a few years back, and that took some doing.) This of course renders "infuential papers" lists rather bogus, but I don't think it means we don't know or care about the older stuff.

Anonymous said...

酒店經紀PRETTY GIRL 台北酒店經紀人 ,禮服店 酒店兼差PRETTY GIRL酒店公關 酒店小姐 彩色爆米花酒店兼職,酒店工作 彩色爆米花酒店經紀, 酒店上班,酒店工作 PRETTY GIRL酒店喝酒酒店上班 彩色爆米花台北酒店酒店小姐 PRETTY GIRL酒店上班酒店打工PRETTY GIRL酒店打工酒店經紀 彩色爆米花