15 July 2006

Where are Interesting Learning Problems in NLP?

I just spent a few days visiting Yee Whye and NUS (photos to prove it!). YW and I talked about many things, but one that stood out as a ripper is attempting to answer the question: as a machine learning person (i.e., YW), what problems are there in NLP that are likely to be amenable to interesting machine learning techniques (i.e., Bayesian methods, or non-parametric methods, etc.). This was a question we tried to answer at the workshop last year, but I don't think we reached a conclusion.

At first, we talked about some potential areas, mostly focusing around problems for which one really needs to perform some sort of inference at a global scale, rather than just locally.  I think that this answer is something of a pearler, but not the one I want to dwell on.

Another potential partial answer arose, which I think bears consideration: it will not be on any problem that is popular right now.  Why?  Well, what we as NLPers usually do these days is use simple (often hueristic) techniques to solve problems.  And we're doing a sick job at it, for the well studied tasks (translation, factoid QA, ad hoc search, parsing, tagging, etc.).  The hunch is that one of the reasons such problems are so popular these days is because such techniques work so bloody well.  Given this, you'd have to be a flamin' galah to try to apply something really fancy to solve one of these tasks.

This answer isn't incompatible with the original answer (globalness).  After all, most current techniques only use local information.  There is a push toward "joint inference" problems and reducing our use of pipelining, but this tends to be at a fairly weak level.

This is not to say that Bayesian techniques (or, fancy machine learning more generally) is not applicable to problems with only local information, but there seems to be little need to move toward integrating in large amounts of global uncertainty.  Of course, you may disagree and if you do, no wuckers.

p.s., I'm in Australia for ACL, so I'm trying to practice my Aussie English.

1 comment:

Kevin Duh said...

I can't imagine you in an Australian accent, although it sure seems you got the lexicon quite well. :)

Regarding interesting machine learning problems in NLP. I agree that global/joint learning is one aspect. In general, I think the neat thing about NLP is that the inputs are words and sentences, and the outputs are sentences and trees. This makes it different from the simple binary classification problem commonly explored in machine learning (so I'm talking about structured outputs here).

I think there may be an additional interesting area, but it's something I haven't seen much of: figuring out the feature space of NLP tasks. Basically, we often define some sort of vectorial representation of words/sentences based on some linguistically motivated features. Then we do some learning algorithm on the resulting Euclidean space. It may be possible that this representation isn't so good, since words are discrete and don't really lie in some Euclidean space. So I'm thinking of smarter feature selection/induction algorithms, something along the lines of Jun Suzuki's "Convoltion Kernels with Feature Selection" paper.

I think NLP is full of interesting problems for MLer's. The catch is for MLer's to use new algorithm to beat the state of the art, which is often not so easy!