Statistical analysis of Lingustic data
This is a special topics course on lingustic data. More and more data
these days have lingustic content--so this class will investigate what
it takes to drop such lingustic data into a statistical model.
The main books will be:
Homework:
I've written up homework 1 (.pdf, .Rnw and for a quick .html
view). Please start
playing around with it and let me know any bugs you discover. The
further homeworks are a work in progress.
Schedule:
- CCA
- slides for today's lecture
- paper with
Sham
- CCA goes back to the 1930's, so there should be pleanty of
web material to look over. I won't put it up. But if you find
something nice, email it to me and I'll post it.
- Other disambiguation
solutions
- pdf
- Read the rest of chapter 20
- Nov 9th: Look at distributional vectors of words
- Nov 23rd: Proof of martingale for mFDR
- Nov 30th: NO CLASS! GO SEE DAN ROTH!!
- Dec 2nd: Question answering
- Dec 7th: Double lecture
- First lecture: Translation (chapter 25)
- Second lecture: Extensions to streaming feature selection (3-4:30)
- JASA paper: VIF (Dongyu Lin)
- JMLR paper: auction
- Dec 9th: in class presentations
- Dec 10th: out of class presentations (maybe?)
dean.foster@gmail.com
Last modified: Tue Dec 7 14:21:42 EST 2010