The main readings will be:

- Speech and Language processing by Jurafsky and Martin. (main text)
- Foundations of statistical natural language processing by Manning and Schutze. (Suggested if you have a weaker statistics background.)
- Language log (In particular, posts by Mark Liberman often have a lot of statistical content.)

- Homework 1 is due Sept 24th.

- Sept 5: Introduction (.pdf)

- Sept 10: Regular expressions (.pdf)
- Sept 12: Ngrams (.pdf)
- N-grams (Chapter 4 of JM)

- Sept 24: Homework 1 is due

- Backoff (.pdf) and information theory.
- Part of Speach tagging
- Chapter 5 of JM
- standford tagger
- Penn tree bank

- HMM (.pdf) (chapter 6)
- HMM (Chapter 6 of JM)
- Kalman filters
- Estimation via CCA (see Sham's paper)

- HMM (part 2) (.pdf)

- Speech: encoding
(.pdf) (chapters 7 and 8)
- Limits on D to A: Shannon limit
- Limits on A to D: Nyquist rate
- IPA (fancy alphabet: read along as various people speak words.)

- Speech: decoding (.pdf) (chapters 9 and 10)

- First Order Predicate
Calculus
- Read chapters 17 and 18

- Word sense disambiguation
- Read chapters 19 and 20

- CCA
- slides for today's lecture
- paper with Sham
- CCA goes back to the 1930's, so there should be pleanty of web material to look over. I won't put it up. But if you find something nice, email it to me and I'll post it.

- Other disambiguation
solutions
- Read the rest of chapter 20

- Nov 1st, 2012: First language log blog post due (for those
doing that for their final project).
- XXX: Look at distributional vectors of words

- XXX: NER
- XXX: mFDR
- lecture notes in .pdf (as given at JSM in 2008)
- mFDR martingal paper: Martingale
- We started this research studying bankruptcy.
- A good introduction to the ideas of risk ratios (Dongyu Lin)
- The original Risk inflation paper (with Edward George)

- XXX: Proof of martingale for mFDR

- XXX: Question answering

- XXX: Double lecture
- First lecture: Translation (chapter 25)
- Second lecture: Extensions to streaming feature selection (3-4:30)
- JASA paper: VIF (Dongyu Lin)
- JMLR paper: auction

- XXX: in class presentations

