Statistics 471/701 home page
- Class: MW3-4:30, F92
- TA: Joshua Magarick
- email: email@example.com
- Last year's TA Sathya, kept a web site with several useful
files on it: code
- Office hours: Tuesday at 11 am (Huntsman 472). Or email me for an appointment.
- Course work:
- submit all exercises to firstname.lastname@example.org
- Exercises (to introduce you to R. These are not that
important--they are more for your benefit.)
- Homeworks (or more accurately, cases)
- Final project
- Note: You are the best students on campus. I have very high
expecations on what you will learn this semester. This class is often listed as taking the most work of
any class taken at wharton on student evaluations.
- We will be using R (free) as our statistics
package. I'll be using R in class. The book is on R. Statistics
revolves around R.
- Two useful books on R:
- Introductory Statistics with R by Peter Dalgaard, 2nd edition, ISBN 978-0-387-79053-4, Springer 2008 (paperback).
- Linear Models with R by Julian J. Faraway, ISBN 1-58488-425-8, Chapman\& Hall/CRC Press 2005, (hardback)
- But the web, and Joshua are your best resources!
- Jan 11: Fitting functions with Taylor
- Jan 14 @ 12-1:30 (F45): Introduction to R (by Sathyanarayan
Anand) Held In F45.
- bring your laptops!
- After seeing his introduction, you should be able to do the
first practice R set. If you have
questions about doing this, send us an email and we can add more
information to the file.
- Feb 14: HW 2 due (due at 2:45 so
you have time to get to class on time. :-)
- Feb 14: Interlude: Darwin
- Feb 16: A Ponzironi :
- Everything in one paper
on testing alpha.
- Other information:
- See my research page for some
blog discussions of these ideas
- Ponzi 1 class notes (general idea)
- Ponzi 2 class notes (rule of 3)
- CAPM revisited: notes 2 (connecting back to CAPM)
- Dice discussion and a dice paper.
- Long run growth rate: notes 1
- Feb 21: Introduction to linguistics
- Mitch Marcus and Noam Chomsky
- History of bigger data
- Problem statement: Who wrote the Federalist papers?
- Zipf distribution (see NSF
for some curious modern ideas on Zipf.)
- Feb 23: Singularity discussion and federalist papers
- Mar 14: PCAs
- class notes: (pdf,
- This class will set up the concepts and the next class will show
how to do it for linguistics and some R code.
- Find someone to work on for your project.
- Mar 28: proposal due
- Apr 4: lit review due
- Apr 12: data analysis due
- end of semester: presentations
- Mar 21: CCA and language
- Mar 23: predicting
the next word
- Mar 24: (4-5pm): Using R in lingustics (for homework 4)
- Room 350 JMHH
- If you are conforatble with the homework--you don't need to attend
- It will "truely" start at 4:30, but if you have class at 4:30,
come at 4:00.
- The R code we will be
- Sathya's code to help you
with homework 4.
- Apr 18: Loss functions and calibration
- Apr 20: In class presentations
- Apr 25: In class presentations
- Apr 27: (Reading days) 3-6, F92: Optional presentation date for those who
want the extra two days. If interested, let me know. I might even
allow powerpoint on this date.
Future Data sets
Future practice exercises
I'll keep a page about the current R
practice you should be doing.
- fire up R (first week's practice)
- make doglegs (second week's practice)
- residuals (3rd week's practice)
- hetroskadasticity (3rd week's practice)
- homework one help file.
- homework two help file.
- federalist help file
Ask me about the Latex and R connection (Sweave)
- example source file
- What it looks like
- Other references I've found useful:
Data sets used in 2009
Other Data sets of interest and some that will be used later
Last modified: Wed Jan 11 12:35:56 EST 2012