Statistics (STAT)

STAT 101 Introductory Business Statistics

Data summaries and descriptive statistics; introduction to a statistical computer package; Probability: distributions, expectation, variance, covariance, portfolios, central limit theorem; statistical inference of univariate data; Statistical inference for bivariate data: inference for intrinsically linear simple regression models. This course will have a business focus, but is not inappropriate for students in the college.

One-term course offered either term

Prerequisite: MATH 104 or MATH 110 or equivalent; successful completion of STAT 101 is prerequisite to STAT 102

Activity: Lecture

1 Course Unit

STAT 102 Introductory Business Statistics

Continuation of STAT 101. A thorough treatment of multiple regression, model selection, analysis of variance, linear logistic regression; introduction to time series. Business applications.

One-term course offered either term

Prerequisite: STAT 101

Activity: Lecture

1 Course Unit

STAT 111 Introductory Statistics

Introduction to concepts in probability. Basic statistical inference procedures of estimation, confidence intervals and hypothesis testing directed towards applications in science and medicine. The use of the JMP statistical package.

One-term course offered either term

Prerequisites: High school algebra.

Activity: Recitation

1 Course Unit

STAT 112 Introductory Statistics

Further development of the material in STAT 111, in particular the analysis of variance, multiple regression, non-parametric procedures and the analysis of categorical data. Data analysis via statistical packages.

One-term course offered either term

Prerequisite: STAT 111

Activity: Lecture

1 Course Unit

STAT 399 Independent Study

One-term course offered either term

Prerequisites: Written permission of instructor and the department course coordinator.

Activity: Independent Study

1 Course Unit

STAT 405 Statistical Computing with R

The goal of this course is to introduce students to the R programming language and related eco-system. This course will provide a skill-set that is in demand in both the research and business environments. In addition, R is a platform that is used and required in other advanced classes taught at Wharton, so that this class will prepare students for these higher level classes and electives.

Taught by: Stine, Waterman, Zhang

One-term course offered either term

Prerequisite: STAT 102 or STAT 112 or STAT 430

Activity: Lecture

0.5 Course Units

STAT 422 Predictive Analytics for Business

This course follows from the introductory regression classes, STAT 102, STAT 112, and STAT 431 for undergraduates and STAT 613 for MBAs. It extends the ideas from regression modeling, focusing on the core business task of predictive analytics as applied to realistic business related data sets. In particular it introduces automated model selection tools, such as stepwise regression and various current model selection criteria such as AIC and BIC. It delves into classification methodologies such as logistic regression. It also introduces classification and regression trees (CART) and the popular predictive methodology known as the random forest.

One-term course offered either term

Prerequisite: STAT 102 or STAT 112 or STAT 431

Activity: Lecture

0.5 Course Units

STAT 424 Text Analytics

This course introduces methods for the analysis of unstructured data, focusing on statistical models for text. Techniques include those for sentiment analysis, topic models, and predictive analytics. Course includes topics from natural language processing (NLP), such as identifying parts of speech, parsing sentences (e.g., subject and predicate), and named entity recognition (people and places). Unsupervised techniques suited to feature creation provide variables suited to traditional statistical models (regression) and more recent approaches (regression trees). Examples that span the course illustrate the success of text analytics. Hierarchical generating models often associated with nonparametric Bayesian analysis supply theoretical foundations.

Taught by: Stine

One-term course offered either term

Prerequisites: Students should be familiar with regression models at the level of STAT 102 and the R statistics language at the level of STAT 405. Familiarity with the R-Studio development environment is presumed, as well as common R packages such as stringr, dplyr and ggplot. Those with more knowledge of Statistics, such as from STAT 422, or computing skills will benefit. The predominant software used in the course is R, with bits of JMP when helpful for interactive illustration. Familiarity with basic probability models is helpful but not presumed.

Activity: Lecture

0.5 Course Units

STAT 430 Probability

Discrete and continuous sample spaces and probability; random variables, distributions, independence; expectation and generating functions; Markov chains and recurrence theory.

One-term course offered either term

Prerequisite: MATH 114 or MATH 115 or equivalent

Activity: Lecture

1 Course Unit

STAT 431 Statistical Inference

Graphical displays; one- and two-sample confidence intervals; one- and two-sample hypothesis tests; one- and two-way ANOVA; simple and multiple linear least-squares regression; nonlinear regression; variable selection; logistic regression; categorical data analysis; goodness-of-fit tests. A methodology course. This course does not have business applications but has significant overlap with STAT 101 and 102.

One-term course offered either term

Prerequisite: STAT 430

Activity: Lecture

1 Course Unit

STAT 432 Mathematical Statistics

An introduction to the mathematical theory of statistics. Estimation, with a focus on properties of sufficient statistics and maximum likelihood estimators. Hypothesis testing, with a focus on likelihood ratio tests and the consequent development of "t" tests and hypothesis tests in regression and ANOVA. Nonparametric procedures.

Course usually offered in spring term

Prerequisite: STAT 430 or 510 or equivalent

Activity: Lecture

1 Course Unit

STAT 433 Stochastic Processes

An introduction to Stochastic Processes. The primary focus is on Markov Chains, Martingales and Gaussian Processes. We will discuss many interesting applications from physics to economics. Topics may include: simulations of path functions, game theory and linear programming, stochastic optimization, Brownian Motion and Black-Scholes.

One-term course offered either term

Prerequisites: STAT 430, or permission of instructor

Activity: Lecture

1 Course Unit

STAT 435 Forecasting Methods for Management

This course provides an introduction to the wide range of techniques available for statistical forecasting. Qualitative techniques, smoothing and decomposition of time series, regression, adaptive methods, autoregressive-moving average modeling, and ARCH and GARCH formulations will be surveyed. The emphasis will be on applications, rather than technical foundations and derivations. The techniques will be studied critically, with examination of their usefulness and limitations.

Taught by: Shaman

One-term course offered either term

Prerequisite: STAT 102 or 112 or 431

Activity: Lecture

1 Course Unit

STAT 436 Introduction to Large-Scale Data Science

The course will focus on computational approaches to large-scale data analysis. The lectures will introduce the relevant concepts, and students will be asked to work on projects, implementing the methods and experimenting with large-scale datasets. The course will cover various techniques for updating models in an online fashion, as well as subsampling and dimensionality-reduction techniques. The students will experiment with neural network architectures and learn to build predictive models for modern machine learning tasks.

One-term course offered either term

Prerequisites: Linear Algebra and basic R programming

Activity: Lecture

1 Course Unit

STAT 451 Fundamentals of Actuarial Science I

This course is the usual entry point in the actuarial science program. It is required for students who plan to concentrate or minor in actuarial science. It can also be taken by others interested in the mathematics of personal finance and the use of mortality tables. For future actuaries, it provides the necessary knowledge of compound interest and its applications, and basic life contingencies definition to be used throughout their studies. Non-actuaries will be introduced to practical applications of finance mathematics, such as loan amortization and bond pricing, and premium calculation of typical life insurance contracts. Main topics include annuities, loans and bonds; basic principles of life contingencies and determination of annuity and insurance benefits and premiums.

Taught by: Lemaire

Course usually offered in fall term

Prerequisites: MATH 104, STAT 430. STAT 430 can be taken concurrently with BEPP 451 or STAT 451

Activity: Lecture

1 Course Unit

STAT 452 Fundamentals of Actuarial Science II

This specialized course is usually only taken by Wharton students who plan to concentrate in actuarial science and Penn students who plan to minor in actuarial mathematics. It provides a comprehensive analysis of advanced life contingencies problems such as reserving, multiple life functions, multiple decrement theory with application to the valuation of pension plans.

Taught by: Lemaire

Course usually offered in spring term

Prerequisite: BEPP 451 or STAT 451

Activity: Lecture

1 Course Unit

STAT 453 Actuarial Statistics

This course covers models for insurer's losses, and applications of Markov chains. Poisson processes, including extensions such as non-homogeneous, compound, and mixed Poisson processes are studied in detail. The compound model is then used to establish the distribution of losses. An extensive section on Markov chains provides the theory to forecast future states of the process, as well as numerous applications of Markov chains to insurance, finance, and genetics. The course is abundantly illustrated by examples from the insurance and finance literature. While most of the students taking the course are future actuaries, other students interested in applications of statistics may discover in class many fascinating applications of stochastic processes and Markov chains.

Taught by: Lemaire

Course usually offered in fall term

Prerequisite: STAT 430

Activity: Lecture

1 Course Unit

STAT 454 Applied Statistical Methods for Actuaries

One half of the course is devoted to the study of time series, including ARIMA modeling and forecasting. The other half studies modifications in random variables due to deductibles, co-payments, policy limits, and elements of simulation. This course is a possible entry point into the actuarial science program. The Society of Actuaries has approved STAT 854 for VEE credit on the topic of time series.

Taught by: Lemaire

Course usually offered in spring term

Prerequisites: STAT 430, STAT 431

Activity: Lecture

1 Course Unit

STAT 470 Data Analytics and Statistical Computing

This course will introduce a high-level programming language, called R, that is widely used for statistical data analysis. Using R, we will study and practice the following methodologies: data cleaning, feature extraction; web scrubbing, text analysis; data visualization; fitting statistical models; simulation of probability distributions and statistical models; statistical inference methods that use simulations (bootstrap, permutation tests).

Taught by: Buja

One-term course offered either term

Prerequisites: STAT 101 and 102 or STAT 111 and 112 or STAT 431 or ECON 103 and ECON 104

Activity: Lecture

1 Course Unit

STAT 471 Modern Data Mining

Modern Data Mining: Statistics or Data Science has been evolving rapidly to keep up with the modern world. While classical multiple regression and logistic regression technique continue to be the major tools we go beyond to include methods built on top of linear models such as LASSO and Ridge regression. Contemporary methods such as KNN (K nearest neighbor), Random Forest, Support Vector Machines, Principal Component Analyses (PCA), the bootstrap and others are also covered. Text mining especially through PCA is another topic of the course. While learning all the techniques, we keep in mind that our goal is to tackle real problems. Not only do we go through a large collection of interesting, challenging real-life data sets but we also learn how to use the free, powerful software "R" in connection with each of the methods exposed in the class.

Taught by: Zhao

One-term course offered either term

Prerequisite: STAT 102 or 112 or 431

Activity: Lecture

1 Course Unit

STAT 474 Modern Regression for the Social, Behavioral and Biological Sciences

Function estimation and data exploration using extensions of regression analysis: smoothers, semiparametric and nonparametric regression, and supervised machine learning. Conceptual foundations are addressed as well as hands-on use for data analysis.

Taught by: Berk

Course usually offered in spring term

Prerequisite: STAT 102 or 112 or equivalent

Activity: Lecture

1 Course Unit

STAT 475 Sample Survey Design

This course will cover the design and analysis of sample surveys. Topics include simple sampling, stratified sampling, cluster sampling, graphics, regression analysis using complex surveys and methods for handling nonresponse bias.

Course not offered every year

Prerequisite: STAT 102 or 112 or 431

Activity: Lecture

1 Course Unit

STAT 476 Applied Probability Models in Marketing

This course will expose students to the theoretical and empirical "building blocks" that will allow them to construct, estimate, and interpret powerful models of customer behavior. Over the years, researchers and practitioners have used these models for a wide variety of applications, such as new product sales, forecasting, analyses of media usage, and targeted marketing programs. Other disciplines have seen equally broad utilization of these techinques. The course will be entirely lecture-based with a strong emphasis on real-time problem solving. Most sessions will feature sophisticated numerical investigations using Microsoft Excel. Much of the material is highly technical.

Taught by: Fader

Course usually offered in spring term

Prerequisites: A high comfort level with basic integral calculus and recent exposure to a formal course in probability and statistics such as STAT 430 is strongly recommended.

Activity: Lecture

1 Course Unit

STAT 480 Advanced Statistical Computing

This course will build on the fundamental concepts introduced in the prerequisite courses to allow students to acquire knowledge and programming skills in large-scale data analysis, data visualization, and stochastic simulation.

Taught by: Buja

Course usually offered in spring term

Prerequisites: STAT 470 or STAT 405 or equivalent background acquired through a combination of online courses that teach the R language and practical experience.

Activity: Lecture

1 Course Unit

STAT 500 Applied Regression and Analysis of Variance

An applied graduate level course in multiple regression and analysis of variance for students who have completed an undergraduate course in basic statistical methods. Emphasis is on practical methods of data analysis and their interpretation. Covers model building, general linear hypothesis, residual analysis, leverage and influence, one-way anova, two-way anova, factorial anova. Primarily for doctoral students in the managerial, behavioral, social and health sciences.

Taught by: Rosenbaum

Course usually offered in fall term

Prerequisite: STAT 102 or 112 or equivalent

Activity: Lecture

1 Course Unit

STAT 501 Introduction to Nonparametric Methods and Log-linear Models

An applied graduate level course for students who have completed an undergraduate course in basic statistical methods. Covers two unrelated topics: loglinear and logit models for discrete data and nonparametric methods for nonnormal data. Emphasis is on practical methods of data analysis and their interpretation. Primarily for doctoral students in the managerial, behavioral, social and health sciences. May be taken before STAT 500 with permission of instructor.

Taught by: Rosenbaum

Course usually offered in spring term

Prerequisite: STAT 102 or 112 or equivalent

Activity: Lecture

1 Course Unit

STAT 503 Data Analytics and Statistical Computing

This course will introduce a high-level programming language, called R, that is widely used for statistical data analysis. Using R, we will study and practice the following methodologies: data cleaning, feature extraction; web scrubbing, text analysis; data visualization; fitting statistical models; simulation of probability distributions and statistical models; statistical inference methods that use simulations (bootstrap, permutation tests).

Taught by: Buja

One-term course offered either term

Prerequisites: Two courses at the statistics 400 or 500 level.

Activity: Lecture

1 Course Unit

STAT 510 Probability

Elements of matrix algebra. Discrete and continuous random variables and their distributions. Moments and moment generating functions. Joint distributions. Functions and transformations of random variables. Law of large numbers and the central limit theorem. Point estimation: sufficiency, maximum likelihood, minimum variance. Confidence intervals.

One-term course offered either term

Prerequisite: A one year course in calculus

Activity: Lecture

1 Course Unit

STAT 511 Statistical Inference

Graphical displays; one- and two-sample confidence intervals; one- and two-sample hypothesis tests; one- and two-way ANOVA; simple and multiple linear least-squares regression; nonlinear regression; variable selection; logistic regression; categorical data analysis; goodness-of-fit tests. A methodology course.

One-term course offered either term

Prerequisite: STAT 510 or equivalent

Activity: Lecture

1 Course Unit

STAT 512 Mathematical Statistics

An introduction to the mathematical theory of statistics. Estimation, with a focus on properties of sufficient statistics and maximum likelihood estimators. Hypothesis testing, with a focus on likelihood ratio tests and the consequent development of "t" tests and hypothesis tests in regression and ANOVA. Nonparametric procedures.

Course usually offered in spring term

Prerequisite: STAT 430 or 510 or equivalent

Activity: Lecture

1 Course Unit

STAT 515 Advanced Statistical Inference I

STAT 515 is aimed at first-year Ph.D. students and builds a good foundation in statistical inference from the first principles of probability.

Taught by: Low

Course usually offered in fall term

Prerequisites: STAT 430 and STAT 431 and MATH 114 and MATH 240 or equivalent

Activity: Lecture

1 Course Unit

STAT 516 Advanced Statistical Inference II

STAT 516 is a natural continuation of STAT 515, and the main focus is on asymptotic evaluations and regression models. Time permitting, it also discusses some basic nonparametric statistical methods.

Taught by: Ma

Course usually offered in spring term

Prerequisite: STAT 515

Activity: Lecture

1 Course Unit

STAT 520 Applied Econometrics I

This is a course in econometrics for graduate students. The goal is to prepare students for empirical research by studying econometric methodology and its theoretical foundations. Students taking the course should be familiar with elementary statistical methodology and basic linear algebra, and should have some programming experience. Topics include conditional expectation and linear projection, asymptotic statistical theory, ordinary least squares estimation, the bootstrap and jackknife, instrumental variables and two-stage least squares, specification tests, systems of equations, generalized least squares, and introduction to use of linear panel data models.

Taught by: Shaman

Course usually offered in fall term

Prerequisites: MATH 114 and MATH 312 or equivalents, and an undergraduate introduction to probability and statistics

Activity: Lecture

1 Course Unit

STAT 521 Applied Econometrics II

Topics include system estimation with instrumental variables, fixed effects and random effects estimation, M-estimation, nonlinear regression, quantile regression, maximum likelihood estimation, generalized method of moments estimation, minimum distance estimation, and binary and multinomial response models. Both theory and applications will be stressed.

Taught by: Shaman

Course usually offered in spring term

Prerequisites: STAT 520. This is a continuation of STAT 520

Activity: Lecture

1 Course Unit

STAT 533 Stochastic Processes

An introduction to Stochastic Processes. The primary focus is on Markov Chains, Martingales and Gaussian Processes. We will discuss many interesting applications from physics to economics. Topics may include: simulations of path functions, game theory and linear programming, stochastic optimization, Brownian Motion and Black-Scholes.

One-term course offered either term

Prerequisite: STAT 510 or equivalent

Activity: Lecture

1 Course Unit

STAT 542 Bayesian Methods and Computation

Sophisticated tools for probability modeling and data analysis from the Bayesian perspective. Hierarchical models, mixture models and Monte Carlo simulation techniques.

Taught by: Jensen

Course usually offered in spring term

Prerequisite: STAT 430 or 510 or equivalent or permission of instructor

Activity: Lecture

1 Course Unit

STAT 571 Modern Data Mining

Modern Data Mining: Statistics or Data Science has been evolving rapidly to keep up with the modern world. While classical multiple regression and logistic regression technique continue to be the major tools we go beyond to include methods built on top of linear models such as LASSO and Ridge regression. Contemporary methods such as KNN (K nearest neighbor), Random Forest, Support Vector Machines, Principal Component Analyses (PCA), the bootstrap and others are also covered. Text mining especially through PCA is another topic of the course. While learning all the techniques, we keep in mind that our goal is to tackle real problems. Not only do we go through a large collection of interesting, challenging real-life data sets but we also learn how to use the free, powerful software "R" in connection with each of the methods exposed in the class.

Taught by: Zhao

One-term course offered either term

Prerequisite: Two courses at the statistics 400 or 500 level or permission from instructor

Activity: Lecture

1 Course Unit

STAT 580 Advanced Statistical Computing

This course will build on the fundamental concepts introduced in the prerequisite courses to allow students to acquire knowledge and programming skills in large-scale data analysis, data visualization, and stochastic simulation.

Taught by: Buja

Course usually offered in spring term

Prerequisites: STAT 503 or equivalent background acquired through a combination of online courses that teach the R language and practical experience.

Activity: Lecture

1 Course Unit

STAT 613 Regression Analysis for Business

This course provides the fundamental methods of statistical analysis, the art and science if extracting information from data. The course will begin with a focus on the basic elements of exploratory data analysis, probability theory and statistical inference. With this as a foundation, it will proceed to explore the use of the key statistical methodology known as regression analysis for solving business problems, such as the prediction of future sales and the response of the market to price changes. The use of regression diagnostics and various graphical displays supplement the basic numerical summaries and provides insight into the validity of the models. Specific important topics covered include least squares estimation, residuals and outliers, tests and confidence intervals, correlation and autocorrelation, collinearity, and randomization. The presentation relies upon computer software for most of the needed calculations, and the resulting style focuses on construction of models, interpretation of results, and critical evaluation of assumptions.

Course usually offered in fall term

Prerequisites: The basic mathematical skills covered in STAT 611, Mathematics for Business Analysis

Activity: Lecture

1 Course Unit

Notes: Lecture and discussion, assigned exercises, data analysis project, quizzes and a final exam.

STAT 621 Accelerated Regression Analysis for Business

STAT 621 is intended for students with recent, practical knowledge of the use of regression analysis in the context of business applications. This course covers the material of STAT 613, but omits the foundations to focus on regression modeling. The course reviews statistical hypothesis testing and confidence intervals for the sake of standardizing terminology and introducing software, and then moves into regression modeling. The pace presumes recent exposure to both the theory and practice of regression and will not be accommodating to students who have not seen or used these methods previously. The interpretation of regression models within the context of applications will be stressed, presuming knowledge of the underlying assumptions and derivations. The scope of regression modeling that is covered includes multiple regression analysis with categorical effects, regression diagnostic procedures, interactions, and time series structure. The presentation of the course relies on computer software that will be introduced in the initial lectures.

Taught by: George

Course usually offered in fall term

Prerequisites: Recent exposure to the theory and practice of regression modeling.

Activity: Lecture

0.5 Course Units

Notes: Lecture and discussion, assigned exercises, data analysis, quizzes, and a final exam.

STAT 701 Modern Data Mining

Modern Data Mining: Statistics or Data Science has been evolving rapidly to keep up with the modern world. While classical multiple regression and logistic regression technique continue to be the major tools we go beyond to include methods built on top of linear models such as LASSO and Ridge regression. Contemporary methods such as KNN (K nearest neighbor), Random Forest, Support Vector Machines, Principal Component Analyses (PCA), the bootstrap and others are also covered. Text mining especially through PCA is another topic of the course. While learning all the techniques, we keep in mind that our goal is to tackle real problems. Not only do we go through a large collection of interesting, challenging real-life data sets but we also learn how to use the free, powerful software "R" in connection with each of the methods exposed in the class.

Taught by: Zhao

One-term course offered either term

Prerequisite: STAT 613 or equivalent

Activity: Lecture

1 Course Unit

STAT 705 Statistical Computing with R

The goal of this course is to introduce students to the R programming language and related eco-system. This course will provide a skill-set that is in demand in both the research and business environments. In addition, R is a platform that is used and required in other advanced classes taught at Wharton, so that this class will prepare students for these higher level classes and electives.

Taught by: Stine, Waterman, Zhang

One-term course offered either term

Prerequisites: STAT 613 or STAT 621 or waiving the Statistics Core completely.

Activity: Lecture

0.5 Course Units

STAT 711 Forecasting Methods for Management

This course provides an introduction to the wide range of techniques available for statistical forecasting. Qualitative techniques, smoothing and decomposition of time series, regression, adaptive methods, autoregressive-moving average modeling, and ARCH and GARCH formulations will be surveyed. The emphasis will be on applications, rather than technical foundations and derivations. The techniques will be studied critically, with examination of their usefulness and limitations.

Taught by: Shaman

One-term course offered either term

Prerequisite: STAT 613 or equivalent

Activity: Lecture

1 Course Unit

STAT 722 Predictive Analytics for Business (formerly STAT 622)

This course follows from the introductory regression classes, STAT 102, STAT 112, and STAT 431 for undergraduates and STAT 613 for MBAs. It extends the ideas from regression modeling, focusing on the core business task of predictive analytics as applied to realistic business related data sets. In particular it introduces automated model selection tools, such as stepwise regression and various current model selection criteria such as AIC and BIC. It delves into classification methodologies such as logistic regression. It also introduces classification and regression trees (CART) and the popular predictive methodology known as the random forest.

One-term course offered either term

Prerequisite: STAT 613 or STAT 621 or having waived the statistics core completely

Activity: Lecture

0.5 Course Units

STAT 724 Text Analytics

This course introduces methods for the analysis of unstructured data, focusing on statistical models for text. Techniques include those for sentiment analysis, topic models, and predictive analytics. Course includes topics from natural language processing (NLP), such as identifying parts of speech, parsing sentences (e.g., subject and predicate), and named entity recognition (people and places). Unsupervised techniques suited to feature creation provide variables suited to traditional statistical models (regression) and more recent approaches (regression trees). Examples that span the course illustrate the success of text analytics. Hierarchical generating models often associated with nonparametric Bayesian analysis supply theoretical foundations.

Taught by: Stine

One-term course offered either term

Prerequisites: Students should be familiar with regression models at the level of STAT 613 and the R statistics language at the level of STAT 705. Familiarity with the R-Studio development environment is presumed, as well as common R packages such as stringr, dplyr and ggplot. Those with more knowledge of Statistics, such as from STAT 722, or computing skills will benefit. The predominant software used in the course is R, with bits of JMP when helpful for interactive illustration. Familiarity with basic probability models is helpful but not presumed.

Activity: Lecture

0.5 Course Units

STAT 770 Data Analytics and Statistical Computing

This course will introduce a high-level programming language, called R, that is widely used for statistical data analysis. Using R, we will study and practice the following methodologies: data cleaning, feature extraction; web scrubbing, text analysis; data visualization; fitting statistical models; simulation of probability distributions and statistical models; statistical inference methods that use simulations (bootstrap, permutation tests).

Taught by: Buja

One-term course offered either term

Prerequisites: STAT 613 or STAT 621 or waiving the Statistics Core completely.

Activity: Lecture

1 Course Unit

STAT 776 Applied Probability Models in Marketing

This course will expose students to the theoretical and empirical "building blocks" that will allow them to develop and implement powerful models of customer behavior. Over the years, researchers and practitioners have used these methods for a wide variety of applications, such as new product sales forecasting, analyses of media usage, customer valuation, and targeted marketing programs. These same techniques are also very useful for other types of business (and non-business) problems. The course will be entirely lecture-based with a strong emphasis on real-time problem solving. Most sessions will feature sophisticated numerical investigations using Microsoft Excel. Much of the material is highly technical.

Taught by: Fader

Course usually offered in spring term

Prerequisites: Students must have a high comfort level with basic integral calculus, and recent exposure to a formal course in probability and statistics is strongly recommended.

Activity: Lecture

1 Course Unit

Notes: Format: Lecture, real-time problem solving

STAT 780 Advanced Statistical Computing

This course will build on the fundamental concepts introduced in the prerequisite courses to allow students to acquire knowledge and programming skills in large-scale data analysis, data visualization, and stochastic simulation.

Taught by: Buja

Course usually offered in spring term

Prerequisites: STAT 770 or STAT 705 or equivalent background acquired through a combination of online courses that teach the R language and practical experience.

Activity: Lecture

1 Course Unit

STAT 851 Fundamentals of Actuarial Science I

This course is the usual entry point in the actuarial science program. It is required for students who plan to concentrate or minor in actuarial science. It can also be taken by others interested in the mathematics of personal finance and the use of mortality tables. For future actuaries, it provides the necessary knowledge of compound interest and its applications, and basic life contingencies definition to be used throughout their studies. Non-actuaries will be introduced to practical applications of finance mathematics, such as loan amortization and bond pricing, and premium calculation of typical life insurance contracts. Main topics include annuities,loans and bonds; basic principles of life contingencies and determination of annuity and insurance benefits and premiums.

Taught by: Lemaire

Course usually offered in fall term

Prerequisite: One semester of calculus

Activity: Lecture

1 Course Unit

STAT 852 Fundamentals of Actuarial Science II

This specialized course is usually only taken by Wharton students who plan to concentrate in actuarial science and Penn students who plan to minor in actuarial mathematics. It provides a comprehensive analysis of advanced life contingencies problems such as reserving, multiple life functions, multiple decrement theory with application to the valuation of pension plans.

Taught by: Lemaire

Course usually offered in spring term

Prerequisite: STAT 851 or BEPP 851

Activity: Lecture

1 Course Unit

STAT 853 Actuarial Statistics

This course covers models for insurer's losses, and applications of Markov chains. Poisson processes, including extensions such as non-homogeneous, compound, and mixed Poissonprocesses are studied in detail. The compound model is then used to establish the distribution of losses. An extensive section on Markov chains provides the theory to forecast future states of the process, as well as numerous applications of Markov chains to insurance, finance, and genetics. The course is abundantly illustrated by examples from the insurance and finance literature. While most of the students taking the course are future actuaries, other students interested in applications of statistics may discover in class many fascinating applications of stochastic processes and Markov chains.

Taught by: Lemaire

Course usually offered in fall term

Prerequisite: Two semesters of Statistics

Activity: Lecture

1 Course Unit

STAT 854 Applied Statistical Methods for Actuaries

One half of the course is devoted to the study of time series, including ARIMA modeling and forecasting. The other half studies modifications in random variables due to deductibles, co-payments, policy limits, and elements of simulation. This course is a possible entry point into the actuarial science program. The Society of Actuaries has approved STAT 854 for VEE credit on the topic of time series.

Taught by: Lemaire

Course usually offered in spring term

Prerequisite: One semester of probability

Activity: Lecture

1 Course Unit

STAT 899 Independent Study

One-term course offered either term

Prerequisites: Written permission of instructor, the department MBA advisor and course coordinator.

Activity: Independent Study

1 Course Unit

STAT 910 Forecasting and Time Series Analysis

Fourier analysis of data, stationary time series, properties of autoregressive moving average models and estimation of their parameters, spectral analysis, forecasting. Discussion of applications to problems in economics, engineering, physical science, and life science.

Taught by: Stine

Course offered spring; odd-numbered years

Prerequisite: STAT 520 or 961 or equivalent

Activity: Lecture

1 Course Unit

STAT 915 Nonparametric Inference

Statistical inference when the functional form of the distribution is not specified. Nonparametric function estimation, density estimation, survival analysis, contingency tables, association, and efficiency.

Course not offered every year

Prerequisite: STAT 520 or equivalent

Activity: Lecture

1 Course Unit

STAT 920 Sample Survey Methods

This course will cover the design and analysis of sample surveys. Topics include simple random sampling, stratified sampling, cluster sampling, graphics, regression analysis using complex surveys and methods for handling nonresponse bias.

Taught by: Small

Course not offered every year

Prerequisites: STAT 520, 961 or 970 or permission of instructor

Activity: Lecture

1 Course Unit

STAT 921 Observational Studies

This course will cover statistical methods for the design and analysis of observational studies. Topics will include the potential outcomes framework for causal inference; randomized experiments; matching and propensity score methods for controlling confounding in observational studies; tests of hidden bias; sensitivity analysis; and instrumental variables.

Taught by: Small

Course usually offered in fall term

Prerequisites: STAT 520, 961 or 970 or permission of instructor

Activity: Lecture

1 Course Unit

STAT 925 Multivariate Analysis: Theory

This is a course that prepares PhD students in statistics for research in multivariate statistics and high dimensional statistical inference. Topics from classical multivariate statistics include the multivariate normal distribution and the Wishart distribution; estimation and hypothesis testing of mean vectors and covariance matrices; principal component analysis, canonical correlation analysis and discriminant analysis; etc. Topics from modern multivariate statistics include the Marcenko-Pastur law, the Tracy-Widom law, nonparametric estimation and hypothesis testing of high-dimensional covariance matrices, high-dimensional principal component analysis, etc.

Taught by: Ma

Course not offered every year

Prerequisites: STAT 930, 970 and 972 or permission of instructor

Activity: Lecture

1 Course Unit

STAT 926 Multivariate Analysis: Methodology

This is a course that prepares PhD students in statistics for research in multivariate statistics and data visualization. The emphasis will be on a deep conceptual understanding of multivariate methods to the point where students will propose variations and extensions to existing methods or whole new approaches to problems previously solved by classical methods. Topics include: principal component analysis, canonical correlation analysis, generalized canonical analysis; nonlinear extensions of multivariate methods based on optimal transformations of quantitative variables and optimal scaling of categorical variables; shrinkage- and sparsity-based extensions to classical methods; clustering methods of the k-means and hierarchical varieties; multidimensional scaling, graph drawing, and manifold estimation.

Taught by: Buja

Course not offered every year

Prerequisite: STAT 961 or permission of instructor

Activity: Lecture

1 Course Unit

STAT 927 Bayesian Statistical Theory and Methods

This graduate course will cover the modeling and computation required to perform advanced data analysis from the Bayesian perspective. We will cover fundamental topics in Bayesian probability modeling and implementation, including recent advances in both optimization and simulation-based estimation strategies. Key topics covered in the course include hierarchical and mixture models, Markov Chain Monte Carlo, hidden Markov and dynamic linear models, tree models, Gaussian processes and nonparametric Bayesian strategies.

Taught by: Jensen

Course not offered every year

Prerequisite: STAT 430 or STAT 510

Activity: Lecture

1 Course Unit

STAT 928 Statistical Learning Theory

Statistical learning theory studies the statistical aspects of machine learning and automated reasoning, through the use of (sampled) data. In particular, the focus is on characterizing the generalization ability of learning algorithms in terms of how well they perform on "new" data when trained on some given data set. The focus of the course is on: providing the fundamental tools used in this analysis; understanding the performance of widely used learning algorithms; understanding the "art" of designing good algorithms, both in terms of statistical and computational properties. Potential topics include: empirical process theory; online learning; stochastic optimization; margin based algorithms; feature selection; concentration of measure.

Course usually offered in spring term

Prerequisites: Probability and linear algebra.

Activity: Lecture

1 Course Unit

STAT 930 Probability

Measure theory and foundations of Probability theory. Zero-one Laws. Probability inequalities. Weak and strong laws of large numbers. Central limit theorems and the use of characteristic functions. Rates of convergence. Introduction to Martingales and random walk.

Taught by: Pemantle

Course usually offered in fall term

Prerequisite: STAT 430 or 510 or equivalent

Activity: Lecture

1 Course Unit

STAT 931 Stochastic Processes

Markov chains, Markov processes, and their limit theory. Renewal theory. Martingales and optimal stopping. Stable laws and processes with independent increments. Brownian motion and the theory of weak convergence. Point processes.

Taught by: Pemantle

Course usually offered in spring term

Prerequisite: STAT 930

Activity: Lecture

1 Course Unit

STAT 955 Stochastic Calculus and Financial Applications

Selected topics in the theory of probability and stochastic processes.

Course usually offered in fall term

Prerequisite: STAT 930 or equivalent

Activity: Lecture

1 Course Unit

STAT 957 Seminar in Data Analysis

Survey of methods for the analysis of large unstructured data sets: detection of outliers, Winsorizing, graphical techniques, robust estimators, multivariate problems.

Course not offered every year

Prerequisites: STAT 961, 971, 972, 925, or equivalents; permission of instructor

Activity: Seminar

1 Course Unit

STAT 961 Statistical Methodology

This is a course that prepares 1st year PhD students in statistics for a research career. This is not an applied statistics course. Topics covered include: linear models and their high-dimensional geometry, statistical inference illustrated with linear models, diagnostics for linear models, bootstrap and permutation inference, principal component analysis, smoothing and cross-validation.

Taught by: Buja

Course usually offered in fall term

Prerequisites: STAT 431 or 520 or equivalent; a solid course in linear algebra and a programming language

Activity: Lecture

1 Course Unit

STAT 962 Advanced Methods for Applied Statistics

This course is designed for Ph.D. students in statistics and will cover various advanced methods and models that are useful in applied statistics. Topics for the course will include missing data, measurement error, nonlinear and generalized linear regression models, survival analysis, experimental design, longitudinal studies, building R packages and reproducible research.

Taught by: Small

Course usually offered in spring term

Prerequisite: STAT 961

Activity: Lecture

1 Course Unit

STAT 970 Mathematical Statistics

Decision theory and statistical optimality criteria, sufficiency, point estimation and hypothesis testing methods and theory.

Taught by: Small

Course usually offered in fall term

Prerequisites: STAT 431 or 520 or equivalent; comfort with mathematical proofs (e.g., MATH 360)

Activity: Lecture

1 Course Unit

STAT 971 Introduction to Linear Statistical Models

Theory of the Gaussian Linear Model, with applications to illustrate and complement the theory. Distribution theory of standard tests and estimates in multiple regression and ANOVA models. Model selection and its consequences. Random effects, Bayes, empirical Bayes and minimax estimation for such models. Generalized (Log-linear) models for specific non-Gaussian settings.

Taught by: Ma

Course usually offered in spring term

Prerequisite: STAT 970

Activity: Lecture

1 Course Unit

STAT 972 Advanced Topics in Mathematical Statistics

A continuation of STAT 970.

Taught by: Cai

One-term course offered either term

Prerequisites: STAT 970 and 971

Activity: Lecture

1 Course Unit

STAT 974 Modern Regression for the Social, Behavioral and Biological Sciences

Function estimation and data exploration using extensions of regression analysis: smoothers, semiparametric and nonparametric regression, and supervised machine learning. Conceptual foundations are addressed as well as hands-on use for data analysis.

Taught by: Berk

Course usually offered in spring term

Prerequisites: Two statistics courses at the graduate school level including a solid foundation in the generalized linear model.

Activity: Lecture

1 Course Unit

STAT 991 Seminar in Advanced Application of Statistics

This seminar will be taken by doctoral candidates after the completion of most of their coursework. Topics vary from year to year and are chosen from advance probability, statistical inference, robust methods, and decision theory with principal emphasis on applications.

One-term course offered either term

Activity: Seminar

1 Course Unit

STAT 995 Dissertation

One-term course offered either term

Activity: Dissertation

1 Course Unit

STAT 999 Independent Study

One-term course offered either term

Prerequisites: Written permission of instructor and the department course coordinator.

Activity: Independent Study

1 Course Unit