Statistics

PhD

In New Haven (USA)

Price on request

Description

  • Type

    PhD

  • Location

    New haven (USA)

Professors Donald Andrews (Economics), Andrew Barron, Joseph Chang, Katarzyna Chawarska ​(Child Study Center​), Xiaohong Chen (Economics), Nicholas Christakis (Sociology​), Ronald Coifman (Mathematics​), James Duncan (Radiology & Biomedical Imaging​), John Emerson (Adjunct), Debra Fischer (Astronomy​), Alan Gerber (Political Science​), Mark Gerstein (Molecular Biophysics & Biochemistry​), John Hartigan (Emeritus), Theodore Holford (Public Health/Biostatistics), Edward Kaplan (School of Management/Operations Research​), Harlan Krumholz (Internal Medicine​), John Lafferty, Peter Phillips (Economics), David Pollard, Daniel Spielman, Hemant Tagare (Radiology & Biomedical Engineering​), Van Vu (Mathematics), Heping Zhang (Public Health/Biostatistics), Hongyu Zhao (Public Health/Biostatistics), Harrison Zhou, Steven Zucker (Computer Science​)

Facilities

Location

Start date

New Haven (USA)
See map
06520

Start date

On request

About this course

Fields of study include the main areas of statistical theory (with emphasis on foundations, Bayes theory, decision theory, nonparametric statistics), probability theory (stochastic processes, asymptotics, weak convergence), information theory, bioinformatics and genetics, classification, data mining and machine learning, neural nets, network science, optimization, statistical computing, and graphical models and methods.

GRE scores for the General Test are required. A GRE Subject Test in the area closest to the undergraduate major is recommended for the Ph.D. program and encouraged for the M.A. program. All applicants should have a strong mathematical background, including advanced calculus, linear algebra, elementary probability theory, and at least one course providing an introduction to mathematical statistics. An undergraduate major may be in statistics, mathematics, computer science, or in a subject in which significant statistical problems may arise . For those whose native language is not English,...

Questions & Answers

Add your question

Our advisors and other users will be able to reply to you

Who would you like to address this question to?

Fill in your details to get a reply

We will only publish your name and question

Reviews

Subjects

  • Probability
  • GCSE Mathematics
  • Computational
  • Programming
  • Confidence Training
  • Medical training
  • Medical
  • Public
  • Algebra
  • Genetics
  • Economics
  • Mathematics
  • Biology
  • Statistics
  • Algorithms
  • Data analysis
  • Testing
  • Computing
  • Credit
  • Public Health

Course programme

Courses

S&DS 500b, Introductory StatisticsWilliam Brinda

An introduction to statistical reasoning. Topics include numerical and graphical summaries of data, data acquisition and experimental design, probability, hypothesis testing, confidence intervals, correlation and regression. Application of statistical concepts to data; analysis of real-world problems.
MWF 10:30am-11:20am

S&DS 501a, Introduction to Statistics: Life SciencesWalter Jetz and Jonathan Reuning-Scherer

Statistical and probabilistic analysis of biological problems, presented with a unified foundation in basic statistical theory. Problems are drawn from genetics, ecology, epidemiology, and bioinformatics.
TTh 1pm-2:15pm

S&DS 502a, Introduction to Statistics: Political ScienceJonathan Reuning-Scherer

Statistical analysis of politics, elections, and political psychology. Problems presented with reference to a wide array of examples: public opinion, campaign finance, racially motivated crime, and public policy. Note: S&DS 501–506 offer a basic introduction to statistics, including numerical and graphical summaries of data, probability, hypothesis testing, confidence intervals, and regression. Each course focuses on applications to a particular field of study and is taught jointly by two instructors, one specializing in statistics and the other in the relevant area of application. The first seven weeks are attended by all students in S&DS 501–506 together as general concepts and methods of statistics are developed. The course separates for the last six and a half weeks, which develop the concepts with examples and applications. Computers are used for data analysis. These courses are alternatives; they do not form a sequence, and only one may be taken for credit.
TTh 1pm-2:15pm

S&DS 503a, Introduction to Statistics: Social SciencesJonathan Reuning-Scherer

Descriptive and inferential statistics applied to analysis of data from the social sciences. Introduction of concepts and skills for understanding and conducting quantitative research. Note: S&DS 501–506 offer a basic introduction to statistics, including numerical and graphical summaries of data, probability, hypothesis testing, confidence intervals, and regression. Each course focuses on applications to a particular field of study and is taught jointly by two instructors, one specializing in statistics and the other in the relevant area of application. The first seven weeks are attended by all students in S&DS 501–506 together as general concepts and methods of statistics are developed. The course separates for the last six and a half weeks, which develop the concepts with examples and applications. Computers are used for data analysis. These courses are alternatives; they do not form a sequence, and only one may be taken for credit.
TTh 1pm-2:15pm

S&DS 505a, Introduction to Statistics: MedicineRussell Barbour and Jonathan Reuning-Scherer

Statistical methods relied upon in medicine and medical research. Practice in reading medical literature competently and critically, as well as practical experience performing statistical analysis of medical data. Note: S&DS 501–506 offer a basic introduction to statistics, including numerical and graphical summaries of data, probability, hypothesis testing, confidence intervals, and regression. Each course focuses on applications to a particular field of study and is taught jointly by two instructors, one specializing in statistics and the other in the relevant area of application. The first seven weeks are attended by all students in S&DS 501–506 together as general concepts and methods of statistics are developed. The course separates for the last six and a half weeks, which develop the concepts with examples and applications. Computers are used for data analysis. These courses are alternatives; they do not form a sequence, and only one may be taken for credit.
TTh 1pm-2:15pm

S&DS 506a, Introduction to Statistics: Data AnalysisWilliam Brinda and Jonathan Reuning-Scherer

An introduction to probability and statistics with emphasis on data analysis. Note: S&DS 501–506 offer a basic introduction to statistics, including numerical and graphical summaries of data, probability, hypothesis testing, confidence intervals, and regression. Each course focuses on applications to a particular field of study and is taught jointly by two instructors, one specializing in statistics and the other in the relevant area of application. The first seven weeks are attended by all students in S&DS 501–506 together as general concepts and methods of statistics are developed. The course separates for the last six and a half weeks, which develop the concepts with examples and applications. Computers are used for data analysis. These courses are alternatives; they do not form a sequence, and only one may be taken for credit.
TTh 1pm-2:15pm

S&DS 520b, Intensive Introductory StatisticsXiaofei Wang

An introduction to statistical reasoning designed for students with particular interest in data science and computing. Using the R language, topics include exploratory data analysis, probability, hypothesis testing, confidence intervals, regression, statistical modeling, and simulation. Computing is taught and used extensively throughout the course. Application of statistical concepts to the analysis of real-world data science problems.
TTh 9am-10:15am

S&DS 523b, YData: An Introduction to Data ScienceJessica Cisewski and John Lafferty

Computational, programming, and statistical skills are no longer optional in our increasingly data-driven world; they are essential for opening doors to manifold research and career opportunities. This course aims to dramatically enhance students’ knowledge and capabilities in fundamental ideas and skills in data science, especially computational and programming skills and inferential thinking. It emphasizes the development of these skills while providing opportunities for hands-on experience and practice. The course is designed to be accessible to students with little or no background in computing, programming, or statistics, but also engaging for more technically oriented students through extensive use of examples and hands-on data analysis. Python 3 is the computing language used. Enrollment is limited.
MWF 10:30am-11:20am

S&DS 530a or b, Data Exploration and AnalysisStaff

Survey of statistical methods: plots, transformations, regression, analysis of variance, clustering, principal components, contingency tables, and time series analysis. The R computing language and Web data sources are used.
HTBA

S&DS 538a, Probability and StatisticsJoseph Chang

Fundamental principles and techniques of probabilistic thinking, statistical modeling, and data analysis. Essentials of probability: conditional probability, random variables, distributions, law of large numbers, central limit theorem, Markov chains. Statistical inference with emphasis on the Bayesian approach: parameter estimation, likelihood, prior and posterior distributions, Bayesian inference using Markov chain Monte Carlo. Introduction to regression and linear models. Computers are used throughout for calculations, simulations, and analysis of data. Prerequisite: differential calculus of several variables; some acquaintance with matrix algebra and computing is assumed.
TTh 1pm-2:15pm

S&DS 541a, Probability TheoryYihong Wu

A first course in probability theory: probability spaces, random variables, expectations and probabilities, conditional probability, independence, some discrete and continuous distributions, central limit theorem, Markov chains, probabilistic modeling. Prerequisite: calculus of functions of several variables.
MW 9am-10:15am

S&DS 542b, Theory of StatisticsAndrew Barron

Principles of statistical analysis: maximum likelihood, sampling distributions, estimation, confidence intervals, tests of significance, regression, analysis of variance, and the method of least squares. Prerequisite: S&DS 541.
MWF 9:25am-10:15am

S&DS 551b, Stochastic ProcessesYihong Wu and Sahand Negahban

Introduction to the study of random processes, including Markov chains, Markov random fields, martingales, random walks, Brownian motion, and diffusions. Techniques in probability such as coupling and large deviations. Applications chosen from image reconstruction, Bayesian statistics, finance, probabilistic analysis of algorithms, genetics, and evolution.
MW 1pm-2:15pm

S&DS 563b, Multivariate Statistical Methods for the Social SciencesJonathan Reuning-Scherer

An introduction to the analysis of multivariate data. Topics include principal components analysis, factor analysis, cluster analysis (hierarchical clustering, k-means), discriminant analysis, multidimensional scaling, and structural equations modeling. Emphasis on practical application of multivariate techniques to a variety of examples in the social sciences. Students complete extensive computer work using either SAS or SPSS. Prerequisites: knowledge of basic inferential procedures, experience with linear models (regression and ANOVA). Experience with some statistical package and/or familiarity with matrix notation is helpful but not required.
TTh 1pm-2:20pm

S&DS 565a or b, Applied Data Mining and Machine LearningStaff

Techniques for data mining and machine learning are covered from both a statistical and a computational perspective, including support vector machines, bagging, boosting, neural networks, and other nonlinear and nonparametric regression methods. The course gives the basic ideas and intuition behind these methods, a more formal understanding of how and why they work, and opportunities to experiment with machine-learning algorithms and apply them to data. Prerequisite: after or concurrent with S&DS 542.
HTBA

S&DS 570b / ASTR 545b, YData: ExoStatistics: Exploring Extrasolar Planets with Data ScienceJessica Cisewski

Extrasolar planets, or exoplanets, are planets orbiting stars outside our solar system. The past decade has led to a proliferation of exoplanet discoveries using various detection methods. Through the lens of data science, we investigate exoplanet datasets to learn how to find exoplanets, examine the population properties of observed exoplanets, estimate probabilities of another Earth-like exoplanet in our universe, and probe other questions about exoplanets. This course provides an introduction to exoplanet astronomy, an introduction to data science tools necessary for studying exoplanets, and opportunities to practice the data science skills presented in S&DS 523. This course can be taken concurrently with, or after successful completion of, S&DS 523.  ½ Course cr
T 3:30pm-5:20pm

S&DS 571b, YData: Text Data Science: An IntroductionJohn Lafferty

Written language is the primary means by which humans document their observations of the world, including scientific discoveries, interpretations of history and art, health diagnoses, analyses of political events and economic trends, social interactions, and many others. Increasingly, this rapidly growing transcript is readily available in electronic form and is being used in commercial applications and to advance scientific knowledge. This course is an introduction to computational and inferential methods that use text. The focus is on simple but often powerful text-processing techniques that do not require linguistic analyses, to gain familiarity with working with text data. Sources used in the seminar include political speeches, Twitter feeds, scientific journals, online FAQ and discussion boards, Wikipedia, news articles, and consumer product reviews. Methodologies include scraping, wrangling, hashing, sorting, regressing, embedding, and probabilistic modeling. The course is based on the Python programming language within a cloud computing platform and is paced to be accessible to students who have previously taken or are currently enrolled in S&DS 523. Prerequisite: S&DS 523; may be taken concurrently.  ½ Course cr
Th 9:25am-11:15am

S&DS 572b / PLSC 524b, YData: Data Science for Political CampaignsJoshua Kalla

Political campaigns have become increasingly data driven. Data science is used to inform where campaigns compete, which messages they use, how they deliver them, and among which voters. In this course, we explore how data science is being used to design winning campaigns. Students gain an understanding of what data is available to campaigns, how campaigns use this data to identify supporters, and the use of experiments in campaigns. The course provides students with an introduction to political campaigns, an introduction to data science tools necessary for studying politics, and opportunities to practice the data science skills presented in S&DS 523. Can be taken concurrently with, or after successful completion of, S&DS 523.  ½ Course cr
T 9:25am-11:15am

S&DS 600b, Advanced ProbabilitySekhar Tatikonda

Measure theoretic probability, conditioning, laws of large numbers, convergence in distribution, characteristic functions, central limit theorems, martingales. Some knowledge of real analysis is assumed.
TTh 2:30pm-3:45pm

S&DS 610a, Statistical InferenceZhou Fan

A systematic development of the mathematical theory of statistical inference covering methods of estimation, hypothesis testing, and confidence intervals. An introduction to statistical decision theory. Knowledge of probability theory at the level of S&DS 541 is assumed.
TTh 11:35am-12:50pm

S&DS 612a, Linear ModelsWilliam Brinda

The geometry of least squares; distribution theory for normal errors; regression, analysis of variance, and designed experiments; numerical algorithms (with particular reference to the R statistical language); alternatives to least squares. Prerequisites: linear algebra and some acquaintance with statistics.
MW 11:35am-12:50pm

S&DS 615b, Introduction to Random Matrix Theory and ApplicationsZhou Fan

A graduate-level introduction to random matrix theory. Wigner matrices, sample covariance matrices, spiked models. Applications to statistical principal component analysis, random graphs and networks, and landscape analysis of nonconvex statistical optimization problems. Methods applicable to non-invariant models that commonly arise in statistical applications: moment method, resolvents and Stieltjes transforms, free probability, concentration of measure, Lindeberg exchange. Prerequisite: real analysis and measure-theoretic probability.
W 2:30pm-5pm

S&DS 625a, Statistical Case StudiesXiaofei Wang

Statistical analysis of a variety of statistical problems using real data. Emphasis on methods of choosing data, acquiring data, assessing data quality, and the issues posed by extremely large data sets. Extensive computations using R.
MW 1pm-2:15pm

S&DS 626a or b, Practical WorkStaff

Individual one-term projects, with students working on studies outside the department, under the guidance of a statistician.
HTBA

S&DS 627a and S&DS 628b, Statistical ConsultingDerek Feng

Statistical consulting and collaborative research projects often require statisticians to explore new topics outside their area of expertise. This course exposes students to real problems, requiring them to draw on their expertise in probability, statistics, and data analysis. Students complete the course with individual projects supervised jointly by faculty outside the department and by one of the instructors. Students enroll for both terms (S&DS 627 and 628) and receive one credit at the end of the year.  ½ Course cr per term
F 2:30pm-4:20pm

S&DS 630a, Optimization TechniquesSekhar Tatikonda

Fundamental theory and algorithms of optimization, emphasizing convex optimization. The geometry of convex sets, basic convex analysis, the principle of optimality, duality. Numerical algorithms: steepest descent, Newton’s method, interior point methods, dynamic programming, unimodal search. Applications from engineering and the sciences.
TTh 1pm-2:15pm

S&DS 645b / CB&B 645b, Statistical Methods in Computational BiologyHongyu Zhao

Introduction to problems, algorithms, and data analysis approaches in computational biology and bioinformatics. We discuss statistical issues arising in analyzing population genetics data, gene expression microarray data, next-generation sequencing data, microbiome data, and network data. Statistical methods include maximum likelihood, EM, Bayesian inference, Markov chain Monte Carlo, and methods of classification and clustering; models include hidden Markov models, Bayesian networks, and graphical models. Prerequisite: S&DS 538, S&DS 542, or S&DS 661. Prior knowledge of biology is not required, but some interest in the subject and a willingness to carry out calculations using R is assumed.
Th 1pm-2:50pm

S&DS 661b, Data AnalysisWilliam Brinda

By analyzing data sets using the R statistical computing language, a selection of statistical topics are studied: linear and nonlinear models, maximum likelihood, resampling methods, curve estimation, model selection, classification, and clustering. Prerequisite: after or concurrent with S&DS 542.
MW 2:30pm-3:45pm

S&DS 663a, Computational Mathematics for Data ScienceRoy Lederman

The course explores the mechanics of the interface between mathematics, computation, and statistics in data analysis. We discuss topics in numerical computation, complexity, programming, and prototyping. Assignments include theory, programming, data analysis, individual work, collaborative work, and making mistakes. Prerequisites: linear algebra and some experience with programming (any language).
MW 9am-10:15am

Statistics

Price on request