Description

Type

Master
Location

Maynard (USA)

Start date

Different dates available

6.345 introduces students to the rapidly developing field of automatic speech recognition. Its content is divided into three parts. Part I deals with background material in the acoustic theory of speech production, acoustic-phonetics, and signal representation. Part II describes algorithmic aspects of speech recognition systems including pattern classification, search algorithms, stochastic modelling, and language modelling techniques. Part III compares and contrasts the various approaches to speech recognition, and describes advanced techniques used for acoustic-phonetic modelling, robust speech recognition, speaker adaptation, processing paralinguistic information, speech understanding, and multimodal processing.

Facilities

Maynard (USA)

See map

02139

Start date

Different dates availableEnrolment now open

Questions & Answers

Add your question

Our advisors and other users will be able to reply to you

Who would you like to address this question to?

All
Students
Centre

Fill in your details to get a reply

I agree to the Privacy Policy and the Conditions.

We will only publish your name and question

Emagister S.L. (data controller) will process your data to carry out promotional activities (via email and/or phone), publish reviews, or manage incidents. You can learn about your rights and manage your preferences in the privacy policy.

Reviews

Subjects

Production
Systems
Materials
Phonetics
Algorithms

Course programme

Lectures: 2 sessions / week, 1.5 hours / session

This course introduces students to the rapidly developing field of automatic speech recognition. Its content is divided into three parts. Part I deals with background material in the acoustic theory of speech production, acoustic-phonetics, and signal representation. Part II describes algorithmic aspects of speech recognition systems including pattern classification, search algorithms, stochastic modelling, and language modelling techniques. Part III compares and contrasts the various approaches to speech recognition, and describes advanced techniques used for acoustic-phonetic modelling, robust speech recognition, speaker adaptation, processing paralinguistic information, speech understanding, and multimodal processing.

There will be two 90 minute lectures per week. To facilitate the coverage of a large quantity of material, copies of the lecture viewgraphs will be handed out. There will be no final exam for the course. Instead there will be two in-class quizzes each counting approximately 15% towards the final grade.

There will be weekly assignments consisting of both problems and mandatory laboratory work, so that students will be able to gain hands-on experience with the materials covered. Linux workstations will be made available to conduct laboratory work. A sign-up mechanism will be available via the 6.345 web-site to reserve time on these machines. Assignments must be turned in by the due date. Solutions will be provided along with the graded assignments. Each of the nine assignments will count approximately 5% towards the final grade.

During the last quarter of the course, assignments will end, and students will work on a term project that will count approximately 25% towards the final grade. Projects will be chosen in consultation with staff members, and typically involve creating and evaluating a speech recognizer along a dimension of interest to the student. Tool kits of key recognizer components will be provided, so that minimal programming skills are necessary.

A detailed outline of the class lectures and assignments is also available.

Lecturer: Jim Glass

Huang, Acero, and Hon. Spoken Language Processing. Upper Saddle River, NJ: Prentice-Hall, 2001. ISBN: 0130226165.

Jelinek. Statistical Methods for Speech Recognition. Cambridge, MA: MIT Press, 1998. ISBN: 0262100665.

Rabiner & Juang. Fundamentals of Speech Recognition. Upper Saddle River, NJ: Prentice-Hall, 1993. ISBN: 0130151572.

Duda, Hart, and Stork. Pattern Classification. New York, NY: Wiley & Sons, 2000. ISBN: 0471056693.

Stevens. Acoustic Phonetics. MIT Press, 1998. ISBN: 0262692503.

Don't show me this again

This is one of over 2,200 courses on OCW. Find materials for this course in the pages linked along the left.

MIT OpenCourseWare is a free & open publication of material from thousands of MIT courses, covering the entire MIT curriculum.

No enrollment or registration. Freely browse and use OCW materials at your own pace. There's no signup, and no start or end dates.

Knowledge is your reward. Use OCW to guide your own life-long learning, or to teach others. We don't offer credit or certification for using OCW.

Made for sharing. Download files for later. Send to friends and colleagues. Modify, remix, and reuse (just remember to cite OCW as the source.)

Learn more at Get Started with MIT OpenCourseWare

See related categories

Automatic speech recognition

Questions & Answers

Reviews

Subjects

Course programme

Add similar courses
and compare them to help you choose.

Automatic speech recognition

Questions & Answers

Reviews

Subjects

Course programme

Add similar coursesand compare them to help you choose.

Add similar courses
and compare them to help you choose.