MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
Professor Strang describes the four topics of the course: Linear Algebra, Deep Learning, Optimization, and Statistics. He provides examples of how Linear algebra concepts are key for understanding and creating machine learning algorithms, especially as applied to deep learning and neural networks. The course parallels the content of his book Linear Algebra and Learning from Data published in 2019.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang, Sarah Hansen
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63uMA4q8GaU6Eg5nzeOc8tx
In this video, Professor Gilbert Strang shares how he teaches his new course on matrix methods using a project-based approach.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
In this first lecture, Professor Strang introduces the linear algebra principles critical for understanding the content of the course. In particular, matrix-vector multiplication Ax and the column space of a matrix and the rank.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
Lecture 1: The Column Space of A Contains All Vectors Ax
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
Multiplying and factoring matrices are the topics of this lecture. Professor Strang reviews multiplying columns by rows: AB = sum of rank one matrices. He also introduces the five most important factorizations.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
This lecture focuses on orthogonal matrices and subspaces. Professor Strang reviews the four fundamental subspaces: column space C(A), row space C (A^T), nullspace N(A), left nullspace N (A^T).
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
Professor Strang begins this lecture talking about eigenvectors and eigenvalues and why they are useful. Then he moves to a discussion of symmetric matrices, in particular, positive definite matrices.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
In this lecture, Professor Strang continues reviewing key matrices, such as positive definite and semidefinite matrices. This lecture concludes his review of the highlights of linear algebra.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
Singular Value Decomposition (SVD) is the primary topic of this lecture. Professor Strang explains and illustrates how the SVD separates a matrix into rank one pieces, and that those pieces come in order of importance.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
In this lecture, Professor Strang reviews Principal Component Analysis (PCA), which is a major tool in understanding a matrix of data. In particular, he focuses on the Eckart-Young low rank approximation theorem.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
A norm is a way to measure the size of a vector, a matrix, a tensor, or a function. Professor Strang reviews a variety of norms that are important to understand including S-norms, the nuclear norm, and the Frobenius norm.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
In this lecture, Professor Strang details the four ways to solve least-squares problems. Solving least-squares problems comes in to play in the many applications that rely on data fitting.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
The subject of this lecture is the matrix equation Ax=b. Solving for x presents a number of challenges that must be addressed when doing computations with large matrices.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
In this lecture, Professor Strang revisits the ways to solve least squares problems. In particular, he focuses on the Gram-Schmidt process that finds orthogonal vectors.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
Numerical linear algebra is the subject of this lecture and, in particular, how to compute eigenvalues and singular values. This includes discussion of the Hessenberg matrix, a square matrix that is almost (except for one extra diagonal) triangular.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
This lecture focuses on randomized linear algebra, specifically on randomized matrix multiplication. This process is useful when working with very large matrices. Professor Strang introduces and describes the basic steps of randomized computations.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
In this lecture, Professor Strang introduces the concept of low rank matrices. He demonstrates how using the Sherman-Morrison-Woodbury formula is useful to efficiently compute how small changes in a matrix affect its inverse.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
This lecture is about changes in eigenvalues and changes in singular values. When matrices move, their inverses, their eigenvalues, and their singular values change. Professor Strang explores the resulting formulas.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
In this lecture, Professor Strang reviews how to find the derivatives of inverse and singular values. Later in the lecture, he discusses LASSO optimization, the nuclear norm, matrix completion, and compressed sensing.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Alex Townsend
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
Professor Alex Townsend gives this guest lecture answering the question ‘Why are there so many low rank matrices that appear in computational math?’ Working effectively with low rank matrices is critical in image compression applications.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
In this lecture, Professor Strang reviews counting the free parameters in a variety of key matrices. He then moves on to finding saddle points from constraints and Lagrange multipliers.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
Professor Strang continues his discussion of saddle points, which are critical for deep learning applications. Later in the lecture, he reviews the Maxmin Principle, a decision rule used in probability and statistics to optimize outcomes.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
This lecture continues the focus on probability, which is critical for working with large sets of data. Topics include sample mean, expected mean, sample variance, covariance matrices, Chebyshev’s inequality, and Markov’s inequality.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
In this lecture, Professor Strang discusses optimization, the fundamental algorithm that goes into deep learning. Later in the lecture he reviews the structure of convolutional neural networks (CNN) used in analyzing visual imagery.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
Gradient descent is the most common optimization algorithm in deep learning and machine learning. It only takes into account the first derivative when performing updates on parameters – the stepwise process that moves downhill to reach a local minimum.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
In this lecture, Professor Strang explains both momentum-based gradient descent and Nesterov’s accelerated gradient descent.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
This lecture focuses on several topics that are specific parts of optimization. These include linear programming (LP), the max-flow min-cut theorem, two-person zero-sum games, and duality.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Suvrit Sra
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
Professor Suvrit Sra gives this guest lecture on stochastic gradient descent (SGD), which randomly selects a minibatch of data at each step. The SGD is still the primary method for training large-scale machine learning systems.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
This lecture is about the central structure of deep neural networks, which are a major force in machine learning. The aim is to find the function that’s constructed to learn the training data and then apply it to the test data.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
In this lecture, Professor Strang presents Professor Sra’s theorem which proves the convergence of stochastic gradient descent (SGD). He then reviews backpropagation, a method to compute derivatives quickly, using the chain rule.
Note: Videos of Lectures 28 and 29 are not available because those were in-class lab sessions that were not recorded.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
Professor Strang starts this lecture asking the question ‘Which matrices can be completed to have a rank of 1?’ He then provides several examples. In the second part, he introduces convolution and cyclic convolution.
Note: Videos of Lectures 28 and 29 are not available because those were in-class lab sessions that were not recorded.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
This lecture continues with constant-diagonal circulant matrices. Each lower diagonal continues on an upper diagonal to produce n equal entries. The eigenvectors are always the columns of the Fourier matrix and computing is fast.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
Professor Strang begins the lecture talking about ImageNet, a large visual database used in visual object recognition software research. ImageNet is an example of a convolutional neural network (CNN). The rest of the lecture focuses on convolution.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
This lecture focuses on the construction of the learning function F, which is optimized by stochastic gradient descent and applied to the training data to minimize the loss. Professor Strang also begins his review of distance matrices.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
This lecture continues the review of distance matrices. Professor Strang then introduces the Procrustes problem, which looks for the orthogonal matrix that swings one set of vectors as nearly as possible onto a second set.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
The topic of this lecture is clustering for graphs, meaning finding sets of ‘related’ vertices in graphs. The challenge is finding good algorithms to optimize cluster quality. Professor Strang reviews some possibilities.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Alan Edelman, Gilbert Strang
View the complete course: https://ocw.mit.edu/18-065S18
YouTube Playlist: https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
Professor Alan Edelman gives this guest lecture on the Julia Language, which was designed for high-performance computing. He provides an overview of how Julia can be used in machine learning and deep learning applications.
License: Creative Commons BY-NC-SA
More information at https://ocw.mit.edu/terms
More courses at https://ocw.mit.edu
Lecture 6: Vision 2 Instructor: John Gabrieli View the complete course: http://ocw.mit.edu/9-00SCS11 License: Creative Commons BY-NC-SA
More information at http://ocw.mit.edu/terms
More courses at http://ocw.mit.edu