# Matrix multiplication as composition | Chapter 4, Essence of linear algebra

https://www.youtube.com/watch?v=XkY2DOUCWMU
Translation: zh-CN

[00:11] Hey everyone.
  大家好。

[00:13] Where we last left off, I showed what linear transformations look like and how to represent them using matrices.
  在我们上次结束的地方，我展示了线性变换的样子以及如何使用矩阵表示它们。

[00:18] This is worth a quick recap because it's just really important.
  这值得快速回顾一下，因为它确实非常重要。

[00:19] But of course, if this feels like more than just a recap, go back and watch the full video.
  但当然，如果这感觉不仅仅是回顾，请回去观看完整视频。

[00:25] Technically speaking, linear transformations are functions with vectors as inputs and vectors as outputs.
  严格来说，线性变换是输入为向量、输出为向量的函数。

[00:30] But I showed last time how we can think about them visually as smooshing around space in such a way that grid lines stay parallel and evenly spaced and so that the origin remains fixed.
  但我上次展示了我们如何从视觉上将它们视为在空间中进行挤压，使得网格线保持平行且间距均匀，并且原点保持固定。

[00:41] The key takeaway was that a linear transformation is completely determined by where it takes the basis vectors of the space, which for two dimensions means I hat and J hat.
  关键要点是，线性变换完全由它将空间的基向量映射到何处决定，对于二维来说，这意味着 i 帽和 j 帽。

[00:52] This is because any other vector can be described as a linear combination of those basis vectors.
  这是因为任何其他向量都可以被描述为这些基向量的线性组合。

[00:56] A vector with coordinates XY is X * I hat + Y * J hat.
  坐标为 XY 的向量是 X * i 帽 + Y * j 帽。

[01:03] After going through the transformation, this property that grid lines remain parallel and evenly spaced has a wonderful consequence.
  经过变换后，网格线保持平行且间距相等的这个性质会产生一个奇妙的后果。

[01:11] The place where your vector lands will be X times the transformed version of I hat + Y times the transformed version of J hat.
  你的向量着陆点将是 I hat 变换版本的 X 倍加上 J hat 变换版本的 Y 倍。

[01:18] This means if you keep a record of the coordinates where I hat lands and the coordinates where J hat lands, you can compute that a vector which starts at XY must land on X times the new coordinates of I hat + Y times the new coordinates of J hat.
  这意味着如果你记录下 I hat 着陆点的坐标和 J hat 着陆点的坐标，你就可以计算出从 XY 开始的向量必须着陆在 I hat 新坐标的 X 倍加上 J hat 新坐标的 Y 倍上。

[01:33] The convention is to record the coordinates of where I hat and J hat land as the columns of a matrix and to define this sum of the scaled versions of those columns by X and Y to be matrix vector multiplication.
  惯例是将 I hat 和 J hat 着陆点的坐标记录为矩阵的列，并将这些列的按 X 和 Y 缩放版本的和定义为矩阵向量乘法。

[01:46] In this way, a matrix represents a specific linear transformation and multiplying a matrix by a vector is what it means computationally to apply that transformation to that vector.
  这样，矩阵就代表了一个特定的线性变换，而矩阵乘以向量在计算上就是将该变换应用于该向量的含义。

[01:58] All right, recap over. On to the new stuff.
  好了，回顾完毕。开始新的内容。

[02:01] Oftentimes, you find yourself wanting to
  很多时候，你会发现自己想要

[02:03] Describe the effects of applying one transformation and then another.
  描述应用一个变换然后另一个变换的效果。

[02:07] For example, maybe you want to describe what happens when you first rotate the plane 90° counterclockwise, then apply a shear.
  例如，你可能想描述当你首先逆时针旋转平面90°，然后应用一个剪切时会发生什么。

[02:15] The overall effect here, from start to finish, is another linear transformation distinct from the rotation and the shear.
  这里的整体效果，从开始到结束，是另一个与旋转和剪切不同的线性变换。

[02:22] This new linear transformation is commonly called the composition of the two separate transformations we applied.
  这个新的线性变换通常被称为我们应用的两个独立变换的复合。

[02:28] And like any linear transformation, it can be described with a matrix all of its own by following I hat and J hat.
  并且像任何线性变换一样，它可以通过遵循I hat和J hat来用它自己的矩阵来描述。

[02:36] In this example, the ultimate landing spot for I hat after both transformations is 1 1.
  在这个例子中，I hat在两次变换后的最终落点是1 1。

[02:42] So let's make that the first column of a matrix.
  所以我们把它作为矩阵的第一列。

[02:44] Likewise, J hat ultimately ends up at the location -1 0.
  同样，J hat最终落在-1 0的位置。

[02:50] So we make that the second column of the matrix.
  所以我们把它作为矩阵的第二列。

[02:52] This new matrix captures the overall effect of applying a rotation then a shear, but as one single action rather than two successive ones.
  这个新矩阵捕捉了应用旋转然后剪切的整体效果，但将其视为一个单一动作，而不是两个连续动作。

[03:03] Here's one way to think about that new
  这里有一种思考那个新方法

[03:04] matrix.
  矩阵。

[03:06] If you were to take some vector and pump it through the rotation then the shear, the long way to compute where it ends up is to first multiply it on the left by the rotation matrix.
  如果你要取一个向量，然后通过旋转再剪切，计算它最终位置的长路是先将它乘以左边的旋转矩阵。

[03:15] Then take whatever you get and multiply that on the left by the shear matrix.
  然后取你得到的结果，再将它乘以左边的剪切矩阵。

[03:20] This is, numerically speaking, what it means to apply a rotation then a shear to a given vector.
  从数值上看，这就是对给定向量应用旋转再剪切的意义。

[03:26] But whatever you get should be the same as just applying this new composition matrix that we just found by that same vector, no matter what vector you chose, since this new matrix is supposed to capture the same overall effect as the rotation then shear action.
  但无论你得到什么，都应该与通过该向量应用我们刚刚找到的这个新的复合矩阵相同，无论你选择哪个向量，因为这个新矩阵应该捕捉与旋转再剪切动作相同的整体效果。

[03:42] Based on how things are written down here, I think it's reasonable to call this new matrix the product of the original two matrices.
  根据这里写的东西，我认为将这个新矩阵称为原始两个矩阵的乘积是合理的。

[03:47] Don't you?
  你不觉得吗？

[03:50] We can think about how to compute that product more generally in just a moment, but it's way too easy to get lost in the forest of numbers.
  我们稍后可以考虑如何更普遍地计算该乘积，但很容易迷失在数字的森林中。

[03:55] Always remember that multiplying two matrices like this has the geometric meaning of applying one transformation then another.
  永远记住，像这样相乘两个矩阵具有应用一个变换再应用另一个变换的几何意义。

[04:06] One thing that's kind of weird here is that this has us reading from right to left.
  这里有一个奇怪的地方是，我们是从右向左阅读的。

[04:09] You first apply the transformation represented by the matrix on the right, then you apply the transformation represented by the matrix on the left.
  你先应用右边矩阵表示的变换，然后应用左边矩阵表示的变换。

[04:17] This stems from function notation, since we write functions on the left of variables, so every time you compose two functions, you always have to read it right to left.
  这源于函数表示法，因为我们将函数写在变量的左边，所以每次组合两个函数时，你总是必须从右向左阅读。

[04:24] Good news for the Hebrew readers, bad news for the rest of us.
  对于希伯来语读者来说是好消息，对我们其他人来说是坏消息。

[04:29] Let's look at another example.
  让我们看另一个例子。

[04:31] Take the matrix with columns 1 1 and -2 0, whose transformation looks like this.
  取第一列为1 1，第二列为-2 0的矩阵，其变换如下。

[04:37] And let's call it M1.
  我们称之为M1。

[04:40] Next, take the matrix with columns 0 1 and 2 0, whose transformation looks like this.
  接下来，取第一列为0 1，第二列为2 0的矩阵，其变换如下。

[04:47] And let's call that guy M2.
  我们称之为M2。

[04:49] The total effect of applying M1 then M2 gives us a new transformation.
  应用M1再应用M2的总效果给了我们一个新的变换。

[04:52] So let's find its matrix.
  所以我们来找出它的矩阵。

[04:54] But this time, let's see if we can do it without watching the animations and instead just using the numerical entries in each matrix.
  但这次，让我们看看是否可以在不看动画的情况下完成，而是仅使用每个矩阵中的数值条目。

[05:04] First, we need to figure out where I hat
  首先，我们需要弄清楚I hat

[05:06] goes.
  走了。

[05:07] After applying M1, the new coordinates of I hat, by definition, are given by that first column of M1, namely 1 1.
  应用M1后，I帽的新坐标，根据定义，由M1的第一列给出，即1 1。

[05:16] To see what happens after applying M2, multiply the matrix for M2 by that vector 1 1.
  要查看应用M2后会发生什么，请将M2的矩阵乘以向量1 1。

[05:25] Working it out the way that I described last video, you'll get the vector 2 1.
  按照我上个视频描述的方式计算，你会得到向量2 1。

[05:30] This will be the first column of the composition matrix.
  这将是组合矩阵的第一列。

[05:34] Likewise, to follow J hat, the second column of M1 tells us that it first lands on -2 0.
  同样，要跟踪J帽，M1的第二列告诉我们它首先落在-2 0上。

[05:42] Then, when we apply M2 to that vector, you can work out the matrix vector product to get 0 -2, which becomes the second column of our composition matrix.
  然后，当我们将其应用于该向量的M2时，你可以计算出矩阵向量乘积得到0 -2，这成为我们组合矩阵的第二列。

[05:56] Let me talk through that same process again, but this time I'll show variable entries in each matrix just to show that the same line of reasoning works for any matrices.
  让我再讲解一遍相同的过程，但这次我将在每个矩阵中显示变量项，只是为了表明同样的推理过程适用于任何矩阵。

[06:04] This is more symbol-heavy and
  这更符号化，并且

[06:06] will require some more room, but it should be pretty satisfying for anyone who has previously been taught matrix multiplication the more rote way.
  需要更多的空间，但对于任何以前以更死记硬背的方式学过矩阵乘法的人来说，这应该相当令人满意。

[06:14] To follow where I hat goes, start by looking at the first column of the matrix on the right, since this is where I hat initially lands.
  要跟踪我帽子（I hat）的去向，请先看右边矩阵的第一列，因为我帽子（I hat）最初会落在这里。

[06:21] Multiplying that column by the matrix on the left is how you can tell where the intermediate version of I hat ends up after applying the second transformation.
  将该列与左边的矩阵相乘，就可以知道我帽子（I hat）的中间版本在应用第二次变换后会落在哪里。

[06:31] So the first column of the composition matrix will always equal the left matrix times the first column of the right matrix.
  所以组合矩阵的第一列将始终等于左矩阵乘以右矩阵的第一列。

[06:42] Likewise, J hat will always initially land on the second column of the right matrix.
  同样，J帽子（J hat）将始终最初落在右矩阵的第二列。

[06:48] So multiplying the left matrix by this second column will give its final location and hence that's the second column of the composition matrix.
  所以将左矩阵乘以这一第二列将给出其最终位置，因此这就是组合矩阵的第二列。

[07:00] Notice, there's a lot of symbols here and it's common to be taught this formula as something to memorize along with a certain algorithmic process to
  请注意，这里有很多符号，并且通常会教授这个公式，将其作为需要记忆的内容，并附带某种算法过程来

[07:07] kind of help remember it.
  某种程度上可以帮助记住它。

[07:09] But I really do think that before memorizing that process, you should get in the habit of thinking about what matrix multiplication really represents, applying one transformation after another.
  但我确实认为，在记忆这个过程之前，你应该养成思考矩阵乘法真正代表什么，即一个变换接一个变换地应用，这样的习惯。

[07:19] Trust me, this will give you a much better conceptual framework that makes the properties of matrix multiplication much easier to understand.
  相信我，这将为你提供一个更好的概念框架，使矩阵乘法的性质更容易理解。

[07:27] For example, here's a question.
  例如，这里有一个问题。

[07:29] Does it matter what order we put the two matrices in when we multiply them?
  当我们相乘两个矩阵时，它们的顺序是否重要？

[07:33] Well, let's think through a simple example like the one from earlier.
  嗯，让我们来思考一个简单的例子，就像之前那个一样。

[07:37] Take a shear which fixes I hat and smooshes J hat over to the right and a 90° rotation.
  取一个剪切变换，它固定了 I hat 并将 J hat 向右推，还有一个 90° 的旋转。

[07:43] If you first do the shear then rotate, we can see that I hat ends up at 0 1 and J hat ends up at -1 1.
  如果你先进行剪切再旋转，我们可以看到 I hat 最终位于 0 1，J hat 最终位于 -1 1。

[07:49] Both are generally pointing close together.
  两者通常都指向彼此靠近。

[07:53] If you first rotate then do the shear, I hat ends up over at 1 1 and J hat is often a different direction at -1 0 and they're pointing, you know, farther apart.
  如果你先旋转再进行剪切，I hat 最终位于 1 1，J hat 通常在 -1 0 的不同方向，它们指向，你知道，更远的距离。

[08:06] The overall effect here is clearly
  这里的总体效果显然是

[08:07] different, so evidently order totally does matter.
  不同，所以显然顺序非常重要。

[08:12] Notice, by thinking in terms of transformations, that's the kind of thing that you can do in your head by visualizing.
  注意，通过从变换的角度思考，这是你可以在脑海中通过可视化完成的那种事情。

[08:17] No matrix multiplication necessary.
  无需进行矩阵乘法。

[08:21] I remember when I first took linear algebra, there was this one homework problem that asked us to prove that matrix multiplication is associative.
  我记得我第一次学线性代数时，有一道家庭作业题要求我们证明矩阵乘法是满足结合律的。

[08:29] This means that if you have three matrices A, B, and C and you multiply them all together, it shouldn't matter if you first compute A * B then multiply the result by C, or if you first multiply B * C then multiply that result by A on the left.
  这意味着如果你有三个矩阵 A、B 和 C，并将它们全部相乘，那么先计算 A * B 然后将结果乘以 C，或者先计算 B * C 然后将结果从左边乘以 A，这应该没有区别。

[08:45] In other words, it doesn't matter where you put the parentheses.
  换句话说，在哪里放置括号并不重要。

[08:48] Now, if you try to work through this numerically like I did back then, it's horrible, just horrible and unenlightening for that matter.
  现在，如果你像我当时那样尝试在数值上解决这个问题，那将是可怕的，简直是可怕且毫无启发性的。

[08:54] But when you think about matrix multiplication as applying one transformation after another, this property is just trivial.
  但当你将矩阵乘法视为一个接一个地应用变换时，这个性质就变得微不足道了。

[09:03] Can you see why?
  你能明白为什么吗？

[09:04] What it's saying is that if you first apply C then B then A, it's the same as
  它所说的是，如果你先应用 C，然后是 B，然后是 A，这与...

[09:10] Applying C then B then A.
  应用C，然后是B，然后是A。

[09:13] I mean, there's nothing to prove.
  我的意思是，没有什么需要证明的。

[09:14] You're just applying the same three things one after the other, all in the same order.
  你只是把这三件事按相同的顺序，一个接一个地应用。

[09:19] This might feel like cheating, but it's not.
  这可能感觉像作弊，但事实并非如此。

[09:21] This is an honest-to-goodness proof that matrix multiplication is associative.
  这是一个货真价实的证明，证明矩阵乘法是满足结合律的。

[09:25] And even better than that, it's a good explanation for why that property should be true.
  比这更好的是，它很好地解释了为什么这个性质应该是成立的。

[09:31] I really do encourage you to play around more with this idea, imagining two different transformations, thinking about what happens when you apply one after the other, and then working out the matrix product numerically.
  我真的鼓励你多玩玩这个想法，想象两个不同的变换，思考当你一个接一个地应用它们时会发生什么，然后用数值计算出矩阵乘积。

[09:40] Trust me, this is the kind of playtime that really makes the idea sink in.
  相信我，这种玩耍的方式能真正让你理解这个想法。

[09:47] In the next video, I'll start talking about extending these ideas beyond just two dimensions.
  在下一个视频中，我将开始讨论将这些想法扩展到二维以上。

[09:50] See you then.
  到时候见。

[09:59] [Music]
  [音乐]