Understanding Matrices | Part 4: Matrix Inverse

of this series [1], [2], and [3], we have observed:

interpretation of multiplication of a matrix by a vector,
the physical meaning of matrix-matrix multiplication,
the behavior of several special-type matrices, and
visualization of matrix transpose.

In this story, I want to share my perspective on what lies beneath matrix inversion, why different formulas related to inversion are the way they actually are, and finally, why calculating the inverse can be done much more easily for matrices of several special types.

Here are the definitions that I use throughout the stories of this series:

Matrices are denoted with uppercase (like ‘A‘, ‘B‘), while vectors and scalars are denoted with lowercase (like ‘x‘, ‘y‘ or ‘m‘, ‘n‘).
|x| – is the length of vector ‘x‘,
A^T – is the transpose of matrix ‘A‘,
B^-1 – is the inverse of matrix ‘B‘.

Definition of the inverse matrix

From part 1 of this series – “matrix-vector multiplication” [1], we remember that a certain matrix “A“, when multiplied by a vector ‘x‘ as “y = Ax“, can be treated as a transformation of input vector ‘x‘ into the output vector ‘y‘. If so, then the inverse matrix A^-1 should do the reverse transformation – it should transform vector ‘y‘ back to ‘x‘:

[begin{equation*}
x = A^{-1}y
end{equation*}]

Substituting “y = Ax” there will give us:

[begin{equation*}
x = A^{-1}y = A^{-1}(Ax) = (A^{-1}A)x
end{equation*}]

which means that the product of the original matrix and its inverse – A^-1A, should be such a matrix, which does no transformation to any input vector ‘x‘. In other words:

[begin{equation*}
(A^{-1}A) = E
end{equation*}]

where “E” is the identity matrix.

*Concatenating X-diagrams of A^-1 and A turns into the identity matrix E.*

The first question that can arise here is, is it always possible to reverse the influence of a certain matrix “A“? The answer is – it is possible, only if no 2 different input vectors x₁ and x₂ are being transformed through “A” into the same output vector ‘y‘. In other words, the inverse matrix A^-1 exists only if for any output vector ‘y‘ there exists exactly one input vector ‘x‘, which is transformed through “A” into it:

[begin{equation*}
y = Ax
end{equation*}]

Case 1: Several input vectors ‘x’ (red dots) are transformed into the same output vector ‘y’ (light blue dots). We can’t design an inverse matrix in this case, because for a certain vector ‘y’ the product “x = A^-1y” will be ambiguous.

Case 2: Each input vector ‘x’ (red dots) is transformed into a different output vector ‘y’ (light blue dots). The inverse matrix, which will do the reverse transformation “x = A^-1y” does exist.

In this series, I don’t want to dive too much into the formal part of definitions and proofs. Instead, I want to observe several cases where it is actually possible to invert the given matrix “A“, and we will see how the inverse matrix A^-1 is calculated for each of those cases.

Inverting chains of matrices

An important formula related to matrix inverse is:

[begin{equation*}
(AB)^{-1} = B^{-1}A^{-1}
end{equation*}]

which states that the inverse of the product of matrices is equal to the product of inverse matrices, but in the reverse order. Let’s understand why the order of matrices is being reversed.

What is the physical meaning of the inverse (AB)^-1? It should be such a matrix that turns back the influence of the matrix (AB). So if:

[begin{equation*}
y = (AB)x,
end{equation*}]

then, we should have:

[begin{equation*}
x = (AB)^{-1}y.
end{equation*}]

Now the transformation “y = (AB)x” goes in 2 steps: first, we do:

[begin{equation*}
Bx = t,
end{equation*}]

which gives an intermediate vector ‘t‘, and then that ‘t‘ is multiplied by “A“:

[begin{equation*}
y = At = A(Bx).
end{equation*}]

During calculation of “y = (AB)x”, the input vector ‘x’ is first transformed by matrix “B”, producing an intermediate vector “t = Bx”, which is then transformed by matrix “A”, producing the final vector “y = A(Bx) = At”.

So the matrix “A” influenced the vector after it was already influenced by “B“. In this case, to turn back such a sequential influence, at first we should turn back the influence of “A“, by multiplying A^-1 over ‘y‘, which will give us:

[begin{equation*}
A^{-1}y = A^{-1}(ABx) = (A^{-1}A)Bx = EBx = Bx = t,
end{equation*}]

… the intermediate vector ‘t‘, produced a bit above.

The product “A^-1(AB)x = (A^-1A)Bx = EBx = Bx = t”.
Note, the vector ‘t’ participates here twice.

Then, after getting back the intermediate vector ‘t‘, to restore ‘x‘, we should also reverse the influence of matrix “B“. And that is done by multiplying B^-1 over ‘t‘:

[begin{equation*}
B^{-1}t = B^{-1}(Bx) = (B^{-1}B)x = Ex = x,
end{equation*}]

or writing it all in an expanded way:

[begin{equation*}
x = B^{-1}(A^{-1}A)Bx = (B^{-1}A^{-1})(AB)x,
end{equation*}]

which explicitly shows that to turn back the influence of the matrix (AB) we should use (B^-1A^-1).

The product “(B^-1A^-1)(AB)x = B^-1(A^-1A)Bx = B^-1EBx = B^-1Bx = Ex = x”.
Note, both vectors ‘x’ and ‘t’ participate here twice.

This is why in the inverse of a product of matrices, their order is reversed:

[begin{equation*}
(AB)^{-1} = B^{-1}A^{-1}
end{equation*}]

The same principle is applied when we have more matrices in a chain, like:

[begin{equation*}
(ABC)^{-1} = C^{-1}B^{-1}A^{-1}
end{equation*}]

Inversion of several special matrices

Now, with the perception of what lies beneath matrix inversion, let’s view how matrices of several special types are being inverted.

Inverse of cyclic-shift matrix

A cyclic-shift matrix is such a matrix “V“, which when multiplied by an input vector ‘x‘, produces an output vector “y = Vx“, where all values of ‘x‘ are cyclic shifted by some ‘k‘ positions. To achieve that, the cyclic-shift matrix “V” has 2 lines of ‘1’s, which reside parallel to its main diagonal, while all other cells of it are ‘0’s.

[begin{equation*}
begin{pmatrix}
y_1 \ y_2 \ y_3 \ y_4 \ y_5
end{pmatrix}
= y = Vx =
begin{bmatrix}
0 & 0 & 1 & 0 & 0 \
0 & 0 & 0 & 1 & 0 \
0 & 0 & 0 & 0 & 1 \
1 & 0 & 0 & 0 & 0 \
0 & 1 & 0 & 0 & 0
end{bmatrix}
*
begin{pmatrix}
x_1 \ x_2 \ x_3 \ x_4 \ x_5
end{pmatrix}
=
begin{pmatrix}
x_3 \ x_4 \ x_5 \ x_1 \ x_2
end{pmatrix}
end{equation*}]

*The X-diagram of the presented 5×5 cyclic-shift matrix “V”. When applied to an input vector ‘x’, it cyclic shifts up all its values by 2 positions, producing output vector ‘y’.*

Now, how should we undo the transformation of the cyclic-shift matrix “V“? Obviously, we should apply another cyclic-shift matrix V^-1, which now cyclic shifts all the values of ‘y‘ downwards by ‘k‘ positions (remember, “V” was shifting all the values of ‘x‘ upwards).

[begin{equation*}
begin{pmatrix}
x_1 \ x_2 \ x_3 \ x_4 \ x_5
end{pmatrix}
= x = V^{-1}Vx =
begin{bmatrix}
0 & 0 & 0 & 1 & 0 \
0 & 0 & 0 & 0 & 1 \
1 & 0 & 0 & 0 & 0 \
0 & 1 & 0 & 0 & 0 \
0 & 0 & 1 & 0 & 0
end{bmatrix}
begin{bmatrix}
0 & 0 & 1 & 0 & 0 \
0 & 0 & 0 & 1 & 0 \
0 & 0 & 0 & 0 & 1 \
1 & 0 & 0 & 0 & 0 \
0 & 1 & 0 & 0 & 0
end{bmatrix}
begin{pmatrix}
x_1 \ x_2 \ x_3 \ x_4 \ x_5
end{pmatrix}
= V^{-1}y
end{equation*}]

The X-diagram of a product of 2 cyclic-shift matrices V^-1V shows that every input value x_i of vector ‘x’ results at the same position, after being transformed as V^-1Vx. As an example, the path of value x₄ is highlighted.

This is why the inverse of a cyclic-shift matrix is another cyclic-shift matrix:

[begin{equation*}
V_1^{-1} = V_2
end{equation*}]

More than that, we can note that the X-diagram of V^-1 is actually the horizontal flip of the X-diagram of “V“. And from the previous part of this series – “transpose of a matrix” [3], we remember that the horizontal flip of an X-diagram corresponds to the transpose of that matrix. This is why the inverse of a cyclic shift matrix is equal to its transpose:

[begin{equation*}
V^{-1} = V^T
end{equation*}]

Inverse of an exchange matrix

An exchange matrix, often denoted by “J“, is such a matrix, which when multiplied by an input vector ‘x‘, produces an output vector ‘y‘, having all the values of ‘x‘, but in reverse order. To achieve that, “J” has ‘1’s on its anti-diagonal, while all other cells are ‘0’s.

[begin{equation*}
begin{pmatrix}
y_1 \ y_2 \ y_3 \ y_4 \ y_5
end{pmatrix}
= y = Jx =
begin{bmatrix}
0 & 0 & 0 & 0 & 1 \
0 & 0 & 0 & 1 & 0 \
0 & 0 & 1 & 0 & 0 \
0 & 1 & 0 & 0 & 0 \
1 & 0 & 0 & 0 & 0
end{bmatrix}
*
begin{pmatrix}
x_1 \ x_2 \ x_3 \ x_4 \ x_5
end{pmatrix}
=
begin{pmatrix}
x_5 \ x_4 \ x_3 \ x_2 \ x_1
end{pmatrix}
end{equation*}]

X-diagram of the exchange matrix “J” shows that all the ‘n’ arrows (corresponding to ‘n’ cells of the matrix with ‘1’s) just flip the content of the input vector ‘x’. So k’th from top value of ‘x’ becomes k’th from bottom value of output vector ‘y’.

Obviously, to undo this type of transformation, we should apply one more exchange matrix.

[
begin{equation*}
begin{pmatrix}
x_1 \ x_2 \ x_3 \ x_4 \ x_5
end{pmatrix}
= x = J^{-1}Jx =
begin{bmatrix}
0 & 0 & 0 & 0 & 1 \
0 & 0 & 0 & 1 & 0 \
0 & 0 & 1 & 0 & 0 \
0 & 1 & 0 & 0 & 0 \
1 & 0 & 0 & 0 & 0
end{bmatrix}
begin{bmatrix}
0 & 0 & 0 & 0 & 1 \
0 & 0 & 0 & 1 & 0 \
0 & 0 & 1 & 0 & 0 \
0 & 1 & 0 & 0 & 0 \
1 & 0 & 0 & 0 & 0
end{bmatrix}
begin{pmatrix}
x_1 \ x_2 \ x_3 \ x_4 \ x_5
end{pmatrix}
= J^{-1}y
end{equation*}]

After 2 exchange matrices “JJ” are sequentially applied to the input vector ‘x’, any k’th from the top value returns to the same position, so the entire vector ‘x’ comes back to its original state. As an example, the path of value “x₂” is highlighted.

This is why the inverse of an exchange matrix is the exchange matrix itself:

[begin{equation*}
J^{-1} = J
end{equation*}]

Inverse of a permutation matrix

A permutation matrix is such a matrix “P” which, when multiplied by an input vector ‘x‘, rearranges its values in a different order. To achieve that, an n*n-sized permutation matrix “P” has ‘n‘ 1(s), arranged in such a way that no two 1(s) appear on the same row or the same column. All other cells of “P” are 0(s).

[begin{equation*}
begin{pmatrix}
y_1 \ y_2 \ y_3 \ y_4 \ y_5
end{pmatrix}
= y = Px =
begin{bmatrix}
0 & 0 & 1 & 0 & 0 \
1 & 0 & 0 & 0 & 0 \
0 & 0 & 0 & 1 & 0 \
0 & 0 & 0 & 0 & 1 \
0 & 1 & 0 & 0 & 0
end{bmatrix}
*
begin{pmatrix}
x_1 \ x_2 \ x_3 \ x_4 \ x_5
end{pmatrix}
=
begin{pmatrix}
x_3 \ x_1 \ x_4 \ x_5 \ x_2
end{pmatrix}
end{equation*}]

*X-diagram of the presented permutation matrix “P” shows that all ‘n’ input values x*_i are being rearranged when producing the output vector ‘y’.

Now, what type of matrix should be the inverse of a permutation matrix? In other words, how to undo the transformation of a permutation matrix “P“? Obviously, we need to do another rearrangement, which acts in reverse order. So, for example, if the input value x₃ was moved by “P” to output value y₁, then in the inverse permutation matrix P^-1, the input value y₁ should be moved back to output value x₃. This means that when drawing X-diagrams of permutation matrices “P^-1” and “P“, one will be the reflection of the other.

The X-diagram of a product matrix P^-1P. We see that the input value ‘x₂‘ is being placed by “P” to the intermediate value ‘y₅‘, and later is being placed back by P^-1 to the original position of ‘x₂‘. The same refers to every other input value ‘x_i‘.

Similarly to the case of an exchange matrix, in the case of a permutation matrix, we can visually note that the X-diagrams of “P” and P^-1 differ only by a horizontal flip. That is why the inverse of any permutation matrix “P” is equal to its transposition:

[begin{equation*}
P^{-1} = P^T
end{equation*}]

[begin{equation*}
begin{pmatrix}
x_1 \ x_2 \ x_3 \ x_4 \ x_5
end{pmatrix}
= x = P^{-1}Px =
begin{bmatrix}
0 & 1 & 0 & 0 & 0 \
0 & 0 & 0 & 0 & 1 \
1 & 0 & 0 & 0 & 0 \
0 & 0 & 1 & 0 & 0 \
0 & 0 & 0 & 1 & 0
end{bmatrix}
begin{bmatrix}
0 & 0 & 1 & 0 & 0 \
1 & 0 & 0 & 0 & 0 \
0 & 0 & 0 & 1 & 0 \
0 & 0 & 0 & 0 & 1 \
0 & 1 & 0 & 0 & 0
end{bmatrix}
begin{pmatrix}
x_1 \ x_2 \ x_3 \ x_4 \ x_5
end{pmatrix}
= P^{-1}y
end{equation*}]

Inverse of a rotation matrix

A rotation matrix on 2D plane is such a matrix “R“, which, when multiplied by a vector (x₁, x₂), rotates the point “x=(x₁, x₂)” counter-clockwise by a certain angle “ϴ” around the null-point. Its formula is:

[
begin{equation*}
begin{pmatrix}
y_1 \ y_2
end{pmatrix}
= y = Rx =
begin{bmatrix}
cos(theta) & -sin(theta) \
sin(theta) & phantom{+} cos(theta)
end{bmatrix}
*
begin{pmatrix}
x_1 \ x_2
end{pmatrix}
end{equation*}]

A rotation matrix acts on any point by rotating it by angle “ϴ”, while preserving its distance from the zero-point. Original points are presented in red, while the rotated points are the blue ones.

Now, what should be the inverse of a rotation matrix? How to undo the rotation produced by a matrix “R“? Obviously, that should be another rotation matrix, this time with an angle “-ϴ” (or “360°-ϴ“):

[begin{equation*}
R^{-1} =
begin{bmatrix}
cos(-theta) & -sin(-theta) \
sin(-theta) & phantom{+} cos(-theta)
end{bmatrix}
=
begin{bmatrix}
phantom{+} cos(theta) & sin(theta) \
-sin(theta) & cos(theta)
end{bmatrix}
=
R^T
end{equation*}]

Which is why the inverse of a rotation matrix is another rotation matrix. We also see that the inverse R^-1 is equal to the transpose of the original matrix “R“.

Inverse of a triangular matrix

An upper-triangular matrix is a square matrix that has zeros below its diagonal. Because of that, in its X-diagram, there are no arrows directed downwards:

*A 3×3 upper-triangular matrix and its X-diagram.*

The horizontal arrows correspond to cells of the diagonal, while the arrows that are directed upwards correspond to the cells above the diagonal.

Similarly, the lower-triangular matrix is defined, which has zeroes above its main diagonal. In this article, we will concentrate only on upper-triangular matrices, as for lower-triangular ones, inversion is performed in an analogous way.

For simplicity, let’s at first address inverting a 2×2-sized upper-triangular matrix ‘A‘.

*The 2×2-sized upper-triangular matrix.*

Once ‘A‘ is multiplied by an input vector ‘x‘, the result vector “y = Ax” has the following form:

[begin{equation*}
y =
begin{pmatrix}
y_1 \ y_2
end{pmatrix}
=
begin{bmatrix}
a_{1,1} & a_{1,2} \
0 & a_{2,2}
end{bmatrix}
begin{pmatrix}
x_1 \ x_2
end{pmatrix}
=
begin{pmatrix}
begin{aligned}
a_{1,1}x_1 + a_{1,2}x_2 \
a_{2,2}x_2
end{aligned}
end{pmatrix}
end{equation*}]

Now, when calculating the inverse matrix A^-1, we want it to act in the reverse order:

*Given values (y₁, y₂), the matrix A^-1 should restore the original values (x₁, x₂).*

How should we restore (x₁, x₂) from (y₁, y₂)? The first and simplest step is to restore x₂, using only y₂, because y₂ was originally affected only by x₂. We don’t need the value of y₁ for that:

*To restore ‘x₂‘, we need only the value of ‘y₂‘.*

Next, how should we restore x₁? This time, we can’t use only y₁, because the value “y₁ = a_1,1x₁ + a_1,2x₂” is kind of a mixture of x₁ and x₂. But we can restore x₁ if using both y₁ and y₂ properly. This time, y₂ will help to filter out the influence of x₂, so the pure value of x₁ can be restored:

*To restore ‘x₁‘, we need values of both ‘y₁‘ and ‘y₂‘.*

We see now that the inverse A^-1 of the upper-triangular matrix “A” is also an upper-triangular matrix.

What about triangular matrices of larger sizes? Let’s take this time a 3×3-sized matrix and find its inverse analytically.

*X-diagram of a 3×3-sized upper-triangular matrix ‘A’.*

Values of the output vector ‘y‘ are obtained now from ‘x‘ in the following way:

[
begin{equation*}
y =
begin{pmatrix}
y_1 \ y_2 \ y_3
end{pmatrix}
= Ax =
begin{bmatrix}
a_{1,1} & a_{1,2} & a_{1,3} \
0 & a_{2,2} & a_{2,3} \
0 & 0 & a_{3,3}
end{bmatrix}
begin{pmatrix}
x_1 \ x_2 \ x_3
end{pmatrix}
=
begin{pmatrix}
begin{aligned}
a_{1,1}x_1 + a_{1,2}x_2 + a_{1,3}x_3 \
a_{2,2}x_2 + a_{2,3}x_3 \
a_{3,3}x_3
end{aligned}
end{pmatrix}
end{equation*}]

As we are interested in building the inverse matrix A^-1, our target is to find (x₁, x₂, x₃), having the values of (y₁, y₂, y₃):

[begin{equation*}
begin{pmatrix}
x_1 \ x_2 \ x_3
end{pmatrix}
= A^{-1}y =
begin{bmatrix}
text{?} & text{?} & text{?} \
text{?} & text{?} & text{?} \
text{?} & text{?} & text{?}
end{bmatrix}
*
begin{pmatrix}
y_1 \ y_2 \ y_3
end{pmatrix}
end{equation*}]

In other words, we must solve the system of linear equations mentioned above.

Doing that will restore at first the value of x₃ as:

[begin{equation*}
y_3 = a_{3,3}x_3, hspace{1cm} x_3 = frac{1}{a_{3,3}} y_3
end{equation*}]

which will clarify cells of the last row of A^-1 :

[begin{equation*}
begin{pmatrix}
x_1 \ x_2 \ x_3
end{pmatrix}
= A^{-1}y =
begin{bmatrix}
text{?} & text{?} & text{?} \
text{?} & text{?} & text{?} \
0 & 0 & frac{1}{a_{3,3}}
end{bmatrix}
*
begin{pmatrix}
y_1 \ y_2 \ y_3
end{pmatrix}
end{equation*}]

Having x₃ figured out, we can bring all its occurrences to the left side of the system:

[begin{equation*}
begin{pmatrix}
y_1 – a_{1,3}x_3 \
y_2 – a_{2,3}x_3 \
y_3 – a_{3,3}x_3
end{pmatrix}
=
begin{pmatrix}
begin{aligned}
a_{1,1}x_1 + a_{1,2}x_2 \
a_{2,2}x_2 \
0
end{aligned}
end{pmatrix}
end{equation*}]

which will allow us to calculate x₂ as:

[begin{equation*}
y_2 – a_{2,3}x_3 = a_{2,2}x_2, hspace{1cm}
x_2 = frac{y_2 – a_{2,3}x_3}{a_{2,2}} = frac{y_2 – (a_{2,3}/a_{3,3})y_3}{a_{2,2}}
end{equation*}]

This already clarifies the cells of the second row of A^-1 :

[begin{equation*}
begin{pmatrix}
x_1 \ x_2 \ x_3
end{pmatrix}
= A^{-1}y =
begin{bmatrix}
text{?} & text{?} & text{?} \[0.2cm]
0 & frac{1}{a_{2,2}} & – frac{a_{2,3}}{a_{2,2}a_{3,3}} \[0.2cm]
0 & 0 & frac{1}{a_{3,3}}
end{bmatrix}
*
begin{pmatrix}
y_1 \ y_2 \ y_3
end{pmatrix}
end{equation*}]

Finally, having the values of x₃ and x₂ figured out, we can do the same trick of moving now x₂ to the left side of the system:

[begin{equation*}
begin{pmatrix}
begin{aligned}
y_1 – a_{1,3}x_3 & – a_{1,2}x_2 \
y_2 – a_{2,3}x_3 & – a_{2,2}x_2 \
y_3 – a_{3,3}x_3 &
end{aligned}
end{pmatrix}
=
begin{pmatrix}
a_{1,1}x_1 \
0 \
0
end{pmatrix}
end{equation*}]

from which x₁ will be derived as:

[begin{equation*}
begin{aligned}
& y_1 – a_{1,3}x_3 – a_{1,2}x_2 = a_{1,1}x_1, \
& x_1
= frac{y_1 – a_{1,3}x_3 – a_{1,2}x_2}{a_{1,1}}
= frac{y_1 – (a_{1,3}/a_{3,3})y_3 – a_{1,2}frac{y_2 – (a_{2,3}/a_{3,3})y_3}{a_{2,2}}}{a_{1,1}}
end{aligned}
end{equation*}]

so the first row of matrix A^-1 will also be clarified:

[begin{equation*}
begin{pmatrix}
x_1 \ x_2 \ x_3
end{pmatrix}
= A^{-1}y =
begin{bmatrix}
frac{1}{a_{1,1}} & – frac{a_{1,2}}{a_{1,1}a_{2,2}} & frac{a_{1,2}a_{2,3} – a_{1,3}a_{2,2}}{a_{1,1}a_{2,2}a_{3,3}} \[0.2cm]
0 & frac{1}{a_{2,2}} & – frac{a_{2,3}}{a_{2,2}a_{3,3}} \[0.2cm]
0 & 0 & frac{1}{a_{3,3}}
end{bmatrix}
*
begin{pmatrix}
y_1 \ y_2 \ y_3
end{pmatrix}
end{equation*}]

After deriving A^-1 analytically, we can see that it is also an upper-triangular matrix.

Paying attention to the sequence of actions that we used here to calculate A^-1, we can say for sure now that the inverse of any upper-triangular matrix ‘A‘ is also an upper-triangular matrix:

*Inverse of a 3×3-sized upper-triangular matrix ‘A’ is also an upper-triangular matrix.*

An analogous judgment will show that the inverse of a lower-triangular matrix is another lower-triangular matrix.

A numerical example of inverting a chain of matrices

Let’s have another look at why, during an inversion of a chain of matrices, their order is reversed. Recalling the formula:

[begin{equation*}
(AB)^{-1} = B^{-1}A^{-1}
end{equation*}]

This time, for both ‘A‘ and ‘B‘, we will take certain types of matrices. The first matrix “A=V” will be a cyclic shift matrix:

*The matrix ‘V’ performs a cyclic shift of values of the input vector ‘x’ by 1 position upwards.*

Let’s recall here that to restore the input vector ‘x‘, the inverse V^-1 should do the opposite – cyclic shift values of the argument vector ‘y‘ downwards:

*Concatenating V^-1V results in the unchanged input vector ‘x’.*

The second matrix “B=S” will be a diagonal matrix with different values on its main diagonal:

*The 4×4 matrix ‘S’ doubles only the first 2 values of the input vector ‘x’.*

The inverse S^-1 of such a scale matrix, to restore the original vector ‘x‘, must halve only the first 2 values of its argument vector ‘y‘:

*Concatenating S^-1S results in the unchanged input vector ‘x’.*

Now, what kind of behavior will the product matrix “VS” have? When calculating “y = VSx“, it will double only the first 2 values of the input vector ‘x‘, and cyclic shift the entire result upwards.

*The product matrix “V*S” doubles only the first 2 values of the input vector ‘x’, and cyclic shifts the result by 1 position upwards.*

We know already that once the output vector “y = VSx” is calculated, to reverse the influence of the product matrix “VS” and to restore the input vector ‘x‘, we should do:

[begin{equation*}
x = (VS)^{-1}y = S^{-1}V^{-1}y
end{equation*}]

In other words, the order of matrices ‘V‘ and ‘S‘ should be reversed during inversion:

*The inverse of the product matrix “VS” is equal to “S^-1V^-1“. All values of the input vector ‘x’ at the right side are restored at the left side.*

And what will happen if we try to invert the affection of “VS” in an improper way, without reversing the order of the matrices, assuming that V^-1S^-1 is what should be used for it:

*Trying to invert the matrix “SV” using S^-1V^-1 will not result in an identity matrix “E”.*

We see that the original vector (x₁, x₂, x₃, x₄) from the right side is not restored at the left side now. Instead, we have vector
(2x₁, x₂, 0.5x₃, x₄) there. One reason for this is that the value x₃ should not be halved on its path, but it actually gets halved because at the moment when matrix S^-1 is applied, x₃ appears at the second position from the top, which actually halves it. Same refers to the path of value x₁. All that results in having an altered vector on the left side.

Conclusion

In this story, we have looked at matrix inversion operation A^-1 as something that undoes the transformation of the given matrix “A“. We have observed why inverting a chain of matrices like (ABC)^-1 actually reverses the order of multiplication, resulting in C^-1B^-1A^-1. Also, we got a visual perspective on why inverting several special types of matrices results in another matrix of the same type.

Thanks for reading!

This is probably the last part of my “Understanding Matrices” series. I hope you enjoyed reading all 4 parts! If that is the case, feel free to follow me on LinkedIn, as hopefully other articles will be coming soon, and I’ll post updates there!

My gratitude to:
– Asya Papyan, for precise design of all the used illustrations ( behance.net/asyapapyan ).
– Roza Galstyan, for careful review of the draft, and useful suggestions ( linkedin.com/in/roza-galstyan-a54a8b352/ ).

If you enjoyed reading this story, feel free to connect with me on LinkedIn ( linkedin.com/in/tigran-hayrapetyan-cs/ ).

All used images, unless otherwise noted, are designed by request of the author.

References:

[1] – Understanding matrices | Part 1: Matrix-Vector Multiplication

[2] – Understanding matrices | Part 2: Matrix-Matrix Multiplication

[3] – Understanding matrices | Part 3: Matrix Transpose

Understanding Matrices | Part 4: Matrix Inverse

Definition of the inverse matrix

Inverting chains of matrices

Inversion of several special matrices

Inverse of cyclic-shift matrix

Inverse of an exchange matrix

Inverse of a permutation matrix

Inverse of a rotation matrix

Inverse of a triangular matrix

A numerical example of inverting a chain of matrices

Conclusion

References:

Related Posts

What Is POI Data? A Guide to Point of Interest Data & Use Cases

Crafting a Custom Voice Assistant with Perplexity

Leave a Reply Cancel reply