Matrices as Transformations
Matrices are best understood as transformations that map vectors from one space to another. This concept is foundational in linear algebra, where matrices represent linear functions between vector spaces [1:2]. For instance, a matrix can scale, rotate, or shear vectors, which can be visualized geometrically
[5:1]. Understanding matrices as transformations helps in grasping their application across various dimensions and scenarios.
Matrix Multiplication
Matrix multiplication is often seen as complex due to its unique rules, but it is defined to preserve the composition of linear transformations. When multiplying two matrices, the process involves "wiring" outputs of one matrix to inputs of another, ensuring that the resulting matrix represents the combined effect of both transformations [2:3],
[4:2]. This approach ensures that for any matrices A, B, and vector v, the equation (AB)v = A(Bv) holds true, maintaining the integrity of linear mappings
[3:5].
Applications of Matrices
Matrices are used extensively in various fields such as computer science, physics, and engineering. They are fundamental in solving systems of linear equations, which is essential in modeling real-world phenomena [3:1]. In robotics, for example, matrices are used to simulate movements and rotations of robot arms, allowing for precise control and simulation of mechanical components
[4:5].
Learning Resources
For those looking to deepen their understanding of matrices, there are several resources available. The YouTube channel 3Blue1Brown offers an excellent series on linear algebra that provides visual and intuitive explanations of matrix operations [2:8],
[3:4]. Additionally, engaging with online forums and discussions can provide insights into practical applications and conceptual understandings that may not be covered in traditional coursework
[4:3].
Conceptualizing Matrices
To truly grasp matrices, it's important to move beyond rote memorization of formulas and seek to understand the underlying concepts. Viewing matrices as instructions or recipes for transformations can aid in developing an intuitive understanding of their operations and applications [5:1]. Engaging with problems and examples that require applying these concepts can further solidify one's comprehension of matrix operations.
At first, I looked at matrices as nice ways to organize numbers. Then, I learned they transforms vectors in space, and I thought of them as functions of sort. Instead of f(x) being something, I had matrix A transforming vectors into another set of vectors.
So I thought of them geometrically in a way for a couple weeks. A 1x1 matrix in 1D, 2x2 in 2D and 3x3 in 3D, and the rank also told me what dimensions it is.
But then I saw matrices more than 3x3, and that idea and thought kind of fell apart.
Now I don't know how to think of matrices. I can do the problems we do in class fine, I mean, I see what our textbook is asking us to do, I follow their rules, and I get things "right" but I don't want to get things right - I want to understand what's happening.
Edit: for context, we learned row echelon form, cramers rule, inverses, the basics of adding/subtracting/multiplying, this week we did spans and vector subspaces. I think we will learn eigen values and such very soon or next?
ill make a reply chain to myself in varying levels of complexity
matrices are functions which turn vectors into other vectors. these functions have some special properties
matrices correspond precisely to the functions which take vectors of length n and map them to vectors of length m in a linear manner. that is if v and w are vectors and c is a scalar (number), then for a matrix M, we have that
M(v+w) = M(v) + M(w)
and
M(cv) = c•M(v)
Suppose you have two finite dimension vector spaces V and W over a field F. One can have a structure preserving function between these vectors spaces. It turns out that these functions are precisely the linear functions. That is T:V->W is linear precisely when T(cv+w)=cT(v)+T(w) for vectors v,w in T and c in the base field.
Matrices then correspond precisely to these linear maps as we have isomorphism of V and F^n where n is the dimension and an iso between W and F^m where m is the dimension. This is by picking a basis for both vector spaces. The transformation T:V->W can the be described as a matrix from F^n -> F^m by finding what T does to a basis of V and recording the result in W. After writing this in terms of what happens to the basis in F^n and F^m, we obtain the corresponding matrix representing this linear map
For a visual interpretation for the action that matrices have vectors, Grant Sanderson or 3b1b has an excellent playlist: https://youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab
As representations of linear transformations with respect to some basis
This is the right way to think about matrices (most of the time). Among other things it lets you derive multiplication rule quite easily (multiplication of A and B corresponds to compositum of respective linear functions on the underlying vector spaces).
The best way to view matrices is as transformations. The "training wheels" are geometric transformations:
https://www.youtube.com/watch?v=Bx4cFrjiQDc&list=PLKXdxQAT3tCtmnqaejCMsI-NnB7lGEj5u&index=21
But they apply any time you go from one thing to another thing:
https://www.youtube.com/watch?v=PVT2j4PID6w&list=PLKXdxQAT3tCtmnqaejCMsI-NnB7lGEj5u&index=94
https://www.youtube.com/watch?v=6dbE8usmgJY&list=PLKXdxQAT3tCtmnqaejCMsI-NnB7lGEj5u&index=95
I just don't really get them, how exactly do they work? I think I understand what they're used for (mainly programming, which is gonna suck as a CS major if I don't get this topic nailed down lol), but a lot about it feels overcomplicated. Or, moreso just being done w/o any explanation as for why. Like, why exactly would the dimensions of matrix A, when multiplied by B, equal m x p? What happens to the inner dimensions, n? And I also don't get how multiplication works for matrices. Why would you multiply the 1st row by the 1st column, then the 1st row by the 2nd, etc... rather than taking every individual element of Matrix A and multiplying by every element of Matrix B? I'd understand if it was simplicity's sake, but even testing out the right way of doing it and the way I was thinking of, I get 2 vastly different arrays.
Sorry if I sound really stupid with these questions lmao, this is just a topic I couldn't really wrap my head around even after looking at online resources. I'd really appreciate any help I could get :D
Edit: wow, ty everyone for taking time out of your day to help! I didn’t expect such traction so quick lol 😅 I can’t reply to everyone, but I did write down notes from all of your replies + saved your recommendations to my bookmarks Cheers :)
The way I look at it is: matrices are things that act on vectors. For example, one matrix might scale up all vectors by a factor of 2, while another might rotate vectors by a certain angle around a certain axis.
You've probably heard how 1 x n matrices are sometimes called "column vectors." If A is some random matrix and e_1 is the column vector (1,0,0...,0) (which looks more like a row vector, but that's only because I can't draw a column vector easily on Reddit, so please pretend it's a column), then what is the product Ae_1? If you look closely at the definition of matrix multiplication, you will discover that Ae_1 is simply the first column of A. And if e_2 is the column vector (0,1,0,...,0), then Ae_2 is the second column of A, and so on. So the matrix A can be interpreted as a list of column vectors, specifically the column vectors that A maps e_1, e_2, ... e_n to. Since any vector can be written as a linear combination of e_1, e_2, ...e_n, knowing where A sends those vectors tells you where A sends any vector. Explicitly, if v=(v_1,v_2,... v_n), then Av=v_1Ae_1+v_2Ae_2+...+v_nAe_n. So matrices are just compact representations of linear functions that take in a vector and output another vector. The height of a column is the length of the output vector, and the number of columns is the length of the input vector. Thus, a (b x a) matrix represents a function that takes as input a vector of length a and outputs a vector of length b.
Now, what happens if we have two matrices A and B and a vector v, and we want to figure out what happens when we start with V, apply A, and then apply B. Well, we can first read off the first column of A, a.k.a. Ae_1, and then compute what B does to that vector. That should be the first column of BA, since BAv should be the matrix that describes our process of applying A and then applying B. (In other words, we want (BA)v=B(Av), making multiplication associative). Work it out carefully and you will see that the matrix multiplication you've been taught is exactly the one that does this job. In particular, the height of A and the width of B have to be the same, because the output of the first function we're applying has to be a vector of the same length as the input of the second function.
This is all very abstract stuff, and I certainly haven't told you why we should care about matrices and vectors in the first place. So let me leave you with a concrete example. You probably know that 2-dimensional vectors of real numbers, e.g. (1,2.5), can be drawn as arrows in the 2D plane starting at the origin. One thing we can do with the 2D plane is rotate it around the origin clockwise by an angle t. From your knowledge of the unit circle, you may know that, if we start at the point (1,0) on the positive x-axis and rotate by t, we get to the point (cos(t) ,sin(t)). And, with a little more trig identity work and the knowledge that (0,1) is what you get by rotating (1,0) by pi/2, you can show that rotating by t sends (0,1) to (-sin(t),cos(t)). So, the matrix that represents "rotate clockwise by t radians" is just R_t:=[[cos(t),-sin(t)],[sin(t),cos(t)]].
Now, what happens if I rotate clockwise by t radians, and then rotate clockwise by another s radians? Well, I've rotated by (s+t) radians in all, right? Based on this, we should expect that R_sR_t=R_(s+t). And by using matrix multiplication and angle addition trig identities, one can check that this is the case.
woah yeah that’s a lot 😅 But I don’t mean it in a bad way, thank you, this gave some nice information abt concepts that my teacher certainly didn’t explain to us lol. Some parts do feel like they’re stuff I haven’t run into yet bc it’s far more advanced than my class, like the section at the end about rotating a set of coordinates by t. But this helped conceptualize rlly, well, what’s matrices can be defined as, which also wasn’t rlly explained so well in our class. As well as what the height of column and # of columns represent. Again, ty, this was rlly informative and I made sure to write everything down :D hope you’re doing good man
each column corresponds to an input, each row to an output. So the product of two matrices requires "wiring" the outputs of one to the inputs of another, which is why you multiply rows and columns like that. This also explains the dimension thing, if one matrix has n inputs and m outputs, and another has m inputs and k outputs, the m outputs can be "wired" to the m inputs so now you have a matrix with n inputs and k outputs.
the "multiply each combination of components" thing you thought of is actually a thing called the tensor product, which is a generalization of objects like matrices and vectors to more complex types of function.
note: you can also say rows are inputs and columns are outputs, depends on if the smaller objects you're working with are column or row vectors.
okay this makes it easier to digest :D every row/output wires to every input, so m outputs would be wired to each n input
and then when you multiply it w/ another matrix, with k (or p) outputs and m inputs, then m would just be wired to the m outputs of Matrix A and cancel out. Leaving N inputs and K outputs for the final dimensions
This helped a lot bc I couldn't quite understand why it was like this, which was messing me up a lot. Thx so much man!
of course!
matrices are defined as linear maps from vectors to vectors, meaning that for vectors V and W, a matrix M, and a scalar k:
MV is a vector
M(V+W) = MV + MW
MkV = kMV
You can think of the first column of the matrix as the vector it outputs when the input is the column vector [1,0,...], the second being the output for [0,1,0,...], etc. because of the above rules, knowing where each unit vector goes allows you to find the output for every vector.
This is probably not the right answer for your particular needs, but from the perspective of pure mathematics, I think the way in which matrices are introduced is often really unmotivated, especially in linear algebra courses designed mainly for students from other disciplines (also it is not really true that their main applications are in programming - matrices, and more generally linear transformations, which they represent, are ubiquitous across all of mathematics and any discipline using it).
The reason that matrix multiplication is the way it is is so that multiplication of matrices corresponds to composition of linear transformations (under a choice of bases). If you know what a linear transformation is, and what a basis is, then you can sit down and write out what sort of operation you need in terms of vector components with respect to given bases to completely describe the algebra of linear transformations, and you will more or less automatically rederive the rule for matrix multiplication. This perspective is laid out especially nicely in Sheldon Axler's Linear Algebra Done Right.
hey I appreciate any feedback I can get lol yeah it’s rather odd how matrices were introduced to us for this class. We didn’t really touch up on what they’re used for exactly/their relationship to vectors and the x/y-axis, only the basic rules of addition, subtraction, scalar multiplication, and then, the one I was most stuck on, matrix multiplication. It confused me a lot bc I didn’t even know what I was using matrices for, only that I was meant to know the formulas that came with them.
Being able to actually understand why they’re used and where they’re used in rlly helped trying to understand them. While I was mostly thinking abt how they’re used in programming/CS at first, finding their uses in datasets made me better get why they’re helpful. Because what I didn’t get was why matrices were created for a limited set of coordinates, like let’s say x, y, and z. But when there’s many inputs and outputs of data that needs to be transformed, putting it all into an organized array and going from there makes it much easier to calculate
overall, thx a lot for taking time out of your day to help :) while some of this isn’t yet on my level, I made sure to put this all down, esp for when I need the more in-depth parts during a later point in my education. Ty for the recommendation, I’ll keep it in mind for when I begin taking Linear Algebra 🙏 have a nice day
I recommend going to YouTube, looking up the channel 3blue1brown, and finding in his playlists tab “The Essence of Linear Algebra”. I feel like I learned more about matrices in that video series than I learned in a university-level course.
ty for the recommendation, definitely liking what I'm seeing from the first video. Really neat and concise w/ explaining how diff definitions of vectors relate to each other and how it's used
I'll watch the rest over the course of this week, it's getting late for me right now lol. Again, thx a lot for helping out, I rlly appreciate your generosity. Have a nice night :)🙏
A lot of great explanations, but I want to give some more insight into use cases.
Linear algebra is used to work with 3 (or more) dimensional space. We have found a lot of ways to model things this way, for example each column could be a datapoint and suddenly you're halfway to machine learning. In calculus of multiple variables linear algebra is used as the ground you stand on, putting variables into matrices. This leads into a multitude of applications in physics (like all of physics). Linear algebra also pops up in proofs in unrelated areas every now and again such as graph theory or combinatorics.
A lot of our computers are optimized for fast work on matrices, such as GPU's, it has pretty much become our standard way of dealing with math on large amounts of data, which is something some programmers will need to do so if you are interested in simulations, computer graphics, machine learning, high performance computing, etc then linear algebra is the most important math course you'll take.
Hello ! (1st year uni student here) Matrices : So I know the fundamental principles of matrices, the rules, the properties, allat, but I only know them in a kind of blind memorization way, I don’t really get the deeper meaning behind them. What I’d like is to actually understand their purpose and how they’re used, not just how to apply formulas. And second, I want to understand the matrix product itself, I know how to do it, but I don’t get why it’s defined in this PARTICULAR way. Why do we multiply matrices like that instead of some other rule?
The short answer is that matrices represent linear maps, and products of matrices represent composition of linear maps. To understand this, a good place to start is this post on MSE.
Matrix are representation of linear transformations between (possibly different) vector spaces.
You can concatenate linear transformations (if the vector spaces matches). The matrix multiplication is defined in a way that
TA(TB) = AB
Where TA is the linear transformation A and A is the matrix representing that transformation.
Have you encountered the youtube channel 3blue1brown? They have a series on linear algebra that might help give you a deeper understanding
okay thank you
The matrix multiplication is done that way such that v for every matrix A, B and vector v with appropriate dimensions (AB)v=A(Bv), which you really want to have since matrices represent linear functions.
The video probably does a better job of it, but in case you aren't a fan of video learning like me, here's a basic concept.
Linear algebra is mostly concerned with linear equations. They take the form Ax+By+...+Dz = K where my upper case letters are coefficients and the lowercase letters are variables. These are linear because every term has exactly one variable to the power of 1 (otherwise this is a non-linear equation).
The original invention of matrices was to solve systems of these equations, that is cases where you have two or more equations in two or more variables. You will likely see this early on in a Linear Algebra class. The language of Linear Algebra allows us to represent the solution to this system as a vector X, the coefficients as a matrix A, and the constants as a vector Y. We can then ask "if we transform X by multiplying it by A, what value of X gives us Y". In essence, the coefficients put into the matrix A represent a way of transforming one vector (x, y, ..., z) into another (K1, K2, ..., Kn). Matrix multiplication is essentially how that process happens.
In this sense, we can talk about a matrix as a linear map, a way of mapping one vector to another vector.
If it helps, think of a linear function (y=mx+b) as mapping the scalar value x to the scalar value y. The linear map (y=Ax) maps the vector (or matrix) x to another vector or matrix y. Matrix multiplication is designed so that this works.
Matrices can also represent other things and be used for other purposes, but this is the essence of linear algebra. Pretty much everything you learn in Linear Algebra 1 supports these concepts and figuring out ways to quickly solve linear equations using matrices.
Hi I just learnt matrices in class 10 and my teacher and textbook skip over the actual use of matrices or why the addition subtraction and especially multiplication is done in this way. I just know computation of matrices and all that's fine but I wanna know how it actually works and why multiplication of matrices is so weird
Matrix multiplication is weird because we want it to be compatible with function composition! Multiplying two matrices AB will give you a new matrix where (AB)v first applies B and then applies A.
Matrices encode what are known as linear transformations. A linear transformation T is a map with the properties that
These properties (very intentionally) align with the properties of matrix-vector multiplication. This means that you can take any matrix A and create an associated linear transformation T(v) = Av. Similarly, given any^(1) linear transformation, you can find a unique^(2) matrix associated to it.
This now leads to our motivation for what we want matrix multiplication to do. Given two function f(x) and g(x), we are able to compute their composition g∘f where (g∘f)(x) is defined to be g(f(x)). This comes up very often, so it would be nice if we could essentially precompute g∘f rather than having to apply both functions. This is exactly what matrix-matrix multiplication does. If you have two linear transformations with compatible dimensions T(v) = Av and S(v) = Bv, their composition (T∘S)(v) is equal to the function (AB)v. I believe the main way of showing this is to use (or prove) that A(Bv) = (AB)v. But as it turns out, this weird formula for matrix multiplication is exactly the one that gives us this compatibility between function composition and multiplication.
[1] Matrices can be found for vector spaces with finite dimension.
[2] These matrices are only unique up to your basis for the vector space.
Its a good thing that you ask these questions. Look into Linear Algebra.
Matrix * vector multiplication is defined the way that it is in order for matrices to "naturally" be "linear transformations"
The motivation for the definition of matrix * matrix multiplication follows pretty quickly from matrix * vector multiplication. Matrix * matrix multiplication represents composing linear transformations.
This probably doesnt make sense, and you will need to look basic linear algebra things up, but this is the most succint way to explain it
Use the index notation. We write:
A_{r,s}
for the matrix element at the rth row and the sth column. So, starting from the upper left corner, you move down by r steps and move to the right by s steps.
A vector of dimension d, is a d by 1 matrix. In index notation we write V_k for the kth element of a vector.
If the vector W results from applying a matrix A to a vector V, then in index notation, you have:
W_r = Sum over all s of A_{r,s} V_s
Matrix product: If matrix C is the matrix product of matrices A and B, so C = A B, then in index notation, we have:
C_{r,s} = Sum over all k of A_{r,k}B_{k,s}
And you then easily see that if we multiply matrices A1, A2, A3,...,An that the product C in index notation will be given by:
C_{r,s} = Sum over all k_1, k_2, k_3,,...,k_{n-1} etc of A1_{r, k_1 }A2_{k_1 ,k_2} A3_{k_2, k_3},...,An_{k_{n-1}.s}
Let me give an example from my (more or less) everyday work. Say you want to simulate a robot arm. It has different parts that can move and rotate. Luckily for us rotations and translation can be represented by matrices. Now I want to know if the piece that is at the end of the robot arm hits a wall (which would be really bad). So I need to know where all the points describing the thing at the end of arm end up. Now I could calculate A(B(C(Dv))) (capital letters for the transformation matrices and v a representation of a point of my object), but since the object has a lot of points this is a bit tedious and I don't really care about how the object came to the position, so it would be nice if I could just have one combined transformation. Luckily this is easy the combined transformation is the matrix product M=ABCD, I only have to calculate this once and now can just calculate Mv instead. To have this property is how the multiplication is done.
Matrices are used to describe linear transformations. I'll give a geometric descrption, but for it to stick, you'll need to draw along. (This is one of those things that would be much easier if we were in front of a nice blackboard.)
To explain what a linear transformation is: imagine the xy-plane, with a grid of 1x1 squares drawn on it, parallel to the axes. We can draw a little arrow A from (0,0) to (1,0), and represent this arrow by the column vector [1 0]^T. (The T means "transpose", which just means to turn the row vector I wrote into a column vector. I did it this way because you can't write column vectors easily on Reddit.) We can draw another little arrow B from (0,0) to (0,1), and represent this arrow by the column vector [0 1]^T. Geometrically, a linear transformation means we transform (e.g. stretch, shrink, rotate, reflect, etc.) the plane in a way that keeps (0,0) the same, sends the little arrows A and B to two other little arrows, and then transforms the rest of the plane based on that, so that the grid squares we started with become a new grid of identical parallelograms with sides that all have the same shape as the arrows we sent A and B to. It's like we're turning the usual grid of parallelograms (squares are parallelograms too) to a new grid of parallelograms, and leveraging the fact that just knowing where A and B go tells us how to change all the other lines in the grid as well.
For example: we could rotate the plane by 90 degrees clockwise around (0,0). That would send A to [0 -1]^T and B to [1 0]^T, and all the squares would stay squares.
Or for another example, we could keep the x-axis the same, but tilt and stretch the y-axis to turn the squares into little diamonds, by sending A to [1 0]^T (keeping it the same) and sending B [1 1]^T.
We can represent each transformation as a 2x2 matrix M, where the first row is whatever the arrow A gets sent to, and the second row is whatever the arrow B gets sent to. In other words, a matrix is nothing more than a list of what happens to the vectors [1 0]^T and [0 1]^T. (For a bigger matrix like a 3x3 matrix, it would be the vectors [1 0 0]^T, [0 1 0]^T, [0 0 1]^T, etc.) Then, matrix multiplication is defined so that, for example MA="the arrow A gets sent to".
What happens when we want to do two transformations in a row? If we first do the tilt-and-stretch I described, and then do the rotation, where does A end up, and where does B end up? Is it the same as if we do them in the other order? You can work this out by drawing. What you will see is that, if R is the matrix for our rotation and S is the matrix for our tilt-and-stretch, then the product RS is the matrix for "first tilt-and-stretch, then rotate." The geometric definition I gave of linear transformation above is a bit sketchy. But if you look up an actual defniition, and work very carefully, you can verify that, if we view matrices as lists of where each of our vectors [1 0 0...]^T, [0 1 0...]^T, etc. goes, the seemingly weird matrix multiplication is the one correct way to keep track of what happens when we do multiple linear transformations after one another.
As for why it's useful, there are two main things to keep in mind. One is that there are a lot of things in life which behave like linear transformation, even if they're not transforming a literal space. Search the internet and you will be surprised how many things are linear. The other is that linear transformations are relatively easy to compute and linear equations are relatively easy to solve, so that turning more complicated situations into linear ones (even if the linear thing you create is only an approximation of the original situation) is a practical problem-solving strategy. For example, this is at the heart of a lot of applications of calculus (various forms of derivative are essentially telling you what the best linear approximation is), as well as a cornerstone of many things called AI, including LLMs like ChatGPT.
Let’s say you’re a wizard who can bend and twist space. Matrix composition is how you combine two spells (transformations) into one mega-spell. Here’s the intuitive breakdown:
Think of a matrix as a recipe for moving or stretching space. For example:
Every matrix answers one question: Where do the basic arrows (i-hat and j-hat) land after the spell?
If you cast two spells in a row, the result is a composition (like stacking filters on a photo).
Order matters: Casting “shear” then “rotate” feels different than “rotate” then “shear”!
Example:
To compute the composition BA (do A first, then B):
Imagine you’re teaching a robot to recognize cats in photos. The robot’s brain (a neural network) works like a factory assembly line with multiple stations (layers). At each station, two things happen:
When you stack layers, you’re composing these matrix transformations:
Previous Posts:
I’m sharing beginner-friendly math for ML on LinkedIn, so if you’re interested, here’s the full breakdown: LinkedIn
W Keep it up fam
Is my writing easy to understand?
Yeah ChatGPT makes it very easy to understand!
Yes
Trying to 🥰
You know, we could really use some originality instead of all this AI generated nonsense you've been posting. Maybe you do have a clear undertanding of linear algebra, but your posts certainly aren't doing a very good job proving that - it doesn't take a lot of skill or knowledge to copy and paste from ChatGPT.
I would suggest focusing on developing your own unique writing style to express your own unique ideas, rather than having AI ghostwrite for you.
I rewrite the whole thing using ai after writing my but thanks I will not form now. And thanks for letting me know 😊 I am taking this very positively
Topic: vectors and matrices.
We use NumPy python library for these.
I got introduced to the concept of vectors and matrices. Vectors are like lists and are divided Vectors are divided into two categories i.e. row vector and column vector. Row vectors are like series of numbers that is they have one row however can have "n" number of columns. Column vector on the other have can have "n" number of rows however each row may have only one column. We can refer row vector as (1,n) and column vector as (n,1).
When we combine both categories of vectors we get matrices which is like a list of lists it can contain both "n" number of rows and "n" number of columns. We can therefore refer matrices as (m x n).
Then I have learn something called as "Transpose".
Transpose means conversion of rows into column and column into rows. It is denoted by letter "T" and it is one of the most important concept for Machine Learning.
We can perform arithmetic operations in these matrices for example addition, subtraction, multiplication etc. I have however not went deep into it today as my focus was more on understanding the basics of vectors and matrices. However I have plans to explore more about matrices because I think it is one of the most fundamental and important topic with respect to AI/ML.
A lot of people have also recommended me some of the really great resources which I explored as well. Suggestions and recommendations of you amazing people always helps me learn better.
Also here's my own handwritten notes and I am again sorry for my handwriting. 😅
As a masters student in computer science , matrices are everything, keep it up buddy.
Yes I am looking forward to learn matrices in as much details as I can so that I can get clarity of concepts.
Thank you very muuch for our support.
Nice nice 🔥🔥 matrix is one of the most used topic in ai ml
Yes Indeed I believe Linear algebra is the very foundation of machine learning.
Thank you for the appreciation btw.
Indeed, Linear Algebra is the foundation for AI and ML.
Bro, from where are you learning these, please tell me to. Cuz I am also doing but fine good resource
I recommend 3Blue1Brown’s ”Esscense of Linear Algebra” series. It dives into matrices and all that, probably to an extent farther than ML needs, but the clarity is nice. Also I recommend building projects by hand on pen and paper. Understanding your designs from a pure concept level helps know what programming techniques to use for optimizing.
I want to learn all maths which is required for ML, like statistics, probability, linear algebra and calculus. I am good in maths and I love this subject, but struggling to find good resources. Thanks Bro 🥹
There are already peoplle who have suggested you sone quality resources and 3blue 1brown is really good to me however I also watch video of anotheer channel name Digital Daru (he also have a playlist dedicated for mathematics for AI/ML).
I wish you all the best for your learning journey.
Learning concepts like this is good, but don’t forget to do exercises. Lots of them! It’s one thing to understand what’s being explained, and it’s another to learn to think with it
Yes I will also do exercise to practice to apply these concepts which I have learnt (maybe school books can help me here).
Thank you for this amazing suggestion.
Where you learning from?
Perhaps my title is not well-posed, but this is a learning sub so I'll ask anyways.
Teaching myself linear algebra again (I have a grad degree in engineering, but never felt like I got a good math foundation). One thing I've read is that we can see an m x n matrix A as a linear map function f: K^n -> K^m. But doesn't this imply that the arguments and outputs are both vectors?
If so, is it the case that the majority of (applications of) linear algebra revolve around treating matrices as linear transformations of vectors (as opposed to other matrices?)
Any n,m matrix yields linear maps R^(m,k) -> R^(n,k) via left multiplication. The k = 1 vector case is just one particular instance
Note also that if you have many vectors and you're interested in Av_1, Av_2 etc. you are also interested in the matrix product AV with V is the matrix with columns v_1, v_2 etc. (and vice versa)
> One thing I've read is that we can see an m x n matrix A as a linear map function f: K^n -> K^m. But doesn't this imply that the arguments and outputs are both vectors?
Yes. And matrix multiplication is defined to be consistent with function composition.
but let's say we were multiplying A by an n x m matrix B. B is not an element of K^n (is it?) and the result is not an element of K^m (is it?)
Perhaps my issue is actually more about the definition of K^n...
B is not an element of K^n, correct.
A can be seen as a linear function f.
B can be seen as a linear function g.
Then A*B can be seen as f∘g, f(g(x)) = A*B*x
Matrix multiplication has the same effect on every column of the right-side matrix and it distributes over matrix addition, so you can represent matrix multiplication as a bunch of matrix/vector multiplications just fine. For instance, for any 3x3 matrix M, multiplying an arbitrary matrix:
| a b c | | a 0 0 | | 0 b 0 | | 0 0 c |
M | e f g | = M(| e 0 0 | + | 0 f 0 | + | 0 0 g |) =
| h i j | | h 0 0 | | 0 i 0 | | 0 0 j |
| a 0 0 | | 0 b 0 | | 0 0 c |
= M | e 0 0 | + M | 0 f 0 | + M | 0 0 g |
| h 0 0 | | 0 i 0 | | 0 0 j |
And the result is what you would expect with the zeroes remaining zeroes, but the columns transforming like standard vectors.
This is exactly the insight I was looking for with this question. Thank you!!
As for your question about matrix multiplication in applications, I'm most familiar with 3D graphics, where you often have objects that are "attached" to each other, forming a hierarchy. The position and orientation of each object can be described by a matrix, but in order to get the "attachment" behavior, the matrix of an object is relative to its parent.
So if object A is 1 unit to the right of the origin, and object B is attached 2 units to the right of object A, then object B ends up 3 units to the right of the origin.
This hierarchy forms a tree, and in order to calculate the final positions of all the objects, you climb down each branch of the tree multiplying by each matrix as you go, so when you get to an object you've effectively accumulated all the transformations performed on its parent objects.
A lot of applied linear algebra has the goal of multiplying a matrix by a vector, but it might be more efficient to get there by multiplying a matrix by a matrix instead.
E.g., you want to calculate A(Bv). It might be more convenient to calculate (A x B)v if you need to perform the calculation many times for the same A and B and different v's.
That's an interesting point. So the goal is usually to transform a vector, but implementations may be more efficient by combining two maps together first.
Adding onto this answer: there are all sorts of other things you might need to do in the middle.
For example, the nth Fibonacci number F(n) (starting from F(1) = 1, F(2) = 1) may be calculated as the top entry of A^(n-1)v, where A is
(1 1)
(1 0)
and v is
(1)
(1)
(Try calculating F(1), F(2), F(3), F(4) using this method and you'll see why.)
In this problem, the "expensive" part is different: multiplying a matrix by a matrix is fine, and multiplying a matrix by a vector is fine, but working out A^(n-1) (for large n) is not fine. In this case, the trick is to diagonalise A: write A = P^(-1)DP, where D is a diagonal matrix. Then A^(n-1)v = (P^(-1)DP)^(n-1)P^(-1)v = (P^(-1)D^(n-1)P)P^(-1)v. The benefit here is that D^(n-1) is very easy to work out.
In other problems, calculating M^(-1) might be very difficult if M is very large, or whatever.
The point is, in each case, to reduce a "hard" problem to an "easy" (or "easier") problem. But how you actually do that in practice isn't always obvious.
Matrix-matrix is a lot more expensive than matrix-vector, so it's important to only do that when you need the composition many times
I guess that's one way to explain linear algebra in a nutshell. That being said, the bulk work in linear algebra (as far as matrix and vector are concerned) is in analyzing the mapping, i.e., the matrix. For example:
This is the response I got:
Imagine a "Sand Transfer" System
Let's say you have a system of bins and piles of sand.
Visualizing (Av) ⋅ w:
Visualizing v ⋅ (transpose(A) w):
Why the Results are the Same (The Key Insight):
The total "value" of the sand in the system doesn't depend on how you calculate it.
<A v, w> = (A v)^T w = v^T A^T w = v^T (A^T w) = <v, A^T w>
Nice. What was your prompt?
first i just casually asked for giving me visual intuition for matrix transpose, as these models are trained on internet data and other stuff, as expected it gave me explanation in terms of Dual vectors -- level sets,, which is the explanation you'd find about matrix transpose intuition on the internet. I set the temperature to 1.7 and it just naturally started giving me analogies, this is one of them.
Damn pretty interesting, I might have my hands on that
Example of a Toeplitz matrix identified when visualizing a radiation resistance matrix during my thesis work. One interesting property of the Toeplitz matrix is that every unique value can be found in single row. This discovery greatly sped up our data crunching process! I think the patterns looks pretty cool. Used original data I collected using an SLDV device (source) and image created using Matlab (tool).
Why are there block sub matrix?
Good question. If I understand your question, you're wondering why the pattern seems to have 5x5 blocks making up the pattern as a whole. This example was something I created to see if there was anything we could learn for making our computations more efficient, a symmetry we could exploit. So I created a simple case of a square flat plate divided into a 5X5 grid, so 25 elements total. That's why you see the smaller blocks in the pattern. In our actual physical experimentation we were using grids of around 2500X2500, or 6.25million elements. You can still see the smaller block patterns in those but this simple case was a lot easier to start with.
Can't find the comment on here now... but someone asked if N was the number of elements required to achieve the frequency resolution required. Or something like that... And yes, that is the case. As frequency increases, waves get shorter and the modes get smaller. The general rule of thumb is to make the elements small enough to fit 6 in the wave of the highest frequency desired.
(source) Data I collected during my master's thesis work in Mechanical Engineering, data collected using an SLDV (scanning laser Doppler vibrometer) device.
(tool) Data compiled, transformed, and visualized using Matlab.
For completeness' sake, can you also explain what the M value is?
The M value is the resulting value for each element in the Radiation Resistance Matrix (R Matrix), at a specific frequency. Here is the equation for the R Matrix:
What do the axes correspond to? In other words, if they had labels what would they say?
N, for both. N as in the number of the radiating element. Essentially the method of measurement applied uses a laser to scan discrete locations on the surface of a noise radiating surface. The surface is divided into small pieces (elements) and the velocity of each small piece is measured. We then compute the radiation resistance matrix (the effect of each element on each element, 1 on 1, 1 on 2, 1 on 3, … 1 on N, 2 on 1, etc.). The resulting matrix is then plotted on a color map to produce the above images.
Doesn’t it match high order vibration modes of the surface?
Now this is some proper content for this sub!
Very didactic article. I enjoyed reading it.
It would be interesting to measure performance with other open source linear algebra libraries .
As for example :
Yeah, while it's a nice exercise to get some understanding of how linear algebra is implemented on a computer, he should have put a gigantic disclaimer at the top stating: "Do NOT use this code for production - use a specialized library that has been hand optimized through decades of hard work by experts"
:-D I thought that would be clear. Just like “Don’t implement your own encryption”
I always appreciate discussions about numerical methods with c++. Thanks!
> Real-life implementations of matrix multiplication are using Strassen’s algorithm
No they don’t. Strassen is only of theoretical interest. This is a silly statement.
If you want to do something performance with matrix matrix multiply, implement the recursive algorithm, which is an example of a cache oblivious algorithm. You need gsl::span for that. I had a good time coding it.
introduction to matrix operations
Key Considerations for Matrix Operations
Matrix Basics:
Types of Matrices:
Basic Operations:
Special Properties:
Applications:
Recommendation: To get started, practice with small matrices to understand addition, subtraction, and multiplication. Use online resources or textbooks that provide exercises and examples to reinforce your understanding. Familiarity with these operations is foundational for more advanced topics in linear algebra and applied mathematics.
Get more comprehensive results summarized by our most cutting edge AI model. Plus deep Youtube search.