In today’s blog post, I am going to look a little more closely at the concept of tensors, which is of paramount importance in many areas of modern physics. Tensors find application in such diverse areas as General Relativity, continuum mechanics, quantum field theory etc etc. The aim of this article is to convey an overview of the general concept, and it is specifically geared towards lay people – clarity and understanding are given preference over mathematical rigor, and I will largely avoid textbook-style dissertations about how to manipulate such objects; a separate article on tensor calculus will follow in the near future.
Before we can discuss tensors, we need to briefly touch on another fundamental object, called a 1-form. To put things as simply as possible, a 1-form is a function which maps a vector into a real number; i.e. it takes a vector as input, and produces a real number as output. To understand the meaning of this operation, we can represent vectors and 1-forms as geometric objects, like so :
Vectors, as you may remember from your school days, can be geometrically represented in a specific coordinate system as arrows, i.e. as objects with a magnitude and direction. 1-forms on the other hand are represented by families of surfaces, which may or may not be flat and regular, and the separation between which can vary from point to point. In the above graphic, we have three basis vectors in the x, y and t directions, and a corresponding set of surfaces for each, which are pierced by the arrows. The arrows are vectors, the sets of surfaces are 1-forms. So when we say that a 1-form is a function that takes a vector as input and produces a real number as output, what is that resulting “real number” ? That number is – geometrically speaking – quite simply the number of surfaces pierced by a given vector. For example, in the above graphic, each of the three vectors pierces exactly four surfaces, so each of the 1-forms pictured above takes as input one of the vectors, and produces the number 4 as a result. And this, in a nutshell, is the relationship between vectors and 1-forms – arrows piercing surfaces, functions acting on vectors.
In terms of mathematical notation, a 1-form is an object written with precisely one index, just like a vector. To agree on a convention so that we do not get confused later, vectors are henceforth denoted with an upper index, and can be represented by a 1-column matrix, such as :
which is something you are probably already familiar with. Note that, while vectors and 1-forms can be defined in any number of dimensions, I will restrict my attention here specifically to our 4-dimensional spacetime. 1-forms, on the other hand, are denoted by lower indices, and are represented by 1-row matrices :
Note that in both cases, the Greek index runs from 0…3, and not from 1…4; this is an important convention, which is universally adhered to in the context of physics. Now, in order to denote the action of a 1-form u on a vector v, i.e. the mapping of an arrow into the number of 1-form surfaces it pierces, we introduce the following notation :
This introduces yet another convention, called the Einstein summation convention – if the same index appears once as an upper and once as a lower index within the same term, we understand this to imply that we are to perform a summation over all possible values of the index, as in the expression above, even if the sum symbol is not explicitly written out. This is a standard convention used in most physics textbooks, so it is very important to be aware of this.
Of course, a lot more can be said on the subject of vectors and 1-forms, but for now I just want you to be aware of the existence of these objects, and their general relationship; this article is of course not a substitute on a good textbook on differential geometry.
So what does any of this have to do with tensors ? As basic motivation, consider the following very simple problem – we are being given a free electric charge in an otherwise completely empty region of space, and we are being tasked with writing down the physical laws of electromagnetism for this simple system. How do we approach this ? We could try and make our lives easy, by assuming we are at rest with respect to the charge – in that case we will describe only an electrostatic field, but no magnetic fields, and our laws will have a very simple form. But then, what about someone who is moving with respect to the charge ( or vice versa ) ? Clearly, our simple laws don’t apply to him, because in addition to an electric field he will also see a magnetic field. No problem, so we write a different set of equations which captures this. But now, someone comes along and says “wait a minute – the charge is spherical, I want to write this using spherical coordinates !”. And so he does, and obtains yet another set of equations to describe the same physical system. And so on.
You are probably starting to see the problem – if done in the naive way, the mathematical form our laws of physics take on will depend on the observer who formulates them. In more technical terms, the mathematical form of the laws of physics will depend on the system of coordinates we use to describe it – the physics are always the same, but the maths are not.
Would it not be advantageous to formulate physical laws in such a way that they are not dependent on which observer formulates them, i.e. that are the same for all observers regardless of where/when they are and how they move ? That is to say, can we develop a mathematical formalism that does not explicitly depend on which system of coordinates we choose to describe it ?
The answer is of course yes. In order to do so, we need to stop trying to specify the components of vectors and 1-forms ( since the mathematical expressions of these components depend on the choice of coordinate system ), and instead ask ourselves how physical quantities are related to one another, and how these relationships change between observers and events.
Let’s stay with the example of electromagnetism – clearly, the components of the E and B fields are dependent on the observer, so trying to specify them directly will always result in expressions that are coordinate-dependent. A better approach would be this : why don’t we define a function at each point in space that takes as input our observer’s 4-velocity at that point, and produces as output the 4-force he feels as a result ? Something like this :
This expression does not make any reference to the E and B field components, but instead describes the relationship between two vectors – the velocity vector of the observer, and the resulting force vector which he is subjected to due to the presence of the electromagnetic field. The above expression has the same form regardless of which observer writes it down, and which system of coordinates he chooses to do so, because it specifies only the relationship between these two vectors – which never changes.
So what is this mysterious function ? Basically, what we have done here is attach a little “machine” to each point in space and time, which takes as input a vector ( the velocity vector of the observer ), crunches it up and processes it, and produces as output another vector ( the 4-force acting on our observer as a result of the electromagnetic field ); and it does so in a way that is completely independent from any specific choice of coordinate system, since only the relationship between the two vectors is referenced, but not their component expressions. And that is exactly the ( laymen’s ) definition of a tensor :
A tensor is a function defined at each event in space-time, which takes as input a certain number of vectors and 1-forms, and produces as output a real number, or another tensor. In other words, a tensor is a linear map which maps vectors and 1-forms into real numbers, or other vectors and tensors.
We can write the above expression with indices :
In this example, the tensor F ( called the Faraday tensor ) takes as input a 4-velocity vector, and produces as output a 4-force vector, at each point in space-time. It maps a vector into another vector, and it does so independently from the coordinate basis which is used to write the components of the vectors. This coordinate independence can be formally proven, but I will omit the proof here, as the purpose of this article is merely to explain the meaning of the concept of tensors. For the formal technical details, I refer the student to one of the many textbooks on the subject matter.
The above Faraday tensor has two indices, hence we say it is a rank-2 tensor. A tensor with three indices would be of rank-3, and so on. In general terms, a tensor written with n upper and m lower indices is said to be of rank-(n+m), and it takes n 1-forms and m vectors, and maps those into a real number. A rank-1 tensor is a 4-vector, and a rank-0 tensor is a scalar. Note that the rank of a tensor has nothing to do with the dimension of the space it is defined in – the dimensionality is reflected in the values that each of the indices can take on, not in the number of indices.
In the above example, the tensor F has one upper and one lower index; there are ways to convert an upper into a lower index and vice versa, but I will not go into this here, nor will I explain how to manipulate tensors and their indices ( I will present these things in detail in a future post ). For now, suffice to say that tensors can be explicitly written in terms of their components, as matrices. For example, the Faraday tensor ( with both lower indices ) can be represented as the matrix
which provides the connection to the more traditional E and B fields of electromagnetism. Just as is the case with tensors and 1-forms, the mathematical form of each component will depend on the system of coordinates you choose, but the relationships between the components of the tensor do not. It is precisely these relationships between components which make a tensor a tensor – hence, all tensors can be represented as matrices, but not all matrices are automatically tensors ! Mathematically, the distinction is in how these objects behave under coordinate transformations.
Here is another way to look at tensors, which some of you might find useful. If you think of the familiar concept of vectors, what we have a set of numbers corresponding to each coordinate axis. In other words, a 4-vector ( which is a rank-1 tensor, i.e. a tensor with one index ) assigns a number to each coordinate axis in a chosen coordinate system. A rank-2 tensor is an object that assigns a number to each pair of coordinate axis, i.e. to each coordinate plane. A rank-3 tensor assigns a number to each triplet of coordinate axis, i.e. to each coordinate cube, and so on. For example, for a rank-2 tensor in three dimensions, we have a number for each coordinate pair, like so :
Personally I prefer not to look at tensors in this way, but some people find it easier, which is why I am mentioning it here.
It is tempting to try and find an easy geometric visualisation for what tensors “are”, in the same manner as vectors can be visualised as arrows, or 1-forms can be visualised as families of surfaces. There are indeed a number of such visualisation schemes – however, I would strongly recommend to resist this temptation, as such schemes tend to be very complicated, and seldom provide any additional insight. In my opinion it is best to regard tensors simply as described above – little “machines” attached to each point in space, which take vectors and 1-forms as input, and produce real numbers ( or other tensors ) as a result. Furthermore, they do so without making reference to any specific set of coordinates, so physical laws written with tensors will have the same form for all observers, regardless of where/when they are, or how they move. That is what makes them so useful.
There is one specific tensor that is of great importance in physics, and which I feel needs to be explicitly mentioned here : the metric tensor. This tensor is a “machine” which is once again defined at each point in space-time; what it does is take two vectors as its input, and produces their scalar product as a result :
Or in component form :
Since the scalar product is used to define measurements of lengths, angles, areas, volumes etc etc, this particular tensor is of particular significance for the theory of general relativity, where it appears as a basic quantity. In general terms, the metric tensor enables us to define measurements on a manifold.
So there you have it – a basic introduction to the concept of tensors. There is nothing mysterious about these quantities, they are in essence just a generalisation of the familiar notion of functions. In a future post I will explain in more detail what one can do with tensors, how they can be manipulated, and how they are used to formulate laws of physics ( we have seen one example already ).
For further reading, I cannot recommend the introductory chapters of Misner/Thorne/Wheeler “Gravitation” highly enough – a very thorough walk-through, and suitable even for interested amateurs.
Follow-on article : Quick and Dirty Tensor Calculus