Navigating the MainstreamPhysics

Quick And Dirty Tensor Calculus

Prerequisits : Tensors For Laypeople

In today’s article we will take a further look at the subject of tensors. If you have not already done so, please do read my “Tensors For Laypeople” post first, so that you have a general idea of what the concept of tensors is about. I will presume here that you are familiar with elementary calculus, both single and multivariate – if that is not the case, then I would urge you to study this subject first, as without calculus you are not ready to learn about tensors and the theory of relativity. This article will be somewhat more technical, and hence more “boring” for some of you, but I would ask you to please bear with me, as this is important groundwork which we need to cover before we can delve deeper into Special Relativity, and eventually onwards to General Relativity. As always, this is not a substitute for a detailed study of relevant textbooks – this article is not intended to make you proficient in tensor calculus, I am merely attempting to give an overview of important concepts. It is up to yourself to take that overview and use it as a guideline for further, more detailed study. You will not learn tensor calculus just by reading this; think of this article more like a quick reference.

Recall from my previous article what the concept of tensors is all about :

A tensor is a “machine” attached to each point in space-time, which takes as input a certain number of vectors and 1-forms, processes those, and produces an output. If the number and type of inputs matches the index structure of the tensor, the output is a real number, otherwise it is a new vector, 1-form, or tensor. In other words, tensors map an input into an output – they are (multi-)linear maps, and can also be thought of as a type of function.

The defining characteristic of tensors is that they do not depend on the system of coordinates which we choose – if we change our coordinate basis, the tensor will remain the same. When we say “the same” that does not mean that the components of the tensor will remain unchanged – that is not the case, as each individual component will indeed be a different expression after a change in coordinates. However, the overall tensor remains the same, in the sense that the relationships between components do not change; a change in coordinates means that we are looking at the same object ( the tensor ) from a different perspective. It’s like a donut at the edge of table – look at it sideways, and all you see is a flat piece of dough; look at it top-down, and you see a torus with a hole in the middle. In both cases, you are looking at the same donut, just from different angles. Tensors are the same – change your coordinate basis, and you are looking at the same object from a different point of view; the component expressions look different, but their relationships within the tensor haven’t changed.

In physical terms, a change in coordinate basis means we change the observer that looks at an experimental setup. The invariance of tensors when changing observers means that, if we formulate laws of physics by using tensors, then these laws will have the same form for all observers. In particular, this means our laws will have the same form regardless of how you move, where and when you are, and possible even regardless of whether there are sources of gravity about. This is of course very useful, since such a formalism captures the essence of the physics themselves, rather than being about the observer. It is a very “pure” formalism, in that sense.

Now let us begin to look at the details. In terms of mathematical notation, a tensor is an object that carries along a certain number of indices, which can be either upper or lower indices :

Image001

Think of tensor indices as “slots”, being places where you can insert the input for our machine – upper slots take 1-forms, lower slots take vectors. In the above example, our tensor S would take a 1-form and two vectors ( three inputs in total ), in order to produce a real number. If you decide to leave one or more of the slots empty, then the result will not be a real number, but another tensor which contains exactly your “leftover” empty slots – for example, I could only fill the two lower slots above with vectors, and would be left with a new tensor that has exactly one upper index.

Remember that the total number of slots ( indices ) a tensor has is called its rank. A rank-0 tensor ( no indices at all ) is a scalar, and a rank-1 tensor is nothing other than a vector. Do note that we are talking about true scalars as well as 4-vectors here, i.e. quantities which are invariant under coordinate transformations in space-time. Not every vector is a rank-1 tensor, and not every real number is necessarily a scalar, i.e. a rank-0 tensor; it all depends how they behave under coordinate transformations ! For example, ordinary 3-vectors such as the electric field E in classical physics are not tensors, because they are not invariant under changes in coordinate basis. So care must be taken when trying to figure out what is a tensor and what is not. Tensors of higher rank ( more than one index ) have no easy geometric visualization, but they can be represented by matrices of appropriate rank and dimensionality. For example, in four dimensions, a rank-2 tensor can be represented by a 4×4 matrix; a rank-3 tensor would be a 4x4x4 matrix, and so on, you get the idea. However, the same rule holds true here – while all tensors can be written as matrices, not all matrices are automatically tensors, it depends on how they behave under changes in coordinates.

So what does it mean to “fill a slot”, i.e. how do we pass an input to a tensor in order to produce an output ? The simple answer is – by multiplying the tensor with the 1-forms and vectors, like so :

Image002

The left hand side looks just like a function – the tensor S takes three arguments as input, which corresponds to our “slot” analogy, and this way to write things is called “tensor notation”. The problem with this type of notation is that we can’t immediately see which of the slots are supposed to be 1-forms, and which ones are supposed to be vectors. The right hand side translates this into a different notation, called index notation – here we skip the brackets, and instead write upper and lower indices to denote places where 1-forms and vectors need to go. This notation is more concise, in the sense that it shows exactly what we need to input into our tensor to obtain a certain result – index notation is therefore more common in physics, whereas the more abstract index-free notation is generally preferred by pure mathematicians.

Tensor Inputs. We let a tensor act ( “filling the slots” ) by multiplying the tensor with the inputs it needs to take in order to produce an output. This way, every component of our vectors and 1-forms gets multiplied by every relevant component of the tensor, and then summed up to obtain an overall result.

If you look at the above expression, you will notice that the position of the indices is reversed between what appears at the tensor, and what appears in the argument; the tensor carries an upper index for 1-forms ( which are then written with a lower index in the argument ), and a lower index for vectors ( which are then written with an upper index in the argument ). In other words, like indices form pairs – an upper index goes together with a like lower index, and vice versa. The appearance of index pairs, one upper and one lower, means we need to perform a summation :

(1)   \begin{equation*} \displaystyle{S^\alpha \omega _\alpha=\sum_{\alpha =0...3}S^\alpha \omega _\alpha =S^0\omega _0+S^1\omega _1+S^2\omega _2+S^3\omega _3 } \end{equation*}

The result is of course a real number, as it needs to be if we “fill” all slots. If there is more than one slot being filled, you perform more than one summation as well :

(2)   \begin{equation*} \displaystyle{S\left ( \omega ,u,v \right )=S{^{\alpha }}_{\beta \gamma }\omega _\alpha u^\beta v^\gamma =\sum_{\alpha =0...3}\sum_{\beta =0...3}\sum_{\gamma =0...3}S{^{\alpha }}_{\beta \gamma }\omega _\alpha u^\beta v^\gamma} \end{equation*}

Since these sums are linear, it does not really matter in which order you perform the summations, but in order not to get confused, it is usually best to do the summation “inside out”, i.e. start with the innermost sum, and work your way backwards.

Einstein Summation Convention. If pairs of like upper and lower indices appear within the same term, we need to sum over all possible values of these indices. The summation operator is not explicitly written.

I am not not explicitly writing out the above sum, because, as you can already see, if there is more than one index involved, such summations become tedious and time consuming very quickly. This is also one of the reasons why tensors are such powerful devices – they allow us to write laws of physics in a very compact form which “hides” the nitty-gritty, and often ugly, computational details. After all, who wants to write down pages and pages of sums ?

Before we go further, I need to warn you about a few things, which also allows me to introduce a couple of other concepts. First of all, the index structure of a tensor is not arbitrary. That means you cannot – in general – rearrange indices and expect everything to still work out for you. Firstly, the sequence of indices is not arbitrary for a general tensor :

(3)   \begin{equation*} \displaystyle{S_{\mu \nu }\neq S_{\nu \mu }} \end{equation*}

Tensors which are invariant under the action of swapping a pair of indices are called symmetric in that index pair. Tensors for which you are free to choose the sequence of all indices are called fully symmetric. We will find later that many of the important tensors in physics are rank-2 tensors, and that they are symmetric – just bear in mind that you cannot automatically assume that this symmetry is the case, unless this information is given to you, or you have proven this mathematically for a given tensor.

Apart from the sequence of indices, their position ( i.e. upper versus lower index ) is also not arbitrary, so you can not just put indices up or down as you please :

(4)   \begin{equation*} \displaystyle{S^{\mu \nu }\neq S_{\mu \nu }\neq S{^{\mu }}_{\nu }\neq S{_{\mu }}^{\nu }} \end{equation*}

Again, there may be special cases where specific tensors have the symmetry of their contravariant components being equal to their covariant ones, but in general this is not the case, so be careful.

Tensor Symmetries. A general tensor is not necessarily invariant under changes of its index structure; this means that in the general case, you cannot change the order or position of indices freely. However, tensors may possess certain symmetries which allow you to change order or position of indices, but this cannot be automatically assumed.

We will now introduce a very special tensor, which plays a crucial role in both maths and physics – it is called the metric tensor. This tensor allows us to define the concept of an inner product, which, in the case of real physical space-times, is also known as the scalar product. The metric tensor ( which is a rank-2 tensor ) is a machine defined at each point in space-time, which takes as input two vectors at that point, and produces as output their scalar product :

(5)   \begin{equation*} \displaystyle{\left \langle \vec{u},\vec{v} \right \rangle=\vec{u}\cdot \vec{v}=\mathbf{g}\left ( \vec{u},\vec{v} \right )=g_{\mu \nu }u^\mu v^\nu } \end{equation*}

The scalar product of vectors is important, because it allows us to define the angle between those vectors, as well as the area of the region spanned by them ( refer to textbooks on differential geometry for details ). If we insert the same vector into both slots of the tensor, the result is the squared length of that vector itself :

(6)   \begin{equation*} \displaystyle{\left | \vec{u} \right |^2=\mathbf{g}\left ( \vec{u},\vec{u} \right )=g_{\mu \nu }u^\mu u^\nu } \end{equation*}

Metric Tensor. The metric tensor is a rank-2 tensor which allows us to define measurements on a manifold. It takes two vectors as input, and produces their scalar product; if we pass the same vector into both slots, it produces that vector’s squared length.

This is a very powerful result, because it has far-reaching consequences. Suppose we have two arbitrary points on a (flat) manifold – we can connect these points by drawing a separation vector that points from one point to the other, and hence compute their distance as the length of that vector, as per formula (6). To make this concept as general as possible, suppose we now choose two points which are infinitesimally close together; their separation, written as the squared length of the vector connecting the two points, is then

(7)   \begin{equation*} \displaystyle{ds^2=g_{\mu \nu }dx^\mu dx^\nu } \end{equation*}

The symbol ds denotes the ( infinitesimal ) length of the separation vector, whereas x^{\mu} denotes our choice of coordinate system; that could be Cartesian coordinates

(8)   \begin{equation*} \displaystyle{\begin{pmatrix} x^0 &x^1  &x^2  &x^3 \end{pmatrix}=\begin{pmatrix} t &x  &y  &z \end{pmatrix} } \end{equation*}

or any other coordinate system. The expression (7) is called the line element, and its meaning is just what it says on the tin – it is an infinitesimally small element of any curve, surface, volume etc etc in space-time. This allows us to measure the length/area/volume etc of whatever object we wish to examine, by integrating the line element over the relevant domain. For example, if we have a line segment C in space-time ( e.g. the world line of a particle between two events ), we can calculate the total length of that segment by evaluating the integral

(9)   \begin{equation*} \displaystyle{L=\int_{C}ds=\int_{C}\sqrt{g_{\mu \nu}dx^{\mu}dx^{\nu}}} \end{equation*}

which is just a standard line integral ( refer to any textbook on vector calculus or differential geometry if you are unsure of how to compute such integrals ).

Line Element. The line element – which is written in terms of the metric tensor – allows us to define infinitesimally small segments of curves/areas/volumes etc etc in space-time. We find the total length/area/volume etc etc of larger regions by integrating the line element. It is the line element which determines the geometry of measurements performed on a manifold.

So how do we know the components of the metric tensor, in order to perform the above calculations ? In the simplest case, the space-time we are working in is completely flat – such space-times are the domain of Special Relativity, and they are called Minkowski space-time. The metric for Minkowski space-time is

(10)   \begin{equation*} \displaystyle{g_{\mu\nu}\equiv \eta _{\mu\nu} \equiv \begin{pmatrix} -1 &0 &0 &0 \\ 0&1 &0 &0 \\ 0&0 &1 &0 \\ 0&0 &0 &1 \end{pmatrix}} \end{equation*}

which gives us a line element of

(11)   \begin{equation*} \displaystyle{ds^2=-d(ct)^2+dx^2+dy^2+dz^2} \end{equation*}

in Cartesian coordinates. This is also called the Minkowski metric, and is specifically denoted with the symbol \eta_{\mu \nu}. The Minkowski metric tensor is fully symmetric, and you are also free to swap lower and upper indices :

(12)   \begin{equation*} \displaystyle{\eta_{\mu\nu}=\eta_{\nu \mu}=\eta{_{\mu}}^{\nu}=\eta{^{\nu}}_{\mu}=...} \end{equation*}

and all other permutations. Basically, you are free to do anything you want with the sequence and position of these indices, so long as your overall expression remains self-consistent. Note that this is true only for the Minkowski metric tensor, not in general for all metrics !

Minkowski Space-time. The flat space-time of Special Relativity is described by the Minkowski metric tensor (10). This tensor is symmetric in its indices, and the indices themselves can be freely raised and lowered within their slots.

When, in the near future, we talk about General Relativity, we will discover that the metric tensor can have forms other than (10) – it can have components which are not constant, and components may appear outside the diagonal of the matrix. However, in the absence of gravity, the metric will be either (10), or something that is isometric to it ( i.e. has the same structure, but possibly different constants ).

So, the physical meaning of a metric is that it allows us to define measurements. However, it is also a powerful mathematical tool to manipulate tensors, as we shall now discover. As it turns out ( I will skip the formal proof, which you can find in any textbook on differential geometry ), having a metric defined on a manifold allows us turn vectors into 1-forms, and vice versa; a metric defines a duality between vectors and 1-forms. By extension, with the aid of the metric, we can manipulate specific slots of given tensor so that it takes a vector instead of a 1-form, and vice versa; in other words, we can raise and lower indices on a given tensor with our metric. This is the first example of what is called index gymnastics, i.e. the art of manipulating tensor indices.

Lowering an index turns a vector into a 1-form :

Image003

We start off with a vector ( u^{\beta} ), then multiply with the metric tensor in such a way that we eliminate one of the two indices by summing over it; the result is a 1-form with one left-over index, being u_{\alpha}. The vector u^{\beta} and the 1-form u_{\alpha} are dual to each other, and related via the metric.

Raising an index turns a 1-form into a vector :

Image004.png

The principle is the same – you eliminate one index by summing over it, to be left with one index in the opposite position. Again, the metric defines the duality. One can perform these operations to raise and lower any index on any tensor; for example :

(13)   \begin{equation*} \displaystyle{S{^{\alpha }}_{\beta \gamma }=g_{\beta \mu }S{^{\alpha \mu }}_{\gamma }} \end{equation*}

Raising and Lowering. Tensor indices can be raised and lowered using the metric; this corresponds to transforming a vector into a 1-form, and vice versa. This operation does not change the rank of the tensor.

The next operation in index gymnastics we are now looking at is tensor contraction. This quite simply means that we are setting one or more upper and lower indices equal, and sum over them, like so :

(14)   \begin{equation*} \displaystyle{R_{\alpha \beta }=R{^{\mu }}_{\alpha \mu \beta }} \end{equation*}

Observe how the \mu index appears both up and down, so we sum over it, which eliminates it; the result is a new tensor with its rank lowered by two. Of particular interest are complete contractions, meaning we contract all indices to obtain a scalar :

(15)   \begin{equation*} \displaystyle{K=R^{\alpha \beta \gamma \delta }R_{\alpha \beta \gamma \delta }} \end{equation*}

The result of this are scalar invariants of tensors, which have physical significance, as we will be learning in a future article.

Contraction. One can contract a tensor by setting upper and lower indices equal, and summing over them. This produces a new tensor of lower rank.

We have mentioned previously that tensors are “machines” defined at each point in space-time; if that is the case, it is reasonable to ask how tensors change from point to point. In other words, we are looking to generalize the concept of gradient – which we are familiar with from elementary vector calculus – to tensors. The tensor ( field ) gradient is defined as

(16)   \begin{equation*} \displaystyle{\bigtriangledown S\left ( \vec{u},\vec{v},\vec{w},\vec{\xi } \right )=\partial _{\vec{\xi }}\left ( S_{\alpha \beta \gamma }u^\alpha v^\beta w^\gamma  \right )=\left ( \frac{\partial S_{\alpha \beta \gamma }}{\partial x^\delta }\xi ^\delta  \right )u^\alpha v^\beta w^\delta =S_{\alpha \beta \gamma |\delta }u^\alpha v^\beta w^\gamma \xi ^\delta } \end{equation*}

This looks awfully complicated, but all there is to this operation is that we add another slot onto our tensor, which takes as input the vector along which we wish to calculate the gradient. The result of this is a new tensor, with rank increased by one. Note the notation |\delta – the single vertical bar denotes the ordinary directional derivative. Caution : as it stands, the expression (16) is valid only in flat space-time !

Gradient. The gradient increases a tensor’s rank by one, and signifies how a tensor changes from point to point on a manifold.

The next concept we need to look at is that of divergence. In physical terms, the divergence measures whether there are any sources or sinks of the quantity in question within a small region; in other words, it measures if a quantity can be physically created or destroyed. For example, the field equation of Newtonian gravity is

(17)   \begin{equation*} \displaystyle{div \mathbf{g}=-4\pi G\rho } \end{equation*}

The physical meaning of this can be stated like so : the source of the gravitational field g is mass density ( times a constant ). That is just precisely the meaning of the divergence operator – it denotes sources or sinks of some quantity. In tensor calculus, the divergence is defined as

(18)   \begin{equation*} \displaystyle{\bigtriangledown \cdot S=S{^{\alpha }}_{\beta \gamma |\alpha }} \end{equation*}

This just means you first calculate the gradient, then you contract the gradient’s slot with one of the original slots of the tensor. The result is a tensor of the same rank as the original one. This operation is very useful when we want to describe conservation laws in physics; as a case in hand, let me state ( without proof or derivation for now ) that there is a tensor called the stress-energy-momentum tensor, denoted T^{\mu \nu}. It measures the fluxes of energy and momentum in space-time, and as we all know from our high school physics days, energy and momentum cannot be created or destroyed, they can only change form. Using tensor calculus, we can write this conservation law as

(19)   \begin{equation*} \displaystyle{T{^{\mu \nu }}_{|\nu }=0} \end{equation*}

The divergence of the energy-momentum tensor is zero, meaning there are no sources or sinks of energy. It cannot be created or destroyed. Neat, isn’t it ?

Divergence. The divergence measures sources or sinks of a quantity in space-time; applied to tensors, it signifies how the quantity described by the tensor is created or destroyed.

Next, I am going to introduce another important object, which you will often encounter in physics and maths, the Levi-Civita symbol \epsilon_{\alpha \beta \gamma \delta} . This is a completely antisymmetric, rank-4 tensor-like quantity, which finds application mainly as a mathematical tool for all manner of nifty little tricks when doing index gymnastics. Note that I say tensor-like quantity, not tensor – this object is not a technically a tensor, but something called a tensor density, which is a slightly different class of objects, which we will not discuss in this article. The number of indices on this object is equal to the dimensionality of the space it is defined in, meaning it has four indices in space-time, and all of its components are either equal to +1, -1, or 0. It is actually kind of tricky to define the components of this symbol in a simple way, which is why it is usually done according to how its indices behave :

  • \epsilon_{\alpha \beta \gamma \delta} changes sign when any two indices are interchanged
  • \epsilon_{\alpha \beta \gamma \delta}=0, unless all indices are different
  • \epsilon_{\alpha \beta \gamma \delta}=+1 for even permutations of (0,1,2,3)
  • \epsilon_{\alpha \beta \gamma \delta}=-1 for odd permutations of (0,1,2,3)
  • \epsilon_{\alpha \beta \gamma \delta}=-\epsilon^{\alpha \beta \gamma \delta}

This uniquely determines all components of this object, though figuring out the value of a specific, given component can sometimes be a bit of a pain. The Levi-Civita symbol has no direct physical interpretation, it is best understood as a mathematical device; in particular, it allows the definition of a new operator that can act on tensors, the Hodge dual \star, also called the star operator. This operator maps a tensor into a new tensor of different rank, such that the combined ranks of the original tensor and the new tensor must equal the dimension of the space we are working in ( i.e. it must equal four in space-time ). It also changes lower indices into upper ones, and vice versa :

  • The Hodge dual of a rank-3 tensor is a vector : \star S_{\alpha \beta \gamma}=S^{\mu}\epsilon_{\mu \alpha \beta \gamma}
  • The Hodge dual of a rank-2 tensor is again a rank-2 tensor : \star S_{\alpha \beta }=\frac{1}{2}S^{\mu \nu }\epsilon _{\mu \nu \alpha \beta }
  • The Hodge dual of a vector is a rank-3 tensor : \star S_\alpha =\frac{1}{3!}S^{\lambda \mu \nu }\epsilon _{\lambda \mu \nu \alpha }

If the tensor in question is completely antisymmetric, then S and \star S contain precisely the same information.

Before we wrap things up, let me briefly mention that it is possible to symmetrize a tensor that was not originally symmetric. This is done by averaging over all possible permutations of its indices. For rank-2 tensors, this is done like so :

(20)   \begin{equation*} \displaystyle{V_{(\mu \nu)}=\frac{1}{2}\left ( V_{\mu \nu}+V_{\nu \mu} \right )} \end{equation*}

wherein the () around the indices denotes symmetrization. One can also define anti-symmetrization as follows :

(21)   \begin{equation*} \displaystyle{V_{[\mu \nu]}=\frac{1}{2}\left ( V_{\mu \nu}-V_{\nu \mu} \right )} \end{equation*}

wherein [] denotes anti-symmetrization. For rank-2 tensors only, it is always possible to split any arbitrary tensor into a symmetric and an anti-symmetric part :

(22)   \begin{equation*} \displaystyle{V_{\mu \nu}=V_{(\mu \nu)}+V_{[\mu \nu]}} \end{equation*}

I just thought I mention this notation, as you might occasionally come across this.


So this is our quick and dirty foray into tensor calculus. Once again, it must be pointed out that this article is not a substitute for learning tensor calculus the proper way, i.e. from an established textbook. I will use this blog entry for future reference, and may occasionally expand it a little, since there are other operations on tensors which I have not yet discussed. But for now, my head is fried from typing all of these formulas, and yours is probably too after reading them. While you might not get all of the mathematical details, I do hope that you are getting the general ideas behind the concepts, so that we can put them to good use in future articles.


Recommended further reading : Introduction to Tensor Calculus

4 votes

3 thoughts on “Quick And Dirty Tensor Calculus

Comments are closed.