Prerequisits : Multivariate Calculus
Recommended : Quick And Dirty Tensor Calculus
Before we progress onwards to General Relativity, it will be helpful to introduce you to a couple of concepts from differential geometry – this is a discipline of mathematics which studies manifolds and their geometric properties, using the tools of calculus. Differential geometry is the natural “language” in which the entire theory of relativity is formulated, so knowledge of some fundamental concepts will be helpful as we go on.
This is not a maths text – the purpose of this article is only to present the concepts in question so far as they pertain to their application in physics, not to motivate, proof, or teach them in their entirety. That is left to one of the many mathematical textbooks on the subject. I assume you are familiar with elementary ( multivariate ) calculus – if that is not the case, you are advised to go and study calculus first, before attempting to learn about General Relativity.
Everything we will be talking about here has to do with manifolds – this is a mathematical concept which refers to a space that looks like the good, old, familiar Euclidean space in the immediate ( =infinitesimal ) neighbourhood of each point. A manifold is hence a collection of points; examples are a sphere, a flat plane, a ball, a torus, and so on. Space-time as we are using it in physics is also modeled as a manifold. We will restrict our attention here to manifolds which are everywhere smooth and continous ( i.e. no edges, boundaries, discontinuities etc etc ), and hence everywhere differentiable. This just means we are able to consistently define the operation of differentiation at each and every point of our manifold.
Differentiable Manifolds. A manifold is a topological space that resembles Euclidean space in the immediate neighbourhood of each point. A manifold is differentiable if it is everywhere smooth and continuous, so that one can define a differentiation operator in a consistent manner.
Before we go on, I need to explicitly stress the importance of the above definition, because it will play a crucial role in physics – a manifold is a space that looks Euclidean in the neighbourhood of each point. What does this mean ? Imagine you are approaching planet Earth in a rocket ship – when you are far away, the planet unsurprisingly looks like a globe. As you approach, the globe becomes bigger and bigger in your field of vision, and the “curving” of it against the backdrop of space seems to become less. As you enter the atmosphere and descend further, the curvature of the horizon becomes less still, and as you land, leave your rocket, and look around you at the horizon, the curvature has become invisible – within the small vicinity of your rocket ship, and even as far as the horizon you can see, the Earth now looks perfectly flat  :
This is a general principle – every manifold will look locally flat and Euclidean, so long as you “zoom in” far enough to a small region, even if globally it has a different geometry. This is also true for space-time : if you look at a very small region of space for a very short period of time, then this local region will seem Euclidean, even if the global geometry is not.
Local vs Global. While every small enough local region of a manifold resembles Euclidean space, the reverse is not true : the fact that we detect a Euclidean geometry in a very small – and therefore local – patch does not allow us to draw any conclusions as to the global ( =large-scale ) geometry of the manifold.
Let’s turn our attention now towards how to actually describe the geometry of a manifold in mathematical turns. It all starts with the simple notion of differentiation. If you think back to how you were introduced to differentiation back in your high school / college days, you will remember that taking the derivative of a function at a point meant putting a tangent onto the function’s plot at that point – the numerical value of the derivative was then just the slope of the tangent. Without giving the ( quite formal ) proof, we are elevating this to a general principle :
Tangents. Taking the derivative at a point on a manifold is equivalent to putting a tangent vector onto the manifold at that point. Derivatives are conceptually equivalent to infinitesimally short tangent vectors.
In anything more than one dimension, a tangent vector has not just magnitude, but also direction; one can therefore take the derivative in a given direction, or one can define the set of all possible derivatives at a point, i.e. all derivatives in all directions. This is called the tangent space at a point on a manifold M, and is usually denoted .
Tangent Spaces. The tangent space at a given point on a manifold is the set of all possible tangent vectors there.
For a flat plane, the tangent space is the same at every point, and is just the flat plane itself – no matter where and in what direction you differentiate, the tangent vector will always lay in the original plane itself, so the manifold and its tangent space coincides. However, for anything other than a flat plane, this is not generally true – the tangent space at a given point will not coincide with the manifold. For example, in the case of a sphere, the tangent space at a given point looks like this  :
So it is a plane which touches the sphere at exactly one point, and the plane is spanned by the set of all possible tangent vectors to the sphere at that point  :
If our manifold itself is a flat plane, then everything is easy and perfectly consistent – no matter where we take the derivative, the operation is always well defined, always works the exact same way, and always ends up within the same tangent space, at each and every point. You can translate the origina of your coordinate system to anywhere within your flat plane, and the derivative taken at a given point will always remain the same. This is just the standard Euclidean geometry that is tacitly assumed when you learn calculus at school; you do not have to worry about where you take the derivative, since the operation is the same everywhere. However, this is true only on flat manifolds; even in the very simple case of a sphere there is already a problem. Observe  :
You get different tangent spaces at different points on the sphere – that means that if you take derivatives at different points on our manifold, the resulting tangent vectors are not the same  :
Everything changes if you go from one point to another point on such manifolds – the tangent p changes into p’, which don’t coincide; the same is true for the normal vector k, the angles between the vectors, as well as the area of the parallelogram spanned by them. This is an awkward problem, because it means that taking the derivative is now no longer defined in a consistent manner – the result of the operation depends on where on the manifold we perform it, even if it is performed on the same object ( such as a function for example ). This is an entirely unsatisfactory situation.
Inconsistency of Ordinary Derivative. On manifolds which are not flat, the ordinary derivative is not consistently defined. Taking the same derivative at different points of the manifold will yield different results; tangent spaces at those points do not coincide.
Equally problematic is a particular consequence of this failure of the derivative operator to be consistent, which is best seen when we attempt to transport a tangent vector along a closed curved on our manifold. Let’s say we start at some point, pick a tangent vector at that point, and then transport that vector along some closed curve that starts and ends at our original point; let’s also say we attempt to remain consistent by transporting our vector in such a way that it remains always parallel. This is easier seen than explained  :
So we start with some tangent vector, transport it along segments (1), (2) and (3) such that the vectors remain parallel to one another – and we end up with a vector that points into a different direction than the one we started off with, even though it is transported along a closed curve back to the same point ! To get a better intuition for this, I recommend you to play around with this Wolfram Demonstrations project a bit; you will soon get a feel for how tangent vectors change as you transport them around a manifold.
Parallel Transport. Tangent vectors can be transported on a manifold in such a way that they remain approximately parallel from point to point. This is called parallel transport. If this operation is perform on a manifold that is not flat, the original vector and the transported vector will not coincide if brought back to the same point along a closed trajectory.
I think you have an intuitive – if not yet mathematical – understanding of the issue at hand. We now need to find some way to fix this, meaning we need to find some way to make the notion of “deriviative” consistent, in the sense that we would like to obtain the same result wherever we perform differentiation on our manifold. More geometrically speaking, we would like to obtain the same tangent vector, regardless of where on our manifold we perform a particular, given differentiation operation. If we shift around the tangent plane on the surface of a sphere ( as shown in the graphic below ), the result is a rotation of the basis vectors of that tangent plane; what we would like to do is find some way to connect tangent spaces at different points in such a way that these rotations are fully accounted for. The mathematical object that does this is unsurprisingly called a connection – it connects tangent spaces at different points  :
Connections. A connection is a mathematical object which connects tangent spaces at different points on a manifold, in such a way that all changes to the basis vectors of the tangent spaces are fully accounted for.
Notationally, a connection is characterised by a matrix of coordinate-dependent expressions, which account for the changes a coordinate basis undergoes if you shift its origin from one point to another along some direction; it is denoted by the Greek capital letter .
This article is specifically intended to provide the reader with mathematical tools to be used in the context of the Theory of Relativity. I will mention here that in general, there are infinitely many ways to connect tangent spaces on a manifold; however, in the theory of relativity, a very specific connection object is used, the so-called Levi-Civita connection. The distinguishing characterists of this particular connection is that it describes the connection between tangent spaces in terms of curvature ( to be introduced shortly ) only, and that it preserves the metric. Don’t worry too much about what exactly this means – I refer you to any textbook on differential geometry for details. For now, the important fact is that the mathematical form of the connection depends only on the metric and its derivatives; i.e. the aforementioned matrix of expressions which characterises the connection ( the connection coefficients ) can be computed directly from the metric via the following formula :
These connection coefficients are also called Christoffel symbols. These symbols are symmetric in the lower two indices, but they are not tensors, even if they look like it.
Recall the ordinary ( directional ) derivative of a vector with respect to the k-th coordinate is written as
with a single bar “|”. Since the concept of derivative suffers from the issues we have discussed earlier; using the connection, we now define a new operation, called the covariant derivative, as follows ( for a vector and a rank-2 tensor ) :
This is just the ordinary directional derivative along , plus a sum of some extra terms; it is precisely those extra terms – which arise from the connection – which account for the changes brought about by going from one point to another on a manifold that isn’t flat. Taking the covariant derivative raises the rank of our vector by one, turning it into a rank-2 tensor; while the Christoffel symbols are not themselves tensors, the full covariant derivative is a tensor, and hence is valid regardless of the choice of coordinate basis.
Let us now write things a bit differently, in terms of coordinate differentials; if we parallel-transport a vector by an infinitesimal distance dx, it will change by the factor
This means that the Christoffel symbols denote the change in a vector as it is parallel transported by an infinitesimal distance. On the other hand, the differential of the vector field itself behaves as follows under parallel transport :
Covariant Derivative. The covariant derivative takes the parallel-transported ordinary derivative, and substracts from it whatever has changed due to the geometry of the manifold, to arrive at a single, consistent concept of differentiation, which is valid everywhere on the manifold. Unlike the Christoffel symbols, the covariant derivative is a tensor.
To visualise this, consider  :
As you can see, despite the changes in the coordinate basis when going from one point to another along on our manifold, the covariant derivative of the original vector remains the same. It does so, because the extra terms introduced by the connection compensate for any changes that occur from point to point on the manifold. If the manifold happens to be completely flat everywhere, the Christoffel symbols will all identically vanish, leaving just the ordinary derivative, as expected. But, on a manifold that isn’t flat, the covariant derivative will behave differently than the ordinary version. Thus, if you are working on such a manifold, there are a few things you must be careful about :
- All ordinary derivatives have to be replaced by covariant derivatives
- Covariant derivatives do not, in general, commute, so the order of terms is important
- Be careful with your index gymnastics – it is very easy to confuse and mix up indices !
Let us consider a special case of parallel transport : suppose we have an airplane pilot flying from Los Angeles to London. He will start off from LA in a north-easterly direction; as the flight progresses, his compass will gradually swing around, and indicate less and less northerly movement, the further east he gets. Eventually he will fly into London on a south-easterly course. Interestingly though, an accelerometer carried along on the flight will have recorded no deviations from zero whatsoever – so far as the accelerometer is concerned, the flight has proceeded along a straight path :
We know intuitively why this is so – if we draw the flight path of the plane onto a desktop globe using a pen, the pen will always proceed “straight ahead”, and never deviate in any direction; and yet, the result is not a straight line, but a circle segment. That is because the Earth is a globe, and not a flat plane. Mathematically speaking, going “straight” in this case means that we are always proceeding in the same direction as a very specific tangent vector to the Earth’s surface, the tangent taken in the direction of the flight path. To put it differently, as we are proceeding along our flight path, we are parallel-transporting our own tangent vector. Remembering that the surface of the Earth is taken as a 2-dimensional manifold, if we are parallel-transporting our very own tangent vector, we are not experiencing any deviation whatsoever from being on a “straight line”, the directional derivative of which is zero. But, because we are on a manifold that isn’t flat, we need to use our newly discovered covariant derivative, and we write ( is the tangent to the curve and the Earth ) :
This is called the geodesic equation, and its solutions are all those curves on a manifold which parallel-transport along their own tangent vectors. Such curves are called geodesics, and in real-world terms they are extremal curves between given points, i.e. they are either the shortest or longest connection ( depending on the geometry of the manifold ) between points.
Geodesics. Geodesics are curves that parallel-transport their own tangent vectors along themselves. They are either the shortest or the longest connection between points on a manifold, and they represent the “straightest” possible route one can take.
So what exactly are these “changes” that are introduced by the geometry of manifolds which aren’t flat ? What does flatness even mean ? Is a cylinder flat ? Is a torus flat ? A sphere ? How do we mathematically describe flatness, or deviations from it, in a general way ?
To answer this question, take a look again at the case of a vector parallel-transported along some closed curve on a manifold; the total change in the vector when doing so is given by equation (5) above. Clearly, if the manifold was flat, we would not expect there to be any change at all – the resulting parallel-transported vector should coincide with the original vector we started off with. Therefore, we can mathematically describe flatness by the absence of any changes during parallel-transport of a vector along some closed curve C :
Flatness. A manifold is flat, if a vector that is parallel-transported along a closed curve anywhere on the manifold does not undergo any change, regardless of the specifics of the curve ( so long as it is closed ), and regardless of where one performs this experiment. Relation (9) then holds, which is equivalent to saying that the ordinary derivative applies everywhere on our manifold.
But what about the case when
In this case, a vector parallel-transported around a closed loop will not coincide with the original vector; the change between original and parallel-transported vectors is
Line integrals – though very intuitive in terms of meaning – are unwieldy to work with, so we now seek a differential formulation by making the loop C arbitrarily small; expression (11) then signifies the difference between vectors – which is again a vector – at the same point. Because the difference of vectors already involves three indices, and we need to be left with another free index ( a vector ), the evaluation of (11) must yield an expression involving a rank-4 tensor :
The object is called the Riemann curvature tensor. This is a rank-4 tensor, which describes how a vector changes when you parallel-transport it around a closed infinitesimal curve that encloses an area given by the surface element . To make things explicitly clear, remember that tensors are little machines which take an input and produce an output. The Riemann tensor is of rank-4, but to understand what it does, we fill only three of its four slots, which leaves us with a vector as a result. To understand the meaning of that vector, consider the following  :
If, on a manifold that is not flat, we transport the vector along two different paths vw and wv, the result will in general not be the same. We therefore pass three arguments to the Riemann tensor – the path, characterised by two vectors, as well as the vector we wish to transport on that path -, and obtain as a result the difference vector due to the inherent curvature of the manifold. The Riemann tensor therefore measures the degree by which covariant derivatives do not commute, and we can write the above symbolically as
The index i refers to the transported vector, v and w refer to the two different paths it can take, and m is a dummy index that can be taken as the components of the resulting difference vector. So, we pass three vectors as input, and obtain a new vector as output.
Riemann Curvature Tensor. This tensor quantifies the failure of the covariant derivative to commute; this is a measure of the intrinsic curvature of a manifold. The Riemann tensor completely specifies all aspects of the local geometry on a manifold.
So, the Riemann tensor gives the difference between a vector and the result of transporting it along a small, closed curve. In 4-dimensional space-time, you have four basis vectors, each of which has four components that can be transported into four directions; this gives a total of 4x4x4 = 256 components for the Riemann tensor. In reality though, it turns out that this tensor has a number of important symmetries ( more on that in a future article ), which reduces the amount of independent components – only 20 out of the 256 components are functionally independent in four dimensions. The components themselves are completely determined by the connection and its derivatives :
Curvature. The curvature of a manifold is completely characterised by the Riemann tensor, and arises from the connection. It is one of the principle invariants of the connection. If the Riemann tensor vanishes, the manifold is flat.
Consider carefully what we have done here – we have described the geometry of the manifold ( i.e. its curvature ) completely intrinsically, without making reference to any embedding into higher dimensional spaces, or notions of extrinsic curvature. The geometry is determined completely by what happens during parallel transport of vectors on the manifold. We now have the tools to re-write condition (9) for flatness of a manifold in a different way :
If the Riemann tensor vanishes everywhere, the manifold must be flat.
Let us take a look at another interpretation of the Riemann tensor, which is perhaps less general, but more useful to us as we move onto General Relativity in the next blog post. Consider to arbitrary geodesics on an arbitrary 4-dimensional manifold; let’s pick one of them to be our reference ( fiducial ) geodesic, and draw a separation vector from this reference geodesic to the other geodesic, at any point. We then also pick a direction in time which corresponds to the future  :
In this scenario, the Riemann curvature tensor takes as input the tangent vector to the fiducial geodesic twice ( slots 1 and 3 ), and the separation vector between geodesics ( slot 2 ), and produces as output the rate of change of separation between the geodesics. This is again a vector, because it has both magnitude ( the rate of change ), and direction :
Geodesics. The Riemann curvature tensor encapsulates information about the relative acceleration between neighbouring geodesics over time.
In two dimensions, and on the surface of a sphere, this could look like this :
The Riemann curvature tensor encodes all information as to what happens to any arbitrary geometric object as you shift it around your manifold; this includes changes in shape, volume, shears, twists etc etc. If one is interested only in certain aspects of curvature, it is possible to form other tensors from the Riemann tensor; the simplest of those is the Ricci tensor, which is formed by contracting the upper index with the lower middle index; this is nothing other than the trace of the Riemann tensor :
As you might perhaps remember from your linear algebra lessons, the trace of a linear map signifies at what rate the volume of a test body spanned by basis vector begins to change; and that is exactly the meaning of the Ricci tensor  :
If you have a small test object and a geodesic, the Ricci tensor will take as input the tangent vector to the geodesic twice, and will as output produce the rate at which the volume of the test body begins to change, after it was initially at rest.
Ricci Tensor. The Ricci tensor is a rank-2 symmetric tensor, and it signifies the rate of change of a test body that moves from rest along a geodesic on a manifold. It is the trace of the full Riemann tensor.
Caution : The Ricci tensor only considers the volume of a test body, but not its shape. If the Ricci tensor vanishes, that means only that the volume is preserved, but you can still have distortions in shape. Hence, a vanishing Ricci tensor does not imply a flat manifold !
Note for the experts : I am fully aware that the above geometric interpretation of the Ricci tensor is valid only for cases where vorticity, shear, and expansion can be neglected ( Raychaudhuri equation ). Since this article is aimed at laypeople, I have chosen not to elaborate further on this, as it would lead too far away from the topic of this post.
Before I wrap things up, I want to introduce you to two more aspects / measures of curvature. The first one is the Ricci curvature scalar; to understand its meaning, consider the surface of a sphere. On a flat manifold, its surface area is given by the usual elementary formulas you are all familiar with from your high school days. However, if you project the same sphere onto a curved manifold, its surface area will no longer be the same, because the curvature of the manifold will distort it. It may for example end up looking like this  :
We can introduce a scalar quantity that measures how, in a given number of dimensions, the surface area of an infinitesimally small sphere differs on a curved manifold from the same sphere in flat space. This scalar quantity is defined as :
with being the radius from the centre point, i.e. the distance of all points on our sphere from the centre. D is the number of dimensions we operate in. The Ricci scalar is the trace of the Ricci tensor, so it can also be computed directly from the metric via the formula
Ricci Scalar. This curvature scalar is a measure of how the area of an infinitesimal surface differs on a curved manifold as compared to the same surface in flat space.
Last but not least, the Ricci scalar allows us to define one final quantity, which will be very important when we talk about General Relativity. This quantity is called the Einstein tensor, and, like the energy-momentum tensor, it is a symmetric rank-2 tensor that is automatically conserved. The Einstein tensor is a linear “machine” that has two slots – if you define a purely time-like unit (!) vector such as for example
and insert this twice into both slots of the Einstein tensor, you get as a result the scalar curvature of the remaining spatial dimensions ( up to a constant ) :
where again, D is the number of dimensions on our manifold. In the real, physical world, the geometric meaning of the Einstein tensor is therefore this :
Einstein Tensor. Once a time direction has been chosen in the form of a unit time-like vector ( to be inserted into both slots ), the Einstein tensor will return the scalar curvature in the corresponding spatial dimensions.
Very few textbooks ever explain the geometric meaning of this tensor, so the above might proof very useful, and will enable you to get a better understanding of General Relativity. The components of the Einstein tensor can be computed from the Ricci tensor and the Ricci scalar, or directly from the metric :
The Einstein tensor is an automatically conserved quantity, meaning that its divergence vanishes identically :
And with this I am going to conclude our little foray into differential geometry. This is not a complete presentation of all necessary concepts by any stretch of the imagination, but I hope that it gives at least a rough idea what the geometry of manifolds in about. The main points again :
- Every small enough local region can be approximated as Euclidean
- If a manifold is not flat, the ordinary derivative is not consistently defined
- An object called a connection allows us to connect tangent spaces at different points
- A connection allows us to define a covariant derivative, which compensates for the effects of curvature, and provides a consistent definition for a differentiation operation
- Curvature is quantified by the Riemann curvature tensor, which completely fixes all aspects of local geometry
- A curve which parallel-transports its own tangent vector is called a geodesic; it is either the shortest or the longest connection between two points on a manifold
And with this we finally have the tools together to tackle General Relativity. Stay tuned 🙂
 Misner/Thorne/Wheeler, “Gravitation“, page 31, Fig 1.11
 Misner/Thorne/Wheeler, “Gravitation“, page 31, Fig 1.10