A Mathematical Intro to Special Relativity

Albert Einstein was a smart guy.


I mean, he was so good at coming up with good ideas that they gave him the Nobel prize for inventing a theoretical basis for the photoelectric effect, not for any of his, perhaps more famous, work on general and special relativity. I mean, he came up with a formula more famous than the Pythagorean theorem, E = mc^2!

Not bad for one guy.

In this post, we want to talk about one of Einstein’s first famous discoveries: special relativity.1

Leading up to this, it was noticed that light was weird. Unlike most waves, it didn’t seem to propagate through anything. Sound waves propagate through air, sea waves through water, but light seems to be doing its own thing, without any sort of background.

Special relativity is Einstein’s attempt to explain this weirdness. And, to do this, he used only two assumptions.

The first was the principle of relativity. This says that you can’t tell how fast you’re traveling in absolute terms, but only how fast you are traveling relative to something else. For example, in a plane, are you traveling fast? Or sitting still? It depends on whether you’re comparing your position to a location on Earth’s surface…


…or to the position of the screaming child sitting next to you.


But neither choice is right, just different.

The second assumption was the universality of the speed of light. This says that no matter who is looking, or how, photons always travel at the same speed.2

This assumption is a bit weird. Experimentally, the Michelson-Morley experiment suggests this is true, though it’s not clear Einstein was directly influenced by this.

One explanation is that, if light propagated through something (like sound in air), it should go faster or slower depending on which direction that something was traveling. Sound, for instance, travels more slowly into the wind than with the wind.

What does the assumption of the universality of the speed of light mean?

Normally, if you’re in a car going 20 km/hr (relative to me, standing by the side of the road), and throw an orange at me at 30 km/hr (relative to yourself), the orange will be heading towards me at 50 km/hr.3


But with light, it’s different.

For that same car, if you turn on your headlights, you could measure the light leaving the car at, well, the speed of light, which is absurdly close to 3\times 10^8 m/s. But if I were to measure the same light, I would still only see it traveling at 3\times 10^8 m/s, despite the fact that your car is going 20 km/hr.


However, despite only assuming these two things, special relativity is… pretty weird.

To wrap our heads around all this, it will be incredibly useful to draw some “spacetime diagrams.” We start with a set of axes.


Notice that the two axes are labeled x and t, i.e., space and time. We could draw these with more space dimensions (after all, we live in three space dimensions), but everything is clearer if we just draw one space dimension, x.

A good way to think about these diagrams is that you are sitting at x=0 not moving. You’re there at x=0 at the beginning, t=0, and you’re there later, at say t=100. Everything else is happening around you.

What does it look like if your friend walks by you?

Let’s say he crosses you right at t=0. Then, before that, he’s to the left of you. After t=0, he’s to the right of you. So, his path in spacetime looks like this:


At each time, he’s at a particular space location.

What happens if a beam of light shoots past you?

Unsurprisingly, it looks very similar. The only difference is the speed. The speed of light is about 3\times 10^8 m/s, which is so much faster than most speeds you are familiar with that, on the same diagram as before, the path of the light would look almost horizontal, since its location is changing so rapidly.


In order to make these diagrams readable, we need to do something about the scale…

The standard thing to do is to measure all speeds as a ratio of the speed of light. So, the speed of light is “1,” while the speed of your friend walking might be 0.00000001.4

What’s nice about these units is that light will always follow a line at 45 degrees. This is because each time t increases by 1, x increases by 1 as well. So, now our diagram with your friend and the ray of light looks like this:


So far, not so weird.

Let’s ask an innocent question.


Ahem, sorry about that.

Anyway, the question is this: What does it mean for two events to be simultaneous?

An event happens at a particular time and place, so, when we say an event, what we mean is a point (x,t).


How can we, sitting at x=0, figure out when an event occurred?

One way would be to use a laser. To figure out when an event occurs, we can send out a beam of light. The light could then bounce off of a mirror that just happens to be sitting at the event, and then travel back to us. On a diagram, it would look something like this:


So since light takes the same amount of time to travel each way, we can tell exactly when the event occurs. The t coordinate is exactly halfway between when we sent the light and when we received it back.

We call two different events simultaneous if, calculating this way, we find they both happened at the same time. For instance, it’s not too hard to see that all the events along the x-axis (i.e., events with t=0) are all simultaneous.


This way of measuring simultaneity is certainly cumbersome, but it’s hard to argue this isn’t precise. I have to admit, though, that this seems like a lot of work for something that feels obvious. But there’s one important fact that changes everything.

The universality of the speed of light.

Let’s consider your friend traveling past you. Now, walking isn’t very fast, so let’s have him on a rocket ship traveling at half the speed of light. His path will look something like this:


Which events do your friend measure as simultaneous with when he passes you?

Your friend has to measure simultaneity in the same way; he shoots out a laser beam, which is reflected. Halfway between when your friend sends out the beam and when he receives it is the time the event occurred.

The trick is that his light also travels at 45 degrees.


In your friend’s view, the moment when he passed you and the event where his light was reflected are simultaneous!

Using the same idea, it’s not hard to figure out all the events that your friend thinks are simultaneous to when he passed you.


That’s weird.

Let’s be clear about this. The fact that you and your friend disagree on what events happened at the same time is not due to a lack of cleverness in how you measured things. The complicated way we defined “simultaneous” was exactly so that you can’t argue that. It is an inescapable consequence of the universality of the speed of light that simultaneity cannot also be universal.


Let’s ask a related question: how can we measure the time between events?

Of course, since we can’t agree on simultaneity, it isn’t surprising that measuring time between events is also dependent on who’s doing the measuring.

So, what can we agree on?

Again, we all agree that light travels at 45 degrees. In terms of coordinates, that means that the change in x value and the change in t value have to be the same for any beam of light. If we denote “change” by \Delta, we could write this as (\Delta x)^2 - (\Delta t)^2 = 0.


Now, this is a bit unmotivated,5 but let’s define the interval (\Delta s)^2 between any two events to be (\Delta s)^2 = (\Delta x)^2  - (\Delta t)^2.6 It is one of the fundamental results in special relativity that everyone can agree on this quantity! (But let’s not prove it here…)

What does this quantity, the interval, represent?

If we measure the interval between you at time 0 and you at time 10, the x coordinate didn’t change, and so (\Delta s)^2 = -10^2 = -100. This is a bit awkward, but if we take the negative of this, then take the square root, we get back 10.

The interval, or rather \sqrt{-(\Delta s)^2}, represents the amount of time you experience going between those two events. This quantity is called proper time. More generally, \sqrt{-(\Delta s)^2} represents how much time someone experiences going between two events, as long as they are going on a straight path (i.e., at a constant speed.)

Let’s think about your friend, now. If he goes along his path, when you measure his time as t=10 seconds, his x coordinate is x = 5. Using the formula for proper time, we can see that he thinks only \sqrt{10^2-5^2}\approx 8.66 seconds have passed!7


This time dilation is an important and very observable consequence of special relativity, and is one of the most precisely measured phenomena in science. In fact, I’m going to spend a lot of the next post just on that subject!

But to finish; the interval was useful for measuring the time experienced on any constant speed path. But what about nonconstant speed (i.e., curved) paths?


This is where we finally get back to metrics, as we talked about in the last post. For special relativity, instead of using the Pythagorean theorem for measuring vectors, we use the interval. So, in special relativity, the metric is ds^2 = dx^2 - dt^2, where dx means the rate of change of your path in the x direction, i.e., the derivative of x.8

If we want to figure out how much time passes for someone following the path, we use the tangent vector to calculate \sqrt{-ds^2}, similarly to how we measured the length of the tangent vector in the last post. This length is the equivalent of speed for a normal metric. However, the interpretation of this quantity is how fast time is passing for the traveler. The speed of time, if you will!

To find how much time the traveler experiences, as before, we integrate this speed of time over the entire path.

We’ll stop here for today. Next time, we’ll talk a lot more about time dilation, and, in the process, talk about the infamous twin paradox!

If you enjoyed this post, please share it with your friends. And don’t forget to subscribe for more math awesomeness!

  1. Why do I say one of his first, instead of his first? Because, in one year, he published four groundbreaking papers. Because, you know, Einstein 
  2.  In a vacuum. 
  3.  Yes, I’m an American. Yes, I refuse to use Imperial units in public. 
  4.  These speeds are measured as ratios, so they are unitless. Often, how this is explained is that we choose to measure time in terms of meters. This is the opposite of measuring distances in terms of time, which you do when you use lightyears. So, 1 meter of time is the time it takes for light to travel 1 meter. Thus, the speed of light is 1 meter per meter, i.e., 1 without units. These are called geometrized units. 
  5. So, I’m going to do something, and it’s not immediately obvious why I’m doing this thing. But, just trust me, it’ll magically work out. (This is what “unmotivated” means in this context.) 
  6. If we had three space dimensions, this would be (\Delta s)^2 = (\Delta x)^2 + (\Delta y)^2 + (\Delta z)^2 - (\Delta t)^2 instead. 
  7. This is similar to the Pythagorean theorem, but, because of the minus sign on the (\Delta t)^2 term, the hypotenuse is always shorter than the vertical height. 
  8. Technically, this metric and those of the previous post are different. Normal metrics, like those in the last post, always give positive ds^2, and are called Riemannian metrics. This new metric gives negative ds^2 for some vectors, and we call it a pseudo-Riemannian metric. 

5 thoughts on “A Mathematical Intro to Special Relativity”

  1. Why this definition of simultaneity? Why not, for example, say that an event emits light and you consider events to be simultaneous if their light reaches you simultaneously? That seems more like what a normal observer would consider to be simultaneous.


    1. That does’t quite work, since light has finite speed. Look at the t=0 line for you (i.e., the x-axis) and note that they wouldn’t be simultaneous by your definition. There are other ways to define simultaneity, but this is one way that works.


  2. Totally displaying my ignorance, sleep deprivation, or how long it’s been since I took physics, but I get lost right around the point with the rocket and the 90 degree triangle of light with the green dot of the event, and the friend’s line of simultaneity. I only understand that an event will not necessarily be simultaneous as viewed by 2 people because I saw a little clip about it in high school. =( Maybe if there were more pictures, or a little video here? Not that I want to kill off the in-house artist…


    1. Unfortunately, videos probably WOULD kill the Missus…

      The idea is that your friend sent the light out, say, 7 seconds before he passed by you. Then the light bounced off of SOMETHING, and got back to him 7 seconds after he passed by you. That means that the event where the light bounced off the mirror (which happened at a time and a place) is simultaneous with his passing you, in your friend’s point of view. But that’s not what YOU think is simultaneous with that.

      Hope that helps…


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s