Alen's Page Einstein's Derivation of the Transformation Equations in Special RelativityPage 1 of 2

EINSTEIN'S DERIVATION
OF THE TRANSFORMATION EQUATIONS
IN SPECIAL RELATIVITY


Anyone who reads Einstein's derivation of the transformation equations of Special Relativity in his 1905 paper may find it somewhat cryptic and unclear, especially on a first acquaintance, since he does not include all the intermediate mathematical steps in the argument. No doubt he could have supposed that his peers of the time would easily have been able to fill in the missing steps for themselves.

Now, however, Special Relativity has a much wider audience, some of whom are subjecting it to doubts and controversy, or even, perhaps, saying that Einstein fudged some steps in the argument.

Since the cryptic nature of the presentation makes it difficult for someone trying to read it for the first time, it seemed to me that it might be useful to have Einstein's 1905 argument, with relevant illustrations added, and with the missing mathematical steps filled in. This is what I have attempted to do in what follows. It deals, of course, only with the 'kinematical part' of Einstein's paper

The derivation is grounded in the two postulates of Special Relativity, the first being that all inertial reference frames provide equally valid viewpoints from which to describe events in general, in the sense that no inertial reference frame can be given a special status in preference to any other, and the second being that the velocity of light is the same if measured within any inertial frame, independently of the reference frame in which the light was emitted. The validity of the transformation equations that result from the derivation depend on the correctness of these postulates, which are not proven by argument, but depend entirely on accepted experimental verification for their credibility.


The 1905 paper deals with the case of two inertial reference frames in constant relative motion. We rigidly connect ourselves to one of the frames so that it is not moving relative to us, and so that we can thus call it the 'stationary' frame. The other frame then becomes the 'moving' frame. We could connect ourselves to either frame, or neither, in which case both frames would be moving relative to us. It is simpler, however, to arrange for one of the frames to be stationary relative to ourselves.

Einstein starts off with a context that is illustrated in the diagrams below. The event whose coordinates are to be considered is event B, and this is described in the moving frame by coordinates of the type, ,,,, and in the stationary frame by coordinates of the type, x,y,z,t. Einstein sets out to derive a relationship between these two sets of coordinates, by which one set can be obtained from the other. This is expressed in a general way as one set being a function of the other, or forming a 'basis' for the other. Thus, we will be dealing with the general form (x,y,z,t), (x,y,z,t), (x,y,z,t), (x,y,z,t). Einstein begins with the derivation of (x,y,z,t).

There are, however, two versions of this general transformation: a transformation of a first kind, in which x,y,z,t are coordinate values referred to the origin of the stationary frame, as in Figure 2, and a transformation of a second kind, in which they are referred to the origin of the moving frame, but remain stationary frame measurements, as in Figure 3. In both kinds of transformations the values of (x,y,z,t) are the same and the values of t are also the same. Only values of the x coordinate are different. The desired, final equations that are to be derived are the transformation equations of the first kind. Einstein uses both kinds of transformations, in respect of (x,y,z,t), for which they are equivalent, but does not overtly discuss the difference between them.

Einstein refers to the velocity of light being c-v or c+v, rather than c, in the moving frame, from the perspective of the stationary observer. The existence of two distances and two times as, and x, and and t, provides for four possible values that can be assigned as a 'velocity' of light: x/t, /, x/, and /t. The principle of the constancy of the velocity of light refers only to this velocity being measured by the distance and the clock being in the same frame, and not by distance in one frame and the clock in the other. Thus, only x/t and / must have the value c. When referring to a velocity of, for example, c-v, Einstein is using the form /t, using distance in the moving frame and a clock in the stationary frame. The value obtained is not actually /t but, rather, X/t, where X is the measured, foreshortened value of that can be obtained by the stationary observer.

A form such as X/t, where X is a distance within the moving frame, and moving with it, represents the stationary observer's attempt to see how the light, emitted in the moving frame, is travelling within the moving frame, rather than within his own stationary frame. This is illustrated in figure 3. That is, he tries to eliminate the relative motion from his viewpoint in order to see what the moving observer sees, but necessarily uses stationary frame calculations. But this causes him to interpret light as travelling within the moving frame at values of velocity other than c, ie (c-v), or (c+v). The postulates of relativity, however, declare that the moving observer does not see the light moving in accordance with these values, and the coordinate values of events therefore have to be transformed to allow the stationary observer to understand the values the moving observer would obtain. This is achieved by the transformation equations of the second kind. Einstein begins by constructing these equations and then uses them to obtain the desired transformation equations of the first kind. As indicated previously, the viewpoint on the basis of which the transformation equations of the second kind are obtained involves a relationship of figures 1 and 3, rather than figures 1 and 2.

In figure 1, we can see that, since, within the moving frame, for the moving observer, the frame is the same as a stationary frame, within which the rod is also stationary, the light takes equal times to go in both directions, so that the time at event B must be half way between the times at events A and C on a moving frame clock, which can be expressed as - 0 = 1 - or 0 + 1 = 2, or 1/2(0 + 1) = . The corresponding times for the stationary observer, as seen in Figure 3, are t0, t0 + X/(c-v), and t0 + X/(c-v) + X/(c+v), where the capital X refers to the measured value of the length of the moving rod as would be obtained by the stationary observer. It can be seen immediately that, for this original stationary observer, who sees the light trajectory according to figure 2, in his own frame, the time at event B is not half way between the times at A and C, on his stationary clock. This makes it clear that the times on stationary and moving clocks cannot correspond.

For the expression 1/2(0 + 1) = , as a function of the stationary coordinate values in figure 3, we have:

1/2{0(0,0,0,t0) + 1(0,0,0,t0 + X/(c-v) + X/(c+v))} = (X,0,0,t0 + X/(c-v))

if X is sufficiently small, this has the differential form

1/2{0(0,0,0,t0) + 1(0,0,0,t0 +dt1)} = (dx,0,0,t0 +dt),
or, 1/2{0(0,0,0,t0) + (0+d1)(0,0,0,t0 +dt1)} = (0+d)(dx,0,0,t0 +dt),
with dt1 = X/(c-v) + X/(c+v), and dt = X/(c-v),
writing dx in the form X, to correspond with the illustrations.

Einstein proceeds to carry out the derivation using differential equations, rather than macroscopic equations, and uses a principle of linearity to convert to macroscopic equations, which I shall do at the end of the derivation. Thus, using the well known partial differential expression for the transformation of the total differential, we can say, in general:

d = (/x)dx + (/y)dy + (/z)dz + (/t)dt

and this can be applied separately to each term in the above differential equation, so as to get a differential transformation for each value of d. The values with subscripts correspond to those shown in the illustrations above. All differential values are expressly indicated, so the results can be simply written down immediately.

For the first term, at the start, at event A, in figures 1 and 3, there are as yet no d or dt, or other differential values, so it can be ignored. In the second term, we are back at the start spatially, at event C, where dx = 0, and only times and t are different, so, applying the above expression for the total differential, we have

d1 = (/t)dt1, with the other terms zero.

At event C, we have dt 1= X/(c-v) + X/(c+v), and thus

d1 = (/t)( X/(c-v) + X/(c+v))

For the right hand side of the equation, at event B, we have values for d and both dx and dt, and so, here, d, will be

d = (/x)dx + (/t)dt, with the other terms zero

here, dt = X/(c-v), so we have:

d = (/x)dx + (/t) X/(c-v)

It is worth examining more closely the nature of the term (/x)dx, and ask what does /x mean? Since it is a partial derivative, it is a value taken when varies only in the x direction, and not any other, also excluding the time direction. The variation of along the x direction, without varying the time, can therefore refer only to the rate of change of the nonsimultaneity of readings on an array of clocks along the x direction, which is to say, in general, along the axis in the direction of the relative motion.

Since the differential equation has been written in the form

1/2(0 + (0+d1)) = 0 + d

we can eliminate 0, and get the differential equation

1/2 d1 = d

and, substituting values obtained above for d1 and d, we have

1/2(/t)( X/(c-v) + X/(c+v)) = (/x)dx + (/t) X/(c-v)

(/t)cX/(c2-v2) = (/x)dx + (/t)X/(c-v) = (/x)dx + (/t)X(c+v)/(c2-v2)

From which

(/x)dx = (/t)(-vX/(c2-v2))

Putting this into the equation for d, which refers to the time existing at event B, we have

d= (/t)(-vX/(c2-v2)) + (/t)(X/(c-v)) = (/t)(-vX/(c2-v2)) + (/t)(X(c+v)/(c2-v2))

d = (/t)(cX/(c2-v2))

Now we can say, putting = 1/(1-v2/c2)1/2

cd = d = (/t)(c2/(c2-v2))X = (/t)2X
and
d = (/t)(c2/(c2-v2))X = (/t)2X/c

which are the first two transformation equations of the second kind.

Einstein now considers the case where we have a light ray to move along the y axis, transverse to the direction of motion, instead of the x axis, so we must have another transformation equation of the second kind. The diagram in Figure 4, on the, left illustrates the case where the light ray moves transversely to the direction of motion of the rod, and the shaded panel indicates the parameters used to set up the equation below. The shaded panel represents the stationary observer's attempt to interpret what the moving observer sees, using stationary frame measurements, and is a transverse case version of what is illustrated in figure 3. It does not represent the moving observer's own observations, which would be identical to figure1, since there is no distinction between the results of transverse observations and observations in the direction of motion within the moving frame. This diagram gives the following equations, which I put immediately in the form of differentials, by assuming y is very small:

1/2{0(0,0,0,t0) + 1(0,0,0,t0+2dy/(c2-v2)1/2)} = (0,dy,0,t0+dy/(c2-v2)1/2)
or, 1/2{0(0,0,0,t0) + (0+d1)(0,0,0,t0+2dy/(c2-v2)1/2)} = (0+d)(0,dy,0,t0+dy/(c2-v2)1/2)

All the subscripts have the same relationship to one another as before, but now refer to a corresponding arrangement of events in the transverse case. Applying the transformation of the total differential, as previously, to d1 and d, regarded as total differentials, we get the equations

d1 = (/t)2dy/(c2-v2)1/2
d = (/y)dy + (/t)dy/(c2-v2)1/2

and with, as before

1/2 d1 = d

we get the equation

(/t)dy/(c2-v2)1/2 = (/y)dy + (/t)dy/(c2-v2)1/2

so we must have /y = 0 and, by a similar argument, /z = 0 (which show that there is no non-simultaneity effect within an array of moving clocks in a plane at right angles to the direction of motion)

thus

d = (/t)dt = (/t)dy/(c2-v2)1/2

or

cd = d = (/t)dy/(1-v2/c2)1/2

and, by a similar argument for the z axis

cd = d = (/t)dz/(1-v2/c2)1/2

Einstein has (/t)/(1-v2/c2)1/2 as an unspecified function (v), so that the transformation equations of the second kind are

d = (v)X

d = (v)X/c

d = (v)dy

d = (v)dz

Einstein now goes on to obtain the value of (v). First he uses a double application of the transformation equations so that, for example, we transform dy => => dy, and simlarly with the other coordinates, and shows that we must have (v)(-v) = 1, since transforming forward, and then back again must give us back the original values. Einstein uses the inverse of the transformation equations to get back the original values and, since it is not self-evident as to how these are to be set up, I have provided notes on these on an Inverse Transform page, so as not to interfere with the argument on the current page.

He then argues that we have d/(v) = dy, and neither d nor dy change with a change in the direction of motion along the x coordinate, so we must have d/(-v) = dy, if the direction of the velocity is reversed. Since, however, all inertial frames are equivalent, the value of (-v) obtained here must be symmetrically the same as that obtained already, by viewing the velocity as -v from the perspective of the other frame. Thus we must have d/(v) = d/(-v) , or, (v) = (-v) = 1

Since we have, above

(/t)/(1-v2/c2)1/2 = (v) = 1

Therefore

/t = (1-v2/c2)1/2 = 1/

This shows that /t depends only on velocity, v, so that the transformation equations must be linear at constant velocity, allowing the differential transformation equations to be written as macroscopic equations.

To get the desired, macroscopic transformation equations of the first kind, we must use coordinate values in figure 2 and, for simplicity, set 0 = t0 = 0. We may note that event B, which is at coordinate value X in figure 3, is at some coordinate value x in figure 2 and, as the length of the moving rod is X = (x-vdt), or X = (x-vt), in the macroscopic version of the equation, as indicated in figure 2, we can get equations in terms of x by substituting for X. In the first transformation equation we can use the form (x-vt). In the second equation, since x = ct and also vt = vx/c, we have X/c =( t - vx/c2). Thus, if we put (v) = 1, the macroscopic transformation equations of the first kind are:

= (x-vt)

= (t-vx/c2)

= y

= z


Einstein's 1905 Paper Online


While the above page contains nothing regarding Special Relativity and the transformation equations that is unorthodox,
the following links connect to pages that do contain an unorthodox interpretation of the transformation equations.
This is because I believe that, although the Minkowski equation is mathematically valid, it can be proven that
the Minkowski metric cannot represent a really existing spacetime.

A Lightlike Interpretation of Special Relativity

The Twins Paradox


© Alen, March 2007. All rights reserved.
alen1@westserv.net.au


Material on this page may be reproduced
for personal use only.