The idea we want to develop is that you can identify where
something is by finding a property of it that is minimized
there. We want to identify which path, among all the imaginable
paths, is the actual path that starts and ends at particular
events, and to figure out what, analogous to the length of a
rubber band, should be constructed so that it will be minimized
always and only at each of the possible actual paths. What we
have so far to describe the system in question is an equation of
motion of the form .
We notice first a thing about minima, and how things find them. Suppose we are in a hilly country, and we are looking for water. We never find it on the sides of hills, and the only places it stays are in lakes that are located in the bottoms of valleys. So we have an example of the importance of a minimum, because where a lake forms is determined by where the height of the ground is a minimum. We notice that if there is any way to go lower than some point, water will not stay there, but will flow out (as in rivers and streams), which also tells us how water finds such minima. It simply flows from wherever it is in whatever direction is downhill, until it hits a bottom.
But this leads us to notice a thing about the bottoms of valleys. Water flows across the ground because there is a direction that takes it downhill. At the bottom of a valley all of the directions are uphill, so there is no way to flow. The special case of interest to us occurs when the valley is a smooth valley. By smooth, we mean that there are no sharp creases in the ground (supporting picture: Ireland vs. Switzerland), which means that all of the properties of the surface change only gently as we go from one place to another. In such a case, one of the things that changes smoothly is the slope of the ground. But we see what happens near the bottom of the valley. As we walk down, the slope in our direction of travel is downward, and after we pass through the bottom it is upward. If we use numbers to measure the severity and direction of the slope, we use negative numbers on one side to tell us that we are going down, and positive numbers on the far side to tell that we are going up. And the only way to get smoothly between them is to pass through a zero slope (no slope, or flat slope) at the bottom of the valley. Therefore we discover an old but great theorem of mathematics:
Theorem: The slope at the bottom of a smooth valley is zero.
Now we apply this to the problem of selecting paths. First, we have to assume that this set of all imaginable paths is in any sense understandable like a map, in that you can get from one place on it to another always by little steps. Remarkably, nature seems to permit us a labeling by which it is, though this has been nontrivial to discover. (The deformations by ``little bumps'', in the previous discussion of infinities, are essentially the small steps that take us from one path to another that is ``nearby''.)
Secondly, we have to assume that it is even possible to define a useful set of smooth numbers on such a map, like it is possible to describe the surface of a hilly country by the height of the hills at each location. In other words, in addition to supposing that minimization can be used at all, we would like to suppose that the right thing to minimize would look like the smooth surface of a hilly country on a map of all imaginable paths, if we could draw such a thing. This too turns out to be possible (the math; though usually not the drawing).
We now ask what a slope would look like in such a territory. Well, recall that the slope in a regular hilly country tells us how much we change in height as we go little steps in each of the possible independent directions. But there are only two independent ways you can walk on the surface of the earth, so such slopes can be described with only two numbers, for example the change from going north and the change from going east.
But recall that there were infinitely many independent ``directions'' or dimensions, along which you could change from one imagined path to another, corresponding to all places you could put little bumps on the path to deform it. Therefore, the slope of any surface we could define on this map of all imaginable paths has to have infinitely many components, one telling that slope at each point along the path. Thus, if we say that some surface in the space of all imaginable paths takes a minimum (has a valley) at the actual path, we are saying that there is some slope, some infinite collection of numbers, one for each point along the path, all of which are zero.
Suddenly we notice that what we have described looks remarkably like
what we have in . Certainly we change nothing in the content of
if we write it instead as
. But this is not just one
statement.
is a statment that is true at each moment
along the actual path. In other words, as the object passes through
each point of its motion, the definitions of
,
and
tell us
how to construct some number,
, that describes the circumstances
of the object while it is at that place, and the characteristics of
its motion as it is passing through that place. Saying that each of
the numbers so generated, at each of the points of the path, is zero,
is exactly of the same form of statement as that the
infinite-dimensional slope of something in the space of all imaginable
paths is zero at the actual possible paths.
Thus, if we can make it work, we have found our connection.
When we have a statement , we wish to regard the set of
numbers given by
at each point of any path, as the set of
numbers defining a slope of something at that path. We give
this ``something'' the name Action,
and now we recognize
that, whenever
is the slope of the action in the space of
all imaginable paths, the statement that the action has a valley
(a minimum) at the actual path is the same as the statement that
along the actual path
. Just as, if we know the slope of
a hilly country everywhere we can reconstruct the heights of all
the hills by taking little steps and ``patching together the
slopes'' as we go, in the same way if we know
for each
imaginable path, we can reconstruct the value of the action at
each path by ``piecing together the slopes'' smoothly from path
to path. We will return to the question of whether this always
works in a moment.
What we have just described is in fact the connection between Newton's laws and the principle of least action. And, it turns out that there is no complete system that we understand for which we have not been able to construct such a description, or more accurately, the predecessor description in the quantum theory from which it follows. But we certainly have not shown all of that here. Really proving some of the assumptions, like those required to make sense of the set of all imaginable paths and the smoothness of the action there, is far more than we can do here, and really far more than there is a reason to do. The important thing from this discussion is to see the relation of its parts:
To see what could go wrong, we look at M.C. Escher's impossible
``waterfall''. (It is no accident that Escher drew such useful
pictures. He was a mathematician by trade, with a great interest in
how one could envision the problems that mathematics makes definite in
equations.) Suppose we were given an equation , and told that
the
was the one responsible for the motion of this water. Such a
force pushes around and around in a circle, in one direction only.
From this we might try to build an action, by assigning a number to
some starting path, and then using
as a slope in the space of
paths, to assign actions to all the other paths, each achieved by some
sequence of small changes from our starting choice.
Perhaps the simplest place to start would be a path that just sits
still at an arbitrary point on the waterfall. Such a path has no
velocity, and only one position, so its action can depend only on
that. That would certainly seem to be an easy starting point in
assigning actions. Further, it doesn't matter what number we assign
to this action, since as noted the importance of the action is how it
relates different paths. Our simplest goal might be to assign actions
to all the paths that ``just sit still'' at different places around
the waterfall. Each such path is easy to obtain from the one next to
it, by just a small deformation to the side. Moreover, since each of
them has no velocity, there is no term at any of them, so no
contribution to the slope from that. The problem is the form of the
force. If we deform our paths by going around the waterfall either
``with the flow'' or ``against the flow'', the
part of the slope
gives us a constant contribution for every change. Thus, the action,
from one path to the next, always increases in the same direction
(either positively or negatively). However, after such a continuous
growth of action, we can arrive back at our starting path,
claiming that it requires a different action from the one we started
with. Worse yet, whether that action is larger or smaller than the
starting value depends on which way we went around the fall. Thus our
hoped-for prescription for assigning actions cannot even be sensibly
defined.
The reason we wind up in this trouble is that Escher's waterfall
represents flow that can't occur down any actual hill. If we
had used, instead, the corresponding to flow down any real hill,
then no matter how we went around on the hill, we would always have
had to go up exactly as much slope as we went down. Therefore, no
matter what sequence we chose in assigning actions to paths, if we
looked at a sequence that took us from our starting path back to
itself, we would always have arrived back at the same number for the
action. That is easy enough to see for the ``just sitting still''
paths.
In fact, this is the simplest expression of a more general condition
that is required of an , if it is to be usable to build an
action. The particular example we just considered is the contribution
that arises from the sitting-still paths alone, from just the
part
of the equations of motion. It requires that, at the least, the force
must result from a slope that could be achieved from some real hill. The height of such a hill is known as a potential
function,
because it has the potential to induce motion, such as flow in water.
The size of the force that induces the flow is proportional to the
steepness of the sides of the hill. That is why, adding up the forces
around any loop that comes back to its starting place, as we did in
the last example, is guaranteed to give a consistent answer in the
case of a real hill. It just amounts to walking around in a circle
and counting up the changes in height, which of course always brings
you back to the same height, as well as the same place, as the one
where you started.
The reason such potential functions are interesting is that they will arise again, in the context of energy in the next section. When potential functions come from real hills, that ensures that there is no force capable of transporting an object endlessly around and around, which would be a form of getting ``something for nothing''.
More general restrictions of the same kind are needed, to apply to all
possible things could do, if we are to ensure consistency with
all possible paths. For now, this is sufficient to give an example of
the nature of these consistency constraints, and to introduce
properties of potential functions which will be explored in later
sections and in the problems. It happens that just the form of
guarantees that it will never give a problem for the part of the slope
contributed by that term alone. For more general terms, though, that
could have appeared if
depended on things like velocity, or instead of
, there is no such guarantee, within the framework
intrinsic to equations of motion themselves.
Clearly, then, the important question at this stage is which kind of
's appear in nature. Certainly, if there are real cases in which
is like the Escher picture, then the Principle of Least Action
must be of limited usefulness, and must not be truly fundamental, but
simply an alternative to
that overlaps with it in some
instances.
We simply state for the record that so far, every time we have
found an for which there has not been a corresponding action,
this has been because we have left out part of the important
description of the system. So far there has always been a way
to go back and include something that was there in the real
world, but which we thought was not important or could be left
out of our description of
, in such a way that a principle of
least action can be made which works.
Having gone this far, we are in a position to fruitfully return to the question that motivated all of this search, which is how one usefully identifies the constants of the motion. Answering this question will lead us to some of the most important fundamental physical concepts in use today.