Hamiltonian Mechanics

\(\newcommand{\footnotename}{footnote}\) \(\def \LWRfootnote {1}\) \(\newcommand {\footnote }[2][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\newcommand {\footnotemark }[1][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\let \LWRorighspace \hspace \) \(\renewcommand {\hspace }{\ifstar \LWRorighspace \LWRorighspace }\) \(\newcommand {\mathnormal }[1]{{#1}}\) \(\newcommand \ensuremath [1]{#1}\) \(\newcommand {\LWRframebox }[2][]{\fbox {#2}} \newcommand {\framebox }[1][]{\LWRframebox } \) \(\newcommand {\setlength }[2]{}\) \(\newcommand {\addtolength }[2]{}\) \(\newcommand {\setcounter }[2]{}\) \(\newcommand {\addtocounter }[2]{}\) \(\newcommand {\arabic }[1]{}\) \(\newcommand {\number }[1]{}\) \(\newcommand {\noalign }[1]{\text {#1}\notag \\}\) \(\newcommand {\cline }[1]{}\) \(\newcommand {\directlua }[1]{\text {(directlua)}}\) \(\newcommand {\luatexdirectlua }[1]{\text {(directlua)}}\) \(\newcommand {\protect }{}\) \(\def \LWRabsorbnumber #1 {}\) \(\def \LWRabsorbquotenumber "#1 {}\) \(\newcommand {\LWRabsorboption }[1][]{}\) \(\newcommand {\LWRabsorbtwooptions }[1][]{\LWRabsorboption }\) \(\def \mathchar {\ifnextchar "\LWRabsorbquotenumber \LWRabsorbnumber }\) \(\def \mathcode #1={\mathchar }\) \(\let \delcode \mathcode \) \(\let \delimiter \mathchar \) \(\def \oe {\unicode {x0153}}\) \(\def \OE {\unicode {x0152}}\) \(\def \ae {\unicode {x00E6}}\) \(\def \AE {\unicode {x00C6}}\) \(\def \aa {\unicode {x00E5}}\) \(\def \AA {\unicode {x00C5}}\) \(\def \o {\unicode {x00F8}}\) \(\def \O {\unicode {x00D8}}\) \(\def \l {\unicode {x0142}}\) \(\def \L {\unicode {x0141}}\) \(\def \ss {\unicode {x00DF}}\) \(\def \SS {\unicode {x1E9E}}\) \(\def \dag {\unicode {x2020}}\) \(\def \ddag {\unicode {x2021}}\) \(\def \P {\unicode {x00B6}}\) \(\def \copyright {\unicode {x00A9}}\) \(\def \pounds {\unicode {x00A3}}\) \(\let \LWRref \ref \) \(\renewcommand {\ref }{\ifstar \LWRref \LWRref }\) \( \newcommand {\multicolumn }[3]{#3}\) \(\require {textcomp}\) \(\newcommand {\toprule }[1][]{\hline }\) \(\let \midrule \toprule \) \(\let \bottomrule \toprule \) \(\def \LWRbooktabscmidruleparen (#1)#2{}\) \(\newcommand {\LWRbooktabscmidrulenoparen }[1]{}\) \(\newcommand {\cmidrule }[1][]{\ifnextchar (\LWRbooktabscmidruleparen \LWRbooktabscmidrulenoparen }\) \(\newcommand {\morecmidrules }{}\) \(\newcommand {\specialrule }[3]{\hline }\) \(\newcommand {\addlinespace }[1][]{}\) \(\newcommand {\intertext }[1]{\text {#1}\notag \\}\) \(\let \Hat \hat \) \(\let \Check \check \) \(\let \Tilde \tilde \) \(\let \Acute \acute \) \(\let \Grave \grave \) \(\let \Dot \dot \) \(\let \Ddot \ddot \) \(\let \Breve \breve \) \(\let \Bar \bar \) \(\let \Vec \vec \) \(\require {physics}\) \(\newcommand {\nicefrac }[3][]{\mathinner {{}^{#2}\!/\!_{#3}}}\) \(\newcommand {\unit }[2][]{#1 \mathinner {#2}}\) \(\newcommand {\unitfrac }[3][]{#1 \mathinner {{}^{#2}\!/\!_{#3}}}\) \(\newcommand {\tcbset }[1]{}\) \(\newcommand {\tcbsetforeverylayer }[1]{}\) \(\newcommand {\tcbox }[2][]{\boxed {\text {#2}}}\) \(\newcommand {\tcboxfit }[2][]{\boxed {#2}}\) \(\newcommand {\tcblower }{}\) \(\newcommand {\tcbline }{}\) \(\newcommand {\tcbtitle }{}\) \(\newcommand {\tcbsubtitle [2][]{\mathrm {#2}}}\) \(\newcommand {\tcboxmath }[2][]{\boxed {#2}}\) \(\newcommand {\tcbhighmath }[2][]{\boxed {#2}}\) \( \DeclareMathOperator {\D }{D} \DeclareMathOperator {\Id }{Id} \DeclareMathOperator {\diag }{diag} \DeclareMathOperator {\mod }{mod} \DeclareMathOperator {\Vol }{Vol} \)

3 Hamiltonian mechanics

In this chapter we start our investigation of the hamiltonian formalism. We have already seen glimpses of it throughout some of the exercises and examples in the previous chapter, but only scratched the surface. We will see how this, seemingly harmless, change of perspective will reveal much more about the geometric structure underlying classical dynamics.

In the lagrangian formulation for a system with \(n\) degrees of freedom and lagrangian function \(L=L(q, \dot q)\), we have seen that the equations of motion are encoded by the Euler-Lagrange equations, which are \(n\) second order differential equations in \(q\), requiring \(2n\) initial conditions, say \(q(0)\) and \(\dot q(0)\).

In the previous chapters, we saw that the \(n\) generalized momenta

\begin{equation} p_i = \pdv {L}{\dot q^i}, \quad i=1,\ldots ,n, \end{equation}

play a relevant role in the first integrals at the core of Noether theorem, and in particular in the conservation of energy. If we then look at the Euler-Lagrange equations and revisit them in terms of the momenta, they acquire the nice form

\begin{equation} \dot p_i = \pdv {L}{q^i}, \quad i=1,\ldots ,n. \end{equation}

Which seems to bring a nice symmetry between the coordinates and the momenta, and seems to allow us to reduce the problem to a first order system of equations.

Our current quest is then to find a function of \(q\) and \(p\) (and not of \(\dot q\)) which contains the same information as the lagrangian \(L\), allowing us to determine the unique evolution of \(q\) and \(p\), but bringing the two variables to more equal footings.

Theorem 1.6 and Theorem 1.7 provided us with some useful hints. Before entering into the gist of the matter, let’s reason in two dimensions and mimic our derivation of Theorem 1.6

3.1 The Legendre Transform in the euclidean plane

In this section we’ll be a bit sloppy, please bear with me. Let’s start considering the total derivative of an arbitrary function \(f(x,y)\),

\begin{equation} \dd f = \frac {\partial f}{\partial x} \dd x + \frac {\partial f}{\partial y} \dd y, \end{equation}

and defining a new function of three variables \(x,y,u\):

\begin{equation} g(x,y,u) = ux - f(x,y). \end{equation}

The total derivative of \(g\) is then given by

\begin{equation} \label {eq:hamdgex} \dd g = \dd (ux) - \dd f = u\dd x + x \dd u - \frac {\partial f}{\partial x} \dd x - \frac {\partial f}{\partial y} \dd y. \end{equation}

Even though \(u\) is a free variable, let’s assume for the sake of the argument that we choose it to be a specific function of \(x\) and \(y\), say

\begin{equation} u(x,y) = \frac {\partial f}{\partial x}. \end{equation}

Then (3.5) simplifies into

\begin{equation} \dd g = x \dd u - \frac {\partial f}{\partial y} \dd y, \end{equation}

i.e., \(g=g(y,u)\) could be considered a function of just \(y\) and \(u\). Inverting \(x = x(y,u)\), we could write down \(g\) explicitly as

\begin{equation} g(y,u) = u\, x(y,u) - f(x(y, u), y). \end{equation}

We have just described an operation which takes a function \(f(x,y)\) to a different function \(g(y,u)\) where \(u = \frac {\partial f}{\partial x}\) without losing any information: we can recover \(f(x,y)\) from \(g(y,u)\) by observing that

\begin{equation} \frac {\partial g}{\partial u} = x \quad \mbox {and}\quad \frac {\partial g}{\partial y} = - \frac {\partial f}{\partial y}, \end{equation}

which ensures that we can invert the transformation to get \(f(x,y) = \frac {\partial g}{\partial u}u - g\).

We have just described the Legendre transform. If you look carefully you may recognize it in Theorem 1.7.

The geometrical meaning of the Legendre transform is best captured with a picture1, although I don’t believe this will justify it more or less than the description before. For a fixed \(y\), draw the curves \(f(x) = f(x,y)\) and \(ux\). For each slope \(u\), the value of \(g(u) = g(u,y) = \max _x (ux - f(x))\) measures the maximal distance between the two curves \(f(x)\) and \(ux\): finding the maximum of such distance means computing

\begin{equation} \frac {\partial }{\partial x}(ux - f(x)) = 0 \quad \rightarrow \quad u = \frac {\partial f}{\partial x}. \end{equation}

Note that in the computation above we are assuming that a maximum exists: this hints to the fact that the Legendre transformation can only be applied to convex functions.

One advantage of this geometric approach is that the procedure is meaningful also on non-convex functions, even if this comes at the cost of losing involutivity, i.e. the function is the inverse of itself only on the space of convex (or concave) functions. You can read more about this on convex optimization books2, where is often called convex conjugation or Legendre-Fenchel transform.

In the next section we will explicit this discussion rigorously and in greater generality.

1 Metaphorically, but I should probably add a sketch here.

2 Or read this rather detailed blog post or these notes.

3.2 Intermezzo: cotangent bundle and differential forms

Again we will briefly present the basic concepts, you can refer to [J.M13] or [Ser20, Chapters 5-7] for further details.

Given an \(n\)-dimensional vector space \(V\) spanned by a basis \(\{e_1, \ldots , e_n\}\), we can define its dual space as the space \(V^* = \{f : V \to \mathbb {R} \; \mbox {linear}\}\) of linear functions from \(V\) to \(\mathbb {R}\). As a basis for the dual space we can define the dual basis \(\{e^1, \ldots , e^n\}\) as the linear maps such that \(e^i(e_j) = \delta ^i_j\). Since any linear function is determined by its action on a basis element, we can determine any \(\sigma \in V^*\) by computing its coefficients \(\sigma _i = \sigma (e_i)\). Then \(\sigma = \sigma _i e^i\).

We have seen that tangent vectors at a point \(q\) on a smooth manifold \(P\) are elements of a \(n\)-dimensional vector space, the tangent space \(T_x P\), and in coordinates they take the form \(v^i \frac {\partial }{\partial x^i}\). We can then use the theory of the previous paragraph to define the dual space to \(T_x P\), the cotangent space \(T^*_x P\). We call elements of the cotangent space at \(x\), covectors or one-forms at \(x\).

It is convenient to denote the dual basis to \(\{\frac {\partial }{\partial x^i} \mid 1\leq i \leq n\}\) by

\begin{equation} \{ \dd x^i \mid 1 \leq i \leq n \}, \quad \dd x^i\left (\frac {\partial }{\partial x^j}\right ) = \delta _j^i. \end{equation}

Even though we settled on the pushforward notation \(\phi _*\) for differentials, we have already seen the \(\dd \) appearing as alternative notation in the definition of the differential of a function. How is it related to one-forms? Observe that for \(f : P \to \mathbb {R}\), we have

\begin{align} \dd f_x : & T_x P \to T_{f(x)} \mathbb {R} \simeq \mathbb {R} \\ & v_x = \dot x^i \frac {\partial }{\partial x^i} \mapsto \dot x^i \frac {\partial f}{\partial x^i}. \end{align} Since derivations are linear maps, identifying the one-dimensional vector space \(T_{f(x)}\mathbb {R} \simeq \mathbb {R}\) with the reals (it is really just about sending the local basis of the tangent space to \(1\) as done in the equation just above), we can view \(\dd f_x\) as a linear map from \(T_x P \to \mathbb {R}\), that is as a differential one-form at \(x\). What are its coefficients? Let’s apply the recipe described at the beginning of the section and compute

\begin{equation} (\dd f_x)_i = \dd f_x\left (\frac {\partial }{\partial x^i}\right ) = \frac {\partial f(x)}{\partial x^i}. \end{equation}

That is, \(\dd f_x = \frac {\partial f(x)}{\partial x^i} \dd x^i\). As you can see, this is directly generalizing the same concepts from multivariable analysis.

Analogously to tangent bundles, we can glue cotangent spaces and define the cotangent bundle as \(T^*P := \cup _{x\in P} \{x\}\times T^*_x P\). Also the cotangent bundle is a \(2n\)-dimensional smooth manifold. The concept of differential forms extends to the cotangent bundle in a similar way as vector fields generalized the notion of tangent vectors to the level of the tangent bundle: a differential one-form on \(P\) is a map \(\sigma : x\in P \mapsto \sigma (x) \in T^*_x P\).

We are almost there, we need just a few more concepts. First of all we will need to talk about differential two-forms, these are a particular family of antisymmetric bilinear maps \(TP \times TP \to \mathbb {R}\). Given two one-forms \(\sigma \) and \(\nu \) we can define a two-form using an operation called wedge product: \(\sigma \wedge \nu \) is the antisymmetric bilinear map such that for all \(X,Y \in TP\),

\begin{equation} \sigma \wedge \nu (X,Y) := \sigma (X)\nu (Y) - \sigma (Y)\nu (X). \end{equation}

Any two-form can be written in terms of the basis \(\{\dd x^i\wedge \dd x^j\}\). Due to antisymmetry \(dx^i \wedge \dd x^j = - \dd x^j \wedge \dd x^i\), so one can simplify formulas by summing over ordered indices \(i<j\) and considering only half of the terms. Also due to antisymmetry, \(\dd x^i \wedge \dd x^i = 0\). As a linear map, we can associate to any two-form \(\sigma \) an antisymmetric matrix \(S\) whose coefficient are computed by evaluating \(\sigma \) over basis elements: \(S_{ij} = \sigma \left (\frac {\partial }{\partial x^i}, \frac {\partial }{\partial x^j}\right )\).

The contraction (or interior product) \(\iota _X\) of a one–form \(\theta = \theta _l(x)\dd x^l\) with a vector field \(X = X^l(x)\frac {\partial }{\partial x^l}\) is the natural pairing between the tangent and cotangent space:

\begin{equation} \iota _X \theta : P \to \mathbb {R}, \quad \iota _X \theta := \theta (X) = \theta _l(x) X^l(x). \end{equation}

Similarly, given a two-form \(\omega = \omega _{ij}(x) \dd x^i\wedge \dd x^j\), we can extend the contraction as the operation \(\iota _X\) that reduces it to a one–form by applying \(X\) as the first argument of \(\omega \), that is,

\begin{equation} \label {eq:contraction} \iota _X \omega := \omega (X, \cdot ) := \omega _{ij}(x)(X^i(x) \dd x^j - X^j(x) \dd x^i). \end{equation}

We will see in later chapters how this simplifies our lives.

The last thing to know, is that we can construct two-forms out of one-forms using so-called exterior products and exterior derivatives. For what concerns us, we will only need to know that \(\dd (\sigma _i \dd x^i) = \frac {\partial \sigma _i}{\partial x^j} \dd x^j \wedge \dd x^i\) and that \(\dd (\dd f) = 0\) for any \(f\).

We will also need to generalize the pull-back operation. Up until now we only encountered \(\Phi ^* : \mathcal {C}^\infty (T^*M) \to \mathcal {C}^\infty (T^*M)\), where \(\Phi \) is a diffeomorphism \(\Phi : T^*M \to T^*M\), defined in terms of right-composition \(\Phi ^* f = f \circ \Phi \). Note that if \(x = \Phi (y)\) then the pullback maps \(f(x)\) into \(\Phi ^* f (y) = f \circ \Phi (y)\), so you can effectively use the pullback to change coordinates. If \(\theta \) is a differential form on a manifold \(P\), then the diffeomorphism \(\Phi : P \to P\) induces a mapping on the space of differential forms on \(P\) defined \(\forall x\in P, X_j\in T_xP\) by

\begin{equation} (\Phi ^{*}\theta )_{x}(X_{1},\ldots ,X_{k})=\theta _{\Phi (x)}(\Phi _*(x)X_{1},\ldots ,\Phi _*(x)X_{k}). \end{equation}

We call the map \(\Phi ^*\) the pullback of \(\theta \). We will only be interested in \(k=1,2\).

The pullback of differential forms satisfies two useful identities that will be handy later on.

  • 1. it is compatible with the exterior product, i.e., if \(\theta _1\) and \(\theta _2\) are differential forms on \(P\), then \(\Phi ^{*}(\theta _1 \wedge \theta _2) = \Phi ^{*}\theta _1 \wedge \Phi ^{*}\theta _2\);

  • 2. it is compatible with the exterior derivative, i.e., if \(\theta \) is a differential form on \(P\), then \(\Phi ^{*}(\dd \theta ) = \dd (\Phi ^{*}\theta )\)

Finally, an important operation in differential geometry is the Lie derivative, the concept is quite general but we will only need it in the context of differential forms. Given a vector field \(X\) on a manifold \(P\) and the one–parameter group of diffeomorphisms \(\Phi _t\) that it generates, we define the Lie derivative of a differential form \(\theta \) with respect to \(X\) as

\begin{equation} \label {eq:lie-derivative} \mathcal {L}_X \theta = \dv {t}\Phi _t^*\theta \Big |_{t=0}. \end{equation}

The Lie derivative evaluates the change of a tensor field (including scalar function, vector field and differential forms), along the flow defined by another vector field.

The interior product is extremely useful when you want to compute something, as it relates the exterior derivative and Lie derivative of differential forms by the Cartan (magic) formula:

\begin{equation} \label {eq:cartan} \mathcal {L}_X \Theta = \dd (\iota _X \Theta ) + \iota _X (\dd {\Theta }). \end{equation}

With all this new equipment at hand, we are ready to delve into the hamiltonian side of mechanics.

3.3 The Legendre Transform

Let \(M\) be a smooth manifold. The evolution of the system with lagrangian \(L=L(q,\dot q)\) is given by the Euler-Lagrange equations: a system of second order equations on the tangent bundle \(TM\). Such dynamics can be reinterpreted in a particularly symmetric way as a dynamical system on the cotangent bundle \(T^*M\). Poisson and Hamilton have formalized the transformation that we described in the previous section and that realizes such symmetry.

We call Legendre transformation the map from the tangent bundle \(TM\) to the cotangent bundle \(T^*M\) defined in local coordinates \((q^1, \ldots , q^n)\in M\) by

\begin{equation} \label {eq:legrendreTrafo} \mathcal {L}: (q,\dot q) \mapsto (q,p), \end{equation}

where \(p = (p_1, \ldots , p_n) \in T^*_q M\) is the conjugate momentum to \(q\) defined via \(p_i := \frac {\partial L(q,\dot q)}{\partial \dot q^i}\), \(i=1,\ldots ,n\).

It follows from Exercise 1.7 that the definition above does not depend on the choice of local coordinates.

If it is not yet clear to you how the momenta are associated to the cotangent bundle, I promise you will not have to wait long. But if you are really impatient you could already have a look at Remark 3.4.

  • Theorem 3.1. The Legendre transform of a non–degenerate lagrangian is a local diffeomorphism.

  • Proof. It is enough to show that Jacobian determinant of \(\mathcal {L}\) is non zero. From the definition, see (3.21), we have the block matrix

    \begin{equation} D\mathcal {L} = \begin{pmatrix} \frac {\partial q^i}{\partial q^j} & \frac {\partial q^i}{\partial \dot q^j} \\ \frac {\partial p_i}{\partial q^j} & \frac {\partial p_i}{\partial \dot q^j} \end {pmatrix} = \begin{pmatrix} \delta ^i_j & 0 \\ \frac {\partial ^2 L}{\partial \dot q^i \partial q^j} & \frac {\partial ^2 L}{\partial \dot q^i \partial \dot q^j}, \end {pmatrix}, \end{equation}

    From which we get

    \begin{equation} \det D\mathcal {L} = (-1)^n \det \left (\frac {\partial ^2 L}{\partial \dot q^i \partial \dot q^j}\right ) \neq 0. \end{equation}

It follows that for a non–degenerate lagrangian we can locally invert the equations

\begin{equation} p_i = \frac {\partial L(q,\dot q)}{\partial \dot q^i}, \; i=1,\ldots ,n, \end{equation}

and write

\begin{equation} \dot q^i = \dot q^i(q,p). \end{equation}

In this way, every non–degenerate function on \(TM\) defines a function on \(T^* M\).

We call hamiltonian of a mechanical system with lagrangian \(L\), the Legendre transform \(H(q,p)\) of its energy \(E(q,\dot q)\) from (1.84). In local coordinates the hamiltonian of the system is given by

\begin{equation} \label {eq:defH} H(q,p) := p_i\dot q^i - L(q, \dot q), \quad \dot q^i = \dot q^i(q,p). \end{equation}

  • Remark 3.1. This is what is meant when you read that the hamiltonian is the lift of the total energy \(E(q, \dot q)\) of the system to a function \(H(q,p)\) on the cotangent bundle \(T^*M\).

  • Remark 3.2. This is where the literature can disagree. Some textbooks define \(H\) to be the Legendre transform of \(L\) instead. In this way the definition can be rewritten3 as

    \(f\) is the Legendre transform of \(g\) if \(f' = (g')^{-1}, \)

    which nicely emphasizes involutivity and duality between the two functions.

3 This is very nicely discussed in this blog post.

  • Remark 3.3. Perhaps not surprisingly, the Legendre transform also comes with some interesting connections to control theory and differential geometry. Here I will focus on the latter. For a more detailed discussion about the relations to control theory, you can refer to [GF00, Appendix II]. We will not go into the details here and instead only point out the bare-bones of the connection. Optimization problems can often be reduced to find a critical point of some functional, like the action, with respect to some extra external constraints, called controls. In such context, it is natural to rely on the theory of Lagrange multipliers that we saw in Chapter 2.5 on D’Alembert principle. If one can consider the critical points of the action \(S\) for a lagrangian \(L(x,v)\) with the constraint \(v=\dot x\), what one finds is that the momenta are the Lagrange multipliers of this problem.

    The connection with differential geometry goes through the concept of fiber derivative. Once you see that term appearing, you can thing of it as a Legendre transform in disguise. We will just briefly discuss it in the this remark but you can look at [MR99, Chapter 7.2] or [AMR88, Chapter 3.5] for all the details and a more general definition. Let \(L:TM \to \mathbb {R}\) be a non-degenerate lagrangian. The fiber derivative is a map

    \begin{align} \mathbb {F}L : TM \to T^*M, \quad v \mapsto \mathbb {F}L(v), \end{align} such that for all \(v,w \in T_q M\)

    \begin{align} \mathbb {F}L(v) w := \dv {s} L(q, v + s w) \big |_{s=0}. \end{align} That is, \(\mathbb {F}L(v) w\) is the derivative of \(L\) at \(v\) in the direction \(w\) along the fiber \(T_qM\). The last observation is crucial: the map is fiber-preserving in the sense that it maps the fiber \(T_qM\) to the fiber \(T^*_qM\). If we denote local coordinates on \(M\) by \(q=(q^1, \ldots , q^n)\), vectors on \(T_qM\) take the form \(\dot q = \dot q^i \frac {\partial }{\partial q^i}\big |_q \simeq (\dot q^1, \ldots , \dot q^n)\) and the fiber derivative can be written as

    \begin{equation} \mathbb {F}L(q^1, \ldots , q^n, \dot q^1, \ldots ,\dot q^n) = \left (q^1, \ldots , q^n, \frac {\partial L}{\partial \dot q^1}, \ldots , \frac {\partial L}{\partial \dot q^n}\right ). \end{equation}

    That is, the coefficients of the covector \(p = \mathbb {F}L(\dot q) \simeq p_i dq^i\) are given by

    \begin{equation} p_i = \pdv {L}{\dot q^i}. \end{equation}

    For non-degenerate lagrangians the fiber derivative is a smooth morphism of fiber bundles.

    The energy function associated to \(L\) is then defined as \(E(v) := \mathbb {F}L(v) v - L(v)\). When this is applied to physical systems, you can use the expressions derived above to check that \(\mathbb {F}L\) is the Legendre transform and \(E\) is the corresponding hamiltonian.

Leaving all these remarks aside, the crucial point of this whole construction lies in the new shape taken by the Euler-Lagrange equations once we revisit them on the cotangent bundle.

  • Theorem 3.2 (Hamilton’s equations). Euler-Lagrange equations for a lagrangian \(L(q,\dot q)\) are equivalent to Hamilton’s equations

    \begin{equation} \label {eq:Hamsys} \left \lbrace \begin{aligned} \dot q^i & = \frac {\partial H}{\partial p_i} \\ \dot p_i & = -\frac {\partial H}{\partial q^i} \end {aligned} \right . \qquad i=1,\ldots ,n, \end{equation}

    for the corresponding hamiltonian \(H(q,p)\).

  • Proof. The differential of the hamiltonian,

    \begin{equation} \label {eq:dHlc} \dd H(q,p) = \frac {\partial H}{\partial q^i}\dd q^i + \frac {\partial H}{\partial p_i}\dd p_i, \end{equation}

    can be explicitly computed using its definition:

    \begin{align} \dd H(q,p) & = \dd {}\left ( p_i\dot {q}^i(q,p) - L(q, \dot {q}(q,p)) \right ) \\ & = \dot {q}^i \dd {p_i} + p_i \dd {\dot {q}^i}(q,p) - \pdv {L}{q^i} \dd {q^i} - \frac {\partial L}{\partial \dot {q}^i} \dd {\dot {q}^i} \\ & = \dot {q}^i \dd {p_i} - \dot {p}_i \dd {q^i}, \label {eq:dHlc2} \end{align} where we used the Euler-Lagrange equations to get

    \begin{equation} p_i = \frac {\partial L}{\partial \dot {q}^i} \qquad \mbox {and}\qquad \dot p_i = \dv {t}\frac {\partial L}{\partial \dot {q}^i} = \pdv {L}{q^i}. \end{equation}

    The theorem follows comparing coefficients in (3.32) and (3.35).

We call hamiltonian system a system of differential equations of the form (3.31) and, as one can expect, we say that \(H\) is its corresponding hamiltonian function. Equations of the form (3.31) are usually called canonical equations.

  • Example 3.1 (A particle in a potential). Let’s start with a simple prototypical example: a particle in \(\mathbb {R}^3\) moving under the influence of a potential \(U(\vb *{x})\). We are, by now, very accustomed to the corresponding natural lagrangian

    \begin{equation} L(\vb *{x},\dot {\vb *{x}}) = \frac {m \dot {\vb *{x}}^2}{2} - U(\vb *{x}). \end{equation}

    In this case

    \begin{equation} \vb *{p} = \frac {\partial L}{\partial \dot {\vb *{x}}} = m \dot {\vb *{x}} \end{equation}

    coincides with what we usually call (kinetic) momentum in physics.

    The hamiltonian is then given by

    \begin{equation} H(\vb *{p},\vb *{x}) = \langle \vb *{p}, \dot {\vb *{x}}\rangle - L = \frac {\vb *{p}^2}{2m} + U(\vb *{x}), \end{equation}

    which should remind you of Theorem 1.6. Hamilton’s equation are then simply

    \begin{equation} \left \lbrace \begin{aligned} \dot {\vb *{x}} & = \frac {\partial H}{\partial \vb *{p}} = \frac 1m \vb *{p} \\ \dot {\vb *{p}} & = - \frac {\partial H}{\partial \vb *{x}} = -\frac {\partial U}{\partial \vb *{x}} \end {aligned} \right ., \end{equation}

    which should also be familiar: the first is just the definition of momentum while the second is Newton’s second law for the system.

  • Example 3.2 (The hamiltonian for a natural lagrangian on a riemannian manifold). Let \(L\) be a free lagrangian

    \begin{equation} L(q, \dot q) = \frac 12 g_{ij}(q) \dot q^i \dot q^j \end{equation}

    on the tangent bundle \(TM\) of a riemannian manifold \((M,g)\). Recall that

    \begin{equation} \dd s^2 = g_{ij}(q)\dd q^i \dd q^j. \end{equation}

    Denote \((g^{ij}(q))\) the inverse of the metric matrix \((g_{ij}(q))\), i.e., the matrix such that \(g^{ik}(q)g_{kj}(q) = \delta ^i_j\) for all \(i,j=1,\ldots ,n\).

    The Legendre transform of \(L\) is, then, the hamiltonian

    \begin{equation} H = \frac 12 g^{ij}(q) p_i p_j,\quad \mbox {with}\quad p_i := g_{ij}(q)\dot q^j, \; i = 1,\ldots , n, \end{equation}

    while the inverse transform gives \(L\) in terms of

    \begin{equation} \dot q^i = g^{ij}(q) p_j, \quad i = 1,\ldots , n. \end{equation}

    As we saw when we defined the Legendre transform and in Exercise 1.7, the behavior of the metric tensor under coordinate transformations implies that \(H\) is invariant under a change of variable. Moreover, we saw in Exercise 1.8 that geodesics on \(M\) are the solutions of the Euler-Lagrange equation for the free lagrangian \(L\). But then, Theorem 3.2 implies that geodesics should appear also from the flow of Hamilton’s equation

    \begin{equation} \label {eq:geoflowham} \left \lbrace \begin{aligned} \dot q^i & = \frac {\partial H}{\partial p_i} = g^{ij}(q) p_j \\ \dot p_i & = -\frac {\partial H}{\partial q^i} = -\frac 12 \frac {\partial g^{jk}}{\partial q^i} p_j p_k \end {aligned} \right . \qquad i=1,\ldots ,n. \end{equation}

    Indeed, we can describe the geodesics in terms of the flow determined by (3.45), also called the cogeodesic flow: the geodesics are the projections of the integral curves of the (co)geodesic flow onto the manifold \(M\).

    • Exercise 3.1. Show that a simple substitution allows to recover Euler-Lagrange equations of Exercise 1.8, which gives the geodesic flow on the tangent bundle \(TM\).

    As we know that the hamiltonian is the total energy of the system and that the total energy is conserved, this immediately implies that \(g_{ij}(q) \dot q^i \dot q^j = \langle \dot q, \dot q\rangle _g = \mathrm {const}\). Which was one of our previous exercises.

  • Remark 3.4. The previous example provides a good opportunity to discuss the nature of the momentum from yet another perspective. Let \((M, g)\) be a riemannian manifold. The riemannian metric, or riemannian metric tensor, is a bilinear form \(\langle \cdot , \cdot \rangle _g\) on the tangent spaces of \(M\). In local coordinates \(q\in M\), we can associate to the metric a matrix \(g_{ij}(q)\). With an abuse of notation, we say that \(g_{ij}(q)\) is a \((0,2)\)-tensor.

    Every \((0,2)\)-tensor \(g_{ij}(q)\) canonically defines a map4

    \begin{equation} \hat g: TM \to T^*M, \qquad T_q M \ni v \mapsto \hat g_q(v) := p := \langle v, \cdot \rangle _g \in T^*_q M. \end{equation}

    Such tensor is called non–degenerate if the map \(T_qM \to T_q^* M\) is an isomorphism for all \(q\in M\). Such isomorphism allows one to identify tangent and cotangent spaces and, in particular, to define a dual bilinear form \(\langle \cdot , \cdot \rangle _g^*\) on \(T^*M\) using the original form on \(TM\):

    \begin{equation} \langle w_1, w_2\rangle _g^* = \langle \hat g_q^{-1} w_1, \hat g_q^{-1} w_2\rangle _g, \quad w_1,w_2 \in T^*_q M. \end{equation}

    The matrix of this dual form coincides with the inverse of the metric matrix:

    \begin{equation} \langle \dd q^i, \dd q^j\rangle _g^* = g^{ij}(q). \end{equation}

    Which, in more algebraic terms and with the same abuse of notation, means that the inverse matrix \((g^{ij}(q))\) is a \((2,0)\)-tensor.

    We can revisit Example 3.2 in light of this remark. What we see is that a mechanical lagrangian \(L(q,\dot q) = \langle \dot q, \dot q\rangle _g - U(q)\) on some state space \(TM\) is mapped via the Legendre transform into the hamiltonian \(H(q, p) = \langle p, p \rangle _g^* + U(q)\) on \(T^*M\), emphasizing the dual nature of the momentum with respect to the velocity without the need to look at the fiber derivative construction.

4 The isomorphisms between the tangent and cotangent spaces defined here are commonly known as musical isomorphisms: the flat and sharp maps respectively correspond to the mapping \(\beta :TM\to T^*M\) and \(\sharp :T^*M\to TM\) such that \(v^\beta (w) = \langle v,w\rangle _g\) and \(\langle p^\sharp , v\rangle = p(v)\). See also [J.M13, Chapter 11] or [Ser20, Example 6.1.9].

  • Example 3.3. We saw in Example 3.1 that the hamiltonian of a free particle of mass \(m\) is, in cartesian coordinates,

    \begin{equation} H = \frac {1}{2m}(p_x^2 + p_y^2 + p_z^2). \end{equation}

    The preliminary computations that we did in Section 1.4 and Examples 3.2 immediately tell us how to compute the corresponding hamiltonian in cylindrical and spherical coordinates. Namely, we can invert the metric matrix to get

    \begin{align} \mbox {cylindrical coordinates } (r,\phi ,z) \quad & \rightarrow \quad H = \frac 1{2m} \left (p_r^2 + \frac {p_\phi ^2}{r^2} + p_z^2\right ), \\ \mbox {spherical coordinates } (r,\phi ,\theta ) \quad & \rightarrow \quad H = \frac 1{2m} \left (p_r^2 + \frac {p_\phi ^2}{r^2\sin ^2\theta } + \frac {p_\theta ^2}{r^2} \right ). \end{align}

    More generally, a natural lagrangian

    \begin{equation} L(q,\dot q) = \frac 12 g_{ij}(q) \dot q^i \dot q^j - U(q) \end{equation}

    on the tangent bundle of a riemannian manifold \((M,g)\) is transformed into a hamiltonian of the form

    \begin{equation} H(q,p) = \frac 12 g^{ij}(q) p_i p_j + U(q). \end{equation}

    Of course, one can also obtain the formulas above directly using Example 3.2 and Exercise 1.7.

  • Exercise 3.2 (Inverse Legendre transform). Show that the hamiltonian \(H\) defined by (3.26) satisfies the non–degeneracy condition

    \begin{equation} \det \left (\frac {\partial ^2 H}{\partial p_i\partial p_j}\right ) \neq 0. \end{equation}

    Moreover, the inverse Legendre transform can be written in local coordinates as

    \begin{equation} \label {eq:inverseLegendre} L(q, \dot q) = p_i(q, \dot q) \dot q^i - H(q, p(q,\dot q)),\qquad \dot q^i = \frac {\partial H (q,p)}{\partial p_i}, \; i=1,\ldots ,n. \end{equation}

    Where we use the latter equation to obtain \(p_i(q, \dot q)\).

  • Exercise 3.3. Show that the Legendre transform of a non–degenerate time-dependent lagrangian \(L(q,\dot q, t)\) is well defined, the corresponding hamiltonian is

    \begin{equation} H(q,p,t) := p_i \dot q^i - L(q, \dot q, t), \end{equation}

    and Hamilton’s equations have the same form plus an additional equation

    \begin{equation} \frac {\partial H}{\partial t} = - \frac {\partial L}{\partial t}. \end{equation}

    Note that such additional equation immediately implies that the total energy is not conserved for lagrangians or hamiltonians which are explicitly dependent on time.

  • Example 3.4 (A particle in an electromagnetic field). We saw the Lagrangian for charged particles in an electromagnetic field in Example 1.7 and in Exercise 1.2. In the simpler case of a single particle, the lagrangian takes the form

    \begin{equation} L(\vb *{x},\dot {\vb *{x}}) = \frac {m \dot {\vb *{x}}^2}{2} - U(\vb *{x}) + \frac ec \langle \vb * A(\vb *{x}), \dot {\vb *{x}}\rangle . \end{equation}

    The conjugate momentum to the position is then given by

    \begin{equation} \label {eq:magMom} \vb *{p} = \frac {\partial L}{\partial \dot {\vb *{x}}} = m \dot {\vb *{x}} + \frac {e}{c}\vb * A(\vb *{x}) \end{equation}

    which now differs from what we usually call momentum by the additional term containing the magnetic vector potential \(\vb * A\).

    Inverting (3.59), we get

    \begin{equation} \dot {\vb *{x}} = \frac 1m \left ( \vb *{p} - \frac {e}{c}\vb * A(\vb *{x}) \right ), \end{equation}

    so the hamiltonian is

    \begin{align} H(\vb *{x}, \vb *{p}) & = \langle \vb *{p}, \dot {\vb *{x}}\rangle - L \\ & = \frac 1m\ \left \langle \vb *{p}, \vb *{p} - \frac {e}{c}\vb * A(\vb *{x})\right \rangle - \frac 1{2m}\left \|\vb *{p} - \frac {e}{c}\vb * A(\vb *{x})\right \|^2 \\ & \quad + U(\vb *{x}) - \frac e{mc} \left \langle \vb * A(\vb *{x}), \vb *{p} - \frac {e}{c}\vb * A(\vb *{x})\right \rangle \\ & = \frac 1{2m}\left \|\vb *{p} - \frac {e}{c}\vb * A(\vb *{x})\right \|^2 + U(\vb *{x}). \end{align} The corresponding Hamilton’s equation read

    \begin{equation} \left \lbrace \begin{aligned} \dot {\vb *{x}}^i & = \frac {\partial H}{\partial \vb *{p}_i} = \frac 1m \left (\vb *{p}_i - \frac {e}{c}(\vb * A(\vb *{x}))^i\right ) \\ \dot {\vb *{p}}_i & = -\frac {\partial H}{\partial \vb *{x}^i} = -\frac {\partial U(\vb *{x})}{\partial \vb *{x}^i} + \frac {e}{mc} \left (\vb *{p}_j - \frac {e}{c}(\vb * A(\vb *{x}))^j\right )\frac {\partial (\vb * A(\vb *{x}))^j}{\partial \vb *{x}^i} \end {aligned} \right . \end{equation}

    where \(i=1,2,3\).

    With some additional work, one can show that these are equivalent to Newton’s second law with the additional Lorenz force as in Exercise 1.2.

  • Example 3.5 (A particle in a constant magnetic field). An important special case of Example 3.4 is the case of a uniform magnetic field pointing in the \(z\)-direction: \(\vb * B = (0,0,B)\). One convenient choice of vector potential for \(\vb * B\) is

    \begin{equation} \vb * A (x,y,z) = (-B y, 0, 0). \end{equation}

    Consider a particle moving in the \((x, y)\)-plane. Then, \(p_z = 0\) and the Hamiltonian for the system is

    \begin{equation} H = \frac 1{2m}\left (p_x + \frac ec By\right )^2 + \frac 1{2m}p_y^2. \end{equation}

    Hamilton’s equations are four first order differential equations:

    \begin{equation} \label {eq:magHeq} \left \lbrace \begin{aligned} \dot p_x & = 0 \\ \dot x & = \frac 1m\left (p_x + \frac ec By\right ) \\ \dot p_y & = -\frac {eB}{cm} \left (p_x + \frac ec By\right ) \\ \dot y & = \frac {p_y} m \\ \end {aligned} \right . \end{equation}

    With some algebraic manipulations it is possible to show that there are the following additional conserved quantities (try to show it as an exercise):

    \begin{equation} p_y + \frac ec Bx = \alpha = \mathrm {const} \quad \mbox {and}\quad p_x = m\dot x - \frac ec By = \beta = \mathrm {const}. \end{equation}

    With these at hand, Hamilton’s equation become very easy to solve: we have

    \begin{equation} x(t) = \frac {c \alpha }{e B} + R \sin (\omega t + \phi ) \quad \mbox {and}\quad y(t) = -\frac {c \beta }{e B} + R \cos (\omega t + \phi ), \end{equation}

    where \(\alpha , \beta , R, \phi \) are integration constants and

    \begin{equation} \omega = \frac {e B}{m c}. \end{equation}

    We see that the particle move in circles (gyration) in the \((x,y)\)-plane with frequency \(\omega \), better known as Larmor or cyclotron frequency.

    Finally, observe from the first equation in (3.68) that \(x\) being a cyclic coordinate is reflected also in the hamiltonian formalism as the conservation of momentum.

  • Example 3.6 (The relativistic hamiltonian). If \(c>0\) denotes the speed of light, the hamiltonian \(H:\mathbb {R}^3\times \mathbb {R}^3\to \mathbb {R}\) of a free relativistic particle of mass \(m>0\) is

    \begin{equation} H(q,p) = c \sqrt {p^2 + m^2 c^2}. \end{equation}

    For \(p < c\) we can expand it in Taylor series as

    \begin{equation} H(q,p) = mc^2 + \frac {p^2}{2m} + O(p^4). \end{equation}

    If we interpret \(H\) as the total energy, you can recognize a famous formula in the first term (also called rest energy), while the second term is the non-relativistic kinetic energy.

    In this case, independently of the chosen momentum, the velocity

    \begin{equation} \dot q = \frac {\partial H}{\partial p} = c \frac {p}{\sqrt {p^2 + m^2 c^2}} \end{equation}

    is always strictly smaller, in absolute value, than the speed of light \(c\). The corresponding Lagrangian \(L\),

    \begin{equation} L(q,\dot q) = (\dot q, p(\dot q)) - H(q, p(\dot q)) = -mc^2 \sqrt {1-\frac {\dot q^2}{c^2}} \end{equation}

    is not defined on the whole \(\mathbb {R}^3\times \mathbb {R}^3\)!

3.4 Poisson brackets and first integrals of motion

One of the many advantages of the hamiltonian approach comes from an elegant algebraic description. This will play a central role in the study of conserved quantities, and provides a remarkable duality with the mathematical description of quantum mechanics.

From now on, we will denote by \(q = (q^1, \ldots , q^n)\) the local coordinates on \(M\), and

\begin{equation} (q^1, \ldots , q^n, p_1, \ldots , p_n)\in T^*M \end{equation}

the coordinates induced by \(q\) on \(T^*M\). We will, thus, write functions on the cotangent space as \(f=f(q,p)\).

Let \(f=f(q,p)\) and \(g=g(q,p)\) be two functions on the phase space \(T^*M\). Then the Poisson bracket is defined to be

\begin{equation} \label {def:Poisson} \big \{f,g\big \} = \frac {\partial f}{\partial q^i}\frac {\partial g}{\partial p_i} - \frac {\partial f}{\partial p_i}\frac {\partial g}{\partial q^i}, \end{equation}

where as usual we are summing repeated indices.

  • Exercise 3.4. Show that the definition of Poisson brackets on \(T^*M\) does not depend on the choice of the local coordinates on \(M\).

Since 3.77 is a kind of weird definition, let’s look at some of its properties.

  • Theorem 3.3. Poisson bracket defines a structure of infinite-dimensional Lie algebra5 on the space \(\mathcal {C}^\infty (T^*M)\) of smooth functions on the phase space.

    More explicitly, the operation \(\mathcal {C}^\infty (T^*M)\times \mathcal {C}^\infty (T^*M) \to \mathcal {C}^\infty (T^*M)\) defined by \((f,g) \mapsto \big \{f,g\big \}\) satisfies the following properties for any \(f,g,h \in \mathcal {C}^\infty (T^*M)\):

    • 1. it is antisymmetric, \(\big \{g,f\big \} = - \big \{f,g\big \}\);

    • 2. it is bilinear, \(\big \{\alpha f + \beta g, h\big \} = \alpha \big \{f,h\big \} + \beta \big \{g,h\big \}\) and \(\big \{f, \alpha g + \beta h\big \} = \alpha \big \{f,g\big \} + \beta \big \{f,h\big \}\) for any \(\alpha , \beta \in \mathbb {R}\);

    • 3. it fulfills Jacobi identity:

      \begin{equation} \label {eq:JacobiId} \big \{\big \{f,g\big \},h\big \} + \big \{\big \{h,f\big \},g\big \} + \big \{\big \{g,h\big \},f\big \} = 0. \end{equation}

    Finally, the Poisson bracket satisfies Leibnitz rule with respect to function products:

    \begin{equation} \label {eq:LeibnitzId} \big \{fg, h\big \} = g\big \{f, h\big \} + f \big \{g, h\big \}, \qquad f,g,h \in \mathcal {C}^\infty (T^*M). \end{equation}

5 The corresponding Lie group is the group of symplectomorphisms, a concept that will be introduced later.

The only property requiring a proof is the Jacobi identity. Here I quote [Ton05]:

To prove this you need a large piece of paper and a hot cup of coffee. Expand out all 24 terms and watch them cancel one by one.

What Theorem 3.3 is telling us, is that Poisson’s bracket define the same algebraic structure as matrix commutators and differential operators. There is a deep relationship between what we are seeing here and Heisenberg’s and Schrödinger’s picture of quantum mechanics. This relationship is emphasized even more if we compute the Poisson bracket of the coordinate functions:

\begin{equation} \label {eq:coordcommutators} \big \{q^i,p_j\big \} = \delta ^i_j, \qquad \big \{q^i,q^j\big \} = \big \{p_i,p_j\big \} = 0, \qquad i,j = 1,\ldots ,n. \end{equation}

One crucial property of the Poisson bracket is enunciated by the following theorem.

  • Theorem 3.4. Let \(H=H(q,p)\) be the hamiltonian of an hamiltonian system (3.31) and \(F=F(q,p)\in \mathcal {C}^\infty (T^*M)\). Then, along the flow \((q(t), p(t))\) of \(H\), we have

    \begin{equation} \dv {t} F = \frac {\partial F}{\partial q^i} \dot q^i + \frac {\partial F}{\partial p_i} \dot p_i = \big \{F,H\big \}. \end{equation}

We say that two functions \(H\) and \(F\) on \(T^* M\) commute, or are in involution, if

\begin{equation} \big \{H,F\big \} = 0. \end{equation}

Then Theorem 3.4 implies the following.

  • Corollary 3.5. Given an hamiltonian system (3.31) with hamiltonian \(H\) and a smooth function \(F\) on its phase space, then \(F\) is a first integral if and only if \(H\) and \(F\) are in involution.

This corollary implies some further interesting corollaries.

  • Corollary 3.6. The first integrals of a hamiltonian system form a subalgebra of the Lie algebra \(\mathcal {C}^\infty (T^*M)\). Otherwise said, given two first integrals \(F\) and \(G\) of a hamiltonian system, then also the function \(\big \{F,G\big \}\) is a first integral of \(H\).

  • Proof. By the previous corollary \(\big \{F,H\big \} = \big \{G,H\big \} = 0\).

    Jacobi identity then implies that \(\big \{\big \{F,G\big \},H\big \} = 0\), i.e., \(\big \{F,G\big \}\) is a first integral of \(H\).

  • Corollary 3.7. The hamiltonian function \(H\) is a first integral of the hamiltonian system (3.31).

Which is the hamiltonian version of the conservation of energy, see Theorem 1.7. In fact, we can reformulate this corollary as follows.

  • Corollary 3.8. The integral curves \(p = p(t)\) and \(q=q(t)\) of the hamiltonian system (3.31) belong to the levelset of the hamiltonian function, i.e.

    \begin{equation} H(q(t), p(t)) = \mathrm {const}. \end{equation}

3.4.1 The symplectic matrix

Denote the \(2n\) coordinates on \(T^*M\) as

\begin{equation} x=(x^1, \ldots , x^{2n}) := (q^1, \ldots , q^n, p_1, \ldots , p_n). \end{equation}

With this notation, the hamiltonian system (3.31) takes the form

\begin{equation} \dot x^k = \big \{x^k, H\big \}, \qquad k=1,\ldots ,2n. \end{equation}

We can write the Poisson bracket of the coordinate functions (3.80) as

\begin{equation} \big \{x^k, x^l\big \} = J^{kl}, \qquad k,l = 1,\ldots ,2n, \end{equation}

where \(J^{kl}\) are the matrix elements of the following antisymmetric matrix, called the symplectic matrix,

\begin{equation} \label {eq:symmat} J = \left (J^{kl}\right )_{1\leq k,l\leq 2n} = \begin{pmatrix}0 & \Id _n \\ -\Id _n & 0\end {pmatrix}. \end{equation}

Then formula (3.77) for the Poisson bracket of two arbitrary functions takes the following form

\begin{equation} \label {eq:coordPB} \big \{f,g\big \} = J^{kl} \frac {\partial f}{\partial x^k}\frac {\partial g}{\partial x^l}, \end{equation}

and the hamiltonian system (3.31) can be rewritten as

\begin{equation} \dot x^k = J^{kl} \frac {\partial H}{\partial x^l}, \qquad k=1,\ldots ,2n, \end{equation}

or, more compactly,

\begin{equation} \label {eq:hamsysJ} \dot x = J \frac {\partial H}{\partial x}. \end{equation}

3.4.2 A brief detour on time-dependent hamiltonians

TODO: Imprecise and partially incorrect, rewrite

Let \(L=L(q,\dot q,t): TM\times \mathbb {R}\to \mathbb {R}\), \(q=(q^1,\ldots ,q^n)\in M\), be the time-dependent lagrangian of some mechanical system as in Exercise 3.3.

Also in this case we can arrive to canonical equations of the form (3.31), however these will be on the extended space \(T^*M\times \mathbb {R}^2\). The local coordinates \((q^1,\ldots ,q^n, p_1, \ldots ,p_n, q^{n+1}, p_{n+1})\) on \(T^*M\times \mathbb {R}^2\) are such that \(q^i, p_i\) remain the same for \(i\leq n\),

\begin{equation} q^{n+1} = t, \quad \mbox {and}\quad p_{n+1} = E, \end{equation}

where \(E\) is a new independent variable, and the new hamiltonian is

\begin{align} \hat H & = H(q^1, \ldots , q^n, p_1, \ldots , p_n, t) - p_{n+1}\dot q^{n+1} \\ & = H(q^1, \ldots , q^n, p_1, \ldots , p_n, t) - E. \end{align} Here, \(H\) is given by the usual formula

\begin{equation} H = \sum _{i=1}^n p_i \dot q^i - L(q, \dot q, t), \quad \dot q^i = \dot q^i(q,p,t). \end{equation}

The corresponding hamiltonian system is

\begin{align} & \dot q^i = \frac {\partial \hat H}{\partial p_i}, \quad \dot p_i = -\frac {\partial \hat H}{\partial q^i}, \quad i=1,\ldots ,n \\ & \dot t = \frac {\partial \hat H}{\partial E} = 1, \quad {\dot E} = -\frac {\partial \hat H}{\partial t}. \end{align} The first \(2n\) equations above are equivalent to the Euler-Lagrange equations for \(L\), the equation for \(q^{n+1} = t\) tells us that \(t\) is the same as in the original lagrangian and the final equation is simply the variation of the energy

\begin{equation} \dot E = \frac {\partial H}{\partial t}. \end{equation}

If we restrict ourselves to the zero levelset \(\hat H = H - E = 0\), then the hamiltonian system is equivalent to the Euler-Lagrange equations for \(L\) and the equation

\begin{equation} E = \sum _{i=1}^n p_i \dot q^i - L(q, \dot q, t), \end{equation}

which completes the recovering of the original dynamics.

The duality between time and energy and the reason they enter in this picture with a negative sign becomes clearer once one looks at variational principles from the hamiltonian point of view, coming up in the next section.

3.5 Variational principles of hamiltonian mechanics

Given a hamiltonian system \(h\), can we describe its solutions of \(q=q(t)\), \(p=p(t)\) in terms of a variational principle as we do in Lagrangian mechanics?

A natural way to approach this question is to go back to Hamilton’s variational principle, see Section 1.2, and use (3.55) to invert Legendre transformation. In fact, this is not in vane and one ends up showing the following theorem.

  • Theorem 3.9. The solutions of a hamiltonian system (3.31) with hamiltonian \(H=H(q,p)\) are the critical points of the functional

    \begin{equation} \label {eq:variationalHamilton} S = \int _{t_1}^{t_2} \left (p_i\dot q^i - H(q,p)\right ) \dd t, \quad q(t_1) = q_1, \quad q(t_2) = q_2. \end{equation}

Note that, even though \(p\) and \(q\) play a rather symmetric role, we only fix the values of \(q\) at the endpoints \(t=t_1\) and \(t=t_2\), while \(p\) can be arbitrary.

  • Proof. Computing the variation of the action \(\delta S = S[q+\delta q, p + \delta p]\), one finds

    \begin{align} \delta S & = \int _{t_1}^{t_2} \left ( \dot q^i \delta p_i + p_i \delta \dot q^i - \frac {\partial H}{\partial q^i}\delta q^i - \frac {\partial H}{\partial p_i}\delta p_i \right ) \dd t \\ & = p_i \delta q^i\Big |_{t_1}^{t_2} + \int _{t_1}^{t_2} \left ( \dot q^i \delta p_i - \dot p_i \delta q^i - \frac {\partial H}{\partial q^i}\delta q^i - \frac {\partial H}{\partial p_i}\delta p_i \right ) \dd t \\ & = \int _{t_1}^{t_2} \left [ \left (\dot q^i - \frac {\partial H}{\partial p_i} \right ) \delta p_i - \left (\dot p_i + \frac {\partial H}{\partial q^i}\right ) \delta q^i \right ] \dd t. \end{align} In the second step we integrated by parts, while for the last step we used the fact that \(\delta q^i\) is zero at the endpoints. The statement now follows from the arbitrariness of \(\delta q\) and \(\delta p\).

Note that often the action (3.99) above is written as

\begin{equation} S = \int _{t_1}^{t_2} (p_i \dd q^i - H \dd t), \end{equation}

where is implied that on the curves \(q=q(t)\), \(p=p(t)\) one makes the substitution \(p_i \dd q^i = p_i(t) \dot q^i (t) \dd t\).

Interestingly, this is not the only way to formulate a hamiltonian variational approach. An alternative approach, the so-called Maupertuis variational principle considers the curves that satisfy conservation of energy, resulting in the following theorem.

  • Theorem 3.10. Given a function \(H=H(q,p)\) and a real number \(E\), then the (isoenergetic) trajectories of the hamiltonian system with hamiltonian \(H\) are the critical points of the truncated action

    \begin{equation} \label {eq:variationalMaupertuis} S_0 = \int _{q_1}^{q_2} p_i \dd q^i \end{equation}

    in the class of curves \(q=q(t)\), \(p=p(t)\) belonging to the levelset

    \begin{equation} H(q(t),p(t)) = E \end{equation}

    and connecting the initial point \(q=q_1\) with the final point \(q=q_2\).

  • Remark 3.5. Maupertuis principle is a variational principle in configuration space: it determines only the trajectories of the motion, not the dynamics on the trajectories.

    In fact, we cannot even fix the time \(t=t_2\) of arrival at the endpoint \(q=q_2\): it will be implicitly determined by the solution of the variational problem.

  • Proof. Being a variation with constraints, it is natural to apply the Lagrange multipliers method shown in Section 2.5 and proceed similarly as the previous proof. We end up with the following equation

    \begin{equation} \delta \int \left [p_i \dot q^i - \lambda (t)\left (H(q,p) - E\right )\right ]\dd t = 0, \end{equation}

    which, again mimicking the previous proof, reduces to the equations

    \begin{equation} \dot q^i = \lambda (t) \frac {\partial H}{\partial p_i}, \quad \dot p_i = -\lambda (t) \frac {\partial H}{\partial q^i}, \quad H(q,p) = E. \end{equation}

    The time reparametrization \(\dd s = \lambda (t)\dd t\) reduces the system to the canonical equation, proving the theorem.

An important application of Maupertuis principle is the determination of the projections \(q=q(t)\) on the configuration space \(M\) of the trajectories in the phase space \(T^*M\). To such end, one restricts the functional on the subset of curves that satisfy the equations

\begin{equation} \dot q^i = \frac {\partial H(q,p)}{\partial p_i}, \quad i=1,\ldots ,n. \end{equation}

With the right choice of time parametrization, this ends up being a constraint which is compatible with the variational problem. Solving the equations in the form \(p_i = p_i(q,\dot q)\) and using them in the energy constraint \(H(q, p(q,\dot q))=E\), we can find \(\dd t\) in terms of the coordinates \(q\) and their differentials \(\dd q\). Once we replace the values in (3.104), we remain with the variational problem

\begin{equation} \delta \int p_i\left (q, \frac {\partial q}{\partial t}\right ) \frac {\partial q^i}{\partial t} \dd t = 0, \end{equation}

which characterizes the trajectories in configuration space.

Let’s try to clarify this with some important examples.

  • Example 3.7 (Geodesics and the free motion on riemannian manifolds). Consider a riemannian manifold \((M, g)\), remember that the metric can be written as

    \begin{equation} \dd s^2 = g_{ij}(q)\dd q^i \dd q^j. \end{equation}

    Consider the free motion of a point particle of mass \(m\). As seen in Example 3.2, its hamiltonian is given by

    \begin{equation} H = \frac 1{2m} g^{ij}(q) p_i p_j. \end{equation}

    The first group of equation of the hamiltonian system (3.31) gives the usual relation between momentum and velocity, which we can invert to get

    \begin{equation} p_i = m g_{ij}(q)\dot q^j. \end{equation}

    Replacing them in the energy conservation law we get

    \begin{equation} \frac m2 g_{ij}(q)\frac {\dd q^i \dd q^j}{\dd t^2} = E, \end{equation}

    which gives

    \begin{equation} p_i\dot q^i \dd t= 2E \dd t \quad \mbox {and}\quad \dd t = \sqrt {\frac {m}{2E}}\sqrt {g_{ij}(q)\dd q^i \dd q^j} = \sqrt {\frac {m}{2E}} \dd s. \end{equation}

    Gathering all together in (3.104) we obtain the variational problem

    \begin{equation} \delta S_0 = 0, \quad S_0 = \sqrt {2mE}\int _{q_1}^{q_2} \dd s. \end{equation}

    Observing that the length functional \(S = \int \dd s\) is invariant with respect to reparametrization, we prove the following theorem.

    • Theorem 3.11. Given a riemannian manifold \((M,g)\), the trajectories of a free point particle on the manifold are the geodesics of the metric.

    Among the curves connecting two points of the manifold, geodesics are the (piecewise) smooth curves that minimize the distance – in the sense that they are critical points of an action for a non–degenerate hamiltonian (or, equiv., lagrangian).

    See also [BKF95] for a more involved application of this.

  • Example 3.8 (Jacobi metric). Let’s consider a mechanical system \((M,L)\), where \(L\) is a natural lagrangian for a particle of mass \(m\),

    \begin{equation} L = \frac m2 g_{ij}(q) \dot q^i \dot q^j - U(q), \end{equation}

    on the tangent bundle of a riemannian manifold \((M,g)\), again like the ones seen in Example 3.1. As we saw, once we fix the energy level \(E\), the motion of the particle is constrained to remain in the part of \(M\) such that

    \begin{equation} \label {eq:JacobiDomain} U(q) < E. \end{equation}

    • Theorem 3.12. The trajectories of the mechanical system \((M,L)\) in the domain (3.117) are the geodesic of the metric

      \begin{equation} \label {eq:JacobiMetric} \dd s^2_E = 2m (E-U(q)) g_{ij}(q)\dd q^i \dd q^j. \end{equation}

    • Proof. As in the previous example, a direct computation shows that

      \begin{equation} p_i \dot q^i \dd t = 2 (E-U(q)) \dd t, \quad \dd t = \sqrt {\frac {m}{2}} \frac {\dd s}{\sqrt {E-U(q)}}. \end{equation}

      Then, for the action, one gets

      \begin{equation} S_0 = \int \sqrt {2m(E-U(q))}\dd s = \int \dd s_E, \end{equation}

      which coincides with the length of the curve in the metric (3.118).

    The metric (3.118) is called Jacobi metric. The dynamics on geodesics of the Jacobi metric is determined by the quadrature

    \begin{equation} t - t_0 = \frac 12 \int \frac {\dd s_E}{E-U(q)}. \end{equation}

    Jacobi metric is often useful in practice, allowing to use the whole machinery around geodesic motions in riemannian manifolds to study properties of the hamiltonian dynamics. For an application to find periodic trajectories of the double pendulum, developed in all the details, you can refer to [Kna18, Example 8.32].

  • Example 3.9 (Fermat principle and geometric optics). Propagation of light in an isotropic medium is described by the hamiltonian

    \begin{equation} H(\vb *{x}, \vb *{p}) = c(\vb *{x}) \|\vb *{p}\|, \end{equation}

    where \(c(\vb *{x}) > 0\) gives the speed of light at the point \(\vb *{x}\). Using the procedure described above, one arrives at

    \begin{equation} \dot {\vb *{x}} = c(\vb *{x})\frac {\vb *{p}}{\|\vb *{p}\|}, \quad \|\vb *{p}\| = \frac {E}{c(\vb *{x})}, \end{equation}

    which imply

    \begin{equation} \langle \vb *{p}, \dd {\vb *{x}}\rangle = \|\vb *{p}\|\|\dd {\vb *{x}}\| = \frac {E}{c(\vb *{x})}\|\dd {\vb *{x}}\|. \end{equation}

    For the action (3.104) we end up with

    \begin{equation} S_0 = E \int _{\vb *{x}_1}^{\vb *{x}_2} \frac {\|\dd {\vb *{x}}\|}{c(\vb *{x})}, \end{equation}

    which is \(E\) multiplied by the time of propagation of light between the points \(\vb *{x}_1\) and \(\vb *{x}_2\).

    In summary, we have shown the following.

    • Theorem 3.13. Light in an isotropic medium propagates in a way that minimizes the travel time. The trajectories followed by the light are the geodesics of the metric

      \begin{equation} \dd s^2_{\mathrm {Fermat}} = \frac {\dd s^2}{c^2(\vb *{x})} \end{equation}

      where

      \begin{equation} \dd s^2 = \dd {\vb *{x}}^2 = \dd x^2 + \dd y^2 + \dd z^2. \end{equation}

    You can see how this is used to study mirages in [Kna18, Example 8.34]. For a nice account of geometric optics refer to [Kna18, Chapter 8.7], an interesting historical take is presented in [Bro16].

3.6 Canonical transformations

Let \(M\) be a smooth manifold. In Section 3.4 we saw that, thanks to the Poisson bracket, we can endow \(\mathcal {C}^\infty (T^*M)\) with a Lie algebra structure, see Theorem 3.3.

We call canonical transformation of \(T^*M\) a diffeomorphism \(\Phi : T^*M\to T^*M\) such that the function

\begin{equation} \Phi ^*: \mathcal {C}^\infty (T^*M) \to \mathcal {C}^\infty (T^*M),\quad \Phi ^* f(x) := f(\Phi (x)), \end{equation}

induced by \(\Phi \) on \(\mathcal {C}^\infty (T^*M)\), is an automorphism of the Lie algebra, i.e.

\begin{equation} \label {eq:automorphism} \big \{\Phi ^* f, \Phi ^* g\big \} = \Phi ^*\big \{f,g\big \}, \qquad \forall f,g\in \mathcal {C}^\infty (T^*M). \end{equation}

In the coordinates \(x=(x^1, \ldots , x^{2n})\) on \(T^*M\) defined in Section 3.4.1, \(\Phi \) is described by \(2n\) functions of \(2n\) variables,

\begin{equation} x(x^1,\ldots ,x^{2n}) \mapsto \Phi (x) = \left (\Phi ^1(x), \ldots , \Phi ^{2n}(x)\right ), \end{equation}

and (3.129) takes the form (exercise)

\begin{equation} \label {eq:cantrafocoord} \frac {\partial \Phi ^k(x)}{\partial x^i} J^{ij} \frac {\partial \Phi ^l(x)}{\partial x^j} = J^{kl},\quad k,l=1,\ldots ,2n. \end{equation}

  • Example 3.10. The transformation \(\Phi : (q,p) \mapsto (p, -q)\) is a canonical transformation. To show it, it is enough to verify how the Poisson bracket acts on the coordinate functions. Let, as usual,

    \begin{equation} x=(x^1, \ldots ,x^{2n})=(q^1,\ldots ,q^n,p_1,\ldots ,p_n) \end{equation}

    and denote \(Q,P\) be the new coordinates, then \(Q^i := \Phi (x)^i = p_i\), \(P_i := \Phi (x)^{n+i} = -q^i\), \(i=1,\ldots , n\). We can immediately compute

    \begin{align} & \big \{Q^i, Q^j\big \} = \big \{p_i, p_j\big \} = 0 \\ & \big \{P_i, P_j\big \} = \big \{-q^i, -q^j\big \} = \big \{q^i, q^j\big \} = 0 \\ & \big \{Q^i, P_j\big \} = \big \{p_i, -q^j\big \} = \big \{q^j, p_i\big \} = \delta ^j_i \end{align} for any \(i,j=1,\ldots ,n\). If \(\widetilde x = (\widetilde x^1, \ldots , \widetilde x^{2n})= (Q^1, \ldots , Q^n, P_1,\ldots ,P_n)\), the above can be rewritten as \(\Phi ^*\big \{x^i, x^j\big \} = \Phi ^* J^{ij} = J^{ij} = \big \{\widetilde x^i, \widetilde x^j\big \} = \big \{\Phi ^* x^i, \Phi ^* x^j\big \}\).

  • Example 3.11. The cotangent lift \(\Phi \) of a diffeomorphism \(\phi \) defined on the base manifold \(M\), as defined below, is a canonical transformation.

    \begin{align} & \phi :M\to M, \quad \phi (q) = \left (\phi ^1(q), \ldots , \phi ^n(q)\right ) \\ & \Phi :T^*M\to T^*M, \quad \Phi (q,p) = (Q, P), \end{align} where \((Q,P)\) are given by

    \begin{align} & Q = \phi (q), \\ & \phi ^* P = p, \qquad (\phi ^* P)_i = \frac {\partial \phi ^j(q)}{\partial q^i} p_j. \end{align} Here we are abusing slightly the notation, it is more conventional to write \(p = (\dd \phi _{q})^* P\). For a more detailed account, you can have a look at [MR99, Chapter 6.3 and in particular formula (6.3.4)] or [Ser20, Proposition 6.2.8].

  • Exercise 3.5. Consider the linear transformations on \(\mathbb {R}^{2n}\),

    \begin{equation} x \mapsto A x, \end{equation}

    were \(A\) is a \(2n\times 2n\) matrix. Show that the transformation is canonical if and only if \(A\) satisfies the identity

    \begin{equation} A J A^T = J, \end{equation}

    where \(J\) is the symplectic matrix.

The following theorem should clarify why canonical transformations are important for us.

  • Theorem 3.14. Let \(\Phi : T^*M \to T^* M\) be a canonical transformation. Then, every solution \(x = x(t)\) of the hamiltonian system with hamiltonian \(H\) is mapped by \(\Phi \) into a solution \(\Phi (x(t))\) of the hamiltonian system with hamiltonian \((\Phi ^{-1})^* H\).

  • Proof. Let \(\widetilde x(t) := \Phi (x(t))\). Theorem 3.4 implies that for any smooth function \(f\), it holds

    \begin{equation} \dv {t} f(x) = \big \{f(x), H(x)\big \}\Big |_{x=x(t)}. \end{equation}

    Which for \(f(\widetilde x(t)) = f(\Phi (x(t))) = \Phi ^* f(x(t))\), reduces to

    \begin{equation} \dv {t} f(\widetilde x(t)) = \big \{f(\Phi (x)), H(x)\big \}. \end{equation}

    By using \(x = \Phi ^{-1}(\Phi (x))\) and the definition of canonical transformations we can manipulate the right hand side as follows

    \begin{align} \big \{f(\Phi (x)), H(x)\big \} & = \big \{f(\Phi (x)), H(\Phi ^{-1}(\Phi (x))\big \} \\ & = \big \{\Phi ^* f, \Phi ^* (\Phi ^{-1})^* H\big \} \\ & = \Phi ^* \big \{f, (\Phi ^{-1})^* H\big \} = \big \{f, (\Phi ^{-1})^* H\big \}\Big |_{x=\widetilde x(t)}. \end{align} Collecting both sides, one gets

    \begin{equation} \dv {t} f(\widetilde x(t)) = \big \{f, (\Phi ^{-1})^* H\big \}\Big |_{x=\widetilde x(t)} \end{equation}

3.6.1 Hamiltonian vector fields

To investigate in more details the connections between canonical transformations and hamiltonian systems, it is convenient to introduce the infinitesimal version of (3.129).

In the rest of this section we will show that hamiltonian flows can be associated to hamiltonian vector fields which define one–parameter groups of canonical transformations. These will allow us to define hamiltonian symmetries and state the hamiltonian version of Noether theorem.

A vector field \(X\) on the manifold \(T^* M\) is called an infinitesimal symmetry of the Poisson bracket if for any two functions \(f,g \in \mathcal {C}^\infty (T^*M)\) it holds

\begin{equation} \label {eq:infsymm} X\big \{f,g\big \} = \big \{X f, g\big \} + \big \{f, X g\big \}. \end{equation}

  • Exercise 3.6. Let \(\Phi _t:T^*M \to T^*M\) denote a one–parameter group of canonical transformations, i.e \(\big \{\Phi _t^* f, \Phi _t^* g\big \} = \Phi _t^*\big \{f,g\big \}\) for all \(t\in \mathbb {R}\). Show that the vector field \(X\) which generates the group,

    \begin{equation} X(x) := \dv {t}\Phi _t(x)\Big |_{t=0}, \end{equation}

    is an infinitesimal symmetry.

    Moreover, this relation goes in both ways: show that the one–parameter group \(\Phi _t:T^*M \to T^*M\) associated to an arbitrary infinitesimal symmetry \(X\) acts as a canonical transformation.

Given a function \(H\) on \(T^* M\), Leibnitz rule for the Poisson bracket (3.79) tells us that the mapping

\begin{equation} \label {eq:Hambrace} f \mapsto \big \{f,H\big \}, \quad f\in \mathcal {C}^\infty (T^*M), \end{equation}

is a first-order differential operator. Using the identification between derivations, i.e., first-order differential operators, and vector fields we can associate to the operator (3.150) a vector field \(X_H\) such that

\begin{equation} X_H f = \big \{f, H\big \}. \end{equation}

Such vector field \(X_H\) is called hamiltonian vector field generated by the hamiltonian \(H\). In local coordinates \(q,p\), the hamiltonian vector field \(X_H\) has the form

\begin{equation} X_H = \frac {\partial H}{\partial p_i} \frac {\partial }{\partial q^i} - \frac {\partial H}{\partial q^i}\frac {\partial }{\partial p_i}. \end{equation}

In the coordinates \(x = (x^1, \ldots , x^{2n})\) on \(T^*M\), the field has the compact representation

\begin{equation} X_H = J^{ij}\frac {\partial H}{\partial x^j}\frac {\partial }{\partial x^i}. \end{equation}

We have constructed a mapping of the space of smooth functions on \(T^*M\) into the space6 \(\Gamma (TT^* M)\) of smooth vector fields on \(T^*M\)

\begin{equation} \label {eq:antimapping} \mathcal {C}^\infty (T^*M) \to \Gamma (TT^* M), \quad H \mapsto X_H. \end{equation}

The Lie bracket of vector fields on a smooth manifold \(N\) [J.M13],

\begin{equation} [X,Y] f := X(Y f) - Y(X f), \qquad f\in \mathcal {C}^\infty (N), \quad X,Y\in \Gamma (TN), \end{equation}

defines a Lie algebra structure on the space \(\Gamma (TN)\) of vector fields on \(N\). In our case, \(N = T^* M\) and it seems perfectly reasonable to ask if and how this bracket can be related to Poisson bracket. Jacobi identity comes to the rescue, leading to the following beautiful relation.

6 Don’t let the \(T\)s scare you, here \(TT^*M\) simply means the tangent bundle to the manifold \(T^*M\).

  • Theorem 3.15. The mapping (3.154) in an anti-homomorphism of Lie algebras, i.e., for any two functions \(F\) and \(G\) on \(T^*M\),

    \begin{equation} [X_{F}, X_{G}] = - X_{\big \{F, G\big \}}. \end{equation}

  • Proof. By definition of Lie bracket and of hamiltonian vector field, we have

    \begin{align} [X_{F}, X_{G}] f & = X_{F}(X_{G} f) - X_{G}(X_{F} f) \\ & = \big \{\big \{f, G\big \}, F\big \} - \big \{\big \{f, F\big \}, G\big \}. \end{align} Jacobi identity (3.78) then gives

    \begin{align} \big \{\big \{f, G\big \}, F\big \} - \big \{\big \{f, F\big \}, G\big \} & = - \big \{f,\big \{F,G\big \}\big \} \\ & = - X_{\big \{F, G\big \}} f. \end{align}

A crucial property of hamiltonian vector fields is that they are infinitesimal symmetries of the Poisson bracket.

  • Theorem 3.16. The hamiltonian vector field \(X_H\) associated to a smooth function \(H\) on \(T^*M\) is an infinitesimal symmetry of Poisson bracket. Vice versa, for every infinitesimal symmetry \(X\) there exists a function \(H\) on \(T^*M\) such that \(X = X_H\).

  • Proof. The first part of the theorem follows from Jacobi identity:

    \begin{align} X_H \big \{f,g\big \} & = \big \{\big \{f,g\big \}, H\big \} \\ & = \big \{\big \{f,H\big \},g\big \} + \big \{f,\big \{g,H\big \}\big \} \\ & = \big \{X_Hf,g\big \} + \big \{f,X_Hg\big \}, \end{align} which coincides with (3.148).

    For the second part of the theorem we need to do more work. Given two indices \(1\leq i,j \leq 2n\), take the coordinate functions \(f(x) = x^i\) and \(g(x) = x^j\) and use (3.88) to rewrite (3.148):

    \begin{align} 0 & = \big \{X f,g\big \} + \big \{f, Xg\big \} - X \big \{f,g\big \} \\ & = \big \{X^i(x),x^j\big \} + \big \{x^i,X^j(x)\big \} \\ & = J^{kj} \frac {\partial X^i}{\partial x^k} + J^{ik} \frac {\partial X^j}{\partial x^k}. \end{align} This has to hold true however we choose the indices, leading to the system of equations

    \begin{equation} \label {eq:systemvfH} J^{kj} \frac {\partial X^i}{\partial x^k} + J^{ik} \frac {\partial X^j}{\partial x^k} = 0, \quad i,j = 1,\ldots ,2n. \end{equation}

    Define

    \begin{equation} \omega _i := J_{ij} X^j, \quad \mbox {where}\quad (J_{ij}) := (J^{ij})^{-1}. \end{equation}

    Multiplying (3.167) by \(J_{\alpha i} J_{\beta j}\), and summing over the repeated indices \(i,j\), we end up with

    \begin{equation} \frac {\partial \omega _\alpha }{\partial x^\beta } - \frac {\partial \omega _\beta }{\partial x^\alpha } = 0, \quad \alpha ,\beta = 1, \ldots , 2n, \end{equation}

    which is equivalent to say that the one–form

    \begin{equation} \omega = \omega _i \dd x^i \end{equation}

    is closed, \(\dd \omega = 0\).

    Poincaré lemma [J.M13] then implies that, locally, there exists a function \(H\) such that

    \begin{equation} \omega = \dd H, \quad \mbox {i.e.}\quad \omega _i = \frac {\partial H}{\partial x^i}, \quad i=1,\ldots ,2n. \end{equation}

    Solving the equations \(J_{ij} X^j = \frac {\partial H}{\partial x^i}\) with respect to \(X\), we get

    \begin{equation} \label {eq:hamsysJ-1} X^j = J^{ji} \frac {\partial H}{\partial x^i}, \end{equation}

    which, recall equation (3.90), is exactly the hamiltonian vector field \(X_H\).

Given a one–parameter group of canonical transformations \(\Phi _t:T^*M\to T^*M\), we call hamiltonian generator of the group the hamiltonian \(H\) on \(T^*M\) such that

\begin{equation} \dv {t}\Phi _t(x)\Big |_{t=0} = X_H(x). \end{equation}

Even though we only proved the local existence of the hamiltonian generator there are many examples of globally defined generators of the group – as was the case for the generators of symmetries in the Lagrangian setting.

  • Exercise 3.7. Let \(\Phi _t\) the one–parameter group of canonical transformations generated by the quadratic hamiltonian

    \begin{equation} H = \frac 12 \sum _{i=1}^n\left (p_i^2 + (q^i)^2\right ). \end{equation}

    Show that the transformation \(\Phi _{t=\frac \pi 2}\) acts as follows

    \begin{equation} \Phi _{t=\frac \pi 2} : (q,p) \mapsto (p, -q) \end{equation}

  • Exercise 3.8. Let \(\phi _t:M\to M\) denote a one–parameter group of diffeomorphisms of the base \(M\) of the cotangent bundle \(T^*M\). Consider, as in Example 3.11, its cotangent lift

    \begin{equation} \Phi _t:T^*M\to T^*M, \quad \Phi _t(q,p) = \left (\phi _t(q), \left (\phi _t^*\right )^{-1}p\right ). \end{equation}

    Show that the hamiltonian generator of the canonical transformations \(\Phi _t\) depends linearly on the coordinates \(p_1, \ldots , p_n\) and is given by the following formula

    \begin{align} & H(q,p) = \langle p, X(q) \rangle = p_i X^i(q), \label {eq:linhamp} \\ & X(q) = (X^1(q), \ldots , X^n(q)), \quad X(q) = \dv {t} \phi _t(q)\Big |_{t=0}. \end{align}

3.6.2 The hamiltonian Noether theorem

With all these results at hand we are finally able to give a meaningful definition of hamiltonian symmetries. Note that, until now, in this chapter we have been discussing symmetries associated to the Poisson structure on the space \(C^\infty (T^*M)\) of smooth real-valued functions on the cotangent bundle. We have not yet considered the symmetries of a specific hamiltonian function. To recover Noether we need to introduce precisely such concept.

Fix a hamiltonian system on \(T^*M\) with hamiltonian \(H\). A diffeomorphism \(\Phi :T^*M\to T^* M\) is a symmetry of the hamiltonian system if \(\Phi \) is a canonical transformation and7 \(\Phi ^* H = H\).

7 This should be compared with a symmetry of a lagrangian \(L:TM \to \mathbb {R}\), i.e. a diffeomorphism \(\phi : M \to M\) such that \(\phi ^* L = L\). Note that even though the formula looks very similar, the symmetry of the hamiltonian is defined on the whole phase space and not induced by a symmetry on the base as in the Lagrangian case. In the hamiltonian formalism, position and momentum are really on equal footings.

Theorem 3.14 then implies that symmetries map solutions of the hamiltonian system to other solutions.

We can finally state the hamiltonian counterpart of Noether theorem.

  • Theorem 3.17 (Hamiltonian Noether theorem). Let \(H\) be a hamiltonian on \(T^*M\) and \(F\) be a first integral of the hamiltonian system, then \(F\) generates a one–parameter group of symmetries of the hamiltonian system \(H\).

    Vice versa, given a one–parameter group of symmetries of the hamiltonian system \(\Phi _t:T^*M\to T^* M\), then locally there exists a first integral \(F\) of \(H\) which generates the group.

  • Proof. The first part of the proof follows from Theorem 3.16 and the following corollary of Theorem 3.15.

    • Corollary 3.18 (of Theorem 3.15). Let \(H, F \in \mathcal {C}^\infty (T^*M)\) be two hamiltonian functions in involution, i.e., \(\{H,F\}=0\). Then, their flows, respectively denoted here by \(\Phi _t^H\) and \(\Phi _s^F\), commute:

      \begin{equation} \Phi _t^H \circ \Phi _s^F = \Phi _s^F \circ \Phi _t^H. \end{equation}

    For the second part of the theorem, it is enough to observe that the hamiltonian generator of the group constructed in the second part of the proof of Theorem 3.16 commutes with the hamiltonian, i.e., it is a first integral.

  • Example 3.12. Consider \((q,p) = (x,y,z,p_x,p_y,p_z)\in T^* \mathbb {R}^3\) and the infinitesimal generator \(M_z = x p_y - y p_x\) (which you may recognize as the \(z\)-component of the angular momentum).

    Parametrize the transformations with \(\theta \), as the parameter will turn out to correspond to an angle. Then the corresponding evolution in \(\theta \) is

    \begin{equation} \frac {\dd q}{\dd \theta } = \big \{q, M_z\big \} = (-y, x, 0),\qquad \frac {\dd p}{\dd \theta } = \big \{p, M_z\big \} = (-p_y, p_x, 0), \end{equation}

    which is readily solved by

    \begin{equation} q(\theta )= R q(0), \quad p(\theta ) = R p(0), \quad R := \begin{pmatrix} \cos \theta & -\sin \theta & 0 \\ \sin \theta & \cos \theta & 0 \\ 0 & 0 & 1 \end {pmatrix}. \end{equation}

    We see that the one parameter group of transformations \(\Phi _\theta (q,p) = (Rq, Rp)\) generated by \(M_z\) is the one of a clockwise rotation about the \(z\) axis.

    In an analogous way, one can see that \(M_x = yp_z-zp_y\) and \(M_y = zp_x - xp_z\) generate rotations about the coordinate axes, respectively \(x\) and \(y\). We can use the Poisson bracket to show that any hamiltonian

    \begin{equation} H(q,p) = \frac 1{2m}\left (p_x^2 + p_y^2 + p_z^2\right ) + V\left (\sqrt {x^2 + y^2 + z^2}\right ) \end{equation}

    is in involution with the three components of the angular momentum. Indeed,

    \begin{equation} \big \{H,M_x\big \} = \big \{H,M_y\big \} = \big \{H,M_z\big \} = 0, \end{equation}

    which at the same time is a way to represent the conservation of angular momentum

    \begin{equation} \dot M_x = \dot M_y = \dot M_z = 0. \end{equation}

    Note that even though the three generators of the angular momentum components are independent, in the sense of being all in involution with the same hamiltonian, they do not mutually commute:

    \begin{equation} \label {eq:commM} \big \{M_x, M_y\big \} = M_z, \quad \big \{M_y, M_z\big \} = M_x, \quad \big \{M_z, M_x\big \} = M_y. \end{equation}

    In fact, the Poisson bracket relations above are the commutation relations of the basis of the Lie algebra of \(SO(3)\). Once we define action–angle coordinates, we will see that this non-commutativity implies that we cannot employ all the three momenta simultaneously as configuration–space coordinates.

    In quantum mechanics one can observe something similar: the three angular momentum components are linear self-adjoint operators generating a unitary representation of the group of rotations in Hilbert space. In that setting, however, their non-commutation means that they cannot be simultaneously measured.

  • Exercise 3.9. Let \(X(q), Y(q)\) denote two vector fields on \(M\) and define two hamiltonians linear in the momenta as in (3.177):

    \begin{equation} H_X = \langle p, X(q)\rangle ,\quad H_Y = \langle p, Y(q) \rangle . \end{equation}

    Show that their Poisson bracket is also linear in the momenta and is given by

    \begin{equation} \big \{H_X, H_Y\big \} = - H_{[X,Y]}. \end{equation}

    Where \([X,Y]\) is the Lie bracket of the vector fields.

  • Exercise 3.10. Given a mechanical system which is invariant with respect to space translations, show that the components of the total momentum

    \begin{equation} \vb * P = (P_x, P_y, P_z) = \sum _i \vb *{p}_i \end{equation}

    are mutually commuting:

    \begin{equation} \big \{P_x, P_y\big \} =\big \{P_y, P_z\big \} =\big \{P_z, P_x\big \} =0. \end{equation}

3.7 The symplectic structure on the cotangent bundle

There is a beautiful geometric description that underlies everything that we have seen so far. This not only allows a coordinate free description of Hamiltonian mechanics, but provides the framework to develop many of our future discussions on integrability and perturbation theory. For a deeper geometric investigation than what we can discuss in this course, you can refer to [Sil08].

In fact, we have already encountered the symplectic form in disguise twice. Recall equations (3.90) and (3.172). Especially in the latter case, we have been in front of a mapping that relates

\begin{equation} \dd H = \left (\frac {\partial H}{\partial q^i}, \frac {\partial H}{\partial p_i}\right ) \leftrightarrow X_H = \left (\frac {\partial H}{\partial p_i}, - \frac {\partial H}{\partial q^i}\right ), \end{equation}

that is, a mapping that associates a vector field \(X_H\) to a differential \(\dd H\) via the matrix \(J\).

Since \(J\) is antisymmetric, we can describe antisymmetric bilinear maps in terms of two-forms and we can generate one-forms out of two-forms with the contraction, it may be curious to see if we can actually use these concepts to connect the dots. We look for an antisymmetric form, say \(\alpha \), on a cotangent bundle \(T^*M\) such that \(\iota _{X_H} \alpha \) can be related to \(\dd H\). Let’s compute it!

In terms of the basis elements \(dq^i\wedge dq^j\), \(dp_i\wedge dp_j\), \(dp_i\wedge dq^j\), we have

\begin{align} \iota _{X_H} \alpha &= \alpha _{ij} (\dd q^i(X_H) \dd q^j - \dd q^j(X_H) \dd q^i) \\ &\quad + \alpha ^{ij} (dp_i(X_H) \dd p_j - dp_j(X_H) \dd p_i) \\ &\quad + \alpha ^i_j( dp_i(X_H) \dd q^j - dq^j(X_H) \dd p_i) \\ &= \alpha _{ij}\left (\frac {\partial H}{\partial p_i} \dd q^j - \frac {\partial H}{\partial p_j}\dd q^i\right ) \\ &\quad + \alpha ^{ij}\left (-\frac {\partial H}{\partial q^i} \dd p_j + \frac {\partial H}{\partial q^j} \dd p_i\right ) \\ &\quad + \alpha ^i_j\left (-\frac {\partial H}{\partial q^i} \dd q^j - \frac {\partial H}{\partial p_j} \dd p_i\right ), \end{align} where the sum was over all indices \(i\leq j\). This should be equal to

\begin{equation} dH = \frac {\partial H}{\partial q^i} \dd q^i + \frac {\partial H}{\partial p_i} \dd p_i. \end{equation}

That is, \(\alpha ^{ij}=\alpha _{ij} = 0\) and \(\alpha ^i_j = -\delta ^i_j\). Rewriting everything explicitly, we have computed

\begin{equation} \iota _{X_H}(\dd p_i\wedge \dd q^i) = - \dd H. \end{equation}

As it turns out, \(\dd p_i \wedge \dd q^i\) is a symplectic form, and the equation above is an alternative definition for the hamiltonian vector field (see Exercise 3.12).

Without further ado, let’s dig into it. Instead of working on the cotangent bundle \(T^*M\) over a smooth manifold \(M\), we will consider at first a generic smooth manifold \(P\).

Let \(\omega \) be a two–form on \(P\), that is, for each \(x\in P\), the map \(\omega _x: T_x P \times T_x P \to \mathbb {R}\) is a skew-symmetric bilinear map on the tangent space to \(P\) at \(x\). As we mentioned, in terms of coordinates and linear algebra, bilinear means that it corresponds to some \(\dim P\times \dim P\) matrix \(\Omega (x) = \left (\omega _{ij}(x)\right )_{1\leq i,j\leq \dim P}\) and skew-symmetric means that the matrix \(\Omega (x)\) is antisymmetric. I.e. \(\omega \), seen as a matrix, acts on vectors \(X,Y\) as \(X^T \Omega Y\).

We say that a two–form \(\omega \) on \(P\) defines a symplectic structure on \(P\) if it is closed8 and non–degenerate9. We call symplectic manifold the pair \((P, \omega )\) of a smooth manifold \(P\) with a symplectic structure \(\omega \) on \(P\).

8 I.e. \(\dd \omega = 0\)

9 I.e. \(\omega \neq 0\). Note that this means \(\omega _x \neq 0\) for all \(x \in P\).

  • Remark 3.6. The fact that \(\omega \) is closed and non-degenerate, in coordinates, can be translated into

    \begin{equation} \det (\Omega (x)) \neq 0 \quad \forall x\in P. \end{equation}

  • Remark 3.7. Note that for an \(n\times n\) antisymmetric matrix \(A = - A^T\),

    \begin{equation} \det A = \det (-A^T) = (-1)^n \det A^T = (-1)^n \det A, \end{equation}

    therefore non–degeneracy implies that \(\dim P\) must be even.

If we go back to our example of an \(n\)-dimensional manifold \(M\), we can construct a symplectic structure on its cotangent bundle \(T^*M = P\) using the coordinates \(q,p\) associated to the local coordinates \(q^1, \ldots , q^n\) on \(M\). Then, the symplectic form is defined by the following formula

\begin{equation} \label {eq:canonical2form} \omega = \dd p_i \wedge \dd q^i, \end{equation}

where as usual we are summing the repeated indices.

  • Lemma 3.19. The matrix of the two–form \(\omega \) defined by (3.201) in the coordinates \(q,p\) is given by

    \begin{equation} \label {eq:symmatpqcoords} \Omega (x) = \Omega = \begin{pmatrix} 0 & -\Id _n \\ \Id _n & 0 \end {pmatrix}. \end{equation}

  • Proof. Denote the basis vectors of \(T_x P\) with

    \begin{equation} \left ( \frac {\partial }{\partial q^1}, \ldots , \frac {\partial }{\partial q^n}, \frac {\partial }{\partial p_1}, \ldots , \frac {\partial }{\partial p_n} \right ). \end{equation}

    The expression of (3.202) follows by using the expressions of the action of the differentials on the basis vectors,

    \begin{equation} \dd q^i\left (\frac {\partial }{\partial q^j}\right ) = \dd p_i\left (\frac {\partial }{\partial p_j}\right ) =\delta ^i_j, \qquad \dd q^i\left (\frac {\partial }{\partial p_j}\right ) = \dd p_i\left (\frac {\partial }{\partial q^j}\right ) = 0, \end{equation}

    together with the definition of the action of a wedge product of two 1-forms \(\alpha , \beta \) on two tangent vectors \(X, Y\)

    \begin{equation} \alpha \wedge \beta (X, Y) = \det \begin{pmatrix} \alpha (X) & \alpha (Y) \\ \beta (X) & \beta (Y) \end {pmatrix}. \end{equation}

Note that \(\Omega \) is the inverse to the matrix \(J = (J^{ij})\) introduced in (3.87).

  • Theorem 3.20. The two–form \(\omega \) defined by (3.201) does not depend from the local choice of coordinates \(q^1, \ldots , q^n\) on the base \(M\) of the cotangent bundle \(T^* M\). This two–form is closed and non–degenerate, therefore defining a symplectic structure on the manifold \(T^*M\).

The proof is not hard, but it is more convenient if we first prove the following lemma.

  • Lemma 3.21. Let \(q^1, \ldots , q^n\) be a choice of local coordinates on a smooth manifold \(M\). Then the tautological one–form

    \begin{equation} \eta = p_i \dd q^i \end{equation}

    is a one–form on \(T^* M\) which does not depend on the choice of local coordinates. Moreover, the differential of such one–form coincides with the two–form (3.201):

    \begin{equation} \dd \eta = \omega . \end{equation}

  • Proof. The formula \(\dd \eta = \omega \) is immediate. It only remains to show the invariance of \(\omega \) with respect to coordinate transformations on \(M\):

    \begin{equation} q^i \mapsto \widetilde q^i = \widetilde q^i(q), \qquad p_i \mapsto \widetilde p_i = \frac {\partial q^k}{\partial \widetilde q^i} p_k. \end{equation}

    Using the definition, we have

    \begin{equation} \widetilde {p}_i \dd {\widetilde {q}^i} = \frac {\partial q^k}{\partial \widetilde {q}^i} p_k \frac {\partial \widetilde {q}^i}{\partial q^l} \dd {q^l} = \delta ^k_l p_k \dd {q^l} = p_k \dd {q^k}. \end{equation}

  • Proof of Theorem 3.20. Any differential form with constant coefficients is closed. The matrix (3.202) is non–degenerate. The invariance of \(\omega \) follows immediately from the invariance of \(\eta \).

The tautological one–form, also called canonical one–form, Liouville one–form, the Poincaré one–form and possibly in many other ways, provides a natural way to endow \(T^*M\) with a symplectic structure.

  • Remark 3.8 (A coordinate free explanation of the tautological one–form). The name tautological comes from the physical intuition that velocity and momenta are necessarily proportional to one-another, but the connection between the two concepts runs deep as previous remarks have shown. The tautological one-form enter into this picture as a foundational concept.

    If you open the relative wikipedia page at the time of writing these lecture notes, you will read some variation of the following text.

    The tautological one-form is a device that converts velocities into momenta. That is, it assigns a numerical value to the momentum \(p\) for each velocity \(\dot {q}\)” Moreover, it does it so that they point “in the same direction” and it does it linearly, such that the magnitudes grow in proportion. Finally, the choice of the assignment is unique: each momentum vector corresponds to only one velocity vector, by definition. In this respect, the tautological one–form can be thought of as a device to convert from Lagrangian mechanics to Hamiltonian mechanics.

    In what follows, we will try to understand what the aforementioned quote really means.

    Let \(P = T^* M\) be the cotangent bundle over the smooth manifold \(M\). The simplest map from the cotangent bundle to the base manifold is the canonical bundle projection

    \begin{equation} \pi :P\to M,\quad \pi (q,p) = q. \end{equation}

    Its differential defines a map

    \begin{equation} \pi _* : TP \to TM \end{equation}

    between tangent bundles. We have seen this multiple times, so nothing new for now.

    Let \(x\) be a point on \(P\). Since \(P = T^*M\) is the cotangent bundle over \(M\), we can think of points \(x = (q,p)\) as linear maps

    \begin{equation} x : T_q M \to \mathbb {R} \end{equation}

    of the tangent space of \(M\) at \(q := \pi (x)\). That is, \(x\) is in the fiber \(T^*_qM\) of \(q\).

    Since \(\pi _*(x): T_x P \to T_q M\), we could compose it with \(x\) to get a linear map over \(TP\). Indeed,

    \begin{equation} \eta (x) = x \circ \pi _*(x), \end{equation}

    defines a linear map

    \begin{equation} \eta (x):T_xP\to \mathbb {R}. \end{equation}

    Recalling that \(T_x^*P = \big \{\phi : T_x P \to \mathbb {R} \mid \phi \mbox { linear}\big \}\),

    \begin{equation} \eta (x) : P \to T^*_x P. \end{equation}

    The map \(\eta \) we just defined is the tautological one–form.

With these new concepts at hand, we have defined two natural geometric structures on \(T^* M\): the Poisson bracket and the symplectic structure. In coordinates, these two structures arise from two antisymmetric tensors: the \((2,0)\)-tensor \(J^{ij}\) from the Poisson bracket (a bilinear form on the cotangent space) and the \((0,2)\)-tensor \(\omega _{ij}\) of the symplectic form (a bilinear form on the tangent space). You may have observed by now that these two tensors are strictly related: their associated matrices are inverse of each other:

\begin{equation} \label {eq:omegaJinverse} \omega _{il}J^{lj} = \delta _i^j, \quad i,j = 1,\ldots , 2n. \end{equation}

In canonical coordinates \(q,p\), these matrices take respectively the constant form of (3.87) and (3.202).

  • Exercise 3.11. Show that \(\big \{f,g\big \} = -\mathcal {L}_{X_f} g = \mathcal {L}_{X_g} f\), where \(\mathcal {L}\) is the Lie derivative defined in (3.19). Remember that the inner product of a vector field with a function is zero by definition.

  • Exercise 3.12. Let \(X_H\) be an hamiltonian vector field on \(T^*M\). Show that the following holds

    \begin{equation} \label {eq:symXHdef} \iota _{X_H}\omega = -\dd H. \end{equation}

    You can do it in one line with Cartan’s magic formula (3.20).

    In symplectic geometry courses, equation (3.217) is usually given as the definition of a hamiltonian vector field. Indeed, show that if \(\iota _{X}\omega = -\dd H\), then \(X=X_H\) and

    \begin{equation} \{f,g\} = X_g f = \omega (X_g, X_f). \end{equation}

In the rest of this section we will construct a mapping between the notions that we introduced in the previous chapters with the Poisson bracket and the new language of symplectic manifolds. As we will see ,this will allow us to discover a range of new interesting properties.

3.7.1 Symplectomorphisms and generating functions

We start by looking back at the canonical transformations.

Let \(\Phi : P \to P\) be a diffeomorphism on a symplectic manifold \((P,\omega )\). We say that \(\Phi \) is a symplectic transformation, or symplectomorphism, if it leaves the symplectic form \(\omega \) invariant:

\begin{equation} \label {eq:symplectomorphism} \Phi ^* \omega = \omega . \end{equation}

  • Theorem 3.22. Let \(\Phi : T^*M \to T^*M\) be a diffeomorphism, then \(\Phi \) is a canonical transformation if and only if it is a symplectomorphism.

  • Proof. With our usual notation, let

    \begin{equation} y^k = \Phi ^k(x),\quad k=1,\ldots ,2n. \end{equation}

    Then the definition of canonical transformation in the local coordinates \(x\), see (3.131), becomes

    \begin{equation} J^{ij}\frac {\partial y^k}{\partial x^i}\frac {\partial y^l}{\partial x^j} = J^{kl}, \quad k,l = 1,\ldots ,2n. \end{equation}

    Inverting the Jacobian in the formula above and using (3.216), we get

    \begin{equation} \frac {\partial x^i}{\partial y^k}\frac {\partial x^j}{\partial y^l} \omega _{ij}= \omega _{kl}, \quad k,l = 1,\ldots ,2n. \end{equation}

    Consider now the inverse transformation

    \begin{equation} x^k = \left (\Phi ^{-1}\right )^k(y),\quad k=1,\ldots ,2n. \end{equation}

    By the equations above, we have

    \begin{align} \omega & = \sum _{k<l} \omega _{kl}\dd x^k \wedge \dd x^l = \frac 12 \omega _{kl} \dd x^k \wedge \dd x^l \\ & = \frac 12 \omega _{kl} \frac {\partial x^k}{\partial y^i}\frac {\partial x^l}{\partial y^j} \dd y^i \wedge \dd y^j \\ & = \frac 12 \omega _{ij} \dd y^i \wedge \dd y^j \end{align} In other words,

    \begin{equation} \left (\Phi ^{-1}\right )^* \omega = \omega , \end{equation}

    and by applying \(\Phi ^*\) we immediately get the invariance of \(\omega \) with respect to \(\Phi \). Repeating the computations from bottom to top, one proves the second part of the theorem.

If you have looked carefully, you may have noticed that the proof above carefully avoids any use of the structure of cotangent bundle. It is then worth asking ourselves what happens if we consider the tautological one–form \(\eta \) instead of \(\omega \).

  • Theorem 3.23. Let \(\Phi : T^*M \to T^*M\) be a symplectomorphism, then locally there exists a function \(S:T^*M\to \mathbb {R}\) such that

    \begin{equation} \label {eq:Sfungen} \Phi ^*\eta - \eta = \dd S. \end{equation}

  • Proof. The definition of symplectomorphism, the identity \(\dd \eta = \omega \) and the commutativity of the pullback and the exterior differential imply

    \begin{equation} 0 = \Phi ^*\omega - \omega = \Phi ^*(\dd \eta ) - \dd \eta = \dd (\Phi ^*\eta - \eta ), \end{equation}

    which means that \(\Phi ^*\eta - \eta \) is a closed one–form. Poincaré lemma, then, implies local existence of the function \(S\).

The function \(S\) appearing in (3.228) is called generating function of the symplectic transformation. The name stems from the fact that we can effectively use \(S\) to construct canonical transformations and effectively compute the corresponding transformed positions or momentums. Put slightly differently, equation (3.228) is telling us that canonical transformations can be identified with a generating function. We will leave it as a vague remark for the time being but we will come back to it, and study its consequences, in more details once we start discussing integrable systems. For now, lets keep drawing parallels between the Poisson bracket and the symplectic formalism.

We have shown in Theorem 3.16 that the hamiltonian flow is an infinitesimal symmetry of the Poisson bracket, which in particular means that it defines a one–parameter group of canonical transformations. The following result comes, thus, as an immediate consequence of Theorem 3.22.

  • Theorem 3.24. Let \(H\) be a function on the manifold \(T^*M\), then the hamiltonian flow \(\Phi _t\) generated by \(H\) is a one–parameter group of symplectomorphisms of \((T^*M, \omega )\).

We can now see the implications of the converse part of Theorem 3.16.

Let \(X\) be a vector field on a symplectic manifold \((P,\omega )\). We say that \(X\) is a infinitesimal symplectomorphism if the Lie derivative of the symplectic form \(\omega \) with respect to \(X\) is zero:

\begin{equation} \label {eq:infsymplcf} \mathcal {L}_X \omega = 0. \end{equation}

The reason of name of the transformation may become clearer if we observe that (3.230) means that, if \(\Phi _t\) is the flow of \(X\),

\begin{equation} \lim _{t\to 0} \frac {\Phi _t^*\omega - \omega }{t} = 0. \end{equation}

That is, \(\Phi _t^*\omega = \omega \) along the flow of \(X\).

Checking that hamiltonian vector fields satisfy this property is a matter of applying Cartan’s magic formula once:

\begin{equation} \mathcal {L}_{X_H} \omega = \dd (\iota _{X_H} \omega ) + \iota _{X_H}(\dd \omega ) = \dd (-\dd H) = 0. \end{equation}

Where we used the fact that \(\omega \) is closed and Exercise 3.12.

With this new baggage at hand, we can compare again Theorem 3.16 and Theorem 3.22 to extend the previous result with the following theorem.

  • Theorem 3.25. Let \(X\) be an infinitesimal symplectomorphism on the symplectic manifold \((P,\omega )\), then locally there exists a function \(H\) such that \(X = X_H\).

  • Example 3.13 (Time dependent hamiltonians). It is possible to use the symplectic formulation to describe hamiltonian systems with an explicit time dependence. To this end, one needs to consider the extended phase space \(T^*M\times \mathbb {R}^2\), as we have already seen in Section 3.4.2.

    In the extended coordinates \((q^1, \ldots , q^n, p_1,\ldots ,p_n,t,E)\), the tautological one–form is given by

    \begin{equation} \eta = p_i\dd q^i -E\dd t, \end{equation}

    which leads to the symplectic form

    \begin{equation} \omega = \dd \eta = \dd p_i \wedge \dd q^i -\dd E \wedge \dd t. \end{equation}

3.7.2 Darboux Theorem

There seems to be more to symplectic manifolds that just cotangent bundles. However, as manifolds locally look like euclidean spaces, the following theorem tells us that symplectic manifolds locally look like cotangent bundles.

Here we will present a very hands-on proof, on the line of [J.M13, Problem 22-19]. For a more elegant approach, due to Moser and Weinstein, refer to [J.M13, Chapter 22] or [Kna18, Chapter 10.3].

  • Theorem 3.26 (Darboux’ Theorem). Let \((P, \omega )\) be a symplectic manifold and \(\dim P = 2n\). Then, there exist local coordinates

    \begin{equation} (q^1, \ldots , q^n, p_1, \ldots , p_n) \end{equation}

    around any point in \(P\) such that \(\omega = \dd p_i \wedge \dd q^i\).

  • Proof. The proof is a sort of Gram-Schmidt process. Let \(x\) be a local system of coordinates in a neighborhood \(U_{x_0}\) of a point \(x_0\in P\) and let \(\omega = \omega _{ij}(x)\dd x^i\wedge \dd x^j\). We would like to find a change of variable that transforms \(\omega \) into \(\dd p_i \wedge \dd q^i\).

    Let \(p_1=p_1(x)\) be a smooth function of \(x\) in a neighborhood of \(x_0\) such that \(p_1(x_0) = 0\). Define the nonzero vector field \(X_{p_1}\) with \(\iota _{X_{p_1}}\omega = -\dd p_1\). Then, by the rectification theorem, around \(x_0\) we can find a system of coordinates \((q^1, y^0, y^1, \ldots , y^{n-2})\) such that

    \begin{equation} X_{p_1} = \frac {\partial }{\partial q^1}, \end{equation}

    i.e., \(X_{p_1}\) generates translations in the coordinate \(q^1\). Being \(X_{p_1} p_1 = 0\), the function \(p_1\) cannot depend on \(q^1\) in these new local coordinates, thus we can instead choose the local system of coordinates \((q^1, p_1, y^1, \ldots , y^{n-2})\). Furthermore, \(X_{p_1} q^1 = 1\) implies that

    \begin{equation} \big \{q^1, p_1\big \} = 1. \end{equation}

    If \(n=1\) we are done, and the coordinates are simply \((q^1, p_1)\).

    If \(n>1\) we need to proceed by induction. Assume that Darboux’ theorem holds for \(\mathbb {R}^{2n-2}\) and consider the set \(S\) obtained by the intersection of the two hyperplanes

    \begin{equation} p_1(x) = 0 \quad \mbox {and}\quad q^1(x) = 0. \end{equation}

    As \(\omega (X_{p_1}, X_{q^1}) = \big \{q^1, p_1\big \} = 1\), the two hamiltonian vector fields \(X_{p_1}\) and \(X_{q^1}\) must be linearly independent in a neighborhood of \(x_0\). It follows that \(S\) is a manifold of dimension \(2n-2\) in that neighborhood.

    • Lemma 3.27. The symplectic structure \(\omega \) in \(\mathbb {R}^{2n}\) induces a symplectic structure in a neighborhood \(V_{x_0}\subset S\).

    • Proof. We need to show that \(\omega |_S\) is non–degenerate.

      For \(x\in V_{x_0}\), denote \(\Phi _\tau ^\xi \) the flow generated by the vector \(\xi \in T_x S\). By the constancy of \(p_1\) and \(q_1\) on \(S\), we have that for \(x\in S\)

      \begin{equation} \xi (p_1) = \frac {\dd }{\dd \tau } p_1(\Phi _\tau ^\xi x)\Big |_{\tau =0} = 0, \quad \xi (q_1) = \frac {\dd }{\dd \tau } q_1(\Phi _\tau ^\xi x)\Big |_{\tau =0} = 0. \end{equation}

      Therefore, for all \(\xi \in T_x S\),

      \begin{equation} \omega (\xi , X_{p_1}) = \dd p_1(\xi ) = 0,\quad \omega (\xi , X_{q^1}) = -\dd q^1(\xi ) = 0. \end{equation}

      Therefore, \(T_x S\) is the symplectic complement to \(X_{p_1}\) and \(X_{q^1}\). The proof now follows from the non–degeneracy of \(\omega \) on \(\mathbb {R}^{2n}\): \(\omega \) must be non–degenerate on \(T_x S\), otherwise we would have a contradiction.

    By induction hypothesis, in a neighborhood \(V_{x_0}\subset S\) of the manifold \((S, \omega |_S)\) there are symplectic coordinates \((q^2, \ldots , q^n, p_2, \ldots , p_n)\). We want to extend \(q^2, \ldots , q^n\) and \(p_2, \ldots , p_n\) to a neighborhood \(U_{x_0}\in \mathbb {R}^{2n}\).

    It follows from the linear independence of \(X_{p_1}\) and \(X_{q^1}\) and their commutativity \([X_{p_1}, X_{q^1}] = -X_{\big \{q^1,p_1\big \}} = 0\) that, if \(s\) and \(t\) are small enough, any point \(z\in U_{x_0}\setminus V_{x_0}\) can be written uniquely as

    \begin{equation} z = \Phi _t^{p_1}\Phi _s^{q^1} x, \quad \mbox {for some }x\in V_{x_0}. \end{equation}

    We define \(q^i(z) := q^i(x)\) and \(p_i(z) := p_i(x)\), \(i=2,\ldots ,n\). The \(2n\) coordinates \((q^1, \ldots , q^n, p_1, \ldots , p_n)\) define a system of local coordinates in the neighborhood of \(x_0\) in \(\mathbb {R}^{2n}\). It remains to show that these coordinates are symplectic.

    First of all, observe that, by construction, the coordinates \(q^2, \ldots , q^n\), \(p_2, \ldots , p_n\) are invariant with respect to \(\Phi _t^{p_1}\) and \(\Phi _s^{q^1}\) for small \(t\) and \(s\), and therefore

    \begin{equation} \big \{q^1, q^i\big \} = \big \{q^1, p_i\big \} = \big \{p_1, q^i\big \} = \big \{p_1, p_i\big \} = 0, \end{equation}

    where \(i = 2,\ldots ,n\). It follows that \(X_{q^i}\) and \(X_{p_i}\) are tangent to the manifold \(s'\) defined, for some small values \(c_1\) and \(c_2\), as \((q^1(z), p_1(z)) = (c_1, c_2)\). As such, they are hamiltonian vector fields on \(S'\) with hamiltonian functions \(q^i, p_i\), \(i=1,\ldots ,n\).

    Furthermore, being \(\Phi _t^{p_1}\) and \(\Phi _s^{q^1}\) hamiltonian flows, they preserve the symplectic structure \(\omega \) and therefore

    \begin{equation} \omega (X_{q^i}, X_{p_j}) = \omega (X_{q^i|_S}, X_{p_j|_S}), \quad q^i = q^i(z), \quad p_j = p_j(z). \end{equation}

    where \(i,j=2,\ldots ,n\). By induction hypothesis the coordinates on \(S\) are symplectic and, therefore, they are symplectic on the whole neighborhood \(U_{x_0}\) just constructed. This means that

    \begin{equation} \big \{q^i, p_j\big \} = \delta _j^i, \quad \big \{q^i, q^j\big \} = \big \{p^i, p_j\big \} = 0, \quad i,j=2,\ldots ,n, \end{equation}

    and thus on \(U_{x_0}\subset \mathbb {R}^{2n}\) the symplectic form \(\omega =\dd p_i \wedge \dd q^i\), proving Darboux’ Theorem.

3.7.3 Liouville theorem

The symplectic formalism allows to prove in a rather immediate fashion a crucial property of hamiltonian systems: the conservation of phase space volume along the flow. This is called Liouville theorem, and will be at the center of attention in this section. To start with, let’s prove the following lemma.

  • Lemma 3.28. On a manifold \(M\) of dimension \(n\), let \(\omega \) be the symplectic form (3.201) on \(T^* M\). Then the exterior product of \(n\) copies of \(\omega \) is proportional to the volume form:

    \begin{equation} \Omega := \frac 1{n!} \omega \wedge \cdots \wedge \omega = \dd p_1 \wedge \dd q^1 \wedge \cdots \wedge \dd p_n \wedge \dd q^n. \end{equation}

  • Proof. First of all observe that for two fixed indices \(i,j\)

    \begin{align} (\dd p_i \wedge \dd q^i)\wedge (\dd p_j \wedge \dd q^j) & = (\dd p_j \wedge \dd q^j)\wedge (\dd p_i \wedge \dd q^i) \\ & = (1-\delta _i^j) \dd p_i \wedge \dd q^i \wedge \dd p_j \wedge \dd q^j, \end{align} i.e., only the terms with different indices survive. Therefore, in the exterior product of \(n\) copies of the two form \(\omega = \dd p_1 \wedge \dd q^1 + \cdot + \dd p_n \wedge \dd q^n\), there are \(n!\) terms surviving, all equal to \(\dd p_1 \wedge \dd q^1 \wedge \cdots \wedge \dd p_n \wedge \dd q^n\).

This lemma immediately implies the invariance of phase space volume with respect to canonical transformations (or symplectomorphisms). Let \(D\subset T^*M\), we define the volume of \(D\) as the integral

\begin{equation} \label {eq:symplvolume} \Vol (D) :=\int _D \Omega , \quad \Omega = \frac 1{n!} \omega ^n = \dd p_1 \wedge \dd q^1 \wedge \cdots \wedge \dd p_n \wedge \dd q^n, \end{equation}

provided that such integral exists. We call measurable the subsets \(D\) for which \(\Vol (D)\) is well defined.

  • Exercise 3.13. Show that the definition of volume \(\Vol (D)\) presented above does not depend on the choice of local coordinates \(q^1, \ldots , q^n\) on \(M\).

  • Theorem 3.29. Let \(\Phi :T^*M \to T^*M\) be a canonical transformation and \(D\subset T^*M\) measurable, then

    \begin{equation} \Vol (\Phi (D)) = \Vol (D). \end{equation}

  • Proof. The proof follows from the definition (3.248) using the formula for the change of variable in the multiple integral

    \begin{equation} \int _{\Phi (D)}\Omega = \int _D \Phi ^*\Omega , \end{equation}

    and applying the definition of symplectomorphism \(\Phi ^* \omega = \omega \) to \(\Phi ^* \Omega = \frac 1{n!}\Phi ^*\omega \wedge \cdots \wedge \Phi ^*\omega \).

The renown Liouville theorem is a special case of the theorem we just proved. For an alternative treatment of Liouville theorem, please refer to [Arn89, Chapter 3.16].

  • Theorem 3.30 (Liouville theorem). Let \(H\) be a hamiltonian function on \(T^*M\). Then, the hamiltonian flow \(\Phi _t\) generated by \(H\) preserves the volume, that is, for every measurable subset \(D\subset T^*M\) and any \(t\) it holds

    \begin{equation} \Vol (\Phi _t(D)) = \Vol (D). \end{equation}

More general invariants can be found considering other products \(\omega ^k\) of the symplectic two–form for \(k\leq n\) and their integrals with respect to closed submanifolds or cycles. These are called integral invariants of Poincaré-Cartan. We will not treat them further in these notes, but we refer interested readers to [Arn89, Chapter 9.44].

A further interesting point of view on the meaning and consequences of Liouville theorem can be found in [Ton05, Chapters 4.2.1-4.2.2].

An important consequence of Theorem 3.29 is that we can apply a large corpus of methods from ergodic theory and, more generally, measure preserving dynamical systems to hamiltonian systems.

These are methods that apply to systems with finite measure, however this is not an uncommon property of hamiltonian systems: for a mechanical system with energy \(E = T + U\), due to energy conservation and the positivity of \(T\), the accessible phase space is contained in the spatial region \(U(q) < E\). If, for example, \(U\) is bounded from below and unbounded from above then this is a region of phase space with bounded volume. You can read more about this in [Arn89, Chapter 3.16.C-D].