Hamiltonian Mechanics

\(\newcommand{\footnotename}{footnote}\) \(\def \LWRfootnote {1}\) \(\newcommand {\footnote }[2][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\newcommand {\footnotemark }[1][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\let \LWRorighspace \hspace \) \(\renewcommand {\hspace }{\ifstar \LWRorighspace \LWRorighspace }\) \(\newcommand {\mathnormal }[1]{{#1}}\) \(\newcommand \ensuremath [1]{#1}\) \(\newcommand {\LWRframebox }[2][]{\fbox {#2}} \newcommand {\framebox }[1][]{\LWRframebox } \) \(\newcommand {\setlength }[2]{}\) \(\newcommand {\addtolength }[2]{}\) \(\newcommand {\setcounter }[2]{}\) \(\newcommand {\addtocounter }[2]{}\) \(\newcommand {\arabic }[1]{}\) \(\newcommand {\number }[1]{}\) \(\newcommand {\noalign }[1]{\text {#1}\notag \\}\) \(\newcommand {\cline }[1]{}\) \(\newcommand {\directlua }[1]{\text {(directlua)}}\) \(\newcommand {\luatexdirectlua }[1]{\text {(directlua)}}\) \(\newcommand {\protect }{}\) \(\def \LWRabsorbnumber #1 {}\) \(\def \LWRabsorbquotenumber "#1 {}\) \(\newcommand {\LWRabsorboption }[1][]{}\) \(\newcommand {\LWRabsorbtwooptions }[1][]{\LWRabsorboption }\) \(\def \mathchar {\ifnextchar "\LWRabsorbquotenumber \LWRabsorbnumber }\) \(\def \mathcode #1={\mathchar }\) \(\let \delcode \mathcode \) \(\let \delimiter \mathchar \) \(\def \oe {\unicode {x0153}}\) \(\def \OE {\unicode {x0152}}\) \(\def \ae {\unicode {x00E6}}\) \(\def \AE {\unicode {x00C6}}\) \(\def \aa {\unicode {x00E5}}\) \(\def \AA {\unicode {x00C5}}\) \(\def \o {\unicode {x00F8}}\) \(\def \O {\unicode {x00D8}}\) \(\def \l {\unicode {x0142}}\) \(\def \L {\unicode {x0141}}\) \(\def \ss {\unicode {x00DF}}\) \(\def \SS {\unicode {x1E9E}}\) \(\def \dag {\unicode {x2020}}\) \(\def \ddag {\unicode {x2021}}\) \(\def \P {\unicode {x00B6}}\) \(\def \copyright {\unicode {x00A9}}\) \(\def \pounds {\unicode {x00A3}}\) \(\let \LWRref \ref \) \(\renewcommand {\ref }{\ifstar \LWRref \LWRref }\) \( \newcommand {\multicolumn }[3]{#3}\) \(\require {textcomp}\) \(\newcommand {\toprule }[1][]{\hline }\) \(\let \midrule \toprule \) \(\let \bottomrule \toprule \) \(\def \LWRbooktabscmidruleparen (#1)#2{}\) \(\newcommand {\LWRbooktabscmidrulenoparen }[1]{}\) \(\newcommand {\cmidrule }[1][]{\ifnextchar (\LWRbooktabscmidruleparen \LWRbooktabscmidrulenoparen }\) \(\newcommand {\morecmidrules }{}\) \(\newcommand {\specialrule }[3]{\hline }\) \(\newcommand {\addlinespace }[1][]{}\) \(\newcommand {\intertext }[1]{\text {#1}\notag \\}\) \(\let \Hat \hat \) \(\let \Check \check \) \(\let \Tilde \tilde \) \(\let \Acute \acute \) \(\let \Grave \grave \) \(\let \Dot \dot \) \(\let \Ddot \ddot \) \(\let \Breve \breve \) \(\let \Bar \bar \) \(\let \Vec \vec \) \(\require {physics}\) \(\newcommand {\nicefrac }[3][]{\mathinner {{}^{#2}\!/\!_{#3}}}\) \(\newcommand {\unit }[2][]{#1 \mathinner {#2}}\) \(\newcommand {\unitfrac }[3][]{#1 \mathinner {{}^{#2}\!/\!_{#3}}}\) \(\newcommand {\tcbset }[1]{}\) \(\newcommand {\tcbsetforeverylayer }[1]{}\) \(\newcommand {\tcbox }[2][]{\boxed {\text {#2}}}\) \(\newcommand {\tcboxfit }[2][]{\boxed {#2}}\) \(\newcommand {\tcblower }{}\) \(\newcommand {\tcbline }{}\) \(\newcommand {\tcbtitle }{}\) \(\newcommand {\tcbsubtitle [2][]{\mathrm {#2}}}\) \(\newcommand {\tcboxmath }[2][]{\boxed {#2}}\) \(\newcommand {\tcbhighmath }[2][]{\boxed {#2}}\) \( \DeclareMathOperator {\D }{D} \DeclareMathOperator {\Id }{Id} \DeclareMathOperator {\diag }{diag} \DeclareMathOperator {\mod }{mod} \DeclareMathOperator {\Vol }{Vol} \)

4 Integrable systems

In this chapters we will develop further the theory to investigate a very important family of hamiltonian systems, the so-called integrable systems. In some sense we will generalize to higher dimensions the integration by quadrature that we had already seen in the first chapter of these notes. Integrable systems will be central to our study of perturbation theory and play a big role in the study of hamiltonian systems, symplectic geometry and symplectic topology, not to mention their generalizations...

4.1 Lagrangian submanifolds

Let \(M\) be smooth manifold of dimension \(n\). We call lagrangian any submanifold \(\Lambda \subset T^*M\), if \(\dim \Lambda = n\) and the symplectic two–form \(\omega \) vanishes on the corresponding tangent subbundle, i.e., \(\omega (X, Y) = 0\) for all \(X,Y \in \Gamma (T\Lambda )\).

As we will see by the end of this section, lagrangian submanifold have a crucial property: if a trajectory of the hamiltonian intersects one of them, then it is fully contained in it. In other words, the hamiltonian motion is constrained to happen in lagrangian submanifolds. This generalizes our observation that the energy levels of one dimensional conservative systems fully contain (and separate) the solutions of the relative equations of motion.

In general, any submanifold \(\Lambda \subset T^*M\) of dimension \(n\) can be characterized, at least locally, by a family of \(n\) independent equations on \(T^*M\):

\begin{equation} \label {eq:lagsubconstrf} f_i(q, p) = 0, \quad i=1,\ldots ,n, \end{equation}

such that the \(2n\times n\) matrix \(\left (\frac {\partial f_i}{\partial q^j}, \frac {\partial f_i}{\partial p_k}\right )\) has full rank.

  • Theorem 4.1. The submanifold \(\Lambda \) defined by (4.1) is lagrangian if and only if the functions \(f_i\) are in involution on \(\Lambda \), i.e.

    \begin{equation} \label {eq:charlagsm1} \big \{f_i, f_j\big \}\big |_\Lambda = 0, \quad i,j = 1,\ldots ,n. \end{equation}

  • Proof. Let \(X_{f_i}\) be the vector fields generated by the \(f_i\), i.e., \(\omega (X_{f_i}, \cdot )=-\dd f_i\). The derivative of \(f_i\) in the direction of \(X_{f_j}\) is given by

    \begin{equation} X_{f_j}f_i = \big \{f_i,f_j\big \}. \end{equation}

    Assume that \(\Lambda \) is a lagrangian submanifold. Then, every \(Y\in \Gamma (T\Lambda )\) is characterized by the equation \(\dd f_i(Y) = 0\), i.e.

    \begin{equation} \label {eq:charvtls} \omega (X_{f_i}, Y) = 0, \quad i=1,\ldots ,n. \end{equation}

    The set

    \begin{equation} T\Lambda ^\perp := \big \{Z\in T(T^*M) \mid \omega (Y,Z) = 0, \; \forall Y\in \Gamma (T\Lambda )\big \} \end{equation}

    is called symplectic complement of \(T\Lambda \). The dimension of \(T\Lambda ^\perp \) is complementary to the one of \(T\Lambda \), and thus it is also \(n\). Since, by hypothesis, \(\omega \) vanishes on \(\Lambda \), it turns out that \(T\Lambda \subseteq T\Lambda ^\perp \). Being subspaces of equal dimension, they must coincide: any vector orthogonal to \(T\Lambda \) belongs to \(T\Lambda \). Therefore, by (4.4), the vectors \(X_{f_i}\) are tangent to \(\Lambda \) and, using again the fact that \(\Lambda \) is lagrangian, it holds

    \begin{equation} 0 = \omega (X_{f_i}, X_{f_j}) = \big \{f_i, f_j\big \}\big |_\Lambda . \end{equation}

    This proves the first implication.

    If we assume (4.2) to hold true, then on \(\Lambda \) we have \(X_{f_j}f_i = 0\), that is, the fields \(X_{f_j}\) are tangent to the surface. Furthermore, they are linearly independent, as \(\dd f_j\) are linearly independent by hypothesis. This means that the vectors \(X_{f_j}\) are a basis of \(T\Lambda \). Since \(\omega (X_{f_i}, X_{f_j}) = \big \{f_i,f_j\big \}\), (4.2) implies also that \(\omega (X_{f_i}, X_{f_j}) = 0\) for all \(i,j=1,\ldots .n\). That is, \(\omega \) vanishes on \(T\Lambda \).

  • Example 4.1. One of the simplest example of lagrangian submanifolds are

    \begin{align} & \Lambda _{q_0} := \big \{(q,p)\in T^*M \mid q=q_0\big \} \\ & \Lambda _{p_0} := \big \{(q,p)\in T^*M \mid p=p_0\big \}. \end{align} Observe that the canonical transformation \((q,p) \mapsto (p,-q)\) exchanges one with the other.

Any lagrangian submanifold can be written, locally, in a parametric fashion using the equations

\begin{equation} q^i = q^i(u^1, \ldots , u^n), \quad p_i = p_i(u^1, \ldots , u^n), \quad i=1,\ldots ,n \end{equation}

with the requirement that the \(n\times 2n\) matrix \(\left (\frac {\partial q^i}{\partial u^k}, \frac {\partial p_i}{\partial u^k}\right )_{i,k=1,\ldots ,n}\) is of full rank.

  • Theorem 4.2. The submanifold \(\Lambda \) is lagrangian if and only if the following equations are satisfied:

    \begin{equation} \label {eq:charbrals} \left (\frac {\partial p_i}{\partial u^k}\frac {\partial q^i}{\partial u^l} - \frac {\partial p_i}{\partial u^l}\frac {\partial q^i}{\partial u^k} \right ) = 0, \quad k,l=1,\ldots ,n \end{equation}

  • Proof. In canonical coordinates, the symplectic two–form takes the form \(\omega = \dd p_i \wedge \dd q^i\). If we substitute the parametric form of the coordinates, we obtain

    \begin{equation} \omega |_{T\Lambda } = \frac {\partial p_i}{\partial u^k}\dd u^k \wedge \frac {\partial q^i}{\partial u^l}\dd u^l. \end{equation}

    Then (4.10) is obtained by setting \(\omega |_{T\Lambda } = 0\) and recalling that \(\wedge \) is antisymmetric.

Finally, in analogy with the one dimensional examples, let’s consider a special class of lagrangian submanifolds which are parametrically described by

\begin{equation} \label {eq:lagsbmPofQ} p_i = W_i(q), \quad i=1,\ldots ,n. \end{equation}

  • Theorem 4.3. Equations (4.12) define a lagrangian submanifold if and only if the following one–form on \(M\) is closed:

    \begin{equation} W = W_i \dd q^i. \end{equation}

  • Proof. Setting \(u^k = q^k\) in Theorem 4.2, we get the equations

    \begin{equation} \frac {\partial p_k}{\partial q^l} - \frac {\partial p_l}{\partial q^k} = 0, \end{equation}

    that is,

    \begin{equation} \frac {\partial W_k}{\partial q^l} - \frac {\partial W_l}{\partial q^k} = 0 \end{equation}

    which, in turn, is equivalent to \(\dd W = 0\).

Since \(W\) is a closed form, it is locally exact and therefore there exists a function \(S: U \subset M \to \mathbb {R}\) such that \(\dd S = W\) or, in other words, such that

\begin{equation} p_i = \frac {\partial S}{\partial q^i}. \end{equation}

Such function \(S\) is called generating function of the lagrangian submanifold.

  • Remark 4.1. For a hamiltonian system with hamiltonian \(H\), we can compute the generating function of a lagrangian submanifold from \(n\) first integrals \(f_1, \ldots , f_n\) in involution such that

    \begin{equation} \label {eq:condlsgf} \det \left (\frac {\partial f_i}{\partial p_k}\right ) \neq 0. \end{equation}

    For any fixed \(n\)-tuple \(a = (a_1, \ldots , a_n) \in \mathbb {R}^n\) and \(f=(f_1,\ldots ,f_n)\), let

    \begin{equation} \Lambda _a = f^{-1}(a) := \{(q,p)\in T^*M \;\mid \; f(q,p) = a\} \end{equation}

    be the lagrangian submanifold described by the equations

    \begin{equation} f_k(q,p) = a_k, \quad k=1,\ldots ,n. \end{equation}

    Thanks to (4.17), we can apply the implicit function theorem and invert the equations \(f_k(q,p) = a_k\) with respect to \(p_i\). In this way, we obtain the functions

    \begin{equation} p_i = W_i(q,a). \end{equation}

    Since the equation above describes by construction the lagrangian submanifold \(\Lambda _a\), Theorem 4.3 implies that the one form \(W_i\dd q^i\) is closed. Therefore, locally there is a function \(S(q, a)\) such that \(p_i = \partial _{q_i} S\).

    Furthermore, since \(\big \{f_i, H\big \} = 0\), the hamiltonian vector field satisfies the additional condition \(\omega (X_{f_i}, X_H) = 0\), which means that \(X_H\in T\Lambda _a\). Said in other words, the trajectories of the hamiltonian flow which intersect \(\Lambda _a\) are completely contained in \(\Lambda _a\).

4.2 Canonical transformations revisited

Our aim is the characterization of canonical transformations. We will start by recalling a few of our previous results.

Let \(M\), as usual, be a smooth manifold of dimension \(n\). If \(T^*M\) is endowed with a Poisson bracket structure, we saw that a diffeomorphism \(\Phi :T^*M\to T^*M\) is a canonical transformation if its pullback \(\Phi ^*:\mathcal {C}^\infty (T^*M)\to \mathcal {C}^\infty (T^*M)\) is an automorphism of the Lie algebra, i.e., \(\Phi ^*\big \{f,g\big \} = \big \{\Phi ^* f, \Phi ^* g\big \}\), see (3.129).

Moreover, we know from Theorem 3.22 that any canonical transformation \(\Phi :T^*M\to T^*M\) of a symplectic manifold \((T^*M,\omega )\) is a symplectomorphism, i.e., \(\Phi ^* \omega = \omega \).

We have also seen that a hamiltonian \(H(q,p)\) with canonical coordinates \((q,p)\) on \(T^*M\) gives rise to the hamiltonian system

\begin{equation} \dot q^i = \frac {\partial H(q,p)}{\partial p_i},\quad \dot p_i = -\frac {\partial H(q,p)}{\partial q^i},\quad i=1,\ldots ,n. \end{equation}

The action of the canonical transformation \(\Phi \) such that \((q^i, p_i) \mapsto (Q^i, P_i)\) defines a new hamiltonian \(K(Q,P) = H(q(Q,P), p(Q,P))\) keeping the symplectic form invariant, i.e.

\begin{equation} \dd p_i \wedge \dd q^i = \omega = \Phi ^* \omega = \dd P_i \wedge \dd Q^i. \end{equation}

However, this means that the corresponding equations of motion must keep the same structure

\begin{equation} \dot Q^i = \frac {\partial K(Q,P)}{\partial P_i},\quad \dot P_i = -\frac {\partial K(Q,P)}{\partial Q^i},\quad i=1,\ldots ,n. \end{equation}

In general, a change of coordinates \(x\mapsto X=(\Phi ^1(x), \ldots , \Phi ^{2n}(x))\) requires to fix \(2n\) functions. As we already saw, when \(\Phi \) is canonical then it is uniquely determined by a single function, called generating function of the canonical transformation.

The starting point for the construction of the generating function is Theorem 3.23, which confirms the existence, locally, of a function \(S=S(q,p)\) such that

\begin{equation} \label {eq:coordgenfun} p_i \dd q^i = P_i \dd Q^i + dS. \end{equation}

  • Remark 4.2. If \(q^i = Q^i\), \(i=1,\ldots ,n\), then the function \(S=S(q)\). In fact, in this case, (4.24) reads

    \begin{align} p_i \dd q^i - P_i \frac {\partial Q^i}{\partial q^j}\dd q^j - P_i \frac {\partial Q^i}{\partial p_j}\dd p_j & = (p_i -P_i)\dd q^i \\ & = \frac {\partial S}{\partial q^i}\dd q^i + \frac {\partial S}{\partial p_i}\dd p_i, \end{align} which imply \(\frac {\partial S}{\partial p_i} = 0\) and, thus, \(p_i = P_i + \frac {\partial S(q)}{\partial q^i}\).

4.2.1 The generating function \(S_1\)

Assume, now, that the \(2n\) functions \((q^i, Q^i)\) are independent, i.e., they define a system of coordinates on the manifold \(T^*M\). Recall that this is possible if the Jacobian determinant of the change of variable is non–zero, i.e.

\begin{equation} \det \left (\frac {\partial (q,Q)}{\partial (q,p)}\right ) = \det \left (\frac {\partial Q}{\partial p} \right )\neq 0. \end{equation}

In this case, the function \(S\), locally, can be expressed in terms of the coordinates \(q\) and \(Q\), i.e.

\begin{equation} S_1(q, Q) = S(q, p(q,Q)), \end{equation}

and (4.24) is equivalent to the equations

\begin{equation} \label {eq:coordgenfun1} p_i = \frac {\partial S_1}{\partial q^i}, \quad P_i = -\frac {\partial S_1}{\partial Q^i}, \quad i = 1,\ldots ,n. \end{equation}

It is important to emphasize that the function \(S_1\) is not defined on the phase space \(T^*M\) but, instead, in a domain of the direct product of two euclidean \(n\)-dimensional coordinate spaces \(\mathbb {R}^n_q\times \mathbb {R}^n_Q\).

  • Theorem 4.4. Let \(S_1(q,Q)\) be a function defined in a neighborhood of the point \((q_0, Q_0)\) of two euclidean coordinate spaces. If

    \begin{equation} \label {eq:coordgenfun1-ndeg} \det \left ( \frac {\partial ^2 S_1(q,Q)}{\partial q \partial Q} \right )\neq 0 \end{equation}

    and (4.29) holds, then \(S_1\) is the generating function of a canonical transformation.

  • Proof. Consider (4.29) in the coordinates \(Q\):

    \begin{equation} p_i = \frac {\partial S_1(q,Q)}{\partial q^i}, \quad i = 1,\ldots , n. \end{equation}

    Thanks to the non–degeneracy implied by (4.30), we can apply the implicit function theorem and resolve the equation for \(Q = Q(q,p)\) in a neighborhood of a point \((q_0, p_0) := \left (q_0, \frac {\partial S_1}{\partial q}\Big |_{(q,Q) =(q_0, Q_0)}\right )\). Changing variables in the equations for \(P\) in (4.29), we get

    \begin{equation} P_i(q,p) = -\frac {\partial S_1}{\partial Q^i}\Big |_{Q=Q(q,p)},\quad i=1,\ldots ,n. \end{equation}

    The transformation \((q,p) \mapsto (Q(q,p), P(q,p))\) is then canonical because by construction

    \begin{align} p_i \dd q^i - P_i \dd Q^i & = \frac {\partial S_1(q,Q)}{\partial q^i}\dd q^i + \frac {\partial S_1(q,Q)}{\partial Q^i}\dd Q^i \\ & = \dd S_1(q,Q). \end{align}

We call free a canonical transformation \(\Phi :T^*M\to T^* M\), \(\Phi (q,p) = (Q,P)\), for which we can choose \(q\) and \(Q\) as independent coordinates. In this case, the function \(S\) expressed in the coordinates \((q,Q)\) is called generating function \(S_1\).

  • Remark 4.3. Not all canonical transformations are free: the identical transformation is not free, as \(q=Q\) and thus they are not independent.

  • Example 4.2. Let \(\omega > 0\) be some fixed constant. Consider the transformation \(\Phi \) given by

    \begin{equation} (q,p) \mapsto (Q,P) = \left ( \sqrt {\frac {2p}{\omega }} \sin (q), \sqrt {2p\omega }\cos (q) \right ). \end{equation}

    In this case, the conservation of the symplectic form can be checked immediately:

    \begin{align} \dd P\wedge \dd Q & = \left (\sqrt {\frac {\omega }{2 p}}\cos (q)\dd p - \sqrt {2 p\omega } \sin (q)\dd q\right ) \\ & \quad \wedge \left (\frac {1}{\sqrt {2\omega p}}\sin (q)\dd p + \sqrt {\frac {2 p}{\omega }} \cos (q)\dd q\right ) \\ & = \dd p \wedge \dd q. \end{align}

    For \(S_1(q,Q) = -\frac {\omega }{2}Q^2\cot (q)\), one gets

    \begin{equation} \frac {\partial S_1}{\partial q} = \frac {\omega Q^2}{2\sin ^2(2q)}, \quad \frac {\partial S_1}{\partial Q} = - Q\omega \cot (2q). \end{equation}

    That is, \(S_1\) generates the canonical transformation \(\Phi \) since

    \begin{equation} \frac {P}{Q} = \omega \cot (q), \quad \frac {Q^2}{\sin ^2(q)} = \frac {p}{\pi \omega }, \end{equation}

    which implies

    \begin{equation} P = -\frac {\partial S_1}{\partial Q}, \quad p = \frac {\partial S_1}{\partial q}. \end{equation}

    The hamiltonian \(H(Q,P) = \frac 12 (P^2 + \omega ^2 Q^2)\) is transformed into

    \begin{equation} K(q,p) = \omega p, \end{equation}

    and the linear equations of motion of \(H\) give rise to the equation for motion for \(K\):

    \begin{equation} \dot q = \omega , \quad \dot p = 0. \end{equation}

4.2.2 The generating functions \(S_2\) and \(S_3\)

We already know a trick to move from a generating function to a different one: the Legendre transform. For example, we could choose \((q,P)\) as coordinates and, if they are independent,

\begin{equation} \det \left ( \frac {\partial (q,P)}{\partial (q,p)}\right ) = \det \left ( \frac {\partial P}{\partial p} \right )\neq 0, \end{equation}

by summing and subtracting \(Q^i\dd P_i\) and doing simple algebraic manipulations we end up with

\begin{equation} p_i \dd q^i + Q_i \dd P^i = \dd (P_iQ^i + S). \end{equation}

The quantity \(P_iQ^i + S\), expressed as function of \((q,P)\), is the new generating function:

\begin{equation} S_2(q,P) = P_i Q^i(q,P) + S(q, p(q,P)). \end{equation}

The coordinates \(p\) and \(Q\) are expressed through \(S_2\) by the relations

\begin{equation} \label {eq:coordgenfun2} p_i = \frac {\partial S_2(q,P)}{\partial q^i},\quad Q^i = \frac {\partial S_2(q,P)}{\partial P_i},\quad i=1,\ldots ,n. \end{equation}

  • Example 4.3. Let \(Q=Q(q)\) be a change of coordinates on the base \(M\) of the cotangent bundle \(T^*M\). This induces a transformation of the conjugate momenta \(p_i\mapsto P_i\) given by \(P_i = p_l \frac {\partial q^l}{\partial Q^i}\), and we already saw that \((q,p)\mapsto (Q,P)\) defines a canonical transformation. The generating function of such transformation is given by

    \begin{equation} S_2(q,P) = \sum _{i=1}^n P_i Q^i(q). \end{equation}

In general, fixed a coordinate \(q\) we can choose one of \(2^n\) collections

\begin{equation} \label {eq:PQcollection} P_{\vb *{i}} = (P_{i_1}, \ldots , P_{i_k}), \quad Q_{\vb *{j}} = (Q^{j_1}, \ldots , Q^{j_{n-k}}), \end{equation}

where \((i_1, \ldots , i_k)\) and \((j_1, \ldots , j_{n-k})\) is a partition of \((1,\ldots ,n)\) in disjoint sets. These can be used to define a third common family of generating functions.

  • Theorem 4.5. Let \(\Phi :T^*M \to T^* M\) be the canonical transformation defined by the functions \(P(q,p)\) and \(Q(q,p)\). Then, in a neighborhood of a point \((q_0, p_0)\) it is possible to choose one of the \(2^n\) collections of functions \((q, Q^{\vb * j}, P_{\vb * i})\) from (4.49) as independent coordinates, i.e., such that

    \begin{equation} \label {eq:choiceofcoordsS3} \det \left (\frac {\partial (q, Q^{\vb * j}, P_{\vb * i})}{\partial (q,p)}\right ) \neq 0. \end{equation}

    The canonical transformation can then be recovered from

    \begin{equation} \dd S_3(q, Q^{\vb * j}, P_{\vb * i}) = \sum _{l=1}^k Q^{i_l} \dd P_{i_l} -\sum _{h=1}^{n-k} P_{j_h} \dd Q^{j_h} + \sum _{i=1}^n p_i \dd q^i \end{equation}

    via the relations

    \begin{equation} \label {eq:coordgenfun3} p_i = \frac {\partial S_3}{\partial q^i}, \quad Q^{i_l} = \frac {\partial S_3}{\partial P_{i_l}}, \quad P_{j_h} = -\frac {\partial S_3}{\partial Q^{j_h}}, \end{equation}

    where the indices run over the expected ranges \(i=1,\ldots ,n\), \(l=1,\ldots ,k\) and \(h = 1,\ldots ,n-k\).

    Conversely, if \(S_3(q, Q^{\vb * j}, P_{\vb * i})\) is a transformation such that

    \begin{equation} \det \left (\frac {\partial ^2 S_3}{\partial q \partial \widetilde p}\right ), \quad \widetilde p = (Q^{\vb * j}, P_{\vb * i}), \end{equation}

    then (4.52) define a canonical transformation around a point \((q_0, p_0)\).

  • Proof. The proof is practically identical to the one of Theorem 4.4, provided that we show the existence of a choice of coordinates satisfying (4.50). This is however always possible since \(x=(q,p) \mapsto \Psi (x) = (Q,P)\) and its inverse are non–degenerate and therefore the corresponding \(2n\times n\) matrix

    \begin{equation} \left (\frac {\partial Q}{\partial p_k}, \frac {\partial P}{\partial p_k}\right ) \end{equation}

    has maximal rank.

  • Remark 4.4. Sometimes, in the literature you can find a different classification in the following terms:

    • 1. \(S_1(q,Q) = S(q,p(q,Q))\), which depends only on the old and new generalized coordinates;

    • 2. \(S_2(q,P) = - Q^i(q,P) P_i + S(q,p(q,P))\), which depends only on the old generalized coordinates and the new generalized momenta;

    • 3. \(S_3(p,Q) = q^i(p,Q) p_i + S(q(p,Q),p)\), which depends only on the old generalized momenta and the new generalized coordinates;

    • 4. \(S_4(p,P) = q^i(p,P) p_i - Q^i(p,P) P_i + S(q(p,P),p)\), which depends only on the old and new generalized momenta.

    Of course, as we also have shown, mixtures of these four types can exist.

4.2.3 Infinitesimal canonical transformations

There is one more important family of transformations, that extends the \(S_2\) mentioned above and will allow us to discuss infinitesimal canonical transformations.

Suppose that we have a family of transformations depending on a parameter \(\epsilon \) and reducing to the identity for \(\epsilon = 0\), i.e., for \(\epsilon \in (-\epsilon _0, \epsilon _0)\),

\begin{equation} \label {eq:infcantraf} P_i = p_i + \epsilon h_i(q,p;\epsilon ) \quad \mbox {and}\quad Q^i = q^i + \epsilon g^i(q,p; \epsilon ), \end{equation}

where \(h_i\) and \(g^i\) are smooth functions in some \(U\subset T^*M\) open and \(\epsilon _0 \ll 1\) is small enough so that the transformation is non–degenerate.

  • Theorem 4.6. Every canonical transformation which is close to the identity admits a generating function of the form

    \begin{equation} \label {eq:infS2} S_2(q, P; \epsilon ) = q^i P_i + \epsilon \psi (q,P,\epsilon ), \end{equation}

    where \(\psi \) is a smooth function in the open \(U\times (-\epsilon _0,\epsilon _0)\).

The proof follows along the lines of Theorem 4.4 and will be omitted.

The change of coordinates (4.55) is an infinitesimal canonical transformation if

\begin{equation} \label {eq:definfcantraf} \big \{Q^i, P_j\big \} = \delta _j^i + O(\epsilon ^2), \quad \big \{Q^i, Q^j\big \} = \big \{P_i, P_j\big \} = O(\epsilon ^2). \end{equation}

  • Theorem 4.7. The change of coordinates (4.55) is an infinitesimal canonical transformation if and only if there exists a function \(K=K(q,p)\) such that

    \begin{equation} h_i(q,p,0) = -\frac {\partial K(q,p)}{\partial q^i}, \quad g^i(q,p,0) = \frac {\partial K(q,p)}{\partial p_i}, \quad i=1,\ldots ,n. \end{equation}

    In this case, we say that \(K\) is the hamiltonian associated to the infinitesimal canonical transformation. Moreover, \(K(q,p) = \psi (q,p,0)\) where \(\psi (q,p,\epsilon )\) is the function appearing in (4.56).

  • Proof. It is enough to use (4.55) in the Poisson bracket (4.57). Comparing the terms of order \(\epsilon \) one gets the following conditions:

    \begin{align} & \frac {\partial h_i(q,p,0)}{\partial p_j} + \frac {\partial g^j(q,p,0)}{\partial q^i} = 0, \\ & \frac {\partial h_i(q,p,0)}{\partial p_j} - \frac {\partial h_j(q,p,0)}{\partial p_i} = 0, \\ & \frac {\partial g^j(q,p,0)}{\partial q^i} - \frac {\partial g^i(q,p,0)}{\partial q^j} = 0. \end{align} These equations, which should remind you of certain Poisson bracket terms, are exactly satisfied if in the open subset \(U\) there is a function \(K(q,p)\) such that

    \begin{equation} g^i(q,p,0) = \frac {\partial K(q,p)}{\partial p_i} \quad \mbox {and}\quad h_i(q,p,0) = -\frac {\partial K(q,p)}{\partial q^i}, \end{equation}

    where \(i=1,\ldots ,n\). Using (4.47) and (4.57), we observe that

    \begin{equation} Q^i = q^i + \epsilon \frac {\partial \psi (q,p,0)}{\partial p_i} + O(\epsilon ^2),\quad P_i = p_i - \epsilon \frac {\partial \psi (q,p,0)}{\partial q^i} + O(\epsilon ^2), \end{equation}

    and therefore \(K(q,p) = \psi (q,p,0)\).

From the expressions above, we can conclude that an infinitesimal transformation can be written as

\begin{equation} \label {eq:forminfct} Q^i = q^i + \epsilon \big \{q^i, K\big \}, \quad P_i = p_i + \epsilon \big \{p_i, K\big \}. \end{equation}

Up until now, we’ve been thinking of canonical transformations in a passive sense, with \((Q,P)\) labelling the same point in phase space as \((q, p)\), but in different coordinates. A one–parameter family \(\Phi _\alpha \) of canonical transformations, however, could be interpreted in another, more dynamic, way: nothing prevents a transformations to take us from one point \((q,p)=(Q(0), P(0))\) in phase space to a different point

\begin{equation} (Q(\alpha ),P(\alpha )) = (Q(q,p,\alpha ), P(q,p,\alpha )). \end{equation}

in the same phase space. In this active interpretation, as we vary the parameter \(\alpha \) we follow curves in phase space... we might as well relabel \(\alpha \) into \(t\) and see what happens. This is, in some sense, what we are going to do in the next chapter.

4.2.4 Time-dependent hamiltonian systems

TODO: correct and rewrite To discuss canonical transformations for time-dependent hamiltonian systems, we consider again the extended phase space \(T^*M\times \mathbb {R}^2\) with the coordinates \((q^1,\ldots ,q^n,p_1,\ldots ,p_n, q^{n+1}=t, p_{n+1}=E)\) introduced in Section 3.4.2 and symplectic form \(\widetilde \omega = \dd p_i\wedge \dd q^i - \dd E\wedge \dd t\) discussed in Example 3.13.

In terms of the tautological one–form, we have

\begin{equation} \widetilde {\omega } = \dd {\widetilde {\eta }}, \quad \widetilde {\eta } = p_i \dd {q^i} - E \dd {t}. \end{equation}

A transformation \(\widetilde \Phi : T^*M\times \mathbb {R}^2 \to T^*M\times \mathbb {R}^2\) that associates \((q,p,t,E)\) to \((Q,P,T,\widetilde E)\) is canonical if it preserves the symplectic form, that is, if \({\widetilde \Phi }^* \widetilde \omega = \widetilde \omega \). Exactly as in the time-independent case, one can show that there exists a function \(S(q,p,t,E)\) such that

\begin{equation} \label {eq:timedepgen} p_i \dd q^i -E \dd t - P_i \dd Q^i + \widetilde E \dd T = \dd S(q,p,E,t). \end{equation}

  • Theorem 4.8. Let \((q,p,t,E) \to (Q,P,T,\widetilde E)\) be a canonical transformation in \(T^*M\times \mathbb {R}^2\). If

    \begin{equation} Q^i = Q^i(q,p,t), \quad P_i = P_i(q,p,t), \quad i=1,\ldots ,n, \end{equation}

    and either \(T\) is constant or \(\dd T = \dd t\), then \(\frac {\partial S}{\partial E} = 0\).

  • Proof. It is enough to expand the differential in \(Q\) in (4.67) to observe that

    \begin{align} & p_i \dd q^i -E \dd t - P_i \left ( \frac {\partial Q^i}{\partial q^j}\dd q^j +\frac {\partial Q^i}{\partial p_j}\dd p_j +\frac {\partial Q^i}{\partial t}\dd t \right ) + \widetilde E \dd T \\ & = \frac {\partial S}{\partial q^i}\dd q^i +\frac {\partial S}{\partial p_i}\dd p_i +\frac {\partial S}{\partial t}\dd t +\frac {\partial S}{\partial E}\dd E. \end{align} Comparing the corresponding elements one proves the claim.

If we choose as independent variables \((q,Q,t)\), we get

\begin{equation} p_i \dd q^i - E \dd t -P_i \dd Q^i + \widetilde E\dd T = \dd S(q,Q,t), \end{equation}

which, using the fact that \(\dd T = \dd t\), is equivalent to

\begin{equation} \label {eq:motionpPEtE} p_i = \frac {\partial S}{\partial q^i}, \quad P_i = - \frac {\partial S}{\partial Q^i}, \quad \widetilde E - E = \frac {\partial S}{\partial t}. \end{equation}

The last equation is usually written as a time reparametrization of hamiltonians:

\begin{equation} \label {eq:time-reparamqQt} K(Q, P(q,Q,t), T(t)) = H(q,p(q,Q,t),t) + \frac {\partial S(q,Q,t)}{\partial t}. \end{equation}

In the same way, if we choose as independent variables \((q,P,t)\), we get

\begin{align} & p_i = \frac {\partial S}{\partial q^i}, \quad Q^i = \frac {\partial S}{\partial P_i}, \\ & K(Q(q,P,t), P, T(t)) = H(q, p(q,P,t),t) + \frac {\partial S(q,P,t)}{\partial t}.\label {eq:time-reparamqPt} \end{align}

  • Example 4.4. Let \(H(q,p) = \frac {p^2}{2} + \alpha q\). Consider a time-dependent canonical transformation \(\Phi _t\) generated by \(H\): let \(q(0) = Q\) and \(p(0) = P\) and

    \begin{equation} \label {eq:unifaccm} p(t) = P - \alpha t, \quad q(t) = Q + Pt - \frac {1}{2}\alpha t^2. \end{equation}

    How are these equations related to each other? Do they ring any bell?

    We can compute the generating function for \(\Phi _t\) with a trick. Writing the change of coordinates in the form

    \begin{equation} p = \frac {q-Q}{t}-\frac 12\alpha t, \quad P = \frac {q-Q}{t} + \frac 12\alpha t, \end{equation}

    we see that \(S(q,Q,t)\) has to satisfy

    \begin{align} & p = \frac {\partial S(q,Q,t)}{\partial q} = \frac {q-Q}{t} - \frac 12\alpha t, \\ & P = - \frac {\partial S(q,Q,t)}{\partial Q} = \frac {q-Q}{t} + \frac 12\alpha t. \end{align} These can be integrated to find

    \begin{equation} S(q,Q,t) = \frac {(q-Q)^2}{2t} - \frac 12 \alpha (q+Q) t + f(t), \end{equation}

    where \(f\) is arbitrary and does only depend on time.

    The new hamiltonian \(K(q,Q,t)\) is then computed as follows

    \begin{align} K(q,Q,t) & = K(Q, P(q,Q,t), T(t)) \\ & = H(q, p(q,Q,t)) + \frac {\partial S (q,Q,t)}{\partial t} \\ & = \frac {(q-Q)^2}{2t^2} + \frac 12 \alpha ^2 t^2 - \frac 12 \alpha (q-Q) + \alpha q \\ & \quad - \frac {(q-Q)^2}{2t^2} - \frac 12 \alpha (q+Q) + f'(t) \\ & = \frac {1}{8}\alpha ^2t^2 + f'(t). \end{align} Since \(f\) does not affect Hamilton’s equation for the new hamiltonian \(K\), we can use it to our advantage: choosing \(f'(t) = - \frac {1}{8}\alpha ^2t^2\) the hamiltonian vanishes identically, \(K\equiv 0\).

    As a side remark, observe that the same result can be obtained also by directly checking that \(\dot P = \dot Q = 0\).

Equations (4.76) represent the motion in phase space of a point particle in uniformly accelerated linear motion, which is, in fact, the solution of the hamiltonian system of the example. In other words, the one–parameter group of canonical transformation in the previous example is the hamiltonian flow itself!

The example above is a particular instance of the following more general fact.

  • Theorem 4.9. The hamiltonian flow is a canonical transformation.

  • Proof. Let \(H = H(q,p,t)\) be a hamiltonian transformation and \(\phi (t) = (q(t), p(t))\) its associated hamiltonian flow. As in the previous example, let \((Q, P) = \phi (0) = (q(0), p(0))\) be the variables at \(t=0\) and \((q, p) = (q(Q,P,t), p(Q,P,t))\) the ones at time \(t\). Define the function

    \begin{equation} S(Q,P,t) = \int _0^t\left ( p_j(Q,P,s)\frac {\partial q^j(Q,P,s)}{\partial s} - H(q(Q,P,s), p(Q,P,s), s) \right )\dd s. \end{equation}

    We can directly verify that such \(S\) is the generating function of the canonical transformation. Indeed, we have

    \begin{align} & \frac {\partial S}{\partial Q^i} = p_j(Q,P,t) \frac {\partial q^j}{\partial Q^i} - P_i \\ & \frac {\partial S}{\partial P_i} = p_j(Q,P,t) \frac {\partial q^j}{\partial P^i} \\ & \frac {\partial S}{\partial t} = p_j(Q,P,t) \frac {\partial q^j}{\partial t} - H(q(Q,P,t), p(Q,P,t), t), \end{align} which imply

    \begin{align} \dd S(Q, P, t) & = p_i(Q, P, t) \dd q^i (Q, P, t) - P_i \dd Q^i \\ & \quad - H(q(Q,P,t), p(Q,P,t), t) \dd t, \end{align} or, more compactly,

    \begin{equation} \dd S = p_i \dd q^i - P_i \dd Q^i - H \dd t. \end{equation}

    Therefore, the reparametrized hamiltonian is \(K\equiv 0\).

  • Remark 4.5. Choosing \((q,Q,t)\) as independent variables in the proof above one gets

    \begin{equation} H(q, p(q,Q,t), t) = \frac {\partial S(q,Q,t)}{\partial t}. \end{equation}

4.3 Hamilton-Jacobi equations

In the previous chapter, we saw that the problem of integrating the equations of motion of a hamiltonian system with hamiltonian \(H(q,p,t)\), can be reduced to finding a one–parameter group of canonical transformations \(\Phi _t\) from the variables \((q,p)\) into the variables \((Q(q,p,t), P(q,p,t), t)\) such that the new Hamiltonian \(K(Q,P,t) \equiv 0\).

Indeed, the equations for \(Q\) and \(P\) are immediately integrated as

\begin{equation} Q^i = a^i, \quad P_i = b_i, \quad i = 1,\ldots ,n, \end{equation}

where \((a,b)\in \mathbb {R}^{n}\times \mathbb {R}^n\) are arbitrary constants related to the initial conditions \((q(0), p(0))\). Then, if \(S = S(q,P,t)\) denotes the generating function of \(\Phi _t\), the condition \(P_i=b_i\) and (4.73) imply that \(S\) satisfies the Hamilton-Jacobi equation:

\begin{equation} \label {eq:Hamilton-Jacobi} H\left (q, \frac {\partial S(q,b,t)}{\partial q}, t\right ) + \frac {\partial S(q,b,t)}{\partial t} = 0. \end{equation}

Once we know \(S\), equation (4.75) implies

\begin{equation} a^i = \frac {\partial S(q,b,t)}{\partial b_i}, \quad i =1,\ldots ,n, \end{equation}

which, for non–degenerate \(S\), can be inverted to find \(q^i = q^i(a,b,t)\). The conjugate momenta are then immediately obtained by

\begin{equation} p_i = \frac {\partial S(q,b,t)}{\partial q^i}, \quad i =1,\ldots ,n, \end{equation}

which confirms that integrating the equations of motion can be reduced to determining \(S(q,b,t)\).

If the hamiltonian \(H\) does not depend on the time, we have \(H=E=\mathrm {const}\) and, therefore, we can write \(S\) in the simpler form

\begin{equation} S(q,b,t) = S_0(q,b) - E t, \end{equation}

which gives the Hamilton-Jacobi equation

\begin{equation} \label {eq:HJnotime} H\left (q, \frac {\partial S_0(q,b)}{\partial q}\right ) = E. \end{equation}

The equations of motion can then be integrated as we did before.

We can give an interpretation of (4.99) in the following way. Assume that the coordinate transformation generated by \(S_0(q,b)\) maps \(H(q,p)\) to the new hamiltonian \(K(Q,P)=P_n\). In this case, the energy \(E\) would simply be the constant \(b_n\) and the integration of the equations of motion in the \((Q,P)\) coordinates would be

\begin{align} P_i(t) & = P_i(0) = b_i, \quad i=1,\ldots ,n, \\ Q^i(t) & = Q^i(0) = a^i, \quad i=1,\ldots ,n-1, \\ Q^n(t) & = t- t_0. \end{align} Then, we obtain the variables \((q,p)\) from \(S_0(q,b)\) using the relation

\begin{align} a^i & = \frac {\partial S_0(q,b)}{\partial b_i}, \quad i=1,\ldots ,n-1, \\ t-t_0 & = \frac {\partial S_0(q,b)}{\partial b_n}. \end{align} Under suitable assumptions on \(S_0\), these can be solved to get \(q^i = q^i(a,b,t)\). With \(q\) out of the way, we can then get the conjugate momenta from \(p_i = \frac {\partial S_0(q,b)}{\partial q^i}\).

The Jacobi method generalizes the approach above by assuming that the coordinate transformation generated by \(S_0(q,b)\) maps \(H(q,p)\) to \(K = K(P)\), i.e., the coordinates \(Q\) are cyclic. The integration of the motion in the coordinates \((Q,P)\), then, becomes

\begin{equation} \label {eq:JacobiM} P_i = b_i, \quad Q^i = k^i t + a^i, \quad k^i = \dot Q^i = \frac {\partial K(P)}{\partial P_i}, \quad i = 1,\ldots ,n. \end{equation}

In this case, the energy is \(E = K(P) = K(b)\), and the integrated equations of motion for \(H\) have the form

\begin{align} q^i & = q^i(k^1 t + a^1, \ldots , k^n t + a^n, b_1, \ldots , b_n), \quad i=1,\ldots ,n, \\ p_i & = p_i(k^1 t + a^1, \ldots , k^n t + a^n, b_1, \ldots , b_n), \quad i=1,\ldots ,n. \end{align}

We say that the Hamilton-Jacobi equation (4.99) admits a complete integral if the function \(S\) which solves it depends on \(n\) constants.

The process described above, once developed in all its details, leads to the following theorem.

  • Theorem 4.10. Let \(H(q,p,t)\) be an hamiltonian function and \(S(q,b,t)\) be a complete integral of the Hamilton-Jacobi equation (4.95) depending on \(n\) constants \(b=(b_1,\ldots ,b_n)\). If \(S\) satisfies the non–degeneracy condition

    \begin{equation} \det \left (\frac {\partial ^2 S}{\partial q^i \partial b_j}\right ) \neq 0, \end{equation}

    then the determination of integral curves of Hamilton’s equations for \(H(q,p,t)\) only involves operations of inversion and substitution.

Despite the fact that the integration of partial differential equations is usually more difficult than solving ordinary equations, the Hamilton-Jacobi theory proved to be a powerful tool in the study of problems of optics, mechanics, geometry and optimal control.

Furthermore, for a hamiltonian system with \(n\) degrees of freedom, we can combine Theorem 4.3 and Remark 4.1 to show the equivalence between the existence of \(n\) first integrals in involution and the existence of a complete integral of the Hamilton-Jacobi equation. A result which is also known as Liouville theorem.

  • Theorem 4.11 (Liouville theorem). Let \(H(q,p)\) denote the hamiltonian of a hamiltonian system with \(n\) degrees of freedom. Assume that there exist \(n\) first integral \(f_i\) of \(H\) which are in involution, i.e.

    \begin{equation} \big \{ f_i, f_j\big \} = 0, \quad i,j = 1,\ldots , n, \end{equation}

    such that the levelsets

    \begin{equation} f_i(q,p) = a_i, \quad i=1,\ldots ,n, \end{equation}

    define a lagrangian submanifold \(\Lambda _a = f^{-1}(a)\).

    If the following non–degeneracy condition is satisfied,

    \begin{equation} \det \left (\frac {\partial f_i}{\partial p_k}\right )\neq 0, \end{equation}

    then there exists, locally, a function \(S=S(q,a)\) which is also a complete integral of the Hamilton-Jacobi equation.

4.3.1 Separable hamiltonians

In many relevant cases, it is possible to find a complete integral of the Hamilton-Jacobi equation with the so-called separation of variables.

A coordinate, which we will denote \(q^1\), is called separable if there exists a function \(W_1(q^1, b)\) such that

\begin{equation} S(q,b) = S_0(q^2, \ldots , q^n, b) + W_1(q^1, b), \end{equation}

and the dependence of the hamiltonian \(H\) from \(q^1\) and \(W_1\) is of the form

\begin{equation} H\left (q,\frac {\partial S}{\partial q}\right ) = \widetilde H\left (q^2, \ldots , q^n , \frac {\partial S_0}{\partial q}, \psi _1\left (q^1, \frac {\partial W_1(q^1, b)}{\partial q^1}\right ) \right ), \end{equation}

for some function \(\psi _1\left (q^1, \frac {\partial W_1(q^1, b)}{\partial q^1}\right )\) that does not contain other coordinates.

  • Example 4.5. Cyclic coordinates are an immediate example of separable coordinates. Assume for example that \(q^1\) is cyclic, then the conjugate momentum \(p_1 = b_1\) is constant. The generating function has the form

    \begin{equation} S(q,b) = S_0(q^2, \ldots , q^n, b) + q^1 b_1, \end{equation}

    and the hamiltonian has the form

    \begin{equation} H(q^2, \ldots , q^n, \frac {\partial S_0}{\partial q^2}, \ldots , \frac {\partial S_0}{\partial q^n}, b_1). \end{equation}

A hamiltonian system is called separable if all its \(n\) coordinates are separable.

In this case, the function \(S\) becomes \(S(q,b) = \sum _{i=1}^n W_i(q^i, b)\) and the hamiltonian takes the form

\begin{equation} H\left (q,\frac {\partial S}{\partial q}\right ) = \widetilde H\left ( \psi _1\left (q^1, \frac {\partial W_1(q^1, b)}{\partial q^1}\right ), \ldots , \psi _n\left (q^n, \frac {\partial W_n(q^n, b)}{\partial q^n}\right ) \right ), \end{equation}

  • Example 4.6. A simple example of separable system is given by a sum of hamiltonians depending each on just one pair of conjugate coordinates:

    \begin{equation} H(q,p) = \sum _{i=1}^n h_i(q^i, p_i). \end{equation}

  • Example 4.7 (Planar harmonic oscillator). Let’s reconsider an old friend of ours, the planar harmonic oscillator defined by a point particle of mass \(m\) attracted by an elastic force centered at \(0\in \mathbb {R}^2\) with stiffness \(k\).

    The hamiltonian in cartesian coordinates \(x=(x_1, x_2)\) has the form

    \begin{equation} H(x,p) = \frac {1}{2m}(p_{x_1}^2 + p_{x_2}^2) + k(x_1^2 + x_2^2). \end{equation}

    Choosing the generating function

    \begin{equation} S(x_1, x_2, b, E, t) = W_1(x_1, b, E) + W_2(x_2, b, E) - E t, \end{equation}

    we obtain the Hamilton-Jacobi equation

    \begin{equation} \left (\frac {\partial W_1}{\partial x_1}\right )^2 + \left (\frac {\partial W_2}{\partial x_2}\right )^2 + 2m k (x_1^2 + x_2^2) = 2m E, \end{equation}

    which can be separated into the two equations

    \begin{equation} \left (\frac {\partial W_i}{\partial x_i}\right )^2 + 2m k x_i^2 = 2m k b_i, \quad i=1,2, \end{equation}

    such that \(K(b_1,b_2) := k(b_1 + b_2) = E\).

    The rest now follows from the Jacobi method. Since we are in the case described by (4.105), it holds

    \begin{equation} \frac {\partial W_i}{\partial b_i} = \frac {\partial K}{\partial b_i}t + c_i, \quad i=1,2, \end{equation}

    where \(c_i\) are integration constants, we get

    \begin{equation} \frac {\partial W_i}{\partial b_i} = \sqrt {\frac {mk}{2}}\arcsin \left (\frac {x_i}{\sqrt {b_i}}\right ) = kt + c_i, \quad i=1,2. \end{equation}

    Some algebra and some care in checking domains of definition lead to

    \begin{equation} x_i = \sqrt {b_i} \sin \left (\sqrt {\frac {2k}{m}} t + c_i\right ), \quad i=1,2. \end{equation}

    The complete integral is thus given by the generating function

    \begin{equation} S(x_1,x_2,b_1,b_2,t) = W_1(x_1, b_1) + W_2(x_2,b_2) - K(b_1, b_2) t, \end{equation}

    which depends on time, the two (generalized) coordinates \((x_1, x_2)\) and the two integration constants \((b_1, b_2)\). Note that you may as well stick with \(S\) in terms of \((E, b)\), as we will do in the next example: what you end up choosing is just a matter of convenience.

  • Example 4.8. Let’s consider a more involved example of a point of mass \(m\) moving in \(\mathbb {R}^3\) under the influence of an arbitrary potential \(U\).

    In spherical coordinates, the hamiltonian has the form

    \begin{equation} H(r,\phi ,\theta , p) = \frac {1}{2m}\left (p_r^2 + \frac {p_\theta ^2}{r^2} + \frac {p_\phi ^2}{r^2\sin ^2\theta }\right ) + U(r,\phi ,\theta ). \end{equation}

    Separation of variables is possible if

    \begin{equation} U(r,\phi ,\theta ) = \alpha (r) + \frac {\gamma (\phi )}{r^2 \sin ^2(\theta )} + \frac {\beta (\theta )}{r^2}, \end{equation}

    for some functions \(\alpha (r)\), \(\beta (\theta )\), \(\gamma (\phi )\).

    For simplicity, let’s assume that \(\gamma (\phi ) = 0\), then \(\phi \) becomes a cyclic coordinate and \(p_\phi = \mathrm {const} =: b_\phi \). The corresponding Hamilton-Jacobi equation for \(S\) becomes

    \begin{equation} \left (\frac {\partial S}{\partial r}\right )^2 + 2m\alpha (r) + \frac {1}{r^2}\left ( \left (\frac {\partial S}{\partial \theta }\right )^2 + 2m\beta (\theta ) \right ) + \frac {1}{r^2\sin ^2(\theta )}\left (\frac {\partial S}{\partial \phi }\right )^2 = 2mE, \end{equation}

    where we are looking for a solution of the form

    \begin{equation} S_0(r,\phi ,\theta ) = \phi b_\phi + W_1(r) + W_2(\theta ). \end{equation}

    Replacing the ansatz for \(S\) in the equation above, we get

    \begin{align} & \left (\frac {\partial W_2}{\partial \theta }\right )^2 + 2m \beta (\theta ) + \frac {b_\phi ^2}{\sin ^2(\theta )} = b_\theta , \\ & \left (\frac {\partial W_1}{\partial r}\right )^2 + 2m \alpha (r) + \frac {b_\theta }{r^2} = 2mE. \end{align}

    Integrating, we end up with

    \begin{align} S_0 & = \phi b_\phi \\ & + \int \sqrt {2m(E-\alpha (r))-{b_\theta }/{r^2}}\dd r \\ & + \int \sqrt {b_\theta -2m\beta (\theta )-{b_\phi ^2}/{\sin ^2(\theta )}}\dd \theta . \\ \end{align}

    The complete integral is then the generating function

    \begin{equation} S(r,\phi ,\theta ,E,b_\phi ,b_\theta ,t) = \phi b_\phi + W_1(r, E, b_\theta ) + W_2(\theta , b_\phi , b_\theta ) - E t, \end{equation}

    depending on time, the three (generalized) coordinates \((r,\phi ,\theta )\) and the three integration constants \((E,b_\phi ,b_\theta )\).

    To obtain \(r(t)\) and \(\theta (t)\) we then need to invert the integrals

    \begin{equation} \frac {\partial S}{\partial E} = Q_E = t-t_0, \quad \frac {\partial S}{\partial b_\theta } = Q_\theta = c_1, \quad \frac {\partial S}{\partial b_\phi } = Q_\phi = c_2, \end{equation}

    where \(c_1\) and \(c_2\) are arbitrary constants. The momenta \(p_\theta \) and \(p_r\) are then given by

    \begin{equation} p_r = \frac {\partial S}{\partial r}, \quad p_\theta = \frac {\partial S}{\partial \theta }. \end{equation}

    Doing these last computations analytically is generally hard or impossible because of complicated integrals and people refer to numerical investigations. On the other hand, by setting the constants to \(0\), they often allow to compute explicit equations for the trajectories – without time dependence – and to obtain useful asymptotic approximations.

4.3.2 A few exercises
  • Exercise 4.1. Show that the infinitesimal transformation

    \begin{equation} q = Q + \epsilon Q^2 \sin (P), \quad p = P + 2\epsilon Q(1+\cos (P)), \end{equation}

    is a canonical transformation and determine its domain. Compute the hamiltonian \(K(Q,P)\) associated to the transformation.

    Consider then the transformation

    \begin{align} q & = Q + \epsilon Q^2 \sin (P) + \sum _{j=2}^\infty \frac {(2j-1)!!}{(j+1)!}2^j \epsilon ^j \sin ^j(P) Q^{j+1}, \\ p & = P + 2\epsilon Q^2\left (1+\cos (P)\right )\left ( 1 + \epsilon Q \sin (P) + \sum _{j=2}^\infty \frac {(2j-1)!!}{(j+1)!}2^j \epsilon ^j \sin ^j(P) Q^{j} \right ). \end{align} Show that it is a canonical transformation and compute its generating function.
    Hint: use the following series expansion \(\sqrt {1-x} = 1 - \frac {x}{2} + \sum _{j=2}^\infty \frac {(2j-3)!!}{j! 2^j} x^j\).

  • Exercise 4.2. Let \(Q = q^2 + \frac 12 \cos (q)\). Find \(P(q,p)\) such that the corresponding coordinate transformation is canonical. Compute the corresponding generating function.

  • Exercise 4.3 (Parabolic coordinates). Let \((r,\phi ,z)\) denote the cylindrical coordinates in \(\mathbb {R}^3\). Find the complete integral of the Hamilton-Jacobi equations for a point particle of mass \(m\) under the influence of the potential

    \begin{equation} U(r,\phi ,z) = \frac {\alpha }{\sqrt {r^2+z^2}} - F z. \end{equation}

    To this end, introduce the parabolic coordinates \((\xi , \eta , \phi )\) obtained from the equations

    \begin{equation} z = \frac 12(\xi - \eta ), \quad r = \sqrt {\xi \eta }. \end{equation}

  • Exercise 4.4 (Prolate spheroidal coordinates). Consider a point particle of mass \(m\) under the gravitational attraction of two bodies which are fixed at the points \((d,0,0)\) and \((-d,0,0)\), \(d>0\). Show that the Hamilton-Jacobi equation is separable introducing the prolate spheroidal coordinates obtained by

    \begin{equation} x = d \cosh \xi \cos \eta , \quad y = d \sinh \xi \sin \eta \cos \phi , \quad z = d \sinh \xi \sin \eta \sin \phi , \end{equation}

    where \(\xi \in \mathbb {R}_+\), \(0\leq \eta \leq \phi \) and \(0\leq \phi \leq 2\pi \).

  • Exercise 4.5. Consider the motion of a free particle of mass \(m\) on a surface \(S\subset \mathbb {R}^3\) parametrized by

    \begin{equation} x = u \cos v, \quad y = u \sin v, \quad z = \psi (u), \end{equation}

    with \(u\in \mathbb {R}\) and \(0\leq v\leq 2\pi \). Determine the conjugate momenta associated to the variables \(v\) and \(u\) and show that the corresponding hamiltonian system is separable.

4.4 The Liouville-Arnold theorem

We have seen that the equations of motion of a lagrangian with one degree of freedom are integrable by quadratures. In fact, throughout the previous chapters we have seen more.

The trajectories of the motion are the level curves \(\lambda _E =: H^{-1}(E) = \{H(q,p) = E\}\) of the hamiltonian \(H(q,p)\) whose dynamics is obtained integrating

\begin{equation} \dd t = \frac {\dd q}{\partial _p H(q,p)} = - \frac {\dd p}{\partial _q H(q,p)} \end{equation}

over \(\lambda _E\). If \(\lambda _E\) is compact, then the motion is finite and periodic with period

\begin{equation} T = \oint _{\lambda _E}\dd t. \end{equation}

If \(\lambda _E\) is smooth, then the gradient \((\partial _q H, \partial _p H)\neq 0\) over \(\lambda _E\), therefore we can reparametrize the motion as \((q,p) = (q(\phi , E), p(\phi , E))\) where

\begin{equation} \label {eq:phirep1d} \phi = \frac {2\pi }{T}\int _{\lambda _E} \frac {\dd q}{\partial _p H(q,p)} = -\frac {2\pi }{T}\int _{\lambda _E} \frac {\dd p}{\partial _q H(q,p)} \end{equation}

and the functions \(q(\phi , E)\) and \(p(\phi , E)\) are \(2\pi \)-periodic in \(\phi \). In fact, in such case, we also saw that

\begin{equation} T = \oint _{\lambda _E}\dd t = \frac {\dd }{\dd E} \oint _{\lambda _E} p\dd q. \end{equation}

  • Remark 4.6. In the integral (4.148) you can choose the initial point arbitrarily as long as it varies smoothly with respect to \(E\).

With such reparametrization we end up with a periodic motion of the form

\begin{equation} \left \lbrace \begin{aligned} & q = q\left (\omega t + \phi ^0, E\right ) \\ & p = p\left (\omega t + \phi ^0, E\right ) \end {aligned} \right ., \quad \omega = \frac {2\pi }{T}, \quad \phi ^0\mbox { arbitrary}. \end{equation}

Consider now a hamiltonian system \(H(q,p)\) with \(n\) degrees of freedom. If the system admits \(n\) independent first integrals \(H_1(q,p), H_2(q,p), \ldots , H_n(q,p)\) in involution, i.e.

\begin{equation} \big \{H_i, H_j\big \} = 0,\quad i,j=1,\ldots ,n, \end{equation}

we say that it is a completely integrable hamiltonian system.

Without loss of generality, let’s assume \(H_1 = H\). We will show that a completely integrable system shares the property of one–dimensional conservative hamiltonian systems: i.e., it is integrable by quadratures. Moreover, if for \(E = (E_1, \ldots , E_n)\in \mathbb {R}^n\) the levelset

\begin{equation} \label {eq:levelsetE} M_E := H^{-1}(E) = \big \{(q,p)\in \mathbb {R}^{2n} \;\mid \; H_i(q,p)=E_i,\;i=1,\ldots ,n \big \} \end{equation}

is compact, then the motion is equivalent to the motion of a free flow on the torus. In fact, the integrability allows us to introduce some local canonical coordinates

\begin{equation} \label {eq:actionalngle} (\phi _1, \ldots , \phi _n, I_1, \ldots , I_n) \in \mathbb {T}^n\times \mathbb {R}^n \end{equation}

such that the hamiltonian system takes the form

\begin{equation} \dot I_k = 0, \quad \dot \phi _k = \omega _k(I), \quad k=1,\ldots ,n, \end{equation}

whose corresponding solution is simply

\begin{equation} I_k(t)= I_k(0), \quad \phi _k(t) = \phi _k(0) + \omega _k(I)t, \quad k = 1,\ldots , n. \end{equation}

  • Theorem 4.12 (Liouville-Arnold). Consider a completely integrable hamiltonian system on the phase space \(\mathbb {R}^{2n}\). Assume that there exists \(E^0 = (E^0_1, \ldots , E^0_n)\) such that the functions

    \begin{equation} H_1(q,p), \ldots , H_n(q,p) \end{equation}

    are independent at every point of the level surface \(M_{E^0}\) defined by (4.152).

    Then, for values of \(|E-E^0|\) small enough, the following holds

    • the level surfaces \(M_E\) are smooth \(n\)-dimensional lagrangian submanifolds in \(\mathbb {R}^{2n}\) which are invariant under the hamiltonian flow;

    • the hamiltonian flow on \(M_E\) is integrable by quadratures.

    Furthermore, if \(M_{E^0}\) is compact and connected, then, for values of \(|E-E^0|\) small enough, the level surfaces \(M_E\), are diffeomorphic to a \(n\)-dimensional torus,

    \begin{equation} \label {eq:idtorusstd} M_E \simeq \mathbb {T}^n = \big \{ (\phi _1, \ldots , \phi _n)\in \mathbb {R}^n \;\mid \; \phi _i \sim \phi _i+2\pi , \; i=1,\ldots ,n \big \}, \end{equation}

    and the motion on the tori \(M_E\) is conditionally periodic. That is, in the coordinates \((\phi _1, \ldots , \phi _n)\) the motion takes the form

    \begin{equation} \label {eq:condperflow} \phi _1(t) = \omega _1(E) t + \phi _1^0, \quad \ldots \quad \phi _n(t) = \omega _n(E) t + \phi _n^0, \end{equation}

    where \(\omega _1(E),\ldots ,\omega _n(E)\) are constants which only depend on \(E\) and the phases \(\phi _1^0,\ldots ,\phi _n^0\) are arbitrary.

  • Remark 4.7. A separable hamiltonian system is completely integrable. Indeed, for a separable system there exists a canonical transformation \((q,p) \mapsto (Q,P)\) such that the hamiltonian \(H\) in the new coordinates does not depend on \(Q\), \(H=H(P)\). The functions \(P_1, \ldots , P_n\), which are in involution by construction, commute with \(H\).

  • Proof. The independence of the functions \(H_1(q,p), \ldots , H_n(q,p)\) at every point in \(M_{E^0}\) implies their independence on nearby energy levels. The implicit function theorem then yields the smoothness of the \(n\)-dimensional submanifolds \(M_E\) for \(|E-E^0|\) small enough.

    Recall that the hamiltonian vector field \(X_H\) is tangent to the levelset \(F(x)=C\) of any first integral \(\big \{H,F\big \} = 0\). Therefore, the hamiltonian vector fields \(X_{H_1}, \ldots , X_{H_n}\) are tangent to \(M_E\) for all values of \(E=(E_1,\ldots , E_n)\). They are also linearly independent, since the differentials \(\dd H_i\) are linearly independent at every point on \(M_E\). In summary, we have shown that for \(|E-E^0|\) small enough

    \begin{equation} \mathrm {span}(X_{H_1}, \ldots , X_{H_n}) \big |_{x\in M_E} = T_x M_E. \end{equation}

    Furthermore,

    \begin{equation} \omega (X_{H_i}, X_{H_j}) = \big \{H_i, H_j\big \} =0, \quad i,j=1,\ldots ,n, \end{equation}

    which implies that \(M_E\) is a lagrangian submanifold. Proving the first part of the first statement.

    To show integrability by quadratures, we need to find a good family of local coordinates. To this end we will rely on our newly acquired knowledge of canonical transformations.

    The independence of the functions implies that the \(n\times 2n\)-matrix \(\left ( \frac {\partial H_i}{\partial q^j}, \frac {\partial H_i}{\partial p_j} \right )_{M_{E^0}}\) has rank \(n\). Without loss of generality, we assume that for a point \(x_0 \in M_{E^0}\), \(x_0 =(q_0,p_0)\), the Jacobian with respect to \(p\) is non–degenerate, i.e.

    \begin{equation} \label {eq:nondegHip} \left ( \frac {\partial H_i}{\partial p_j} \right )_{x_0} \neq 0. \end{equation}

    The equations

    \begin{align} & H_i(q_0, p(E)) = E_i,\quad i = 1,\ldots ,n, \\ & p(E^0) = p_0, \end{align} determine,for all \(E\) close enough to \(E^0\), a point \(x_0(E) = (q_0, p(E))\in M_E\) on all the surfaces \(M_E\) which depends smoothly on \(E\). Thanks to (4.161) and the implicit function theorem, the lagrangian submanifold \(M_E\) can be represented in a neighborhood of \(x_0(E)\) in the form

    \begin{equation} p = p(q,E) = \frac {\partial S(q,E)}{\partial q}. \end{equation}

    Moreover, the variables \(q=(q^1, \ldots , q^n)\), \(E= (E_1,\ldots ,E_n)\) can be used as local coordinates in a neighborhood of \(x_0(E^0)\). A generating function \(S=S(q,E)\) could be, for example, the quadrature

    \begin{equation} S(q,E) = \int _{x_0(E)}^{(q,p(q,E))} p \dd q \end{equation}

    along a curve on the lagrangian submanifold \(M_E\) (one can show that such integral does not depend on the choice of the integration path).

    Using the generating function \(S(q,E)\) one obtains a canonical transformation

    \begin{align} & (q,p)\mapsto (\psi , E), \quad \psi _i = \frac {\partial S(q,E)}{\partial E_i}, \quad i=1,\ldots ,n, \\ & \dd p \wedge \dd q = \dd E \wedge \dd \psi . \end{align} In the canonical coordinates \((\psi , E)\), we have that \(\{\psi _i, E_j\} = \delta _{ij}\) and the hamiltonian \(H = H_1(q,p)\) transforms to \(H(\psi , E) = E_1\). Then the hamiltonian flow \(\dot x = \{x, E_1\}\), i.e.,

    \begin{align} & \dot \psi _1 = 1, \quad \dot \psi _i = 0, \quad i=2,\ldots ,n \\ & \dot E_j = 0, \quad j = 1,\ldots , n, \end{align} becomes immediately integrable and concludes the first part of the proof.

    To show the second part of the theorem we need an intermediate geometric lemma.

    • Lemma 4.13. Let \(M\) be a \(n\)-dimensional compact connected manifold. Assume that at all points \(x\in M\) there exist \(n\) linearly independent vector fields \(X_1, \ldots , X_n\) which are pairwise commuting, i.e.,

      \begin{equation} [X_i, X_j] = 0, \quad i,j = 1,\ldots ,n. \end{equation}

      Then, \(M \simeq \mathbb {T}^n\) is diffeomorphic to a torus.

    • Proof. Since \(M\) is compact, the one–parameter group of diffeomorphisms

      \begin{equation} \Phi _t^i: M \to M, \quad \dv {t} \Phi _t^i(x)\Big |_{t=0} = X_i(x) \end{equation}

      generated by the vector fields is well defined for all \(t\in \mathbb {R}\). The commutativity of the vector fields implies the commutativity of the groups, that is,

      \begin{equation} \Phi ^i_s \circ \Phi ^j_t = \Phi ^j_t \circ \Phi ^i_s, \quad \forall s,t\in \mathbb {R}. \end{equation}

      Such flows allows us to define an action of the abelian group \(\mathbb {R}^n\) on the manifold \(M\) by

      \begin{align} & \Phi _{\vb *{t}} := \phi _{t_1}^1 \circ \cdots \circ \phi _{t_n}^n, \quad \vb *{t} = (t_1, \ldots , t_n)\in \mathbb {R}^n, \\ & \Phi _{\vb * t + \vb * s} = \Phi _{\vb * t} \circ \Phi _{\vb * s}, \quad \forall \vb * t, \vb * s \in \mathbb {R}^n. \end{align}

      • Exercise 4.6. Show that the action \(\Phi _{\vb * t}\) is transitive, that is, for every two points \(x, y\in M\), there exists \(\vb * t\in \mathbb {R}^n\) such that \(y = \Phi _{\vb * t} (x)\).

      For any \(x_0 \in M\), the map

      \begin{equation} \label {eq:diffeoRnM} \mathbb {R}^n \to M, \quad \vb * t \mapsto \Phi _{\vb * t}(x_0) \end{equation}

      is smooth, and the set

      \begin{equation} \Gamma = \left \{ \vb * t \in \mathbb {R}^n \;\mid \; \Phi _{\vb * t}(x_0) = x_0\right \} \end{equation}

      is a subgroup of \(\mathbb {R}^n\), called the isotropy subgroup.

      • Exercise 4.7. Show that the subgroup \(\Gamma \) does not depend on \(x_0\). Hint: \(\Gamma \) is transitive. Furthermore, \(\Gamma \) is a discrete subgroup of \(\mathbb {R}^n\), that is, there is an open set \(U \ni \vb * 0\) such that \(U\cap \Gamma = \{\vb * 0\}\).

      Discrete subgroups of \(\mathbb {R}^n\) admit a very nice description, which will be handy also later (for the proof see [Kna18, Lemma 13.4]).

      • Lemma 4.14. Let \(\Gamma \) be a discrete subgroup of \(\mathbb {R}^n\). Then, there exist \(m\) linearly independent vectors \(e_1, \ldots , e_m \in \mathbb {R}^n\), \(0\leq m \leq n\), such that

        \begin{align} \Gamma & = \mathrm {span}_{\mathbb {Z}}(e_1, \ldots , e_m) \\ & = \left \{ k_1 e_1 + \cdots + k_m e_m \;\mid \; (k_1, \ldots , k_m)\in \mathbb {Z}^m \right \}. \end{align}

      To conclude the proof of the lemma we just need to observe that \(M\) is diffeomorphic to the quotient

      \begin{equation} M \simeq \mathbb {R}^n / \Gamma \end{equation}

      via the diffeomorphism (4.175). Furthermore, if we extend the basis \(e_1, \ldots , e_m\) to a basis \(e_1, \ldots , e_m, e_{m+1}, \ldots ,e_n\) of \(\mathbb {R}^n\), Lemma 4.14 implies that

      \begin{equation} M \simeq \mathbb {R}^n / \Gamma \simeq \mathbb {T}^m \times \mathbb {R}^{n-m}. \end{equation}

      Since \(M\) is compact by hypothesis, \(n=m\).

      With \(M\) being diffeomorphic to a torus, its points can be parametrized by \(n\) angles \(\phi _1, \ldots , \phi _n\) in such a way that every \(\vb * t \in \mathbb {R}^n/\Gamma \) can be written in the form

      \begin{equation} \vb * t = \frac 1{2\pi }(\phi _1 e_1 + \cdots + \phi _n e_n), \end{equation}

      which gives the identification with the standard torus (4.157).

    The fact that \(M_E\simeq \mathbb {T}^n\) now follows applying Lemma 4.13 with the vector fields \(X_{H_1}, \ldots , X_{H_n}\).

    To conclude the proof, we only need to show that the hamiltonian dynamics of the angular coordinates \((\phi _1, \ldots , \phi _n)\) is linear. For each \(x_0\in M_E\), the mapping (4.175), \([0, 2\pi )^n \to M_E\),

    \begin{equation} (\phi _1, \ldots , \phi _n)\mapsto \Phi \left (\sum _{i=1}^n \frac {\phi _i e_i}{2\pi }, x_0\right ) := \Phi _{\vb * t}(x_0)\Big |_{\vb * t = \sum _{i=1}^n \frac {\phi _i e_i}{2\pi }} \end{equation}

    is a bijection. Assume that the vector \((1, 0, \ldots , 0)\in \mathbb {R}^n\) is represented as \(\sum _{i=1}^n \frac {\omega _i}{2\pi } e_i\), then the flow generated by \(H\) is of the form

    \begin{equation} \Phi ^1_{t_1}(y) = \Phi ((t_1, 0, \ldots , 0), y) = \Psi \left (t_1 \sum _{i=1}^n \frac {\omega _i}{2\pi }e_i, y\right ), \end{equation}

    which means that the coordinates \(\phi _1(t), \ldots , \phi _n(t)\) of \(\Phi _t^1(y)\) satisfy (4.158).

  • Exercise 4.8. Extend the proof to completely integrable system on an arbitrary symplectic manifold. (Solution: find the differences between our proof and [Kna18, Theorem 13.3]).

4.4.1 Action-angle variables

Given a surface \(M_{E^0}\) on which the first integrals \(H_1, \ldots , H_n\) are independent everywhere (i.e., the differentials \(\dd H_1, \ldots , \dd H_n\) are linearly independent at all points), we would like to construct globally the canonical coordinates \((\phi _1, \ldots , \phi _n, I_1, \ldots , I_n)\) from (4.153) on some neighborhood \(U(M_{E^0})\subset \mathbb {R}^{2n}\) of \(M_{E^0}\) such that the \(\phi \) are the angular coordinates on the invariant tori \(M_E\) for \(E\) sufficiently close to \(E^0\).

Consider the symplectic manifold \(\mathbb {T}^n \times \mathbb {R}^n\) with canonical coordinates \((\phi , I)\) and the symplectic form

\begin{equation} \dd I \wedge \dd \phi = \sum _{k=1}^n \dd I_k \wedge \dd \phi _k. \end{equation}

Any ball \(B\subset \mathbb {R}^n\) centered at a point \(I^0 = (I^0_1, \ldots , I^0_n)\) gives a submanifold \(\mathbb {T}^n\times B \subset \mathbb {T}^n \times \mathbb {R}^n\).

  • Theorem 4.15. Under the hypotheses of the Liouville-Arnold theorem (Theorem 4.12), for every compact connected component \(M_{E^0}\) there are a neighborhood \(M_{E^0}\subseteq U = U(M_{E^0}) \subseteq \mathbb {R}^{2n}\) and a symplectomorphism

    \begin{equation} \mathbb {T}^n\times B \to U(M_{E^0}) \end{equation}

    such that the hamiltonians become exclusively functions of the action coordinates \(I = (I_1, \ldots , I_n)\),

    \begin{align} & H_1 = H_1(I), \ldots , H_n = H_n(I), \quad I\in B, \\ & H_k(I^0) = E^0_k, \quad k=1,\ldots ,n, \end{align} and for every \(I\in B\) the angle coordinates \(\phi =(\phi _1, \ldots , \phi _n)\) are the angular coordinates on the invariant torus \(M_E\), \(E = (H_1(I), \ldots , H_n(I))\).

  • Proof. We can start mimicking the first part of the proof of the Liouville-Arnold theorem.

    Choosing a point \(x_0 = (q_0, p_0) \in M_{E^0}\) such that the \(p\)-Jacobian is non degenerate, we obtain a generating function \(S=S(q,E)\), with \(p = \frac {\partial S(q,E)}{\partial q}\), for the family of lagrangian submanifolds \(M_E\). Locally, around \(x_0\), we can define \(\psi = \frac {\partial S(q,E)}{\partial E}\) to obtain a system of canonical coordinates \((\psi , E)\) with \(\dd p\wedge \dd q = \dd E\wedge \dd \psi \).

    In the \(\psi \) coordinates, the hamiltonian flows \(\Phi _t^1, \ldots , \Phi _t^n\) generated by the hamiltonian vector fields \(X_{H_1}, \ldots , X_{H_n}\) are translations along the respective coordinate axis \(\phi _1, \ldots , \phi _n\). Thus, the local coordinates \(\phi _1, \ldots , \phi _n\) are related to the global angular coordinates \(\phi _1, \ldots , \phi _n\) on the same torus \(M_E\) via a non–degenerate linear transformation:

    \begin{equation} \psi _i = \sum _{j=1}^n \rho _{ij}(E)\phi _j, \quad i=1,\ldots ,n, \qquad \det (\rho _{ij})\neq 0. \end{equation}

    Since for every \(k\) the shift \(\phi _k \mapsto \phi _k + 2\pi \) is the identity on every \(M_E\) (in a neighborhood of \(M_{E^0}\)), then the translation

    \begin{equation} \big (\psi _1, \ldots , \psi _n, E\big ) \mapsto \big (\psi _1 + 2\pi \rho _{1k}(E), \ldots , \psi _n + 2\pi \rho _{nk}(E), E\big ) \end{equation}

    is a canonical transformation. Then, in particular, we have

    \begin{equation} \big \{ \psi _i + 2\pi \rho _{ik}(E), \psi _j + 2\pi \rho _{jk}(E) \big \} = \big \{\psi _i, \psi _j\big \} = 0, \quad i,j = 1, \ldots , n, \end{equation}

    and, for all \(i,j,k\),

    \begin{equation} \frac {\partial \rho _{ik}(E)}{\partial E_j} = \frac {\partial \rho _{jk}(E)}{\partial E_i}. \end{equation}

    Therefore, locally around \(E^0\) there are functions \(I_1 = I_1(E), \ldots , I_2 = I_2(E)\) such that

    \begin{equation} \rho _{ik}(E) = \frac {\partial I_k(E)}{\partial E_i}, \quad i,k=1,\ldots ,n, \end{equation}

    and, thanks to the non–degeneracy of the matrix \(\rho _{ij}(E)\), the change of coordinates \(E \mapsto I(E)\) is a local diffeomorphism.

    We need to show that the functions \((\phi _1, \ldots , \phi _n, I_1, \ldots , I_n)\) that we just defined are the canonical coordinates. Using the fact that \((\psi , E)\) are canonical, we have

    \begin{equation} \big \{\psi _i, I_k\big \} = \frac {\partial I_k}{\partial E_i} = \rho _{ik}(E). \end{equation}

    On the other hand, using the substitution \(\psi _i = \sum _{j=1}^n \rho _{ij}(E)\phi _j\), one gets

    \begin{equation} \big \{\psi _i, I_k\big \} = \sum _{j=1}^n \rho _{ij}(E) \big \{\phi _j, I_k\big \}. \end{equation}

    A direct comparison of the last two equations yields the desired result, i.e.

    \begin{equation} \big \{\phi _j, I_k\big \} = \delta _{jk}, \quad j,k=1,\ldots ,n. \end{equation}

For an alternative proof, you can refer to [Kna18, Chapter 13.3].

As anticipated in the previous section, a completely integrable hamiltonian system with Hamiltonian \(H = H(q,p)\) in action–angle coordinates takes an extremely simple form:

\begin{equation} \dot I_k = 0, \quad \dot \phi _k = \omega _k(I), \quad \omega _k(I) = \frac {\partial H(I)}{\partial I_k}, \quad k=1,\ldots ,n, \end{equation}

with \(H(I)\) being only a function of the canonical actions. Such equations are immediately integrated, yielding

\begin{equation} I_k(t)= I_k(0), \quad \phi _k(t) = \phi _k(0) + \omega _k(I)t, \quad k = 1,\ldots , n. \end{equation}

It is important to stress the importance of the canonical actions \(I_1, \ldots , I_n\). If we were sticking to our first integrals \(H_1, \ldots , H_n\), we would have had two “troubles” to deal with: first of all the flows generated by \(H_k\) affect all the angle variables, and not just \(\phi _k\), secondly the coordinates \((\phi , E)\) would not in general be canonical.

Furthermore, due to their simplicity, the action–angle coordinates are the ideal starting point for the development of perturbation theory.

  • Remark 4.8. Consider the (inverse) canonical transformation

    \begin{equation} q_k = q_k(\phi , I), \quad p_k = p_k(\phi , I), \quad k=1,\ldots ,n. \end{equation}

    The smooth functions \(q\) and \(p\) are \(2\pi \)-periodic with respect to any of the angular variables \(\phi = (\phi _1, \ldots , \phi _n)\), and therefore they can be expanded in Fourier series:

    \begin{align} & x_j = \sum _{\vb * k = (k_1, \ldots , k_n)\in \mathbb {Z}^n} A^j_{\vb * k}(I) e^{i(k_1 \phi _1 + \cdots + k_n \phi _n)} = \sum _{\vb * k\in \mathbb {Z}^n} A^j_{\vb * k}(I) e^{i\langle \vb * k, \phi \rangle } \\ & x = (q,p), \quad j=1,\ldots ,2n. \end{align} Then, the dynamics of the hamiltonian system in the original coordinates is of the form

    \begin{align} & x_j(t) = \sum _{\vb * k\in \mathbb {Z}^n} A^j_{\vb * k}(I) e^{i t\langle \vb * k, \omega (I)\rangle + i\langle \vb * k,\phi ^0\rangle } \\ & x = (q,p), \quad j=1,\ldots ,2n, \end{align} where \(\omega (I) = (\omega _1(I), \ldots , \omega _n(I))\) and the parameters \(I = (I_1, \ldots , I_n)\) and \(\phi ^0 = (\phi _1^0, \ldots , \phi _n^0)\) can be considered as constants of integration.

    The solutions \(x(t) = (q(t), p(t))\) are periodic in \(t\) (and the period does not depend on the parameter \(\phi ^0\)) if the frequencies \(\omega (I)\) are rationally dependent, i.e., if there is \((k_1, \ldots , k_n)\in \mathbb {Z}^d\) such that \(k_1\omega _1(I) + \ldots + k_n \omega _n(I) = 0\). If, instead, the frequencies are rationally independent, or non–resonant, then the trajectories are dense on the invariant torus \(M_{E(I)}\).

What we have seen so far seems very abstract, however there is a practical and direct approach to make computations in the action–angle variables that also helps clarifying the first half of the name.

Since \(\dd {p} \wedge \dd {q} = \dd {I} \wedge \dd {phi}\), the difference \(p\dd q - I\dd \phi \) is a closed one–form on the neighborhood \(U(M_{E^0})\). Thus, locally,

\begin{equation} p\dd {q} - I\dd {\phi } = \dd {S} \end{equation}

for some generating function \(S = S(q,\phi )\). Consider the fundamental cycles \(\gamma _1, \ldots , \gamma _n\) on \(\mathbb {T}^n\),

\begin{equation} \gamma _k = \big \{ (\phi _1, \ldots , \phi _n) \in \mathbb {T}^n \;\mid \; \phi _j = \mathrm {const}, \; j\neq k,\; \phi _k \in [0,2\pi ] \big \}, \end{equation}

where \(k=1,\ldots ,n\). Integrating \(\dd S\) on the cycle \(\gamma _i\) one obtains

\begin{equation} I_k = \frac 1{2\pi } \oint _{\gamma _k} p \dd {q}, \quad k=1,\ldots ,n. \end{equation}

This formula, that looks like Maupertuis action, can be used to compute the canonical actions. To compute the conjugate angles, we use the generating function \(\widetilde S = \widetilde S(q,I)\) of the canonical transformation \((q,p) \mapsto (\phi , I)\). That is, the function such that

\begin{equation} \dd {\widetilde {S}} = p\dd {q} - \phi \dd {I}. \end{equation}

Since \(I = I(E)\), a restriction of the closed one–form \(\dd {\widetilde S}\) on the torus \(M_E\) can be written as \(\dd {\widetilde S}\big |_{M_E} = p\dd {q}\) and, therefore,

\begin{equation} \widetilde S(q,I) = \int _{x_0(q,E)}^{(q, p(q,E))} p \dd {q}, \quad E = E(I), \end{equation}

where the integral along a path on \(M_E\) locally does not depend on the choice of the path itself (globally this is generally false, a clarification would require a discussion of holonomy). Thus, the canonical angles can be determined as

\begin{equation} \phi _k = \frac {\partial }{\partial I_k} \int _{x_0(q,E)}^{(q, p(q,E))} p \dd q, \quad k=1,\ldots ,n. \end{equation}

  • Example 4.9 (Harmonic oscillator). Consider, once again, the hamiltonian of a harmonic oscillator with mass \(m\) and stiffness \(k>0\):

    \begin{equation} H(q,p) = \frac {p^2}{2m} + \frac {k q^2}{2}. \end{equation}

    We have seen a long time ago that for \(E>0\) the curve \(H(q,p) = E\) is an ellipse and the motion on the ellipse is periodic with period

    \begin{equation} T = \frac {2\pi }{\omega } = 2\pi \sqrt {\frac {m}{k}}. \end{equation}

    The canonical action variable

    \begin{equation} I = \frac 1{2\pi }\oint _{H(q,p) = E} p \dd q \end{equation}

    is proportional to the area of the ellipse:

    \begin{equation} I = \frac 1{2\pi } \mathrm {area}\left (\frac {p^2}{2m} + \frac {k q^2}{2} = E\right ) = E \sqrt {\frac {m}{k}} = \frac E \omega . \end{equation}

    The variable conjugate to \(I\), is generally non-trivial to obtain. In this case, it is obtained from the canonical transformation

    \begin{equation} q(I, \phi ) = \rho ^{-1} \sqrt {2I} \cos \phi , \quad p(I, \phi ) = \rho \sqrt {2I} \sin \phi , \quad \rho = \sqrt {m \omega }. \end{equation}

    The change of coordinates \((q,p) \mapsto (I, \phi )\) maps the phase space flow onto a cylinder parametrized by \((I, \phi )\) so that the flows are now all straight lines in the new coordinates.

    If you recall Example 4.2 you can now see what we were doing. The construction in Example 4.7 is also not dissimilar: if you observe that \(\dot W_i = \frac {\partial W_i}{\partial q^i} \dot q^i = p_i \dot q^i\), then you get \(W_i = \oint p_i\dd q^i\) which hints at the role of the \(W_i\) of the separated generating function in the context of the action–angle variables.

  • Example 4.10 (Kepler problem – elliptic motion). Let’s consider once again the Kepler problem for negative energies and, therefore, elliptic motions. For conciseness, we assume that the system is already reduced to the planar system. In this case, the invariant tori are two dimensional:

    \begin{align} & p_\phi = m r^2 \dot \phi = M \\ & \frac {p_r^2}{2 m} + \frac {p_\phi ^2}{2 m r^2} - \frac {k}{r} = E <0, \end{align} where \((r,\phi )\) are polar coordinates on the orbit plane and the constants \((M,E)\), the angular momentum and the energy, parametrize the torus. The canonical actions, then, are

    \begin{align} I_\phi & = \frac {1}{2\pi } \int _0^{2\pi } p_\phi \dd \phi = M \\ I_r & = \frac {2}{2\pi } \int _{r_{\mathrm {min}}}^{r_{\mathrm {max}}} p_r \dd r \\ & = \frac 1\pi \int _{r_{\mathrm {min}}}^{r_{\mathrm {max}}} \sqrt {2m(E + k/r) - M^2/r^2}\dd r \\ & = - M + k \sqrt {\frac {m}{2|E|}}, \end{align} where the latter follows from the identity

    \begin{equation} \int _{r_{m}}^{r_{M}} \sqrt {\left (1-\frac {r_m}r\right )\left (\frac {r_M}r-1\right )}\dd r = \frac {\pi }2(r_m + r_M) - \pi \sqrt {r_m r_M}. \end{equation}

    Then, the hamiltonian in the action angle variables takes the form

    \begin{equation} H = - \frac {mk^2}{2(I_r + I_\phi )^2}. \end{equation}

    Observe that \(H\) is symmetric in the actions. This means that the associated frequencies,

    \begin{equation} \omega _\phi = \frac {\partial H}{\partial I_\phi }, \quad \omega _r = \frac {\partial H}{\partial I_r}, \end{equation}

    coincide for all values of \((M,E)\): they are degenerate. Which, in particular, means that the trajectories are always periodic.

    This has major consequences in the quantum mechanics of the hydrogen atom. The actions are all quantized as integer multiples of Planck’s constant: from the formula above, the energy depends only on the sum of the quantum numbers. Thus, above the ground state, energy levels are degenerate, which is why the energy spectrum of the hydrogen atom has the simple form so well explained by the Bohr model.

    Recall the parametrization of the ellipse from (2.129), where the eccentricity, the focal parameter and the semiaxes of the ellipse where given respectively by

    \begin{equation} e = \sqrt {1+ \frac {2EM^2}{mk^2}}, \quad p = \frac {M^2}{mk}, \quad a = \frac {p}{1-e^2}, \quad b = \frac {p}{\sqrt {1-e^2}}, \quad \end{equation}

    We can derive the following identities,

    \begin{equation} |E| = \frac {k}{2a}, \quad e^2 = 1 - \left (\frac {I_\phi }{ I_\phi + I_r}\right ), \end{equation}

    which imply

    \begin{equation} \frac {b}{a} = \sqrt {1-e^2} = \frac {I_\phi }{I_\phi + I_r} = \frac {|m|}{n} \end{equation}

    where the last equality is just a rewriting using the notation for the hydrogen quantum numbers: \(|m| \leq n\) is the orbital quantum number, \(n\) is the principal quantum number and the energy is \(E_n \sim n^{-2}\).

    Compare this example with Example 4.8. It is instructive to derive the action–angle coordinates for the general system in \(\mathbb {R}^3\) using this procedure or the separation of variables. The energy will have a similar form with the denominator including \(I_\theta \) in the sum and thus three degenerate frequencies. Reasoning on the frequencies and the actions, you can identify 5 constants of motion, which can be associated to the angular momentum vector, the energy and the Laplace-Runge-Lenz vector. With some algebra, these constants can be associated to the five parameters of the planetary orbit: inclination angle, longitude of the ascending node, semi–major axis, eccentricity and the angle of orientation in the orbital plane. You can read much more about this and how it is useful in perturbation theory, like the spin–orbit problem or the various restricted three-body problems, in [AKN06] or [Arn10].

  • Exercise 4.9 (Anharmonic oscillator). Consider the following hamiltonian system with one degree of freedom for a point particle of mass \(m\):

    \begin{equation} H(q,p) = \frac {p^2}{2m} + V_0 \tan ^2\left (\frac {q}{\alpha }\right ), \end{equation}

    where \(V_0>0\) and \(\alpha >0\) are given constants.

    Compute the action variable \(I\) and show that the energy \(E = E(I)\) is given by

    \begin{equation} E(I) = \frac {I (I + 2\alpha \sqrt {2m V_0})}{2m\alpha ^2}. \end{equation}

    Verify that the period of the motion is

    \begin{equation} T(I) = \frac {4 \pi m \alpha ^2}{I - \alpha \sqrt {2mV_0}} \end{equation}

    and compute the angular variable.