Hamiltonian Mechanics

\(\newcommand{\footnotename}{footnote}\) \(\def \LWRfootnote {1}\) \(\newcommand {\footnote }[2][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\newcommand {\footnotemark }[1][\LWRfootnote ]{{}^{\mathrm {#1}}}\) \(\let \LWRorighspace \hspace \) \(\renewcommand {\hspace }{\ifstar \LWRorighspace \LWRorighspace }\) \(\newcommand {\mathnormal }[1]{{#1}}\) \(\newcommand \ensuremath [1]{#1}\) \(\newcommand {\LWRframebox }[2][]{\fbox {#2}} \newcommand {\framebox }[1][]{\LWRframebox } \) \(\newcommand {\setlength }[2]{}\) \(\newcommand {\addtolength }[2]{}\) \(\newcommand {\setcounter }[2]{}\) \(\newcommand {\addtocounter }[2]{}\) \(\newcommand {\arabic }[1]{}\) \(\newcommand {\number }[1]{}\) \(\newcommand {\noalign }[1]{\text {#1}\notag \\}\) \(\newcommand {\cline }[1]{}\) \(\newcommand {\directlua }[1]{\text {(directlua)}}\) \(\newcommand {\luatexdirectlua }[1]{\text {(directlua)}}\) \(\newcommand {\protect }{}\) \(\def \LWRabsorbnumber #1 {}\) \(\def \LWRabsorbquotenumber "#1 {}\) \(\newcommand {\LWRabsorboption }[1][]{}\) \(\newcommand {\LWRabsorbtwooptions }[1][]{\LWRabsorboption }\) \(\def \mathchar {\ifnextchar "\LWRabsorbquotenumber \LWRabsorbnumber }\) \(\def \mathcode #1={\mathchar }\) \(\let \delcode \mathcode \) \(\let \delimiter \mathchar \) \(\def \oe {\unicode {x0153}}\) \(\def \OE {\unicode {x0152}}\) \(\def \ae {\unicode {x00E6}}\) \(\def \AE {\unicode {x00C6}}\) \(\def \aa {\unicode {x00E5}}\) \(\def \AA {\unicode {x00C5}}\) \(\def \o {\unicode {x00F8}}\) \(\def \O {\unicode {x00D8}}\) \(\def \l {\unicode {x0142}}\) \(\def \L {\unicode {x0141}}\) \(\def \ss {\unicode {x00DF}}\) \(\def \SS {\unicode {x1E9E}}\) \(\def \dag {\unicode {x2020}}\) \(\def \ddag {\unicode {x2021}}\) \(\def \P {\unicode {x00B6}}\) \(\def \copyright {\unicode {x00A9}}\) \(\def \pounds {\unicode {x00A3}}\) \(\let \LWRref \ref \) \(\renewcommand {\ref }{\ifstar \LWRref \LWRref }\) \( \newcommand {\multicolumn }[3]{#3}\) \(\require {textcomp}\) \(\newcommand {\toprule }[1][]{\hline }\) \(\let \midrule \toprule \) \(\let \bottomrule \toprule \) \(\def \LWRbooktabscmidruleparen (#1)#2{}\) \(\newcommand {\LWRbooktabscmidrulenoparen }[1]{}\) \(\newcommand {\cmidrule }[1][]{\ifnextchar (\LWRbooktabscmidruleparen \LWRbooktabscmidrulenoparen }\) \(\newcommand {\morecmidrules }{}\) \(\newcommand {\specialrule }[3]{\hline }\) \(\newcommand {\addlinespace }[1][]{}\) \(\newcommand {\intertext }[1]{\text {#1}\notag \\}\) \(\let \Hat \hat \) \(\let \Check \check \) \(\let \Tilde \tilde \) \(\let \Acute \acute \) \(\let \Grave \grave \) \(\let \Dot \dot \) \(\let \Ddot \ddot \) \(\let \Breve \breve \) \(\let \Bar \bar \) \(\let \Vec \vec \) \(\require {physics}\) \(\newcommand {\nicefrac }[3][]{\mathinner {{}^{#2}\!/\!_{#3}}}\) \(\newcommand {\unit }[2][]{#1 \mathinner {#2}}\) \(\newcommand {\unitfrac }[3][]{#1 \mathinner {{}^{#2}\!/\!_{#3}}}\) \(\newcommand {\tcbset }[1]{}\) \(\newcommand {\tcbsetforeverylayer }[1]{}\) \(\newcommand {\tcbox }[2][]{\boxed {\text {#2}}}\) \(\newcommand {\tcboxfit }[2][]{\boxed {#2}}\) \(\newcommand {\tcblower }{}\) \(\newcommand {\tcbline }{}\) \(\newcommand {\tcbtitle }{}\) \(\newcommand {\tcbsubtitle [2][]{\mathrm {#2}}}\) \(\newcommand {\tcboxmath }[2][]{\boxed {#2}}\) \(\newcommand {\tcbhighmath }[2][]{\boxed {#2}}\) \( \DeclareMathOperator {\D }{D} \DeclareMathOperator {\Id }{Id} \DeclareMathOperator {\diag }{diag} \DeclareMathOperator {\mod }{mod} \DeclareMathOperator {\Vol }{Vol} \)

5 Hamiltonian perturbation theory

In this chapter we will briefly review perturbation theory for hamiltonian systems. We will start revisiting the small oscillations approach, and then give a brief glimpse of the powerful tools available when perturbing integrable systems. For a nice and compact overview of perturbation theory you can refer to [Cel09], also available from arXiv.

For brevity, I decided to omit a discussion of parametric resonances and adiabatic invariants. For those, refer respectively to [Arn89, Chapter 25] and [Kna18, Chapter 5.4] and to [Kna18, Chapter 15.1].

5.1 Small oscillations revisited

For convenience, let’s assume \(M=\mathbb {R}^n\). Given a hamiltonian \(H=H(q,p)\), \((q,p)\in \mathbb {R}^{2n}\), we say that \((q_0, p_0)\) is an equilibrium point if

\begin{equation} \frac {\partial H}{\partial q}(q_0, p_0) = \frac {\partial H}{\partial p}(q_0, p_0) = 0. \end{equation}

In this case, \((q(t), p(t)) = (q_0, p_0)\) is a solution of the equations of motion of the hamiltonian \(H\).

Without loss of generality, let’s assume \(H\) has an equilibrium at the origin \((q_0, p_0) = (0,0)\). As in the Lagrangian case, to study the stability of the system we linearize it, i.e., we represent the hamiltonian in the form

\begin{equation} \label {eq:hamidevl} H(q,p) = H_0(q_0,p_0) + H_2(q,p) + H_3(q,p) + \cdots \end{equation}

where each \(H_k(q,p)\) is a polynomial in \((q,p)\) homogeneous of degree \(k\). Here, for convenience, we are implicitly assuming \(H\) analytic in \((q_0,p_0)\). Furthermore, we will set the constant term \(H_0(q_0,p_0) = 0\): it does not contribute to the dynamics in any case.

The hamiltonian system (3.31) with hamiltonian \(H_2\),

\begin{equation} \label {eq:lienarizedH} \left \lbrace \begin{aligned} \dot q_i &= \frac {\partial H_2}{\partial p_i} \\ \dot p_i &= -\frac {\partial H_2}{\partial q_i} \end {aligned} \right .,\quad i=1,\ldots ,n, \end{equation}

is the linearization of the hamiltonian system with hamiltonian \(H\) on the equilibrium \((q,p)\). Being it linear in both \(q\) and \(p\), we can represent it as

\begin{equation} \label {eq:hamsoivp} \dot x = B x,\quad x =(q,p), \end{equation}

where \(B\) is a constant \(2n\times 2n\) matrix.

The solution of the initial value problem (5.4) with initial condition \(x(0) = x_0\) is then immediately obtained by

\begin{equation} x(t) = e^{Bt} x_0, \qquad e^{Bt} = \sum _{k=0}^\infty \frac {B^k t^k}{k!}, \end{equation}

and its dynamic behavior is fully characterized by the eigenvalues of the matrix \(B\).

We can always find a symmetric matrix \(A\) to rewrite the quadratic polynomial \(H_2 = H_2(x)\) in the form

\begin{equation} H_2(x) = \frac 12 \langle Ax, x\rangle . \end{equation}

Then, it follows by (3.90) that

\begin{equation} B = J A, \quad J = \begin{pmatrix} 0 & \Id _n \\-\Id _n&0 \end {pmatrix}. \end{equation}

Lemma 5.1. Let \(A\) be a symmetric matrix and \(J\) be the symplectic matrix. The characteristic polynomial of the matrix \(B=JA\) has the form
\(\seteqnumber{0}{5.}{7}\)
\begin{equation} \det (B-\lambda I) = \det (A-\lambda J) = P_n(\lambda ^2) \end{equation}

where \(P_n\) is a polynomial of degree \(n\).

Proof. We use \(J^2 = \Id \) and \(\det J= 1\) to write
\(\seteqnumber{0}{5.}{8}\)
\begin{equation} \det (B-\lambda I) = \det (JA - \lambda J^2) = \det (J) \det (A-\lambda J) = \det (A-\lambda J). \end{equation}

Using the invariance of the characteristic polynomials by transposition, we have
\(\seteqnumber{0}{5.}{9}\)
\begin{equation} \det (A-\lambda J) = \det (A^T -\lambda J^T) = \det (A + \lambda J), \end{equation}

which implies that the characteristic polynomial of the matrix \(B\) has to be an even function, i.e., a polynomial in \(\lambda ^2\). As its degree must be \(2n\), we can always rewrite it as a polynomial of degree \(n\) in \(\lambda ^2\). □

In the generic case of \(n\) distinct roots of the polynomial \(P_n\), we have practically proved the following theorem.

Theorem 5.2. Let \(H_2(x) = \frac 12\langle Ax,x\rangle \) be a generic quadratic hamiltonian. The the solution \((q(t), p(t)) = (0,0)\) of the linear system (5.3) is stable if and only if all the roots of the polynomial \(P_n(z)\) from the Lemma 5.1 are real and negative.

More precisely, one can prove the following.

Theorem 5.3. Let \(H(x) = \frac 12\langle Ax,x \rangle \) be a quadratic hamiltonian in \(\mathbb {R}^{2n}\) such that the characteristic polynomial has the form
\(\seteqnumber{0}{5.}{10}\)
\begin{equation} \det (A-\lambda J) = \prod _{i=1}^n(\lambda + \omega _i^2), \quad \omega _1, \ldots , \omega _n \in \mathbb {R}_+, \omega _i\neq \omega _j. \end{equation}

Then, there exists a linear canonical transformation
\(\seteqnumber{0}{5.}{11}\)
\begin{equation} \begin{pmatrix} q \\ p \end {pmatrix} \mapsto \begin{pmatrix} \widetilde q \\ \widetilde p \end {pmatrix} = Q \begin{pmatrix} q \\ p \end {pmatrix}, \quad QJQ^T = J, \end{equation}

which reduces the hamiltonian to the following normal form:
\(\seteqnumber{0}{5.}{12}\)
\begin{equation} \label {eq:HamNF} H = \frac 12 \sum _{i=1}^n \omega _i(\widetilde p_i^2 + \widetilde q_i^2). \end{equation}

Example 5.1. If we consider the lagrangian
\(\seteqnumber{0}{5.}{13}\)
\begin{equation} L(x,\dot x) = \frac 12 \sum _{i,j=1}^n g_{ij}\dot x_i \dot x_j - U(x) \end{equation}

as in Section 2.3, then we have already seen that we can reduce it to
\(\seteqnumber{0}{5.}{14}\)
\begin{equation} L_0(y,\dot y) = \frac 12 \sum _{i=1}^n(\dot y_i^2 - \lambda _i y_i^2). \end{equation}

If all the eigenvalues are positive, \(\lambda _i = \omega _i^2\), \(\omega _i > 0\), then the hamiltonian associated to the lagrangian is
\(\seteqnumber{0}{5.}{15}\)
\begin{equation} H = \frac 12 \sum _{n=1}^n(p_i^2 + \omega _i^2 q_i^2), \qquad (q_i, p_i) = (y_i, \dot y_i). \end{equation}

Then, the canonical transformation
\(\seteqnumber{0}{5.}{16}\)
\begin{equation} p_i \mapsto \sqrt {\omega _i}p_i, \quad q_i \mapsto \frac {1}{\sqrt {\omega _i}}q^i, \quad i=1,\ldots ,n \end{equation}

reduces \(H\) to the normal form (5.13)

Solutions of the hamiltonian system with hamiltonian (5.13) can be explicitly computed as

\begin{equation} \label {eq:easycomplint} \left \lbrace \begin{aligned} q_i = \rho _i \sin (\omega _it +\phi _i^0) \\ p_i = \rho _i \cos (\omega _it +\phi _i^0) \end {aligned} \right .,\quad i=1,\ldots ,n \end{equation}

where \(\rho _i\geq 0\) and \(\phi _i^0\) are constants of integration.

Remark 5.1. The hamiltonian systems with hamiltonian (5.13) is completely integrable. With the notation of (5.18), the invariant tori \(\mathbb {T}^n\) are defined by the equations
\(\seteqnumber{0}{5.}{18}\)
\begin{equation} p_k^2 + q_k^2 = \rho _k^2, \quad k=1,\ldots ,n, \end{equation}

the canonical actions then are
\(\seteqnumber{0}{5.}{19}\)
\begin{equation} I_k = \frac {\rho _k^2}2, \quad k=1,\ldots ,n, \end{equation}

and the angular coordinates are introduced as
\(\seteqnumber{0}{5.}{20}\)
\begin{equation} q^k = \sqrt {2I_k} \sin \phi _k, \quad p_k = \sqrt {2I_k} \cos \phi _k, \quad k=1,\ldots ,n, \end{equation}

in the same way as the usual harmonic oscillator.

Then, the hamiltonian \(H_2\) is the linear function of the actions
\(\seteqnumber{0}{5.}{21}\)
\begin{equation} H_2 = \sum _{k=1}^n \omega _k I_k \end{equation}

Exercise 5.1 (The linear triatomic molecule). Use the hamiltonian approach to small oscillations to compute the normal modes of the linear triatomic molecule from Section 2.3.3.

There is a beautiful duality between normal modes and creation and annihilation operators in quantum mechanics. For further information, refer to [Low12, Chapter 2.10].

In the first edition of these notes, I was going to conclude this section showing the stability of the Lagrange points for the restricted three body problem. This has been removed due to time constraints, but you can see it developed, with a similar approach, for the points \(L_4\) and \(L_5\) in [Low12, Chapter 2.10] and slightly differently but without sparing details in [AKN06; Arn10].

5.2 Birkhoff normal forms

We will consider now the effect on the solutions of the contribution from higher order terms in (5.2), under the assumptions that the quadratic hamiltonian is generic and stable and still in the case of small oscillations near an equilibrium. For convenience we can drop the constant term and write

\begin{equation} H = H_2 + H_3 + \cdots . \end{equation}

In this case, it is natural to consider \(H_2\) as the principal term, and treat the successive terms as small perturbations.

The rescaling

\begin{equation} (q, p)\mapsto (\epsilon q, \epsilon p), \quad H \mapsto \epsilon ^{-2} H, \end{equation}

where \(\epsilon > 0\) is a small parameter, transforms the hamiltonian into

\begin{equation} H = H_2 + \epsilon H_3 + \epsilon ^2 H_4 + \cdots . \end{equation}

The behavior of the solutions of the perturbed hamiltonian system can then be studied using a family of canonical transformations suggested by Birkhoff. For a more detailed account, please refer to [Arn10, Chapter 6.5], [AKN06, Chapter 8.3] and [Kna18, Chapters 15.2 and 15.3].

We look for a canonical transformation

\begin{equation} x = (q,p) \mapsto \widetilde x = (\widetilde q, \widetilde p) = x + \sum _{i>0} \epsilon ^i \Delta _i x, \end{equation}

expressed as a formal series in \(\epsilon \) which cancels the first cubic term of the perturbation. Since infinitesimal canonical coordinates have the form

\begin{equation} x \mapsto x + \epsilon \big \{x, F\big \} + O(\epsilon ^2) \end{equation}

for some hamiltonian \(F\), we can compute the shift due to \(F\) in the transformed hamiltonian explicitly by expanding in Taylor series around \(\epsilon =0\):

\begin{equation} H \mapsto H - \epsilon \big \{H,F\big \} + O(\epsilon ^2) = H_2 + \epsilon \left (H_3 - \big \{H_2,F\big \}\right ) + O(\epsilon ^2). \end{equation}

We are left to find a cubic hamiltonian \(F\) that solves

\begin{equation} \label {eq:linearmapH_2} \big \{H_2,F\big \} = H_3. \end{equation}

This is due to the fact that \(H_2\) is quadratic and \(H_3\) is cubic by hypothesis. To solve the equation, we need to invert the map \(F \mapsto \big \{H_2,F\big \}\) induced by \(H_2\) on polynomials of degree \(3\). To understand how reasonable is this task, let’s first consider an artificial example.

The idea sketched in this section has been thoroughly developed in multiple flavours in the last century. Its consequences percolated in perturbative analysis in semiclassical theory and more generally in spectral theory, where, once coupled with pseudodifferential calculus, it provides an incredibly powerful tool.

Example 5.2. Consider the following hamiltonian for a system with one degree of freedom
\(\seteqnumber{0}{5.}{29}\)
\begin{equation} H = \frac \omega 2 (p^2 + q^2) + \epsilon (C_1 p^3 + C_2 p^2 q + C_3 p q^2 + C_4 q^3) + O(\epsilon ^2), \end{equation}

where \(\omega >0\). Choosing
\(\seteqnumber{0}{5.}{30}\)
\begin{equation} F = \frac {1}{\omega }\left ( \frac {C_2 + 2 C_4}{6} p^3 - \frac {C_1}2 p^2 q + \frac {C_4}{2} p q^2 -\frac {C_3 + 2 C_1}{6} q^3 \right ), \end{equation}

and applying the infinitesimal canonical transformation discussed above, our hamiltonian becomes
\(\seteqnumber{0}{5.}{31}\)
\begin{equation} H = \frac {\omega }{2}(p^2 + q^2) + \epsilon ^2 H_4 + O(\epsilon ^3). \end{equation}

We can now look for a fourth oder polynomial \(G\) such that the canonical transformation
\(\seteqnumber{0}{5.}{32}\)
\begin{equation} x \mapsto x + \epsilon ^2 \big \{x, G\big \} + O(\epsilon ^3) \end{equation}

cancels the fourth order terms in the new hamiltonian. You can try it out for yourselves, but you will rapidly stumble upon an insurmountable obstacle: the linear map (5.29) on the space of fourth order polynomials has a non–trivial kernel! This can be verified immediately by observing that the polynomial \((p^2+q^2)^2\) commutes with \(H_2=\frac \omega 2(p^2+q^2)\) and, therefore, belongs to the kernel.

For systems with one degree of freedom, we can summarize the situation as follows

Exercise 5.2. Let \(H\) be an hamiltonian of the form
\(\seteqnumber{0}{5.}{33}\)
\begin{equation} H(p,q; \epsilon ) = \frac {\omega }{2}(p^2 + q^2) + \sum _{k\geq 3} \epsilon ^{k-2} H_k(q,p), \end{equation}

where \(H_k(q,p)\) is a polynomial in \((q,p)\) homogeneous of degree \(k\), \(k\geq 3\) and \(\omega \neq 0\). Prove the existence of a formal series
\(\seteqnumber{0}{5.}{34}\)
\begin{equation} F(q,p; \epsilon ) = \sum _{k\geq 3} \epsilon ^{k-3} F_k(q,p), \end{equation}

where \(F_k(q,p)\) is a polynomial in \((q,p)\) homogeneous of degree \(k\), \(k\geq 3\), such that the canonical transformation
\(\seteqnumber{0}{5.}{35}\)
\begin{equation} \label {eq:ictbnf} x \mapsto \widetilde x = x + \epsilon \big \{x, F\big \} + \frac {\epsilon ^2}{2!} \big \{\big \{x, F\big \}, F\big \} + \frac {\epsilon ^3}{3!} \big \{\big \{\big \{x, F\big \}, F\big \}, F\big \} + \cdots \end{equation}

transforms the hamiltonian \(H\) into
\(\seteqnumber{0}{5.}{36}\)
\begin{equation} \label {eq:bnormalform} H = h(\widetilde p^2 + \widetilde q^2; \epsilon ),\quad h(z;\epsilon ) = \frac \omega 2 z + \sum _{k\geq 2} \epsilon ^{2k-2}C_k(\epsilon ) z^k, \end{equation}

where \((\widetilde q, \widetilde p) = \widetilde x\) and the coefficients \(C_k(\epsilon )\) are defined by formal series in \(\epsilon \).

Formula (5.37) is called normal form of the perturbed hamiltonian with one degree of freedom.

The trajectories of the perturbed system are then the deformed circles

\begin{equation} h(\widetilde p^2 + \widetilde q^2; \epsilon ) = E, \end{equation}

over which the particle moves uniformly as

\begin{equation} \phi (t) = \widetilde \omega (E;\epsilon ) t + \phi _0, \quad \widetilde \omega (E; \epsilon ) = 2\frac {\partial }{\partial z}h(z;\epsilon ), \quad h(z;\epsilon ) = E. \end{equation}

The generalization to \(n>1\) degrees of freedom is less straightforward. We will mention here only the case of non–resonant frequencies \(\omega _1,\ldots ,\omega _n\), which for \(n=1\) coincides with \(\omega \neq 0\). See Remark 4.8.

Theorem 5.4 (Birkhoff). Let \(H\) be an hamiltonian of the form
\(\seteqnumber{0}{5.}{39}\)
\begin{equation} H(p,q; \epsilon ) = \frac 12 \sum _{i=1}^n \omega _i(p_i^2 + q_i^2) + \sum _{k\geq 3} \epsilon ^{k-2} H_k(q,p), \end{equation}

where \(H_k(q,p)\) is a polynomial in \((q,p)\) homogeneous of degree \(k\) and the frequencies \(\omega _1,\ldots ,\omega _n\) are non–resonant. Then, there exists a canonical transformation of the form (5.36) which transforms the hamiltonian to the following normal form
\(\seteqnumber{0}{5.}{40}\)
\begin{align} H & = h(z_1, \ldots , z_n; \epsilon ), \\ h(z_1,\ldots ,z_n;\epsilon ) & = \frac 12 \sum _{i=1}^n \omega _i z_i + \sum _{k\geq 2} \epsilon ^{2k-2} h_k(z_1, \ldots , z_n; \epsilon ), \\ z_i & = \widetilde p_i^2 + \widetilde q_i^2,\quad i=1,\ldots ,n, \end{align} where \(h_k(z_1, \ldots , z_n; \epsilon )\) is a polynomial in \(z_1, \ldots , z_n\) homogeneous of degree \(k\).

The non–resonance condition is essential for the existence of the canonical transformation even as a formal series. In the resonant case, the map \(F \mapsto \big \{H_2, F\big \}\) always has a non–trivial kernel on polynomial spaces and cannot be inverted. There is a full classification of normal form for resonant systems due to Moser, but we don’t have the time to enter into details.

If you formally write down the equations of motion, you can convince yourself that if all the series were to be convergent, the hamiltonian system could be easily integrated. In fact, the question of convergence of (5.36) for \(n>1\) is a rather delicate problem, related to the existence of first integrals with certain analyticity properties.

For further details, please refer to [Bro09], [Arn10, Chapter 6.5], [AKN06, Chapter 8.3] or [Kna18, Chapters 15.2 and 15.3], as suggested also at the beginning of this section.

5.3 A brief look at KAM theory

We saw multiple times by now that every hamiltonian system with one–degree of freedom is integrable by quadrature. Moreover, if for some \(E\) the curve \(H^{-1}(E) = \{(p,q)\mid H(p,q) = E\}\) is compact, then the solution \((q(t),p(t))\) of the equations of motion with initial data on \(H^{-1}(E)\), i.e., \((q(0), p(0)) = (q_0, p_0)\) with \(H(q_0,p_0)=E\), is periodic. Small enough changes of the initial condition or of the energy do not change the periodic nature of the motion.

For hamiltonian systems with \(n>1\) degrees of freedom, the behavior of the solutions under small perturbations can be much more complicated. The simplest case is the case of completely integrable systems, as the ones investigated in sections 4.3.1 and 4.4. For completely integrable systems, Liouville-Arnold theorem tells us that the motion on compact connected level surfaces is equivalent to a conditionally periodic motion on a Liouville \(n\)-torus. Even better, we constructed the action–angle coordinates \(I_1, \ldots , I_n, \phi _1, \ldots , \phi _n\) on a small neighborhood of a Liouville torus in such a way that the equations

\begin{equation} I_1 = I_1^0, \ldots , I_n = I_n^0 \end{equation}

identify a Liouville torus with its standard angular coordinates \(\phi _1, \ldots , \phi _n\):

\begin{equation} \mathbb {T}^n = \left \{(\phi _1, \ldots , \phi _n)\in \mathbb {R}^n \;\mid \; \phi _k\sim \phi _k + 2\pi \right \}. \end{equation}

In the \((I,\phi )\) coordinates, the hamiltonian is independent of the angular variables,

\begin{equation} H = H(I_1, \ldots , I_n), \end{equation}

and the corresponding equations of motion

\begin{equation} \dot \phi _j = \frac {\partial H(I)}{\partial I_j}, \quad \dot I_j = 0, \quad j=1,\ldots ,n, \end{equation}

are immediately solved as linear motions on the torus:

\begin{equation} \phi _j(t) = \omega _j(I) t + \phi _j^0, \quad \omega _j(I) = \frac {\partial H(I)}{\partial I_j}, \end{equation}

where the frequencies \(\omega _1(I), \ldots , \omega _n(I)\) only depend on the Liouville torus.

The motion is periodic with period \(T\) if all the frequencies are resonant (that is, they are all rational multiples of \(2\pi /T\)), otherwise the trajectory is dense on the torus.

Sadly, a typical hamiltonian system with \(n>1\) degrees of freedom is not integrable. In non–integrable systems, the motion within the \((2n-1)\)-dimensional energy shell is not confined to \(n\)-dimensional tori for a positive Liouville measure of initial conditions, and it can look quite complex. Entirely new notions had to be developed to describe the long term behavior of the orbits in this case.

We now focus on systems that are almost integrable, in the sense that their hamiltonian has the form

\begin{equation} H_\epsilon = H_0(I) + \epsilon H_{\mathrm {pert}}(I,\phi ), \end{equation}

where \(H_0\) is the hamiltonian of a completely integrable system ib action–angle coordinates, \(\epsilon \) is the small parameter of the perturbation and the perturbation itself is given by the hamiltonian \(H_{\mathrm {pert}}(I,\phi )\) defined on

\begin{equation} (I,\phi ) \in \mathcal {I} \times \mathbb {T}^n \subseteq \mathbb {R}^n\times \mathbb {T}^n. \end{equation}

For \(\epsilon =0\) the motion is a conditionally periodic motion on a Liouville torus, the quest now is to describe the nature of the motion for small values of \(\epsilon \).

If we fix one torus

\begin{equation} I_1 = I_1^0, \ldots , I_n = I_n^0, \end{equation}

for the unperturbed system, we can try to apply Birkhoff’s transformation \((\phi , I)\mapsto (\widetilde \phi ,\widetilde I)\) from the previous section to reduce the hamiltonian \(H_\epsilon \) to a normal form \(H_\epsilon = H_\epsilon (\widetilde I; \epsilon )\). As we saw, the necessary canonical transformation does not exist even just as formal series when the frequencies \(\omega _1^0, \ldots , \omega _n^0\) are resonant. Further from being just a problem of the method, it can be shown that resonant tori are destroyed by arbitrarily small values of \(\epsilon \). Furthermore, the survival of the invariant tori under small perturbation in the non–resonant case has been unclear for a very long time. Indeed, a careful development of the normal form approach around non–resonant frequencies presents the appearance of denominators of the form \((\vb * k, \omega )\) with large \(\vb * k \in \mathbb {Z}^n\setminus \{0\}\): it is often possible to find \(\vb * k\) that makes such denominators arbitrarily small. If that was not enough, frequencies which are unaffected by the small denominators problem are always arbitrarily close to “bad” frequencies.

This problem was so bad that Poincaré called it the “fundamental problem” of classical mechanics, and it was so pervasive tht in the middle of the last century people formulated an “ergodic hypothesis” according to which the trajectories of the motion behave chaotically on generically perturbed energy surfaces.

The turning point in this story was made by Kolmogorov in 1954. According to him, most of the invariant tori survives any sufficiently small perturbation of the hamiltonian and the surviving tori can be characterized rather explicitly in terms of a diophantine condition on the frequencies. Shortly afterwards, Arnold and Moser proved this intuition in the case of analytical and then sufficiently smooth hamiltonians, leading to the modern name of KAM theorem (from Kolmogorov, Arnold and Moser).

To begin, let’s describe the diophantine conditions for the non–resonant frequencies \(\omega _1, \ldots , \omega _n\).

Let \(\alpha >0\), \(\tau >0\) be two fixed positive numbers, we say that the frequency satisfy the \((\alpha ,\tau )\) diophantine condition if

\begin{equation} \label {eq:diophat} \left |\sum _{i=1}^n k_i \omega _i \right | \geq \frac {\alpha }{|\vb * k|^\tau } \qquad \forall \vb * k=(k_1, \ldots , k_n)\in \mathbb {Z}^n, \end{equation}

where \(|\vb * k| = \sum _{i=1}^n |k_i|\). We denote \(\Delta _\alpha ^\tau \subset \mathbb {R}^n\) the set of vectors \(\omega \) satisfying (5.52).

First of all observe that for any fixed \(\alpha >0\), \(\tau >0\), the set \(\Delta _\alpha ^\tau \) is not empty. Indeed, its complement in \(\mathbb {R}^n\) is

\begin{equation} R_{\alpha }^\tau := \cup _{\vb * k \in \mathbb {Z}^n\setminus \{0\}} R_{\alpha }^\tau (\vb * k),\quad R_{\alpha }^\tau (\vb * k) := \left \{ \omega \in \mathbb {R}^n \;\mid \; |\langle \vb * k, \omega \rangle | < \frac {\alpha }{|\vb * k|^\tau } \right \}. \end{equation}

For any bounded \(\Omega \subset \mathbb {R}^n\), the Lebesgue measure \(\mu \) of \(\Omega \cap R_{\alpha }^\tau (\vb * k)\) can be estimated explicitly as

\begin{equation} \mu (\Omega \cap R_{\alpha }^\tau (\vb * k)) = O\left (\frac {\alpha }{|\vb * k|^{\tau +1}}\right ). \end{equation}

If \(\tau > n-1\),

\begin{equation} \mu \left (\mathbb {R}^n\setminus \Delta _\alpha ^\tau )\cap \Omega \right )\leq \sum _{\vb * k\in \mathbb {Z}^n} \mu \left (\Delta _{\alpha }^\tau (\vb * k)\cap \Omega \right ) = O(\alpha ), \end{equation}

and since

\begin{equation} \mu \left (\cap _{\alpha >0}R_{\alpha }^\tau \right )=0, \end{equation}

the set \(\Delta ^\tau = \cup _{\alpha >0}\Delta _\alpha ^\tau \) is of full measure in \(\mathbb {R}^n\) for any \(\tau > n-1\). i.e., for almost every \(\omega \in \mathbb {R}^n\) there is \(\alpha >0\) such that \(\omega \in \Delta ^\tau _\alpha \). For the curious reader, the set of missing frequencies can be explicitly characterized in terms of diophantine approximation.

From now on, we assume that a number \(\tau > n-1\) is fixed and we drop it from the notation: \(\Delta _\alpha = \Delta _\alpha ^\tau \).

Sadly, the parameter \(\alpha \) in the non–resonance condition will limit the size of the perturbation parameter \(\epsilon \), so before stating the theorem we need some more hypotheses.

Let’s go back to hamiltonian systems. In what follows, we assume that the unperturbed hamiltonian \(H_0(I)\) is non–degenerate, that is, the matrix of second derivatives is not singular:

\begin{equation} \label {eq:Hnondeg} \det \left (\frac {\partial ^2 H_0(I)}{\partial I_j \partial I_k}\right ) \neq 0, \quad \forall I = (I_1, \ldots , I_n)\in \mathcal {I}\subset \mathbb {R}^n. \end{equation}

Similarly as we have seen for Legendre transformations, non–degenerate hamiltonians induce a local diffeomorphism

\begin{equation} \label {eq:H0omegalocaldiffeo} I \mapsto \omega (I) = \frac {\partial H_0(I)}{\partial I}. \end{equation}

By eventually restricting the domain of definition of the actions \(\mathcal {I}\), we can assume that (5.58) is a diffeomorphism of \(\mathcal {I}\) on some domain \(\Omega \subset \mathbb {R}^n\) which we assume to have piecewise smooth boundary \(\partial \Omega \). From a different perspective, we are saying that, given a non–degenerate hamiltonian, we can use the frequencies \(\omega \in \Omega \) to parametrize the unperturbed invariant tori.

Fix \(\alpha >0\) and denote \(\Omega _\alpha \subset \Omega \) the subset of frequencies \(\omega \) in \(\Delta _\alpha \) whose distance from the boundary \(\partial \Omega \) is at least \(\alpha \).

The set \(\Omega _\alpha \), as the set \(\Delta _\alpha \), is a Cantor set: closed, perfect and nowhere dense. However, the Lebesgue measure of \(\Omega _\alpha \) is rather large since the measure of a boundary layer of size \(\alpha \) is \(O(\alpha )\) and thus

\begin{equation} \mu (\Omega \setminus \Omega _\alpha ) = O(\alpha ). \end{equation}

We are finally ready to state the KAM theorem.

Theorem 5.5 (KAM theorem). Let \(H_0(I)\), \(I\in \mathcal {I}\subset \mathbb {R}^n\), be a non–degenerate hamiltonian such that
\(\seteqnumber{0}{5.}{59}\)
\begin{equation} \mathcal {I} \to \Omega \subset \mathbb {R}^n, \quad I\mapsto \omega (I) = \frac {\partial H_0(I)}{\partial I} \end{equation}

is a diffeomorphism. Assume that the perturbed hamiltonian
\(\seteqnumber{0}{5.}{60}\)
\begin{equation} H_{\epsilon } = H_0(I) + \epsilon H_{\mathrm {pert}}(I,\phi ), \end{equation}

is real analytic on \(\overline {\mathcal {I}}\times \mathbb {T}^n\). Then, there exist a constant \(\delta >0\) such that for
\(\seteqnumber{0}{5.}{61}\)
\begin{equation} |\epsilon | < \delta \alpha ^2 \end{equation}

all the invariant tori of the unperturbed system with \(\omega \in \Omega _\alpha \) persist as Lagrangian tori, being only slightly deformed. Moreover, they depend in a Lipschitz continuous way on \(\omega \) and fill the phase space \(\mathcal {I}\times \mathbb {T}^n\) up to a set of measure \(O(\alpha )\).

Remark 5.2. Here, by being real analytic on \(\overline {\mathcal {I}}\times \mathbb {T}^n\) we mean that the analyticity extends to a uniform neighborhood of \(\mathcal {I}\).

An immediate and important consequence of the KAM theorem is that the ergodic hypothesis is wrong: the KAM tori form an invariant subset, yes, but it is neither of full nor of zero measure.

Another tricky point of the KAM theorem comes from the fact that the invariant set \(\Omega _\alpha \) is a Cantor set and, as such, has no interior points. This means that, given an initial condition in phase space, we cannot tell whether the trajectory for a given initial position falls onto an invariant torus or wanders into a gap between such tori. From the physical point of view, the KAM theorem makes a probabilistic statement: a randomly chosen orbit lies on an invariant torus with probability \(1-O(\alpha )\).

Remark 5.3. The hamiltonian nature of the equations is almost indispensable. Analogous results are true for reversible systems, but in all cases the system has to be conservative: any kind of dissipation immediately destroys the Cantor family of tori, although isolated ones may persist as attractors.

Every proof of KAM theorem is complicated. After all, KAM theory is “one of the big mathematical achievements” of the 20th century, as commented by Winfried Scharlau in 1992 on the occasion of the award of the Cantor medal to Jürgen Moser. We will omit the proof here, and refer the interested reader to [Kna18, Chapter 15] and [P0̈1]. For a nice and readable account of KAM theory and its developments, you can refer to [Bro04].

5.4 Nekhoroshev theorem

In this section we will briefly discuss another approach to the perturbation theory of integrable systems developed by Nekhoroshev at the end of the 1970s. This is a less known but extremely useful result for perturbation theory which deals with long time stability of the perturbed orbits. We will state a simpler version of the original theorem, as presented in [BGG85]. An accessible proof of a slightly more general statement can be found in [P9̈3]. For a more detailed and general account, including applications to celestial mechanics, please refer to [Arn10, Chapter 8] and [AKN06, Chapter 6].

KAM theorem guarantees the persistance of certain Liouville tori: the trajectories remain on the invariant tori for all times. The result is, however, not so explicit in terms of which tori persist and for which weights of the perturbations, these particular issues are still active areas of research. Nekhoroshev took a slightly different point of view: forget about the persistence of the tori, can we give estimates on how long the trajectories of a perturbed integrable system remain close to the original invariant tori?

It turns out that the answer is yes, and the result is stronger than one may expect at first: all trajectories remain close to the unperturbed invariant tori for exponentially long times

\begin{equation} |t| < T(\epsilon ) \sim e^{1/\epsilon }. \end{equation}

The starting point is an analytical hamiltonian of the usual form

\begin{equation} \label {eq:hamneko} H_\epsilon = H_0(I) + \epsilon H_{\mathrm {pert}}(I, \phi ), \quad (I,\phi )\in \mathcal {I}\times \mathbb {T}^n, \end{equation}

where \((I,\phi )\) are the action–angle coordinates for \(H_0\), \(n>1\) and \(\mathcal {I}\subset \mathbb {R}^n\) is open and bounded.

Before stating the theorem, we need some more definitions.

We say that the unperturbed hamiltonian \(H_0\) is a uniformly convex function if there exist \(\lambda >0\) such that for all \(I\in \mathcal {I}\) and for all \((X_1, \ldots , X_n)\in \mathbb {R}^n\)

\begin{equation} \left | \sum _{i,j=1}^n \frac {\partial ^2 H_0(I)}{\partial I_i \partial I_j} X_i X_j\right | \geq \lambda \sum _{i=1}^n X_i^2. \end{equation}

Note that the assumption above implies that \(H_0\) is not convex, but the inverse implication is false.

Finally, given a positive number \(\delta >0\), we denote \(\mathcal {I}_\delta \subset \mathcal {I}\) the subset of points in \(\mathcal {I}\) that are contained in it alongside a ball of radius \(\delta \).

Theorem 5.6 (Nekhoroshev). Let \(H_\epsilon \) be an hamiltonian of the form (5.64) with \(H_0\) uniformly convex. Then there exists positive numbers \(a,b,\alpha ,\beta ,\epsilon _0\in \mathbb {R}_+\) such that for any \(|\epsilon |< \epsilon _0\) and for any solution \((\phi (t), I(t))\) of Hamilton’s equations
\(\seteqnumber{0}{5.}{65}\)
\begin{equation} \dot \phi _k = \frac {\partial H_\epsilon }{\partial I_k}, \quad \dot I_k = - \frac {\partial H_\epsilon }{\partial \phi _k}, \quad k=1,\ldots ,n, \end{equation}

it holds that
\(\seteqnumber{0}{5.}{66}\)
\begin{equation} |I(t)-I(0)| < \alpha |\epsilon |^a \end{equation}

for all initial conditions \((\phi (0), I(0))\) satisfying
\(\seteqnumber{0}{5.}{67}\)
\begin{equation} I(0) \in \mathcal {I}_\delta \quad \mbox {for}\quad \delta = \alpha |\epsilon |^a \end{equation}

and for all \(t\) such that
\(\seteqnumber{0}{5.}{68}\)
\begin{equation} |t| \leq \beta \sqrt {\frac {\epsilon _0}\epsilon } \exp \left (\frac {\epsilon ^b_0}{\epsilon ^b}\right ). \end{equation}

This theorem does not exclude the chaotic motion of the variables \(I\), but guarantees that their variation remains small for extremely long times – which may for example surpass the lifetime of the physical system itself. In fact, one of the most important outcomes of Nekhoroshev is the long–term stability of various celestial systems. To cite a few, it helped understand motions observed in the asteroid belt and to show the stability of the motion of some restricted planetary problems (for example, the Sun-Jupiter-Saturn system or the planar Sun-Jupiter-Saturn-Uranus system), for which it implies that they remain close to an invariant torus of the system for times comparable to Solar System lifetime [Guz15]. Indeed, one of the motivations behind Nekhoroshev idea is the analysis of the stability of the Trojan asteroids and the \(L_4, L_5\) Lagrange points of equilibiria.

The KAM theorem can be considered the turning point in the study of deterministic chaos, a broad and lively topic that had huge impact also outside classical mechanics. Many of the ideas in these chapters are at the hearth of ideas in perturbation theory of operators and semi-classical mechanics. For an overview of the classical aspects of deterministic chaos, with an emphasis on the influence of KAM theory, you can refer to the beautiful paper [MR11].