The Optimal Erection of the Inverted Pendulum

Massaro, Matteo; Lovato, Stefano; Limebeer, David J. N.

doi:10.3390/app12168112

Open AccessArticle

The Optimal Erection of the Inverted Pendulum

by

Matteo Massaro

^1,*

,

Stefano Lovato

¹

and

David J. N. Limebeer

²

¹

Department of Industrial Engineering, University of Padova, Via Venezia 1, 35131 Padova, Italy

²

School of Electrical and Information Engineering, University of the Witwatersrand, Johannesburg 2000, South Africa

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(16), 8112; https://doi.org/10.3390/app12168112

Submission received: 21 July 2022 / Revised: 5 August 2022 / Accepted: 11 August 2022 / Published: 13 August 2022

(This article belongs to the Collection Analysis, Control and Applications of Multibody Systems)

Download

Browse Figures

Versions Notes

Abstract

:

The erection of the inverted pendulum is a classic control problem, which has appeared in several variants. One of the most challenging is the minimum-time erection of a pendulum that is mounted on a moving cart. The aim is to erect the pendulum from the ‘straight-down’ (stable equilibrium) to a ‘straight-up’ (unstable equilibrium) position in minimum time. The swing-up maneuver is usually addressed using a pre-defined control strategy, e.g., energy-based control or selecting the switching times in a bang-bang structure. The aim of this paper is to show that the minimum-time solution may have a singular arc, with the optimal control taking a bang-singular-bang form. The singular arc segment of the control law is a feedback law that is derived herein with the solution discussed. A sensitivity analysis of the solution structure is also performed by varying the model parameters. Finally, the time-optimal solution is compared with that obtained using an energy-based control strategy.

Keywords:

inverted pendulum; optimal control; bang-bang; singular arc; optimization; dynamics; multibody systems

1. Introduction

The inverted pendulum is a classical prototype problem that captures the essential features of several practical problems, such as bicycle and motorcycle balancing [1], unicycle riding and balancing [2], skier and skater dynamics [3], human and humanoid balancing [4], and spacecrafts and rockets at launch [5]. The related control problems have been tackled using a number of different approaches. In [6], the tilt motion of a two-wheeled inverted pendulum is stabilized using a self-tuning proportional-integral-derivative (PID) controller. Both linear quadratic regulator (LQR) and PID control are used in [7] to control a nonlinear inverted pendulum mounted on a motor-driven cart, with the aim of reaching a given cart position and stabilizing the pendulum in the (unstable) upright position. The two control methods are designed using the linearized model. In [8], partial feedback linearization is employed on a two-wheel cart based inverted pendulum in order to design two nonlinear controllers that control its position and orientation, while keeping the tilt angle within a given range. Neural network control has also been applied with the aim of balancing the pendulum with no a priori knowledge of the dynamics [9]. In [10], three types of fuzzy control schemes are applied to a two-wheeled inverted pendulum, which is investigated both numerically and experimentally. Sliding mode control is used in [11], and compared with PID control in [12]. The PID controller shows a faster convergence time than a second-order sliding-mode regulator. Finally, ref. [13] shows the effectiveness of active disturbance rejection control in stabilizing an inverted pendulum mounted on a unmanned aerial vehicle in the presence of external disturbances and model mismatch.

The aforementioned research focuses on the stabilization of an inverted pendulum in the upright position. One of the most challenging variants of the inverted pendulum problem is the erection of a pendulum that is mounted on a horizontally movable carriage. In this problem, the pendulum is initially in a ‘straight-down’ (stable) equilibrium. The controlled horizontal motion of the carriage is used to erect the pendulum and place it in ‘straight-up’ (unstable) equilibrium. In this case, a standard controller is typically designed to ensure stability around the upright position—a swing-up controller is employed to erect the pendulum from the straight-down position. Energy-based control strategies have been widely employed in the swing-up controller design [14,15,16]. The basic idea to use the energy rate as the controlled variable in order to increase the total pendulum energy. Using this idea, rule-based fuzzy control is used; see, for example, [17,18]. Proportional-velocity controllers have been also used [19,20], with the idea of moving the cart left and right repeatedly with a PID controller, until the pendulum reaches the straight-up position. This research assumed a pre-defined control strategy for the swing-up controller, but without time requirements to perform the maneuver.

This work deals with the minimum-time erection of an inverted pendulum within a nonlinear optimal-control framework. An early treatment of this problem is tackled in Section 9.3.13 of the classic optimal control book [21] and in [22], where the solution structure is ‘assumed’ to have a bang-bang structure, with a fixed number of switching times determined by optimization. In [23], the time-optimal swing-up is considered from a theoretical perspective by using geometric control theory. The controlled variable is the cart accelerations and it is shown that, with the formulated problem, no singular arcs are possible.

In this paper, a deeper investigation of the time-optimal control problem is carried out, but without a pre-defined control structure assumed and by controlling the cart force (instead of its acceleration). In contrast to prior investigations, it is shown that a singular arc may occur under certain circumstances—under these circumstances the control structure is bang-singular-bang in form. The control law in the singular sub-arc is derived analytically and compared with the (sub-optimal) bang-bang solution. A sensitivity analysis with respect to the model parameter and initial condition is performed to highlight the circumstances under which a singular arc occurs. Finally, the optimal solution is compared with an energy-based control strategy.

A dynamic model of the system is derived in Section 2. The optimal control problem is studied in Section 3. The simulation results are presented in Section 4, with the conclusions given in Section 5.

2. Dynamic Model

The system is shown in Figure 1, which comprises a pendulum with length l and mass m that is mounted on a cart with mass M. The pendulum has a rotational freedom

θ

(with respect to the vertical axis), while the cart has a translational freedom y. The acceleration due to gravity is g, while there is also an external horizontal force F acting on the cart. In sum, the model have two freedoms (

θ

and y), and one control input F.

The equation of motion can be derived from the kinetic energy T, the potential energy V and the virtual work

δ W

\begin{matrix} T & = & \frac{1}{2} M {\dot{y}}^{2} + \frac{1}{2} m ({(\dot{y} + l \dot{θ} cos θ)}^{2} + {(l \dot{θ} sin θ)}^{2}), \end{matrix}

(1)

\begin{matrix} V & = & - m g l cos θ, \end{matrix}

(2)

\begin{matrix} δ W & = & F δ y . \end{matrix}

(3)

Using Hamilton’s principle, one obtains

\begin{matrix} [\begin{matrix} M + m & m l cos θ \\ m l cos θ & m l^{2} \end{matrix}] [\begin{matrix} \ddot{y} \\ \ddot{θ} \end{matrix}] = [\begin{matrix} m l {\dot{θ}}^{2} sin θ + F \\ - m g l sin θ \end{matrix}], \end{matrix}

(4)

which can be reduced to

\begin{matrix} [\begin{matrix} 1 & ϵ l cos θ \\ cos θ & l \end{matrix}] [\begin{matrix} \ddot{y} \\ \ddot{θ} \end{matrix}] = [\begin{matrix} ϵ l {\dot{θ}}^{2} sin θ + g u \\ - g sin θ \end{matrix}], \end{matrix}

(5)

where

\begin{matrix} ϵ = \frac{m}{M + m} and u = \frac{F}{g (M + m)} . \end{matrix}

(6)

Equation (5) are identical to those reported in [21], when y is in the units of l, and the accelerations are in units of g—as a consequence time is in units of

\sqrt{l / g}

.

Since most solvers cannot deal with second order differential equations, it is convenient to reduce (5) to the first-order state-space form. By introducing state variables

v_{y} = \dot{y}

and

v_{θ} = \dot{θ}

, one obtains the following:

\begin{matrix} [\begin{matrix} \dot{y} \\ \dot{θ} \\ {\dot{v}}_{y} \\ {\dot{v}}_{θ} \end{matrix}] = [\begin{matrix} v_{y} \\ v_{θ} \\ \frac{(v_{θ}^{2} + cos θ) ϵ sin θ + u}{1 - ϵ {cos}^{2} θ} \\ - \frac{ϵ v_{θ}^{2} sin θ cos θ + sin θ + u cos θ}{1 - ϵ {cos}^{2} θ} \end{matrix}], \end{matrix}

(7)

which is in standard state-space form. From now on

x = {[y, θ, v_{y}, v_{θ}]}^{T}

represents a vector of the scaled state variables.

3. Optimal Control Problem

The optimal control problem (OCP) is one of finding a control u, with

| u | \leq u_{max} = 1

, which moves the system from the stable equilibrium

y = \dot{y} = θ = \dot{θ} = 0

(straight-down) to the unstable equilibrium

y = \dot{y} = \dot{θ} = 0

and

θ = π

(straight-up) in the minimum time. Note that both the initial and terminal states are specified.

The optimal control problem will be tackled in the following steps. First, the control Hamiltonian is derived, which is shown to be linear in the control u. Next, the switching function is derived (from the control Hamiltonian). Finally, under the assumption that a singular arc might exist, a feedback control law for the singular sub-arc is derived. This calculation is based on the assumption that the switching function and its various time derivatives must be zero. The control law for the singular arc is expressed both explicitly and as the a solution of a Riccati differential equations—these control laws are equivalent.

The OCP cost is simply the elapsed time

J = t_{f} .

(8)

The control Hamiltonian is given by

H = λ_{1} v_{y} + λ_{2} v_{θ} + λ_{3} \frac{(v_{θ}^{2} + cos θ) ϵ sin θ + u}{1 - ϵ {cos}^{2} θ} - λ_{4} \frac{ϵ v_{θ}^{2} sin θ cos θ + sin θ + u cos θ}{1 - ϵ {cos}^{2} θ}

(9)

where

λ = {[λ_{1}, λ_{2}, λ_{3}, λ_{4}]}^{T}

are the Lagrange multipliers. Evidently

\begin{matrix} H_{u} = \frac{λ_{3} - λ_{4} cos θ}{1 - ϵ {cos}^{2} θ} . \end{matrix}

(10)

The co-state equations for the OCP are given by

\begin{matrix} \dot{λ} = - \frac{\partial H}{\partial x}, \end{matrix}

(11)

which in expanded form are

\begin{matrix} {\dot{λ}}_{1} & = & 0, \end{matrix}

(12)

\begin{matrix} {\dot{λ}}_{2} & = & \frac{ϵ (1 - 2 {cos}^{2} θ) (λ_{3} - λ_{4} v_{θ}^{2}) + (λ_{4} - λ_{3} ϵ v_{θ}^{2}) cos θ - λ_{4} u sin θ}{1 - ϵ {cos}^{2} θ} + \\ \frac{2 {[(λ_{3} - λ_{4} v_{θ}^{2}) ϵ sin θ - λ_{4} u] cos θ - (λ_{4} - λ_{3} ϵ v_{θ}^{2}) sin θ + λ_{3} u} ϵ sin θ cos θ}{{(1 - ϵ {cos}^{2} θ)}^{2}}, \end{matrix}

(13)

\begin{matrix} {\dot{λ}}_{3} & = & - λ_{1}, \end{matrix}

(14)

\begin{matrix} {\dot{λ}}_{4} & = & - λ_{2} - 2 ϵ v_{θ} sin θ \frac{λ_{3} - λ_{4} cos θ}{1 - ϵ {cos}^{2} θ} . \end{matrix}

(15)

The optimal control

u^{*}

is given by

\begin{matrix} u^{*} = min_{u} H (x^{*}, λ^{*}, u) . \end{matrix}

(16)

Since

H

is linear in the control u, the optimal control is given by

\begin{matrix} u^{*} = \{\begin{matrix} - 1 & if Φ > 0 \\ unknown & if Φ = 0 \\ 1 & if Φ < 0, \end{matrix} \end{matrix}

(17)

where the switching function is

\begin{matrix} Φ = λ_{3} - λ_{4} cos θ; \end{matrix}

(18)

note that

(1 - ϵ {cos}^{2} θ) > 0

. The control is either bang-bang, or possibly a bang-singular-bang (if

Φ = 0

over a finite interval).

The transversality condition associated to the free final time is

\begin{matrix} H (t_{f}) = - 1; \end{matrix}

(19)

see, for example, Equation (8.73) in [1]. Since

H

is not explicitly time-dependent, (19) implies that

H (t) = - 1

. Since the state variables are specified at both the initial and final times, there are no pre-specified boundary conditions on the Lagrange multipliers. See Equation (8.72) in [1].

Singular Arc

The optimal control on the singular arc will be computed using the high-order minimum principle [24,25]. In the case of a single control, the generalized necessary condition for optimality is

{(- 1)}^{k} \frac{\partial}{\partial u} [{(\frac{d}{d t})}^{2 k} \frac{\partial H}{\partial u}] \geq 0,

(20)

which will be checked numerically a posteriori. The degree of singularity of the arc is

2 k

, which corresponds to the lowest-order time derivative of

H_{u}

(or the switching function

Φ

) in which u appears explicitly. Equation (20) reduces to the classic Legendre–Clebsch condition for

k = 0

, and to the Kelley condition if

k = 1

(singular arc of degree two), which is the case of interest here.

Under the assumption that a singular arc exists, the switching function given in (18) vanishes. In order to gain access to the control, (10) has to be differentiated with respect to the time at least twice; it is well known that the first re-appearance of u in the switching function differentiation process must occur in an even-order derivative [26]. One can now calculate

\dot{Φ} (x, λ)

and

\ddot{Φ} (x, λ, u)

.

In order to enforce

Φ (x, λ) = 0

,

\dot{Φ} (x, λ) = 0

and

\ddot{Φ} (x, λ, u) = 0

, we exploit the fact that

λ_{1}

is constant (c); see (12), and assemble the following system of equations:

[\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 0 & * & * \\ * & * & * & * \\ * & * & * & * \end{matrix}] [\begin{matrix} λ_{1} \\ λ_{2} \\ λ_{3} \\ λ_{4} \end{matrix}] = [\begin{matrix} c \\ 0 \\ 0 \\ 0 \end{matrix}];

(21)

where the second, third and fourth rows in (21) correspond to

Φ (x, λ) = 0

,

\dot{Φ} (x, λ) = 0

and

\ddot{Φ} (x, λ, u) = 0

, respectively—(21) can be solved for the co-state variables. The expressions in (21) are too cumbersome to report explicitly—these calculations were carried out using a symbolic mathematics tool.

The next step is to substitute the solution of (21) into (9) using

H = - 1

. This gives

u^{*} (x) = \frac{n_{0} (x) + n_{1} (x) c}{d_{0} (x) + d_{1} (x) c},

(22)

where

\begin{matrix} n_{0} & = & 2 {cos}^{3} θ - (1 + ϵ {cos}^{2} θ) cos θ \end{matrix}

\begin{matrix} + 2 (1 - 2 ϵ {cos}^{2} θ) v_{θ}^{2} - (1 - 3 ϵ {cos}^{2} θ) v_{θ}^{2} {cos}^{2} θ, \\ n_{1} & = & [2 (1 - 2 ϵ {cos}^{2} θ) - (1 - 3 ϵ {cos}^{2} θ) {cos}^{2} θ] v_{θ}^{2} v_{y} \\ - [1 - (2 - ϵ) {cos}^{2} θ] v_{y} cos θ + [1 - ϵ (2 - {cos}^{2} θ)] v_{θ}^{2} cos θ \end{matrix}

(23)

\begin{matrix} + [2 (2 - ϵ {cos}^{2} θ) {cos}^{2} θ - (3 - ϵ {cos}^{2} θ)] v_{θ} \end{matrix}

(24)

\begin{matrix} d_{0} & = & 2 sin θ {cos}^{2} θ, \end{matrix}

(25)

\begin{matrix} d_{1} & = & 2 (v_{θ} + v_{y} cos θ) sin θ cos θ . \end{matrix}

(26)

Since the denominator of (22) is given by

d_{0} (x) + d_{1} (x) c = sin (2 θ) [cos θ + c ((v_{θ} + v_{y} cos θ)],

the solution of (22) is singular when

θ = k \frac{π}{2}

(

k \in Z

). The constant c in (22) is a free parameter to be optimised. For this problem, the condition (20) along the singular arc (i.e., when enforcing

Φ = 0

) is given by

\frac{λ_{4} sin (2 θ)}{{(1 - ϵ cos θ)}^{2}} \geq 0,

(27)

which can be checked numerically a posteriori.

For numerical reasons, it may be prudent to describe the optimal control in differential form. To do this, we replace the first row of (21) with

\overset{⃛}{Φ} (x, λ, u, \dot{u}) = 0

to obtain

[\begin{matrix} * & * & * & * \\ 0 & 0 & * & * \\ * & * & * & * \\ * & * & * & * \end{matrix}] [\begin{matrix} λ_{1} \\ λ_{2} \\ λ_{3} \\ λ_{4} \end{matrix}] = [\begin{matrix} 0 \\ 0 \\ 0 \\ 0 \end{matrix}] .

(28)

If a non-trivial solution exists, the matrix on the left-hand side of (28) is singular. Setting the determinant of this matrix to zero gives

\begin{matrix} \dot{u} = a_{0} (x) + a_{1} (x) u + a_{2} (x) u^{2}, \end{matrix}

(29)

which is a Riccati equation—the state-dependent coefficients are

\begin{matrix} a_{0} & = & \frac{1}{2 {sin}^{2} θ} {v_{θ} (3 cos θ^{2} + 5 ϵ) \\ + \frac{v_{θ}^{3} [3 (4 - ϵ {cos}^{2} θ) ϵ {cos}^{4} θ - (1 + 14 ϵ) {cos}^{2} θ + 8 ϵ - 2]}{2 (1 - ϵ {cos}^{2} θ) cos θ} \\ + \frac{v_{θ} [(2 + 3 ϵ {cos}^{2} θ) ϵ {cos}^{2} θ - 5]}{1 - ϵ {cos}^{2} θ} \end{matrix}

(30)

\begin{matrix} - \frac{(4 - ϵ) (2 - ϵ) {cos}^{4} θ - 2 (5 - 2 ϵ) {cos}^{2} θ + 3}{2 v_{θ} (1 - ϵ {cos}^{2} θ) cos θ}} \end{matrix}

(31)

\begin{matrix} a_{1} & = & \frac{1}{2 sin θ} [\frac{v_{θ} (4 - 3 {cos}^{2} θ)}{cos θ} + \frac{(8 - 3 ϵ) {cos}^{2} θ - 5}{v_{θ} (1 - ϵ {cos}^{2} θ)}] \end{matrix}

(32)

\begin{matrix} a_{2} & = & - \frac{2 cos θ}{v_{θ} (1 - ϵ {cos}^{2} θ)} . \end{matrix}

(33)

It is worth noting that

u^{*}

in (22) is a solution of (29), and thus one can use either (22) or (29). The initial condition

u_{0}

on (29) is given by

u_{0} = {\frac{n_{0} (x) + n_{1} (x) c}{d_{0} (x) + d_{1} (x) c}|}_{x = x (t_{i})},

(34)

where

t_{i}

is the start time of the singular arc.

In sum, the optimal control along the singular arc is given by either (22) or (29) with its initial condition (33)—in either case, c is a parameter to be optimized. The form of the

a_{0}

and

a_{1}

coefficients suggest that (29) will have an escape time when

θ = k \frac{π}{2}

(

k \in Z

).

4. Numerical Solution

Four solution protocols are considered with

ϵ = 0.5

, with the results compared against those found in [21]. The time-optimal solution is also compared with an energy-based control strategy.

4.1. Baseline

The first solution to the OCP is found numerically using a direct collocation method. The transcription of the OCP into a nonlinear program (NLP) is achieved using GPOPS-II [27], which employs a Legendre–Gauss–Radau (LGR) discretization with mesh refinement. The derivatives are computed through the automatic-differentiation (AD) tool ADIGATOR [28]. The NLP solver is IPOPT [29]. The setup includes a mesh tolerance for GPOPS-II of

10^{- 6}

, with a error for the IPOPT solver of

10^{- 7}

. All simulations are started with the default mesh of GPOPS-II, which consists of 10 mesh fractions and 4 collocation points per fraction. The mesh refinement method employed is the hp-PattersonRao [27].

The solution is shown in Figure 2a: the optimal control is bang (

u = + 1

for

t < 0.69

), bang (

u = - 1

for

t < 0.70

), singular (

t < 1.17

), bang (

u = - 1

for

t < 2.72

), bang (

u = + 1

for

t < 4.32

) and bang (

u = - 1

for

t < 4.915

). The maneuver time is

t_{f}^{(i)} = 4.9151

in this case. Not surprisingly, numerical issues are clearly visible along the singular arc. As a side note, the computed solution was used to verify that

H \equiv - 1

for the entire simulation, and that

H_{u} = 0

between 0.70 and 1.17, i.e., along the singular arc. This is possible since the co-state of the OCP, which is used to evaluate

H

, can be estimated from the co-state of the associated NLP [30,31].

In the second solution, the OCP is divided in three phases: before, during and after the singular arc (as was suggested by the solution in Figure 2a). In the second phase, the optimal control is computed using (29), while the first and third bang phases are computed as before. The switching time between the phases is subject to optimization. In order to compute the singular phase, the state vector was inflated to include the control, i.e.,

x^{(i i)} = {[y, θ, v_{y}, v_{θ}, u]}^{T}

. To this end, the dynamic system (7) included (29)—the related initial condition was left free to be optimized (

u_{0} \approx - 0.96

was obtained in this case). The solution is shown in Figure 3a, while in the (b) part of the figure, a side view of the pendulum and cart during the optimal erection is shown; the singular-arc phase is shown in gray. The switching times obtained by the optimization process are 0.6880 s, 0.7079 s, 1.1737 s, 2.7218 s, and 4.3217 s, with the singular arc occurring between 0.7079 s and 1.1737 s. The maneuver time is again

t_{f}^{(i i)} = 4.9151

, which is identical to

t_{f}^{(i)}

(indeed, the singular control is essentially a low-pass filtered version of the oscillating control shown in the first simulation). The optimal control over the singular arc satisfies (27).

In the third solution, a pure bang-bang solution is pre-imposed, as shown in [21]. In other words, the control is only allowed to be

u = \pm 1

. Again, the GPOPS-II–ADIGATOR–IPOPT combination is employed, with the switching times as optimization parameters. In this case, Figure 2b shows the solution, which is identical to the one reported in [21], with

t_{f}^{(i i i)} = 4.9161

, which is greater than

t_{f}^{(i)}

and is thus sub-optimal.

The fourth solution protocol employs a regularized control that was designed to avoid the singular arc. This is a well-known approach which is employed practically in a number of applications including the minimum-time maneuvering of road vehicles [32]. In this case, a Lagrange term is added to the performance index so that

J = t_{f} + \int_{0}^{t_{f}} w_{u} u^{2} d t,

(35)

where

w_{u}

is a control weighting that must be chosen ‘small enough’. As a consequence, the control Hamiltonian becomes

\begin{matrix} H^{(2)} = H + w_{u} u^{2}, \end{matrix}

(36)

which is no longer linear in u. It is now possible to solve for the control explicitly using

H_{u} = 0

to obtain

\begin{matrix} u^{*} = \frac{1}{2 w_{u}} \frac{λ_{3} - λ_{4} cos θ}{1 - ϵ {cos}^{2} θ}; \end{matrix}

(37)

there is no singularity. As a general rule,

w_{u}

is chosen so as not to affect significantly the solution, while avoiding the numerical issues associated to the singular arc. The new problem can be solved as in the first scenario, i.e., without dividing the problem in multiple phases. The solution is shown in Figure 4a for

w_{u} = 10^{- 3}

(which gives

t_{f}^{(i v)} \approx t_{f}^{(i)}

), while in (b) the solution with

w_{u}

up to 0.1 are shown (

t_{f}

are 4.9156 and 4.9347 for

w_{u} = 10^{- 2}

and

w_{u} = 10^{- 1}

, respectively).

4.2. Sensitivity to $ϵ$

The aim of this section is to examine the control structure for different values of the mass ratio for

0 \leq ϵ < 1

.

The

ϵ = 0

case is of little physical relevance because either the pendulum bob is massless, or the mass of the cart is infinite. In this case, the problem can be computed, with the result being little more than the limit case for small

ϵ

. The

ϵ = 1

case corresponds to a massless cart or a bob of infinite mass. In this case, numerical issues arise from singularities in (7) when

θ = 0 o r π

. Neither singularity is avoidable because they will occur, respectively, at the beginning and end of the pendulum-erection maneuver. For these reasons, these extreme cases are not considered here.

Figure 5 shows the structure of the optimal solution as a function of the model parameter

ϵ

. A hundred simulations were performed with

ϵ

ranging from 0 to

0.98

in steps of

0.01

, with the corresponding switching times retained. The solution has a pure bang-bang structure with four bangs in the range

0 < ϵ < 0.23

—no singular arc occurs when most of the system mass in concentrated in the cart. For

ϵ > 0.23

, a singular arc arises at

t \approx 1.076

s, splitting the second bang in a singular-bang sub-arc. As

ϵ

is increased, the solution structure becomes bang-singular-bang with a lengthening singular sub-arc and with the initial bang-bang sub-arc decreasing in duration. When

ϵ > 0.56

, the second bang sub-arc vanishes, and the structure becomes bang-singular-bang up to

ϵ \approx 0.98

with an ever increasing singular arc duration. When

ϵ > 0.98

, there are numerical issues relating to the denominator in (7), which terminates the investigation. The results suggest that the greater the mass of the pendulum, the larger the duration of the singular arc.

4.3. Sensitivity to Initial Conditions

Here we investigate briefly the influence of initial conditions on the optimal control solution structure. Since the cart position is a cyclic coordinate [33], the optimal solution is invariant under changes in the initial cart position. As can be seen in Figure 6, the optimal control structure does react to changes in the initial velocity condition. As the cart’s initial velocity increases, the duration of the singular arc decreases—when

\dot{y} (0) = 0.2

there is no singular arc, and the solution has a pure bang-bang structure. Changes in the pendulum’s initial angular velocity have a similar effect. When

\dot{θ} (0) = 0.1

, the solution contains a singular arc, which then vanishes for

\dot{θ} (0) = 0.2

, and a pure bang-bang structure results.

4.4. Comparison to Energy-Based Control Strategy

The aim of this section is to compare the obtained time-optimal solution with that found using an energy-based controller—see, for example, [16]— which drives the mechanical energy of the pendulum from

E_{i} = - ϵ

at

t = 0

to

E_{f} = + ϵ

at

t = t_{f}

. The employed swing-up controller is

u = k_{1} (E_{f} - E) sign (\dot{y}) + k_{2} sign (y) log (1 - \frac{| y |}{y_{max}}),

(38)

while a LQR controller ensures stability around the straight-up position and is enabled when

θ \geq 0.9 π

, i.e., in the linear region of the pendulum. In (37), E is the mechanical energy (kinetic plus gravitational) of the pendulum,

k_{1}, k_{2}

are the controller parameters, and

y_{max}

is the bound for the cart position. Here

k_{1} = 1, k_{2} = 1.5

are selected by trial and error to ensure quick swing-up, while

y_{max} = 1

. The first term in (37) injects energy into the system (i.e.,

\frac{d}{d t} (T + V) = u \dot{y} > 0

), while the second term bounds the cart position within

[- y_{max}, + y_{max}]

. The computed control u is saturated to

\pm u_{max}

.

Figure 7 shows the numerical solution with

ϵ = 0.5

as compared to the corresponding optimal-time solution. At the beginning, the pendulum trajectory is close to the optimal one. However, as the time increases, the solution becomes sub-optimal. At

θ = 0.9 π

(i.e.,

t \approx 4.684

s, see the vertical solid lines), the swing-up maneuver ends and the LQR takes control of the system. At this time, the system is still moving with a non-zero velocity and the cart position is

y \neq 0

. Therefore, the LQR controller takes some time to drive the system state to the desired final values (i.e.,

θ = π, y = \dot{y} = \dot{θ} = 0

). In contrast, the time-optimal control does not need a stabilizing controller at the end of the maneuver because the control is able to reach the target final state.

5. Conclusions

We have shown that the solution structure of the minimum-time inverted-pendulum erection problem is more complex than was classically believed. In particular, we have shown that the solution may contain singular sub-arcs. The duration of these sub-arcs varies with both

ϵ

and changes in the initial velocity conditions. A pre-imposed bang-bang structure is shown to be sub-optimal. It is demonstrated that the singular control is a state feedback control that is responsive to the first problem Lagrange multiplier, which is an optimizable constant. The traditional, pragmatic and, as some think, ‘dirty’ use of a regularization Lagrange term in the cost function is shown to be effective for this problem, while also reducing considerably the complexity of the problem solution. Finally, the optimal-time solution has been compared against that obtained by using energy-based control for the swing-up maneuver, showing that the latter is sub-optimal.

Author Contributions

Conceptualization, M.M. and D.J.N.L.; methodology, M.M., S.L. and D.J.N.L.; software, M.M. and S.L.; validation, M.M., S.L. and D.J.N.L.; investigation, S.L.; writing—original draft preparation, M.M. and S.L.; writing—review and editing, M.M., S.L. and D.J.N.L.; visualization, S.L.; supervision, M.M. and D.J.N.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Limebeer, D.J.N.; Massaro, M. Dynamics and Optimal Control of Road Vehicles; Oxford University Press: Oxford, UK, 2018. [Google Scholar]
Sharp, R.S. On the stability and control of unicycles. Proc. R. Soc. A 2010, 466, 1849–1869. [Google Scholar] [CrossRef]
Braghin, F.; Cheli, F.; Maldifassi, S.; Melzi, S.; Sabbioni, E. The Engineering Approach to Winter Sports; Springer: Berlin/Heidelberg, Germany, 2016. [Google Scholar]
Hirai, K.; Hirose, M.; Haikawa, Y.; Takenaka, T. The development of Honda humanoid robot. In Proceedings of the 1998 IEEE International Conference on Robotics and Automation (Cat. No.98CH36146), Leuven, Belgium, 20 May 1998; Volume 2, pp. 1321–1326. [Google Scholar] [CrossRef]
Hughes, P.C. Spacecraft Attitude Dynamics; Courier Corporation: Dover, UK, 2012. [Google Scholar]
Ren, T.J.; Chen, T.C.; Chen, C.J. Motion control for a two-wheeled vehicle using a self-tuning PID controller. Control. Eng. Pract. 2008, 16, 365–375. [Google Scholar] [CrossRef]
Prasad, L.; Tyagi, B.; Gupta, H. Optimal control of nonlinear inverted pendulum system using PID controller and LQR: Performance analysis without and with disturbance input. Int. J. Autom. Comput. 2014, 11, 661–670. [Google Scholar] [CrossRef]
Pathak, K.; Franch, J.; Agrawal, S. Velocity and position control of a wheeled inverted pendulum by partial feedback linearization. IEEE Trans. Robot. 2005, 21, 505–513. [Google Scholar] [CrossRef]
Anderson, C. Learning to Control an Inverted Pendulum Using Neural Networks. IEEE Control Syst. Mag. 1989, 9, 31–37. [Google Scholar] [CrossRef]
Huang, C.H.; Wang, W.J.; Chiu, C.H. Design and Implementation of Fuzzy Control on a Two-Wheel Inverted Pendulum. IEEE Trans. Ind. Electron. 2011, 58, 2988–3001. [Google Scholar] [CrossRef]
Huang, J.; Guan, Z.H.; Matsuno, T.; Fukuda, T.; Sekiyama, K. Sliding-mode velocity control of mobile-wheeled inverted-pendulum systems. IEEE Trans. Robot. 2010, 26, 750–758. [Google Scholar] [CrossRef]
Balcazar, R.; Rubio, J.d.J.; Orozco, E.; Andres Cordova, D.; Ochoa, G.; Garcia, E.; Pacheco, J.; Gutierrez, G.J.; Mujica-Vargas, D.; Aguilar-Ibañez, C. The Regulation of an Electric Oven and an Inverted Pendulum. Symmetry 2022, 14, 759. [Google Scholar] [CrossRef]
Villaseñor Rios, C.A.; Luviano-Juárez, A.; Lozada-Castillo, N.B.; Carvajal-Gámez, B.E.; Mújica-Vargas, D.; Gutiérrez-Frías, O. Flatness-Based Active Disturbance Rejection Control for a PVTOL Aircraft System with an Inverted Pendular Load. Machines 2022, 10, 595. [Google Scholar] [CrossRef]
Yoshida, K. Swing-up control of an inverted pendulum by energy-based methods. In Proceedings of the 1999 American Control Conference (Cat. No. 99CH36251), San Diego, CA, USA, 2–4 June 1999; Volume 6, pp. 4045–4047. [Google Scholar] [CrossRef]
Åström, K.; Furuta, K. Swinging up a pendulum by energy control. Automatica 2000, 36, 287–295. [Google Scholar] [CrossRef]
Chatterjee, D.; Patra, A.; Joglekar, H.K. Swing-up and stabilization of a cart–pendulum system under restricted cart track length. Syst. Control. Lett. 2002, 47, 355–364. [Google Scholar] [CrossRef]
Muskinja, N.; Tovornik, B. Swinging up and stabilization of a real inverted pendulum. IEEE Trans. Ind. Electron. 2006, 53, 631–639. [Google Scholar] [CrossRef]
Susanto, E.; Surya Wibowo, A.; Ghiffary Rachman, E. Fuzzy Swing Up Control and Optimal State Feedback Stabilization for Self-Erecting Inverted Pendulum. IEEE Access 2020, 8, 6496–6504. [Google Scholar] [CrossRef]
Solihin, M.I.; Wahyudi; Akmeliawati, R. Self-erecting inverted pendulum employing PSO for stabilizing and tracking controller. In Proceedings of the 2009 5th International Colloquium on Signal Processing & Its Applications, Kuala Lumpur, Malaysia, 6–8 March 2009; pp. 63–68. [Google Scholar] [CrossRef]
Vinodh Kumar, E.; Jerome, J. Robust LQR Controller Design for Stabilizing and Trajectory Tracking of Inverted Pendulum. Procedia Eng. 2013, 64, 169–178. [Google Scholar] [CrossRef]
Bryson, A.E. Dynamic Optimization; Addison-Wesley: Menlo Park, CA, USA, 1999. [Google Scholar]
Mori, S.; Nishihara, H.; Furuta, K. Control of unstable mechanical system Control of pendulum†. Int. J. Control 1976, 23, 673–692. [Google Scholar] [CrossRef]
Mason, P.; Broucke, M.E.; Piccoli, B. Time optimal swing-up of the planar pendulum. In Proceedings of the 2007 46th IEEE Conference on Decision and Control, New Orleans, LA, USA, 12–14 December 2007; pp. 5389–5394. [Google Scholar]
Kelley, H.J.; Kopp, R.E.; Moyer, H.G. Singular extremals. In Topics in Optimization; Academic Press: Cambridge, MA, USA, 1967; Chapter 3; pp. 63–101. [Google Scholar]
Krenner, A.J. The High Order Maximal Principle and Its Application to Singular Extremals. SIAM J. Control Optim. 1977, 15, 256–293. [Google Scholar] [CrossRef]
Robbins, H.M. A Generalized Legendre-Clebsch Condition for the Singular Cases of Optimal Control. IBM J. 1967, 11, 361–372. [Google Scholar] [CrossRef]
Patterson, M.A.; Rao, A.V. GPOPS-II: A Matlab Software for Solving Multiple-Phase Optimal Control Problems Using hp–Adaptive Gaussian Quadrature Collocation Methods and Sparse Nonlinear Programming. ACM Trans. Math. Softw. 2014, 41, 1–37. [Google Scholar] [CrossRef]
Weinstein, M.J.; Rao, A.V. Algorithm 984: ADiGator, a toolbox for the algorithmic differentiation of mathematical functions in MATLAB using source transformation via operator overloading. ACM Trans. Math. Softw. 2017, 44, 1–25. [Google Scholar] [CrossRef]
Wächter, A.; Biegler, L.T. On the Implementation of an Interior-Point Filter Line-Search Algorithm for Large-Scale Nonlinear Programming. Math. Program. 2006, 106, 25–57. [Google Scholar] [CrossRef]
Garg, D.; Patterson, M.A.; Darby, C.L.; Francolin, C.; Huntington, G.T.; Hager, W.W.; Rao, A.V. Direct Trajectory Optimization and Costate Estimation of Finite-Horizon and Infinite-Horizon Optimal Control Problems via a Radau Pseudospectral Method. Comput. Optim. Appl. 2011, 49, 335–358. [Google Scholar] [CrossRef]
Francolin, C.C.; Hager, W.W.; Rao, A.V. Costate Approximation in Optimal Control Using Integral Gaussian Quadrature Collocation Methods. Optim. Control Appl. Methods 2014, 36, 381–397. [Google Scholar] [CrossRef]
Massaro, M.; Limebeer, D.J.N. Minimum-lap time simulation and optimization. Veh. Syst. Dyn. 2021, 59, 1069–1113. [Google Scholar] [CrossRef]
Goldstein, H.; Poole, C.; Safko, J. Classical Mechanics, 3rd ed.; Pearson Education International: London, UK, 2002. [Google Scholar]

Figure 1. Model of a pendulum on a sliding cart.

Figure 2. (a) Single-stage numerical solution, and (b) bang-bang solution [21].

Figure 3. (a) Solution with three phases, and (b) stroboscopic movie with 50 equally spaced time steps; the gray lines denote the singular arc.

Figure 4. (a) Single-stage solution with

w_{u} = 10^{- 3}

. (b) Optimal controls with weights

w_{u} = 10^{- 3}

(solid),

w_{u} = 10^{- 2}

(dashed), and

w_{u} = 10^{- 1}

(dotted).

Figure 4. (a) Single-stage solution with

w_{u} = 10^{- 3}

. (b) Optimal controls with weights

w_{u} = 10^{- 3}

(solid),

w_{u} = 10^{- 2}

(dashed), and

w_{u} = 10^{- 1}

(dotted).

Figure 5. Structure of the optimal solution as a function of the model parameter

ϵ

: (1) bang-bang, (2) bang-bang-singular-bang, and (3) bang-singular-bang. The solid lines represent the switching times between the bangs, while the vertical dash-dot lines denote

ϵ = 0.23

,

ϵ = 0.56

, and

ϵ = 0.98

.

Figure 5. Structure of the optimal solution as a function of the model parameter

ϵ

: (1) bang-bang, (2) bang-bang-singular-bang, and (3) bang-singular-bang. The solid lines represent the switching times between the bangs, while the vertical dash-dot lines denote

ϵ = 0.23

,

ϵ = 0.56

, and

ϵ = 0.98

.

Figure 6. Optimal control structure with non-zero initial velocities: (a)

\dot{y} (0) = 0.1

(top) and

\dot{y} (0) = 0.2

(bottom), and (b)

\dot{θ} (0) = 0.1

(top) and

\dot{θ} (0) = 0.2

(bottom).

Figure 6. Optimal control structure with non-zero initial velocities: (a)

\dot{y} (0) = 0.1

(top) and

\dot{y} (0) = 0.2

(bottom), and (b)

\dot{θ} (0) = 0.1

(top) and

\dot{θ} (0) = 0.2

(bottom).

Figure 7. Comparison of the optimal time solution (solid line) with that found using the energy-based control strategy (dashed line). The vertical solid lines denote the end of the swing-up phase for the energy-based control.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Massaro, M.; Lovato, S.; Limebeer, D.J.N. The Optimal Erection of the Inverted Pendulum. Appl. Sci. 2022, 12, 8112. https://doi.org/10.3390/app12168112

AMA Style

Massaro M, Lovato S, Limebeer DJN. The Optimal Erection of the Inverted Pendulum. Applied Sciences. 2022; 12(16):8112. https://doi.org/10.3390/app12168112

Chicago/Turabian Style

Massaro, Matteo, Stefano Lovato, and David J. N. Limebeer. 2022. "The Optimal Erection of the Inverted Pendulum" Applied Sciences 12, no. 16: 8112. https://doi.org/10.3390/app12168112

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Optimal Erection of the Inverted Pendulum

Abstract

1. Introduction

2. Dynamic Model

3. Optimal Control Problem

Singular Arc

4. Numerical Solution

4.1. Baseline

4.2. Sensitivity to $ϵ$

4.3. Sensitivity to Initial Conditions

4.4. Comparison to Energy-Based Control Strategy

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

The Optimal Erection of the Inverted Pendulum

Abstract

1. Introduction

2. Dynamic Model

3. Optimal Control Problem

Singular Arc

4. Numerical Solution

4.1. Baseline

4.2. Sensitivity to ϵ

4.3. Sensitivity to Initial Conditions

4.4. Comparison to Energy-Based Control Strategy

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4.2. Sensitivity to $ϵ$