自控，ADRC IWP Python 项目分析

《SWING UP AND BALANCING OF A REACTION WHEEL INVERTED PENDULUM》

https://github.com/B-Paweekorn/Reaction-wheel-inverted-pendulum

论文

在定义了坐标之后，下一步是用广义坐标计算动能和总势能，然后，由于能量是标量函数，只需将动能和势能相减即可得到系统中的总机械能。通过以下表达式可以找到动能和势能的拉格朗日量 $L$ 。

L = T - V

其中， $T$ 是动能， $V$ 是势能。

由于拉格朗日量是广义坐标及其导数的函数，运动方程如下所示：

\frac{d}{dt}\left(\frac{\partial L}{\partial \dot{q}_k}\right) - \frac{\partial L}{\partial q_k} = \tau_k

其中， $\tau_{\boldsymbol{k}}$ 是沿 $q_{\boldsymbol{k}}$ 方向的广义力或力矩。

2.2 状态空间建模

状态空间是一种数学模型，用于将物理系统表示为输入、输出和状态的关系。状态空间模型的一般形式由以下方程给出：

\begin{aligned} & \dot{X} = A X + B U \\ & Y = C X + D U \end{aligned}

其中：
• $X$ 是状态向量，

• $Y$ 是输出向量，

• $U$ 是输入向量，

• $A$ 是状态矩阵，

• $B$ 是输入矩阵，

• $C$ 是输出矩阵，

• $D$ 是前馈矩阵，

• $\dot{X}$ 是状态向量的一阶导数。

2.3 电机模型

直流（DC）电机的机械系统数学模型通常基于电机转矩进行描述。在本实验中，直流电机用于驱动反作用轮。在直流电机中，电机转矩与电机电流成正比。由于我们使用 PWM（脉宽调制）电压驱动器而非电流驱动器来驱动电机，因此需要电机的动态模型来控制和近似转矩。

通常，直流电机产生的转矩（ $\boldsymbol{\tau}$ ）与转子线圈中的电流成正比，可以用直流电机的机电方程表示：

\tau = k_t \cdot i \quad \text{其中，} \quad k_t \text{ 是电机的转矩常数。}

反电动势（ $\boldsymbol{e}$ ）与电机转子的角速度成正比，可以用直流电机的电气方程表示：

\begin{aligned} & \boldsymbol{e} = k_e \cdot \dot{\theta} \quad \text{其中，} \quad k_e \text{ 是电机的反电动势常数，} \\ & \dot{\theta} \text{ 是电机轴的角速度。} \end{aligned}

假设直流电机的效率为 $100\%$ ，则转矩常数 $k_t$ 和反电动势常数 $k_e$ 应相等。根据基尔霍夫电压定律，电机电路可以推导出以下表达式：

v - R i - L \frac{d i}{d t} - k_e \dot{\theta} = 0

2.4 起摆控制器（Bang-Bang控制器）

该系统的起摆控制采用Bang-Bang控制器，其输出信号在输入信号的上限和下限之间切换。Bang-Bang控制器可以控制摆杆的运动，但无法精确控制摆杆在接近直立位置时的速度，而这一点对稳定控制至关重要。因为如果摆杆到达直立位置时的角速度过高，反作用轮的扭矩可能不足以使其停止。因此，当摆杆接近直立位置时，需要切换至另一种控制器（如LQR控制器）以实现稳定控制。

Bang-Bang控制器的激活由一个基于能量的控制策略决定。在Bang-Bang控制器增加摆杆能量的同时，能量控制器会实时检测摆杆的机械能量，并将其与目标能量进行比较。如果当前能量未达到目标值，则Bang-Bang控制器保持开启；否则，Bang-Bang控制器关闭，电机停止工作。

2.5 线性二次调节器（LQR控制器）

LQR是一种用于优化线性时不变系统控制的数学框架，其目标是通过最小化一个成本函数（通常表示为控制输入随时间的变化）来实现最优控制。该控制器基于当前系统状态计算反馈控制律，以确保系统稳定性并达成控制目标。

2.6 PID控制器

PID是“比例-积分-微分控制器”的缩写，是动态控制系统中应用最广泛的控制器。其控制输出由比例项、积分项和微分项叠加而成，表达式如下：

u(t) = K_p e(t) + K_i \int_0^\tau e(\tau) d\tau + K_d \frac{d}{dt} e(t)

其中：
• $u(t)$ 为控制输入信号，

• $K_p$ 为比例增益，

• $K_i$ 为积分增益，

• $K_d$ 为微分增益，

• $e(t)$ 为误差信号，

• $t$ 为瞬时时间，

• $\tau$ 为积分变量。

本实验将结合LQR控制器与PID控制器共同调节摆杆角速度，以对比两种控制器的输出性能与稳定性差异。

方法论

在本项目框架中，系统框图由两大核心部分组成：控制器部分（包含关键的起摆控制器和平衡控制器）以及作为重要测量工具的仿真部分。仿真部分进一步划分为两个模块：反应轮倒立摆动力学模型和反应轮倒立摆仿真模型。这些组件共同构成了一个用于控制策略分析与验证的完整框架。

数学建模公式

1. 参数定义

• L1: 摆杆质心到转轴的长度。

• L2: 飞轮安装位置到转轴的长度。

• m1, m2: 摆杆和飞轮的质量。

• θp, θr: 摆杆和飞轮的角位移（飞轮角为相对摆杆的相对角）。

• I1, I2: 摆杆和飞轮（含电机）的转动惯量。

• Tm: 电机施加的扭矩（驱动飞轮）。

• Td: 外部扰动扭矩。

2. 动能（K）

系统动能包含两部分：

摆杆的平动与转动动能： $K_1 = \frac{1}{2}(m_1 L_1^2 + m_2 L_2^2 + I_1)\dot{\theta}_p^2$
飞轮的转动动能：
飞轮的总角速度为 $\dot{\theta}_p + \dot{\theta}_r$ ，因此： $K_2 = \frac{1}{2}I_2(\dot{\theta}_p + \dot{\theta}_r)^2$

展开后总动能为：

K = \frac{1}{2}(m_1 L_1^2 + m_2 L_2^2 + I_1 + I_2)\dot{\theta}_p^2 + I_2 \dot{\theta}_p \dot{\theta}_r + \frac{1}{2}I_2 \dot{\theta}_r^2

3. 势能（V）

摆杆和飞轮的势能由其质心高度决定：

V = (m_1 L_1 + m_2 L_2)g \cos\theta_p

4. 拉格朗日方程

拉格朗日量 $L = K - V$ ，对广义坐标 $\theta_r$ 和 $\theta_p$ 分别应用方程：

\frac{d}{dt}\left(\frac{\partial L}{\partial \dot{\theta}_r}\right) - \frac{\partial L}{\partial \theta_r} = Q_r, \quad \frac{d}{dt}\left(\frac{\partial L}{\partial \dot{\theta}_p}\right) - \frac{\partial L}{\partial \theta_p} = Q_p

其中广义力： • $Q_r = T_m$ （电机扭矩作用于飞轮）。

• $Q_p = -T_m + T_d$ （摆杆受到的反作用扭矩和扰动）。

L = K - V = \left[ \frac{1}{2}(m_1 L_1^2 + m_2 L_2^2 + I_1 + I_2)\dot{\theta}_p^2 + I_2 \dot{\theta}_p \dot{\theta}_r + \frac{1}{2}I_2 \dot{\theta}_r^2 \right] - (m_1 L_1 + m_2 L_2)g \cos\theta_p

5. 方程推导

对 $\theta_p$ 的方程

L = K - V = \left[ \frac{1}{2}(m_1 L_1^2 + m_2 L_2^2 + I_1 + I_2)\dot{\theta}_p^2 + I_2 \dot{\theta}_p \dot{\theta}_r + \frac{1}{2}I_2 \dot{\theta}_r^2 \right] - (m_1 L_1 + m_2 L_2)g \cos\theta_p

计算导数项： $\frac{\partial L}{\partial \dot{\theta}_p} = (m_1 L_1^2 + m_2 L_2^2 + I_1 + I_2)\dot{\theta}_p + I_2 \dot{\theta}_r$ $\frac{d}{dt}\left(\frac{\partial L}{\partial \dot{\theta}_p}\right) = (m_1 L_1^2 + m_2 L_2^2 + I_1 + I_2)\ddot{\theta}_p + I_2 \ddot{\theta}_r$
势能导数项： ( cos ⁡ x ) ′ = − sin ⁡ x $\frac{\partial L}{\partial \theta_p} = (m_1 L_1 + m_2 L_2)g \sin\theta_p$
代入拉格朗日方程： $(m_1 L_1^2 + m_2 L_2^2 + I_1 + I_2)\ddot{\theta}_p + I_2 \ddot{\theta}_r - (m_1 L_1 + m_2 L_2)g \sin\theta_p = -T_m + T_d \tag{2}$

对 $\theta_r$ 的方程

L = K - V = \left[ \frac{1}{2}(m_1 L_1^2 + m_2 L_2^2 + I_1 + I_2)\dot{\theta}_p^2 + I_2 \dot{\theta}_p \dot{\theta}_r + \frac{1}{2}I_2 \dot{\theta}_r^2 \right] - (m_1 L_1 + m_2 L_2)g \cos\theta_p

计算导数项：

\frac{\partial L}{\partial \dot{\theta}_r} = \frac{\partial K}{\partial \dot{\theta}_r} = I_2 (\dot{\theta}_p + \dot{\theta}_r)

\frac{d}{dt} \left( \frac{\partial L}{\partial \dot{\theta}_r} \right) = I_2 (\ddot{\theta}_p + \ddot{\theta}_r)

\frac{\partial L}{\partial \theta_r} = 0

代入拉格朗日方程： $I_2 (\ddot{\theta}_p + \ddot{\theta}_r) = T_m \tag{1}$

DC Motor Dynamics

Brushed DC Motor Parameter

有刷直流电机参数

• Vin - 输入电压

• R - 电机电阻

• L - 电机电感

• i - 电机电流

• B - 电机阻尼系数

• J - 电机转子转动惯量

• ke - 反电动势常数

• kt - 扭矩常数

电气部分动态方程

假设电感效应可忽略（ $L \ll R$ ），电压平衡方程简化为：

V_{\text{in}} = R \, i + k_e \dot{\theta}_r

由此可得电枢电流 $i$ 的表达式：

i = \frac{V_{\text{in}} - k_e \dot{\theta}_r}{R}

其中：
• $\dot{\theta}_r$ 为电机转子角速度

• $k_e \dot{\theta}_r$ 表示反电动势电压

机械部分动态方程

电机扭矩 $T_m$ 与电枢电流 $i$ 的关系由扭矩常数 $k_t$ 决定：

T_m = k_t \, i = k_t \frac{V_{\text{in}} - k_e \dot{\theta}_r}{R}

扭矩平衡方程为：

T_m = J \ddot{\theta}_r + B \dot{\theta}_r

其中：
• $\ddot{\theta}_r$ 为电机转子角加速度

• $B \dot{\theta}_r$ 表示阻尼扭矩

该推导表明，电机扭矩 $T_m$ 同时受电气输入和机械运动的影响。

Controllers

LQR Controller: A linear quadratic regulator designed to stabilize the pendulum in the upright position.

Linearization dynamics model
- Part RWIP
  
  when sinθp -> 0 sinθp = θp
$\ddot{\theta_p} = \frac{m_{1}gL_{1}\theta_{p}\ +\ m_{2}gL_{2}\theta_{p}\ -\ T_{m} +\ T_{d}}{m_{1}L_{1}^{2}+m_{2}L_{2}+I_{1}}$

$\ddot{\theta_r} = \frac{T_{m}}{I_{2}}\-\frac{m_{1}gL_{1}\ +\ m_{2}gL_{2}\ -\ T_{m} +\ T_{d}}{m_{1}L_{1}^{2}+m_{2}L_{2}+I_{1}}$
- Part Motor
We can estimate that L << R

$Vin = R i + k_e θ_r$

$T_{m} = k_t i$

State space

The proceeding equations are valid around the operating point where θp = 0

\left[\begin{array}{c} \dot{\theta}_p \\ \ddot{\theta}_p \\ \dot{\theta}_r \\ \ddot{\theta}_r \end{array}\right]=\left[\begin{array}{cccc} 0 & 1 & 0 & 0 \\ \frac{\left(m_1 L_1+m_2 L_2\right) g}{m_1 L_1^2+m_2 L_2^2+J} & 0 & 0 & \frac{k_t k_e}{\left(m_1 L_1^2+m_2 L_2^2+J\right) R} \\ 0 & 0 & 0 & 1 \\ -\frac{\left(m_1 L_1+m_2 L_2\right) g}{m_1 L_1^2+m_2 L_2^2+J} & 0 & 0 & -\left(\frac{m_1 L_1^2+m_2 L_2^2+2 J}{\left(m_1 L_1^2+m_2 L_2^2+J\right) J}\right)\left(\frac{k_t k_c}{R}\right) \end{array}\right]\left[\begin{array}{c} 0 \\ \theta_p \\ \dot{\theta}_p \\ \theta_r \\ \dot{\theta}_r \end{array}\right]+\left[\begin{array}{c} k_t \\ \frac{\left.m_1 L_1^2+m_2 L_2^2+J\right) R}{0} \\ \left(\frac{m_1 L_1^2+m_2 L_2^2+2 J}{\left(m_1 L_1^2+m_2 L_2^2+J\right) J}\right)\left(\frac{k t}{R}\right) \end{array}\right] V_{i n}

PID Controller: The stabilize controller to compare with LQR

Transfer function

$\Large\frac{\theta_{p}(s)}{\tau_{m}(s)}=\frac{\frac{s}{-J-m_{2}L_{2}^{2}}}{s^{3}+\left(\frac{B}{I_{1}}+\frac{B+d_{p}}{J+m_{2}L_{2}^{2}}\right)s^{2}-\left(\frac{\left(m_{1}L_{1}+m_{2}L_{2}\right)g}{\left(J+m_{2}L_{2}^{2}\right)I_{1}}-\frac{B\cdot d_{p}}{\left(J+m_{2}L_{2}^{2}\right)I_{1}}\right)s-\frac{\left(m_{1}L_{1}+m_{2}L_{2}\right)Bg}{\left(J+m_{2}L_{2}^{2}\right)I_{1}}}$

Root Locus Design

To predict the system's characteristics as the gain (Kp) is adjusted and poles move, design the root locus.

Root Locus of the system by default parameter set. It has one zero (s = 0), and three poles (s = -74.93; s =-3.88e-3; s = 73.53).

Closed Loop Root Locus

Where G(s) is Kp the gain can be adjusted ti make the closed loop poles to be in stable location The resultant Root Locus can be seen below (note to plot this graph in param.py you need to set Stabilize_Controller to "PID" mode and set plot_rootlocus to "True") in this graph you can click pole position you want to know Gain Kp to adjust your system characteristics.
Bang-bang Controller: The swing up control routine and the stabilizing control routine are switched between -25 to 25 degree

Brake Controller: Used as reduced energy of RWIP when RWIP have too much energy for stabilze

Sound Generation

The simulation incorporates sound generation related to the speed of the reaction wheel. This feature adds an auditory element to the simulation, enhancing the user experience.

Compare PID controller and LQR controller

In this project, we explore and compare the performance of two different control strategies: PID (Proportional-Integral-Derivative) controller and LQR (Linear Quadratic Regulator) controller.

Linear quadratic regulator

CodeCogsEqn

Error (deg)	settling time (s)	Power (Watt)
5	0.66	0.6
6	0.73	1.01
7	0.85	1.75
8	1.07	3.5
9	can't stabilize	can't stabilize

Max Disturbance : 9.32 Nm

PID : Kp = 500 (choose form root locus)

Error (deg)	settling time (s)	Power (Watt)
5	20.43	10.06
6	21.02	12.17
7	21.37	13.95
8	21.59	15.96
9	can't stabilize	can't stabilize

Max Disturbance : 8.05 Nm

PID : Kp = 215800 (choose form root locus) Notice that when choose unstable pole the system still stable because now it have hardware limit so the character of controller same like Fuzzy logic control to see unstable you need to unlock hardware limitation by set MotorLimit to False in param.py

Conclusion

	Stabilize boundary
LQR	Can stabilize in every position
PID	Can stabilize only in small boundary

PID Controller The PID controller is a widely used feedback control system that relies on three components: Proportional, Integral, and Derivative. Here's a brief overview of each component:

Proportional (P): Reacts to the current error.
Integral (I): Reacts to the accumulation of past errors.
Derivative (D): Predicts future errors based on the rate of change.

Advantages of PID:

Simplicity and ease of implementation.
Effectiveness in a wide range of systems.
Intuitive tuning parameters for performance optimization.

Considerations:

Tuning may be required for optimal performance in different systems.
Limited capability to handle complex or nonlinear systems.

LQR Controller The LQR controller is designed based on the principles of optimal control theory. It minimizes a cost function that combines both state and control input, making it suitable for linear, time-invariant systems.

Advantages of LQR:

Optimal control solution for linear systems.
Ability to handle systems with multiple inputs and outputs.
Incorporates a mathematical model for optimal performance.

Considerations:

Requires a good understanding of the system dynamics for effective modeling.
Limited applicability to strictly linear systems.

Acknowledgments

This project is part of the coursework for FRA333 Robot Kinematics at the Institute of Field Robotics, King Mongkut’s University of Technology Thonburi. Special thanks to the course instructors for their guidance and support.

Feel free to explore, modify, and extend this project for educational and research purposes.

目录

论文