Whenever you measure something, there can be multiple sources of error: the measuring device, the thing being measured, the person or agent making the observation. If you want to be precise with your estimates, its important to understand where the error is coming from. In this post I show a gauge study I did on 2 sets of scales for measuring the weights of bags of M&Ms.
You can find the data I collected here:
gauge_study_kitchen_scales.csv
gauge_study_lab_scales.csv
The advertised weight for this small bag of M&Ms is 36 grams. However when I weighed it on my kitchen scales, I got about 37 grams.
Is the advertised weight accurate? Or are you measuring it incorrectly?
Is the bag heavier than advertised?
Or are my kitchen scales poor?
Or did I place it on the scales incorrectly?
Whenever something is measured, or “observed”, there can be natural variation in:
The individual observation of a part can be more explicitly written as the sum of these factors:
\[y_{ijk} = \bar{y}_{\cdot\cdot\cdot} + o_i + p_j + (o\times p)_{ij} + e_{ijk} \tag{1}\]where:
Importantly, we assume that all the deviations follow zero-mean Gaussian distributions:
\[\begin{align} o&\sim\mathcal{N}(0,\sigma_o^2) \tag{2a} \\ p&\sim\mathcal{N}(0,\sigma_p^2) \tag{2b} \\ (o\times p)&\sim\mathcal{N}(0,\sigma_{op})^2 \tag{2c} \\ e&\sim\mathcal{N}(0,\sigma_e^2). \tag{2d} \end{align}\]The diagram below shows the composition of these error components.
(Random) measurement error can be decomposed in to repeatability, and reproducibility.
We can use a gauge study to to separate out these effects and answer some important questions about:
In a crossed gauge study we have:
By having multiple observers measure the same parts we can filter out the bias that any individual may have, and account for interaction of operator and part.
In a crossed gauge study, every observer measures every object or part.
Every observer measures every part multiple times. It is also important to randomise the order in which observations are taken to reduce any bias or systematic effects.
The table below shows how a data collection form for a gauge study should look. Programs like Minitab can generate these automatically for you, though it is simple enough to create one in a spreadsheet.
| Standard Order | Randomised Order | Part Number | Observer | Replicate | Measurement |
|---|---|---|---|---|---|
| 3 | 1 | 3 | A | 1 | 38 |
| 21 | 2 | 1 | B | 1 | 37 |
| 14 | 3 | 4 | A | 2 | 38 |
| 13 | 4 | 3 | A | 2 | 37 |
| 53 | 5 | 3 | C | 2 | 37 |
We first begin by calculating the mean of all measurements:
\[\bar{y}_{\cdot\cdot\cdot} = \frac{1}{n_o n_p n_r}\sum_{i=1}^{n_o} \sum_{j=1}^{n_p}\sum_{k=1}^{n_r} y_{ijk} \tag{3}\]This becomes our reference or anchor point for separating out the reproducibility and reliability.
We then compute operator means:
\[\bar{y}_{\cdot i\cdot} = \frac{1}{n_p n_r}\sum_{j=1}^{n_p}\sum_{k=1}^{n_r} y_{ijk}. \tag{4}\]Then we compute the (sample) variance for all the operator measurements:
\[s_o^2 = \frac{1}{n_o - 1}\sum_{i=1}^{n_o}\big(\bar{y}_{i \cdot \cdot } - \bar{y}_{\cdot \cdot \cdot})^2. \tag{5}\]Next we compute the mean for each part:
\[\bar{y}_{\cdot j\cdot} = \frac{1}{n_o n_r}\sum_{i=1}^{n_o} \sum_{k=1}^{n_r} y_{ijk} \tag{6}\]and its accompanying variance:
\[s_p^2 = \frac{1}{n_p - 1} \sum_{j=1}^{n_p} \left(\bar{y}_{\cdot j\cdot} - \bar{y}_{\cdot\cdot\cdot}\right)^2. \tag{7}\]Next we compute the mean effect between operators and parts:
\[\bar{y}_{ij\cdot} = \frac{1}{n_r} \sum_{k=1}^{n_r} y_{ijk} \tag{8}\]Now, computing the variance for the operator-by-part interaction is complicated. But consider that, when we take the average, we eliminate measurement error $e_{ijk}$ since its mean is zero:
\[\bar{y}_{ij\cdot} = \bar{y}_{\cdot\cdot\cdot} + o_i + p_j + (o\times p)_{ij}. \tag{9}\]We may then rearrange Eqn. (9) to obtain: \(\begin{align} (o\times p)_{ij} &\approx \bar{y}_{ij\cdot} - \overbrace{(\bar{y}_{i\cdot\cdot} - \bar{y}_{\cdot \cdot \cdot})}^{p_i} - \overbrace{(\bar{y}_{\cdot j\cdot} - \bar{y}_{\cdot \cdot \cdot})}^{o_j} - \bar{y}_{\cdot \cdot \cdot} \tag{10a}\\ &= \bar{y}_{ij\cdot} - \bar{x}_{i\cdot \cdot} - \bar{y}_{\cdot j \cdot} + \bar{y}_{\cdot \cdot \cdot} \tag{10b} \end{align}\)
Then the variance is computed as:
\[s_{op}^2 = \frac{1}{(n_o-1)(n_p -1)} \sum_{i=1}^{n_o}\sum_{j=1}^{n_p} \underbrace{(\bar{y}_{ij\cdot} - \bar{y}_{i\cdot\cdot} - \bar{y}_{\cdot j \cdot} + \bar{y}_{\cdot \cdot \cdot})}_{(o\times p)_{ij}} {}^2 . \tag{11}\]Finally we can compute the instrument errors as the difference of any individual observation from the operator-by-part mean: \(e_{ijk} = y_{ijk} - \bar{y}_{ij\cdot}. \tag{12}\)
The interpretation here is that $\bar{y}_{ij\cdot}$ averages out any errors from individual operators or parts.
We compute the (sample) variance of the instrument error as:
\[s_e^2 = \frac{1}{n_o n_p (n_r - 1)} \sum_{i=1}^{n_o} \sum_{j=1}^{n_p} \sum_{k=1}^{n_r} \left(y_{ijk} - \bar{y}_{ij\cdot}\right)^2. \tag{13}\]Now having computed the variances for each source of error may determine the gauge repeatability and reliability.
The total variance of our gauge study $s^2$ is the sum of all the individual variances in which:
The sum of the repeatability and reproducibility is, as you would have guessed, the Gauge R&R.
📝 NOTE: Programs like Minitab can perform this analysis for you. But, being a masochist, I learnt and programmed the math myself.
Normally when you perform a gauge study with a program like Minitab it prints out statistical information, including something called “number of distinct categories”.
According to ChatGPT, and a few sources I read on the internet, this is computed as;
\[NDC = \frac{\sqrt{2} s_p}{s_{grr}}. \tag{15}\]A value of $NDC \ge 5$ is considered quite good.
Try as I might, I could not reverse engineer this equation, nor make any sense of the $\sqrt{2}$ on the numerator. From what I could determine, this is simply a heuristic.
In the proceeding section I will propose a more mathematically sound metric.
For a process that produces goods or services, the capability is defined as the upper specification limit (USL) minus the lower specification limit (LSL), divided by 6 standard deviations of the process variation:
\[C_p = \frac{USL - LSL}{6\sigma} \tag{16}\]A good process has $C_p \approx 1$. As shown in the diagram below, the specification limits are thus $\pm 3\sigma$ either side of the mean, or 99.7% of the process variation is within the specification limits.
A $6\sigma$ process has upper and lower specification limits 3 standard deviations either side of the mean $\mu$.
We can extend this idea to our gauge. We can take the ratio of the process standard deviation over 6 times the gauge R&R:
\[C_{gauge} = \frac{s_p}{6\cdot s_{grr}} \tag{17}\]That is, our process variation is 6 times larger than Gauge R&R. An excellent gauge will have $C_{gauge} \approx 1$.
An excellent gauge will have 1/6 the of the process variation.
Given the gauge variance, i.e. uncertainty in our measurement, what is the minimum value $\Delta p$ that it can detect? If we assume a Normal distribution, then we can use a z-score to determine to determine different confidence levels.
For example, a z-score of 1.96 equates to 95% probability, or confidence. We can solve the following equation to find the minimum detectable value with 95% confidence:
\[\begin{align} z = \frac{\Delta p}{s_{grr}} &= 1.96 \tag{18a}\\ \Delta p &= 1.96 \cdot s_{grr} \tag{18b} \end{align}\]We can do this for other values as well.
| Z-Score | Confidence Level |
|---|---|
| 1.645 | 90% |
| 1.96 | 95% |
| 2.576 | 99% |
Any value below the gauge resolution we cannot be certain is due to measurement error.
Below are gauge studies conducted on 2 different sets of scales:
The gauge study consisted of measuring:
for a total of $10 \times 3 \times 2 = 60$ measurements. Measurement order was randomised for each study to reduce bias.
If you want to perform your own data analysis, you can find all the raw data in CSV format for:
I bought my kitchen scales from Kasanova when I was living in Italy. The resolution on the digital display is a minimum of 1g.
Bilancia elettronica Bambù da cucina, portata 5 kg
Below is a stem and leaf plot of the measurements taken during the gauge study:
| Stem | Leaves |
|---|---|
| 3 | 6 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 |
| 4 | 0 0 0 |
The median value is 37g; this is 1g higher than advertised. There is also an outlier of 40g.
The table below shows the results of the gauge study:
The estimated gauge capability $C_{gauge} \approx 0.2 < 1$ which is not good. Likewise, the number of distinct categories was estimated to be $2 < 5$ which again, is not good.
The estimated resolution is about 1.4 grams (99% confidence), so probably closer to 2g if we were to be conservative and round up.
The pareto chart below show the cumlative contribution for the sources of measurement variance. There is significant variance coming from the scales themselves; the repeatability.
When we plot individual observations by operator we can see that they line up with the 1g resolution of the scales themselves. Not much insight here.
Plotting the observations by part reveals another story. Despite the low resolution of the scales, the part measurements are quite inconsistent across operators. Look at bag no. 7 for example. There was a 2g discrepency between Operator A and Operator C. Operators weren’t even able to measure the same part consistently.
Conclusion: These scales are not good if we want to control the weight of M&M bags in a production line!
The lab scales are from Kern & Sohn, and are about 20 times more expensive (thankfully I didn’t have to pay for them!).
Kern PCB Economy Precision Balance
The digital display has a resolution of 0.001g (remember this for later).
The stem and leaf plot below shows the distribution of measurements obstained during the gauge study. Immediately we can see, due to the higher resolution, a wider distriubtion of values. But interestingly, the median lies in the 37g range:
| Stem | Leaves |
|---|---|
| 37 | .136 .138 .138 .139 .140 .142 .143 .146 .146 .146 .146 .148 .507 .511 .516 .521 .523 .523 .532 .533 .533 .537 .542 .572 .842 .849 .850 .853 .854 .855 .885 .885 .894 .894 .894 .898 .945 .956 .962 .962 .964 .974 |
| 38 | .117 .117 .118 .119 .121 .122 |
| 39 | .429 .429 .433 .436 .442 .444 .475 .476 .476 .485 .486 .489 |
The table below shows the results of the gauge study:
The estimated gauge capability is $C_{gauge} \approx 20 \gg 1$, so this is a very good gauge. The distinct number of categories is also 170 $\gg$ 5, which supports this.
My estimated resolution of the gauge is 0.02g (99% confidence), so it seems that the 3rd decimal place.
The pareto chart below shows that, within the study, all the variance in the measurements were due to the differences in bags themselves. This is in stark contrast to the kitchen scales is not useful.
When we look at observations by operator, we get consistent distributions:
And when we examine operator-by-part, we can see that all operators are measuring each part consistently. So, unlike the kitchen scales, this measuring device does not appear to be susceptible to operator idiosynchrasies.
Conclusions: These scales are extremely precise. Probably too precise for weighing bags of M&Ms. It means we could use a potentially cheaper set of scales for controlling production.
A measurement or observation of a quantity can have multiple sources of variance:
By performing a systematic gauge study we can isolate these sources of error.
We have also seen that the interaction between operator and the measuring device can have an effect on measurement variance, as evinced by the study of my kitchen scales.
Moreover, it is important to select a gauge that is appropriate to the measuring task. My kitchen scales are sufficient for cooking at home, but probably not for a production line. Conversely, the lab scale are probably too_ precise for such a task
It seems that Mars Inc. (who produce M&Ms) are over-filling their bags. Value for money!
I also now have a lot of chocolate to eat.
Statistics can be fun, and delicious.
]]>I made the graphs for this post using my newly released tufteplotlib package for Python.
Many real autonomous systems are nonlinear, so we need more sophisticated nonlinear control methods to regulate them. In this article I start by showing how energy gives a much easier, and intuitive approach for reasoning about the stability of dynamic systems. Using this as a framework, I then introduce Lyapunov stability as a method for nonlinear analysis. Finally, I apply it to quaternions for orientation feedback control.
The swinging pendulum on a grandfather clock moves back and forth in perpetuity (well, almost). Normally in control theory, when we talk about stability, we talk about system states where it is not moving. Or, in the case of trajectory tracking, the tracking error converges to zero. But a swinging pendulum isn’t exactly unstable. Its rhythmic motion is deliberately controlled at a rate of 1Hz. How can we reason about this kind of stability?
The pendulum of a grandfather clock swings back-and-forth, consistently, at 1Hz
If we took a Newtonian approach to the pendulum we would write the equations of motion as:
\[\begin{align} \overbrace{ml^2}^{I}\ddot{q} &= -mgl\cdot\sin(q) \tag{1a} \\ \ddot{q} &= -\tfrac{g}{l}\cdot\sin(q) \tag{1b} \end{align}\]where:
Physical modeling of a swinging pendulum.
Now, at this point, depending how much of a masochist you are, you can solve this nonlinear differential equation as:
\[q(t) = 2\cdot\arcsin\left(k\cdot sn(\omega t, k)\right) \tag{2}\]where:
Or we can do what most lazy academics do and abstract away all utility by assuming sin(q) ≈ q when q ≈ 0 such that:
\[\begin{align} \ddot{q} &\approx -\tfrac{g}{l}q \tag{3a}\\ \Longrightarrow q(t) &= A\cdot\cos(\omega t+\phi) \tag{3b} \end{align}\]where $A$ and $\phi$ are obtained from the initial conditions $q(0),~\dot{q}(0)$.
Either way, we end up with some complicated, trigonometric functions that:
Clearly, brute forcing mathematics won’t get us anywhere, which signifies we should change strategy. There is another physics paradigm we can appeal to instead: energy. The total energy in the system is:
\[E = \tfrac{1}{2}ml^2\dot{q}^2 + mgl\left(1 - \cos(q)\right) \tag{4}\]such that its time derivative is:
\[\begin{align} \dot{E} &= ml^2\dot{q}\ddot{q} + mgl\dot{q}\cdot \sin(q) \tag{5a} \\ &= -mgl\dot{q}\cdot\sin(q) + mgl\dot{q}\cdot\sin(q) \tag{5b}\\ &= 0. \tag{5c} \end{align}\]From the conservation of energy, its time derivative is zero. And, if we were to add a tiny bit of damping $b$:
\[ml^2\ddot{q} = -mgl\cdot\sin(q) - b\cdot\dot{q}^2 \tag{6}\]we would arrive at:
\[\dot{E} = -b\cdot\dot{q}^2 \le 0 ~\forall\dot{q}. \tag{7}\]which is non-increasing. We can conclude that a system is stable if:
The combination of $q$ and $\dot{q}$ is known as state space, and is very common in system dynamics. But Sir William Rowan Hamilton took an alternative approach using $q$ (the configuration), and momentum $p$. Combined, these form phase space. The momentum for the pendulum is:
\[p = ml^2\dot{q}. \tag{8}\]When combined with the configuration $q$ we have phase space. Furthermore, the Hamiltonian (i.e. total energy) of the system is:
\[\mathcal{H}(p,q) = \frac{1}{2}\frac{p}{(ml^2)} + mgl(1-\cos(q)). \tag{9}\]Since the Hamiltonian takes two inputs and maps them to a single (positive) output $\mathcal{H}:\mathbb{R}\times\mathbb{R}\mapsto\mathbb{R}^+$, we can visualise this as a 3D surface. Moreover, the time derivatives of the phase space coordinates are just partial derivatives of the Hamiltonian:
\[\dot{q} = \frac{\partial \mathcal{H}}{\partial p}~,~ \dot{p} = -\frac{\partial\mathcal{H}}{\partial q} \tag{10}\]These give a gradient vector which points in the direction that the system is changing, which we can plot on the energy surface. In the conservative case, we would see that a point on the phase space follows the same contour line (level set) along the surface, which corresponds to constant energy. A damped system will always move down from its current energy level.
Phase portrait of a swinging pendulum. A conservative system remains on the same level set (contour line). A dissipative system always moves below its current level set.
Aleksandr Mikhailovich Lyapunov published a thesis on A General Problem of the Stability of Motion1. As you might have guessed, there are some mathematical definitions named after him that classify the different degrees of stability. But, from my experience, asking a mathematician about concepts in control theory is the very definition of masochism.
masochism: (noun)
Asking a mathematician about control theory.
I’ll circumvent the torture by giving a straightforward explanation. First, suppose we have a configuration $\mathbf{x}\in\mathbb{R}^n$, and some positive, scalar function that is zero for $\mathbf{x} = \mathbf{0}$:
\[V(\mathbf{x}) \ge 0 ~\forall\mathbf{x}\ne\mathbf{0} \quad,\quad V(\mathbf{0}) = 0. \tag{11}\]From this we may denote 3 levels of stability.
Stable in the Sense of Lyapunov:
If the Lyapunov function remains bounded with a finite region $\epsilon$, then the system is said to be stable:
For brevity, it is often referred to as Lyapunov stable. We can see from the pendulum example that the energy $E(q,\dot{q})$ is a natural choice for a Lyapunov function. In the undamped case, $\epsilon$ is its initial energy which remains constant (bounded) for all time.
Asymptotically Stable:
A system is asymptotically stable if we can show the time derivative of the Lyapunov function is non-increasing:
More precisely, it is monotonically non-increasing. It can decrease, or stay flat, but never increase. For the damped pendulum, its energy is decreasing with time, so it is asymptotically stable.
Exponentially Stable:
This is a more advanced case of the previous. If we can show that the time derivative is proportional to its current value, then it must be exponentially decreasing:
It is strictly decreasing, hence will converge to zero faster than the asymptotic case.
We can see that these 3 definitions form a nested hierarchy (see figure below):
Every exponentially stable system is asymptotically stable, and every asymptotically stable system is Lyapunov stable.
In a previous article, I showed a 3 step process for solving feedback control of linear systems. The exact same process applies here. If we have a position or configuration vector $\mathbf{x}\in\mathbb{R}^n$ then:
For the non-linear case, we need only make a slight modification:
There are no rules or guidelines for choosing the Lyapunov candidate function, other than that it is positive. But as we saw from the pendulum example, an energy-like quantity is an excellent choice. It allows us to appeal to physics principles which gives intuitive results.
Another thing to consider is that energy is quadratic with respect to velocity, and in some cases, with respect to configuration as well. This is really nice because quadratic functions have a single, global minimum, and we can visualise the energy of 1D systems easily (see figure below). For example, a mass-spring-damper system has the energy:
\[E(x,\dot{x}) = \frac{1}{2}m\dot{x}^2 + \frac{1}{2}k(x-x_0)^2 \tag{15}\]where $x_0$ is the resting position. It is quadratic in both position (configuration), and velocity. In a more abstract sense, it contains the sum-of-squares $x^2,~\dot{x}^2$. So, the sum-of-squared errors is often the best choice for a Lyapunov candidate function.
The energy in a mass-spring-damper system is quadratic with respect to both position, and velocity.
Quaternions are sophisticated mathematical objects that are used to represent orientation in 3D space. They are used in animation, videogames, aerospace, and robotics. In the latter two fields, orientation control is particularly important. Quaternions form a Lie group $\mathbb{H}$, and those which represent orientation are a subset of this $\mathbb{S}^3\subset\mathbb{H}$. Lie groups have specific rules for combining objects, which can make them highly nonlinear. In specific cases, Lyapunov stability is the most straightforward method for stability proofs.
A quaternion contains four elements, often represented as a scalar part and vector part:
\[\boldsymbol{v} = \begin{bmatrix} \eta \\ \boldsymbol{\varepsilon} \end{bmatrix} \in\mathbb{S}^3 \subset \mathbb{H} \tag{16}\]which, in the case of orientation, we have:
The unit norm condition (the Euler-Rodrigues parameters) is such that:
\[\eta^2 + \boldsymbol{\varepsilon}^T\boldsymbol{\varepsilon} = 1. \tag{17}\]Before we can formulate the feedback control problem, we will need several important properties to exploit.
Closure:
This is a fundamental property of Lie groups. When we combine 2 elements in a Lie group, we get a 3rd. For the quaternion, we follow a unique arithmetic for combining rotations together:
Identity:
This is the element of a group that results in no change. By multiplying a quaternion with the identity, we end up with the original quaternion. The identity of a quaternion contains zero in the vector component:
We can reverse engineer this to see that it equates to zero rotation: $\alpha = 2\cdot\arccos(1) =0$.
Inverse:
Applying the closure property to the inverse element of a Lie group leads to the identity. For quaternions, we negate the vector component, otherwise known as its conjugate:
Time Derivative:
Evaluating the time derivative for a quaternion involves appealing to L’Hopital’s rule, and the closure property. It’s a little complex, so the proof is outside the scope of this article. Simply stated, the time derivative is:
where $\boldsymbol{\omega}\in\mathbb{R}^3$ is the angular velocity vector (rad/s), and:
\[S(\boldsymbol{\omega}) = \begin{bmatrix} \phantom{-}0 & -\omega_z & \phantom{-}\omega_y \\ \phantom{-}\omega_z & \phantom{-}0 & -\omega_x \\ -\omega_y & \phantom{-}\omega_x & 0 \end{bmatrix} \tag{22}\]is a skew-symmetric matrix.
Quaternion Error:
Now we can give a proper definition to the quaternion error. We apply the closure and inverse between the desired and actual:
We can see that if $\boldsymbol{v} = \boldsymbol{v}_d$ then this leads to the identity.
An analogy is addition over real vectors $\mathbf{x}\in\mathbb{R}^n$ which also form a Lie group. To define error, we would use addition (closure) with subtraction (inverse):
\[\boldsymbol{\epsilon} = \mathbf{x}_d + (-\mathbf{x}). \tag{24}\]When $\mathbf{x} = \mathbf{x}_d$, we get the identity element (zero):
\[\mathbf{x} = \mathbf{x}_d~\longrightarrow~\boldsymbol{\epsilon} = \mathbf{0}. \tag{25}\]This proof for quaternion feedback control is from a research paper you can find online2. To replicate it, we will follow my 3 step process. First, we define a Lyapunov candidate function as the sum-of-square errors:
\[\begin{align} V(\boldsymbol{e}) &= \left(\eta_d - \eta\right)^2 + \left(\boldsymbol{\varepsilon}_d - \boldsymbol{\varepsilon}\right)^T\left(\boldsymbol{\varepsilon}_d - \boldsymbol{\varepsilon}\right) \tag{26a} \\ &= 2 - 2\left(\eta_d\eta +\boldsymbol{\varepsilon}_d^T\boldsymbol{\varepsilon}\right) \ge 0 \tag{26b} \end{align}\]Notice that with sufficient algebraic manipulation this reduces to the scalar component of the quaternion error.
Second, we take the time derivative, substituting in the quaternion velocity equation to obtain:
\[\dot{V}(\boldsymbol{e},\dot{\boldsymbol{e}}) = -2\dot{\eta}_e = \left(\boldsymbol{\omega}_d - \boldsymbol{\omega}\right)^T\boldsymbol{\varepsilon}_e. \tag{27}\]Third, we choose our control input $\boldsymbol{\omega}$ so that this will asymptotically decay:
\[\boldsymbol{\omega} \triangleq \boldsymbol{\omega}_d + \mathbf{K}\boldsymbol{\varepsilon}_e \tag{28}\]where $\mathbf{K}\in\mathbb{R}^{3\times 3}$ is a gain matrix. If we substitute this in, then:
\[\dot{V}(\boldsymbol{e},\dot{\boldsymbol{e}}) = -\boldsymbol{\varepsilon}_e^T\mathbf{K}\boldsymbol{\varepsilon}_e < 0 ~\forall \boldsymbol{\varepsilon}_e\ne\mathbf{0}. \tag{29}\]If we design $\mathbf{K}$ so that it is positive definite (symmetric, with positive, real, eigenvalues) then this is guaranteed to be monotonically non-increasing. Thus the feedback control law is asymptotically stable. An easy choice for $\mathbf{K}$ is to make it a diagonal matrix.
The animation below shows a robot using quaternion feedback control. It is a standard part of RobotLibrary.
Quaternion feedback used to control the orientation.
A very important note here: both $\boldsymbol{v}$ and $-\boldsymbol{v}$ represent the same orientation with quaternions! This can cause your robot to suddenly spin 360$^\circ$ toward the desired orientation. Since quaternions can be represented as a 4D vector, we can check if they are point in the same direction using the dot product. If $\boldsymbol{v}_d \cdot \boldsymbol{v} < 0$, then simply use $-\boldsymbol{\varepsilon}_e$ in the feedback control law to spin the opposite direction.
Lyapunov, A. M. (1892). The General Problem of the Stability of Motion. Kharkov Mathematical Society. Originally in Russian. English translation by A.T. Fuller, London: Taylor & Francis, 1992. ↩
Yuan, J. (1988). Closed-loop manipulator control using quaternion feedback. IEEE Journal on Robotics and Automation, 4(4):434–440. ↩
What does quality mean for your customer? A core tenet of the Lean Six Sigma project is empiricism, so it is necessary to articulate, and quantify exactly what customer means. The Critical to Quality (CTQ) tree is a useful tool and thinking exercise for developing quantifiable measures of quality. These become the basis for data collection in the Measure phase, but can also be useful as engineering specifications, and key performance indicators.
Lean Six Sigma (LSS) is a project management methodology used to improve the performance and efficiency of business and engineering processes. It combines the heuristics of the Toyota Production System with statistical process control from Motorola. An LSS project is divided in to 5 stages:
In the Define phase it is also necessary to articulate who the customer is of a process. In a previous post , I showed how to use the SIPOC tool in order identify the customer(s). Once identified, is is then necessary to speak with the customer in order to quantify what quality means to them.
The problem is, a customer might not always be clear or precise in what they mean. But, since LSS is based on empiricism, it is necessary to translate vague customer requirements in to quantifiable metrics. That way we can collect and scrutinise the data in the Measure and Analyse phases.
The Citical-To-Quality (CTQ) tree is a standard LSS tool used in the Define phase. It helps refine vague, or subjective notions about quality in to objective measurements.1
A CTQ tree contains 3 (or 4) components:
The structure of a CTQ tree.
The first step is to provide a concise description of what the customer requires. It should be 1 sentence long, and ideally use adjectives to give a notion of quality factors. It should describe what quality is, but not how it is defined. Any lay person should have a basic understanding of what is being asked for.
Here are some bad examples I found on the internet, and what I think is a better definition:
| Original | My Definition |
|---|---|
| I need my paycheck. | A paycheck delivered regularly, on time, in the correct amount. |
| Ease of operation and maintenance. | Consistent operation with minimal defects & breakdowns. |
| Effective maintenance with minimal downtime. | |
| Monthly project report. | A timely monthly report with sufficient information on progress. |
Notice that all these examples don’t give adequate descriptors from which to build upon. The “Need” should be a counter-factual to an existing problem (hence why an LSS project is being undertaken). For example, “A monthly project report” could simply be an A4 paper, with a single sentence “All good.”, delivered any time within a 4 week period. But if the problem is that project reports are constantly late, and lacking detail, then the need should describe the converse: timely, and with sufficient information on progress.
The next step is to list what quality entails. These should be adjectives. It can be difficult for people to start jumping to numerical quantities, so using descriptive words helps build momentum. A useful thing to ask is what does good look like?. Conversely, if you’re having trouble coming up with ideas, a better question to ask is what does bad look like?.
In a previous post, I mentioned a poor experience I had at a hotel. Below are some examples of turning a bad experience in to a performance requirement:
The next step is to turn these descriptions in to quantifiable metrics. This should give a precise number with some kind of constraint, preferably with a mathematical qualifier $=, >, <$. Continuing my example from above we could define:
It’s often good to define how the measurement should be taken. This way we can:
In the Define phase, this operational definition only needs to be high-level. If a more detailed procedure is necessary, it can be elaborated on in the Measure phase as part of the data collection plan.
In a previous post I used an example of the time I made lamingtons for friends and colleagues whilst I was living in Italy. These are an Australian delicacy, consisting of a sponge cake, dipped in chocolate sauce, and rolled in coconut shavings. I developed a SIPOC, which is a tool used to identify who receives the output of a process, i.e. the customer. Then based on customer feedback, we can proceed with developing the CTQ tree.
Delicious lamingtons, made by me!
One of the problems I had was that my first batch was perfect, and all subsequent batches were too dense and flat. Below is how a CTQ tree might be developed for making good lamingtons:
A CTQ tree for properly made lamingtons.
On my first batch (where the sponge cake quality was perfect), the chocolate layer was quite thick. I thought it was too much, but my friend said it was perfect. She didn’t like that commercially made lamingtons have thin chocolate layers in order to save money. This highlights the necessity of getting direct customer feedback when creating the CTQs (thanks Trina!).
Whilst working on the Terabotics project with the University of Leeds, other postdocs and I won a competition for our mini research proposal. The idea was to make artificial limbs with the same optical and mechanical properties as a human (a phantom limb / organ). It could be used for testing and experiments using THz sensing in skin contact measurements.
In late 2024, we met at the University of Warwick for a workshop to develop a plan for how we were going to make these things. It had never been done before (phantom organs exist, but not phantom limbs for this specific technology), so the CTQ tree was the perfect tool to take a vague definition and refine it into quantitative engineering specifications.
Below is the initial draft we developed, and below is my refined version.
A CTQ tree to define engineering specifications of a phantom forearm for skin contact sensing.
One of the interesting outcomes from applying this tool was that, during discussions, we realized the compression of the skin decays asymptotically over time. This can be quantified using a mathematical property called a time constant. The process of developing the tool itself helped us articulate and define this important metric.
Another thing to note is, the first draft we developed (pictured above) had a few question marks, uncertainties, and deficiencies. It can be difficult to get produce a correct CTQ on the first go. After some thought, and reflection, I developed a better one. It should be standard practice to take some extra time to ensure the metrics are correctly defined. This will have a big impact on what data is collected during the Measure phase. Conversely, it’s also OK to get things wrong, and go back and change it as necessary.
From about 2006 to early 2007, I worked as a barista. I became quite skilled at making coffee, and I had a very strict process. The proof of my efforts was the reputation I developed for making excellent quality coffee.
A cappuccino I made working as a barista in 2007.
Making good coffee is surprisingly technical. For example:
A CTQ tree is developed based on customer requirements, but how we achieve those requirements internally might be different. In this case, it is necessary to use tools like the quality function deployment (QFD) that maps the relationship between what the customer wants (i.e. the CTQs), versus how to achieve it internally within the process.
For example, here is a customer-centric CTQ of what makes a good cappucino:
A CTQ tree for a cappuccino.
But based on my expert knowledge above, here is the QFD that relates the quality factors for a good coffee to the technical aspects involved in making it. Notice that there are multiple factors that affect the flavour, and, from a production side, many of them are interrelated (as seen in the “roof” part of the diagram).
A matrix diagram, or quality function deployment (QFD), relating customer requirements to technical specifications for coffee production.
The customer doesn’t care about technical details like extraction time, the precise temperature measurements, etc. They only care that the coffee tastes good, is well made, and is sufficiently hot. But in order to meet this customer expectations, the actual coffee making process must be very tightly controlled.
The Critical to Quality Tree is a useful tool and structued thinking exercise for translating vague, subjective customer needs in to quantifiable performance metrics. These become the basis for developing a data collection plan in the the Measure phase of the project. A well thought-out CTQ tree can provide useful KPIs for a business or product.
Some of the important things I’ve learned over the years is to:
The Tree Diagram is one of the 7 Management & Planning Tools. ↩
In classical mechanics, the Lagrangian is defined as the difference between kinetic and potential energy. We use this to solve for the equations of motion for systems of rigid bodies. But what is its relationship to the conservation of energy, which states the sum of kinetic and potential is constant? In this article I show how to derive the Hamiltonian from the Lagrangian, i.e. the sum of kinetic and potential for rigid body systems. I then show how momentum is used in lieu of velocity to define phase space, and touch on its implications with respect to the Hamiltonian.
One of the most important principles in physics is the conservation of energy. When an apple falls from a tree, it loses potential energy but gains kinetic energy. The total energy remains constant for all time (until it hits someone on the head).
A falling apple loses potential energy and gains kinetic energy as it falls.
We can use this principle to solve for the state of an object at any given time. If $x\in\mathbb{R}$ is its position, and $\dot{x}\in\mathbb{R}$ is its velocity, then at any 2 given points in time it must hold that:
\[\tfrac{1}{2}m\dot{x}_1^2 + mgx_1 = \tfrac{1}{2}m\dot{x}_2^2 + mgx_2. \tag{1}\]in which $m\in\mathbb{R}^+$ is the mass (kg), and $g\in\mathbb{R}$ is gravitational acceleration. So, for example, given $x_1,~\dot{x}_1$ and $x_2$ we could determine the speed just before it hits the ground $\dot{x}_2$.
We can use conservation of energy to solve for state variables at different points in time.
In fact, we can use the conservation of energy to derive Newton’s second law: \(\begin{align} \frac{d}{dt}\left(\tfrac{1}{2}m\dot{x}^2 + mgx\right) &= 0 \tag{2a} \\ m\dot{x}\ddot{x} + mg\dot{x} &= 0 \tag{2b} \\ m\ddot{x} &= -mg. \tag{2c} \end{align}\)
Lagrangian mechanics extends Newton’s second law and enables us to solve the dynamic equations of motion for systems of rigid bodies. Suppose $\mathbf{q}\in\mathbb{R}^n$ is the configuration vector, and $\dot{\mathbf{q}}\in\mathbb{R}^n$ is the velocity vector. Hamilton noted that we could first define a functional as the difference between kinetic energy (kinetic energy is now configuration dependent) and potential energy.1
\[\mathcal{L}(\mathbf{q},\dot{\mathbf{q}}) = \mathcal{K}(\mathbf{q},\dot{\mathbf{q}}) - \mathcal{P}(\mathbf{q}) : \mathbb{R}^{n}\times\mathbb{R}^n\mapsto\mathbb{R}. \tag{3}\]This is known as the Lagrangian. Then, from the calculus of variations, this function is an extremum (maximum or minimum) when its variation is zero $\delta\mathcal{L} = 0$. This leads to the Euler-Lagrange equation:
\[\frac{d}{dt}\left(\frac{\partial\mathcal{L}}{\partial\dot{\mathbf{q}}}\right) - \frac{\partial\mathcal{L}}{\partial\mathbf{q}} = \mathbf{0}. \tag{4}\]Using Hamilton’s definition Eqn. (3) gives Lagrange’s equations of motion.2
This is strange, though, right? From the conservation of energy we would expect the sum of kinetic and potential, yet the Lagrangian requires the difference between the two. What is the relationship between them?
If we were to take the time derivative of Eqn. (3) then we would obtain:
\[\begin{align} \dot{\mathcal{L}} &= \dot{\mathbf{q}}^T\frac{\partial\mathcal{L}}{\partial\mathbf{q}} + \ddot{\mathbf{q}}^T\frac{\partial\mathcal{L}}{\partial\dot{\mathbf{q}}} \tag{5a} \\ &= \underbrace{\dot{\mathbf{q}}^T\frac{d}{dt}\left(\frac{\partial\mathcal{L}}{\partial\dot{\mathbf{q}}}\right) + \ddot{\mathbf{q}}^T\frac{\partial\mathcal{L}}{\partial\dot{\mathbf{q}}}}_{\frac{d}{dt}\left(\frac{\partial\mathcal{L}}{\partial\dot{\mathbf{q}}}\right)} \tag{5b} \end{align}\]But now we can integrate with respect to time to get back an expression containing the original Lagrangian:
\[\mathcal{L}(\mathbf{q},\dot{\mathbf{q}}) = \dot{\mathbf{q}}^T\frac{\partial\mathcal{L}}{\partial\dot{\mathbf{q}}} + \text{const.} \tag{6}\]Notice how the constant appears due to the rules of integral calculus. Now we can simply re-arrange and define the Hamiltonian:
\[\mathcal{H}(\mathbf{q},\dot{\mathbf{q}}) = \dot{\mathbf{q}}^T\frac{\partial\mathcal{L}}{\partial\dot{\mathbf{q}}} - \mathcal{L}(\mathbf{q},\dot{\mathbf{q}}) \tag{7}\]which is constant in a conservative system.
In a rigid body system the kinetic energy is:
\[\mathcal{K}=\frac{1}{2}\dot{\mathbf{q}}^T\mathbf{M}(\mathbf{q})\dot{\mathbf{q}} \tag{8}\]where $\mathbf{M}(\mathbf{q})=\mathbf{M}(\mathbf{q})^T\in\mathbb{R}^{n\times n}$ is the generalised inertia matrix. By putting this back in to Eqn. (7) we now obtain a much more familiar form:
\[\mathcal{H}(\mathbf{q},\dot{\mathbf{q}}) = \mathcal{K}(\mathbf{q},\dot{\mathbf{q}}) + \mathcal{P}(\mathbf{q}). \tag{9}\]Newton didn’t actually express his second law as “force = mass x acceleration”, as people often recite. What he said was:
“Lex II: Mutationem motus proportionalem esse vi motrici impressae, et fieri secundum lineam rectam qua vis illa imprimitur.” 3
or translated to English:
“Law II: The change of motion is proportional to the motive force impressed; and is made in the direction of the straight line in which that force is impressed.”
Here, “motion” is better conceptualised as “momentum”, i.e. the product of mass and velocity. The time derivative of momentum is equal to the forces applied. In a 1D system we would write:
\[p = m\dot{x}~\Longrightarrow~ f = \frac{dp}{dt} = m\ddot{x}. \tag{10}\]For a system of rigid bodies, we denote the generalised momentum as:
\[\mathbf{p} \triangleq \mathbf{M}(\mathbf{q})\dot{\mathbf{q}} = \frac{\partial\mathcal{L}}{\partial\dot{\mathbf{q}}} \tag{11}\]which, from Eqn. (3) and Eqn. (8), is the partial derivative of the Lagrangian with respect to velocity.
If we re-arrange Eqn. (4) and substitute in Eqn. (11) we obtain:
\[\begin{align} \frac{d}{dt}\left(\frac{\partial\mathcal{L}}{\partial\dot{\mathbf{q}}}\right) &= \frac{\partial\mathcal{L}}{\partial\mathbf{q}} \tag{12a}\\ \dot{\mathbf{p}} &= \frac{\partial\mathcal{K}}{\partial\mathbf{q}} - \frac{\partial\mathcal{P}}{\partial\mathbf{q}} \tag{12b} \\ \mathbf{M}\ddot{\mathbf{q}} + \dot{\mathbf{M}}\dot{\mathbf{q}} &= \frac{1}{2}\dot{\mathbf{q}}^T\frac{\partial\mathbf{M}}{\partial\mathbf{q}}\dot{\mathbf{q}} - \mathbf{g}. \tag{12c} \end{align}\]where $\mathbf{g} = \frac{\partial \mathcal{P}}{\partial\mathbf{q}}$ it the generalised gravitational force vector.
On the left hand side of Eqn. (12c) we have the familiar mass by acceleration $\mathbf{M}\ddot{\mathbf{q}}$. But a new term appears, $\dot{\mathbf{M}}\dot{\mathbf{q}}$. This is because, in a system of rigid bodies, the distribution of mass can also change over time. Then, on the right hand side, we see the effect of gravity $\mathbf{g}$, but also the forces due to a configuration change.
So Lagrange’s equations of motion are just a generalisation of Newton’s second law. The time derivative of momentum is equal to the force applied. But it accounts for the change in configuration, and the subsequent change in the distribution of mass.
If we consider the case where $\dot{\mathbf{q}} = \mathbf{0}$, then Eqn. (12c) reduces to a much more familiar form:
\[\mathbf{M}\ddot{\mathbf{q}} = -\mathbf{g}. \tag{13}\]If we have the system configuration $\mathbf{q}$ and its time derivative $\dot{\mathbf{q}}$, then we have all the information we need to reconstruct its equations of motion under its own impetus.4 The concatenation of the two gives the state space vector:
\[\mathbf{x} = \begin{bmatrix} \mathbf{q} \\ \dot{\mathbf{q}} \end{bmatrix} ~\longrightarrow ~ \dot{\mathbf{x}} = \begin{bmatrix} \dot{\mathbf{q}} \\ \ddot{\mathbf{q}} \end{bmatrix}. \tag{14}\]We could instead consider phase space as configuration and momentum:
\[\mathbf{y} = \begin{bmatrix} \mathbf{q} \\ \mathbf{p}\end{bmatrix}. \tag{15}\]Now using Eqn. (7) & (11) we can express the Hamiltonian as a function of momentum:
\[\mathcal{H}(\mathbf{p},\mathbf{q}) = \dot{\mathbf{q}}^T\mathbf{p} - \mathcal{L}(\mathbf{q},\dot{\mathbf{q}}). \tag{16}\]We can actually use this to generate the dynamic equations of motion. The trick is to treat $\mathbf{p}$ and $\mathbf{q}$ as independent variables. First, we can easily recover the velocity from Eqn. (16) by taking the partial derivative with respect to momentum:
\[\dot{\mathbf{q}} = \frac{\partial\mathcal{H}}{\partial\mathbf{p}}. \tag{17}\]Then from Eqn. (12) & (16) we can get the time derivative of momentum:
\[\begin{align} \dot{\mathbf{p}} = \frac{d}{dt}\left(\frac{\partial\mathcal{L}}{\partial\dot{\mathbf{q}}}\right) = \frac{\partial\mathcal{L}}{\partial\mathbf{q}} = -\frac{\partial\mathcal{H}}{\partial\mathbf{q}}. \tag{18} \end{align}\]This is interesting because the Hamiltonian is 1 dimensional, and this means we can conceptualise it as an energy surface. This also means the time derivative of the phase space is a vector on this surface:
\[\dot{\mathbf{y}} = \begin{bmatrix} \dot{\mathbf{q}} \\ \dot{\mathbf{p}} \end{bmatrix} = \begin{bmatrix} \phantom{-}\partial\mathcal{H}/\partial\mathbf{p} \\ -\partial\mathcal{H}/\partial\mathbf{q} \end{bmatrix}. \tag{19}\]A conservative system $\dot{H} = 0$ will follow a single contour line along this surface, i.e. a fixed level set.
A conservative system will remain on a single contour line in the phase portrait.
[Hamilton, 1835] Hamilton, W. R. (1835). Second essay on a general method in dynamics. Philosophical Transactions of the Royal Society of London, 125:95–144. ↩
[Lagrange, 1788] Lagrange, J.-L. (1788). Mécanique analytique. Imprimerie de la République, Paris. Available online at various archives. ↩
[Newton, 1687] Newton, I. (1687). Philosophiæ Naturalis Principia Mathematica. Royal Society, London. First edition. ↩
We can simple re-arrange Eqn. (12c) to solve for $\ddot{\mathbf{q}}$. ↩
Not all features of a product or service are of equal value. The Kano model is a concept for categorising and prioritising them. Using them we can distinguish between what we need to even do business, versus what will attract customers, versus what makes a market leader. I also give examples of how I’ve applied it to some of my engineering projects. It’s a good tool for task prioritisation.
The Kano model is a tool often used in the Lean Six Sigma (LSS) project management methodology to enumerate and categorise customer requriments. LSS projects are divided in to 5 phases:
In the Define phase, it is necessary to articulate the quality metrics of a product or service with respect to the customer. Often customers have many requirements, needs, and wants. They can often be subjective, and conflicting. A man named Noriaki Kano developed a conceptual model that can help categorise and prioritise quality features.
I travel a lot for work, so I’ve spent a decent amount of time in a variety of hotel rooms. I stayed in a cheap hotel recently (for a leisurely weekend away), and there were a few things that made it a dissatisfying experience:
What bemused me was how nonchalant the owner was about me having to open up the cistern to manually fiddle with it and flush the toilet every time.
Conversely, when I went to Japan in 2022 for a conference, I stayed at an incredible hotel in Kyoto.
A photo I took of the Prince Kyoto Takaragaike from the conference center.
Some of the things that stood out were:
The view from my hotel room at the Prince Kyoto Takaragaike.
Clearly, there are minimum expectations we have about a decent hotel room, like functional plumbing. And there are things we would expect to get better the more we pay for it, like breakfast options, and room sizes. But there are also things that amaze us; koi ponds, tea houses, etc.
The Kano model categorises features of a product or service in to 3 categories:
We can plot this on a Cartesian graph with 2 axes:
The Kano model conceptualises customer satisfaction versus level of implementation.
According to Kano’s conceptual model, minimum requirements must be implemented. But, no matter how much effort you put in to them, your customer will not be impressed. If they’re absent, however, or done poorly, your customer will be very unhappy. A hot shower is a hot shower, but a tepid shower on a cold, rainy day in England is awful!
Conversely, customer satisfaction increases proportionally to the level of implementation of the performance requirements. A bigger hotel room, and more breakfast options available? Yes please! And if they can be done for the same price, or cheaper, you will easily put your rivals out of business.
Innovative features (sometimes called delighters, or wow factors) are unexpected, but amaze the customer. A koi pond, and traditional Japanese tea house? Wow! These are features that can transform an industry. Free Wi-Fi at a hotel used to be an exciting feature, and distinguished a quality hotel from its rivals. Now, however, it’s become a minimum expectation. Innovative features often become minimum standards over time, especially if they can be done economically.
To help with determining the different features of a Kano model, I developed my own sorting algorithm.
My sorting algorithm for Kano model categories.
From about 2015 - 2018, I worked on the submerged pile inspection robot (SPIR) as part of my PhD. This was an underwater robot designed to clean marine growth off underwater bridge colums. In May 2017 I hosted a design worskhop with the team to review problems with the previous 2 prototypes, and what we need to do for the 3rd protoype.
The third prototype of the submerged pile inspection robot (SPIR).
We began by brainstorming all the kind of problems we had when working with previous prototype, and what we needed to improve. This included use-case scenarios like:
We then used Affinity Diagrams to group ideas together based on common themes. This made the vast number of ideas easier to manage.
Brainstorming and affinity diagrams for the SPIR prototype development.
We sorted all these ideas using the algorithm above. We also added a few ideas of what would be really cool to implement (if we had the time).
The Kano model developed for the SPIR.
We each received $n = \frac{15}{3} = 5$ (number of ideas divided by 3) votes to place on what we thought was most important to work on. Notice that we did not vote on the basic features / minimum requirements. These must be done.
We can put these votes in to a Pareto chart to see what the team thought was most important.
A pareto chart of votes on the most important features to implement.
Note: This Pareto chart doesn’t follow the 80/20 rule very well, which implies that the features haven’t been adequately categorised or articulated.
At the end, we had a list of engineering specifications, and performance requirements. The basic / minimum requirements become a design checklist of things to be achieved. The weighted performance requirements then became a way to manage time, resources, and priorities.
To recap, the procedure we followed was:
In 2023 I was working for the Italian Institute of Technology (IIT) on the ergoCub robot as part of the Humanoid Sensing & Perception (HSP) team. We had to showcase our human-robot interaction module at the International Conference in Robotics and Automation (ICRA) in London. We had a very tight deadline, and a lot of different features to get up and running (decision trees, control algorithms, object recognition, software interfaces etc.).
The ergoCub robot can recognise a human waving, and respond.
I started by hosting a workshop with my team to review the performance of our previous demo at the start of the year. We brainstormed a bunch of ideas around 3 questions:
You can see that I didn’t frame this explicitly as a Kano model, but there was almost a direct mapping. Afterward, we used Affinity Diagrams to group all these ideas in to common categories. This enabled us to assign responsibility based on subject matter expertise.
A brainstorming session for the ergoCub human-robot interaction demo.
After, we explicity categorised each of the tasks based on the Kano model. Like before, all the minimum requirements were things that we had to do. For the performance requirements, we used a Prioritisation Matrix1 to compare them all. This revealed where we should invest most of our efforts with limited time. We used this later as part of our project planning & monitoring.
| Group | Item | Category |
|---|---|---|
| Action Recognition | Incorrect action recognition when holding object (box, phone) | Minimum Requirement |
| Additional reactions beyond wave and handshake | Extra | |
| Ability to change actions mid-task | Extra | |
| Idle actions when nothing is happening | Extra | |
| Administration | Book trip to London | Minimum Requirement |
| Plan with AMI to work on the robot | Minimum Requirement | |
| Behaviour Tree | Behaviour tree not responsive (clarification?) | Performance Requirement |
| Code | Code works cross-platform (iCub2, ergoCub, Gazebo simulation) | Minimum Requirement |
| Successful communication over network | Minimum Requirement | |
| Modules from different packages integrate successfully | Minimum Requirement | |
| Code takes a long time to compile (separate .cpp from .h) | Performance Requirement | |
| Load parameters from a configuration file | Performance Requirement | |
| Use thrift communication instead of strings over yarp port | Performance Requirement | |
| Control | Robot can execute joint control | Minimum Requirement |
| Robot can avoid singularities | Minimum Requirement | |
| Robot moves quickly | Performance Requirement | |
| Robot moves smoothly, naturally | Performance Requirement | |
| Robot can jump over obstacles | Extra | |
| Grasping | Robot grasps a box without it slipping | Minimum Requirement |
| Robot can grasp an object from given hand transforms | Performance Requirement | |
| Robot can grasp box from different poses | Performance Requirement | |
| The robot can grasp different objects | Performance Requirement | |
| Robot can more accurately shake hands | Performance Requirement | |
| Force control on the box | Performance Requirement | |
| Hands can follow box as it moves | Extra | |
| Collaboratively grasp and lift a box in a short time | Extra | |
| Hardware | Demonstration runs on ergoCub | Minimum requirement |
| Ambient lighting affecting vision | Minimum Requirement | |
| Hardware fails so we can’t use the robot | Minimum Requirement | |
| Head / Gaze Control | Reliable human focus detection | Minimum Requirement |
| We can send commands to move the head | Minimum Requirement | |
| Robot looks at & follows object | Performance Requirement, Extra? | |
| Robot follows human gaze | Extra | |
| Robot changes focus to different things in environment | Extra | |
| Neck moves to stabilize head while walking | Unnecessary | |
| Marketing | Live demonstration executes as planned | Minimum Requirement |
| Presentation summarizing research development | Minimum Requirement | |
| Navigation | Robot can navigate successfully in simulation | Minimum Requirement |
| Navigation works on real ergoCub | Extra | |
| Robot localizes without artificial landmarks | Extra | |
| Robot Communication | Robot responds to voice commands | Extra |
| Robot follows human on command | Extra | |
| Robot talks to people | Extra | |
| Robot changes facial expressions based on actions | Extra | |
| Robot can learn on the fly (clarification for this one?) | Extra |
To recap, the overall procedure here was:
The Kano model is a conceptual method of categorising features of a product or service. It helps reveal the priorities you need to have a successful business, and can also be a great way to prioritise tasks in a project.
Some key lessons I’ve learned over the years are:
This is one of the Seven Management & Planning Tools. ↩
Who is the customer of your business? Who receives the output of your work? Surprisingly, it’s not always who you think, and in this article I’d like to demonstrate why. The SIPOC is a fundamental tool in the Lean Six Sigma project management method. When done correctly, it can reveal important insights in to a business process. It’s an important step before establishing quality metrics and key performance indicators of your work.
Lean Six Sigma (LSS) is a project management methodology used to optimise the performance of business and engineering systems. It combines the heuristics for process optimisation from Toyota’s Lean production with statistical process control from Motorola’s Six Sigma.
An LSS project is divided in to 5 phases:
This is abbreviated as DMAIC.
A core principle of LSS is to define the quality of a product or service with respect to the customer; not what the business itself believes. As such, one the first steps in the Define phase of a project is to:
There is a canonical tool that we use to help articulate this. But before introducing it, I want to take you through a thinking exercise. Hopefully it will show the value in applying this tool correctly, but also the utility of using project management tools as structured thinking.
Who is the customer for a Bachelor’s degree program at a university? The student? The student pays for the tuition fees, therefore the student is the customer, right? This is the wrong way to think about it, and I will demonstrate why.
First let’s think of the Bachelors degree program as a process. The basic steps are:
Next, who enrols in to the university system?
And what comes out?
Now we have a clearly defined Input-Process-Output.
The next important step is to identify where all these inputs come from:
These are the suppliers. It is important to connect them directly to inputs so we can trace problems back to the origin.
Finally, where do all these graduates go?
These are the customers of a graduate degree program. In light of this, students are the product, not the customer. This means we should frame the key performance indicators of a Bachelor’s degree program with respect to the customer’s requirements.
If we were to begin by naively asking “What makes a good University?” at the beginning of the project, then we might answer with things like:
But this will lead us to the wrong conclusions. It tells us nothing about the quality of the students coming out of the program.
Instead, by asking the customer, we might get responses like:
Of course, the KPIs will be specific to the field of study. I’m an engineer, so I would frame in terms of mathematical ability, programming skills, ability to use software. Whereas a degree like history might emphasise knowledge, writing, and research synthesis.
In light of this, we might measure the quality of graduates through things like:
Admittedly, there is a danger in treating scholasticism as business. A University degree may devolve in to merely producing technical competencies, rather than the development of the intellect. The former should be the purview of technical colleges, in my opinion, but I digress.
The Supplier-Input-Process-Output (SIPOC) tool is a staple of the Define phase in a Six Sigma project. Its purpose is to:
Firstly, knowing who is supplying the inputs to a process can be an important first step in resolving quality issues in a product or process. In LSS there is the adage “rubbish in = rubbish out”. If we are receiving poor quality materials & products from our suppliers, this can cause issues within our processes.
Second, having a high-level process map can help with early identification of potential problem areas in the system.
Third, identifying the customer is integral to the success of the project. The next step in the Define phase is usually to develop the Voice of the Customer (VoC). This often involves interviews and focus groups to obtain primary evidence about what the customer actually wants, versus opinions of what we think they want.
It is also important for establishing the product or service specifications (critical to quality factors). By knowing who the customer is, we define quality with respect to their needs. These metrics are what we use in the later phases of the project:
When I was working for Sydney Trains, circa 2013, I did a sabbatical over the summer as a train technician. One thing we had to do was replace faulty membrane dryers from the trains. These were devices that removed moisture from the air before it entered all the pneumatic systems. They were failing quite frequently, and were being replaced often.
When I went back to corporate in the Autumn, I was sitting in on the Six Sigma Green Belt training course. Since I was already employed in the Six Sigma group, I ended up helping other students with their projects. One of the engineering managers was investigating why membrane dryers were failing. We developed the SIPOC, and I, having worked on the trains myself, added some subject matter expertise.
I told him the output is the defective membrane dryer, and we should define who receives it (a customer). It turns out they get put in a box in the storeroom. The supplier was never informed of the problem.
Immediately, just from working on the SIPOC, we identified a crucial broken point in the overall system. How can our suppliers fix the problem if they haven’t received a defective one to inspect? This might have fixed the problem immediately if our suppliers had been informed.
To me this highighted 2 important things:
Lamingtons are an Australian delicacy. They are a sponge cake, coated in chocolate sauce, and dipped in coconut shavings. They are best enjoyed with tea or coffee.
When I was living in Italy, I baked lamingtons for my friends & colleagues. The very first batch I ever made turned out perfectly. All the batches after were poor quality; too firm, too dry. I made this as part of the Six Sigma online course that I’m a guest lecturer for.
Some key insights from this example are:
I never actually figured out what was wrong, but I suspect it was the flour with raising agent which had lost its potency. This is a good learning lesson; check the quality of the inputs (ChatGPT suggested to test the reaction of the flour to warm water).
I asked ChatGPT to generate a SIPOC for the Bachelor’s degree program scenario above. Since the AI learns from examples on the internet, I think its amalgamated many poor habits when developing a SIPOC.
Here are what I think it’s done wrong, or poorly:
What I do think was good was:
To summarise, a diligent application of the SIPOC is crucial to correctly identifying the customer of business process. This will frame how quality and key performance indicators are developed. It can also provide early insight in to potential problem areas for further investigation (or identify them immediately!).
Here are some tips for making a good SIPOC:
Lagrangian mechanics is a sophisticated method for deriving the equations of motion for a dynamic system. The key principle is that it minimises the difference between kinetic and potential energy, integrated across time. But why? In this article, I trace the derivation from Newton’s second principle, to Lagrange’s formulation, to Hamilton’s principle of least action. I show that Lagrangian mechanics is just a generalisation of Newton’s law, extended to multi-body systems.
Between 1589 and 1592, Galileo Galilei supposedly dropped two objects of different masses from the Leaning Tower of Pisa to show that acceleration is independent of mass.
Galileo demonstrated that acceleration is independent of mass by dropping two different objects from the Tower of Pisa.
About 100 years later, in 1687, Sir Isaac Newton published his laws of motion in the Principia Matematica1. His second law of motion codified what Galileo had observed; that the acceleration due to gravity $\frac{d^2 x}{dt^2} = \ddot{x}$ is independent of mass. In light of Galileo’s experiment we would write the equation of motion for a falling object as:
\[m\ddot{x} = -mg ~\Longrightarrow~ \ddot{x} = -g \tag{1}\]where:
Assuming the object starts with zero velocity, we can compute its speed when it impacts the ground using integration:
\[\dot{x}_f = \int_{t_0}^{t_f} \ddot{x}~dt. \tag{2}\]But there’s another way we could solve this problem. The potential energy of an object at any given height is:
\[\mathcal{P}(x) = mgx. \tag{3}\]
The gravitational potential energy in an object is a function of its height. This is converted to kinetic energy as it falls.
And when it hits the ground all of this potential energy is converted to kinetic energy:
\[\mathcal{K}(\dot{x}) = \frac{1}{2}m\dot{x}^2. \tag{4}\]By equating the two we can solve:
\[\begin{align} \frac{1}{2}m\dot{x}_f^2 &= mgx_0 \tag{5a} \\ \dot{x}_f &= \sqrt{2 g x_0}. \tag{5b} \end{align}\]So there are 2 ways to frame this problem that result in the same solution: force, or energy.
It is well established that forces in a potential field are the negative of the gradient:
\[\mathcal{P}(x) = mgx ~\Longrightarrow~ f_g = -\frac{d\mathcal{P}}{dx} = -mg. \tag{6}\]But the dynamic forces $m\ddot{x}$ may also be expressed in terms of derivatives of kinetic energy. Specifically, we could re-write Newton’s second law as:
\[\underbrace{\frac{d}{dt}\left(\frac{d\mathcal{K}}{d\dot{x}}\right)}_{m\ddot{x}} = \underbrace{-\frac{d\mathcal{P}}{dx}\vphantom{\begin{bmatrix} a\\ b\end{bmatrix}}}_{-mg}. \tag{7}\]Newton’s laws concern particles; individual, rigid bodies. But Lagrange’s genius was to generalise these principles to systems of rigid bodies2. Now we consider the configuration for a rigid body system denoted by $\mathbf{q}\in\mathbb{R}^n$ (e.g., a vector of joint angles for a robot), and the associated velocities $\dot{\mathbf{q}}\in\mathbb{R}^n$.
If the energy in a closed system is conserved, then it follows that an infinitesimal change in the kinetic energy must equal an infinitesimal change in potential energy:
\[\delta\mathcal{K} = \delta\mathcal{P}. \tag{8}\]Three things to keep in mind here:
Taking the variation, we consider the effect of infinitesimal changes in configuration $\delta\mathbf{q}$ and velocity $\delta\dot{\mathbf{q}}$ on energy balance:
\[\delta\mathbf{q}^T\frac{\partial\mathcal{K}}{\partial\mathbf{q}} + \delta\dot{\mathbf{q}}^T\frac{\partial\mathcal{K}}{\partial\dot{\mathbf{q}}} = \delta\mathbf{q}^T\frac{\partial\mathcal{P}}{\partial\mathbf{q}}. \tag{9}\]Then we can use integration by parts to eliminate $\delta\dot{\mathbf{q}}$:
\[\delta\dot{\mathbf{q}}^T\frac{\partial\mathcal{K}}{\partial\dot{\mathbf{q}}} = -\delta\mathbf{q}^T\frac{d}{dt}\left(\frac{\partial \mathcal{K}}{\partial\dot{\mathbf{q}}}\right). \tag{10}\]Now putting Eqn. (10) back in to Eqn (9) we obtain:
\[\begin{align} \delta\mathbf{q}^T\left(\frac{\partial\mathcal{K}}{\partial\mathbf{q}} -\frac{d}{dt}\left(\frac{\partial \mathcal{K}}{\partial\dot{\mathbf{q}}}\right)\right) &= \delta\mathbf{q}^T\frac{\partial\mathcal{P}}{\partial\mathbf{q}} \tag{11a} \\ \frac{d}{dt}\left(\frac{\partial \mathcal{K}}{\partial\dot{\mathbf{q}}}\right) - \frac{\partial\mathcal{K}}{\partial\mathbf{q}} &= -\frac{\partial P}{\partial\mathbf{q}}. \tag{11b} \end{align}\]Equation (11a) is d’Alemberts principle. It is the projection of a virtual displacement $\delta\mathbf{q}$ on to the forces acting on the system, which should sum to zero.3 More importantly, Eqn. (11b) is Lagrange’s equations for the dynamics of a rigid body system. Note its structural similarity to (a generalisation of) Eqn. (7).
What is Eqn. (11b) telling us?
Firstly, Newton didn’t state his second law as “force equals mass times acceleration”, as often recited. What he wrote was:
“Lex II: Mutationem motus proportionalem esse vi motrici impressae, et fieri secundum lineam rectam qua vis illa imprimitur.” 1
or in English:
“Law II: The change of motion is proportional to the motive force impressed; and is made in the direction of the straight line in which that force is impressed.”
In modern parlance we would say that force is equal to the time derivative of momentum. For a system of rigid bodies, we would denote its generalised inertia matrix as $\mathbf{M}(\mathbf{q}) = \mathbf{M}(\mathbf{q})^T\in\mathbb{R}^{n\times n}$. Then its kinetic energy is:
\[\mathcal{K}(\mathbf{q},\dot{\mathbf{q}}) = \frac{1}{2}\dot{\mathbf{q}}^T\mathbf{M}(\mathbf{q})\dot{\mathbf{q}} \tag{12}\]and the momentum:
\[\mathbf{p} = \mathbf{M}(\mathbf{q})\dot{\mathbf{q}} = \frac{\partial\mathcal{K}}{\partial\dot{\mathbf{q}}}. \tag{13}\]So:
There was a different thread running through history at the same time. In 1744 Pierre-Louis Moreau de Maupertuis philosophised that:
“…in all changes that happen in nature, the amount of action is as small as possible.” 4
He would later denote this as the integral of momentum over distance, which is twice the kinetic energy across time:
\[A = \int m\dot{x} ~dx = \int m\dot{x}^2~dt. \tag{14}\]This didn’t pan out, as evinced by history. Sir William Rowan Hamilton would propose its canonical form still used in classical mechanics today5. He observed that Lagrange’s equations of motion, Eqn. (11b), may be first be written as the functional:
\[\mathcal{L}(\mathbf{q},\dot{\mathbf{q}}) = \mathcal{K}(\mathbf{q},\dot{\mathbf{q}}) - \mathcal{P}(\mathbf{q}) ~:~ \mathbb{R}^{n}\times\mathbb{R}^n\mapsto\mathbb{R} \tag{15}\]This is, unsurprisingly, referred to as the Lagrangian. Then, via the calculus of variations we obtain the (surprise!) Euler-Lagrange equation:
\[\frac{d}{dt}\left(\frac{\partial\mathcal{L}}{\partial\mathbf{\dot{q}}}\right) - \frac{\partial\mathcal{L}}{\partial\mathbf{q}} = \mathbf{0}. \tag{16}\]This is equivalent to (11b). Reverse-engineering this, the action is defined as:
\[A = \int \underbrace{\mathcal{K}(\mathbf{q},\dot{\mathbf{q}}) - \mathcal{P}(\mathbf{q})\vphantom{\begin{matrix} a \\ b \end{matrix}}}_{\mathcal{L}(\mathbf{q},\dot{\mathbf{q}})}~dt \tag{17}\]which has the SI units of joule-seconds. It follows that, for a conservative system, the equations of motion are an extremum of the action:
\[\delta A = \int \delta\mathcal{L}~dt = 0 ~\Longrightarrow~\delta L = 0 \tag{18}\]whose solution is (16). The second variation, with respect to $\dot{\mathbf{q}}$, is:
\[\frac{\partial^2\mathcal{L}}{\partial\dot{\mathbf{q}}^2} = \mathbf{M}(\mathbf{q}) \succ 0. \tag{19}\]The inertia matrix is positive definite, such that kinetic energy is always positive.6 Hence Eqn. (16), (11b) is a minimum of action.
Newton’s law is about the instantaneous balance of forces. Equation (17) is metric across time. That is, the trajectory that a system of rigid bodies will follow through a potential field will minimise the difference between all the forces acting on it.
Newton, I. (1687). Philosophiæ Naturalis Principia Mathematica. Royal Society, London. First edition. ↩ ↩2
Lagrange, J.-L. (1788). Mécanique analytique. Imprimerie de la République, Paris. Available online at various archives. ↩
Virtual displacements do no net work, since they’re not real. Obviously. ↩
Maupertuis, P. L. M. (1744). Accord de différentes loix de la nature qui avoient jusqu’ici paru incompatibles. Mémoir de l’Académie Royale des Sciences de Paris, pages 417–426. ↩
Hamilton, W. R. (1835). Second essay on a general method in dynamics. Philosophical Transactions off the Royal Society of London, 125:95–144. ↩
A matrix $\mathbf{A} = \mathbf{A}^T \in\mathbb{R}^n$ is positive definite if for $\mathbf{x}\in\mathbb{R}^n \ne \mathbf{0}$ then $\mathbf{x}^T\mathbf{A}\mathbf{x} > 0$. ↩
Quaternions are sophisticated mathematical objects that are used to represent orientation in 3D for robotics, animation, and aerospace. In this article I trace a logical sequence from using complex numbers as rotations toward the derivation of the quaternion itself. I then derive the Lie group properties for combining and inverting quaternions. Lastly, I show how they can be used to rotate vectors, and some of their advantages over rotation matrices.
Euler’s formula states that:
\[e^{i\psi} = \cos(\psi) + i\cdot\sin(\psi) \in\mathbb{C} ~,~ i = \sqrt{-1}. \tag{1}\]We can think of this as a rotation in to the complex plane (Fig. 1). When we multiply powers together, we add the exponent. This equates to adding rotations together (Fig. 1):
\[e^{i\psi}\cdot e^{i\phi} = e^{i(\psi + \phi)} = \cos(\psi + \phi) + i\cdot\sin(\psi + \phi). \tag{2}\]
Figure 1: A complex number represents a rotation in to the complex plane. Multiplying complex numbers is equivalent to adding rotations.
If we took a complex number:
\[\mathrm{z} = \mathrm{x} + i\cdot\mathrm{y}\in\mathbb{C} \tag{3}\]and multiplied it by Eqn. (1) then we would get:
\[\begin{align} e^{i\psi}\cdot \mathrm{z} &= \left(\cos(\psi) + i\cdot\sin(\psi)\right)\left(\mathrm{x} +i\cdot \mathrm{y}\right) \tag{4a} \\ &= \mathrm{x}\cdot\cos(\psi) - \mathrm{y}\cdot\sin(\psi) + i\left(\mathrm{x}\cdot\sin(\psi) - \mathrm{y}\cdot\cos(\psi)\right) \tag{4b} \end{align}\]But we could also represent Eqn. (3) as a vector:
\[\mathbf{v} = \begin{bmatrix} \mathrm{x} \\ \mathrm{y} \end{bmatrix} \begin{matrix} \leftarrow \text{Real part}\phantom{abcd} \\ \leftarrow \text{Complex part} \tag{5} \end{matrix}\]In the same manner, we could write Eqn. (4) as:
\[\begin{bmatrix} \mathrm{x}\cdot\cos(\psi) - \mathrm{y}\cdot\sin(\psi) \\ \mathrm{x}\cdot\sin(\psi) + \mathrm{y}\cdot\cos(\psi) \end{bmatrix} = \underbrace{ \begin{bmatrix} \cos(\psi) & -\sin(\psi) \\ \sin(\psi) & \phantom{-}\cos(\psi) \end{bmatrix} }_{\mathbf{R}} \underbrace{ \begin{bmatrix} \mathrm{x} \\ \mathrm{y} \end{bmatrix} }_{\mathbf{v}}. \tag{6}\]This matrix $\mathbf{R}$ is in fact a 2D rotation matrix. It belongs to the Special Orthogonal group:
\[\mathbb{SO}(n) = \left\{ \mathbf{R}\in\mathbb{R}^{n\times n} ~\big|~ \mathbf{RR}^T = \mathbf{I}~,~ det(\mathbf{R}) = 1 \right\}. \tag{7}\]Multiplying a complex number by Euler’s equation is equivalent to rotating a 2D vector with a 2D rotation matrix. But this isn’t the only connection between complex numbers and 2D rotations. An eigenvector $\mathbf{v}$ of $\mathbf{R}\in\mathbb{SO}(2)$ satisfies the identity:
\[\mathbf{Rv} = \lambda\mathbf{v} \tag{8}\]where $\lambda$ is the corresponding eigenvalue. We can find the eigenvalue(s) of a 2D matrix using the shortcut:
\[\begin{align} \lambda^2 - trace(\mathbf{R}) \lambda + det(\mathbf{R}) &= 0 \tag{9a} \\ \lambda^2 -2\cos(\psi) + 1&= 0 \tag{9b} \end{align}\]where
We can then solve Eqn. (9) with the quadratic formula and some trigonometric identities:
\[\begin{align} \lambda = \cos(\psi) &\pm \sqrt{\cos^2(\psi)-1 } \tag{10a}\\ \cos(\psi) &\pm \sqrt{-\sin^2(\psi)} \tag{10b} \\ \cos(\psi) &\pm i\cdot\sin(\psi) \in\mathbb{C}. \tag{10c} \end{align}\]The eigenvalue of $\mathbb{SO}(2)$ is a complex number. Is this surprising? Take a look at Eqn. (4), (6) and (8) again:
\[e^{i\psi}\cdot \mathrm{z} = \lambda\mathbf{v} = \mathbf{R}\mathbf{v}. \tag{11}\]Now you may be thinking: if 1 complex element gives rotation in 2D, then 2 complex elements are needed for rotation in 3D. Let’s declare an “extended” complex number where $j = \sqrt{-1}$:
\[\mathrm{x} + i\cdot \mathrm{y} + j \cdot \mathrm{z} \in\mathbb{C}^2. \tag{12}\]What happens when we multiply two of them together?
\[\begin{align} \left(\mathrm{x} + i\cdot \mathrm{y} + j\cdot \mathrm{z}\right)\left(\mathrm{x} + i\cdot \mathrm{y} + j\cdot \mathrm{z}\right) &= \underbrace{\mathrm{x}^2 - \mathrm{y}^2 - \mathrm{z}^2}_{\text{Real}} + \underbrace{i\cdot 2\mathrm{xy} + j\cdot 2\mathrm{xz}}_{\text{Complex}} + \underbrace{\left(ij + ji\right)\cdot \mathrm{yz} }_{\text{???}} \notin\mathbb{C}^2 \tag{13} \end{align}\]What is $ij$ and $ji$? The mathematical object on the right is different from the object on the left. The problem is that Eqn. (12) is not a Lie group.
Lie groups are mathematical objects that satisfy 4 properties:
Complex numbers form a Lie group. This why we could rotate another complex number using Eqn. (4). We multiply 2 complex numbers, and get a 3rd. Equation (13) violates the closure property. To represent rotations in 3D, we need a Lie group so that we can use the closure property to combine them.
| Group Properties of $\mathbb{C}$ (Over Multiplication) | |
|---|---|
| Closure: | $\mathrm{z}_1,\mathrm{z}_2\in\mathbb{C}~:~ \mathrm{z}_1 \mathrm{z}_2 \in\mathbb{C}$ |
| Associativity: | $\left(\mathrm{z}_1 \mathrm{z}_2\right) \mathrm{z}_3 = \mathrm{z}_1 \left(\mathrm{z}_2 \mathrm{z}_3\right)$ |
| Identity: | $1 \equiv1+i\cdot0\subset\mathbb{C}: 1\mathrm{z} = \mathrm{z}$ |
| Inverse: | $\mathrm{z}^{-1} = \frac{\bar{\mathrm{z}}}{\mathrm{z}\bar{\mathrm{z}}} ~:~ \mathrm{z}^{-1}\mathrm{z} = 1 + i\cdot 0$ |
Sir William Rowan Hamilton proposed the now famous quaternion:
\[\boldsymbol{q} = \mathrm{w} + i\cdot \mathrm{x} + j\cdot \mathrm{y} + k\cdot \mathrm{z} \in\mathbb{H} \tag{14}\]where $i^2 = j^2 = k^2 = \sqrt{-1}$. By multiplying 2 of them together with standard rules for arithmetic we obtain:
\[\begin{align} \boldsymbol{q}_1\cdot\boldsymbol{q}_2 &= (\mathrm{w}_1\mathrm{w}_2 - \mathrm{x}_1\mathrm{x}_2 - \mathrm{y}_1\mathrm{y}_2 -\mathrm{z}_1\mathrm{z}_2) \nonumber \\ &+ i\cdot(\mathrm{w}_1 \mathrm{x}_2 + \mathrm{x}_1 \mathrm{w}_2) + j\cdot(\mathrm{w}_1 \mathrm{y}_2 + \mathrm{y}_1 \mathrm{w}_2) + k\cdot(\mathrm{w}_1 \mathrm{z}_2 + \mathrm{z}_1 \mathrm{w}_2) \nonumber \\ &+ ij\cdot\mathrm{x}_1\mathrm{y}_2 + ji\cdot\mathrm{y}_1\mathrm{x}_2 + jk\cdot\mathrm{y}_1\mathrm{z}_2 + kj\cdot\mathrm{z}_1\mathrm{y}_2 + ki\cdot\mathrm{z}_1\mathrm{x}_2 + ik\cdot\mathrm{x}_1\mathrm{z}_2. \tag{15} \end{align}\]On October 16th, 1843, he had an epiphany about how to resolve the closure property. His insight was to say that $ijk = -1$. He inscribed this now famous identity on to the Brougham Bridge in Dublin (Fig. 2).
Figure 2: A plaque on Brougham (Broom) Bridge commemorating Hamilton's invention.
(JP, William Rowan Hamilton Plaque, CC BY-SA 2.0)
| Quaternion | Multiplication | Rules | |
|---|---|---|---|
| $\times$ | $\phantom{-}i$ | $\phantom{-}j$ | $\phantom{-}k$ |
| $i$ | $-1$ | $\phantom{-}k$ | $-j$ |
| $j$ | $-k$ | $-1$ | $\phantom{-}i$ |
| $k$ | $\phantom{-}j$ | $-i$ | $-1$ |
The key is that quaternions obey their own rules for multiplication. Specifically, we resolve $ij = k$, $ji = -k$, etc. That way $ijk = k^2 = -1$. We may now complete Eqn. (15):
\[\begin{align} \boldsymbol{q}_1\cdot\boldsymbol{q}_2 &= \phantom{h\cdot}(\mathrm{w}_1\mathrm{w}_2 - \mathrm{x}_1\mathrm{x}_2 - \mathrm{y}_1\mathrm{y}_2 -\mathrm{z}_1\mathrm{z}_2) \nonumber \\ &+ i\cdot(\mathrm{w}_1 \mathrm{x}_2 + \mathrm{x}_1 \mathrm{w}_2 + \mathrm{y}_1\mathrm{z}_2 - \mathrm{z}_1\mathrm{y}_2) \nonumber\\ &+ j\cdot(\mathrm{w}_1 \mathrm{y}_2 + \mathrm{y}_1 \mathrm{w}_2 + \mathrm{z}_1\mathrm{x}_2 - \mathrm{x}_1\mathrm{z}_2 ) \nonumber \\ &+ k\cdot(\mathrm{w}_1 \mathrm{z}_2 + \mathrm{z}_1 \mathrm{w}_2 + \mathrm{x}_1\mathrm{y}_2 - \mathrm{y}_1\mathrm{x}_2) \in\mathbb{H} \tag{16} \end{align}\]which satisfies the closure property for a Lie group.
If the exponential of a purely imaginary complex number represents a rotation, Eqn. (1), what about a purely complex quaternion?
\[\boldsymbol{p} = i\cdot\mathrm{x} + j\cdot\mathrm{y} + k\cdot\mathrm{z} \in\mathbb{H}. \tag{17}\]When exponentiating Eqn. (17) we obtain:
\[e^{\boldsymbol{p}} = \sum_{n=0}^{\infty} \frac{(\|\boldsymbol{p}\|\cdot\hat{\boldsymbol{p}})^n} {n!} \tag{18}\]where $\hat{\boldsymbol{p}} = \frac{\boldsymbol{p}}{|\boldsymbol{p}|}$ such that $\hat{\boldsymbol{p}}^2 = -1$. We can split this in to even and odd terms and simplify them a little:
\[\begin{align} (\|\boldsymbol{p}\|\cdot\hat{\boldsymbol{p}})^{2n\phantom{+1}} &= (-1)^n \cdot\|\boldsymbol{p}\|^{2n} \tag{19a} \\ (\|\boldsymbol{p}\|\cdot\hat{\boldsymbol{p}})^{2n+1} &= (-1)^n \cdot\|\boldsymbol{p}\|^{2n+1} \cdot \hat{\boldsymbol{p}}. \tag{19b} \end{align}\]By substituting Eqn. (19a) & (19b) in to (18) we arrive at:
\[e^{\boldsymbol{p}} = \underbrace{\sum_{n=0}^{\infty} \frac{(-1)^n \cdot\|\boldsymbol{p}\|^{2n}} {(2n)!}}_{\cos(\|\boldsymbol{p}\|)} + \underbrace{\sum_{n=0}^{\infty} \frac{(-1)^n \cdot\|\boldsymbol{p}\|^{2n+1}} {(2n+1)!}}_{_{\sin(\|\boldsymbol{p}\|)}}\cdot\hat{\boldsymbol{p}} \in\mathbb{H} \tag{20}\]which is itself a quaternion. In this context,
This is exactly what Euler’s rotation theorem states: any 3D rotation may be parameterised by an angle of rotation about a fixed axis. Thus, we can use quaternions to represent rotation. But not just any quaternion; it must be the exponential of a purely imaginary quaternion.
I am now going to switch notation, and from (14) I am going to define:
\[\eta = \mathrm{w} ~,~ \boldsymbol{\varepsilon} = \begin{bmatrix} \mathrm{x} \\ \mathrm{y} \\ \mathrm{z} \end{bmatrix} ~\longrightarrow~ \boldsymbol{q} = \begin{bmatrix} \eta \\ \boldsymbol{\varepsilon} \end{bmatrix}. \tag{21}\]From careful inspection of Eqn. (16) we can now re-write the product of 2 quaternions using 2 familiar vector operations; the dot product1, and cross product:
\[\boldsymbol{q}_1 \cdot \boldsymbol{q}_2 = \begin{bmatrix} \eta_1\eta_2 - \boldsymbol{\varepsilon}_1^T\boldsymbol{\varepsilon}_2 \\ \eta_1 \boldsymbol{\varepsilon}_2 + \eta_2\boldsymbol{\varepsilon}_1 + \boldsymbol{\varepsilon}_1\times\boldsymbol{\varepsilon}_2 \end{bmatrix} \in\mathbb{H}. \tag{22}\]To re-iterate, this is the closure property of $\mathbb{H}$. In fact, if the product of any 2 quaternions is another quaternion, then the associativity property follows:
\[\left(\boldsymbol{q}_1 \cdot \boldsymbol{q}_2\right) \cdot \boldsymbol{q}_3 = \boldsymbol{q}_1 \cdot \left(\boldsymbol{q}_2 \cdot \boldsymbol{q}_3\right). \tag{23}\]Be careful though; since $\boldsymbol{\varepsilon}_1\times\boldsymbol{\varepsilon}_2 \ne \boldsymbol{\varepsilon}_2\times\boldsymbol{\varepsilon}_2$ it is also the case that $\boldsymbol{q}_1\cdot\boldsymbol{q}_2 \ne \boldsymbol{q}_2\cdot\boldsymbol{q}_1$.
The identity element of a quaternion is the same as $\mathbb{C}$, a unit real part and zero complex part:
\[\boldsymbol{\iota} = \begin{bmatrix} 1 \\ \mathbf{0} \end{bmatrix} \in\mathbb{H}~\Longrightarrow ~ \boldsymbol{q}\cdot\boldsymbol{\iota} = \boldsymbol{q}. \tag{24}\]Now, for a complex number we obtain the conjugate by negating the complex component. The product of a complex number and its conjugate gives a purely real number:
\[\mathrm{z} = \mathrm{x} + i\cdot\mathrm{y}~,~\bar{\mathrm{z}} = \mathrm{x} - i\cdot\mathrm{y} \in\mathbb{C} ~\Longrightarrow~ \mathrm{z\bar{z}} = \mathrm{x}^2 + \mathrm{y}^2 \in\mathbb{R}. \tag{25}\]The same is true of quaternions. We form the conjugate by negating the complex component. And when we multiply a quaternion with its conjugate we end up with a purely real number:
\[\bar{\boldsymbol{q}} = \begin{bmatrix} \phantom{-}\eta \\ -\boldsymbol{\varepsilon} \end{bmatrix} ~\Longrightarrow~ \boldsymbol{q}\cdot\bar{\boldsymbol{q}} = \begin{bmatrix} \eta^2 + \boldsymbol{\varepsilon}^T\boldsymbol{\varepsilon} \\ \mathbf{0} \end{bmatrix}. \tag{26}\]Can you see it? Eqn. (26) leads to the identity Eqn. (24) if, and only if:
\[\underbrace{\eta^2 + \boldsymbol{\varepsilon}^T\boldsymbol{\varepsilon}}_{\mathrm{w^2 + x^2 + y^2 + z^2}} = 1. \tag{27}\]This condition is known as the Euler-Rodrigues parameters. We already have a solution using the exponential quaternion Eqn. (20):
\[\boldsymbol{v} =e^{\tfrac{1}{2}\mathbf{a}} = \underbrace{\cos\left(\tfrac{1}{2}\alpha\right)}_{\eta} + \underbrace{\sin\left(\tfrac{1}{2}\alpha\right)\hat{\mathbf{a}}}_{\boldsymbol{\varepsilon}} \in \mathbb{S}^3\subset \mathbb{H} \tag{28}\]where $\mathbf{a} = \alpha\cdot\hat{\mathbf{a}}$ (the angle-axis parameterisation). The reason for the half angle will be apparent later. A quaternion of unit norm is called a versor. Equation (27) implies that the versor is a point on the surface of a 4D sphere, hence $\mathbb{S}^3$ (4D volume, 3D surface).
So for a versor, the conjugate is the inverse element since:
\[\boldsymbol{v}\cdot\bar{\boldsymbol{v}} = \boldsymbol{\iota}. \tag{29}\]We have completed the Lie algebra; but not for quaternions $\mathbb{H}$ per se, but for versors $\mathbb{S}^3\subset\mathbb{H}$.
| Group Properties for $\mathbb{S}^3\subset\mathbb{H}$ | |
|---|---|
| Closure: | $\boldsymbol{v}_1,\boldsymbol{v}_2\in\mathbb{S}^3~:~\boldsymbol{v}_1\cdot\boldsymbol{v}_2\in\mathbb{S}^3$ |
| Associativity: | $\left(\boldsymbol{v}_1\cdot\boldsymbol{v}_2\right)\cdot\boldsymbol{v}_3 = \boldsymbol{v}_1\cdot\left(\boldsymbol{v}_2\cdot\boldsymbol{v}_3\right)$ |
| Identity: | $\boldsymbol{\iota} = \begin{bmatrix} 1 & \mathbf{0} \end{bmatrix}^T\in\mathbb{S}^3 ~:~ \boldsymbol{v}\cdot\boldsymbol{\iota} = \boldsymbol{v}$ |
| Inverse: | $\bar{\boldsymbol{v}} = \begin{bmatrix} \eta & -\boldsymbol{\varepsilon}^T\end{bmatrix}^T~:~ \boldsymbol{v}\cdot\bar{\boldsymbol{v}} = \boldsymbol{\iota}$ |
To rotate a vector $\mathbf{v}\in\mathbb{R}^3$ we:
The result is:
\[\begin{bmatrix} 0 \\ \mathbf{u} \end{bmatrix} = \overbrace{ \begin{bmatrix} \eta \\ \boldsymbol{\varepsilon} \end{bmatrix} }^{\boldsymbol{v}} \cdot \begin{bmatrix} 0 \\ \mathbf{v} \end{bmatrix} \cdot \overbrace{ \begin{bmatrix} \phantom{-}\eta \\ -\boldsymbol{\varepsilon} \end{bmatrix} }^{\bar{\boldsymbol{v}}} = \begin{bmatrix} 0 \\ \mathbf{R}(\eta,\boldsymbol{\varepsilon})\mathbf{v} \end{bmatrix}. \tag{30}\]First, we need the half-angle in Eqn. (28) so that, when we apply this left-side and right-side product, we end up with zero in the real part of the result. Without it, we wouldn’t have a pure quaternion (try it!).
Second, any rotation of a vector $\mathbf{v}\in\mathbb{R}^n \to \mathbf{u}\in\mathbb{R}^n$ that preserves its length is equivalent to applying a rotation matrix $\mathbf{R}\in\mathbb{SO}(n)$. If we were to expand Eqn. (30) we would find:
\[\mathbf{R}(\eta,\boldsymbol{\varepsilon}) = \begin{bmatrix} 1 - 2(\varepsilon_2^2 + \varepsilon_3^2) & 2(\varepsilon_1 \varepsilon_2 - \eta \varepsilon_3) & 2(\varepsilon_1 \varepsilon_3 + \eta \varepsilon_2) \\ 2(\varepsilon_1 \varepsilon_2 + \eta \varepsilon_3) & 1 - 2(\varepsilon_1^2 + \varepsilon_3^2) & 2(\varepsilon_2 \varepsilon_3 - \eta \varepsilon_1) \\ 2(\varepsilon_1 \varepsilon_3 - \eta \varepsilon_2) & 2(\varepsilon_2 \varepsilon_3 + \eta \varepsilon_1) & 1 - 2(\varepsilon_1^2 + \varepsilon_2^2) \end{bmatrix}\in\mathbb{SO}(3) \tag{31}\]Now we have a short-hand for constructing a rotation matrix from a versor. This is more efficient because we can skip all the calculations that cancel to zero.
NOTE: $\boldsymbol{v}$ and $-\boldsymbol{v}$ represent the same orientation. This is because $\mathbf{u} = (-\boldsymbol{v})\cdot\mathbf{v}\cdot(-\bar{\boldsymbol{v}}) = \boldsymbol{v}\cdot\mathbf{v}\cdot\bar{\boldsymbol{v}}$. You can think of it like this: facing South and walking backwards is equivalent to facing North and walking forwards.
Quaternions are used in animation, robotics, and aerospace. They require fewer floating point operations (FLOPs) when propagating rotations versus rotation matrices. However, they are more costly when rotating vectors. This can be reduced from 56 flops to 39 flops by forming a rotation matrix first, Eqn. (31), then performing the rotation.
Quaternions are also much more efficient for storing and transmitting data. They only require 4 parameters, versus 9 for rotation matrices. This is important when we have limited bandwidth, and limited storage space.
They are also numerically stable. Successive rotations will lead to an accumulation of floating point error. We can easily re-normalise a versor to preserve Eqn. (27).
| $\mathbb{SO}(3)$ | $\mathbb{S}^3\subset\mathbb{H}$ | ||
|---|---|---|---|
| Parameters | 9 | 4 | |
| Closure | Multiplications | 27 | 16 |
| Additions | 18 | 12 | |
| Total FLOPs | 45 | 28 | |
| Vector Rotation | Multiplications | 9 | 32 (23) |
| Additions | 6 | 24 (16) | |
| Total FLOPs | 15 | 56 (39) |
For two vectors $\mathbf{a},\mathbf{b}\in\mathbb{R}^n$ the dot product $\mathbf{a}\bullet\mathbf{b} = \mathbf{a}^T\mathbf{b}$. ↩
In this article I provide some basic definitions and proofs of identities for rotation matrices $\mathbf{R}\in\mathbb{SO}(3)$. I show that a rotation matrix can be represented as a matrix exponential. From this, Rodrigues’ formula follows which expresses the matrix in terms of the angle and axis of rotation. I then show how to reverse this formula to obtain the angle and axis from an arbitrary rotation matrix. Then using the exponential form, and the angle-axis, I derive a control law for the angular velocity to perform feedback control on orientation error.
Euler’s rotation theorem states that any change in orientation of a rigid body can be described by:
The 3 combined rotations in the illustration below can be reduced to a single rotation about a single axis:
Any number of combined rotations can be expressed as a single rotation about a single axis.
Any transformation of a vector $\mathbf{v}\in\mathbb{R}^n\to\mathbf{u}\in\mathbb{R}^n$ that preserves its length can be expressed with a product involving a rotation matrix:
\[\mathbf{u} = \mathbf{Rv}. \tag{1}\]This matrix belongs to the Special Orthogonal group:
\[\mathbb{SO}(n) = \Big\{\mathbf{R}\in\mathbb{R}^{n\times n} ~\Big|~ \mathbf{RR}^T = \mathbf{I}~,~ det(\mathbf{R}) = 1\Big\} \tag{2}\]Given an arbitrary rotation matrix $\mathbf{R}\in\mathbb{SO}(3)$ we may be interested in finding the angle and axis of rotation. To do this, we need to define some other properties of $\mathbb{SO}(3)$ that we can exploit.
If we take the time derivative of Eqn. (1), and assuming $\dot{\mathbf{v}} = \mathbf{0}$, then we arrive at:
\[\dot{\mathbf{u}} = \dot{\mathbf{R}}\mathbf{v}. \tag{3}\]But in 3D, the time derivative of a vector is given by the cross product with the instantaneous angular velocity $\boldsymbol{\omega}\in\mathbb{R}^3$ (rad/s):
\[\mathbf{\dot{u}} = \boldsymbol{\omega}\times\mathbf{u} = S(\boldsymbol{\omega})\mathbf{u} \tag{4}\]where $S(\cdot)$ is the skew-symmetric matrix operator:
\[S(\boldsymbol{\omega}) = \begin{bmatrix} \phantom{-}0 & -\omega_z & \phantom{-}\omega_y \\ \phantom{-}\omega_y & \phantom{-}0 & -\omega_x \\ -\omega_y & \phantom{-}\omega_x & \phantom{-}0 \end{bmatrix} \in\mathfrak{so}(3). \tag{5}\]This is also the Lie algebra of $\mathbb{SO}(3)$. By equating Eqn. (3) with Eqn. (4), and substituting in Eqn. (1) we can see that the time derivative of the rotation matrix is “proportional” to itself:
\[\mathbf{\dot{R}} = S(\boldsymbol{\omega})\mathbf{R} ~\Longrightarrow \mathbf{R}(\mathrm{t}) = e^{S(\boldsymbol{\omega})\mathrm{t}}\mathbf{R}(0) \tag{6}\]This is a first-order differential equation whose solution is a (matrix) exponential. But the integral of the angular velocity is simply the angle-axis vector at any given point in time:
\(\int_0^t \boldsymbol{\omega}~dt = \boldsymbol{\omega}t + const. = \alpha\cdot\hat{\mathbf{a}} = \mathbf{a}. \tag{7}\) (where $const. = 0$).
Assuming we start from zero rotation $(\mathbf{R}(0) = \mathbf{I})$, then the rotation matrix is equivalent to a matrix exponential containing the angle-axis:
\[\mathbf{R} = e^{S(\mathbf{a})}\in\mathbb{SO}(3). \tag{8}\]From the definition of the exponential:
\[e^{S(\mathbf{a})} = \sum_{k=0}^\infty \frac{\alpha^{k}}{\mathrm{k!}}S(\hat{\mathbf{a}})^{k} \tag{9}\]we can reduce Eqn. (8) to Rodrigues’ formula which features the angle and axis as separate parameters:
\[\mathbf{R}(\alpha,\hat{\mathbf{a}}) = \mathbf{I} + \sin(\alpha)S(\hat{\mathbf{a}}) + (1-\cos(\alpha))S(\hat{\mathbf{a}})^2. \tag{10}\]Rodrigues’ formula, Eqn. (10), contains 3 matrices with a particular structure to their respective diagonal elements. If we take the trace (sum of diagonal elements) we can see that:
Hence the trace of a rotation matrix must be:
\[\begin{align} trace(\mathbf{R}) &= 3 - 2\cdot(1 - \cos(\alpha)) \tag{11a} \\ &= 1 + 2\cdot\cos(\alpha). \tag{11b} \end{align}\]We can re-arrange this to solve for the angle of rotation:
\[\alpha = \cos^{-1}\left(\frac{trace(\mathbf{R}) - 1}{2}\right). \tag{12}\]If the angle of rotation is zero $\alpha = 0$, then the axis of rotation is arbitrary since $0\cdot\hat{\mathbf{a}} = \mathbf{0}$.
The axis for a rotation matrix does not change $\mathbf{R}\hat{\mathbf{a}} = \hat{\mathbf{a}}$. This implies that it is an eigenvector whose corresponding eigenvalue $\lambda = 1$.1 For any arbitrary eigenvector of $\mathbf{R}$ it must hold that:
\[\mathbf{Rv = v}. \tag{13}\]Multiplying this by the transpose of the rotation yields:
\[\begin{align} \overbrace{\mathbf{R}^T\mathbf{R}}^{\mathbf{I}}\mathbf{v} &= \mathbf{R}^T\mathbf{v} \tag{14a}\\ \mathbf{v} &= \mathbf{R}^T\mathbf{v}. \tag{14b} \end{align}\]Equating Eqn. (13) and Eqn. (14b) we obtain:
\[\begin{align} \mathbf{Rv} &= \mathbf{R}^T\mathbf{v} \tag{15a} \\ \underbrace{\left(\mathbf{R} - \mathbf{R}^T\right)}_{S(\mathbf{v})}\mathbf{v} &= \mathbf{0}. \tag{15b} \end{align}\]The matrix $\mathbf{R} - \mathbf{R}^T $ must be skew-symmetric since $\mathbf{v}\times\mathbf{v} = S(\mathbf{v})\mathbf{v} = \mathbf{0}$. Expanding this we have:
\[\mathbf{R} - \mathbf{R}^T = \begin{bmatrix} 0 & r_{12} - r_{21} & r_{13} - r_{31} \\ r_{21} - r_{12} & 0 &r_{23} - r_{32} \\ r_{31} - r_{13} & r_{32} - r_{23} & 0 \end{bmatrix}. \tag{16}\]Using what we know about the structure of skew-symmetric matrices, Eqn. (5), we can deduce that the eigenvector is:
\[\mathbf{v} = \begin{bmatrix} r_{32} - r_{23} \\ r_{13} - r_{31} \\ r_{21} - r_{12} \end{bmatrix}. \tag{17}\]We can then normalise this vector to obtain the axis of rotation $\hat{\mathbf{a}}$:
\[\hat{\mathbf{a}} = \begin{cases} \frac{\mathbf{v}}{\|\mathbf{v}\|} & \text{if } \alpha \ne 0 \\ \text{trivial} & \text{otherwise.} \end{cases} \tag{18}\]Note that if $\mathbf{R} = \mathbf{I}$ (i.e. no rotation), then $\mathbf{v} = \mathbf{0}$ and hence $\nexists|\mathbf{v}|^{-1}$. In this case, we can assign any arbitrary value to the axis of rotation.
We can use the angle-axis vector to perform feedback on the orientation of an automated system. Suppose $\mathbf{R}_d\in\mathbb{SO}(3)$ is the desired orientation, and $\mathbf{R}\in\mathbb{SO}(3)$ is our actual orientation. We can define our orientation error as:
\[\mathbf{E} \triangleq \mathbf{R}_d\mathbf{R}^T = e^{S(\boldsymbol{\epsilon})}. \tag{19}\]If $\mathbf{R} = \mathbf{R}_d$ then $\mathbf{E} = \mathbf{I}$, implying no difference between orientations. From Eqn. (6) the time derivative of our rotation error is:
\[\dot{\mathbf{E}} = S(\dot{\boldsymbol{\epsilon}})\mathbf{E}~,~\dot{\boldsymbol{\epsilon}} = \boldsymbol{\omega}_d -\boldsymbol{\omega}. \tag{20}\]where:
Assuming $\boldsymbol{\omega}$ is our control input, we can define the control law:
\[\boldsymbol{\omega} \triangleq \boldsymbol{\omega}_d + \mathbf{K}\boldsymbol{\epsilon} \tag{21}\]where $\mathbf{K}\in\mathbb{R}^{3\times 3}$ is a positive-definite gain matrix (an easy choice here is a diagonal matrix with positive values). The desired angular velocity $\boldsymbol{\omega}_d$ becomes a feed-forward term, whereas $\mathbf{K}\boldsymbol{\epsilon}$ is a proportional feedback on the orientation error. In such cases where $\boldsymbol{\omega}_d$ is unavailable, then $\boldsymbol{\omega} = \mathbf{K}\boldsymbol{\epsilon}$ is sufficient.
If we substitute Eqn. (21) in to Eqn. (20) we obtain:
\[\dot{\boldsymbol{\epsilon}} = -\mathbf{K}\boldsymbol{\epsilon} ~\Longrightarrow \boldsymbol{\epsilon}(t) = e^{-\mathbf{K}t}\boldsymbol{\epsilon}(0). \tag{22}\]This form implies exponential decay. As the error angle approaches zero $\boldsymbol{\epsilon}\to \mathbf{0}$ then the orientation error will approach the identity $\mathbf{E}\to\mathbf{I}$ such that $\mathbf{R}\to\mathbf{R}_d$. This follows from the fact that $e^0 = 1$.
Below is a video of the ergoCub robot rotating an object using the bimanual manipulation library that I wrote whilst working as a Postdoc at the Italian Institute of Technology. It uses this exact method for orientation feedback control.
The ergoCub is able to rotate an object with 2 hands using the angle-axis representation for orientation control.
For any arbitrary matrix $\mathbf{A}\in\mathbb{R}^{m\times m}$ the eigenvector $\mathbf{v}\in\mathbb{C}^m$ and eigenvalue $\lambda\in\mathbb{C}$ obey the identity $\mathbf{Av} = \lambda\mathbf{v}$. ↩
In this post I extend the concept of linear feedback control for scalars and vectors in to the realm of Lie groups. Lie groups are mathematical objects with generalised properties for combining, inverting, and computing “differences”. They are used to represent orientation in 3D space in robotics and animation. By understanding their properties we can apply the same logic as linear systems and solve more sophisticated, nonlinear control problems.
In a previous post I discussed the problem of solving feedback control for a linear system using a 3-step process. Given the current position $\mathbf{x}\in\mathbb{R}^m$ and the desired position $\mathbf{x}_d\in\mathbb{R}^{m}$, we:
1 Denote the error from the desired position:
\[\boldsymbol{\epsilon} = \mathbf{x}_d - \mathbf{x}. \tag{1}\]2 Evaluate the time derivative:
\[\dot{\boldsymbol{\epsilon}} = \dot{\mathbf{x}}_d - \dot{\mathbf{x}} \tag{2}\]3 Solve the input to force an exponential decay for the error:
\[\dot{\mathbf{x}} = \dot{\mathbf{x}}_d + \mathbf{K}\boldsymbol{\epsilon} ~\Longrightarrow~ \dot{\boldsymbol{\epsilon}} = -\mathbf{K}\boldsymbol{\epsilon} ~\Longrightarrow~ \boldsymbol{\epsilon}(t) = e^{-\mathbf{K}t}\boldsymbol{\epsilon}_0. \tag{3}\]where $\mathbf{K}\in\mathbb{R}^{n\times n}$ is a positive definite matrix, such that $-\mathbf{K}$ has negative eigenvalues.
How do we perform feedback control for other types of mathematical structures?
For example, it is common to represent the orientation of a rigid body using a rotation matrix:
\[\mathbf{R} = \begin{bmatrix} r_{11} & r_{12} & r_{13} \\ r_{21} & r_{22} & r_{23} \\ r_{31} & r_{32} & r_{33} \end{bmatrix} \in\mathbb{R}^{3\times 3}. \tag{4}\]Each of the columns are unit norm:
\[r_{1i}^2 + r_{2i}^2 + r_{3i}^2 = 1 \quad \text{for } i \in\{1,2,3\} \tag{5}\]and are orthogonal:
\[r_{1i}r_{1j} + r_{2i}r_{2j} + r_{3i}r_{3j} = 0 \quad \quad \text{for } i, j \in\{1,2,3\} \text{ and } i \ne j \tag{6}\]So we cannot add or subtract these matrices $\mathbf{R}_d - \mathbf{R}$ without violating these properties.
Lie groups are mathematical structures that satisfy 4 properties:
Vectors (over addition) form a Lie group.
| Vectors (Over Addition) | |
|---|---|
| Closure: | $\mathbf{x}_1,\mathbf{x}_2\in\mathbb{R}^n$ : $\mathbf{x}_1 + \mathbf{x}_2 \in\mathbb{R}^n$ |
| Associativity: | $\mathbf{x}_1 + \left(\mathbf{x}_2 + \mathbf{x}_3 \right) = \left(\mathbf{x}_1 + \mathbf{x}_2 \right) + \mathbf{x}_3$ |
| Identity: | $\mathbf{0} \in\mathbb{R}^n : \mathbf{x} + \mathbf{0} = \mathbf{x}$ |
| Inverse: | $-\mathbf{x} : \mathbf{x} + (-\mathbf{x}) = \mathbf{0}$ |
The closure and inverse property were applied to define the position error, Eq. (1). In fact, we can see that when the desired position is equal to the actual position, this leads to the identity:
\[\mathbf{x}_d = \mathbf{x} ~\Longrightarrow~ \mathbf{x}_d - \mathbf{x} = \mathbf{0}. \tag{7}\]The rotation matrix, Eq. (4), actually belongs to the Special Orthogonal $\mathbb{SO}$ group:
\[\mathbf{R}\in\mathbb{SO}(n) \triangleq \big\{\mathbf{R}\in\mathbb{R}^{n\times n} : \mathbf{RR}^T = \mathbf{I},~\det(\mathbf{R}) = 1 \big\}. \tag{8}\]Importantly, the closure property is defined by matrix multiplication, and inverse by its transpose.
| Special Orthogonal Group | |
|---|---|
| Closure: | $\mathbf{R}_1,\mathbf{R}_2\in\mathbb{SO}(n)$ : $\mathbf{R}_1\mathbf{R}_2\in\mathbb{SO}(n)$ |
| Associativity: | $\mathbf{R}_1 \left(\mathbf{R}_2 \mathbf{R}_3 \right) = \left(\mathbf{R}_1 \mathbf{R}_2 \right) \mathbf{R}_3$ |
| Identity: | $\mathbf{I} \in\mathbb{SO}(n)\subset\mathbb{R}^{n\times n} : \mathbf{R}\mathbf{I} = \mathbf{R}$ |
| Inverse: | $\mathbf{R}^T : \mathbf{RR}^T = \mathbf{I}$ |
As with Eq. (1), we first apply the closure and inverse properties of $\mathbb{SO}(n)$ to define the rotation error as:
\[\mathbf{E} = \mathbf{R}_d\mathbf{R}^T. \tag{9}\]The $\mathbb{SO}$ group can actually be written as a matrix exponential, so we can instead write Eq. (9) as:
\[\mathbf{E} = e^{S(\boldsymbol{\epsilon})} \tag{10}\]where:
and
\[S(\boldsymbol{\epsilon}) = \begin{bmatrix} \phantom{-}0 & -\epsilon_z & \phantom{-}\epsilon_y \\ \phantom{-}\epsilon_z & \phantom{-}0 & -\epsilon_x \\ -\epsilon_y & \phantom{-}\epsilon_x & \phantom{-}0 \end{bmatrix} \in\mathfrak{so}(3) \tag{11}\]is the Lie algebra of $\mathbb{SO}(3)$ (a skew-symmetric matrix).
Second, we evaluate the time derivative which, from Eq. (10), becomes:
\[\dot{\mathbf{E}} = S(\dot{\boldsymbol{\epsilon}})\mathbf{E} ~,~\dot{\boldsymbol{\epsilon}} = \boldsymbol{\omega}_d - \boldsymbol{\omega}. \tag{12}\]The time derivative of the Lie algebra is actually the difference between the desired velocity $\boldsymbol{\omega}_d\in\mathbb{R}^3$ (rad/s), and the actual velocity $\boldsymbol{\omega}\in\mathbb{R}^3$ (rad/s).
Now instead of operating over $\mathbb{SO}(3)$ or $\mathfrak{so}(3)$, we can apply what we already know about $\mathbb{R}^n$. If we define the input angular velocity as:
\[\boldsymbol{\omega} \triangleq \boldsymbol{\omega}_d + \mathbf{K}\boldsymbol{\epsilon} \tag{13}\]for a matrix $\mathbf{K}\in\mathbb{R}^{3\times 3}$, then the error derivative becomes:
\[\dot{\boldsymbol{\epsilon}} = -\mathbf{K}\boldsymbol{\epsilon} ~\Longrightarrow~ \boldsymbol{\epsilon} = e^{-\mathbf{K}t}\boldsymbol{\epsilon}_0. \tag{14}\]Likewise, the rotation error will decay to the identity:
\[\lim_{t\to\infty} \mathbf{E} = e^{-S\left(\mathbf{K}\boldsymbol{\epsilon}\right)t} = \mathbf{I}. \tag{15}\]Below is a simulation of the ergoCub where I used this principle to enable it to rotate an object when grasping with 2 hands.
We can use the underlying Lie algebra of the rotation matrix to control the orientation of a robot's hands.