Skip to content

optimize imuMahonyAHRSupdate() hot path (5 micro-opts)#11358

Open
sensei-hacker wants to merge 1 commit intoiNavFlight:maintenance-9.xfrom
sensei-hacker:perf/imu-mahony-hot-path-opts
Open

optimize imuMahonyAHRSupdate() hot path (5 micro-opts)#11358
sensei-hacker wants to merge 1 commit intoiNavFlight:maintenance-9.xfrom
sensei-hacker:perf/imu-mahony-hot-path-opts

Conversation

@sensei-hacker
Copy link
Member

@sensei-hacker sensei-hacker commented Feb 22, 2026

Summary

Five micro-optimizations to the 1 kHz Mahony AHRS update hot path:

  1. Direct rMat[2][0..2] reads instead of quaternionRotateVector() — saves ~32 multiplies + 24 adds per PID cycle
  2. Squared comparison thetaMagnitudeSq² < 24e-6 instead of thetaMagnitudeSq < sqrt(24e-6) — eliminates a sqrt() call
  3. First-order Newton quaternion renormalization instead of quaternionNormalize() — eliminates sqrt() + 4 divides per cycle
  4. Precompute dcm_i_limit in imuConfigure() (called on settings save) instead of recalculating every PID cycle
  5. prevOrientation snapshot every 100 cycles instead of every cycle (only used by the fault-recovery path)

Files Changed

  • src/main/flight/imu.c
  • src/main/flight/imu.h

Testing

  • Compiles cleanly for SITL ✅
  • Attitude estimation remains accurate in flight
  • Fault-recovery path (prevOrientation) still functional

None of these are expected to make a huge difference in performance, but they are free.
It should be essentially the same result in fewer CPU cycles.

Five cycle-saving changes to imuMahonyAHRSupdate(), which accounts for
~205 µs (30%) of the PID loop on RP2350.  Each change is safe on all
targets; the gains are proportionally smaller on F7/H7 where the function
already lives in ITCM RAM.

1. Replace quaternionRotateVector({0,0,1}) with rMat[2][*] reads.
   imuComputeRotationMatrix() is called at the end of every invocation and
   keeps rMat in sync with orientation.  Rotating the constant gravity
   vector EF→BF yields exactly the third row of rMat, so three float loads
   replace ~56 floating-point multiply/add operations.

2. Eliminate sqrt() from the Taylor-series threshold.
   Original: thetaMagnitudeSq < sqrt(24e-6).
   Squaring both sides (both non-negative) gives the equivalent condition
   thetaMagnitudeSq² < 24e-6 with no sqrt call.

3. First-order Newton quaternion renormalization replaces quaternionNormalize()
   (sqrt + 4 divides, ~35 cycles) with scale = (3 - normSq) * 0.5 (~14 cycles).
   At 1 kHz the per-step drift |ε| < 1e-6, making the O(ε²) error < 1e-12.
   imuCheckAndResetOrientationQuaternion() remains as the catastrophic-failure
   safety net.

4. Precompute the anti-windup i_limit in imuConfigure() (called only on
   settings save).  The hot path now reads a single float instead of
   performing an add, a multiply, and a divide every PID cycle.
   Adds dcm_i_limit to imuRuntimeConfig_t and imuConfigure().

5. Reduce prevOrientation snapshot frequency from every PID cycle to
   every 100 cycles (~100 ms at 1 kHz).  The snapshot is only used by
   the fault-recovery path which should never fire in normal flight;
   16 bytes of unnecessary SRAM write every ms is not justified.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@sensei-hacker sensei-hacker changed the title imu: optimize imuMahonyAHRSupdate() hot path (5 micro-opts) optimize imuMahonyAHRSupdate() hot path (5 micro-opts) Feb 22, 2026
@sensei-hacker
Copy link
Member Author

sensei-hacker commented Feb 22, 2026

Ps, note that thetaMagnitudeSq is mathematically guaranteed to be non-negative, because it is the sum of three squares. Therefore the comparison is arithmetically equivalent, just with fewer CPU cycles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant