What is really strange about all this, is that it still sometimes worked before, so the code was only partially broken? or variable ordering is dynamic even after compile time? puzzling...
btw, teensy 4 use a similar chip, no? they use usb2 for midi communication, maybe that would be more reliable with -O3
No there's not dynamic ordering at run time in the cortex-M7.
I assume it's a problem of race condition between the main USB interrups and other threads. So sometimes it worked, depending when the first USB interrupts are called during the init process.
Just had a look, the teensy 4 use a NXP iMXRT1062 (the pfm3 use a STM32H743)
The CPU/core part is also a ARM cortex-M7. But all the rest (hardware implementation, registers, interrupts etc...) is different.
Midi is SUPER SLOW (31Kbits/s). The slower USB is 1.5Mbits /s so we're good (50 times faster).
And i'm confidant that using -O2 on this low level driver won't have any perf impact.
But it could be interesting some day in the far future to understand why the USB access from computer is so slow.