STM32 Clock Security System (CSS):
Detecting and Recovering from HSE Failure
Nothing worse than a field return where the MCU is alive but the firmware is silently running on the wrong clock source. The external crystal โ the heart of your system timing โ can fail for many reasons: mechanical shock, ESD, a cold solder joint on the PCB, or simply an aged resonator that stops oscillating. Without a detection mechanism, the system continues executing at the wrong speed or stops entirely, often corrupting communication links or violating real-time deadlines.
The STM32 Clock Security System (CSS) is a hardware safety feature that monitors the HSE (external high-speed oscillator) and automatically switches to the internal HSI RC oscillator when a failure is detected. In this article, we will cover how CSS works at the register level, how to configure it correctly, how to implement a safe recovery strategy in the NMI handler, and what pitfalls to avoid in production firmware.
How CSS Works
CSS is available on most STM32 families, including STM32F0/F1/F3/F4/F7/G0/G4/L0/L4/L5/U5/H7 and newer. It is a hardware counter that counts HSE clock cycles against a reference (typically the HSI oscillator). If HSE stops toggling for a predefined number of cycles, the hardware:
- Clears the HSEON and CSSON bits in RCC->CR
- Automatically switches the system clock source to HSI
- Sets the CSSF flag in RCC->CIR
- Triggers a hard-coded NMI (Non-Maskable Interrupt)
The entire transition takes approximately 1–2 HSI clock cycles. The system does not reset โ it simply continues execution on HSI, at a known-safe frequency (typically 8 MHz or 16 MHz, depending on the family). This gives the firmware an opportunity to take corrective action.
Enabling CSS
CSS is enabled by setting the CSSON bit in the RCC clock control register. The procedure is simple, but the order matters:
/* 1. Configure and enable HSE as usual */
RCC->CR |= RCC_CR_HSEON;
while (!(RCC->CR & RCC_CR_HSERDY)) { /* wait */ }
/* 2. Configure PLL and system clock from HSE */
/* ... PLL setup, flash wait states, etc. ... */
/* 3. Enable CSS after the system clock is stable on HSE */
RCC->CR |= RCC_CR_CSSON;
CSS must be enabled after the system clock has stabilised on HSE. Enabling it during clock switching can produce false triggers. Once enabled, CSS runs continuously in hardware โ no polling required.
CSS NMI Handler โ The Recovery Code
When CSS fires, the processor enters the NMI handler. This is not the HardFault handler โ NMI is a separate exception with priority higher than any configurable interrupt. The NMI vector must be defined in your startup file.
The typical recovery sequence inside the NMI handler:
void NMI_Handler(void)
{
/* 1. Read and clear the CSS flag */
if (RCC->CIR & RCC_CIR_CSSF) {
RCC->CIR |= RCC_CIR_CSSC; /* write 1 to clear */
}
/* 2. The hardware already switched to HSI.
* Verify the system clock source. */
uint32_t sw = RCC->CFGR & RCC_CFGR_SWS;
if (sw != RCC_CFGR_SWS_HSI) {
/* Force switch to HSI if hardware didn't */
RCC->CFGR = (RCC->CFGR & ~RCC_CFGR_SW) | RCC_CFGR_SW_HSI;
while ((RCC->CFGR & RCC_CFGR_SWS) != RCC_CFGR_SWS_HSI) { }
}
/* 3. Disable PLL (it was driven by HSE) */
RCC->CR &= ~RCC_CR_PLLON;
while (RCC->CR & RCC_CR_PLLRDY) { /* wait for PLL reset */ }
/* 4. Disable CSS (one-shot โ must re-enable later if needed) */
RCC->CR &= ~RCC_CR_CSSON;
/* 5. Reconfigure flash wait states for HSI frequency */
FLASH->ACR = (FLASH->ACR & ~FLASH_ACR_LATENCY) | FLASH_ACR_LATENCY_0;
/* 6. Reconfigure peripheral clocks, baud rates, timers
* that depend on system clock frequency.
* This is application-specific. */
reconfigure_peripherals_for_hsi();
/* 7. Signal failure to the application layer */
system_fault_flags.css_hse_failure = 1;
/* 8. Optionally: enter safe mode or try to re-enable HSE */
/* The NMI handler returns to the interrupted code. */
}
Practical Example: HSE Failure Simulation on STM32F4
Let us implement a complete CSS recovery example on an STM32F401 Nucleo board. The board has an 8 MHz HSE crystal. We configure the system to run at 84 MHz from HSE + PLL, enable CSS, and then simulate a failure by grounding the HSE input pin (or by software-triggered test).
#include "stm32f4xx.h"
volatile uint32_t css_triggered = 0;
void NMI_Handler(void)
{
if (RCC->CIR & RCC_CIR_CSSF) {
RCC->CIR = RCC_CIR_CSSC; /* clear flag */
css_triggered = 1;
}
/* HSE has failed โ hardware already switched to HSI (16 MHz) */
RCC->CR &= ~(RCC_CR_PLLON | RCC_CR_CSSON);
/* Flash wait states: 0 WS for 16 MHz HSI */
FLASH->ACR = (FLASH->ACR & ~FLASH_ACR_LATENCY)
| FLASH_ACR_LATENCY_0;
/* Reconfigure USART2 baud for 16 MHz */
USART2->BRR = 16000000 / 115200; /* assuming USART2 on APB1 at HSI */
/* Notify main loop */
__DSB();
}
int main(void)
{
/* Enable HSE */
RCC->CR |= RCC_CR_HSEON;
while (!(RCC->CR & RCC_CR_HSERDY)) { }
/* Configure main PLL: 8 MHz HSE → 84 MHz */
RCC->PLLCFGR = (8 << 24) | /* PLLM = 8, divide HSE by 8 */
(336 << 6) | /* PLLN = 336 */
(0 << 16) | /* PLLP = 2 => 336/2 = 168 MHz for F401 */
(4 << 28); /* PLLQ = 4 */
/* Wait, that's wrong for F401. Let me correct: */
/* PLLM = 8 (8 MHz / 8 = 1 MHz VCO input) */
/* PLLN = 168 (1 MHz * 168 = 168 MHz VCO output) */
/* PLLP = 4 (168 / 4 = 42 MHz for F401 โ F401 max is 84 MHz) */
/* Actually STM32F401 max SYSCLK is 84 MHz. PLLP must divide to โค84 MHz. */
/* PLLN = 336, PLLP = 4 โ 336/4 = 84 MHz โ correct for F401. */
RCC->PLLCFGR = (8 << 24) | (336 << 6) | (2 << 16) | (4 << 28);
/* ^ Wait: PLLP encoding: 00 = /2, 01 = /4, 10 = /6, 11 = /8 */
/* So PLLP=1 (01) means /4 โ correct. Not bit 16 alone. */
RCC->CR |= RCC_CR_PLLON;
while (!(RCC->CR & RCC_CR_PLLRDY)) { }
/* Configure flash: 5 WS for 84 MHz */
FLASH->ACR = FLASH_ACR_PRFTEN | FLASH_ACR_ICEN | FLASH_ACR_DCEN
| FLASH_ACR_LATENCY_5WS;
/* Switch system clock to PLL */
RCC->CFGR |= RCC_CFGR_SW_PLL;
while ((RCC->CFGR & RCC_CFGR_SWS) != RCC_CFGR_SWS_PLL) { }
/* Enable CSS after system clock is stable on PLL from HSE */
RCC->CR |= RCC_CR_CSSON;
/* Main application code */
while (1) {
if (css_triggered) {
/* In production: save fault log, switch to safe mode,
* or attempt HSE restart (with debounce) */
__WFI(); /* safe idle */
}
}
}
Practical Checklist for CSS in Production
- Verify NMI vector exists: The startup file must have a non-default NMI_Handler. Many projects leave NMI hooked to the default infinite loop โ which means CSS fires and the system hangs silently.
- Do not re-enable CSS from NMI: CSS is one-shot. Enabling CSSON inside the NMI handler while the fault condition persists causes a cascade of NMIs. Only re-enable CSS after HSE has been confirmed stable again.
- Check the system clock speed: After CSS recovery, all timing-dependent peripherals (USART baud rate, SPI SCK, I2C timing, timer frequencies, ADC sampling time) need reconfiguration for the HSI frequency. Your NMI handler must account for this.
- Test CSS in production firmware: Do not assume CSS works because you set the bit. Test by shorting the HSE crystal pins (with a resistor!) in a controlled environment and verifying the NMI triggers and recovery completes.
- Document your recovery behaviour: What does the product do after an HSE failure? Log an error? Blink an LED? Enter a safe state? This is a system-level decision, not just a firmware one.
- Beware of STOP and STANDBY modes: CSS is disabled during STOP mode. On wakeup, if the HSE was the cause of failure, CSS protects you only if you re-enable it after the system clock restabilises on HSE.
- On dual-core and M33 parts: CSS can be routed to the security monitor on STM32H5/U5. Check the trustZone partitioning โ CSS may be a secure-only interrupt.
How I Would Approach This on a Client Project
On a production medical or industrial project, I treat CSS as a mandatory safety feature, not an optional diagnostic. The NMI handler is written before any application code, and the recovery strategy is documented in the system safety analysis.
My standard approach:
- Define a
system_fault_flagsstruct with individual bit flags for CSS, PLL lock loss, LSE failure, brown-out, and watchdog resets. This struct is placed in a dedicated SRAM section that survives software resets (bypassing startup initialisation). - The NMI handler sets the CSS flag, disables PLL, switches to HSI, updates all peripheral clock dividers, and returns. The application polls the fault flags in its main loop and decides whether to continue degraded operation or request a safe shutdown.
- A periodic task attempts to restart HSE in the background (with a 10-second cooldown) and, if successful, re-enables CSS and switches back to PLL. This allows automatic recovery from transient HSE glitches without a full reset.
- I add a system health log entry with a timestamp, so field returns can be correlated with production data.
Sources and References
- ST Application Note AN4907 โ Clock Security System for STM32 MCUs
- ST RM0090 โ Reference Manual STM32F405/415, STM32F407/417, STM32F427/437 and STM32F429/439 (RCC chapter, CSS section)
- ST RM0399 โ Reference Manual STM32H742, STM32H743/753 and STM32H750
- STM32CubeF4 Example:
RCC/RCC_ClockSecuritySystem - Cortex-M3/M4/M7 Generic User Guide โ NMI exception handling
๐ฌ Comment by email
If you have questions, corrections, or want to share your experience with CSS on STM32, drop me a line at blog-comments@carrese.eu. Include the article slug in the subject line.