2026-06-12 · Davide Carrese

STM32 RCC Clock Configuration at Register Level:
HSI, HSE, PLL, and System Clock Switching on STM32F4

STM32 · RCC · Clock · PLL · Register-Level · STM32F4

Spent the morning debugging why your UART baud rate is exactly half of what you configured? Nine times out of ten, the clock tree is the culprit. The STM32F4 has six clock sources, three PLLs, and a cascade of prescalers — get any ratio wrong and every peripheral that depends on the bus clock breaks silently. Let's walk through the clock tree register by register.

The RCC (Reset and Clock Control) peripheral is the single most important block on any STM32 microcontroller. It generates the system clock (SYSCLK) that drives the CPU, the AHB and APB bus clocks, and all peripheral clocks derived from them. Get the clock configuration right and everything else follows; get it wrong and your timers run at half speed, your USART produces garbage, and your ADC samples at the wrong rate.

This article covers the register-level clock configuration on STM32F401 (ARM Cortex-M4), but the same RCC structure applies across the entire STM32F4 family. The register addresses and field positions are identical; only the maximum frequencies and PLL constraints differ slightly between parts.

Clock Tree Overview

The STM32F4 clock tree has three configurable clock sources for SYSCLK:

Additionally, two low-speed oscillators exist for the RTC and independent watchdog: LSI (32 kHz RC) and LSE (32.768 kHz crystal).

The PLL itself is split into three output branches — PLLP (main system clock), PLLQ (48 MHz for USB OTG/SDIO), and PLLR (present on some F4 devices for I2S) — but on the F401, only PLLP and PLLQ are available.

⚠️ HSI vs HSE trade-off

If you only need moderate CPU performance (≤ 16 MHz), HSI is fine. But if you push PLL output above ~120 MHz, use HSE as the PLL source. The HSI oscillator's accuracy degrades with temperature and may push the PLL beyond its VCO lock range at high multiplication factors, producing a jittery or unstable clock.

Register-Level Walkthrough: 168 MHz from an 8 MHz HSE

Let's configure the STM32F401 to run at its maximum SYSCLK of 84 MHz (or 168 MHz on dual-core and F405/7 parts). We'll use an 8 MHz external crystal as the HSE source, multiply it through the PLL, and route the output through the appropriate prescalers.

The PLL output formula is:

PLL_output = (HSE / M) × N / P

where:
  M = PLL input divider (2..63)
  N = PLL multiplier  (192..432 on F401)
  P = PLL system clock divider (2, 4, 6, 8)

For 168 MHz from 8 MHz, a common configuration is M = 8, N = 336, P = 2:

(8 MHz / 8) × 336 / 2 = 1 × 336 / 2 = 168 MHz

On the STM32F401, the maximum SYSCLK is 84 MHz, so we configure PLLP = 4 instead:

(8 MHz / 8) × 336 / 4 = 1 × 84 = 84 MHz

Step 1: Enable HSE and Wait for Ready

Before touching the PLL, we must enable and stabilize the HSE oscillator. Writing to HSEON in RCC->CR starts the oscillator; polling HSERDY confirms it's stable:

RCC->CR |= RCC_CR_HSEON;                    /* start HSE oscillator */
while (!(RCC->CR & RCC_CR_HSERDY))           /* wait for ready */
    ;                                         /* typically ~1 ms */

Always verify the ready flag. I've seen boards where a damaged crystal or incorrect load capacitors caused HSERDY to never set — the code silently hangs here, which is infinitely better than a peripheral that runs at a wrong clock rate.

Step 2: Configure Flash Wait States

This is the step most clock-tutorials skip. The STM32F4 flash memory has a maximum access speed: at 84 MHz SYSCLK you need 5 wait states on the F401. Running the CPU at high speed without enough wait states causes flash read failures — typically manifesting as hard faults or random instruction corruption. Set this before switching the system clock:

/* FLASH_ACR: 5 wait states for 84 MHz on F401 */
FLASH->ACR = FLASH_ACR_LATENCY_5WS | FLASH_ACR_PRFTEN | FLASH_ACR_ICEN | FLASH_ACR_DCEN;

The prefetch buffer (PRFTEN), instruction cache (ICEN), and data cache (DCEN) should all be enabled for maximum execution performance from flash.

Step 3: Configure AHB and APB Prescalers

These prescalers divide SYSCLK down to the peripheral bus domains. On STM32F4:

We want HCLK = 84 MHz (÷1), PCLK1 = 42 MHz (÷2), PCLK2 = 84 MHz (÷1):

RCC->CFGR |= RCC_CFGR_HPRE_DIV1;            /* AHB prescaler = 1 */
RCC->CFGR |= RCC_CFGR_PPRE1_DIV2;           /* APB1 prescaler = 2 — 42 MHz */
RCC->CFGR |= RCC_CFGR_PPRE2_DIV1;           /* APB2 prescaler = 1 — 84 MHz */

Critical note for timer users: APB1 timers clock at 2 × PCLK1 when the prescaler is > 1. So with PPRE1 = 2, timer clocks on APB1 run at 84 MHz, not 42 MHz. This is a common source of confusion when configuring timer periods.

Step 4: Configure the PLL

Set the PLL source to HSE, configure M, N, P, and Q dividers, then enable the PLL:

RCC->PLLCFGR &= ~RCC_PLLCFGR_PLLSRC;        /* clear source bits */
RCC->PLLCFGR |= RCC_PLLCFGR_PLLSRC_HSE;     /* source = HSE */

RCC->PLLCFGR = (8 << RCC_PLLCFGR_PLLM_Pos)  /* M = 8  — divide 8 MHz → 1 MHz */
              | (336 << RCC_PLLCFGR_PLLN_Pos) /* N = 336 — multiply → 336 MHz VCO */
              | (4 << RCC_PLLCFGR_PLLP_Pos)   /* P = 4  — VCO / 4 → 84 MHz SYSCLK */
              | (7 << RCC_PLLCFGR_PLLQ_Pos)   /* Q = 7  — 336 / 7 → 48 MHz for USB */
              | RCC_PLLCFGR_PLLSRC_HSE;

RCC->CR |= RCC_CR_PLLON;                    /* enable PLL */
while (!(RCC->CR & RCC_CR_PLLRDY))          /* wait for PLL lock */
    ;                                         /* typically a few µs */

The VCO frequency (HSE/M × N) must stay within 192–432 MHz on the F401. Our VCO is 336 MHz, well within range. The PLLQ output targets 48 MHz for USB OTG — the Q divider must be chosen so that 336 MHz / Q = 48 MHz exactly. Q = 7 gives 48 MHz. For SDIO, the same 48 MHz clock is used.

Step 5: Switch System Clock to PLL

Now we switch SYSCLK from HSI (the default after reset) to PLL. Use the SW bits in RCC->CFGR and poll SWS to confirm the switch completed:

RCC->CFGR |= RCC_CFGR_SW_PLL;               /* SYSCLK = PLL output */
while ((RCC->CFGR & RCC_CFGR_SWS_Msk) != RCC_CFGR_SWS_PLL)
    ;                                         /* wait for switch complete */

The switch is glitch-free: the clock mux waits for a safe clock edge from the new source before switching. There's no risk of a mid-instruction clock glitch during the transition.

Step 6: Verify the Result

You can read back the current system clock status from SWS to confirm the switch took effect. To verify the actual frequency, toggle a GPIO pin in the main loop and measure it with an oscilloscope, or configure a timer output to generate a known frequency and check it against a reference:

/* Read current SYSCLK status */
uint32_t sws = (RCC->CFGR & RCC_CFGR_SWS_Msk) >> RCC_CFGR_SWS_Pos;
/* 0b00 = HSI, 0b01 = HSE, 0b10 = PLL, 0b11 = not used */

PLLQ and USB Clock Constraints

The USB OTG peripheral requires an exact 48 MHz clock. On the STM32F4, this comes from either PLLQ output or a dedicated external clock. If you change the PLL configuration for power optimization or different SYSCLK, you must recalculate Q so that:

PLLQ = (HSE / M) × N / Q = 48 MHz (exact)

Not all combinations produce an exact 48 MHz. Here are working combinations for common crystals:

CrystalMNPQSYSCLKUSB
8 MHz83364784 MHz48.0 MHz ✓
8 MHz833627168 MHz48.0 MHz ✓
12 MHz123364784 MHz48.0 MHz ✓
25 MHz253364784 MHz48.0 MHz ✓
8 MHz8300475 MHz37.5 MHz ✗

Always confirm that Q produces an exact 48 MHz division. A fractional-µs offset in the USB frame clock causes enumeration failures or intermittent disconnect.

Clock Security System (CSS)

The STM32F4 includes a Clock Security System that automatically switches to HSI if the HSE oscillator fails. Enable it after HSE stabilizes:

RCC->CR |= RCC_CR_CSSON;                    /* enable clock security */

When CSS detects a failure, it generates either an NMI interrupt or a reset depending on the configuration. On production hardware, I always enable CSS — a crystal going bad in the field is rare but catastrophic if it causes a silent hang. With CSS enabled, the system degrades gracefully to HSI (16 MHz) instead of locking up.

⚠️ CSS NMI handler required

If you enable CSS, you MUST provide an NMI_Handler or HardFault_Handler that detects the CSS flag (RCC->CIR & RCC_CIR_CSSF) and takes appropriate action — typically re-initializing the clock tree from HSI and alerting the application layer. Without a handler, the NMI causes an immediate hard fault with no recovery path.

Practical example: Minimal clock init for STM32F401 from HSE

Here is a complete, self-contained clock initialization function that configures 84 MHz from an 8 MHz HSE crystal:

void clock_init_84mhz(void)
{
    /* 1. Enable HSE */
    RCC->CR |= RCC_CR_HSEON;
    while (!(RCC->CR & RCC_CR_HSERDY));

    /* 2. Flash wait states for 84 MHz */
    FLASH->ACR = FLASH_ACR_LATENCY_5WS
               | FLASH_ACR_PRFTEN
               | FLASH_ACR_ICEN
               | FLASH_ACR_DCEN;

    /* 3. AHB/APB prescalers */
    RCC->CFGR |= RCC_CFGR_HPRE_DIV1;         /* AHB = SYSCLK / 1 */
    RCC->CFGR |= RCC_CFGR_PPRE1_DIV2;        /* APB1 = HCLK / 2 */
    RCC->CFGR |= RCC_CFGR_PPRE2_DIV1;        /* APB2 = HCLK / 1 */

    /* 4. PLL: 8 MHz / 8 * 336 / 4 = 84 MHz */
    RCC->PLLCFGR = (8  << RCC_PLLCFGR_PLLM_Pos)
                 | (336 << RCC_PLLCFGR_PLLN_Pos)
                 | (4  << RCC_PLLCFGR_PLLP_Pos)
                 | (7  << RCC_PLLCFGR_PLLQ_Pos)
                 | RCC_PLLCFGR_PLLSRC_HSE;

    RCC->CR |= RCC_CR_PLLON;
    while (!(RCC->CR & RCC_CR_PLLRDY));

    /* 5. Switch SYSCLK to PLL */
    RCC->CFGR |= RCC_CFGR_SW_PLL;
    while ((RCC->CFGR & RCC_CFGR_SWS_Msk) != RCC_CFGR_SWS_PLL);

    /* 6. Optional: enable Clock Security System */
    RCC->CR |= RCC_CR_CSSON;                 /* CSS — needs NMI handler */
}

Practical checklist

StepRegisterWhat to verify
HSE startupRCC->CR (HSERDY)Bit set after HSEON — if stuck, check crystal/load caps
Flash latencyFLASH->ACRLatency set BEFORE switching clock
AHB prescalerRCC->CFGR (HPRE)HCLK ≤ max frequency for the part
APB1 prescalerRCC->CFGR (PPRE1)PCLK1 ≤ 42 MHz (F401); timer clock = 2× if > 1
PLL VCO rangeRCC->PLLCFGR192 ≤ VCO ≤ 432 MHz (F401)
PLL lockRCC->CR (PLLRDY)Bit set after PLLON
Clock switchRCC->CFGR (SWS)Must read 0b10 (PLL) after switching
USB clockRCC->PLLCFGR (PLLQ)Must be exactly 48 MHz for USB
CSSRCC->CR (CSSON)NMI handler present

How I would approach this on a client project

On a production codebase, I use a structured clock configuration table rather than scattering the PLL magic numbers across multiple source files. A single clock_cfg.h header defines the target frequencies and computes the register values at compile time when possible:

/* clock_cfg.h — single source of truth */
#define HSE_FREQ_HZ    8000000UL
#define SYSCLK_FREQ_HZ 84000000UL

/* Computed PLL parameters — verify with the datasheet */
#define PLL_M          8
#define PLL_N          336
#define PLL_P          4   /* 2, 4, 6, or 8 */
#define PLL_Q          7   /* N/PLL_Q must yield 48 MHz */

I also validate the PLL parameters at build time with static assertions (_Static_assert) checking that the VCO is within range and that USB gets exactly 48 MHz. This catches configuration drift when someone changes the crystal frequency and forgets to update the PLL dividers.

For the NMI handler, I allocate a small recovery function in the startup file's NMI vector that reads the CSS flag, re-initializes the clock to HSI, and sets a global error flag that the fault logger picks up on the next main-loop iteration. This gives the firmware a fighting chance to log the failure and enter a safe state instead of bricking silently.

Sources

📬 Leave a comment

Questions or corrections? Email me — I reply to every message.