STM32 Flash Memory: Writing, Erasing, and Managing Persistent Data at Runtime

2026-06-02 · Davide Carrese
STM32 · Flash · Firmware · Bootloader

Every production embedded device eventually needs to persist data — calibration coefficients, serial numbers, network credentials, or application logs. On an STM32 without external EEPROM, the on-chip flash is the only non-volatile storage available. Writing to flash at runtime is completely different from writing to RAM: you must manage unlock sequences, erase-before-write constraints, alignment rules, and timing. This article covers the register-level flash programming model across STM32F4, G0, and U5 series, with practical patterns for EEPROM emulation and bootloader integration.

Flash architecture: sectors vs pages

The first thing to understand is that STM32 flash is not organised uniformly. The erase granularity — the smallest region you can erase in one operation — varies by family:

The erase time also differs: erasing a 16 KB sector on F4 takes about 50–100 ms, while erasing a 2 KB page on G0 takes about 5–10 ms. Your firmware must tolerate these latencies — never erase from an interrupt context unless you can accept the stall.

Flash controller registers (STM32F4 as reference)

The flash interface is controlled through the FLASH->CR (control), FLASH->SR (status), and FLASH->ACR (access control) registers. The key bits are:

The unlock sequence is a write of 0x45670123 then 0xCDEF89AB to FLASH->KEYR. Writing any other value locks the controller again permanently until the next reset.

Register-level flash write

Writing a 32-bit word to flash on STM32F4 follows this sequence:

// 1. Unlock the flash controller
FLASH->KEYR = 0x45670123;
FLASH->KEYR = 0xCDEF89AB;

// 2. Wait for any previous operation to complete
while (FLASH->SR & FLASH_SR_BSY);

// 3. Enable programming
FLASH->CR |= FLASH_CR_PG;

// 4. Write the word (32-bit aligned address!)
*(volatile uint32_t *)0x08040000 = 0xAABBCCDD;

// 5. Wait for completion (typically 10–50 µs per word)
while (FLASH->SR & FLASH_SR_BSY);

// 6. Verify: read back and compare
if (*(volatile uint32_t *)0x08040000 != 0xAABBCCDD) {
    // Handle write failure
}

// 7. Lock the controller
FLASH->CR |= FLASH_CR_LOCK;

Key constraints:

Writing multiple words

For efficiency, you can keep FLASH_CR_PG set and write consecutive 32-bit words. The hardware handles the internal programming sequence for each word. Just poll BSY between writes:

FLASH->CR |= FLASH_CR_PG;
for (size_t i = 0; i < 64; i++) {      // 64 words = 256 bytes
    ((volatile uint32_t *)dest)[i] = data[i];
    while (FLASH->SR & FLASH_SR_BSY);
}
FLASH->CR &= ~FLASH_CR_PG;

On STM32U5 and G4, you can use the burst programming feature (FLASH_CR_BURST) which programs multiple words with fewer CPU wait cycles. Check the reference manual for your target series.

Register-level flash erase

Erasing a sector on STM32F4:

// 1. Unlock
FLASH->KEYR = 0x45670123;
FLASH->KEYR = 0xCDEF89AB;
while (FLASH->SR & FLASH_SR_BSY);

// 2. Select sector erase, pick sector number
FLASH->CR |= FLASH_CR_SER;            // Sector erase mode
FLASH->CR &= ~(0x78);                 // Clear SNB bits
FLASH->CR |= (5 << 3);                // SNB = 5 (sector 5, 128 KB on F401)

// 3. Start erase
FLASH->CR |= FLASH_CR_STRT;

// 4. Wait for completion (50–300 ms depending on sector size)
while (FLASH->SR & FLASH_SR_BSY);

// 5. Lock
FLASH->CR &= ~FLASH_CR_SER;
FLASH->CR |= FLASH_CR_LOCK;

On G0/G4 with page erase, the register is similar but uses FLASH_CR_PER (page erase) and FLASH_CR_PNB (page number) instead of SER/SNB. The unlock keys are also different (0x45670123 / 0xCDEF89AB on main flash, but a different key for option bytes).

Option bytes

Option bytes control hardware configuration — read-out protection (RDP), hardware watchdog, boot mode, and flash write-protection sectors. They are stored in a dedicated flash region and must be programmed through a separate unlock sequence:

// Unlock option bytes
FLASH->OPTKEYR = 0x08192A3B;
FLASH->OPTKEYR = 0x4C5D6E7F;

// Enable option byte programming
FLASH->CR |= FLASH_CR_OPTPG;

// Write the option byte value (example: set RDP to level 1)
*(volatile uint32_t *)0x1FFFC000 = 0xCC;

while (FLASH->SR & FLASH_SR_BSY);

// Reload option bytes (triggers a system reset on some series)
FLASH->CR |= FLASH_CR_OPTSTRT;
while (FLASH->SR & FLASH_SR_BSY);

Warning: setting RDP level 2 permanently disables debugging — irreversible. On client projects, I always set RDP level 1 (debug still possible but flash reads from external debugger are blocked) only after the production programming fixture is validated. Never prototype with RDP level 2.

EEPROM emulation: the two-page swap pattern

Since flash cannot be written in place (you must erase the whole sector/page first), you cannot use it like EEPROM. The standard workaround is a two-page swap algorithm:

Here is a minimal implementation sketch:

#define PAGE_A_ADDR 0x08040000
#define PAGE_B_ADDR 0x08040800  // 2 KB pages on G0/G4
#define PAGE_SIZE   2048

typedef struct {
    uint16_t id;         // Variable identifier
    uint16_t len;        // Data length (bytes)
    uint8_t  data[252];  // Payload
} eeprom_record_t;

static uint32_t current_page = PAGE_A_ADDR;

void eeprom_write(uint16_t id, const uint8_t *data, uint16_t len) {
    // Find next free slot in current page (scan for 0xFFFF header)
    // If no space left, do the swap: copy valid records to the other page
    // Erase the full page, then switch active page
    // Write the new record at the next free offset
}

void eeprom_init(void) {
    // Scan Page A and Page B to determine which is active
    // The active page contains more recent data (valid records + FFFF tail)
}

On STM32F4 with 16 KB sectors, this pattern wastes more space but still works for calibration storage. I typically reserve sector 11 or 12 (near the end of flash) for data, leaving the rest for code. On G0 with 2 KB pages, the overhead is minimal — perfect for 100–200 parameters.

Practical example: writing calibration data on STM32G070

The STM32G070RB has 128 KB flash with 2 KB pages. I reserve page 63 (the last page, 0x0803F800) for calibration storage:

#define CAL_PAGE_ADDR  0x0803F800
#define CAL_MAGIC      0xCA1I

typedef struct __attribute__((packed)) {
    uint32_t magic;
    float    gain_x;
    float    offset_x;
    float    gain_y;
    float    offset_y;
    uint16_t checksum;
} cal_data_t;

int cal_save(const cal_data_t *cfg) {
    // Erase page first
    FLASH->KEYR = 0x45670123;
    FLASH->KEYR = 0xCDEF89AB;
    while (FLASH->SR & FLASH_SR_BSY);

    FLASH->CR |= FLASH_CR_PER;
    FLASH->CR &= ~FLASH_CR_PNB_Msk;
    FLASH->CR |= (63 << FLASH_CR_PNB_Pos);  // Page 63
    FLASH->CR |= FLASH_CR_STRT;
    while (FLASH->SR & FLASH_SR_BSY);
    FLASH->CR &= ~FLASH_CR_PER;

    // Write data
    const uint32_t *src = (const uint32_t *)cfg;
    FLASH->CR |= FLASH_CR_PG;
    for (int i = 0; i < sizeof(cal_data_t) / 4; i++) {
        ((volatile uint32_t *)CAL_PAGE_ADDR)[i] = src[i];
        while (FLASH->SR & FLASH_SR_BSY);
    }
    FLASH->CR &= ~FLASH_CR_PG;
    FLASH->CR |= FLASH_CR_LOCK;

    // Verify
    return memcmp((void *)CAL_PAGE_ADDR, cfg, sizeof(cal_data_t)) == 0 ? 0 : -1;
}

This approach is simple, deterministic, and survives unexpected power loss — either the write completed fully (magic + checksum valid) or it did not (magic missing), in which case cal_load() returns the factory defaults. On a client project, I add a second backup page so that if the write is interrupted, the previous valid calibration is still available.

Bootloader considerations

If your firmware has a bootloader, flash write/erase typically runs from the bootloader context, not the application. The application sends a firmware image over UART/CAN/SPI, stored temporarily in RAM or a scratch area, then jumps to the bootloader which programs the main flash:

Practical checklist

How I would approach this on a client project

On a production firmware project, I never let application code touch flash registers directly. Flash programming is a critical operation with real consequences if misconfigured — one wrong bit in CR and you can lock yourself out of debugging or corrupt the application vector table.

I write a dedicated flash_driver.c module that exposes only:

int  flash_init(void);
int  flash_erase(uint32_t page_addr);
int  flash_write(uint32_t addr, const uint8_t *data, uint32_t len);
int  flash_read(uint32_t addr, uint8_t *out, uint32_t len);
void flash_lock(void);
void flash_unlock(void);

This module goes through code review with a checklist (the one above). Unit tests verify the API calls the correct register sequences by inspecting a mock FLASH peripheral struct. Integration tests run on actual hardware with a dedicated test firmware that exercises every sector — this catches brownout-sensitive boards early in production.

The EEPROM emulation layer sits above flash_driver.c and adds the swap/page management. It is board-agnostic: the same .c file compiles on F4, G0, and U5 just by swapping the flash driver underneath. I keep the emulation layer simple — two pages, a 16-bit CRC, and no wear-leveling beyond the swap. For most calibration use cases (fewer than 100 000 writes over the product lifetime), this is more than sufficient.

If the client insists on byte-addressable EEPROM from day one, I evaluate the cost of an external I²C EEPROM (like a 24LC512) versus the two-page flash emulation. For volumes above 10K units, the BOM savings of on-chip flash emulation usually win, and the firmware effort is a one-time cost.

Sources and further reading

Comments

Have comments? Send me an email.