STM32 Flash Memory: Writing, Erasing, and Managing Persistent Data at Runtime
Every production embedded device eventually needs to persist data — calibration coefficients, serial numbers, network credentials, or application logs. On an STM32 without external EEPROM, the on-chip flash is the only non-volatile storage available. Writing to flash at runtime is completely different from writing to RAM: you must manage unlock sequences, erase-before-write constraints, alignment rules, and timing. This article covers the register-level flash programming model across STM32F4, G0, and U5 series, with practical patterns for EEPROM emulation and bootloader integration.
Flash architecture: sectors vs pages
The first thing to understand is that STM32 flash is not organised uniformly. The erase granularity — the smallest region you can erase in one operation — varies by family:
- STM32F4 (F401/F411/F412): flash is divided into sectors. On a 512 KB device, sectors are 16 KB (4 sectors), 64 KB (1 sector), and 128 KB (3 sectors). You cannot erase less than a full sector.
- STM32G0/G4: flash is divided into pages, typically 2 KB each. Much finer erase granularity, which makes EEPROM emulation simpler.
- STM32U5: uses a dual-bank architecture with pages of 4 KB or 8 KB, depending on the bank configuration. Supports read-while-write (RWW) when banks are independent.
- STM32L0/L4: pages of 128 or 256 bytes (L0) and 2 KB (L4). The small page size on L0 is excellent for data EEPROM emulation without external memory.
The erase time also differs: erasing a 16 KB sector on F4 takes about 50–100 ms, while erasing a 2 KB page on G0 takes about 5–10 ms. Your firmware must tolerate these latencies — never erase from an interrupt context unless you can accept the stall.
Flash controller registers (STM32F4 as reference)
The flash interface is controlled through the FLASH->CR (control), FLASH->SR (status), and FLASH->ACR (access control) registers. The key bits are:
- FLASH_CR_PG (bit 0): Programming enable. Must be set before every write sequence.
- FLASH_CR_SER (bit 1): Sector erase enable.
- FLASH_CR_SNB (bits 3–6): Sector number selection for erase.
- FLASH_CR_STRT (bit 16): Start the erase operation.
- FLASH_CR_LOCK (bit 31): Lock bit. Set after reset; must be unlocked before any write/erase.
- FLASH_SR_BSY (bit 0): Busy flag. Poll this between operations.
- FLASH_SR_PGSERR / FLASH_SR_PGPERR: Programming sequence and parallelism errors.
The unlock sequence is a write of 0x45670123 then 0xCDEF89AB to FLASH->KEYR. Writing any other value locks the controller again permanently until the next reset.
Register-level flash write
Writing a 32-bit word to flash on STM32F4 follows this sequence:
// 1. Unlock the flash controller
FLASH->KEYR = 0x45670123;
FLASH->KEYR = 0xCDEF89AB;
// 2. Wait for any previous operation to complete
while (FLASH->SR & FLASH_SR_BSY);
// 3. Enable programming
FLASH->CR |= FLASH_CR_PG;
// 4. Write the word (32-bit aligned address!)
*(volatile uint32_t *)0x08040000 = 0xAABBCCDD;
// 5. Wait for completion (typically 10–50 µs per word)
while (FLASH->SR & FLASH_SR_BSY);
// 6. Verify: read back and compare
if (*(volatile uint32_t *)0x08040000 != 0xAABBCCDD) {
// Handle write failure
}
// 7. Lock the controller
FLASH->CR |= FLASH_CR_LOCK;
Key constraints:
- The destination address must be 32-bit aligned. Half-word writes (16-bit) are also supported with the right PSIZE setting (see FLASH_CR_PSIZE on F4), but 8-bit writes are not — you must do a read-modify-write cycle.
- You can only write to an erased (0xFF) location. If the target word is not 0xFFFFFFFF, the write will either fail silently (F4) or produce a write-protection error. Always erase first.
- The flash programming voltage must be in the specified range. On F4, VDD must be between 1.8 V and 3.6 V during write/erase — brownout during programming corrupts the flash.
Writing multiple words
For efficiency, you can keep FLASH_CR_PG set and write consecutive 32-bit words. The hardware handles the internal programming sequence for each word. Just poll BSY between writes:
FLASH->CR |= FLASH_CR_PG;
for (size_t i = 0; i < 64; i++) { // 64 words = 256 bytes
((volatile uint32_t *)dest)[i] = data[i];
while (FLASH->SR & FLASH_SR_BSY);
}
FLASH->CR &= ~FLASH_CR_PG;
On STM32U5 and G4, you can use the burst programming feature (FLASH_CR_BURST) which programs multiple words with fewer CPU wait cycles. Check the reference manual for your target series.
Register-level flash erase
Erasing a sector on STM32F4:
// 1. Unlock
FLASH->KEYR = 0x45670123;
FLASH->KEYR = 0xCDEF89AB;
while (FLASH->SR & FLASH_SR_BSY);
// 2. Select sector erase, pick sector number
FLASH->CR |= FLASH_CR_SER; // Sector erase mode
FLASH->CR &= ~(0x78); // Clear SNB bits
FLASH->CR |= (5 << 3); // SNB = 5 (sector 5, 128 KB on F401)
// 3. Start erase
FLASH->CR |= FLASH_CR_STRT;
// 4. Wait for completion (50–300 ms depending on sector size)
while (FLASH->SR & FLASH_SR_BSY);
// 5. Lock
FLASH->CR &= ~FLASH_CR_SER;
FLASH->CR |= FLASH_CR_LOCK;
On G0/G4 with page erase, the register is similar but uses FLASH_CR_PER (page erase) and FLASH_CR_PNB (page number) instead of SER/SNB. The unlock keys are also different (0x45670123 / 0xCDEF89AB on main flash, but a different key for option bytes).
Option bytes
Option bytes control hardware configuration — read-out protection (RDP), hardware watchdog, boot mode, and flash write-protection sectors. They are stored in a dedicated flash region and must be programmed through a separate unlock sequence:
// Unlock option bytes
FLASH->OPTKEYR = 0x08192A3B;
FLASH->OPTKEYR = 0x4C5D6E7F;
// Enable option byte programming
FLASH->CR |= FLASH_CR_OPTPG;
// Write the option byte value (example: set RDP to level 1)
*(volatile uint32_t *)0x1FFFC000 = 0xCC;
while (FLASH->SR & FLASH_SR_BSY);
// Reload option bytes (triggers a system reset on some series)
FLASH->CR |= FLASH_CR_OPTSTRT;
while (FLASH->SR & FLASH_SR_BSY);
Warning: setting RDP level 2 permanently disables debugging — irreversible. On client projects, I always set RDP level 1 (debug still possible but flash reads from external debugger are blocked) only after the production programming fixture is validated. Never prototype with RDP level 2.
EEPROM emulation: the two-page swap pattern
Since flash cannot be written in place (you must erase the whole sector/page first), you cannot use it like EEPROM. The standard workaround is a two-page swap algorithm:
- Allocate two identical flash pages (Page A, Page B).
- Start with Page A active. All writes go to the next free slot in Page A.
- When Page A fills up, write the latest copy of every variable to Page B and erase Page A.
- Page B becomes active; the process repeats in reverse.
Here is a minimal implementation sketch:
#define PAGE_A_ADDR 0x08040000
#define PAGE_B_ADDR 0x08040800 // 2 KB pages on G0/G4
#define PAGE_SIZE 2048
typedef struct {
uint16_t id; // Variable identifier
uint16_t len; // Data length (bytes)
uint8_t data[252]; // Payload
} eeprom_record_t;
static uint32_t current_page = PAGE_A_ADDR;
void eeprom_write(uint16_t id, const uint8_t *data, uint16_t len) {
// Find next free slot in current page (scan for 0xFFFF header)
// If no space left, do the swap: copy valid records to the other page
// Erase the full page, then switch active page
// Write the new record at the next free offset
}
void eeprom_init(void) {
// Scan Page A and Page B to determine which is active
// The active page contains more recent data (valid records + FFFF tail)
}
On STM32F4 with 16 KB sectors, this pattern wastes more space but still works for calibration storage. I typically reserve sector 11 or 12 (near the end of flash) for data, leaving the rest for code. On G0 with 2 KB pages, the overhead is minimal — perfect for 100–200 parameters.
Practical example: writing calibration data on STM32G070
The STM32G070RB has 128 KB flash with 2 KB pages. I reserve page 63 (the last page, 0x0803F800) for calibration storage:
#define CAL_PAGE_ADDR 0x0803F800
#define CAL_MAGIC 0xCA1I
typedef struct __attribute__((packed)) {
uint32_t magic;
float gain_x;
float offset_x;
float gain_y;
float offset_y;
uint16_t checksum;
} cal_data_t;
int cal_save(const cal_data_t *cfg) {
// Erase page first
FLASH->KEYR = 0x45670123;
FLASH->KEYR = 0xCDEF89AB;
while (FLASH->SR & FLASH_SR_BSY);
FLASH->CR |= FLASH_CR_PER;
FLASH->CR &= ~FLASH_CR_PNB_Msk;
FLASH->CR |= (63 << FLASH_CR_PNB_Pos); // Page 63
FLASH->CR |= FLASH_CR_STRT;
while (FLASH->SR & FLASH_SR_BSY);
FLASH->CR &= ~FLASH_CR_PER;
// Write data
const uint32_t *src = (const uint32_t *)cfg;
FLASH->CR |= FLASH_CR_PG;
for (int i = 0; i < sizeof(cal_data_t) / 4; i++) {
((volatile uint32_t *)CAL_PAGE_ADDR)[i] = src[i];
while (FLASH->SR & FLASH_SR_BSY);
}
FLASH->CR &= ~FLASH_CR_PG;
FLASH->CR |= FLASH_CR_LOCK;
// Verify
return memcmp((void *)CAL_PAGE_ADDR, cfg, sizeof(cal_data_t)) == 0 ? 0 : -1;
}
This approach is simple, deterministic, and survives unexpected power loss — either the write completed fully (magic + checksum valid) or it did not (magic missing), in which case cal_load() returns the factory defaults. On a client project, I add a second backup page so that if the write is interrupted, the previous valid calibration is still available.
Bootloader considerations
If your firmware has a bootloader, flash write/erase typically runs from the bootloader context, not the application. The application sends a firmware image over UART/CAN/SPI, stored temporarily in RAM or a scratch area, then jumps to the bootloader which programs the main flash:
- Vector table relocation: the bootloader lives in sector 0 (0x08000000) with its own vector table. The application starts at sector 1 (e.g. 0x08008000 on F401 with 32 KB bootloader + 16 KB sector 1 overlap).
- Interrupt-safe window: do not program flash while interrupts may fire. Disable global interrupts during the write/erase loop, or ensure no ISR accesses flash or uses a vector from the region being erased.
- Watchdog management: a sector erase can take up to 300 ms. Refresh the IWDG before starting, or switch to a longer timeout period. I have debugged countless "bricked-on-update" devices where the IWDG reset the MCU mid-erase, leaving a partial image.
Practical checklist
- ☐ Flash architecture identified: sector-based (F4) or page-based (G0/G4/U5/L4). Erase granularity confirmed in the reference manual.
- ☐ Unlock sequence written correctly: KEYR not OPTKEYR for main flash; two specific keys in order.
- ☐ Destination address is 32-bit aligned and falls within a writable flash region (not protected by WRP option bytes).
- ☐ BSY flag polled before and after each write/erase operation.
- ☐ Flash locked after write/erase to prevent spurious writes from runaway code.
- ☐ VDD monitoring during flash programming: brownout reset could corrupt.
- ☐ EEPROM emulation design handles power-loss: magic number + checksum to detect valid records.
- ☐ Bootloader code runs from RAM or from a region not being erased (if self-programming).
- ☐ IWDG refreshed or prescaled before long erase operations (50–300 ms).
- ☐ Option bytes: RDP level verified — never level 2 during development.
- ☐ Write-protected sectors (WRP) checked: trying to write a protected sector causes a hard fault or silent error.
How I would approach this on a client project
On a production firmware project, I never let application code touch flash registers directly. Flash programming is a critical operation with real consequences if misconfigured — one wrong bit in CR and you can lock yourself out of debugging or corrupt the application vector table.
I write a dedicated flash_driver.c module that exposes only:
int flash_init(void);
int flash_erase(uint32_t page_addr);
int flash_write(uint32_t addr, const uint8_t *data, uint32_t len);
int flash_read(uint32_t addr, uint8_t *out, uint32_t len);
void flash_lock(void);
void flash_unlock(void);
This module goes through code review with a checklist (the one above). Unit tests verify the API calls the correct register sequences by inspecting a mock FLASH peripheral struct. Integration tests run on actual hardware with a dedicated test firmware that exercises every sector — this catches brownout-sensitive boards early in production.
The EEPROM emulation layer sits above flash_driver.c and adds the swap/page management. It is board-agnostic: the same .c file compiles on F4, G0, and U5 just by swapping the flash driver underneath. I keep the emulation layer simple — two pages, a 16-bit CRC, and no wear-leveling beyond the swap. For most calibration use cases (fewer than 100 000 writes over the product lifetime), this is more than sufficient.
If the client insists on byte-addressable EEPROM from day one, I evaluate the cost of an external I²C EEPROM (like a 24LC512) versus the two-page flash emulation. For volumes above 10K units, the BOM savings of on-chip flash emulation usually win, and the firmware effort is a one-time cost.
Sources and further reading
- STM32F401 Reference Manual (RM0368) — Chapter 3: Flash memory interface. Register map, lock/unlock sequence, sector layout, and programming/erase timing.
- STM32G0x0 Reference Manual (RM0444) — Chapter 4: Flash. Page erase, write protection, and boot configuration.
- STM32U5 Reference Manual (RM0456) — Chapter 4: Flash. Dual-bank architecture, RWW, and secure flash programming.
- ST Application Note AN4760 — EEPROM emulation for STM32F4 microcontrollers. Complete reference for two-page swap with wear-leveling.
- ST Application Note AN4657 — STM32 Flash programming. Covers write/erase sequences across multiple STM32 families.
- ST Application Note AN4776 — General-purpose timer cookbook (contains flash write timing references for time-triggered programming).
- STM32CubeF4 firmware package — FLASH_WriteRead example in Projects/STM32F401RE-Nucleo/Examples/FLASH/.

Comments
Have comments? Send me an email.