Skip to content

Debugging with SWD, GDB, and Fault Handlers

Debugging with SWD, GDB, and Fault Handlers hero image
Modified:
Published:

When embedded firmware misbehaves, you cannot just add print statements and re-run. The bug might be a stack overflow that corrupts memory silently, a HardFault triggered by a null pointer dereference, or a race condition between an interrupt and the main loop. The only reliable path to these bugs is a proper debugger. In this lesson you will set up GDB with OpenOCD over SWD, learn every debugging technique the Cortex-M3 offers, and then hunt down five real bugs planted in a firmware project. #STM32 #Debugging #GDB

What We Are Building

Bug Hunt Challenge: Find and Fix Five Bugs

A pre-written firmware project with five intentionally planted bugs that cause various failure modes: a HardFault from an unaligned access, a stack overflow from unbounded recursion, a peripheral misconfiguration that silently drops data, a race condition between DMA and main loop code, and a watchdog timeout from a blocking loop. You will use GDB, SWD breakpoints, watchpoints, fault register decoding, and ITM trace to find and fix each one.

Project specifications:

ParameterValue
BoardBlue Pill (STM32F103C8T6)
DebuggerST-Link V2 clone via SWD
Debug softwareGDB (gdb-multiarch or arm-none-eabi-gdb) + OpenOCD
Bugs to find5 (described below, solutions at the end)
New parts neededNone (reuse existing hardware)
Skills practicedBreakpoints, watchpoints, HardFault decode, ITM, register inspection

The Five Bugs

Bug #SymptomCategory
1Firmware crashes immediately after enabling a peripheralHardFault (null pointer)
2Serial output works for 10 seconds then stopsStack overflow
3ADC readings are always zeroPeripheral misconfiguration
4OLED display shows corrupted data occasionallyDMA race condition
5System resets every ~4 secondsWatchdog timeout

Setting Up GDB with OpenOCD



Starting OpenOCD

OpenOCD acts as a bridge between GDB and the ST-Link hardware. It speaks USB to the ST-Link and GDB Remote Serial Protocol to GDB.

Debug toolchain:
+-------+ GDB RSP +---------+ USB +--------+
| GDB |<---------->| OpenOCD |<----->| ST-Link|
| (PC) | port 3333 | (PC) | | V2 |
+-------+ +---------+ +---+----+
|
SWD (2 wires)
|
+---+------+
| STM32 |
| Cortex-M3|
| (target) |
+----------+
Terminal window
# Terminal 1: Start OpenOCD
openocd -f interface/stlink.cfg -f target/stm32f1x.cfg

OpenOCD listens on port 3333 for GDB connections and port 4444 for a telnet command interface.

Connecting GDB

Terminal window
# Terminal 2: Start GDB
arm-none-eabi-gdb firmware.elf
# Or if using gdb-multiarch (common on Ubuntu)
gdb-multiarch firmware.elf

Inside GDB:

# Connect to OpenOCD
target remote localhost:3333
# Load firmware into flash
monitor reset halt
load
# Set a breakpoint at main
break main
# Start execution
continue

Essential GDB Commands

CommandShortDescription
break mainb mainSet breakpoint at function
break *0x08001234b *0x08001234Breakpoint at address
continuecResume execution
stepsStep one source line (into functions)
nextnStep one source line (over functions)
stepisiStep one instruction
print varp varPrint variable value
print/x varp/x varPrint in hexadecimal
info registersi rShow all CPU registers
info breakpointsi bList breakpoints
backtracebtShow call stack
watch varBreak when variable changes
x/16xw 0x20000000Examine 16 words at address
monitor reset haltReset and halt the MCU

Breakpoints and Watchpoints



The Cortex-M3 has 6 hardware breakpoints and 4 hardware watchpoints built into the silicon. Hardware breakpoints work on flash memory (where your code lives) without modifying the code. Watchpoints trigger when a memory address is read or written, which is invaluable for finding out who is corrupting a variable. GDB can also set software breakpoints by inserting a special instruction (BKPT), but these only work in RAM.

Using Breakpoints

# Break when entering uart_send_string
break uart_send_string
# Conditional breakpoint: only when adc_value > 3000
break main.c:85 if adc_values[0] > 3000
# Temporary breakpoint (deleted after first hit)
tbreak process_command
# Delete all breakpoints
delete

Using Watchpoints

# Break when 'counter' variable is written
watch counter
# Break when 'counter' is read
rwatch counter
# Break when memory at specific address is written
watch *(uint32_t*)0x20000100
# Useful for finding who corrupts a specific variable
watch stack_canary

Examining Memory

# Examine the stack (16 32-bit words at current SP)
x/16xw $sp
# Examine the vector table
x/16xw 0x08000000
# Examine a peripheral register (e.g., RCC->CR)
x/xw 0x40021000
# Examine a struct
print *((GPIO_TypeDef*)0x40010800)

HardFault Decoding



When the Cortex-M3 encounters an error it cannot handle (invalid memory access, undefined instruction, divide by zero with the trap enabled), it triggers a HardFault exception. The CPU saves its state on the stack so a fault handler can inspect what went wrong.

HardFault stack frame (pushed by HW):
Higher address (older data)
+----------------+
| xPSR | SP + 28
+----------------+
| PC (return) | SP + 24 <-- faulting
+----------------+ instruction
| LR | SP + 20
+----------------+
| R12 | SP + 16
+----------------+
| R3 | SP + 12
+----------------+
| R2 | SP + 8
+----------------+
| R1 | SP + 4
+----------------+
| R0 | SP + 0
+----------------+
Lower address (SP after fault)
In GDB: print/x stack_frame[6]
gives the PC of the faulting instruction.

The CPU pushes eight registers onto the stack (R0-R3, R12, LR, PC, xPSR) and jumps to the HardFault handler. By examining these saved registers, you can determine exactly which instruction caused the fault and what the processor state was at that moment.

HardFault Handler

void HardFault_Handler(void) {
__asm volatile (
"TST LR, #4 \n" /* Check which stack was in use */
"ITE EQ \n"
"MRSEQ R0, MSP \n" /* Main stack pointer */
"MRSNE R0, PSP \n" /* Process stack pointer */
"B hard_fault_handler_c \n"
);
}
void hard_fault_handler_c(uint32_t *stack_frame) {
volatile uint32_t r0 = stack_frame[0];
volatile uint32_t r1 = stack_frame[1];
volatile uint32_t r2 = stack_frame[2];
volatile uint32_t r3 = stack_frame[3];
volatile uint32_t r12 = stack_frame[4];
volatile uint32_t lr = stack_frame[5];
volatile uint32_t pc = stack_frame[6]; /* Faulting instruction */
volatile uint32_t psr = stack_frame[7];
/* Configurable Fault Status Register */
volatile uint32_t cfsr = SCB->CFSR;
volatile uint32_t hfsr = SCB->HFSR;
volatile uint32_t mmfar = SCB->MMFAR; /* Memory fault address */
volatile uint32_t bfar = SCB->BFAR; /* Bus fault address */
/* Suppress unused variable warnings */
(void)r0; (void)r1; (void)r2; (void)r3;
(void)r12; (void)lr; (void)psr;
(void)cfsr; (void)hfsr; (void)mmfar; (void)bfar;
/*
* In GDB, set a breakpoint here and examine:
* print/x pc -> address of faulting instruction
* print/x lr -> return address (caller)
* print/x cfsr -> fault type bits
* info line *pc -> source file and line number
*/
while (1); /* Halt here for debugger */
}

CFSR Bit Decoding

Bit RangeRegisterKey Bits
[7:0]MMFSR (MemManage)MMARVALID, DACCVIOL, IACCVIOL
[15:8]BFSR (BusFault)BFARVALID, PRECISERR, IMPRECISERR
[31:16]UFSR (UsageFault)UNDEFINSTR, INVSTATE, INVPC, UNALIGNED

Debugging a HardFault in GDB

# When the firmware hits the HardFault handler:
break hard_fault_handler_c
# After it breaks, examine the saved PC (faulting instruction)
print/x stack_frame[6]
# Look up which source line that address corresponds to
info line *0x08001A3C
# Examine the fault status registers
print/x *(uint32_t*)0xE000ED28 # CFSR
print/x *(uint32_t*)0xE000ED2C # HFSR
# Examine the faulting address (if MemManage or BusFault)
print/x *(uint32_t*)0xE000ED34 # MMFAR
print/x *(uint32_t*)0xE000ED38 # BFAR

ITM Trace Output



ITM (Instrumentation Trace Macrocell) is a hardware debug feature that sends data from the MCU to the debugger through the SWO (Serial Wire Output) pin. It provides printf-style output without consuming a UART.

ITM trace output path:
Firmware Cortex-M3 debug HW
+---------+ +------------------+
| ITM | write | ITM stimulus |
| PORT[0] |------->| port 0 |
| .u8 = c | +--------+---------+
+---------+ |
v
+------+------+
| TPIU |
| (Trace Port)|
+------+------+
|
SWO pin (PB3)
|
+------+------+
| ST-Link V2 |
+------+------+
|
+------+------+
| OpenOCD |
| /tmp/itm.fifo
+-------------+

Unlike UART printf debugging, ITM does not consume a UART peripheral and has minimal impact on timing. You can send trace messages, variable values, and timestamps through stimulus ports. The Blue Pill’s SWO pin is PB3 (shared with JTDO).

ITM Printf

/* Send a character via ITM stimulus port 0 */
void itm_putc(char c) {
while (ITM->PORT[0].u32 == 0); /* Wait until port is ready */
ITM->PORT[0].u8 = (uint8_t)c;
}
void itm_puts(const char *str) {
while (*str) {
itm_putc(*str++);
}
}

OpenOCD ITM Configuration

Terminal window
# In the OpenOCD telnet console (port 4444):
# Enable SWO trace at 72 MHz core clock, 2 MHz SWO baud
tpiu config internal /tmp/itm.fifo uart off 72000000 2000000
# Enable stimulus port 0
itm port 0 on
Terminal window
# In a separate terminal, read the ITM output:
cat /tmp/itm.fifo

Bug Hunt: The Five Bugs



Now apply everything above to find and fix five bugs in the challenge firmware. Here are hints for each bug, followed by solutions.

Bug 1: HardFault on Peripheral Enable

Symptom: The firmware crashes immediately after calling sensor_init().

Debugging approach:

  1. Set a breakpoint at sensor_init() and step through line by line.
  2. When the HardFault triggers, examine the saved PC register in the fault handler.
  3. Look up the source line with info line *<PC>.
  4. Check: is the code dereferencing a pointer? Is that pointer valid?

Bug 2: Serial Output Stops After 10 Seconds

Symptom: The serial terminal shows data for about 10 seconds, then output freezes. The MCU does not reset, it just hangs.

Debugging approach:

  1. Let the firmware run until output stops.
  2. Press Ctrl+C in GDB to halt the CPU.
  3. Run backtrace to see where the CPU is stuck.
  4. If the backtrace shows deep recursion or the SP is near the bottom of RAM, you have a stack overflow.
  5. Check for recursive functions or large local arrays.

Bug 3: ADC Readings Always Zero

Symptom: The ADC values displayed on serial are always 0 for all channels.

Debugging approach:

  1. Check if the ADC clock is enabled: x/xw 0x40021018 (RCC->APB2ENR).
  2. Check if the ADC is powered on: x/xw 0x40012408 (ADC1->CR2).
  3. Check the DMA buffer: x/3hx &adc_values.
  4. Compare the ADC configuration registers against the working code from Lesson 6.

Bug 4: OLED Shows Corrupted Data

Symptom: The OLED display usually looks correct but occasionally shows garbled pixels or shifted text.

Debugging approach:

  1. Set a watchpoint on the framebuffer: watch framebuffer[0].
  2. Check if DMA and the main loop are both writing to the framebuffer simultaneously.
  3. Look for missing DMA completion checks before modifying the buffer.

Bug 5: System Resets Every ~4 Seconds

Symptom: The firmware runs for about 4 seconds, then resets. The reset cause register (RCC->CSR) shows a watchdog reset.

Debugging approach:

  1. Search the code for IWDG (Independent Watchdog) initialization.
  2. Check if the watchdog is being fed (refreshed) regularly.
  3. Look for blocking loops or long delays that prevent the watchdog refresh.

Solutions



Bug 1 Solution: Null Pointer Dereference

The sensor_init() function receives a pointer to a configuration struct, but the caller passes NULL:

/* Bug: passing NULL */
sensor_config_t *config = NULL;
sensor_init(config); /* Dereferences config->i2c_addr, causing HardFault */
/* Fix: initialize the struct */
sensor_config_t config = { .i2c_addr = 0x76, .oversample = 1 };
sensor_init(&config);

Bug 2 Solution: Unbounded Recursion

A logging function calls itself recursively when the log buffer is full, instead of dropping the message:

/* Bug: recursive call without base case */
void log_message(const char *msg) {
if (log_buf_full()) {
log_flush();
log_message(msg); /* Stack grows each time if flush fails */
}
/* ... */
}
/* Fix: use a loop instead of recursion */
void log_message(const char *msg) {
if (log_buf_full()) {
log_flush();
if (log_buf_full()) return; /* Drop message if still full */
}
/* ... */
}

Bug 3 Solution: Missing ADC Clock Enable

The ADC clock enable line is commented out or uses the wrong register bit:

/* Bug: enabling SPI1 clock instead of ADC1 */
RCC->APB2ENR |= RCC_APB2ENR_SPI1EN;
/* Fix: enable ADC1 clock */
RCC->APB2ENR |= RCC_APB2ENR_ADC1EN;

Bug 4 Solution: DMA Race Condition

The main loop modifies the framebuffer while SPI DMA is still transmitting the previous frame:

/* Bug: no synchronization */
oled_clear(); /* Writes to framebuffer */
draw_dashboard(); /* Writes to framebuffer */
oled_update_dma(); /* Starts DMA transfer */
/* Main loop immediately starts modifying framebuffer again */
/* Fix: wait for DMA transfer to complete before modifying buffer */
while (spi_dma_busy); /* Wait for previous transfer */
oled_clear();
draw_dashboard();
oled_update_dma();

Bug 5 Solution: Blocking Loop Starves Watchdog

A sensor read function has a blocking wait for a “data ready” flag that never gets set because the sensor was not properly configured:

/* Bug: infinite wait if sensor not configured */
while (!(i2c_read_reg(BME280_ADDR, 0xF3) & 0x08)); /* Wait for data ready */
/* Fix: add timeout */
uint32_t start = systick_ms;
while (!(i2c_read_reg(BME280_ADDR, 0xF3) & 0x08)) {
if ((systick_ms - start) > 100) {
return SENSOR_TIMEOUT; /* Bail out after 100 ms */
}
iwdg_feed(); /* Keep watchdog alive while waiting */
}

What You Have Learned



Lesson 7 Complete

GDB skills:

  • Connecting GDB to OpenOCD for live debugging over SWD
  • Setting breakpoints (hardware and conditional) and watchpoints
  • Examining CPU registers, memory, and peripheral registers
  • Stepping through code at source and instruction level
  • Backtrace analysis for finding stuck or recursive code

Fault handling:

  • HardFault handler that captures faulting PC and fault status registers
  • CFSR decoding to identify fault type (MemManage, BusFault, UsageFault)
  • Recovering the faulting instruction address and source line

ITM trace:

  • Configuring SWO output through OpenOCD
  • Sending trace messages from firmware without using a UART peripheral

Bug hunting patterns:

  • Null pointer dereferences causing HardFault
  • Stack overflow from unbounded recursion
  • Missing peripheral clock enable (the most common STM32 mistake)
  • DMA race conditions with shared buffers
  • Watchdog starvation from blocking waits without timeouts

Comments

Loading comments...


© 2021-2026 SiliconWit®. All rights reserved.