-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
Description
Sending any (user) packet over 802154 submac using lwip stack will deadlock. I traced the issue down to a race condition between the main thread, which requests BH before setting the fsm state to PREPARE, and the lwip_netdev_mux thread who will happily try to handle the BH before while the fsm state is still RX.
This does not happen with GNRC as all submac interaction happens on a separate thread there.
Steps to reproduce the issue
Print out thread names on debug prints with
diff --git a/core/lib/include/debug.h b/core/lib/include/debug.h
index 620de78267..1d4bdef600 100644
--- a/core/lib/include/debug.h
+++ b/core/lib/include/debug.h
@@ -121,7 +121,7 @@ extern "C" {
* @details If a variable is only accessed by `DEBUG()`, the compiler will
* warn about unused variables when `ENABLE_DEBUG` is set to `0`.
*/
-#define DEBUG(...) do { if (ENABLE_DEBUG) { DEBUG_PRINT(__VA_ARGS__); } } while (0)
+#define DEBUG(...) do { if (ENABLE_DEBUG) { puts(thread_get_name(thread_get_active())); DEBUG_PRINT(__VA_ARGS__); } } while (0)
/**
* @def DEBUG_PUTS
@@ -129,7 +129,7 @@ extern "C" {
* @brief Print debug information to stdout using puts(), so no stack size
* restrictions do apply.
*/
-#define DEBUG_PUTS(str) do { if (ENABLE_DEBUG) { puts(str); } } while (0)
+#define DEBUG_PUTS(str) do { if (ENABLE_DEBUG) { puts(thread_get_name(thread_get_active())); puts(str); } } while (0)
/** @} */
/**Enable debug prints for cpu/nrf52/radio/nrf802154/nrf802154_radio.c, drivers/netdev_ieee802154_submac/netdev_ieee802154_submac.c and /pkg/lwip/contrib/netdev/lwip_netdev.c.
Run LWIP_IPV6=1 make -C examples/networking/coap/gcoap_dtls BOARD=nrf52840dk flash term -j
Expected results
No race condition, submac stuff should be handled on a single thread I guess?
Actual results
coap get coap://[fe80::1]/st
2025-11-04 14:23:33,375 # coap get coap://[fe80::1]/
2025-11-04 14:23:33,379 # gcoap_cli: sending msg ID 64789, 6 bytes
2025-11-04 14:23:33,380 # main
2025-11-04 14:23:33,387 # IEEE802154 submac: ieee802154_submac_process_ev(): IEEE802154_FSM_STATE_RX + REQUEST_TX
2025-11-04 14:23:33,388 # main
2025-11-04 14:23:33,391 # [nrf802154] Device state: DISABLED
2025-11-04 14:23:33,391 # main
2025-11-04 14:23:33,394 # [nrf802154] Send a packet
2025-11-04 14:23:33,394 # main
2025-11-04 14:23:33,399 # [nrf802154] send: putting 64 bytes into the frame buffer
2025-11-04 14:23:33,399 # main
2025-11-04 14:23:33,406 # IEEE802154 submac: ieee802154_submac_bh_request(): post NETDEV_EVENT_ISR
2025-11-04 14:23:33,406 # main
2025-11-04 14:23:33,409 # [lwip_netdev] NETDEV_EVENT_ISR
2025-11-04 14:23:33,410 # lwip_netdev_mux
2025-11-04 14:23:33,413 # [lwip_netdev] handle netdev isr
2025-11-04 14:23:33,414 # lwip_netdev_mux
2025-11-04 14:23:33,419 # IEEE802154 submac: _isr(): NETDEV_SUBMAC_FLAGS_BH_REQUEST
2025-11-04 14:23:33,421 # lwip_netdev_mux
2025-11-04 14:23:33,428 # IEEE802154 submac: ieee802154_submac_process_ev(): IEEE802154_FSM_STATE_RX + BH
2025-11-04 14:23:33,429 # lwip_netdev_mux
2025-11-04 14:23:33,431 # RX--(BH)->INVALID
2025-11-04 14:23:38,382 # gcoap: timeout for msg ID 64789
and deadlock because the main process waits for TX_DONE.
Versions
Current master.