Skip to content

Fix gateway/group startup and teardown race conditions#690

Open
dmulcahey wants to merge 2 commits intodevfrom
pr/fix-gateway-group-lifecycle-races
Open

Fix gateway/group startup and teardown race conditions#690
dmulcahey wants to merge 2 commits intodevfrom
pr/fix-gateway-group-lifecycle-races

Conversation

@dmulcahey
Copy link
Contributor

@dmulcahey dmulcahey commented Feb 27, 2026

This pull request introduces several improvements to error handling, task management, and group operations in the Zigbee gateway and group management modules. The changes help prevent potential issues like race conditions, redundant task creation, and errors from missing entities or groups. The most important changes are grouped below by theme.

Task Management & Error Handling

  • Improved device initialization task removal logic in device_initialized to ensure tasks are only removed if they match the expected task, preventing race conditions.
  • Added cancellation of initialization tasks when a device is removed in device_removed, ensuring cleanup of background tasks.
  • Enhanced startup polling logic in async_initialize_devices_and_entities with a try/finally to always allow polling after attempting device polling, improving robustness.
  • Wrapped shutdown logic in shutdown with try/finally to guarantee that the shutdown state is reset even if errors occur. [1] [2]
  • Prevented redundant creation of global updater and device availability checker tasks in helpers.py by checking if tasks are already running before starting new ones. [1] [2]

Group Operations

  • Added handling for unknown groups in group_removed to avoid errors when removing groups that don't exist, with debug logging for visibility.
  • Updated unregister_group_entity to refresh entity subscriptions after removing a group entity, ensuring correct subscription state.
  • Added early returns in async_add_members and async_remove_members if the member list is empty, preventing unnecessary processing. [1] [2]

Configuration Handling

  • Ensured deep copying of network configuration in get_application_controller_data to avoid unintended mutation of config objects.## Summary
    Fixes gateway/group lifecycle race conditions around startup polling, shutdown behavior, init task races, and group/member lifecycle bookkeeping.

Failing tests addressed

  • tests/test_gateway.py::test_group_removed_unknown_group_is_noop
  • tests/test_gateway.py::test_country_code_passthrough_updates_when_changed
  • tests/test_gateway.py::test_group_async_add_members_empty_list_noop
  • tests/test_gateway.py::test_group_async_remove_members_empty_list_noop
  • tests/test_gateway.py::test_group_unregister_entity_clears_member_subscriptions
  • tests/test_gateway.py::test_shutdown_controller_exception_resets_shutting_down_flag
  • tests/test_gateway.py::test_startup_polling_failure_still_enables_allow_polling
  • tests/test_gateway.py::test_device_removed_cancels_pending_device_init_task
  • tests/test_gateway.py::test_global_updater_second_start_stop_cancels_first_task
  • tests/test_gateway.py::test_gateway_device_initialized_done_callback_race_keeps_newer_task
  • tests/test_gateway.py::test_device_availability_checker_start_twice_stop_once_cancels_all_tasks

Verification

  • pytest (branch-local targeted run): all listed tests pass.

@dmulcahey dmulcahey force-pushed the pr/fix-gateway-group-lifecycle-races branch from 6e8fb32 to 24d076d Compare February 27, 2026 16:04
@dmulcahey dmulcahey changed the base branch from dm/codex-issue-exploration to dev February 27, 2026 16:04
@dmulcahey dmulcahey closed this Feb 27, 2026
@dmulcahey dmulcahey reopened this Feb 27, 2026
@codecov
Copy link

codecov bot commented Feb 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.60%. Comparing base (9d03d63) to head (3646c90).

Additional details and impacted files
@@            Coverage Diff             @@
##              dev     #690      +/-   ##
==========================================
+ Coverage   97.51%   97.60%   +0.08%     
==========================================
  Files          62       62              
  Lines       10949    10971      +22     
==========================================
+ Hits        10677    10708      +31     
+ Misses        272      263       -9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant