1
0
mirror of https://github.com/home-assistant/supervisor.git synced 2026-02-15 07:27:13 +00:00
Commit Graph

44 Commits

Author SHA1 Message Date
Stefan Agner
77f3da7014 Disable Home Assistant watchdog during system shutdown (#6512)
During system shutdown (reboot/poweroff), the watchdog was incorrectly
detecting the Home Assistant Core container as failed and attempting to
restart it. This occurred because Docker was stopping all containers in
parallel with Supervisor's own shutdown sequence, causing the watchdog
to trigger while add-ons were still being stopped.

This led to an abrupt termination of Core before it could cleanly shut
down its SQLite database, resulting in a warning on the next startup:
"The system could not validate that the sqlite3 database was shutdown
cleanly".

The fix registers a supervisor state change listener that unregisters
the watchdog when entering any shutdown state (SHUTDOWN, STOPPING, or
CLOSE). This prevents restart attempts during both user-initiated
reboots (via API) and external shutdown signals (Docker SIGTERM,
console reboot commands).

Since SHUTDOWN, STOPPING, and CLOSE are terminal states with no reverse
transition back to RUNNING, no re-registration logic is needed.

Fixes #6511

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-31 17:01:05 +01:00
dependabot[bot]
2a4890e2b0 Bump aiodocker from 0.24.0 to 0.25.0 (#6448)
* Bump aiodocker from 0.24.0 to 0.25.0

Bumps [aiodocker](https://github.com/aio-libs/aiodocker) from 0.24.0 to 0.25.0.
- [Release notes](https://github.com/aio-libs/aiodocker/releases)
- [Changelog](https://github.com/aio-libs/aiodocker/blob/main/CHANGES.rst)
- [Commits](https://github.com/aio-libs/aiodocker/compare/v0.24.0...v0.25.0)

---
updated-dependencies:
- dependency-name: aiodocker
  dependency-version: 0.25.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update to new timeout configuration

* Fix pytest failure

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Mike Degatano <michael.degatano@gmail.com>
Co-authored-by: Stefan Agner <stefan@agner.ch>
2026-01-30 09:39:06 +01:00
Stefan Agner
a2db716a5f Check frontend availability after Home Assistant Core updates (#6311)
* Check frontend availability after Home Assistant Core updates

Add verification that the frontend is actually accessible at "/" after core
updates to ensure the web interface is serving properly, not just that the
API endpoints respond.

Previously, the update verification only checked API endpoints and whether
the frontend component was loaded. This could miss cases where the API is
responsive but the frontend fails to serve the UI.

Changes:
- Add check_frontend_available() method to HomeAssistantAPI that fetches
  the root path and verifies it returns HTML content
- Integrate frontend check into core update verification flow after
  confirming the frontend component is loaded
- Trigger automatic rollback if frontend is inaccessible after update
- Fix blocking I/O calls in rollback log file handling to use async
  executor

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Avoid checking frontend if config data is None

* Improve pytest tests

* Make sure Core returns a valid config

* Remove Core version check in frontend availability test

The call site already makes sure that an actual Home Assistant Core
instance is running before calling the frontend availability test.
So this is rather redundant. Simplify the code by removing the version
check and update tests accordingly.

* Add test coverage for get_config

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-01-29 09:06:45 +01:00
Mike Degatano
909a2dda2f Migrate (almost) all docker container interactions to aiodocker (#6489)
* Migrate all docker container interactions to aiodocker

* Remove containers_legacy since its no longer used

* Add back remove color logic

* Revert accidental invert of conditional in setup_network

* Fix typos found by copilot

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Revert "Apply suggestions from code review"

This reverts commit 0a475433ea.

---------

Co-authored-by: Stefan Agner <stefan@agner.ch>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-01-27 12:42:17 +01:00
Mike Degatano
d23bc291d5 Migrate create container to aiodocker (#6415)
* Migrate create container to aiodocker

* Fix extra hosts transformation

* Env not Environment

* Fix tests

* Fixes from feedback

---------

Co-authored-by: Jan Čermák <sairon@users.noreply.github.com>
2025-12-15 09:57:30 +01:00
Hendrik Bergunde
a2d301ed27 Increase timeout waiting for Core API to work around 2025.12.x issues (#6404)
* Fix too short timeouts for Synology NAS 

With Home Assistant Core 2025.12.x updates available the STARTUP_API_RESPONSE_TIMEOUT that HA supervisor is willing to wait (before assuming a startup failure and rolling back the entire core update) seems to be too low on not-so-beefy hosts. The problem has been seen on Synology NAS machines running Home Assistant on the side (like in my case). I have doubled the timeout from 3 to 6 minutes and the upgrade to Core 2025.12.1 works on my Synology DS723+. My update took 4min 56s -- hence the timeout increase was proven necessary.

* Fix tests for increased API Timeout

* Increase the timeout to 10 minutes

* Increase the timeout in tests

---------

Co-authored-by: Jan Čermák <sairon@users.noreply.github.com>
2025-12-08 11:05:57 -05:00
Stefan Agner
ae7700f52c Fix private registry authentication for aiodocker image pulls (#6355)
* Fix private registry authentication for aiodocker image pulls

After PR #6252 migrated image pulling from dockerpy to aiodocker,
private registry authentication stopped working. The old _docker_login()
method stored credentials in ~/.docker/config.json via dockerpy, but
aiodocker doesn't read that file - it requires credentials passed
explicitly via the auth parameter.

Changes:
- Remove unused _docker_login() method (dockerpy login was ineffective)
- Pass credentials directly to pull_image() via new auth parameter
- Add auth parameter to DockerAPI.pull_image() method
- Add unit tests for Docker Hub and custom registry authentication

Fixes #6345

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Ignore protected access in test

* Fix plug-in pull test

* Fix HA core tests

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-11-26 17:37:24 +01:00
Mike Degatano
30cc172199 Migrate images from dockerpy to aiodocker (#6252)
* Migrate images from dockerpy to aiodocker

* Add missing coverage and fix bug in repair

* Bind libraries to different files and refactor images.pull

* Use the same socket again

Try using the same socket again.

* Fix pytest

---------

Co-authored-by: Stefan Agner <stefan@agner.ch>
2025-11-12 20:54:06 +01:00
Mike Degatano
207b665e1d Send progress updates during image pull for install/update (#6102)
* Send progress updates during image pull for install/update

* Add extra to tests about job APIs

* Sent out of date progress to sentry and combine done event

* Pulling container image layer
2025-08-22 10:41:10 +02:00
Mike Degatano
8a82b98e5b Improved error handling for docker image pulls (#6095)
* Improved error handling for docker image pulls

* Fix mocking in tests due to api use change
2025-08-13 18:05:27 +02:00
Felipe Santos
b8852872fe Remove anonymous volumes when removing containers (#5977)
* Remove anonymous volumes when removing containers

* Add tests for docker.run_command()
2025-06-30 13:31:41 +02:00
Stefan Agner
122b73202b Unify Supervisor event message functions (#5831)
* Unify Supervisor event message functions

Unify functions which send WebSocket messages of type
"supervisor/event". This deduplicates code and hopefully avoids further
diversication in the future.

While at it, remove unused HomeAssistantWSNotSupported exception. It
seems the only place this exception is used got removed in #3317.

* Test message delivery during shutdown states
2025-04-23 10:40:25 +02:00
Mike Degatano
4a00caa2e8 Fix mypy issues in docker, hardware and homeassistant modules (#5805)
* Fix mypy issues in docker and hardware modules

* Fix mypy issues in homeassistant module

* Fix async_send_command typing

* Fixes from feedback
2025-04-08 12:52:58 -04:00
Mike Degatano
8b3bf547d7 Skip corrupt registry files in backups (#5789) 2025-03-27 10:32:28 +01:00
Mike Degatano
324b059970 Move write of core state to executor (#5720) 2025-03-04 17:49:53 +01:00
Mike Degatano
582b128ad9 Finish migrating read_text to executor (#5698)
* Move read_text to executor

* switch to async_capture_exception

* Finish moving read_text to executor

* Cover read_bytes and some write_text calls as well

* Fix await issues

* Fix format_message
2025-03-04 11:45:44 +01:00
Mike Degatano
86133f8ecd Move read_text to executor (#5688)
* Move read_text to executor

* Fix issues found by coderabbit

* formated to formatted

* switch to async_capture_exception

* Find and replace got one too many

* Update patch mock to async_capture_exception

* Drop Sentry capture from format_message

The error handling got introduced in #2052, however, #2100 essentially
makes sure there will never be a byte object passed to this function.
And even if, the Sentry aiohttp plug-in will properly catch such an
exception.

---------

Co-authored-by: Stefan Agner <stefan@agner.ch>
2025-03-01 16:02:43 +01:00
Stefan Agner
696dcf6149 Initialize Supervisor Core state in constructor (#5686)
* Initialize Supervisor Core state in constructor

Make sure the Supervisor Core state is set to a value early on. This
makes sure that the state is always of type CoreState, and makes sure
that any use of the state can rely on it being an actual value from the
CoreState enum.

This fixes Sentry filter during early startup, where the state
previously was None. Because of that, the Sentry filter tried to
collect more Context, which lead to an exception and not reporting
errors.

* Fix pytest

It seems that with initializing the state early, the pytest actually
runs a system evaluation with:
Starting system evaluation with state initialize

Before it did that with:
Starting system evaluation with state None

It detects that the container runs as privileged, and declares the
system as unhealthy.

It is unclear to me why coresys.core.healthy was checked in this
context, it doesn't seem useful. Just remove the check, and validate
the state through the getter instead.

* Update supervisor/core.py

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* Make sure Supervisor container is privileged in pytest

With the Supervisor Core state being valid now, some evaluations
now actually run when loading the resolution center. This leads to
Supervisor getting declared unhealthy due to not running in a privileged
container under pytest.

Fake the host container to be privileged to make evaluations not
causing the system to be declared unhealthy under pytest.

* Avoid writing actual Supervisor run state file

With the Supervisor Core state being valid from the very start, we end
up writing a state everytime.

Instead of actually writing a state file, simply validate the the
necessary calls are being made. This is more conform to typical unit
tests and avoids writing a file for every test.

* Extend WebSocket client fixture and use it consistently

Extend the ha_ws_client WebSocket client fixture to set Supervisor Core
into run state and clear all pending messages.

Currently only some tests use the ha_ws_client WebSocket client fixture.
Use it consistently for all tests.

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-02-28 18:01:55 +01:00
Stefan Agner
7348745049 Print the exact reason if the WebSocket event to Core fails (#5609)
* Print the exact reason if the WebSocket event to Core fails

* Improve error at backup end too, fix tests

* Fix text

* Address ruff check issue
2025-02-06 18:17:46 +01:00
Mike Degatano
4c04f364a3 Use full match in homeassistant backup excludes (#5597) 2025-02-03 13:47:12 +01:00
Stefan Agner
f6faa18409 Bump pre-commit ruff to 0.5.7 and reformat (#5242)
It seems that the codebase is not formatted with the latest ruff
version. This PR reformats the codebase with ruff 0.5.7.
2024-08-13 20:53:56 +02:00
Mike Degatano
17ee234be4 Fix resp may be undefined end_backup issue (#5224) 2024-08-05 17:02:11 +02:00
Erik Montnemery
4ab4350c58 Add support for offline DB migration (#5202)
* Add support for offline DB migration

* Format code
2024-07-23 15:27:16 -04:00
Mike Degatano
50a2e8fde3 Allow adoption of existing data disk (#4991)
* Allow adoption of existing data disk

* Fix existing tests

* Add test cases and fix image issues

* Fix addon build test

* Run checks during setup not startup

* Addon load mimics plugin and HA load for docker part

* Default image accessible in except
2024-04-10 10:25:22 +02:00
Mike Degatano
9ca927dbe7 Watchdog does not start core before supervisor (#4955) 2024-03-13 09:08:27 +01:00
Mike Degatano
7fd6dce55f Migrate to Ruff for lint and format (#4852)
* Migrate to Ruff for lint and format

* Fix pylint issues

* DBus property sets into normal awaitable methods

* Fix tests relying on separate tasks in connect

* Fixes from feedback
2024-02-05 11:37:39 -05:00
Mike Degatano
3cc6bd19ad Mark system as unhealthy on OSError Bad message errors (#4750)
* Bad message error marks system as unhealthy

* Finish adding test cases for changes

* Rename test file for uniqueness

* bad_message to oserror_bad_message

* Omit some checks and check for network mounts
2023-12-21 18:05:29 +01:00
Jan Čermák
ed7edd9fe0 Adjust "retry in ..." log messages to avoid confusion (#4783)
As shown in home-assistant/operating-system#3007, error messages printed
to logs when container installation fails can cause some confusion,
because they are sometimes printed to the log on the landing page.
Adjust all wordings of "retry in" to "retrying in" to make it obvious
this happens automatically.
2023-12-20 18:34:42 +01:00
Erik Montnemery
95ac53d780 Bump core shutdown timeout for new pre-stopping core state (#4736)
* Bump core shutdown timeout

* Clarify comment

* Update tests
2023-11-28 15:03:25 -05:00
Stefan Agner
1e49129197 Use longer timeouts for API checks before trigger a rollback (#4658)
* Don't check if Core is running to trigger rollback

Currently we check for Core API access and that the state is running. If
this is not fulfilled within 5 minutes, we rollback to the previous
version.

It can take quite a while until Home Assistant Core is in state running.
In fact, after going through bootstrap, it can theoretically take
indefinitely (as in there is no timeout from Core side).

So to trigger rollback, rather than check the state to be running, just
check if the API is accessible in this case. This prevents spurious
rollbacks.

* Check Core status with and timeout after a longer time

Instead of checking the Core API just for response, do check the
state. Use a timeout which is long enough to cover all stages and
other timeouts during Core startup.

* Introduce get_api_state and better status messages

* Update supervisor/homeassistant/api.py

Co-authored-by: J. Nick Koston <nick@koston.org>

* Add successful start test

---------

Co-authored-by: J. Nick Koston <nick@koston.org>
2023-11-01 16:01:38 -04:00
Mike Degatano
a24657e565 Handle get users API returning None (#4628)
* Handle get users API returning None

* Skip throttle during test
2023-10-16 21:54:50 +02:00
Mike Degatano
682b8e0535 Core API check during startup can timeout (#4595)
* Core API check during startup can timeout

* Use a more specific exception so caller can differentiate
2023-10-04 18:54:42 +02:00
Mike Degatano
1611beccd1 Add job group execution limit option (#4457)
* Add job group execution limit option

* Fix pylint issues

* Assign variable before usage

* Cleanup jobs when done

* Remove isinstance check for performance

* Explicitly raise from None

* Add some more documentation info
2023-08-08 16:49:17 -04:00
Mike Degatano
1f92ab42ca Reduce executor code for docker (#4438)
* Reduce executor code for docker

* Fix pylint errors and move import/export image

* Fix test and a couple other risky executor calls

* Fix dataclass and return

* Fix test case and add one for corrupt docker

* Add some coverage

* Undo changes to docker manager startup
2023-07-18 11:39:39 -04:00
Mike Degatano
96d5fc244e Separate startup event from update check event (#4425)
* Separate startup event from update check event

* Add a queue for messages sent during startup
2023-07-06 12:45:37 -04:00
Mike Degatano
c896b60410 Fix asyncio.wait in supervisor.reload (#4333)
* Fix asyncio.wait in supervisor.reload

* Unwrap to prevent throttling across tests
2023-06-01 18:38:42 -04:00
Mike Degatano
5ced4e2f3b Update to python 3.11 (#4296) 2023-05-22 19:12:34 +02:00
Mike Degatano
14fcda5d78 Sentry only loaded when diagnostics on (#3993)
* Sentry only loaded when diagnostics on

* Logging when sentry is closed
2022-11-13 21:23:52 +01:00
Mike Degatano
2cd7f9d1b0 Attempt plugin update before failing job condition (#3796) 2022-08-17 07:36:05 +02:00
Mike Degatano
c8f184f24c Add auto update option (#3769)
* Add update freeze option

* Freeze to auto update and plugin condition

* Add tests

* Add supervisor_version evaluation

* OS updates require supervisor up to date

* Run version check during startup
2022-08-15 12:13:22 -04:00
Mike Degatano
d19166bb86 Docker events based watchdog and docker healthchecks (#3725)
* Docker events based watchdog

* Separate monitor from DockerAPI since it needs coresys

* Move monitor into dockerAPI

* Fix properties on coresys

* Add watchdog tests

* Added tests

* pylint issue

* Current state failures test

* Thread-safe event processing

* Use labels property
2022-07-15 09:21:59 +02:00
Casper
82060dd242 Fix typos (#2704) 2021-03-09 13:37:10 +01:00
Joakim Sørensen
90d8832cd2 Send event when add-on changes state (#2608)
* Send event when add-on changes state

* fix test
2021-02-23 15:12:30 +01:00
Joakim Sørensen
b31ecfefcd Initial WS support (#2439)
* Initial WS support

* test

* Update frontend to fc7c4af2

* Fix issue with closing states

* log error

* make data optional

* limit stopping states

* Move wrappers to HomeAssistantWebSocket

* use info

* Use call_soon

* Use lookuptable for WS commands

* Fix tests
2021-02-19 11:57:31 +01:00