1
0
mirror of https://github.com/home-assistant/supervisor.git synced 2026-05-19 14:18:53 +01:00
Commit Graph

39 Commits

Author SHA1 Message Date
Stefan Agner c772a9bbb0 Replace fixed-duration sleeps after bus events with gather (#6803)
* Replace fixed-duration sleeps after bus events with gather

Several tests use ``await asyncio.sleep(...)`` to "wait for the
listener to run" after firing a bus event. The fixed duration is
real wall-clock time and the wait can be indeterministic — if the
handler chain happens to need slightly more time on a busy CI
runner, the assertion races the handler.

``Bus.fire_event`` returns the listener tasks since #6252; capture
and ``await asyncio.gather(*tasks)`` instead of sleeping. Touches
test_bus.py (the bus tests were poking scheduling instead of
verifying their assertions), test_home_assistant_watchdog.py,
test_plugin_base.py, addons/test_manager.py, docker/test_addon.py,
and test_store_execute_reload.py.

Other cleanups in the same spirit:

- ``_fire_test_event`` in addons/test_addon.py becomes ``async def``
  and gathers the listener tasks itself, so its 17 call sites
  collapse to a single ``await _fire_test_event(...)``.
- The two test_store_execute_reload.py sites that used the private
  ``_update_connectivity()`` helper are reworked to set the cached
  connectivity flag directly and fire the event themselves so they
  can gather the listener tasks the same way.
- The two ``sleep(1)`` post-pull drains in docker/test_interface.py
  collapse to ``sleep(0)`` (handler tasks are already gathered
  inside pull_image), saving ~2s.
- The ``sleep(0.01)`` waits inside ``container_events()`` task
  bodies (api/test_addons.py, api/test_store.py,
  backups/test_manager.py) are just one-yield-to-the-parent and
  become ``sleep(0)``.

Switching to ``gather`` exposes a few latent test mocks that were
silently swallowing TypeErrors as background-task failures before:

- ``CGroup.add_devices_allowed`` is ``async def`` but was patched
  as a plain MagicMock in docker/test_addon.py — now patched via
  ``new_callable=AsyncMock``.
- The watchdog does ``await (await self.start())`` /
  ``await (await self.restart())`` because ``App.start`` /
  ``App.restart`` return ``asyncio.Task``. The mocks in
  addons/test_addon.py (test_app_watchdog, test_watchdog_on_stop,
  test_watchdog_during_attach) needed
  ``AsyncMock(return_value=<settled future>)`` to mirror that
  shape rather than a plain MagicMock.

* Factor bus.fire_event + gather pattern into a helper

Per review feedback, the ``await asyncio.gather(*coresys.bus.fire_event(...))``
incantation was scattered across many call sites. Add
``tests.common.fire_bus_event`` that takes the coresys, event and data,
fires the event and awaits the spawned listener tasks. Convert all
matching sites to use it, including the ``_fire_test_event`` wrapper
in addons/test_addon.py which now just builds the
``DockerContainerStateEvent`` and delegates.
2026-05-06 12:02:28 +02:00
Stefan Agner 0ac8b42062 Rework Supervisor connectivity check with coalescing and force flag (#6765)
* Rework Supervisor connectivity check with coalescing and force flag

Previously, a failed connectivity probe could strand Supervisor in a
"no connectivity" state indefinitely. After an Ethernet reconnect, a
probe kicked by NetworkManager's connectivity transition could race
with CoreDNS being restarted (due to DNS locals changing), time out on
DNS, and leave supervisor.connectivity = False. The retry that
_on_dns_container_running was meant to fire landed inside the 5 s
JobThrottle window from the just-failed probe and was silently dropped,
since JobThrottle.THROTTLE drops rather than waits.

The rework replaces the @Job(throttle=THROTTLE) decorator and the
public connectivity setter with a single authoritative state-updating
method:

- check_and_update_connectivity(force=False) is the only path that
  runs the HTTP probe and updates the cached state. Concurrent callers
  coalesce onto a single in-flight probe. A min-interval throttle
  lives inside the method and reuses the cached result within window
  instead of dropping calls.
- request_connectivity_check(force=False) is a fire-and-forget wrapper
  for signal handlers (D-Bus, plugin callbacks) that must return
  quickly without blocking signal dispatch on the HTTP round-trip.
- force=True bypasses the min-interval and, when a probe is in flight,
  sets a trailing-rerun flag so the owning task runs one more probe
  after the current one completes. Used for signals that carry fresh
  state-change information (NM connectivity transition to FULL, DNS
  container RUNNING, startup, post-NTP sync).
- _update_connectivity is the sole writer of the cached flag and
  emits SUPERVISOR_CONNECTIVITY_CHANGE only on actual transitions.

Call sites migrate accordingly. The opportunistic
supervisor.connectivity = False writes in update_apparmor,
updater.fetch_data, os.manager, and addon_pwned error paths are
replaced with request_connectivity_check() calls so the probe remains
authoritative - an endpoint-specific failure no longer lies about the
overall connectivity state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Propagate connectivity-probe cancellation and skip last-check on cancel

Awaiting an asyncio.Task does not propagate cancellation INTO the task,
so the previous owner-doesn't-shield comment was misleading: a cancelled
owner left the spawned probe running orphaned, and the next caller could
start a second probe alongside it. The owner now explicitly cancels and
awaits the probe on CancelledError before re-raising.

The last-check timestamp is also moved out of the finally block so a
cancelled probe does not leave a "fresh result just ran" cache behind
that would short-circuit the next non-forced caller.

A regression test exercises both: that owner cancellation clears the
in-flight reference and leaves the timestamp untouched, and that a
subsequent non-forced check therefore still actually probes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Clarify why post-NTP-sync forces a connectivity probe

The previous comment claimed the last-check timestamp may be unreliable
after a time jump, but _connectivity_last_check uses loop.time() which
is monotonic and unaffected by wall-clock corrections. The real reason
to force a fresh probe is TLS validation: certificates that appeared
expired or not-yet-valid before the system clock was corrected may now
verify, so a probe that just failed with an SSL error can succeed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add debug logs to Supervisor connectivity probe paths

The original stuck-offline bug was hard to spot in logs because the
silent throttle-drop and the cached state had no audit trail. With
debug-level logging at each decision point, a future investigation can
reconstruct from a single log file:

- who requested a check (force flag distinguishes signal-driven probes
  from precondition / opportunistic-error-path requests)
- why a probe did not actually run (in-flight coalesce, cached within
  min-interval, owner cancellation)
- when a forced rerun was queued and when it ran (the precise failure
  mode that stranded the supervisor in the original incident)
- when the cached state actually flipped (with the previous value in
  the message so transitions are visible)

All new lines are debug-level. The existing _do_connectivity_check
"failed" / "succeeded" lines are kept unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Skip system-checks fan-out in test_events_on_issue_changes

The test asserts that apply_suggestion fires an ISSUE_REMOVED event.
ISSUE_REMOVED is fired by dismiss_issue inside FixupBase.__call__, before
apply_suggestion calls healthcheck. The healthcheck call afterwards is
incidental to this test's intent, but it fans out into check_system()
which runs CheckDNSServer (A and AAAA) - real aiodns query_dns() probes
against the NetworkManager mock's stub nameserver 192.168.30.1 that each
hit the default ~10 s aiodns timeout. The file took ~21 s to run.

The slowness has been latent since #3818 (Aug 2022), which added the
apply_suggestion step at the end of test_events_on_issue_changes two
days after the DNS check landed in its current form (#3811). The default
24 h JobThrottle on CheckDNSServer.run_check tends to mask the cost in
full-suite runs once any earlier test has tripped the throttle, which is
likely why this slipped through.

Mock coresys.resolution.healthcheck for just this one apply_suggestion
call rather than introducing a file-wide DNS mock. The patch is local to
the slow call site and the test's assertion is unaffected. The file
drops from ~21 s to ~2.5 s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 10:14:13 +02:00
Stefan Agner 4938fb215d Improve Docker port-in-use detection and handling (#6766)
Triaging SUPERVISOR-1JWK turned up a missed port conflict:
RE_PORT_CONFLICT_ERROR only matched one of the Docker daemon's
port-in-use message shapes. The two variants produced by current moby
— "Bind for <ip>:<port> failed: port is already allocated" from
portallocator and "failed to bind host port <ip>:<port>/<proto>:
address already in use" from osallocator — fell through to
DockerAPIError, got re-raised as AppUnknownError, and the watchdog
shipped them to Sentry as unknown errors.

Widen the regex to match all known shapes (including the older form
embedding the container endpoint, still observed from older daemons
and wrappers), anchored on the "failed to set up container networking"
prefix and one of the "address already in use" or "port is already
allocated" suffixes. Log the raw Docker message at debug level before
converting, so curious users can still see the exact upstream text
(host IP, container endpoint, protocol) when investigating which
process is holding the port.

The watchdog's _restart_after_problem now catches AppPortConflict
explicitly ahead of the generic AppsError handler: log a warning,
break the retry loop, do not call async_capture_exception. A port
conflict is an environment condition — another process grabbed the
port while the add-on was down — so retrying cannot make it succeed
and reporting to Sentry is noise.

With port conflicts now raised as typed APIError subclasses at the
detection site, the DockerAPIError → format_message() rewrite fallback
in api_return_error has no work left. Drop the fallback and delete
supervisor/utils/log_format.py along with its tests; the module only
ever handled port-conflict prose.

Fixes SUPERVISOR-1JWK

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 21:55:18 +02:00
Mike Degatano ba8c49935b Refactor internal addon references to app/apps (#6717)
* Rename addon→app in docstrings and comments

Updates all docstrings and inline comments across supervisor/ and
tests/ to use the new app/apps terminology. No runtime behaviour
is changed by this commit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Rename addon→app in code (variables, args, class names, functions)

Renames all internal Python identifiers from addon/addons to app/apps:
- Variable and argument names
- Function and method names
- Class names (Addon→App, AddonManager→AppManager, DockerAddon→DockerApp,
  all exception, check, and fixup classes, etc.)
- String literals used as Python identifiers (pytest fixtures,
  parametrize param names, patch.object attribute strings,
  URL route match_info keys)

External API contracts are preserved: JSON keys, error codes,
discovery protocol fields, TypedDict/attr.s field names.
Import module paths (supervisor/addons/) are also unchanged.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix partial backup/restore API to remap addons key to apps

The external API accepts `addons` as the request body key (since
ATTR_APPS = "addons"), but do_backup_partial and do_restore_partial
now take an `apps` parameter after the rename. The **body expansion
in both endpoints would pass `addons=...` causing a TypeError.

Remap the key before expansion in both backup_partial and
restore_partial:

    if ATTR_APPS in body:
        body["apps"] = body.pop(ATTR_APPS)

Also adds test_restore_partial_with_addons_key to verify the restore
path correctly receives apps= when addons is passed in the request
body. This path had no existing test coverage.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix merge error

* Adjust AppLoggerAdapter to use app_name

Co-authored-by: Stefan Agner <stefan@agner.ch>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Stefan Agner <stefan@agner.ch>
2026-04-14 16:47:20 +02:00
Stefan Agner 5e1eaa9dfe Respect auto-update setting for plug-in auto-updates (#6606)
* Respect auto-update setting for plug-in auto-updates

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Also skip auto-updating plug-ins in decorator

* Raise if auto-update flag is not set and plug-in is not up to date

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 09:04:33 +01:00
Mike Degatano a122b5f1e9 Migrate info, events and container logs to aiodocker (#6514)
* Migrate info and events to aiodocker

* Migrate container logs to aiodocker

* Fix dns plugin loop test

* Fix mocking for docker info

* Fixes from feedback

* Harden monitor error handling

* Deleted failing tests because they were not useful
2026-02-03 18:36:41 +01:00
Stefan Agner 79f9afb4c2 Fix port conflict tests for aiodocker 0.25.0 compatibility (#6519)
The aiodocker 0.25.0 upgrade (PR #6448) changed how DockerError handles
the message parameter. The library now extracts the message string from
Docker API JSON responses before passing it to DockerError, rather than
passing the entire dict.

The port conflict detection tests were written before this change and
incorrectly passed dicts to DockerError. This caused TypeErrors when
the port conflict detection code tried to match err.message with a
regex, expecting a string but receiving a dict.

Update both test_addon_start_port_conflict_error and
test_observer_start_port_conflict to pass message strings directly,
matching the real aiodocker 0.25.0 behavior.

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-03 10:34:47 +01:00
Mike Degatano 11b754102c Map port conflict on start error into a known error (#6445)
* Map port conflict on start error into a known error

* Apply suggestions from code review

* Run ruff format

---------

Co-authored-by: Stefan Agner <stefan@agner.ch>
2026-02-02 17:16:31 +01:00
Mike Degatano 909a2dda2f Migrate (almost) all docker container interactions to aiodocker (#6489)
* Migrate all docker container interactions to aiodocker

* Remove containers_legacy since its no longer used

* Add back remove color logic

* Revert accidental invert of conditional in setup_network

* Fix typos found by copilot

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Revert "Apply suggestions from code review"

This reverts commit 0a475433ea.

---------

Co-authored-by: Stefan Agner <stefan@agner.ch>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-01-27 12:42:17 +01:00
Mike Degatano d23bc291d5 Migrate create container to aiodocker (#6415)
* Migrate create container to aiodocker

* Fix extra hosts transformation

* Env not Environment

* Fix tests

* Fixes from feedback

---------

Co-authored-by: Jan Čermák <sairon@users.noreply.github.com>
2025-12-15 09:57:30 +01:00
Stefan Agner ae7700f52c Fix private registry authentication for aiodocker image pulls (#6355)
* Fix private registry authentication for aiodocker image pulls

After PR #6252 migrated image pulling from dockerpy to aiodocker,
private registry authentication stopped working. The old _docker_login()
method stored credentials in ~/.docker/config.json via dockerpy, but
aiodocker doesn't read that file - it requires credentials passed
explicitly via the auth parameter.

Changes:
- Remove unused _docker_login() method (dockerpy login was ineffective)
- Pass credentials directly to pull_image() via new auth parameter
- Add auth parameter to DockerAPI.pull_image() method
- Add unit tests for Docker Hub and custom registry authentication

Fixes #6345

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Ignore protected access in test

* Fix plug-in pull test

* Fix HA core tests

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-11-26 17:37:24 +01:00
Mike Degatano 30cc172199 Migrate images from dockerpy to aiodocker (#6252)
* Migrate images from dockerpy to aiodocker

* Add missing coverage and fix bug in repair

* Bind libraries to different files and refactor images.pull

* Use the same socket again

Try using the same socket again.

* Fix pytest

---------

Co-authored-by: Stefan Agner <stefan@agner.ch>
2025-11-12 20:54:06 +01:00
Stefan Agner 1448a33dbf Remove Codenotary integrity check (#6236)
* Formally deprecate CodeNotary build config

* Remove CodeNotary specific integrity checking

The current code is specific to how CodeNotary was doing integrity
checking. A future integrity checking mechanism likely will work
differently (e.g. through EROFS based containers). Remove the current
code to make way for a future implementation.

* Drop CodeNotary integrity fixups

* Drop unused tests

* Fix pytest

* Fix pytest

* Remove CodeNotary related exceptions and handling

Remove CodeNotary related exceptions and handling from the Docker
interface.

* Drop unnecessary comment

* Remove Codenotary specific IssueType/SuggestionType

* Drop Codenotary specific environment and secret reference

* Remove unused constants

* Introduce APIGone exception for removed APIs

Introduce a new exception class APIGone to indicate that certain API
features have been removed and are no longer available. Update the
security integrity check endpoint to raise this new exception instead
of a generic APIError, providing clearer communication to clients that
the feature has been intentionally removed.

* Drop content trust

A cosign based signature verification will likely be named differently
to avoid confusion with existing implementations. For now, remove the
content trust option entirely.

* Drop code sign test

* Remove source_mods/content_trust evaluations

* Remove content_trust reference in bootstrap.py

* Fix security tests

* Drop unused tests

* Drop codenotary from schema

Since we have "remove extra" in voluptuous, we can remove the
codenotary field from the addon schema.

* Remove content_trust from tests

* Remove content_trust unsupported reason

* Remove unnecessary comment

* Remove unrelated pytest

* Remove unrelated fixtures
2025-11-03 20:13:15 +01:00
Jan Čermák bbb9469c1c Write cidfiles of Docker containers and mount them individually to /run/cid (#6154)
* Write cidfiles of Docker containers and mount them individually to /run/cid

There is no standard way to get the container ID in the container
itself, which can be needed for instance for #6006. The usual pattern is
to use the --cidfile argument of Docker CLI and mount the generated file
to the container. However, this is feature of Docker CLI and we can't
use it when creating the containers via API. To get container ID to
implement native logging in e.g. Core as well, we need the help of the
Supervisor.

This change implements similar feature fully in Supervisor's DockerAPI
class that orchestrates lifetime of all containers managed by
Supervisor. The files are created in the SUPERVISOR_DATA directory, as
it needs to be persisted between reboots, just as the instances of
Docker containers are.

Supervisor's cidfile must be created when starting the Supervisor
itself, for that see home-assistant/operating-system#4276.

* Address review comments, fix mounting of the cidfile
2025-09-09 13:38:31 +02:00
Stefan Agner 2d12920b35 Stop refreshing the update information on outdated OS versions (#6098)
* Stop refreshing the update information on outdated OS versions

Add `JobCondition.OS_SUPPORTED` to the updater job to avoid
refreshing update information when the OS version is unsupported.

This effectively freezes installations on unsupported OS versions
and blocks Supervisor updates. Once deployed, this ensures that any
Supervisor will always run on at least the minimum supported OS
version.

This requires to move the OS version check before Supervisor updater
initialization to allow the `JobCondition.OS_SUPPORTED` to work
correctly.

* Run only OS version check in setup loads

Instead of running a full system evaluation, only run the OS version
check right after the OS manager is loaded. This allows the
updater job condition to work correctly without running the full
system evaluation, which is not needed at this point.

* Prevent Core and Add-on updates on unsupported OS versions

Also prevent Home Assistant Core and Add-on updates on unsupported OS
versions. We could imply `JobCondition.SUPERVISOR_UPDATED` whenever
OS is outdated, but this would also prevent the OS update itself. So
we need this separate condition everywhere where
`JobCondition.SUPERVISOR_UPDATED` is used except for OS updates.

It should also be safe to let the add-on store update, we simply
don't allow the add-on to be installed or updated if the OS is
outdated.

* Remove unnecessary Host info update

It seems that the CPE information are already loaded in the HostInfo
object. Remove the unnecessary update call.

* Fix pytest

* Delay refreshing of update data

Delay refreshing of update data until after setup phase. This allows to
use the JobCondition.OS_SUPPORTED safely. We still have to fetch the
updater data in case OS information is outdated. This typically happens
on device wipe.

Note also that plug-ins will automatically refresh updater data in case
it is missing the latest version information.

This will reverse the order of updates when there are new plug-in and
Supervisor update information available (e.g. on first startup):
Previously the updater data got refreshed before the plug-in started,
which caused them to update first. Then the Supervisor got update in
startup phase. Now the updater data gets refreshed in startup phase,
which then causes the Supervisor to update first before the plug-ins
get updated after Supervisor restart.

* Fix pytest

* Fix updater tests

* Add new tests to verify that updater reload is skipped

* Fix pylint

* Apply suggestions from code review

Co-authored-by: Mike Degatano <michael.degatano@gmail.com>

* Add debug message when we delay version fetch

---------

Co-authored-by: Mike Degatano <michael.degatano@gmail.com>
2025-08-22 11:09:56 +02:00
Mike Degatano 207b665e1d Send progress updates during image pull for install/update (#6102)
* Send progress updates during image pull for install/update

* Add extra to tests about job APIs

* Sent out of date progress to sentry and combine done event

* Pulling container image layer
2025-08-22 10:41:10 +02:00
Mike Degatano 8a82b98e5b Improved error handling for docker image pulls (#6095)
* Improved error handling for docker image pulls

* Fix mocking in tests due to api use change
2025-08-13 18:05:27 +02:00
Stefan Agner 9a0f530a2f Add Supervisor connectivity check after DNS restart (#6005)
* Add Supervisor connectivity check after DNS restart

When the DNS plug-in got restarted, check Supervisor connectivity
in case the DNS plug-in configuration change influenced Supervisor
connectivity. This is helpful when a DHCP server gets started after
Home Assistant is up. In that case the network provided DNS server
(local DNS server) becomes available after the DNS plug-in restart.

Without this change, the Supervisor connectivity will remain false
until the a Job triggers a connectivity check, for example the
periodic update check (which causes a updater and store reload) by
Core.

* Fix pytest and add coverage for new functionality
2025-07-10 11:08:10 +02:00
Stefan Agner 953f7d01d7 Improve DNS plug-in restart (#5999)
* Improve DNS plug-in restart

Instead of simply go by PrimaryConnectioon change, use the DnsManager
Configuration property. This property is ultimately used to write the
DNS plug-in configuration, so it is really the relevant information
we pass on to the plug-in.

* Check for changes and restart DNS plugin

* Check for changes in plug-in DNS

Cache last local (NetworkManager) provided DNS servers. Check against
this DNS server list when deciding when to restart the DNS plug-in.

* Check connectivity unthrottled in certain situations

* Fix pytest

* Fix pytest

* Improve test coverage for DNS plugins restart functionality

* Apply suggestions from code review

Co-authored-by: Mike Degatano <michael.degatano@gmail.com>

* Debounce local DNS changes and event based connectivity checks

* Remove connection check logic

* Remove unthrottled connectivity check

* Fix delayed call

* Store restart task and cancel in case a restart is running

* Improve DNS configuration change tests

* Remove stale code

* Improve DNS plug-in tests, less mocking

* Cover multiple private functions at once

Improve tests around notify_locals_changed() to cover multiple
functions at once.

---------

Co-authored-by: Mike Degatano <michael.degatano@gmail.com>
2025-07-09 11:35:03 +02:00
Felipe Santos b8852872fe Remove anonymous volumes when removing containers (#5977)
* Remove anonymous volumes when removing containers

* Add tests for docker.run_command()
2025-06-30 13:31:41 +02:00
Mike Degatano 0e8ace949a Fix mypy issues in plugins and resolution (#5946)
* Fix mypy issues in plugins

* Fix mypy issues in resolution module

* fix misses in resolution check

* Fix signatures on evaluate methods

* nitpick fix suggestions
2025-06-16 14:12:47 -04:00
Stefan Agner 85f8107b60 Recreate aiohttp ClientSession after DNS plug-in load (#5862)
* Recreate aiohttp ClientSession after DNS plug-in load

Create a temporary ClientSession early in case we need to load version
information from the internet. This doesn't use the final DNS setup
and hence might fail to load in certain situations since we don't have
the fallback mechanims in place yet. But if the DNS container image
is present, we'll continue the setup and load the DNS plug-in. We then
can recreate the ClientSession such that it uses the DNS plug-in.

This works around an issue with aiodns, which today doesn't reload
`resolv.conf` automatically when it changes. This lead to Supervisor
using the initial `resolv.conf` as created by Docker. It meant that
we did not use the DNS plug-in (and its fallback capabilities) in
Supervisor. Also it meant that changes to the DNS setup at runtime
did not propagate to the aiohttp ClientSession (as observed in #5332).

* Mock aiohttp.ClientSession for all tests

Currently in several places pytest actually uses the aiohttp
ClientSession and reaches out to the internet. This is not ideal
for unit tests and should be avoided.

This creates several new fixtures to aid this effort: The `websession`
fixture simply returns a mocked aiohttp.ClientSession, which can be
used whenever a function is tested which needs the global websession.

A separate new fixture to mock the connectivity check named
`supervisor_internet` since this is often used through the Job
decorator which require INTERNET_SYSTEM.

And the `mock_update_data` uses the already existing update json
test data from the fixture directory instead of loading the data
from the internet.

* Log ClientSession nameserver information

When recreating the aiohttp ClientSession, log information what
nameservers exactly are going to be used.

* Refuse ClientSession initialization when API is available

Previous attempts to reinitialize the ClientSession have shown
use of the ClientSession after it was closed due to API requets
being handled in parallel to the reinitialization (see #5851).
Make sure this is not possible by refusing to reinitialize the
ClientSession when the API is available.

* Fix pytests

Also sure we don't create aiohttp ClientSession objects unnecessarily.

* Apply suggestions from code review

Co-authored-by: Jan Čermák <sairon@users.noreply.github.com>

---------

Co-authored-by: Jan Čermák <sairon@users.noreply.github.com>
2025-05-06 16:23:40 +02:00
Mike Degatano 52cc17fa3f Delay initial version fetch until there is connectivity (#5603)
* Delay inital version fetch until there is connectivity

* Add test

* Only mock get not whole websession object

* drive delayed fetch off of supervisor connectivity not host

* Fix test to not rely on sleep guessing to track tasks

* Use fixture to remove job throttle temporarily
2025-02-11 13:22:33 +01:00
Stefan Agner f6faa18409 Bump pre-commit ruff to 0.5.7 and reformat (#5242)
It seems that the codebase is not formatted with the latest ruff
version. This PR reformats the codebase with ruff 0.5.7.
2024-08-13 20:53:56 +02:00
Mike Degatano 5ee7d16687 Add hard-coded image fallback for plugins for offline start (#5204) 2024-07-25 13:45:38 +02:00
Mike Degatano 50a2e8fde3 Allow adoption of existing data disk (#4991)
* Allow adoption of existing data disk

* Fix existing tests

* Add test cases and fix image issues

* Fix addon build test

* Run checks during setup not startup

* Addon load mimics plugin and HA load for docker part

* Default image accessible in except
2024-04-10 10:25:22 +02:00
Mike Degatano 7fd6dce55f Migrate to Ruff for lint and format (#4852)
* Migrate to Ruff for lint and format

* Fix pylint issues

* DBus property sets into normal awaitable methods

* Fix tests relying on separate tasks in connect

* Fixes from feedback
2024-02-05 11:37:39 -05:00
Mike Degatano 3cc6bd19ad Mark system as unhealthy on OSError Bad message errors (#4750)
* Bad message error marks system as unhealthy

* Finish adding test cases for changes

* Rename test file for uniqueness

* bad_message to oserror_bad_message

* Omit some checks and check for network mounts
2023-12-21 18:05:29 +01:00
Mike Degatano 37c1c89d44 Remove race with watchdog during backup, restore and update (#4635)
* Remove race with watchdog during backup, restore and update

* Fix pylint issues and test

* Stop after image pull during update

* Add test for max failed attempts for plugin watchdog
2023-10-19 22:01:56 -04:00
Mike Degatano f4b43739da Skip plugin update on startup if supervisor out of date (#4515) 2023-09-01 11:18:22 +02:00
Mike Degatano 1611beccd1 Add job group execution limit option (#4457)
* Add job group execution limit option

* Fix pylint issues

* Assign variable before usage

* Cleanup jobs when done

* Remove isinstance check for performance

* Explicitly raise from None

* Add some more documentation info
2023-08-08 16:49:17 -04:00
Mike Degatano 1f92ab42ca Reduce executor code for docker (#4438)
* Reduce executor code for docker

* Fix pylint errors and move import/export image

* Fix test and a couple other risky executor calls

* Fix dataclass and return

* Fix test case and add one for corrupt docker

* Add some coverage

* Undo changes to docker manager startup
2023-07-18 11:39:39 -04:00
Mike Degatano 5ced4e2f3b Update to python 3.11 (#4296) 2023-05-22 19:12:34 +02:00
Mike Degatano 14fcda5d78 Sentry only loaded when diagnostics on (#3993)
* Sentry only loaded when diagnostics on

* Logging when sentry is closed
2022-11-13 21:23:52 +01:00
Mike Degatano c8f184f24c Add auto update option (#3769)
* Add update freeze option

* Freeze to auto update and plugin condition

* Add tests

* Add supervisor_version evaluation

* OS updates require supervisor up to date

* Run version check during startup
2022-08-15 12:13:22 -04:00
Mike Degatano ebeff31bf6 Pass supervisor debug value to audio (#3752) 2022-07-27 17:15:54 +02:00
Mike Degatano d19166bb86 Docker events based watchdog and docker healthchecks (#3725)
* Docker events based watchdog

* Separate monitor from DockerAPI since it needs coresys

* Move monitor into dockerAPI

* Fix properties on coresys

* Add watchdog tests

* Added tests

* pylint issue

* Current state failures test

* Thread-safe event processing

* Use labels property
2022-07-15 09:21:59 +02:00
Mike Degatano 8bb4596d04 Add API option to disable fallback DNS (#3586)
* Add API option to disable fallback DNS

* DNS unsupported evaluation and fallback in sentry
2022-04-25 18:15:40 +02:00
Mike Degatano f3e2ccce43 Create issue for detected DNS server problem (#3578)
* Create issue for detected DNS server problem

* Validate behavior on restart as well

* tls:// not supported, remove check

* Move DNS server checks into resolution checks

* Revert all changes to plugins.dns

* Run DNS server checks if affected

* Mock aiodns query during all checks tests
2022-04-21 10:55:49 +02:00