Add check_oserror() method to ResolutionManager with an extensible
errno-to-unhealthy-reason mapping. Replace ~20 inline
`if err.errno == errno.EBADMSG` checks across 14 files with a single
call to self.sys_resolution.check_oserror(err). This makes it easy to
add handling for additional filesystem errors (e.g. EIO, ENOSPC) in
one place.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Migrate builder workflow to new builder actions
Migrate Supervisor image build to new builder actions. The resulting images
should be identical to those built by the builder.
Refs #6646 - does not implement multi-arch manifest publishing (will be done in
a follow-up)
* Update devcontainer version to 3
systemd only emits bus signals (including PropertiesChanged) when at
least one client has called Subscribe() on the Manager interface. On
regular HAOS systems, systemd-logind calls Subscribe which enables
signals for all bus clients. However, in environments without
systemd-logind (such as the Supervisor devcontainer with systemd), no
signals are emitted, causing the firewall unit wait to time out.
Explicitly calling Subscribe() has no downsides and makes it clear
that the Supervisor relies on these signals. There is no need to call
Unsubscribe() as systemd automatically tracks clients and stops
emitting signals when all subscribers have disconnected from the bus.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add iptables rules via a systemd transient unit to drop traffic
addressed to the bridge gateway IP from non-bridge interfaces.
The firewall manager waits for the transient unit to complete and
verifies success via D-Bus property change signals. On failure, the
system is marked unhealthy and host-network add-ons are prevented
from booting.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Further slow down automatic update rollout to reduce pressure on container
registry infrastructure (GHCR rate limiting). Plugins are staggered 2 minutes
apart starting at 12h, Supervisor moves from 12h to 24h.
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Slow down the automatic Supervisor update rollout to reduce pressure
on the container registry infrastructure (GHCR rate limiting).
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* Wait for addon startup task before unload to prevent data access race
Replace the cancel-based approach in unload() with an await of the outer
_wait_for_startup_task. The container removal and state change resolve the
startup event naturally, so we just need to ensure the task completes
before addon data is removed. This prevents a KeyError on self.name access
when _wait_for_startup times out after data has been removed.
Also simplify _wait_for_startup by removing the unnecessary inner task
wrapper — asyncio.wait_for can await the event directly.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Drop asyncio.sleep() in test_manager.py
* Only clear startup task reference if still the current task
Prevent a race where an older _wait_for_startup task's finally block
could wipe the reference to a newer task, causing unload() to skip
the await.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Reuse existing pending startup wait task when addon is already running
If start() is called while the addon is already running and a startup
wait task is still pending, return the existing task instead of creating
a new one.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
aiohttp's default max_msg_size of 4MB causes the ingress WebSocket proxy
to silently drop connections when an add-on sends messages larger than
that limit (e.g. Zigbee2MQTT's bridge/devices payload with many devices).
Setting max_msg_size=0 removes the limit on both the server-side
WebSocketResponse and the upstream ws_connect, fixing dropped connections
for add-ons that produce large WebSocket messages.
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: J. Nick Koston <nick@koston.org>
* Fix fallback time sync, create repair issue if time is out of sync
The "poor man's NTP" using the whois service didn't work because it attempted
to sync the time when the NTP service was enabled, which is rejected by the
timedated service. To fix this, Supervisor now first disables the
systemd-timesyncd service and creates a repair issue before adjusting the time.
The timesyncd service stays disabled until submitting the fixup. Theoretically,
if the time moves backwards from an invalid time in the future,
systemd-timesyncd could otherwise restore the wrong time from a timestamp if we
did that after the time was set.
Also, the sync is now performed if the time is more that 1 hour off and in both
directions (previously it only intervened if it was more than 3 days in the
past).
Fixes#6015, refs #6549
* Update test_adjust_system_datetime_if_time_behind
The core_security check (HA < 2021.1.5 with custom components) and the
ResolutionNotify class that created persistent notifications for it are
no longer needed. The minimum supported HA version is well past 2021.1.5,
so this check can never trigger. The notify module was the only consumer
of persistent notifications and had no other users.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* Treat empty string password as None in backup restore
Work around a securetar 2026.2.0 bug where an empty string password
sets encrypted=True but fails to derive a key, leading to an
AttributeError on restore. This also restores consistency with backup
creation which uses a truthiness check to skip encryption for empty
passwords.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Explicitly mention that "" is treated as no password
* Add tests for empty string password handling in backups
Verify that empty string password is treated as no password on both
backup creation (not marked as protected) and restore (normalized to
None in set_password).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Improve comment
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* Use Python 3.14(.3) in CI and base image
Update base image to the latest tag using Python 3.14.3 and update Python
version in CI workflows to 3.14.
With Python 3.14, backports.zstd is no longer necessary as it's now available
in the standard library.
* Update wheels ABI in the wheels builder to cp314
* Use explicit Python fix version in GH actions
Specify explicitly Python 3.14.3, as the setup-python action otherwise default
to 3.14.2 when 3.14.3, leading to different version in CI and in production.
* Update Python version references in pyproject.toml
* Fix all ruff quoted-annotation (UP037) errors
* Revert unquoting of DBus types in tests and ignore UP037 where needed
* Respect auto-update setting for plug-in auto-updates
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Also skip auto-updating plug-ins in decorator
* Raise if auto-update flag is not set and plug-in is not up to date
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
aiohttp's BasicAuth.decode() raises ValueError for any non-Basic auth
method (e.g. Bearer tokens). This propagated as an unhandled exception,
causing a 500 response instead of the expected 401 Unauthorized.
Catch the ValueError in _process_basic() and raise HTTPUnauthorized with
the WWW-Authenticate realm header so clients get a proper 401 response.
Fixes SUPERVISOR-BFG
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
The _migrate function in addons/validate.py is the first validator in the
SCHEMA_ADDON_CONFIG All() chain and was called directly with raw config data.
If a malformed add-on config file contained a non-dict value (e.g. a string),
config.get() would raise an AttributeError instead of a proper voluptuous
Invalid error, causing an unhandled exception.
Add an isinstance check at the top of _migrate to raise vol.Invalid for
non-dict inputs, letting validation fail gracefully.
Fixes SUPERVISOR-HMP
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* Drop unsupported architectures and machines from Supervisor
Since #5620 Supervisor no longer updates the version information on
unsupported architectures and machines. This means users can no longer
update to newer version of Supervisor since that PR got released.
Furthermore since #6347 we also no longer build for these
architectures. With this, any code related to these architectures
becomes dead code and should be removed.
This commit removes all refrences to the deprecated architectures and
machines from Supervisor.
This affects the following architectures:
- armhf
- armv7
- i386
And the following machines:
- odroid-xu
- qemuarm
- qemux86
- raspberrypi
- raspberrypi2
- raspberrypi3
- raspberrypi4
- tinker
* Create issue if an app using a deprecated architecture is installed
This adds a check to the resolution system to detect if an app is
installed that uses a deprecated architecture. If so, it will show a
warning to the user and recommend them to uninstall the app.
* Formally deprecate machine add-on configs as well
Not only deprecate add-on configs for unsupported architectures, but
also for unsupported machines.
* For installed add-ons architecture must always exist
Fail hard in case of missing architecture, as this is a required field
for installed add-ons. This will prevent the Supervisor from running
with an unsupported configuration and causing further issues down the
line.
* Fix add-on build using wrong architecture for non-native arch add-ons
When building a locally-built add-on (no image tag), the architecture
was always set to sys_arch.default (e.g. amd64 on x86_64) instead of
matching against the add-on's declared architectures. This caused an
i386-only add-on to incorrectly build as amd64.
Use sys_arch.match() against the add-on's declared arch list in all
code paths: the arch property, image name generation, BUILD_ARCH build
arg, and default base image selection.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Use CpuArch enums to fix tests
* Explicitly set _supported_arch as new list to fix tests
* Fix pytests
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* Use verbose log output for plug-ins
All three plug-ins which support logging (dns, multicast and audio)
should use the verbose log format by default to make sure the log lines
are annotated with timestamp. Introduce a new flag default_verbose for
advanced logs.
* Use default_verbose for host logs as well
Use the new default_verbose flag for advanced logs, to make it more
explicit that we want timestamps for host logs as well.
The /os/info API endpoint has been using D-Bus property TimeUSec which got
cached between requests, so the time returned was not always the same as
current time on the host system at the time of the request. Since there's no
reason to use D-Bus API for the time, as Supervisor runs on the same machine
and time is global, simply format current datetime object with Python and
return it in the response.
Fixes#6581