Layers that already exist locally should not count towards download
progress since there's nothing to download for them. Only layers that
need pulling are included in the progress calculation.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Refactor Docker image pull progress to use a simpler count-based approach
where each layer contributes equally (100% / total_layers) regardless of
size. This replaces the previous size-weighted calculation that was
susceptible to progress regression.
The core issue was that Docker rate-limits concurrent downloads (~3 at a
time) and reports layer sizes only when downloading starts. With size-
weighted progress, large layers appearing late would cause progress to
drop dramatically (e.g., 59% -> 29%) as the total size increased.
The new approach:
- Each layer contributes equally to overall progress
- Per-layer progress: 70% download weight, 30% extraction weight
- Progress only starts after first "Downloading" event (when layer
count is known)
- Always caps at 99% - job completion handles final 100%
This simplifies the code by moving progress tracking to a dedicated
module (pull_progress.py) and removing complex size-based scaling logic
that tried to account for unknown layer sizes.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix progress when using containerd snapshotter
* Add test for tiny image download under containerd-snapshotter
* Fix API tests after progress allocation change
* Fix test for auth changes
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Stefan Agner <stefan@agner.ch>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Implement Supervisor API for home-assistant/os-agent#238, adding possibility to
schedule migration either to Containerd overlayfs driver, or migration to the
graph overlay2 driver, once the device is rebooted the next time. While it's
technically in the DBus OS interface, in Supervisor's abstraction it makes more
sense to put it under `/docker` endpoints.
* Pass registry credentials to add-on build for private base images
When building add-ons that use a base image from a private registry,
the build would fail because credentials configured via the Supervisor
API were not passed to the Docker-in-Docker build container.
This fix:
- Adds get_docker_config_json() to generate a Docker config.json with
registry credentials for the base image
- Creates a temporary config file and mounts it into the build container
at /root/.docker/config.json so BuildKit can authenticate when pulling
the base image
- Cleans up the temporary file after build completes
Fixes#6354🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix pylint errors
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Refactor registry credential extraction into shared helper
Extract duplicate logic for determining which registry matches an image
into a shared `get_registry_for_image()` method in `DockerConfig`. This
method is now used by both `DockerInterface._get_credentials()` and
`AddonBuild.get_docker_config_json()`.
Move `DOCKER_HUB` and `IMAGE_WITH_HOST` constants to `docker/const.py`
to avoid circular imports between manager.py and interface.py.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* ruff format
* Document raises
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Mike Degatano <michael.degatano@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Fix private registry authentication for aiodocker image pulls
After PR #6252 migrated image pulling from dockerpy to aiodocker,
private registry authentication stopped working. The old _docker_login()
method stored credentials in ~/.docker/config.json via dockerpy, but
aiodocker doesn't read that file - it requires credentials passed
explicitly via the auth parameter.
Changes:
- Remove unused _docker_login() method (dockerpy login was ineffective)
- Pass credentials directly to pull_image() via new auth parameter
- Add auth parameter to DockerAPI.pull_image() method
- Add unit tests for Docker Hub and custom registry authentication
Fixes#6345🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Ignore protected access in test
* Fix plug-in pull test
* Fix HA core tests
---------
Co-authored-by: Claude <noreply@anthropic.com>
* Drop Debian 12 from supported OS list
With the deprecation of Home Assistant Supervised installation method
Debian 12 is no longer supported. This change removes Debian 12
from the list of supported operating systems in the evaluation logic.
* Improve tests
* Deprecate i386, armhf and armv7 Supervisor architectures
* Exclude Core from architecture deprecation checks
This allows to download the latest available Core version still, even
on deprecated systems.
* Fix pytest
Add support for `no_colors` query parameter on all advanced logs API endpoints,
allowing users to optionally strip ANSI color sequences from log output. This
complements the existing color stripping on /latest endpoints added in #6319.
* Handle pull events with complete progress details only
Under certain circumstances, Docker seems to send pull events with
incomplete progress details (i.e., missing 'current' or 'total' fields).
In practise, we've observed an empty dictionary for progress details
as well as missing 'total' field (while 'current' was present).
All events were using Docker 28.3.3 using the old, default Docker graph
backend.
* Fix docstring/comment
* Fix call_at to use event loop time base instead of Unix timestamp
The CoreSys.call_at method was incorrectly passing Unix timestamps
directly to asyncio.loop.call_at(), which expects times in the event
loop's monotonic time base. This caused scheduled jobs to be scheduled
approximately 55 years in the future (the difference between Unix epoch
time and monotonic time since boot).
The bug was masked by time-machine 2.19.0, which patched time.monotonic()
and caused loop.time() to return Unix timestamps. Time-machine 3.0.0
removed this patching (as it caused event loop freezes), exposing the bug.
Fix by converting the datetime to event loop time base:
- Calculate delay from current Unix time to scheduled Unix time
- Add delay to current event loop time to get scheduled loop time
Also simplify test_job_scheduled_at to avoid time-machine's async
context managers, following the pattern of test_job_scheduled_delay.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Add comment about dateime in the past
---------
Co-authored-by: Claude <noreply@anthropic.com>
* Strip ANSI escape color sequences from /latest log responses
Strip ANSI sequences of CSI commands [1] used for log coloring from
/latest log endpoints. These endpoint were primarily designed for log
downloads and colors are mostly not wanted in those. Add optional
argument for stripping the colors from the logs and enable it for the
/latest endpoints.
[1] https://en.wikipedia.org/wiki/ANSI_escape_code#CSIsection
* Refactor advanced logs' tests to use fixture factory
Introduce `advanced_logs_tester` fixture to simplify testing of advanced logs
in the API tests, declaring all the needed fixture in a single place. # Please
enter the commit message for your changes. Lines starting
* Migrate images from dockerpy to aiodocker
* Add missing coverage and fix bug in repair
* Bind libraries to different files and refactor images.pull
* Use the same socket again
Try using the same socket again.
* Fix pytest
---------
Co-authored-by: Stefan Agner <stefan@agner.ch>
* Add context to Sentry events during setup phase
Since not all properties are safe to access the current code avoids
adding any context during initialization and setup phase. However,
quite some reports are during the setup phase. This change adds some
context to events during setup phase as well, to make debugging easier.
* Drop default arch (not available during setup)
* Avoid adding Content-Type to non-body responses
The current code sets the content-type header for all responses
to the result's content_type property if upstream does not set a
content_type. The default value for content_type is
"application/octet-stream".
For responses that do not have a body (like 204 No Content or
304 Not Modified), setting a content-type header is unnecessary and
potentially misleading. Follow HTTP standards by only adding the
content-type header to responses that actually contain a body.
* Add pytest for ingress proxy
* Preserve Content-Type header for HEAD requests in ingress API
* Fix docker image pull progress blocked by small layers
Small Docker layers (typically <100 bytes) can skip the downloading phase
entirely, going directly from "Pulling fs layer" to "Download complete"
without emitting any progress events with byte counts. This caused the
aggregate progress calculation to block indefinitely, as it required all
layer jobs to have their `extra` field populated with byte counts before
proceeding.
The issue manifested as parent job progress jumping from 0% to 97.9% after
long delays, as seen when a 96-byte layer held up progress reporting for
~50 seconds until it finally reached the "Extracting" phase.
Set a minimal `extra` field (current=1, total=1) when layers reach
"Download complete" without having gone through the downloading phase.
This allows the aggregate progress calculation to proceed immediately
while still correctly representing the layer as 100% downloaded.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Update test to capture issue correctly
* Improve pytest
* Fix pytest comment
* Fix pylint warning
---------
Co-authored-by: Claude <noreply@anthropic.com>
* Formally deprecate CodeNotary build config
* Remove CodeNotary specific integrity checking
The current code is specific to how CodeNotary was doing integrity
checking. A future integrity checking mechanism likely will work
differently (e.g. through EROFS based containers). Remove the current
code to make way for a future implementation.
* Drop CodeNotary integrity fixups
* Drop unused tests
* Fix pytest
* Fix pytest
* Remove CodeNotary related exceptions and handling
Remove CodeNotary related exceptions and handling from the Docker
interface.
* Drop unnecessary comment
* Remove Codenotary specific IssueType/SuggestionType
* Drop Codenotary specific environment and secret reference
* Remove unused constants
* Introduce APIGone exception for removed APIs
Introduce a new exception class APIGone to indicate that certain API
features have been removed and are no longer available. Update the
security integrity check endpoint to raise this new exception instead
of a generic APIError, providing clearer communication to clients that
the feature has been intentionally removed.
* Drop content trust
A cosign based signature verification will likely be named differently
to avoid confusion with existing implementations. For now, remove the
content trust option entirely.
* Drop code sign test
* Remove source_mods/content_trust evaluations
* Remove content_trust reference in bootstrap.py
* Fix security tests
* Drop unused tests
* Drop codenotary from schema
Since we have "remove extra" in voluptuous, we can remove the
codenotary field from the addon schema.
* Remove content_trust from tests
* Remove content_trust unsupported reason
* Remove unnecessary comment
* Remove unrelated pytest
* Remove unrelated fixtures
When fetching the host info without UDisks2 D-Bus connected it would raise an exception and disable managing addons via home assistant (and other GUI functions that need host info status).
* Treat containerd snapshotter/overlayfs driver as supported
With home-assistant/operating-system#4252 the storage driver would
change to "overlayfs". We don't want the system to be marked as
unsupported. It should be safe to treat it as supported even now, so add
it to the list of allowed values.
* Flip the logic
(note for self: don't forget to check for unstaged changes before push)
* Set valid storage for invalid logging test case
* Add support for ulimit in addon config
Similar to docker-compose, this adds support for setting ulimits
for addons via the addon config. This is useful e.g. for InfluxDB
which on its own does not support setting higher open file descriptor
limits, but recommends increasing limits on the host.
* Make soft and hard limit mandatory if ulimit is a dict
* Add progress reporting to addon, HA and Supervisor updates
* Fix assert in test
* Add progress to addon, core, supervisor updates/installs
* Fix double install bug in addons install
* Remove initial_install and re-arrange order of load
* Fix CID file handling to prevent directory creation
It seems that under certain conditions Docker creates a directory
instead of a file for the CID file. This change ensures that
the CID file is always created as a file, and any existing directory
is removed before creating the file.
* Fix tests
* Fix pytest
* Fix range header to correctly fetch latest logs
Add a colon before line numbers to indicate that no cursor is used.
This makes the range header work when fetching latest logs from
systemd-journal-gatewayd.
* Fix pytest
* Check Core version and raise unsupported if older than 2 years
Check the currently installed Core version relative to the current
date, and if its older than 2 years, mark the system unsupported.
Also add a Job condition to prevent automatic refreshing of the update
information in this case.
* Handle landing page correctly
* Handle non-parseable versions gracefully
Also align handling between OS and Core version evaluations.
* Extend and fix test coverage
* Improve Job condition error
* Fix pytest
* Block execution of fetch_data and store reload jobs
Block execution of fetch_data and store reload jobs if the core version
is unsupported. This essentially freezes the installation until the
user takes action and updates the Core version to a supported one.
* Use latest known Core version as reference
Instead of using current date to determine if Core version is more than
2 years old, use the latest known Core version as reference point and
check if current version is more than 24 releases behind.
This is crucial because when update information refresh is disabled due to
unsupported Core version, using date would create a permanent unsupported
state. Even if users update to the last known version in 4+ years, the
system would remain unsupported. By using latest known version as reference,
updating Core to the last known version makes the system supported again,
allowing update information refresh to resume.
This ensures users can always escape the unsupported state by updating
to the last known Core version, maintaining the update refresh cycle.
* Improve version comparision logic
* Use Home Assistant Core instead of just Core
Avoid any ambiguity in what is exactly outdated/unsupported by using
Home Assistant Core instead of just Core.
* Sort const alphabetically
* Update tests/resolution/evaluation/test_evaluate_home_assistant_core_version.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Allow arbitrarily nested addon config schemas
* Disallow lists directly nested in another list in addon schema
* Handle arbitrarily nested addon schemas in UiOptions class
* Handle arbitrarily nested addon schemas in AddonOptions class
* Add tests for addon config schemas
* Add tests for addon option validation
* Add endpoint for complete logs of the latest container startup
Add endpoint that returns complete logs of the latest startup of
container, which can be used for downloading Core logs in the frontend.
Realtime filtering header is used for the Journal API and StartedAt
parameter from the Docker API is used as the reference point. This means
that any other Range header is ignored for this parameter, yet the
"lines" query argument can be used to limit the number of lines. By
default "infinite" number of lines is returned.
Closes#6147
* Implement fallback for latest logs for OS older than 16.0
Implement fallback which uses the internal CONTAINER_LOG_EPOCH metadata
added to logs created by the Docker logger. Still prefer the time-based
method, as it has lower overhead and using public APIs.
* Address review comments
* Only use CONTAINER_LOG_EPOCH for latest logs
As pointed out in the review comments, we might not be able to get the
StartedAt for add-ons that are not running. Thus we need to use the only
reliable mechanism available now, which is the container log epoch.
* Remove dead code for 'Range: realtime' header handling
* Write cidfiles of Docker containers and mount them individually to /run/cid
There is no standard way to get the container ID in the container
itself, which can be needed for instance for #6006. The usual pattern is
to use the --cidfile argument of Docker CLI and mount the generated file
to the container. However, this is feature of Docker CLI and we can't
use it when creating the containers via API. To get container ID to
implement native logging in e.g. Core as well, we need the help of the
Supervisor.
This change implements similar feature fully in Supervisor's DockerAPI
class that orchestrates lifetime of all containers managed by
Supervisor. The files are created in the SUPERVISOR_DATA directory, as
it needs to be persisted between reboots, just as the instances of
Docker containers are.
Supervisor's cidfile must be created when starting the Supervisor
itself, for that see home-assistant/operating-system#4276.
* Address review comments, fix mounting of the cidfile
* Store and persist OS upgrade map to fix update path evaluation
The existing logic calculated OS upgrade paths inline during fetch_data,
which will not get reevaluted when the current OS is unsupported
(JobCondition.OS_SUPPORTED). E.g. after updating from 11.4 to 11.5, the
system wouldn't offer the next available update (15.2) because the
upgrade path calculation relied on fresh data from the blocked fetch
operation.
Changes:
- Add ATTR_HASSOS_UPGRADE constant and schema validation
- Store hassos-upgrade map from version JSON in updater data
- Refactor version_hassos property to use stored upgrade map instead of
inline calculation during fetch_data
- Maintain upgrade path logic: upgrade within major version first, then
jump to next major version when at the latest in current major
- Add type safety checks for version.major access
This ensures upgrade paths work correctly even when update data refresh
is blocked due to unsupported OS versions, fixing the scenario where
HAOS 11.5 wouldn't show 15.2 as the next available update.
* Update supervisor/updater.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Address mypy issue
* Fix pytest
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Add availability API for addons
* Add cast back and test for latest version of installed addon
* Make error responses more translation/client library friendly
* Add test cases for install/update APIs
Running tests in UTC+2 timezone makes some of the tests fail because the
mocked time in the future is actually in the past, as UTC is used as the
new reference point. Adjust the tests to mock also the time when the
first execution of function happens.
Instances where the second execution happened "immediately" were mocked
to happen 1ms later. The 1ms delta is also needed to be added when
mocking time 1h in the future, otherwise it will be throttled too.
* Add background option to update/install APIs
* Refactor to use common background_task utility in backups too
* Use a validation_complete event rather then looking for bus events
* Handle missing type attribute in add-on map config
Handle missing type attribute in the add-on `map` configuration key.
* Make sure wrong volumes are cleared in any case
Also add warning when string mapping is rejected.
* Add unit tests
* Improve test coverage