1
0
mirror of https://github.com/home-assistant/supervisor.git synced 2026-04-18 07:35:22 +01:00
Commit Graph

57 Commits

Author SHA1 Message Date
Stefan Agner
667bd62742 Remove CLI command hint from unknown error messages (#6684)
* Remove CLI command hint from unknown error messages

Since #6303 introduced specific error messages for many cases,
the generic "check with 'ha supervisor logs'" hint in unknown
error messages is no longer as useful. Remove the CLI command
part while keeping the "Check supervisor logs for details" rider.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Use consistently "Supervisor logs" with capitalization

Co-authored-by: Jan Čermák <sairon@users.noreply.github.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Jan Čermák <sairon@users.noreply.github.com>
2026-03-31 18:09:14 +02:00
Stefan Agner
5e1eaa9dfe Respect auto-update setting for plug-in auto-updates (#6606)
* Respect auto-update setting for plug-in auto-updates

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Also skip auto-updating plug-ins in decorator

* Raise if auto-update flag is not set and plug-in is not up to date

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 09:04:33 +01:00
Stefan Agner
3db60170aa Fix flaky test_group_throttle_rate_limit race condition (#6504)
The test was failing intermittently in CI because concurrent async
operations in asyncio.gather() were getting slightly different
timestamps (microseconds apart) despite being inside a time_machine
context.

When test2.execute() calls were timestamped at start+2ms due to async
scheduling delays, they weren't cleaned up in the final test block
(cutoff = start+1ms), causing a false rate limit error.

Fix by using tick=False to completely freeze time during the gather,
ensuring all 4 calls get the exact same timestamp.

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-30 17:17:50 +01:00
Mike Degatano
81b7e54b18 Remove unknown errors from addons and auth (#6303)
* Remove unknown errors from addons

* Remove customized unknown error types

* Fix docker ratelimit exception and tests

* Fix stats test and add more for known errors

* Add defined error for when build fails

* Fixes from feedback

* Fix mypy issues

* Fix test failure due to rename

* Change auth reset error message
2025-12-03 18:11:51 +01:00
Stefan Agner
72bbc50c83 Fix call_at to use event loop time base instead of Unix timestamp (#6324)
* Fix call_at to use event loop time base instead of Unix timestamp

The CoreSys.call_at method was incorrectly passing Unix timestamps
directly to asyncio.loop.call_at(), which expects times in the event
loop's monotonic time base. This caused scheduled jobs to be scheduled
approximately 55 years in the future (the difference between Unix epoch
time and monotonic time since boot).

The bug was masked by time-machine 2.19.0, which patched time.monotonic()
and caused loop.time() to return Unix timestamps. Time-machine 3.0.0
removed this patching (as it caused event loop freezes), exposing the bug.

Fix by converting the datetime to event loop time base:
- Calculate delay from current Unix time to scheduled Unix time
- Add delay to current event loop time to get scheduled loop time

Also simplify test_job_scheduled_at to avoid time-machine's async
context managers, following the pattern of test_job_scheduled_delay.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Add comment about dateime in the past

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-11-19 11:49:05 +01:00
Ashton
d33305379f Improve error message clarity by specifying to check supervisor logs (#6250)
* Improve error message clarity by specifying to check supervisor logs with 'ha supervisor logs'

* Fix ruff, supervisor -> Supervisor

---------

Co-authored-by: Jan Čermák <sairon@sairon.cz>
2025-11-04 17:12:15 -05:00
Mike Degatano
64f94a159c Add progress syncing from child jobs (#6207)
* Add progress syncing from child jobs

* Fix pylint issue

* Set initial progress from parent and end at 100
2025-09-30 14:52:16 -04:00
Stefan Agner
c712d3cc53 Check Core version and raise unsupported if older than 2 years (#6148)
* Check Core version and raise unsupported if older than 2 years

Check the currently installed Core version relative to the current
date, and if its older than 2 years, mark the system unsupported.
Also add a Job condition to prevent automatic refreshing of the update
information in this case.

* Handle landing page correctly

* Handle non-parseable versions gracefully

Also align handling between OS and Core version evaluations.

* Extend and fix test coverage

* Improve Job condition error

* Fix pytest

* Block execution of fetch_data and store reload jobs

Block execution of fetch_data and store reload jobs if the core version
is unsupported. This essentially freezes the installation until the
user takes action and updates the Core version to a supported one.

* Use latest known Core version as reference

Instead of using current date to determine if Core version is more than
2 years old, use the latest known Core version as reference point and
check if current version is more than 24 releases behind.

This is crucial because when update information refresh is disabled due to
unsupported Core version, using date would create a permanent unsupported
state. Even if users update to the last known version in 4+ years, the
system would remain unsupported. By using latest known version as reference,
updating Core to the last known version makes the system supported again,
allowing update information refresh to resume.

This ensures users can always escape the unsupported state by updating
to the last known Core version, maintaining the update refresh cycle.

* Improve version comparision logic

* Use Home Assistant Core instead of just Core

Avoid any ambiguity in what is exactly outdated/unsupported by using
Home Assistant Core instead of just Core.

* Sort const alphabetically

* Update tests/resolution/evaluation/test_evaluate_home_assistant_core_version.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-09-19 17:58:37 +02:00
Mike Degatano
01911a44cd Persistent notifications to repairs and fix free_space check (#6179)
* Persistent notifications to repairs and fix free_space check

* Fix tests mocking too little free space
2025-09-16 11:22:59 -04:00
Jan Čermák
3d62c9afb1 Make test_job_decorator tests timezone agnostic (#6153)
Running tests in UTC+2 timezone makes some of the tests fail because the
mocked time in the future is actually in the past, as UTC is used as the
new reference point. Adjust the tests to mock also the time when the
first execution of function happens.

Instances where the second execution happened "immediately" were mocked
to happen 1ms later. The 1ms delta is also needed to be added when
mocking time 1h in the future, otherwise it will be throttled too.
2025-09-03 17:55:28 +02:00
Mike Degatano
207b665e1d Send progress updates during image pull for install/update (#6102)
* Send progress updates during image pull for install/update

* Add extra to tests about job APIs

* Sent out of date progress to sentry and combine done event

* Pulling container image layer
2025-08-22 10:41:10 +02:00
dependabot[bot]
07d8fd006a Bump time-machine from 2.17.0 to 2.18.0 (#6113)
* Bump time-machine from 2.17.0 to 2.18.0

Bumps [time-machine](https://github.com/adamchainz/time-machine) from 2.17.0 to 2.18.0.
- [Changelog](https://github.com/adamchainz/time-machine/blob/main/docs/changelog.rst)
- [Commits](https://github.com/adamchainz/time-machine/compare/2.17.0...2.18.0)

---
updated-dependencies:
- dependency-name: time-machine
  dependency-version: 2.18.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Fix time_machine usage

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Stefan Agner <stefan@agner.ch>
2025-08-20 10:37:29 +02:00
Stefan Agner
3b093200e3 Improve JobGroup locking with external ownership tracking (#6074)
* Use context manager for Job concurrency control

* Allow to release lock outside of Job running context

* Improve JobGroup locking with external ownership tracking

Track lock ownership by job UUID instead of execution context. This
allows external lock release via job parameter.

* Fix acquire lock in nested Jobs

* Simplify nested lock tracking

* Simplify Job group lock acquisition logic

* Simplify by using helper methods

* Allow throttling with group concurrency

* Use Lock instead of Semaphore for job concurrency control

Use the same synchronization primitive (Lock) for job concurrency
control as used in job groups.

* Go back to lock ownership tracking with references

* Drop unused property `active_job_id`

* Drop unused property `can_acquire`

* Replace assert with cast
2025-08-07 00:14:58 +02:00
Stefan Agner
9bee58a8b1 Migrate to JobConcurrency and JobThrottle parameters (#6065) 2025-08-05 13:24:44 +02:00
Stefan Agner
6871ea4b81 Split execution limit in concurrency and throttle parameters (#6013)
* Split execution limit in concurrency and throttle parameters

Currently the execution limit combines two ortogonal features: Limit
concurrency and throttle execution. This change separates the two
features, allowing for more flexible configuration of job execution.

Ultimately I want to get rid of the old limit parameter. But for ease
of review and migration, I'd like to do this in two steps: First
introduce the new parameters, and map the old limit parameters to the
new parameters. Then, in a second step, remove the old limit parameter
and migrate all users to the new concurrency and throttle parameters
as needed.

* Introduce common lock release method

* Fix THROTTLE_WAIT behavior

The concurrency QUEUE does not really QUEUE throttle limits.

* Add documentation for new concurrency/throttle Job options

* Handle group options for concurrency and throttle separately

* Fix GROUP_THROTTLE_WAIT concurrency setting

We need to use the QUEUE concurrency setting instead of GROUP_QUEUE
for the GROUP_THROTTLE_WAIT execution limit. Otherwise the
test_jobs_decorator.py::test_execution_limit_group_throttle_wait
test deadlocks.

The reason this deadlocks is because GROUP_QUEUE concurrency doesn't
really work because we only can release a group lock if the job is
actually running.

Or put differently, throttling isn't supported with GROUP_*
concurrency options.

* Prevent using any throttling with group concurrency

The group concurrency modes (reject and queue) are not compatible with
any throttling, since we currently can't unlock the group lock when
a job doesn't get started (which is the case when throttling is
applied).

* Fix commit in group rate limit

* Explain the deadlock issue with group locks in code

* Handle locking correctly on throttle limit exceptions

* Introduce pytest for new job decorator combinations
2025-07-30 22:12:14 +02:00
Mike Degatano
9222a3c9c0 Report stage with error in jobs (#5784)
* Report stage with error in jobs

* Copy doesn't lose track of the successful copies

* Add stage to errors in api output test

* revert unneessary change to import

* Add tests for a bit more coverage of copy_additional_locations
2025-03-27 10:07:06 -04:00
Mike Degatano
0636e49fe2 Enable mypy part 1 (addons and api) (#5759)
* Fix mypy issues in addons

* Fix mypy issues in api

* fix docstring

* Brackets instead of get with default
2025-03-25 15:06:35 -04:00
Mike Degatano
324b059970 Move write of core state to executor (#5720) 2025-03-04 17:49:53 +01:00
Stefan Agner
696dcf6149 Initialize Supervisor Core state in constructor (#5686)
* Initialize Supervisor Core state in constructor

Make sure the Supervisor Core state is set to a value early on. This
makes sure that the state is always of type CoreState, and makes sure
that any use of the state can rely on it being an actual value from the
CoreState enum.

This fixes Sentry filter during early startup, where the state
previously was None. Because of that, the Sentry filter tried to
collect more Context, which lead to an exception and not reporting
errors.

* Fix pytest

It seems that with initializing the state early, the pytest actually
runs a system evaluation with:
Starting system evaluation with state initialize

Before it did that with:
Starting system evaluation with state None

It detects that the container runs as privileged, and declares the
system as unhealthy.

It is unclear to me why coresys.core.healthy was checked in this
context, it doesn't seem useful. Just remove the check, and validate
the state through the getter instead.

* Update supervisor/core.py

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* Make sure Supervisor container is privileged in pytest

With the Supervisor Core state being valid now, some evaluations
now actually run when loading the resolution center. This leads to
Supervisor getting declared unhealthy due to not running in a privileged
container under pytest.

Fake the host container to be privileged to make evaluations not
causing the system to be declared unhealthy under pytest.

* Avoid writing actual Supervisor run state file

With the Supervisor Core state being valid from the very start, we end
up writing a state everytime.

Instead of actually writing a state file, simply validate the the
necessary calls are being made. This is more conform to typical unit
tests and avoids writing a file for every test.

* Extend WebSocket client fixture and use it consistently

Extend the ha_ws_client WebSocket client fixture to set Supervisor Core
into run state and clear all pending messages.

Currently only some tests use the ha_ws_client WebSocket client fixture.
Use it consistently for all tests.

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-02-28 18:01:55 +01:00
Mike Degatano
52cc17fa3f Delay initial version fetch until there is connectivity (#5603)
* Delay inital version fetch until there is connectivity

* Add test

* Only mock get not whole websession object

* drive delayed fetch off of supervisor connectivity not host

* Fix test to not rely on sleep guessing to track tasks

* Use fixture to remove job throttle temporarily
2025-02-11 13:22:33 +01:00
Mike Degatano
600bf91c4f Sort jobs by creation in API (#5545)
* Sort jobs by creation in API

* Fix tests missing new field

* Fix sorting logic around child jobs
2025-01-16 09:51:44 +01:00
Stefan Agner
180a7c3990 Throttle connectivity check on connectivity issue (#5342)
* Throttle connectivity check on connectivity issue

If Supervisor detects a connectivity issue, currenlty every function
which requires internet get delayed by 10s due to the connectivity
check. This especially slows down initial startup when there are
connectivity issues. It is unlikely to resolve immeaditly, so throttle
the connectivity check to check every 30s.

* Fix pytest

* Reset throttle in test and refactor helper

* CodeRabbit suggestion

---------

Co-authored-by: Mike Degatano <michael.degatano@gmail.com>
2024-10-10 22:57:16 +02:00
Stefan Agner
f6faa18409 Bump pre-commit ruff to 0.5.7 and reformat (#5242)
It seems that the codebase is not formatted with the latest ruff
version. This PR reformats the codebase with ruff 0.5.7.
2024-08-13 20:53:56 +02:00
Mike Degatano
7fd6dce55f Migrate to Ruff for lint and format (#4852)
* Migrate to Ruff for lint and format

* Fix pylint issues

* DBus property sets into normal awaitable methods

* Fix tests relying on separate tasks in connect

* Fixes from feedback
2024-02-05 11:37:39 -05:00
Mike Degatano
88d718271d Fix serialization issue adding error to job (#4853) 2024-01-30 12:21:02 +01:00
Mike Degatano
480b383782 Add background option to backup APIs (#4802)
* Add background option to backup APIs

* Fix decorator tests

* Working error handling, initial test cases

* Change to schedule_job and always return job id

* Add tests

* Reorder call at/later args

* Validation errors return immediately in background

* None is invalid option for background

* Must pop the background option from body
2024-01-22 12:09:15 -05:00
Mike Degatano
2da27937a5 Update python to 3.12 (#4815)
* Update python to 3.12

* Fix tests and deprecations

* Fix other references to 3.11

* build.json doesn't exist
2024-01-13 16:35:07 +01:00
Stefan Agner
928aff342f Address pytest warnings (#4695) 2023-11-15 10:45:36 +01:00
Mike Degatano
010043f116 Don't warn for removing unstarted jobs (#4632) 2023-10-19 17:35:16 +02:00
Mike Degatano
ace58ba735 Unstarted jobs should always be cleaned up (#4604) 2023-10-09 11:57:04 +02:00
Mike Degatano
44daffc65b Add freeze/thaw apis for external snapshots (#4538)
* Add freeze/thaw apis for external backups

* Error when thaw called before freeze

* Timeout must be > 0
2023-09-09 10:54:19 +02:00
Mike Degatano
f93b753c03 Backup and restore track progress in job (#4503)
* Backup and restore track progress in job

* Change to stage only updates and fix tests

* Leave HA alone if it wasn't restored

* skip check HA stage message when we don't check

* Change to helper to get current job

* Fix tests

* Mark jobs as internal to skip notifying HA
2023-08-30 16:01:03 -04:00
Mike Degatano
93ba8a3574 Add job names and references everywhere (#4495)
* Add job names and references everywhere

* Remove group names check and switch to const

* Ensure unique job names in decorator tests
2023-08-21 09:15:37 +02:00
Mike Degatano
1611beccd1 Add job group execution limit option (#4457)
* Add job group execution limit option

* Fix pylint issues

* Assign variable before usage

* Cleanup jobs when done

* Remove isinstance check for performance

* Explicitly raise from None

* Add some more documentation info
2023-08-08 16:49:17 -04:00
Mike Degatano
72d81e43dd Allow all job conditions to be ignored (#4107)
* Allow all job conditions to be ignored

* Clear features cache in test

* patch out OS Agent supported feature
2023-01-18 12:14:12 +01:00
Mike Degatano
14fcda5d78 Sentry only loaded when diagnostics on (#3993)
* Sentry only loaded when diagnostics on

* Logging when sentry is closed
2022-11-13 21:23:52 +01:00
Joakim Sørensen
1f7c067c90 Job conditions take a list (#3949) 2022-10-13 09:59:06 -04:00
Joakim Sørensen
9da4ea20a9 Add unhealthy reasons to block message (#3948) 2022-10-13 09:28:30 -04:00
Mike Degatano
d195f19fa8 Refactor to dbus-next proxy interfaces (#3862)
* Refactor to dbus-next proxy interfaces

* Fix tests mocking dbus methods

* Fix call dbus
2022-09-13 13:45:28 -04:00
Mike Degatano
fc646db95f Reduce connectivity checks (#3836)
* Reduce connectivity checks

* Fix/remove connectivity tests

* Remove throttle from prior connectivity tests

* Use dbus_property wrapper

* Allow variable throttle period with lambda

* Add evaluation for connectivity check disabled
2022-09-03 09:48:30 +02:00
Mike Degatano
2cd7f9d1b0 Attempt plugin update before failing job condition (#3796) 2022-08-17 07:36:05 +02:00
Mike Degatano
96065ed704 Bump to python 3.10 and alpine 3.16 (#3791)
* Bump to python 3.10

* 3.10 is not a number

* Musllinux wheels link

* Revert attrs 22.1.0 -> 21.2.0 for wheel

* Revert cryptography for wheel & pylint fix

* Precommit and devcontainer to 3.10

* pyupgrade rewriting things

* revert

* Update builder.yml

* fix rust

* Update builder.yml

Co-authored-by: Pascal Vizeli <pvizeli@syshack.ch>
2022-08-16 14:33:23 +02:00
Mike Degatano
c8f184f24c Add auto update option (#3769)
* Add update freeze option

* Freeze to auto update and plugin condition

* Add tests

* Add supervisor_version evaluation

* OS updates require supervisor up to date

* Run version check during startup
2022-08-15 12:13:22 -04:00
Mike Degatano
e62324e43f Set limits on watchdog retries (#3779)
* Set limits on watchdog retries

* Use relative import
2022-08-09 11:44:35 -04:00
Mike Degatano
d097044fa8 Update supervisor before auto-updating others (#3756) 2022-07-28 12:29:05 -04:00
dependabot[bot]
d4fd8f3f0d Bump black from 21.12b0 to 22.1.0 (#3425)
* Bump black from 21.12b0 to 22.1.0

Bumps [black](https://github.com/psf/black) from 21.12b0 to 22.1.0.
- [Release notes](https://github.com/psf/black/releases)
- [Changelog](https://github.com/psf/black/blob/main/CHANGES.md)
- [Commits](https://github.com/psf/black/commits/22.1.0)

---
updated-dependencies:
- dependency-name: black
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update black

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Pascal Vizeli <pvizeli@syshack.ch>
2022-02-10 14:13:40 +01:00
Pascal Vizeli
271e4f0cc4 Support OS-Agent Data disk (#3120)
* Support OS-Agent Data disk

* fix lint

* add tests

* Fix empty path

* revert change

* Using as_posix()

* clean not needed cast

* rename

* Rename files
2021-09-17 15:01:07 +02:00
Pascal Vizeli
1ef46424ea Fix tests (#2883) 2021-05-14 08:36:49 +02:00
Pascal Vizeli
9194088947 Fix HAOS sync output (#2755)
* Fix HAOS sync output

* revert api change

* As usaly

* Simplify code

* Adjust error handling
2021-03-26 14:33:14 +01:00
Pascal Vizeli
31f5033dca Add throttle to job execution (#2631)
* Add throttle to job execution

* fix unittests

* Add tests

* address comments

* add comment

* better on __init__

* New text

* Simplify logic
2021-02-25 23:29:03 +01:00