* Block OS updates when the system is unhealthy
In #6024 we mark a system as unhealthy when multiple OS installations
were found. The idea was to block OS updates in this case. However, it
turns out that the OS update job was not checking the system health
and thus allowed updates even when the system was marked as unhealthy.
This commit adds the `JobCondition.HEALTHY` condition to the OS update
job, ensuring that OS updates are only performed when the system is
healthy.
Users can force an OS update still by using
`ha jobs options --ignore-conditions healthy`.
* Add test for update of unhealthy system
---------
Co-authored-by: Jan Čermák <sairon@sairon.cz>
* Fix mypy errors in misc and mounts
* Fix mypy issues in os module
* Fix typing of capture_exception
* avoid unnecessary property call
* Fixes from feedback
* Move read_text to executor
* Fix issues found by coderabbit
* formated to formatted
* switch to async_capture_exception
* Find and replace got one too many
* Update patch mock to async_capture_exception
* Drop Sentry capture from format_message
The error handling got introduced in #2052, however, #2100 essentially
makes sure there will never be a byte object passed to this function.
And even if, the Sentry aiohttp plug-in will properly catch such an
exception.
---------
Co-authored-by: Stefan Agner <stefan@agner.ch>
When boot slot is selected manually in GRUB, the system boots into this
slot and marks it as good. However, the boot order is not changed, so in
the next boot (after an explicit or unexpected reboot) HAOS returns to
the version in the other slot. This might be confusing because if the
system has been running for some time, the user can forget they have
changed the boot slot to fix issue they had.
This gets more confusing if the "other" boot slot is selected manually
three times in a row. Let's say we have ORDER="A B". This means that
every time GRUB starts, it wants to boot slot A. If the slot B is
selected instead, only A_TRY is incremented, system boots into slot B
and marks slot B as good (B_OK=1, B_TRY=0). On another boot, this
repeats, yet A_TRY is incremented again. Until it reaches 3, the slot A
would be always chosen automatically, only after that it would boot to
slot B, presuming slot A is dead. The ORDER variable will be still
unchanged though.
This commit only makes sure that when the system is marked as healthy,
the slot is both marked as good AND active, updating the ORDER variable
as well. Because the X_TRY counter is incremented by GRUB, if we want
the other slot not to be marked as bad, we need to adjust the logic in
OS's grub.cfg as well, because Supervisor can't know whether it's
apppropriate to change other slot's state or not.
I also took the courtesy to adjust the logging a bit, to include the
stack trace in the error log if marking the slot fails somehow.
* Allow client to change boot slot via API
* Wrap call to rauc in job that checks for OS
* Reboot after changing the active boot slot
* Add test cases and clean up
* BootName to BootSlot
* Fix test
* Rename boot_name to boot_slot
* Fix tests after field change
* Bad message error marks system as unhealthy
* Finish adding test cases for changes
* Rename test file for uniqueness
* bad_message to oserror_bad_message
* Omit some checks and check for network mounts
* Backup and restore track progress in job
* Change to stage only updates and fix tests
* Leave HA alone if it wasn't restored
* skip check HA stage message when we don't check
* Change to helper to get current job
* Fix tests
* Mark jobs as internal to skip notifying HA
* Add update freeze option
* Freeze to auto update and plugin condition
* Add tests
* Add supervisor_version evaluation
* OS updates require supervisor up to date
* Run version check during startup