* Avoid aiodns resolver memory leak
In certain cases, the aiodns resolver can leak memory. This also
leads to Fatal `Python error… ffi.from_handle()`. This addresses
the issue by ensuring that the resolver is properly closed
when it is no longer needed.
* Address coderabbitai feedback
* Fix pytest
* Fix pytest
* Recreate aiohttp ClientSession after DNS plug-in load
Create a temporary ClientSession early in case we need to load version
information from the internet. This doesn't use the final DNS setup
and hence might fail to load in certain situations since we don't have
the fallback mechanims in place yet. But if the DNS container image
is present, we'll continue the setup and load the DNS plug-in. We then
can recreate the ClientSession such that it uses the DNS plug-in.
This works around an issue with aiodns, which today doesn't reload
`resolv.conf` automatically when it changes. This lead to Supervisor
using the initial `resolv.conf` as created by Docker. It meant that
we did not use the DNS plug-in (and its fallback capabilities) in
Supervisor. Also it meant that changes to the DNS setup at runtime
did not propagate to the aiohttp ClientSession (as observed in #5332).
* Mock aiohttp.ClientSession for all tests
Currently in several places pytest actually uses the aiohttp
ClientSession and reaches out to the internet. This is not ideal
for unit tests and should be avoided.
This creates several new fixtures to aid this effort: The `websession`
fixture simply returns a mocked aiohttp.ClientSession, which can be
used whenever a function is tested which needs the global websession.
A separate new fixture to mock the connectivity check named
`supervisor_internet` since this is often used through the Job
decorator which require INTERNET_SYSTEM.
And the `mock_update_data` uses the already existing update json
test data from the fixture directory instead of loading the data
from the internet.
* Log ClientSession nameserver information
When recreating the aiohttp ClientSession, log information what
nameservers exactly are going to be used.
* Refuse ClientSession initialization when API is available
Previous attempts to reinitialize the ClientSession have shown
use of the ClientSession after it was closed due to API requets
being handled in parallel to the reinitialization (see #5851).
Make sure this is not possible by refusing to reinitialize the
ClientSession when the API is available.
* Fix pytests
Also sure we don't create aiohttp ClientSession objects unnecessarily.
* Apply suggestions from code review
Co-authored-by: Jan Čermák <sairon@users.noreply.github.com>
---------
Co-authored-by: Jan Čermák <sairon@users.noreply.github.com>
Similar to #5825, make sure we mock the systemd journal gateway socket
for tests. This makes the test work on systems which have
systemd-journal-gatewayd installed.
* Finish out effort of adding and enabling blockbuster
* Skip getting addon file size until securetar fixed
* Fix test for devcontainer and blocking I/O
* Fix docker fixture and load_config to post_init
* Add blockbuster library and find I/O from unit tests
* Fix lint and test issue
* Fixes from feedback
* Avoid modifying webapp object in executor
* Split su options validation and only validate timezone on change
* Load resolution evaluation, check and fixups early
Before #5652, these modules were loaded in the constructor, hence early
in `initialize_coresys()`. Moving them late actually exposed an issue
where NetworkManager connectivity setter couldn't get the
`connectivity_check` evaluation, leading to an exception early in
bootstrap.
Technically, it might be safe to load the resolution modules only in
`Core.connect()`, however then we'd have to load them separately for
pytest. Let's go conservative and load them the same place where they
got loaded before #5652.
* Load resolution modules in a single executor call
* Fix pytest
Make sure that add-on store resets do not delete the root folder. This
is important so that successive reset attempts do not fail (the
directory passed to `remove_folder` must exist, otherwise find fails
with an non-zero exit code).
While at it, handle find errors properly and report errors as critical.
* Fix and extend cloud backup support
* Clean up task for cloud backup and remove by location
* Args to kwargs on backup methods
* Fix backup remove error test and typing clean up
* Max retries for auto applying addon image fixup
* Update supervisor/resolution/fixups/addon_execute_repair.py
Co-authored-by: Stefan Agner <stefan@agner.ch>
---------
Co-authored-by: Stefan Agner <stefan@agner.ch>
* Allow adoption of existing data disk
* Fix existing tests
* Add test cases and fix image issues
* Fix addon build test
* Run checks during setup not startup
* Addon load mimics plugin and HA load for docker part
* Default image accessible in except
* Unsupported if wrong image used on virtualization
* Add generic-aarch64 as supported image
* Add virtualization field to API
* Change startup to setup in check
* Migrate to Ruff for lint and format
* Fix pylint issues
* DBus property sets into normal awaitable methods
* Fix tests relying on separate tasks in connect
* Fixes from feedback
* Bad message error marks system as unhealthy
* Finish adding test cases for changes
* Rename test file for uniqueness
* bad_message to oserror_bad_message
* Omit some checks and check for network mounts
* Wait until mount unit is deactivated on unmount
The current code does not wait until the (bind) mount unit has been
actually deactivated (state "inactive"). This is especially problematic
when restoring a backup, where we deactivate all bind mounts before
restoring the target folder. Before the tarball is actually restored,
we delete all contents of the target folder. This lead to the situation
where the "rm -rf" command got executed before the bind mount actually
got unmounted.
The current code polls the state using an exponentially increasing
delay. Wait up to 30s for the bind mount to actually deactivate.
* Fix function name
* Fix missing await
* Address pytest errors
Change state of systemd unit according to use cases. Note that this
is currently rather fragile, and ideally we should have a smarter
mock service instead.
* Fix pylint
* Fix remaining
* Check transition fo failed as well
* Used alternative mocking mechanism
* Remove state lists in test_manager
---------
Co-authored-by: Mike Degatano <michael.degatano@gmail.com>
* Backup and restore track progress in job
* Change to stage only updates and fix tests
* Leave HA alone if it wasn't restored
* skip check HA stage message when we don't check
* Change to helper to get current job
* Fix tests
* Mark jobs as internal to skip notifying HA
* Reduce executor code for docker
* Fix pylint errors and move import/export image
* Fix test and a couple other risky executor calls
* Fix dataclass and return
* Fix test case and add one for corrupt docker
* Add some coverage
* Undo changes to docker manager startup
* Addon startup waits for healthy
* fix import for pylint
* wait_for to 5 in tests
* Adjust tests to simplify async tasks
* Remove wait_boot time from addons.boot tests
* Eliminate async task race conditions in tests
* Add mount to supported features
* Typo in enable
* Fix places mocking os available without version
* Increase resilence of problematic repeat task test