The deprecated-key-path option is no longer handled, but it doesn't cause
problems because the key is explicitly ignored. It was completely removed in
Docker 19.03.0 [1].
As such, the option and the pre-start script to fix the corrupted key.json can
be removed now, as it has no effect, only printing confusing message when
Docker service fails to start.
[1] 98fc09128b
Prefer the containerd snapshotter by using it by default for new installs and
when no Docker data is present (e.g. after datadisk wipe). The snapshotter is
enabled by a dockerd flag which is set when a flag file is present in the data
partition. This flag file can be used also to opt-in for this snapshotter on
legacy installs (high level API through OS Agent and Supervisor TBD), to
migrate to the containerd snapshotter this file can be simply created manually.
Testing shown no major problems when migrating, the old overlay2 folder can be
(and should be - to avoid situations where the data disk might run out of
space) deleted before the docker.service is started in the docker-prepare
script.
Note that there's no offline migration path, OS needs to be connected to the
internet to re-download the images when migrating. This could be theoretically
possible through docker image save/load functions but guarding for enough of
space and other edge cases would be probably too complex to justify it.
Refs #4252
Refs #4253 - easier opt-in method is still needed
Closes#4254 - migration is handled seamlessly by Docker
Use the version used in the docker-engine package to ensure it stays in sync.
Although we haven't seen any issues related to the fact it was sometimes
mismatching, reduce the burden of needing it to be synced manually.
This might be required for some modern Intel processors (Meteor Lake and newer)
which fail to boot Linux kernel without x2APIC controller when some features
(e.g. VT-d or x2APIC itself) are enabled in the BIOS.
Enable it also for OVA, as it can be emulated in virtual machines, even when
the host CPU does not support it.
Fixes#4337, fixes#4144, fixes#4345
The CPUfreq governor "powersave" sets the CPU statically to the lowest
frequency within the borders of scaling_min_freq and scaling_max_freq.
This can be useful if a particular power budget should not ever be
crossed. Can be set using `cpufreq.default_governor=powersave`. Note
that this obviously affects performance.
* Improve UX of HA CLI wrapper and emergency console
For many users, the emergency console gives feeling that the system is
completely broken. However, there are various cases when the system just takes
just a bit longer to start up and the emergency message is shown, while it
finishes a proper startup shortly after. This change tries to improve the UX in
several ways:
* The limit before a forced emergency console startup is changed to 3 minutes
* Waiting can be interrupted with Ctrl+C (reset counter is cleared then)
* Some hints what to check have been added before starting the shell
* Also, because if the HA CLI failed for 5 times in a row in quick succession,
the CLI startup was then not retried anymore and user may have been left with
a black screen, the restart limits timeouts have been adjusted only to back
off and never mark the unit as failed
Closes#4273
* Use /bin/sh and printf to silence linter errors
RaspberryMatic was renamed to OpenCCU in
https://github.com/OpenCCU/OpenCCU/pull/3162. This caused change of the name of
the directory in the source tarball, causing build failure when the archive
wasn't cached.
Use the --cidfile Docker CLI argument when starting the container and
bind-mount the generated file containing full ID of the container to the
container itself.
Using --mount instead of --volume is needed, as --volume is racy and creates
empty directory volume at the destination path instead.
This is prerequisite for home-assistant/supervisor#6006 but can come handy for
other cases too.
Upstream commit [1] caused regression in IPv4 routing which can cause some
routes becoming broadcast even though they should be routed as unicast, e.g.:
# ip route get 1.1.1.1
broadcast 1.1.1.1 via 192.168.122.1 dev enp0s3 src 192.168.122.204 uid 0
cache <local,brd>
It's not entirely clear yet why it happens but this behavior seems to be
triggered for instance when the SSDP integration sends the broadcast packet on
HA startup. While this behavior is not described in the regression report [1],
the commit cherry-picked from Linux master fixes the problems for us as well.
Patches moved to version-specific folder, as this one shouldn't be applied on
Raspberry Pi targets.
[1] https://lore.kernel.org/all/20250710142714.12986-1-oscmaes92@gmail.com/
[2] https://lore.kernel.org/stable/20250822165231.4353-4-bacs@librecast.net/Fixes#4265
Revert patch added to 6.12.43 which breaks the build of CAN_TI_HECC module
present in Tinker config. While it's quite unlikely anyone would be using it,
so we could just simply disable the module, this seems to be a better solution.
This reverts commit 22fe9b19ee.
There are major issues when OS has no internet connectivity - in such case the
script doesn't go the expected happy path after the rework and eventually
removes the Docker image, essentially bricking offline installations.
Since there is no immediate benefit for HAOS and such change turns out to be
high risk considering the planned release, leave it to be implemented later.
This knob controls whether Linux throws away its congestion
window (cwnd) after a connection has been idle for at least one
retransmission timeout (RTO). With a value of 0, Linux keeps the
cwnd it had before the idle period and can send that amount
immediately when the application resumes writing (still bounded
by the receiver's advertised window and by pacing).
With slow start after idle enabled (the default), Linux allows
only about 10 MSS (~14 KiB) in the first burst after idle. Even
when a connection stays open to web clients, a short idle forces
multiple round trips to ramp back up.
On Wi-Fi, local connections often have very low RTTs, which drives
the RTO down. Between page navigations the connection is considered
idle by Linux. If the next request happens during a transient
latency spike on the Wi-Fi link, the sender starts with a tiny
cwnd and must grow it over many RTTs, so the spike causes outsized
and visible loading delays.
For devices behind typical Internet uplinks, the higher RTT makes
the initial ramp-up feel even slower until the window regains size.
However, here the connection does take longer to drop to idle, for
Linux standards. So the connection is less likely to be considered
idle between navigations.
This change does not affect flows with very small receive windows
(e.g. many microcontrollers), which are limited by the peer's
advertised window rather than the sender's cwnd.
Example RTOs on low jitter, low loss connections:
Defaults:
TCP_RTO_MIN = 200 ms
TCP_RTO_MAX = 120 s
low-jitter path so rttvar_us = 200 ms
HZ = 1000 or 250 or 100 (depending on the kernel settings)
*31 ms average RTT*
- SRTT ≈ 31 ms; RTTVAR ≈ 200 ms → Sum = 231 ms
- 'usecs_to_jiffies(231000)' = 231 jiffies (HZ 1000) -> RTO ≈ 231 ms
- If 'HZ = 250' (4 ms tick), ceil(231/4)=58 jiffies -> 232 ms RTO
- If 'HZ = 100' (10 ms tick), ceil(231/10)=23 jiffies -> 240 ms RTO
*178 ms average RTT*
- HZ=1000 (1 ms tick): 378 ms RTO
- HZ=250 (4 ms tick): ceil(378/4)=95 -> 380 ms RTO
- HZ=100 (10 ms tick): ceil(378/10)=38 -> 380 ms RTO
*292 ms average RTT*
- HZ=1000 (1 ms tick): 492 ms RTO
- HZ=250 (4 ms tick): ceil(492/4)=123 -> 492 ms RTO
- HZ=100 (10 ms tick): ceil(492/10)=50 -> 500 ms RTO
Any loss or jitter will increase those RTO values.
Set net.ipv4.tcp_thin_linear_timeouts=1 to switch retransmission
timeout (RTO) backoff from exponential to linear for 'thin' TCP flows.
This reduces tail latency for API-style connections that typically have
very few packets in flight, improving recovery from sporadic loss without
changing anything for larger TCP transfers.
Kernel definition: A flow is considered thin when 'tp->packets_out < 4'
and while not in the initial slow start.
See tcp_stream_is_thin(tp) in include/net/tcp.h.
Increase the BlueZ temporary device timeout from the default 30s to 195s.
This prevents devices from being removed from D-Bus during connection
retries, especially when multiple connection attempts are queued.
The 195s timeout aligns with Home Assistant's Bluetooth stack behavior
for ESPHome proxies and prevents the 'device removal spiral' that occurs
when devices timeout during sequential connection attempts.
Before update to Buildroot 2025.02, the overlays directory on Yellow was
created by rpi-firmware in a condition added confusingly in firmware bump [1].
However, this got lost during Buildroot update, and since Yellow doesn't copy
overlays from the rpi-firmware repo, the directory was never created and the
rpi-rf-mod.dtbo couldn't be copied there in pre-image build hook.
To make things more robust, create the overlays directory for rpi targets
conditionally in the hook instead of relying on rpi-firmware to create it.
[1] f1af1a0bf7Fixes#4233
Raspberry Pi Linux update to 6.12.34 broken some USB devices, mostly USB-Serial
converters connected to Yellow, but there are reports of some other peripherals
connected to RPi boards too.
This is a known RPi upstream issue [1] fixed by a PR [2] that's not been merged
to RPi stable kernel yet. Applying patches from this PR fixes the issues.
Fixes#4228, refs #4229
[1] https://github.com/raspberrypi/linux/issues/6941
[2] https://github.com/raspberrypi/linux/issues/6936
To make system timezone configurable, we need to have /etc/localtime
writable, and it must be possible to atomically create a symlink from
this place, which means the whole parent folder must be writable. We
don't have /etc writable and can't use the usual bind mount for this.
Latest Systemd v258 has patch that allows setting an environment
variable that sets where the localtime should be written. This can be
persisted in the overlay partition, with a symlink from /etc/localtime
leading there, finally pointing to the actual zoneinfo file. If the
symlink doesn't exist, create it by hassos-overlay script (it's not
really needed as UTC is the default, but Systemd does the same if you
change from non-UTC timezone back to UTC).
Also disable BR2_TARGET_LOCALTIME, so /etc/localtime and /etc/timezone
(the latter is only informative and non-standard) are not written by the
tzdata package build.
* Fix rpi-eeprom-update when device boots from NVMe
The boot partition detection doesn't work correctly if the device boots from
NVMe. Also the mounting step is unnecessary in HAOS as we can assume the boot
partition to be always mounted.
Fix the issues by modifying the bootfs detection logic to always use /mnt/boot.
However, still fail in case when flashrom can't be used (usually on CM4). On
CM5, or on Pi 5 booted from NVMe, update process works without further changes
because the firmware can be flashed directly from the running system using
flashrom.
Fixes#4157
* Fix typo in patch commit message
When follow request for logs is issued that points to/beyond the end of logs, a
busy loop in systemd-journal-gatewayd can be triggered which manifests as
systemd-journal-gatewayd consuming 100% CPU. Since threads are used for each
request, the logs may still work but the CPU will be hogged until the restart
of systemd-journal-gatewayd, Supervisor, or the whole system.
Backport the patch submitted upstream that addresses this issue.
Fixes#4190