This reverts commit 22fe9b19ee.
There are major issues when OS has no internet connectivity - in such case the
script doesn't go the expected happy path after the rework and eventually
removes the Docker image, essentially bricking offline installations.
Since there is no immediate benefit for HAOS and such change turns out to be
high risk considering the planned release, leave it to be implemented later.
This knob controls whether Linux throws away its congestion
window (cwnd) after a connection has been idle for at least one
retransmission timeout (RTO). With a value of 0, Linux keeps the
cwnd it had before the idle period and can send that amount
immediately when the application resumes writing (still bounded
by the receiver's advertised window and by pacing).
With slow start after idle enabled (the default), Linux allows
only about 10 MSS (~14 KiB) in the first burst after idle. Even
when a connection stays open to web clients, a short idle forces
multiple round trips to ramp back up.
On Wi-Fi, local connections often have very low RTTs, which drives
the RTO down. Between page navigations the connection is considered
idle by Linux. If the next request happens during a transient
latency spike on the Wi-Fi link, the sender starts with a tiny
cwnd and must grow it over many RTTs, so the spike causes outsized
and visible loading delays.
For devices behind typical Internet uplinks, the higher RTT makes
the initial ramp-up feel even slower until the window regains size.
However, here the connection does take longer to drop to idle, for
Linux standards. So the connection is less likely to be considered
idle between navigations.
This change does not affect flows with very small receive windows
(e.g. many microcontrollers), which are limited by the peer's
advertised window rather than the sender's cwnd.
Example RTOs on low jitter, low loss connections:
Defaults:
TCP_RTO_MIN = 200 ms
TCP_RTO_MAX = 120 s
low-jitter path so rttvar_us = 200 ms
HZ = 1000 or 250 or 100 (depending on the kernel settings)
*31 ms average RTT*
- SRTT ≈ 31 ms; RTTVAR ≈ 200 ms → Sum = 231 ms
- 'usecs_to_jiffies(231000)' = 231 jiffies (HZ 1000) -> RTO ≈ 231 ms
- If 'HZ = 250' (4 ms tick), ceil(231/4)=58 jiffies -> 232 ms RTO
- If 'HZ = 100' (10 ms tick), ceil(231/10)=23 jiffies -> 240 ms RTO
*178 ms average RTT*
- HZ=1000 (1 ms tick): 378 ms RTO
- HZ=250 (4 ms tick): ceil(378/4)=95 -> 380 ms RTO
- HZ=100 (10 ms tick): ceil(378/10)=38 -> 380 ms RTO
*292 ms average RTT*
- HZ=1000 (1 ms tick): 492 ms RTO
- HZ=250 (4 ms tick): ceil(492/4)=123 -> 492 ms RTO
- HZ=100 (10 ms tick): ceil(492/10)=50 -> 500 ms RTO
Any loss or jitter will increase those RTO values.
Set net.ipv4.tcp_thin_linear_timeouts=1 to switch retransmission
timeout (RTO) backoff from exponential to linear for 'thin' TCP flows.
This reduces tail latency for API-style connections that typically have
very few packets in flight, improving recovery from sporadic loss without
changing anything for larger TCP transfers.
Kernel definition: A flow is considered thin when 'tp->packets_out < 4'
and while not in the initial slow start.
See tcp_stream_is_thin(tp) in include/net/tcp.h.
Increase the BlueZ temporary device timeout from the default 30s to 195s.
This prevents devices from being removed from D-Bus during connection
retries, especially when multiple connection attempts are queued.
The 195s timeout aligns with Home Assistant's Bluetooth stack behavior
for ESPHome proxies and prevents the 'device removal spiral' that occurs
when devices timeout during sequential connection attempts.
Before update to Buildroot 2025.02, the overlays directory on Yellow was
created by rpi-firmware in a condition added confusingly in firmware bump [1].
However, this got lost during Buildroot update, and since Yellow doesn't copy
overlays from the rpi-firmware repo, the directory was never created and the
rpi-rf-mod.dtbo couldn't be copied there in pre-image build hook.
To make things more robust, create the overlays directory for rpi targets
conditionally in the hook instead of relying on rpi-firmware to create it.
[1] f1af1a0bf7Fixes#4233
Raspberry Pi Linux update to 6.12.34 broken some USB devices, mostly USB-Serial
converters connected to Yellow, but there are reports of some other peripherals
connected to RPi boards too.
This is a known RPi upstream issue [1] fixed by a PR [2] that's not been merged
to RPi stable kernel yet. Applying patches from this PR fixes the issues.
Fixes#4228, refs #4229
[1] https://github.com/raspberrypi/linux/issues/6941
[2] https://github.com/raspberrypi/linux/issues/6936
To make system timezone configurable, we need to have /etc/localtime
writable, and it must be possible to atomically create a symlink from
this place, which means the whole parent folder must be writable. We
don't have /etc writable and can't use the usual bind mount for this.
Latest Systemd v258 has patch that allows setting an environment
variable that sets where the localtime should be written. This can be
persisted in the overlay partition, with a symlink from /etc/localtime
leading there, finally pointing to the actual zoneinfo file. If the
symlink doesn't exist, create it by hassos-overlay script (it's not
really needed as UTC is the default, but Systemd does the same if you
change from non-UTC timezone back to UTC).
Also disable BR2_TARGET_LOCALTIME, so /etc/localtime and /etc/timezone
(the latter is only informative and non-standard) are not written by the
tzdata package build.
* Fix rpi-eeprom-update when device boots from NVMe
The boot partition detection doesn't work correctly if the device boots from
NVMe. Also the mounting step is unnecessary in HAOS as we can assume the boot
partition to be always mounted.
Fix the issues by modifying the bootfs detection logic to always use /mnt/boot.
However, still fail in case when flashrom can't be used (usually on CM4). On
CM5, or on Pi 5 booted from NVMe, update process works without further changes
because the firmware can be flashed directly from the running system using
flashrom.
Fixes#4157
* Fix typo in patch commit message
When follow request for logs is issued that points to/beyond the end of logs, a
busy loop in systemd-journal-gatewayd can be triggered which manifests as
systemd-journal-gatewayd consuming 100% CPU. Since threads are used for each
request, the logs may still work but the CPU will be hogged until the restart
of systemd-journal-gatewayd, Supervisor, or the whole system.
Backport the patch submitted upstream that addresses this issue.
Fixes#4190
As there is VirtualBox available for aarch64 on Apple Macs, provide OS images
also in the native VirtualBox format, which also grants the ability to resize
existing disk images, unlike VMDK.
Fixes#4171 & fixes#4172
Enable option for the netfilter NETMAP target, as it can be useful for some
users. Until now it's been enabled only for some targets as an option coming
from upstream defconfigs; make sure it's available for all targets.
Fixes#4183
The ip6tables configuration is now enabled by default since Docker 27
(see https://github.com/moby/moby/pull/47747). The experimental config
got introduced with the ip6tables flag in #2051. There is no other
experimental feature used from what I am aware of, so lets remove the
experimental flag as well.
Unbind the Bluetooth driver for Broadcom HCI module before the bluetooth
service starts if running on board without WiFi module. This is a replacement
for #2948 but using a more targeted approach for removing the particular driver
and better detection of no-WiFi (thus no-Bluetooth) models.
This still means the driver will be probed and couple of lines printed when it
fails to set baudrate and reset the module, yet this should be benign, at least
the all-zero MAC device no longers appears in Bluetooth stack.
Backport patch for traces appearing since v4.21.0 bump, introduced in #4095.
This change is not available in any newer tagged release of the driver and the
commit message upstream is messed up, hence the reworded patch.
Make sure that all LAN drivers used on Raspberry Pi boards are built-in.
Although they are defined as such in the base defconfig, we change them to
modules in device support includes. For simplicity and keeping kernel config
close to the RPi OS config, change them all to built-in in the main RPi include
for all RPi targets.
This is not only a formal change - at least one regression is known if the PHY
driver on RPi 5 is not built-in and MAC driver is - in that case the PHY hooked
up to the RP1 isn't initialized properly, and it is reported as "Generic PHY"
instead, e.g. breaking the control of LEDs through dtparams. Relevant dmesg log
before the change:
macb 1f00100000.ethernet end0: PHY [1f00100000.ethernet-ffffffff:01] driver [Generic PHY] (irq=POLL)
And after the change:
macb 1f00100000.ethernet eth0: PHY [1f00100000.ethernet-ffffffff:01] driver [Broadcom BCM54213PE] (irq=POLL)
Fixes#3333
Bind-mount Systemd Journal socket to the Supervisor container. This way
Supervisor can use the socket directly for writing log entries using the
Systemd native Journal protocol [1] instead of logging to stderr of the
container.
[1] https://systemd.io/JOURNAL_NATIVE_PROTOCOL/
This reverts commit eab18076ad.
This change was added in #2948 as a workaround for all-zero adapter appearing
in the HA frontend (#2944). With changes implemented in [1], this is no longer
needed, the only minor issue is that the ghost adapter still appears in
hciconfig (and other utilities') output as reported in [2]. However, this
should be less problematic than the Bluetooth being unavailable if WiFi is
disabled through disable-wifi DT overlay, so let's start with removing the
workaround.
Fixes#2975
[1] https://github.com/Bluetooth-Devices/bluetooth-adapters/pull/105
[2] https://github.com/raspberrypi/linux/issues/5756
When following logs in Home Assitant frontend, the last line may be duplicated
over time when no new lines are added. This is because systemd-journal-gatewayd
incorrectly processed the num_skip part of the Range header, always returning
the last entry even when it should have been skipped.
Backport the patch for Systemd that processes the header correctly.
Fixes#4101
* Bump buildroot to update package/pigz
* Enable parallel gzip for faster Docker pulls
Docker checks if unpigz is available, and if so uses it to unpack
container layers with multiple CPU cores. This should make Docker pulls
faster, especially on lower end hardware.
Since update to Systemd v256.x the Range header requires the num_entries part
and fails if it's not provided, which we worked around by [1]. With this patch
that was already accepted upstream, the workaround shouldn't be necessary
anymore.
[1] https://github.com/home-assistant/supervisor/pull/5827
Add driver for Marvell PHYs, such as 88E1543(4L) on an ASRock C3758D4I-4L
board. Adding it to x86 config only, as it seems it's not widely used anywhere
else.
Fixes#4025
Bump Hailo stuff to the latest version. While this is a breaking change for
add-ons depending on the driver, the most commonly used one (i.e. Frigate)
didn't bump to v4.20.1 on their stable channel either, so it shouldn't have
significant impact. We agreed with @blakeblackshear that once HAOS bumps the
Hailo driver in HAOS 16, Frigate will follow.
Backport /boots endpoint for Systemd so we can use it in Supervisor to get the
actual list of boots. Should be available upstream since Systemd v258, for v256
minor tweaks were needed.
When creating OVA image, the CPU is slacking at the end of the build because it
is creating three ZIP archives, each one on a single CPU only. As we're
creating only single-entry archives, we can use pigz to use all cores.
The actual speedup on my machine (16C/32T) reflects the number of cores - it
takes around 2 seconds instead of 1 minute.