For users having non-English, and especially non-qwerty layouts, using the host
shell can be very awkward. There was no option to change the keymaps as they
haven't been installed in the OS, and the persistence couldn't have been
achieved because of read-only /etc.
With upstream patch merged in #4224, we have an option to put
/etc/vconsole.conf to a writable location and use the same approach as in the
timezone PR. This is needed because even if we only bind-mounted the file from
the overlay directory, the Systemd services which start early will still refer
to the inode on the read-only FS. Also, gzip is required as current version of
kbd in Buildroot (v2.6.4) always compresses the keymaps using gzip. We can get
rid of this after we bump to kbd v2.9.0 [1] or newer. The overall bloat in
local build of the OS is slightly over 1 MiB, so it is acceptable.
With these changes, the `localectl set-keymap` command can be used to use any
available keymap from the installed `kbd` package (refer to `localectl
list-keymaps` for complete lists) and persist it between reboots.
[1] https://github.com/legionus/kbd/releases/tag/v2.9.0Fixes#1775
The deprecated-key-path option is no longer handled, but it doesn't cause
problems because the key is explicitly ignored. It was completely removed in
Docker 19.03.0 [1].
As such, the option and the pre-start script to fix the corrupted key.json can
be removed now, as it has no effect, only printing confusing message when
Docker service fails to start.
[1] 98fc09128b
Prefer the containerd snapshotter by using it by default for new installs and
when no Docker data is present (e.g. after datadisk wipe). The snapshotter is
enabled by a dockerd flag which is set when a flag file is present in the data
partition. This flag file can be used also to opt-in for this snapshotter on
legacy installs (high level API through OS Agent and Supervisor TBD), to
migrate to the containerd snapshotter this file can be simply created manually.
Testing shown no major problems when migrating, the old overlay2 folder can be
(and should be - to avoid situations where the data disk might run out of
space) deleted before the docker.service is started in the docker-prepare
script.
Note that there's no offline migration path, OS needs to be connected to the
internet to re-download the images when migrating. This could be theoretically
possible through docker image save/load functions but guarding for enough of
space and other edge cases would be probably too complex to justify it.
Refs #4252
Refs #4253 - easier opt-in method is still needed
Closes#4254 - migration is handled seamlessly by Docker
To make system timezone configurable, we need to have /etc/localtime
writable, and it must be possible to atomically create a symlink from
this place, which means the whole parent folder must be writable. We
don't have /etc writable and can't use the usual bind mount for this.
Latest Systemd v258 has patch that allows setting an environment
variable that sets where the localtime should be written. This can be
persisted in the overlay partition, with a symlink from /etc/localtime
leading there, finally pointing to the actual zoneinfo file. If the
symlink doesn't exist, create it by hassos-overlay script (it's not
really needed as UTC is the default, but Systemd does the same if you
change from non-UTC timezone back to UTC).
Also disable BR2_TARGET_LOCALTIME, so /etc/localtime and /etc/timezone
(the latter is only informative and non-standard) are not written by the
tzdata package build.
As the reported MemTotal can fluctuate a bit on some systems, e.g. because the
reserved memory changes between kernel version or other factors affect it like
VRAM, the swap file can be recreated unnecessarily between boots. Allow for
some fluctuation (up to +-32MB) before the swapfile is recreated.
This was a problem already before the recent haos-swapfile changes, however,
before it checked if the existing swapfile isn't smaller than the desired
value. If the MemTotal fluctuated there, the swapfile size eventually settled
on the highest value seen and it wasn't recreated anymore. With this change,
things should be stable even more.
Use simple shell script to perform device wipe instead of calling OS Agent to
do that through the UDisks2 API. While it might have been a good idea to use
high level interface for that back then, it turns out it causes more issues
than the benefits it could bring.
Main problem currently is that the OS Agent needs to read sysctl variables, but
those are only set after mounting the overlay partition. But at the same time,
the overlay partition can't be mounted if we want to wipe it - this creates a
dependency cycle through the haos-agent.service.
To get rid of the cycle and simplify things, use a shell script doing basically
the same what the OS Agent does. Since the wipe functionality only makes sense
to be implemented on HAOS targets (not on Supervised), there's little point of
having it in higher layer of abstraction that OS Agent provides.
It should be also checked if changes from #1291 are needed anymore, as the
driving factor for those have been probably the wipe feature in OS Agent too,
but at this point they seem to be harmless.
Allow configuration of the swap size via /etc/default/haos-swapfile file. By
setting the SWAPSIZE variable in this file, swapfile get recreated on the next
reboot to the defined size. Size can be either in bytes or with optional units
(B/K/M/G, accepting some variations but always interpreted as power of 10). The
size is then rounded to 4k block size. If no override is defined or the value
can't be parsed, it falls back to previously used 33% of system RAM.
Fixes#968
* Move rauc.db to boot partition
The RAUC metadata file contains information that is tightly related to the
system and kernel partitions. With the possibility to migrate data disk, the
rauc.db can contain bogus information when moved to a different system. Removal
of the file on "device wipe" is also not desirable, because the information
about slot status is lost.
Relocate the rauc.db to the boot partition after a system upgrade (as this
can't be handled by RAUC hooks, because it needs to be executed after all slots
and metadata is written) and adjust the script for recreating it. The downside
is that its content in /mnt/data would be recreated if the boot slot is changed
or system downgraded but this should be handled quite gracefully.
Also remove the raucdb-first-boot service which is no longer necessary
with the file not present in the data partition.
* Fix shellcheck and mount path
If HAOS on Yellow is booted for the first time with NVMe data disk present, it
should be preferred over the empty eMMC data partition. This will ease
reinstall of the system and migration from CM4 to CM5. All other data disks
(e.g. if a USB drive is used for them) are still treated as before, requiring
manual adoption using the Supervisor repair.
RAUC currently doesn't know the version of the booted slot when booted for the
first time or after wiping the data partition. As a result `ha os info` is
missing this information too.
As there's no built-in mechanism for generating these data by RAUC itself, add
a oneshot service that checks if the boot slot information is contained in the
rauc.db and if not, then generate it.
RAUC seems to cope quite well even with bogus data contained in rauc.db but in
any case, a test has been added to check that everything works as expected.
* Add initial Raspberry Pi 5 buildroot config
* Add machine-id support via cmdline.txt
* Add new entry if entry is missing
* Don't overwrite cmdline.txt when adding machine-id
Use sed to append the new cmdline parameter to the first line.
* Skeleton script for RAUC custom bootloader interface
* Deploy kernel/device-tree into a RAUC slot specific directory
This allows us to use the os_prefix feature to switch between slot A and
B. Compared to the boot_partition option, this option allows to use a
shared config.txt and cmdline.txt, which makes it more like how HAOS
currently works on other Raspberry Pis.
* Deploy new kernel/device-tree to correct slot on installation
* Increase boot size to 128MB
This makes sure we can store up to three kernels (slot A, B and an
temporary one while installing the OTA update).
* Initial tryboot implementation using os_prefix
* Make sure to delete the old slot completely
* Add Busybox xargs for tryboot bootloader script
* Compare tryboot bootloader file silently
* Revert "Increase boot size to 128MB"
This reverts commit 7f2c69b58f02f500d6aeee4f0a419046899b5e38.
* Use compressed kernel
* Address shellcheck
* Address shellcheck issue in rauc-hook
* Fix shellcheck for rpi-tryboot.sh
* Do not follow source - it gets checked separately
* Correctly set the slot to boot
* Apply suggestions from code review
Co-authored-by: Jan Čermák <sairon@users.noreply.github.com>
* Drop serial console from default cmdline.txt
* Resync rpi5_64_defconfig with rpi4_64_defconfig
* Improve machine-id match
Only match actual hexadecimal characters.
* Deploy firmware overlays to OS prefix directory
* Add Raspberry Pi 5 to documentation
* Bump buildroot
* buildroot fd1dc86f40...f13ad03408 (1):
> linux: add in-tree device tree overlay support
* Install device tree overlays from Kernel sources
* Drop RPi RF modules for now
No Raspberry Pi 5 specific device tree overlays are available, drop RPi
RF mod for now.
* Use Raspberry 5 specific identifiers for Supervisor/OS Agent
* Bump buildroot
* buildroot f13ad03408...07e08e01b2 (1):
> linux: fix add in-tree device tree overlay support
* Revert "Drop RPi RF modules for now"
This reverts commit 46fc1701e4.
---------
Co-authored-by: Jan Čermák <sairon@users.noreply.github.com>
* Add fsfreeze support for QEMU/KVM/Proxmox installations
Add fsfreeze scripts which calls the new Supervisor API to freeze Home
Assistant Core and add-ons which support the backup freeze scripts
(`backup_pre` and `backup_post`).
This allows to create safe snapshots with databases running.
* Fix lint issues
* Fix swapfile creation for all memory sizes
In certain situation awk prints the swapfile size in scientific
notation. The script can't deal with that, in which case swap file
creation fails.
Use int to convert the number to an integer.
Since pages are 4k, also make sure swapsize is aligned to 4k blocks.
* Add info message
* Use zswap instead of swap in zram
This requires a swap file which will get generated automatically on
startup.
* Fix file size and free disk space comparison
* Set zswap factor to 33%
* Set vm.swappiness to 1
Decrease swapping to a minimum. This is also recommended for database
work loads by the MariaDB documentation. In practice it causes the least
amount of writes to disk when under memory pressure, while still making
swap available when needed.
* Avoid moving data to same device
When a data disk move is triggered when the data disk is already in use
the script currently renames that only data disk, rendering the system
unusable.
Don't continue if source and destination happens to be the same device.
* On failure rename to hassos-data-fail
The label hassos-data-failed is too long.
* Deactivate any external data disk device on first boot (#2390)
* Use lsblk to determine the underlying device file
Comparing major number is not reliable, e.g. virtio disks have the same
major number despite being different devices. Use lsblk to find the
underlying device, and compare the device name instead.
* Fix Docker key.json corruption check
Since /etc/docker does not get bind mounted anymore (see #2116),
key.json from the overlay partition is used directly.
* Use -e flag for jq to get useful exit code
It seems that Docker can fail to start if there is no space left on the
device. Try to free up some space in that case by asking journald to
limit its size to 256MiB.
This should work for any storage larger than ~2.5GiB (as the journals
maximum size is 10% of the disk size). It still should leave enough
logs to diagnose problems if necessary.
Note: We could also limit the size of the journal in first place, but
that isn't sustainable: Once that space is used up, we run into the
same problem again.
By only asking journalctl to free up if necessary, we kinda (miss)use
the journal as way to "reserve" some space which we can free up at boot
if necessary.
* Drop default NetworkManager configuration
NetworkManager will automatically connect using the global defaults.
Also Supervisor today will create a profiles once the user configures
the network explicitly.
* Create system-connection directory
* Add AArch64/ARM64 EFI boot support (for QEMU and some boards)
* Allow GRUB to load cmdline.txt-like
* Enable qcow2/vmdk disk images
Co-authored-by: Stefan Agner <stefan@agner.ch>
It seems that Busybox shell (ash) cannot calculate the disk size
properly probably due to integer overflow. Use jq to calculate the last
usable LBA which seems to be able to handle large integers.
Partition handling for disks with 4k sectors broke partition resizing
when using MBR disk label. It seems that sfdisk doesn't calculate the
last LBA for diks with MBR label. Calculate the last usable LBA ourselfs
in the MBR case.
The calculation whether to resize the partition only works with disks
with 512 byte sector size. Use values provided by sfdisk exclusively to
make sure comparing the same sector size.
Furthermore, it seems that sgdisk does not like sfdisk's backup GPT
placement:
$ sgdisk -e /dev/zram1
Warning! Secondary partition table overlaps the last partition by 250 blocks!
Today it seems sfdisk can handle GPT quite well. Use sfdisk for all
operations in hassos-expand.
* Use systemd-growfs instead of resize2fs (#1106)
Since systemd 236 systemd has a built-in file system growing mechanism.
The mechanism relies on the kernels online file system resize
capabilities instead of the external resize2fs utility. Online resizing
is supposedly much faster since the kernel takes care of things.
This also makes sure that external file systems get resized which
previously have not been taken care of.
* Drop HA OS specific file system resizing
Since we have systemd-growfs in place now we can drop our file system
resizing code.
* Make sure /dev/disk/by-label/hassos-data is present after resizing
Note: systemd will retry mnt-data.mount later, so at least in theory
this shouldn't really matter. However, the journal has a lot of churn
due to that reordering.
* Avoid waiting for external drive unnecessarily
Even though the condition to start hassos-data.service is not met (the
file /mnt/overlay/data-move is not there by default), it seems that
systemd waits for the dependencies for hassos-data.service. Don't
Require or Wants any dependencies which might not be present by
default.
* Use systemd to wait for partition using partlabel device
* Use sfdisk which allows to wipe filesystem signatures
Even though we zap the partition table using sgdisk, the file system
superblock (which contains the file system label) does survive. This
can cause problems when trying to reuse a disk previously already
labeled using hassos-data: It might take precendence on next boot
over the existing data partition on the eMMC.
Make sure to clean all file system signatures using sfdisk.
* Make the datactl command more robust
Validate target disk (partition) size to avoid a copy attempt which will
fail. If e2image operation fails, make sure the leftover copy is not
regonized as data partition.
* Fix hassos-data service device unit dependencies
* Rewrite datactl command
Prepare the target partition as part of the datactl command. Rely on
partlabel for the target disk since we are always using GPT on the
target disk. Use systemd and partlabel mechanism to wait and find
the target data disk. Keep using the file system label to identify
the source disk.
Also use e2image instead of raw dd to move data. This should
speed up the processes significantly.
* Fix corner case when reusing same disk again
* Use /run as default location for lock files for U-Boot tools
While there is a command line parameter to set the lock file explicitly,
there are other tools invoking fw_setenv (in particular rauc) which do
not set the lock file. Using /run by default makes fw_setenv use the
correct lock file in all situations.
* Don't explicitly set lock file location
Since we patch U-Boot tools to use /run by default setting it explicitly
is unnecessary.
* Update buildroot-patches for 2020.11-rc1 buildroot
* Update buildroot to 2020.11-rc1
Signed-off-by: Stefan Agner <stefan@agner.ch>
* Don't rely on sfdisk --list-free output
The --list-free (-F) argument does not allow machine readable mode. And
it seems that the output format changes over time (different spacing,
using size postfixes instead of raw blocks).
Use sfdisk json output and calculate free partition space ourselfs. This
works for 2.35 and 2.36 and is more robust since we rely on output which
is meant for scripts to parse.
* Migrate defconfigs for Buildroot 2020.11-rc1
In particular, rename BR2_TARGET_UBOOT_BOOT_SCRIPT(_SOURCE) to
BR2_PACKAGE_HOST_UBOOT_TOOLS_BOOT_SCRIPT(_SOURCE).
* Rebase/remove systemd patches for systemd 246
* Drop apparmor/libapparmor from buildroot-external
* hassos-persists: use /run as directory for lockfiles
The U-Boot tools use /var/lock by default which is not created any more
by systemd by default (it is under tmpfiles legacy.conf, which we no
longer install).
* Disable systemd-update-done.service
The service is not suited for pure read-only systems. In particular the
service needs to be able to write a file in /etc and /var. Remove the
service. Note: This is a static service and cannot be removed using
systemd-preset.
* Disable apparmor.service for now
The service loads all default profiles. Some might actually cause
problems. E.g. the profile for ping seems not to match our setup for
/etc/resolv.conf:
[85503.634653] audit: type=1400 audit(1605286002.684:236): apparmor="DENIED" operation="open" profile="ping" name="/run/resolv.conf" pid=27585 comm="ping" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
The hassos-expand script calls sfdisk to find free disk space. It seems
that today it considers the space before the first partition as free:
$ sudo sfdisk -Fq /dev/sdi
Start End Sectors Size
2048 16383 14336 7M
This causes the script to always resize. It seems not to cause harm to
the partition table (it does not resize really). However, the call to
partx seems to confuse systemd and kill the mnt-data.mount process
(presumably because udev causes remove/add events for the by-label
device units).
Consider everything below 8MiB to not be worthy of a size change. This
avoids missdetection and resize attempts where there is no need.