This week I was troubleshooting a PXElinux boot loop on a customer's IBM hardware.
The system was added as bare-metal host in Foreman and configured to run CentOS 7. Everything that was necessary to provision the hardware was prepared:
- (U)EFI configuration (boot order: 1. PXE, 2. local RAID)
- RAID configuration
- DHCP configuration
- Server-side PXElinux configuration
- Software Repo access
After rebooting the system, network-based OS installation starts and reboots again afterwards to boot into the new operating system. Normally the boot process should look like the following:
- Power on
- POST/ UEFI magic/ hardware tests
- UEFI loads hardware drivers (NIC, RAID controller, etc.)
- Enterprise lifecycle tool collects hardware information (optional)
- Network DHCP discover, offer, request, ack
- TFTP-downloading and loading boot managers and other files from DHCP/TFTP server
- If PXE-booting/ DHCP IP gaining fails, boot from local disk(s)/RAID
In my particular case, step 6 failed after downloading the network boot manager/menu. After OS installation has finished by HTTP GETting Foreman (
wget http://foreman/unattended/built) successfully, Foreman changes the PXElinux configuration for our specific host (MAC address) to the default PXElinux configuration provision template. This in fact contains, boot from local media:
The system is unable to execute this command and fails back to PXE-boot again. Files will be TFTP-downloaded and local boot fails again. This happens until a specific amount of retries has reached. The system then stops doing anything. It won't try to boot from local disk(s)/RAID, I think this is because the system thinks PXE-booting was successful.
The solution was very simple. A post by Pascal Legrand on the syslinux mailing list pointed me into the right direction.
I changed the Foreman provision template to the following (notice the last three lines):
DEFAULT menu PROMPT 0 MENU TITLE PXE Menu TIMEOUT 200 TOTALTIMEOUT 6000 ONTIMEOUT local LABEL local MENU LABEL (local) MENU DEFAULT #LOCALBOOT 0 COM32 chain.c32 APPEND hd0
Now chainloading Grub and booting from a local RAID works! Using chain.c32 also works for chainloading windows bootloaders and in a VMware vSphere environment, so I've set it to the global default.