What can be achieved in two days? Quite a lot…

On the way back from London after spending some time with lead iPXE developer Michael Brown in Cambridge and then meeting up with people at the 2Pint London Office. In Cambridge we did some amazing work at troubleshooting some issues that we have been dealing with. As most of the times, it was actually down to hardware drivers not being up to the job of dealing with the amount of data that iPXE forces it to handle. We did also discover a heap memory exhaustion under https and peerdist (BranchCache) which I had seen in the lab before (hence why HTTPS is not in 2.5). Great progress though, and some of these changes will be incorporated ASAP into iPXE Anywhere 2.6, which is our next upcoming release.

There are also other changes coming to our iPXE solution, but that will not be covered in this post. But we are moving to use iPXE.EFI instead of using SNPOnly.efi, which will boost transfer speeds to close to 1GB/s but also remove a lot of the issues we see with SNP based drivers. If iPXE.EFI cannot find an appropriate driver, it will then revert to SNP. It will be awesome.

Outstanding actions to be fixed (Coming Soon)

These are items already taken care of technically, but we need to implement them into iPXE in a good way.

Out of memory while running HTTPS and Peerdist (BranchCache)

Heap exhaustion when running under HTTPS and having multiple connections alive. This is typically only the case when running under HTTPS and doing multiple connections to peers. As iPXE by default runs 32 active connections, we basically run out of memory when using peerdist under HTTP. A fix for this is under way, either in the short term by simply increasing HEAP space. Long term, this is a situation that can theoretically happen on other systems so a re-write of the code to allow caching data outside heap the way that peerdist does the transfers might be a possible approach.

Windows Authentication

Today iPXE supports a range of authentication providers, but does not support Windows Authentication NTLM v2. This adds the ability for iPXE to download data from IIS servers requiring authentication. In the iPXE Anywhere world it means that we can download from DP’s that are not protected with HTTPS without the need to enable the “Allow anonymous access” option on the DP. This will greatly help when installing the 2PXE server in ConfigMgr mode as no changes to the Distribution Point is required.

Already Fixed in iPXE – now waiting to be implemented

If you interested, here are some cut and pastes from the latest commits to main branch of iPXE, and some coming changes that have already been fixed:

Gather and report peer statistics during download

iPXE can now record and report the number of peers (calculated as the maximum number of peers discovered for a block’s segment at the time that the block download is complete), and the percentage of blocks retrieved from peers rather than from the origin server. This allows us to show the actual % of content coming from peers. As we also send this information using the syslog feature, this can be displayed in realtime in StifleR.

Avoid false positive warnings from valgrind

Calling discard_cache() is likely to result in a call to free_memblock(), which will call valgrind_make_blocks_noaccess() before returning. This causes valgrind to report an invalid read on the next iteration through the loop in alloc_memblock().
Fix by explicitly calling valgrind_make_blocks_defined() after discard_cache() returns. Also call valgrind_make_blocks_noaccess() before calling discard_cache(), to guard against free list corruption while executing cache discarders.

Impose receive quota on tap driver.

The tap driver can retrieve a potentially unlimited number of packets in a single poll. This can lead to heap exhaustion under heavy load. Fix by imposing an artificial receive quota (as already used in other drivers without natural receive limits).

Cancel all pending transmissions on any transmit error

Some external code (such as the UEFI UNDI driver for the Realtek USB NIC on a Microsoft Surface Book or Surface devices) will block during transmission attempts and can take several seconds to report a transmit error. If there is a large queue of pending transmissions, then the accumulated time from a series of such failures can easily exceed the EFI watchdog timeout, resulting in what appears to be a system lockup followed by a reboot.

Work around this problem by immediately cancelling any pending transmissions as soon as any transmit error occurs.

The only expected transmit error under normal operation is ENOBUFS arising when the hardware transmit queue is full. By definition, this can happen only for drivers that do not utilise deferred transmissions, and so this new behaviour will not affect these drivers.

Raise TPL when calling UNDI entry point

The SnpDxe driver raises the task priority level to TPL_CALLBACK when calling the UNDI entry point.  This does not appear to be a documented requirement, but we should probably match the behaviour of SnpDxe to minimise surprises to third party code.

Long term changes planned for iPXE (6 months to a year)

We do have some long term plans, that we have started looking into, but we are not there yet to write a proper spec and implement. Especially the 802.1x support for PXE requires some major changes to how iPXE deals with EAP in order to get Microsoft to sign that binary.

802.1x support for user authentication

This will allow us to actually communicate with a 802.1x switch directly from iPXE. The long term is also to support machine based certificate for authenticating with the switch itself. If you are interested in this area, please give us a buzz.

Investigating area

We are looking to see if we can use a generic USB WiFi driver that will work for a large number of WiFi devices. If we fail to do so, we will then find the best structured and available USB device (ebay, amazon, long lifespan etc.) and build a driver for that. That will allow for easy WiFi OSD build scenarios as only a single driver is required using the USB dongle model.