A Look at a Real World Thread Network

Thread is a fast growing wireless protocol for low power IoT networks. It’s a feature of Matter, which can also use BLE and Wi-Fi. Similar to Zigbee, Thread uses 802.15.4 for its PHY and MAC, but an important difference is that it uses 6LoWPAN for the network layer, so IPv6 addressing is used for all Thread devices. That’s IPv6-only, but more on that later.

Having recently deployed a home Thread network managed by Apple HomeKit, I wanted to see what I could discover about its operation. Below are some tips and interesting findings. Maybe this isn’t particularly useful information, because Apple’s implementation requires very little from the end-user and couldn’t be simpler to use. It really is as close to “It just works” as I’ve experienced with a wireless network, but if you want to peek behind the protocol operation curtain a bit, you might find this interesting.

The Thread protocol stack. Thread is one of many protocols built on top of the 802.15.4 PHY and MAC.

Network Discovery

An easy way to start discovering a HomeKit Thread network is with the Eve iOS app. The Thread Network option in Settings shows each device in the Thread Network, their role, capabilities, how their traffic is routed, and their 802.15.4 short address. You can get a good idea of the network topology here. Perhaps in the future Eve will build a visualization of it.

mDNS is used for network layer device discovery in a home Thread network, so you don’t need to use a Thread device to find other Thread devices. On a Mac, I use Discovery.app on the same network as the Thread Border Router to inspect the openthread.thread.home.arpa domain.

There is a lot to learn from this! The _hap._udp. service shows all the Thread endpoint devices. The hostname and MAC address of each device is there, along with its IPv6 address(es).

_meshcop._udp. will show the Border Router(s). MeshCoP is the Thread Mesh Commissioning Protocol, and the information shared here is a little like a Wi-Fi AP sharing network information in beacon and probe response frames, but this is much simpler. Here’s a look at everything advertised by an Apple TV 4K acting as a Thread Border Router.

The Border Router has internal Thread addresses and external LAN addresses. It also shares its extended 802.15.4 address, perhaps for OTA commissioning?

According to the Thread 1.1.1 Specification, The State Bitmap value 1 means:

1: DTLS connection to Border Agent allowed with a user chosen network Commissioner Credential and shared PSKc for the active Thread Network Partition

So somewhere a passphrase and PSK has been generated and delivered to each device, then it is used to authenticate and encrypt the commissioning session. Apple has handled this all behind the scenes and doesn’t expose any of it to end-users. This bit also has many other uses and I encourage you to download the spec and learn more about what it can signal.

Unlike Wi-Fi, there is no need for end-users to be aware of any of this information, which makes it so much more usable for non-IT people. End-users onboard new endpoints through scanning QR codes with an app, then everything else happens behind the scenes. Even the network name is hidden away. And really, why bother them with it? It makes me wonder how much of consumer Wi-Fi network management could be automated, removing the burden from end-users to understand it and make good decisions in using it. I bet if Apple ever makes Wi-Fi AP’s again, the user-experience will be more like Thread than what we have come to expect from consumer Wi-Fi.

IPv6 Addressing

In this network, all of the endpoint devices have a single ULA address, which is not internet-routable. The prefix was automatically generated, I assume by the Border Router? The LAN this Thread network is connected to is dual-stack IPv4/IPv6 with GUA and ULA addressing, so I assume the selection of a unique ULA prefix for the Thread network was by design.

All Thread devices appear to be reachable from outside the Thread network via the Border Router. “Router” is definitely the best name for this device, because it happily forwards packets to and from the Thread network without applying any security policy. Want to send thousands of packets to a low-power, low-capability Sleepy End Device (SED)? Sure, no problem.

Pinging a battery-powered door sensor. There is a lot of latency and packet loss, because this device spends most of its time asleep, but that is the only barrier to sending any traffic I want to it. Good thing it only has a ULA address.

While ULA-only addressing keeps traffic from the Internet away, it does mean that LAN traffic directed at Thread devices is possible. This accessibility is something to consider in future enterprise deployments. Just sending a lot of packets to a tiny battery-powered SED might disable it like a small-scale DoS, or eventually, through excessive battery drain. Any enterprise Border Router will need to have firewall functionality built-in.

So how are IPv6 packets getting into the Thread IPv6 network anyway? The Thread Border Router is sending IPv6 Router Advertisement packets to the outside LAN. Any modern device that receives that will update it’s IPv6 routing table with the new prefix, no matter how your LAN is configured. Think you don’t have IPv6 on your network? You do now. Thread requires IPv6.

Example IPv6 RA from a Thread Border Router to the external network.

The Border Router maintains IPv4 and IPv6 interfaces on the outside LAN. Other Thread devices that need Internet connectivity or have other non-Thread services can do this as well, but all Thread network traffic uses IPv6-only.

Without decrypting traffic over-the-air, I can’t tell if the Thread network is using SLAAC, DHCPv6, or both.

Capturing Thread Traffic OTA

Using a Nordic Semiconductor nRF52840 Dongle and their free nRF Sniffer for 802.15.4 software, I was able to capture encrypted 802.15.4 frames from this network. Getting the keys from HomeKit to decrypt this traffic is probably not possible, but there are some good use-cases for OTA packet capturing:

  • Measuring RSS and LQI at a location to determine mesh coverage
  • Determining the network’s operating channel
  • Discovering the chattiest Thread devices on the network, which may be useful when planning network capacity
  • Measuring packet counts, looking at the “busyness” of the network
  • Validating any security policy that is applied to the Thread IPv6 network. You can try sending traffic to a device from outside the Thread network and see if it gets there. The packet rate is so low that its easy to identify new flows coming to a end device’s short address in real-time as traffic is generated.
  • Diagnosing high retry rates (What is a high retry rate in 802.15.4?)
The 6LoWPAN header which would show IPv6 addressing is encrypted, but the 802.15.4 short address can be used to identify Thread devices over-the-air. In this case, 7002 is the same Eve Door sensor shown above in the Eve app.

A few miscellaneous findings from capturing routine HomeKit Thread traffic:

  • The battery-powered SED’s I observed were chattier than I expected, each sending a Data Request frame to its parent router every 5 seconds.
  • Mesh Link Establishment (MLE) frames make up 20% of the frames, and the 6LoWPAN header of these frames is unencrypted
  • So much in Thread is hidden behind encryption, don’t expect to learn as much about the network with OTA capturing as you might be used to with Wi-Fi

A Review of the GL.iNet Slate AX Travel Router

As a wireless engineer who spends a good amount of time on the road for work and family trips, I’m often unhappy with the quality of the Wi-Fi in hotels, Airbnb’s, and shared workspaces I find myself in. In many of those places, deploying my own AP is a good solution when I can do so without disrupting the existing network. I also like plugging in my own AP and knowing all my devices and those of the people traveling with me will have secure connectivity as soon as its up and running. And I can plug a Chromecast into a nearby TV and it will just work too.

So for a couple years, I’ve carried a small wireless travel router from tp-link with me, the TL-WR902AC. While the tp-link router gets the job done, it doesn’t have the more advanced features I’d prefer.

Enter the GL.iNet Slate AX (GL-AXT1800), which I purchased myself and have been using for the past few weeks (no, this not a sponsored post). This new travel router is bigger than the tp-link router, but it has an impressive list of useful features that have made my life on the road easier since I’ve owned it.

Some standout features…

  • 802.11ax 2×2:2 2.4 GHz radio
  • 802.11ax 2×2:2 5 GHz radio
  • Static channel, channel width, and Tx power assignment
  • 5 GHz DFS channels (this is great!)
  • WPA3
  • 802.11k/v/r/w/p support (Do you even know what 802.11p is?)
  • Supported/basic data rate management
  • Enable/disable MU-MIMO
  • Static BSS Color assignment
  • Router/AP/repeater modes
  • IPv6 support including RA and DHCPv6 client/server/prefix delegation
  • AdGuard Home built-in for encrypted and filtered DNS
  • Support for 30+ VPN services
  • USB-C powered
  • USB-A port (perfect for powering a Chromecast)

Some of those more advanced Wi-Fi things, like 802.11k, MU-MIMO, data rate configuration, and BSS Color assignment can only be managed in the OpenWRT CLI, but its a pretty straightforward config if you don’t mind tinkering. You probably don’t if anything on that list looks appealing!

Once I had everything just right, it became fairly plug-n-pay when I’m on the road. The options that normally need changed I can manage quickly with the glinet smartphone app. That includes selecting an appropriate channel, channel width, and Tx power, or changing the AP/router/repeater mode if necessary.

The iOS configuration app

If you find yourself interested in a travel router with a lot of wireless nerd knobs, check out what GL.iNet has to offer.

Sorting Out BSS Color, Spatial Reuse, and Dual NAV

This post first appeared on 7signal.com.

We usually only hear about BSS Coloring in the marketing of Wi-Fi 6, but Spatial Reuse and Dual NAV are related important features of 802.11ax. Let’s sort them out, but first some background.

All 802.11 stations (AP’s and clients) must make sure that the channel they are operating on is free before transmitting. This prevents collisions with other stations operating on the same channel. 802.11 stations accomplish this through two methods: physical carrier sense at layer 1 and virtual carrier sense and layer 2. Physical carrier sense listens for 802.11 preambles that are transmitted at the beginning of every frame. This is the clear channel assessment signal detect (CCA-SD), sometimes called preamble detect. Physical carrier sense also checks for any RF energy on the channel. This is the clear channel assessment energy detect (CCA-ED). Virtual carrier sense operates at layer 2 using a frame’s MAC header Duration/ID field to determine how long an ongoing frame exchange will last. It sets the station’s NAV timer (network allocation vector), which prevents the station from transmitting until it counts down to zero, even if physical carrier sense determines the channel to be idle. Both carrier sense methods must determine that the channel is available before the station can transmit.

Because modern 802.11 radios are very sensitive, CCA-SD causes a station to defer transmitting even if it detects a very low RSSI signal from a distant BSS operating on the same channel. Co-channel interference, referred to as overlapping BSS (OBSS) in the standard, is a problem then even at very low RSSI, as most 802.11 radios will trip their CCA-SD check when they detect a transmission just 4 dB above the noise floor, and defer transmission. If instead, such a station transmitted despite the low RSSI OBSS transmission, it is likely that the receiving station would hear it successfully, which would increase overall spectral efficiency and limit the negative effects of CCI.

802.11ax introduces enhancements to both physical and virtual carrier sense to help address this issue. Spatial Reuse works at the physical carrier sense level and enhances CCA-SD, and Dual NAV works at the virtual carrier sense level and enhances the NAV timer. Both features cause stations to act on the BSS color field, although a BSS color is not required in all cases for them to work. When used in combination, these features can increase the spectral efficiency of 802.11ax.

BSS Coloring

BSS Coloring is simply the ability for an AP to advertise a BSS color, which is actually a number, in its beacon and probe response frames, as well as include the same BSS color field in the HE preamble of the 802.11ax frames that it transmits. Clients that support BSS Coloring also add a BSS color field to the HE preamble of the 802.11ax frames that they transmit. The AP and all its clients in the BSS use the same color value. Overlapping BSS’s on the same channel use a different color to indicate that their frames are OBSS, and therefore they may be treated differently using one or both of the techniques below. The presence of BSS coloring on its own doesn’t change station behavior, it must be acted on using the following techniques to provide any benefit.

Note that some AP vendors and the Wi-Fi Alliance talk very generally about BSS Coloring and I suspect that they really mean BSS Coloring with Spatial Reuse operation.

Spatial Reuse

Spatial Reuse introduces the concept of an OBSS-PD threshold (overlapping BSS packet detect) to CCA-SD. In the OBSS scenario, each BSS will use a unique BSS color. Spatial Reuse allows the stations in each BSS to use a less sensitive preamble detection threshold for OBSS frames during their normal CCA-SD check. That way, even though there may be an OBSS frame making the channel busy, if it is not very loud and there is still significant SNR, an 802.11ax station that supports Spatial Reuse can transmit anyway. To account for the temporarily lower SNR, it may use a lower, more robust MCS. One limitation of Spatial Reuse is that the OBSS transmitting station can’t make the same adjustment to its MCS because it has no knowledge of the other station’s future intention to transmit. 802.11be may solve this problem with new multi-AP coordination features.

Spatial Reuse support is indicated in beacon and probe responses by the Spatial Reuse Parameter Set IE. This is also where the specific thresholds are defined along with which spatial reuse method is to be used. The two methods are OBSS-PD-based operation and parameterized spatial reuse-based operation (PSR), the details of which are beyond the scope of this blog.

Dual NAV

Dual NAV (referred to as “two NAV operation” in the standard) works at layer 2 using the duration field of a frame’s MAC header, and it also takes advantage of the new TXOP field present in the HE preamble. It requires 802.11ax clients to establish two NAV timers, an intra-BSS NAV for all frames within the BSS, and a basic NAV for OBSS frames (often called inter-BSS frames).

The intra-BSS NAV timer is set by frames that match the station’s BSS color or frames with a BSSID field that matches the station’s associated AP. The basic NAV timer is set by OBSS frames with a different BSS color, frames with no BSS color in the case of legacy frames, or frames with a BSSID field that doesn’t match the station’s associated AP.

This helps 802.11ax stations overcome several problems. A legacy station with a single NAV can have its NAV incorrectly shortened by an OBSS frame declaring a shorter duration than its current NAV value. This scenario is particularly troublesome during the long TXOP’s an AP holds for OFDMA frame exchanges. Dual NAV prevents OBSS frames from resetting the intra-BSS NAV.

On the other hand, the basic NAV can also protect OBSS frames during OFDMA if an AP has set the carrier sense required field with its preceding trigger frame. If a client in that scenario has a non-zero basic NAV, it will not respond to the trigger frame in order to avoid a collision with the OBSS transmission. Therefore, the state of the CS required field in the trigger frame determines if a client will respect the basic NAV or transmit anyway, but this only applies to OFDMA operation.

In all other scenarios, both NAV’s must equal zero in order for an 802.11ax station to transmit.

A key difference in 802.11ax is that there is a new TXOP field present in the HE preamble which sets the NAV timer. This allows the NAV to be set at the PHY level, removing the need for RTS/CTS TXOP protection when legacy PHY’s are not present. It blurs the layer 1/layer 2 distinction between the preamble and NAV. It also allows the NAV to be set at lower SNR and at greater distance than previous generations of 802.11, which only set the NAV via a frame’s duration field or RTS/CTS protection. Dual NAV became necessary to prevent the OBSS NAV reset issue from becoming much worse in 802.11ax with the NAV now set by the robustly modulated HE preamble.

Dual NAV can operate using the BSSID field present in non-HE frames to distinguish OBSS frames, like in a mixed environment with 802.11ac and 802.11n stations present. It also operates when BSS coloring is disabled on an AP.

Putting it All Together

Spatial Reuse can make a station less sensitive to OBSS transmissions and increase the likelihood of successful simultaneous transmissions, increasing the spectral efficiency of 802.11ax. Dual NAV on its own will probably only have a marginal impact on spectral efficiency. Its value lies in ensuring virtual carrier sense is accurate and reducing collisions. However, when these features are used in combination, Spatial Reuse will prevent OBSS frames below the OBSS-PD threshold from setting the basic NAV, increasing spectral efficiency by desensitizing both physical and virtual carrier sense to OBSS frames.

Now it is helpful to understand how these features are implemented. 802.11ax has different requirements for AP’s and clients as to what mix of them is mandatory.

Station TypeBSS ColoringSpatial ReuseDual NAV
APMandatoryOptionalOptional
ClientMandatoryOptionalMandatory

Most 802.11ax AP’s come with BSS Coloring enabled by default, although the standard allows it to be disabled. Unfortunately, Spatial Reuse is optional for all stations, however Cisco has announced AP support for OBSS-PD-based Spatial Reuse in recent code versions. It seems unlikely that client vendors will implement it if it is not required. Dual NAV is optional for AP’s and mandatory for clients. The standard doesn’t explain this, but perhaps this is because the AP owns the TXOP during both upload and download OFDMA, so it will not reset its NAV due to OBSS frames during that period. However, it may also be due to the mobile nature of clients who can be anywhere within an AP’s coverage and are more likely to encounter and create OBSS conditions than their associated AP.

In practice, 802.11ax stations that only support Dual NAV without Spatial Reuse won’t see a significant improvement to spectral efficiency under OBSS conditions, perhaps only benefiting during OFDMA operation. Combining BSS Coloring with Dual NAV and Spatial Reuse is the key to significantly improving spectral efficiency through reducing physical and virtual carrier sense sensitivity to OBSS transmissions.