Encountering a Mysterious Issue with
BranchCache
While troubleshooting BranchCache for a
client, we stumbled upon a peculiar problem. Despite being certain that content
was cached, some computers used for PreCaching never shared this content with
their peers. After an extensive support case that escalated to the development
team and involved numerous network traces and database dumps, we finally resolved the issue with a critical optimization of the RepubQuorumSize registry key.
Understanding the 10 Response Limit
The core of the issue is BranchCache's
behavior when handling content requests. If a BranchCache request receives 10
responses (not 10 computers, as explained below), the computer downloading the
content marks it as "not peerable." This means that even if the
computer has the content, it will not share it. This is by design to protect
large networks from being overwhelmed by too many devices attempting to share
content simultaneously.
Why This Matters in Modern Networks
This approach worked well when most computers
were stationary workstations. However, in environments with numerous laptops or
in deployment centers where devices are frequently shipped out, this setup is
suboptimal.
The Default Behaviour: 10 Responses
Typically, a computer has both an IPv4 and an
IPv6 address, meaning each device can responde twice for each request.
Consequently, if five computers on a network have the content, they will
provide 10 responses in total and the sixth computer will mark the content as
"not peerable."
Figure 1: Five computers with the content (green) provides 10 responses; the
sixth computer will download the content from the peers but mark the content as
"not peerable" (orange).
Five Went Away and There was Only One
If the five initial computers are turned off
or moved, the sixth computer, although it has the content, will not share it
due to the "not peerable" status.
Figure 2: The orange computer has the content in its cache but will not peer.
Thus, the blue computer must download everything from the source. Since there were
no responses for the BC request, the content will be marked as available for
peering on the blue computer and it will share it as expected later on.
RepubQuorumSize Registry Key and Resetting the Cache
Microsoft does not offer a supported method to
reset the "not peerable" flag in the BranchCache database. The only MS
supported solution is to flush the BranchCache and redownload the content when
fewer than five computers have it. However, Microsoft support revealed an
undocumented registry key, "RepubQuorumSize," which controls the
number of responses required before marking content as "not peerable."
By default, this value is set to 10.
You can also use our 2Pint Software BCMon tool
to reset any content in the database so it will start peering. See Troubleshooting section below.
Testing and Implementation
We've tested setting the value to 100 without
noticing any network issues. However, it is crucial to test this in your
environment with your network team. Setting the value too high might
trigger network switch storm control to start dropping packets, or worse,
drop ports if that's how the network team has configured it.
Adjusting the RepubQuorumSize Registry Key
To adjust the RepubQuorumSize:
- Path:HKLM\Software\Policies\Microsoft\PeerDist\DiscoveryManager
- Name:RepubQuorumSize (DWORD)
- Value:10 (Default)
Value refers to the number of responses needed before content will be set to "not peerable" on the computer doing the download.
After setting the key, restart the BranchCache
service (PeerDistSvc) for the changes to take effect. Note that changing the
value will not affect content already marked as "not peerable".
The content must be flushed and redownloaded or reset using the BCMon tool.
By understanding and adjusting the
RepubQuorumSize, you can optimize BranchCache performance in your
network environment. Ensure thorough testing and collaboration with your
network team to avoid potential issues.
Troubleshooting Not Peerable Content
2Pint Software provides a free tool called
BCMon to help troubleshoot BranchCache issues and this can be used on a client
to determine if content is marked as not peerable. BCMon can be found on our github page:
BranchCache/BCMon at master · 2pintsoftware/BranchCache (github.com)
(Tip: Run the BCMon.Net.exe without any arguments to get help, or specify one of the options to get additional help for that argument)
BranchCache itself has 2 APIs, a local and
a remote. BCMon can be used to query both and if a computer responds that it
has the content when querying the content locally, but not using the remote
API, then you know that the content has been marked as “not peerable”.
If running the tool locally on a machine,
use the argument “-d 127.0.0.1:<port>” to query the BC Remote API, if you
omit the “-d” argument it will use the local API
Figure 3: Using the local API we can see that all content is in the cache.
Figure 4: But when adding “-d 127.0.0.1:1337” and querying the remote API it says no
content is found. This tells us the content has been marked as “not peerable”
in the BranchCache database.
Resetting the Database Using BCMon.
You can also use the tool to reset any
content that has been marked as “not peerable” so that it will peer again.
BCMon.Net.exe MakeSegmentDiscoverable system
Happy peering!
/Mattias
Benninge
Principal Engineer @ 2Pint Software
@matbg X /
Twitter