A common question that we get when people start out on their P2P journey, are concerns around ‘overloading’ the clients that are serving up the content. What if you only have a single client with the content and then another 100 machines all request it at once? BranchCache has it’s own built in safety valves – the main one being that it can start sharing content the second that the first block of data makes it across the WAN from the DP, so it’s much more dynamic and distributed. BranchCache also does not serve up content if the host is on battery power(configurable via Policy)
Microsoft ConfigMgr Peer Cache also has some ‘built-in’ thresholds which are designed to avoid system resource overload. Peer Cache source clients (AKA SuperPeers) have a number of hard thresholds which, when breached, will result in the client refusing the connection from the requesting peer. These parameters are as follows:
MaxAvgDiskQueueLength (default 10)
MaxPerCentProcessorTime (default 80)
MaxConnectionCountOnClients (default 0 – which I guess means the OS limit)
MaxConnectionCountOnServers (as above)
RejectWhenBatteryLow (default TRUE)
So. Fairly self explanatory right? If you want to observe/test this behaviour, it is possible and here’s how.
In the current release – the parameters are hard coded, and not exposed in the UI via the Client Settings interface like the other BranchCache/PeerCache settings. Perhaps this will change in coming releases. You can change the defaults – but you have to do it programatically via the ConfigMgr APIs (there’s a C# example in the SDK if you want to fiddle!) but you should be able to generate enough load on the Peer Source to make it barf when it gets a content request. I did compile a wee .exe based on this example which you can download here – and which you can use the change the default values from those above.
To test this – I used JAM software’s Heavy Load tool to overload the CPU on my test client and that was enough.
You’ll want to enable verbose logging on the Peer Cache source too – as most of the action happens in the CAS.log. This article describes how to do that – just a quick registry hack and ccmexec restart and off you go. Also get a copy of WMI explorer – there’s some stuff in there too..
Once you have some content in the CCMCACHE of your Peer Cache source (and have turned on logging etc as above) generate a high CPU load so that the machine is really busy – the threshold is 80% (I cheated and set it to 30% for testing using my homegrown .exe)
Then, try to grab some content from another client. Here’s what you will see:
In the CAS.log on the Peer Cache source – you will see the request being blocked, like a 300lb nightclub doorman with a headache.
Notice that there’s a couple attempts – so that if the machine is just a ‘bit busy right now’ but calms down to below the threshold, you’re back in business.
Over on the requesting client, check the DTS log and you will see a minor bloodbath of red lines at it tries, then fails over to the DP in this case. If you have multiple Peer Cache clients of course it will simple fail over to the next (and hopefully less busy) Peer Cache source.
Peer Cache sends an HTTP 429 error – ‘Too many requests’ so that DTS know that it needs to back off and try again. It waits 30 seconds before the final attempt, after which it moves on down the list of content sources.
Back on the Peer Cache source – there’s one last thing to look at. As soon as a request for content is denied, a new WMI class gets created, to record the events. These events are then sent up with the other Peer Cache stats – and there’s a Report within ConfigMgr that tells you which machines are rejecting requests and why, so I guess if you really wanted to you could report on clients that were refusing connections frequently and demote them to non-Peer Cache sources.
The new class is a CTM class called CCM_CTM_PeerSourceServiceRejectionStats and you get the content ID and a count of rejections.
That’s it! As you can see, the ConfigMgr product group are putting a lot of effort into making Peer Cache a safe, scalable P2P solution, and these safety nets are a step in the right direction. Let’s see what comes next!