Featured Post

Using VMware's Horizon Performance Tracker For Rudimentary Blast Optimization

Recently updated for Horizon 7.10, the  VMware Blast Extreme Optimization Guide  focuses on, "two key configurable components: the tran...

Monday, June 22, 2020

Workspace ONE UEM Alternatives To AD Group Policies: Get Yourself Free

Modern management promises the ability to administer desktops and laptops without requiring they have membership to a domain or even connectivity to a corporate network. A common techie response to this proposition is, "what about GPOs?" IT shops have managed desktops with AD based GPOs for decades now, a process that's been pretty much boiled down to a science.  However, when it comes to switching from AD based GPOs to modern management the path forward is far less prescriptive or codified, with a wide range of options and room for creativity.  At the risk of being crude I'd say there's 50 Ways To Leave Your Lover and over 10 ways to leave your AD GPOs while embracing modern management with Workspace ONE.

There are 5 different Configuration Service Provider (CSP) based options alone, including WS1's native Win10 profiles, AirLift exports and Policy Builder. Then there's WS1 Baselines, which under the hood is an amalgamation of 3 different non-CSP based strategies. Finally, there's the GPO migration tool, customized scripts and various imaginable DIY permutations.

This post is a primer on transitioning from AD based GPOs to Workspace ONE's modern management alternatives.  It will review and prioritize various guidance and strategies, with particular focus on the recently released tutorial, Understanding Windows 10 Group Policies: VMware Workspace ONE Operation Tutorial. While providing brief descriptions of the different alternatives I want to zero in on a few key decision points along the path from traditional AD GPOs to modern management.  

Do You Have An AD Legacy To Preserve?

With all the migration options available, overlapping capabilities, caveats and all, sorting out an optimal path forward is tough.  While discussing this challenge a colleague of mine, Jason Walker, made an excellent point. "Well, if someones asking, 'which route should I take,' the first question to asks is, 'who are you and what's your normal role/technical back ground?'"  So are you an MDM guy who's looking for some basic management or are you a grizzled AD administrator who's managed GPOs for decades?  To further refine the question, “does your enterprise have a heavy investment and reliance on GPOs?  Is there a GPO legacy you absolutely need to port over to modern management?"  Getting this question answered elucidates a path forward.  

For the traditional MDM guy or someone who doesn't have the luggage of an extensive AD GPO legacy, start with a careful investigation of WS1's native Win10 capabilities, then move on to WS1 UEM Baselines.  On the other hand, if you have an AD legacy to preserve and extend to modern management, start with AirLift.  The reporting capabilities of AirLift alone are worth the price of admission, providing key information for GPO rationalization.  With this info in hand scan the built in Win10 profiles for overlapping functionality, then turn to WS1 Baselines or AirLift's export capabilities to fill in the gaps.  Finally, in both scenarios, there's an option to fall back to various customized SyncXML alternatives or scripting strategies.

Whichever path you're on an investigation of WS1 UEM's built-in profiles is in your future, so I'm going to cover that next. 

WS1 Win 10 Management Out Of The Box

Before diving into all the alternatives you should first investigate the built-in native Windows 10 payloads.   These payloads map to specific Configuration Service Providers (CSPs), essentially Microsoft's APIs for Windows 10 modern management.  Out of the box there are 30 payload types that configure hundreds of settings.  When you eyeball these payloads there's significant overlap with traditional AD GPOs.   Examples include payloads for password settings, BitLocker, Defender, Windows updates and Windows Firewall.

These built-in WS1 Win10 payloads aren't a complete substitute for the thousands of GPO settings known to mankind.   However, when you think tactically about what’s really essential for Win10 mobile management, what they do cover is formidable.  Given they are built-in capabilities, both easy to implement and maintain, it makes sense to exhaust them fully before exploring alternatives.

For a short but sweet description of WS1 Windows 10 modern management capabilities, check out this video by Chris Halstead, VMware Workspace ONE: Windows 10 Modern Management - Technical Introduction.  For a very dense and comprehensive overview, check out this video by Pat Linsky: VMware Workspace ONE UEM: Windows 10 Modern Management - Technical Overview.

The delta between WS1's built-in Win10 management capabilities and traditional AD GPO settings reminds me of a scene in Monty Python's Life Of Bryant.  While the revolutionaries rant about, "what have the romans ever done for us," they realize, oh yeah, they have done a lot, huh?  Likewise, while WS1 out of the box doesn't have full parity with traditional AD based GPOs, a lot of relevant GPO functionality has been addressed.  "Alright, but apart from Windows Updates, Anti-Virus, BitLocker, Firewall, certificates and settings associated with the other 23 built-in payloads, what has WS1 UEM ever done to replace traditional AD GPOs?"  Well, there's plenty of alternatives to explore, with Workspace ONE Baselines shining brightest.

Bazooka Baselines - The Duck-Billed Platypus Of WS1 GPO Alternatives

Workspace ONE Baselines, a component of WS1 Advanced edition, is, to say the least, one odd duck!  Essentially, it's an offering of 3 very different methods for pushing out settings to Windows 10 devices: Baseline Templates, Custom Baseline and a service catalog of ADMX-backed settings.  Through Baseline templates you apply 100's of settings at a time based off the Windows 10 Security Baseline or CIS Benchmarks.  This is useful for situations where you really want to heavily lock down a Windows 10 device according to industry standards and best practices.  It's ideal for a scenario where someone is looking to manage a Windows 10 device more like a purpose built mobile device.  We're talking about a wad of settings here, 380+ regarding Windows 1809 for example, so there's certainly some commitment involved, with a lot of settings you may have vet.  At first glance it can seem a bit unwieldy,  however you can disable or tweak out values for these settings individually as needed.

This solution offers some very compelling compliance reporting.   Each device with a Baseline assigned to it will report back as Compliant, Intermediate Compliance or Non-Compliant. Compliant represents having 100% of the settings actively applied while Intermediate Compliance represents having 99% to 85% of the settings implemented.  Anything less will report back as Non-Compliant.  By default Compliance status is reported at intervals of every 6 hours, so you get a fairly up to date reflection of the actual state of the device.  Even more impressive is the ability to zero in on a specific device and drill down into which particular settings are currently implemented versus settings that are non-compliant.

This granular reporting really sets the solution apart from other modern management alternatives. With other methods you can always test an individual device with tools like Policy Analyzer or RSoP, but with Baselines we're getting that functionality and insight built right into the UEM console.  Further, there's an option to leverage a registry setting that will force the reapplication of Baselines at a defined interval, ensuring your desired settings remain enforced.  This registry setting can get pushed out through a custom settings payload or manual registry edit, as specified in the latest guidance.

WS1 Baselines also offers a Custom Baseline option which involves pushing out an exported GPO rather than an industry standard template.  You can use Group Policy Manager or LGPO.exe to export a GPO from your current environment, then upload a zipped copy of that backup to the console.  Baselines will then go on to import that GPO on a target device using a local instance of LGPO.exe. 

Custom baselines are a nice little option to have in your back pocket.  If all else fails, do an export of your current AD GPOs and then just blast it out to your target devices.  However, there's two caveats.   One, you're not getting any kind of lifecycle management built into the UEM console.  If you want to make a change to a GPO setting getting pushed out, you'd have to go back to your AD environment, edit the original GPO, do another export, then do another import to Baselines. Also, you don't get the benefit of compliance reporting like you do with Baseline templates.  The latest guidance addresses these short coming quite succinctly with the statement, "You will find that custom baselines lack the ability for full lifecycle management such as, reporting and making edits directly via the console." 

Finally, a third capability of this solution involves a built in cloud based ADMX catalog.  Whether you're going with Baseline Templates or a custom Baseline, you have the option to add additional settings from this catalog.  Honestly, in my mind, this corollary to Baseline or Custom Baseline should be seen as it's own separate solution and represents the closest thing to an, "easy button," for configuring individual GPO settings with WS1.  We're talking a very extensive catalog, with some 4300 group policy settings to choose from. 

After you make your selection, the tool actually leverages functionality traditionally associated with Dynamic Environment Manager, what used to be called, "User Environment Manager."  It's really interesting to see how VMware has harnessed this traditional Horizon tool to round out WS1 Baselines.   Fortunately, this subset of Dynamic Environment Manager's capabilities it built right into the WS1 agent and doesn't require any additional installs.  Another piece of good news is that, like with Baseline templates, you do get compliance reported back regarding these ADMX-backed settings, so that's pretty awesome too.

To wrap things up, here are the three different mechanisms offered by WS1 Baselines:

I told you, this thing is weird, like Weird Al Yankovic weird.  Like a street performer playing Crocodile Rock on 9 separate musical instruments kind of weird.  That said, it sure does cast a wide net.  It's hard to imagine a GPO setting you couldn't theoretically push out with this tool one way or another.  

For a greenfield deployment or situation without a GPO legacy to mind, WS1 Baselines are a dream come true.  For that matter, it's still a very relevant and viable option for those looking to preserve a GPO legacy, with one major caveat.  It doesn't really look backwards at your environment or offer any analysis of what you're currently doing with GPOs.  So for an AD guy looking to preserve a legacy, even if you're determined to use Baselines, you'll still want to run AirLift for it's reporting capabilities.   I'm going to detail AirLift's Policies features next.

For additional info on Baselines, check out:

Official documentation: Using Baselines
Fantastic video: VMware Workspace ONE UEM: Baselines - Feature Walk-through
Latest Tech Zone guidance: Modernizing Group Policies Using Workspace ONE Baselines

AirLift - It's Not Just For SCCM Migrations

The release of AirLift 2.x in 2019 introduced assistance with GPO migrations, making it relevant to pretty much all WS1 customers, not just SCCM admins.  Sure, it's primarily focused on migrating folks from SCCM to WS1 UEM.  However, the subset of it's functionality that addresses AD group policies, Workspace ONE AirLift Policies, is applicable to any customer who has AD group policies they want to port over to modern management.   Further, it's free, with very minimal requirements that are well worth the price of admission.  After standing AirLift up and pointing it to your domain controllers you get an exhaustive report of all the GPOs within your domain, along with their associated settings.  Useful information returned includes which OUs these GPOs are assigned to so you know their current scope within your environment.  Most notably, you get feedback on why or why not such settings are suitable for modern management, as well as whether AirLift can automate the export of these GPO settings to UEM profiles for you.  This reporting functionality is invaluable for planning and GPO rationalization, something I'll come back to shortly.

Again, AirLift also offers an option to export group policies directly from Active Directory to your WS1 UEM environment.  This process is dependent on whether or not a specific GPO setting has a corresponding Configuration Service Provider (CSP).  CSPs represent Microsoft efforts to make Windows 10 a modern mobile operating system.  They're essentially APIs for configuring GPO settings on Win 10 through XML, what's called SyncML.  The Configuration Service Provider reference states, "A configuration service provider (CSP) is an interface to read, set, modify, or delete configuration settings on the device. These settings map to registry keys or files." Rumor has it that Microsoft has an army of geeks seeking to create CSPs for all relevant GPOs, but with over 3,000 group policy settings for Windows 10 and 1,800 IE settings, there's plenty of work to be done.

In a nutshell, where it's possible, AirLift generates custom SyncML required to set the GPO settings through supported CSPs.  This SyncML in turn is configured as the payload in a custom Windows 10 UEM profile that can then be assigned to your target devices through Smart Groups.  If you like, the utility offers the option to combine multiple exportable GPOs as a a payload into a single custom profile.  Further, it can also export some 3rd party ADMX policies such as Google or Office.

Before charging ahead with AirLift's export capabilities, you should first leverage it's reporting capabilities for some critical GPO rationalization.  That's a topic I'm going to review next.

For additional info on AirLift, check out:

Official Documentation:  Introduction to VMware Workspace ONE AirLift
Great Video: VMware Workspace ONE AirLift: Windows 10 Migration - Expert Panel
Latest Tech Zone Guidance: Using Workspace ONE AirLift to Analyze Group Policies

GPO Rationalization 

GPO Migration Strategy [Graphic]. (2018). Retrieved June 2020 from Developing A Modern Management Adoption Process

With AirLift reporting in hand you're well positioned to begin rationalizing your GPOs.  As with house moving, it's best to throw away as much as you can rather than get caught up transporting stuff you really don't need.  Perhaps a GPO setting is inherently dependent on domain membership.  Maybe a GPO setting isn't applicable to the latest version of Windows 10, remote users or is just a vestige that's no longer relevant to your enterprise.  Whatever the reason,  if it doesn't have a place in the world of mobile Windows 10 management you need to let it go.  So, rather than trying to port everything over, "just in case," it's best to start with some house cleaning.

After zeroing in on the GPO's that matter to you, the next question to ask is, "what of these settings are accommodated by the native capabilities of WS1 UEM?"  This essentially amounts to, "of these settings which ones are both supported by CSPs AND WS1's built-in Win10 profiles?"  WS1's built-in profiles are reliant on CSPs, so for anything AirLift reports as not being supported by a CSP, you can spare yourself the search.  However, for any settings reported as exportable you'll want to search for a match in the UEM console or Windows Desktop Device Management guide.

For functionality not covered by the built-in UEM profiles, there's a judgement call to be made between leveraging WS1 Baselines or AirLift's policy export option.  

Baselines or AirLift Exports? Another Key Decision Point

If a GPO setting isn't supported by a CSP WS1 Baselines is probably your best bet. However, let's say there's a GPO setting that isn't accommodated by native WS1 UEM functionality, but does have support from CSPs.  Let's take it a step further and say not only is it supported by a CSP, but it's also reported by AirLift as exportable.  Should you proceed with the AirLift export option or should you investigate Baselines?  The latest VMware guidance describes this dilemma as making a choice between, "Modernize or Migrate."  

A lot of the decision hinges on what kind of administrative and lifecycle capabilities you'll need going forward.  There's also a question of how much time and energy you have in the short term.  If all you your settings are supported by the AirLift export option, you're just a few clicks away from a migration.  With Baselines there's probably going to be more up front work, as you manually map out your GPO settings or vet excess settings associated with the templates,  but you get a big fat gui at your disposal from start to finish.  In the future this gui provides a straight forward process for pushing out updates or changes to your policy.  Further, you get granular and up to date compliance reporting on all these settings, along with the ability to reinforce them.  The case is very different if you choose to leverage AirLift's export functionality.  While AirLift may provided a very fast and automated method for migrating your GPOs, it doesn't offer a mechanism for managing these settings going forward.  Essentially, you get your GPO setting, or settings plural, combined into a big wad of SyncML that in turn is added as a payload to a custom profile.

Now, should you want to tweak out the settings pushed out by this profile you'll have two major options going forward.  One, you can edit the SyncML manually or through the assistance of Policy Builder, both of which require some skill.  It's doable, but not for the faint of heart.  A second option would be to edit the original AD GPO, then attempt a new export with AirLfit.  That's a bit unsavory and hardly feels like freedom from AD GPOs.  It feels more like a trial separation at best.  Contrast this with making an update on an existing Baseline, where you make gui guided edits on the UEM console directly, push the new settings out, then later receive confirmation the settings have been implemented.

From the user's perspective there's no perceivable difference between a setting pushed out with a CSP versus Baselines.  What's really a stake is administrative overhead and lifecycle capabilities going forward. With the AirLift export option you get something akin to an easy button for migrating your AD GPOs where CSPs support them, but limited manageability moving forward.  With a transition to WS1 Baselines their might be more work up front,  but there's simpler on-going manageability and a more promising shot at freedom from on-premises dependencies.  Where possible, I would recommend aiming for Baselines adoption, assuming you have the WS1 Advanced licenses required for it.  However, the ideal path forward for you is going to depend on your circumstances. 

For a really interesting analysis on this decision point check out the section titled, Choosing The Correct Policy Delivery Model, in the latest Tech Zone guidance.

Additional CSP Options

If AirLift isn't feasible or it's export function needs customization or augmentation, VMware's documentation calls out 3 other CSP based options.  All involve tweaking out SyncML that's delivered through a Custom Settings profile.   One option is to leverage sample SyncML from the VMware Sample Exchange.  Another is to generate it using a Fling called Policy Builder.   Finally, a third option is to leverage the Microsoft CSP Development Suite.   Of these 3 choices, I find Policy Builder the most accessible and promising. 

To get access to Policy Builder, navigate to https://vmwarepolicybuilder.com/ and login with your My VMware account.  After selecting which version of Windows 10 you're focusing on you'll get presented with relevant CSPs. 

There's a very wide range of depth and capabilities amongst these different CSPs.  For example, the Accounts CSP has only two configurations options.  On the other hand the Policy CSP has an absolutely mind blowing range of options.  If you browse to Microsoft's Configuration Service Provider reference and select the Policy CSP, you'll see that the settings go on for days. Go ahead. I dare you. See how many scrolls it takes to get through the entire list.  It's a whole lot of configuration options at your finger tips.

For this example, while configuring the Policy CSP within Policy Builder navigate to Device --> Config --> Start.   Here you'll find a boat load of start menu settings.  Zero in on the options to hide the sign out and sleep options from the start menu.  By researching these specific settings within the CSP service provider reference you'll discover that a value of 1 enables these settings.   After punching in the number 1, SyncML is automatically generated on the right hand side of the page.  Clicking the ADD button updates the SyncML to what's required for implementing this setting.   The process is the same for the Hide Sleep option.

Copy this generated SyncML into the Custom Settings payload of a new profile.  For the remove settings, simply go back to Policy Builder and click on the Delete button instead of Add.   The SyncML will change on the right accordingly, replacing <Add> with <Delete>.  Copy this SyncML into the Remove Settings section.  

After assigning this profile to an endpoint you'll see the desired results.  As expected, there's no sleep option on the power button and no sign-on off option on the user start menu option.

Again, when you consider the vastness of something like the Policy CSP, this Policy Builder option is definitely worth more than an honorable mention.  Yeah, it's not as nice as the completely supported processes offered by WS1 Baselines or AirLift.  However, if you're someone who likes to tinker, oh boy!  There's a lot to work with here.

For additional info on Policy Builder, check out:

Latest Tech Zone Guidance: Modernize Group Policies Using VMware Policy Builder
Excellent Overview:  Introducing VMware Policy Builder: The Quick and Simple Way To Build Your Windows 10 Custom Settings

But wait, there's more! 

So if for some reason you just have to say no to AirLift, Baselines or CSPs, no worries, there's more options.   One is the GPO Migration tool, an alternative that's been around for a couple years now.  This tool is similar to the Custom Baseline option in that it leverages GPO backups that are then pushed out to target devices.  While this isn't a fully supported tool, it's certainly an interesting fall back option.  If nothing else, it's a wonderful example of what can be done if someone rolls up their sleeves and decides to get-er-done.  

With WS1 provisioning packages in our back pocket we essentially have an elevated command prompt available on any of our managed Win10 devices.  Well, there's an awful lot that can be done with a command prompt and script if you put your mind to it.  Further, there's an option to push out registry changes directly from a Custom Settings profile in UEM.   This option gets called out in the appendix of the latest guidance, which references the blog post, "How To Set Registry Values Using The Custom Setting Profile In Workspace ONE UEM."   Given that a large majority of GPO settings essentially map out to registry settings, there's a lot of ground you could cover with this option alone.  

Additional Resources For Reporting On GPOs

If you can't run AirLift, need to augment AirLift reporting or just want a quick look at a specific policy with minimal overhead, there's Microsofts own MDM Migration Analysis Tool, MMAT. Unlike AirLift, it wont give you an exhaustive enterprise view of all your GPOs and applicable modern management equivalents.  However, on whatever target system you run it on it will analyze the GPOs currently assigned to that system and report back on the feasibility of migrating them to modern management.  Also, as previously mentioned, there's Microsoft's Configuration service provider reference that details the Windows 10 CSPs developed for modern management.  If there's a specific setting you're interested in making customized SyncML for it makes sense to dive into this reference.

And Get Yourself Free

The subject of WS1's modern management alternatives for GPO settings sits at the intersection of two very different worlds.  On one end of the spectrum you have an MDM admin who, though possibly a brilliant tech, has never touched a production domain controller in their life.  At the other end of the spectrum you've got a grizzled AD veteran who's managed enterprises desktops with GPOs for decades.  They're two very different kinds of people with different expectations and priorities when it comes to AD GPOs.  That their needs for GPO settings overlap may very well be the only thing they have in common.  Accordingly, they're likely to require different paths on the journey to modern management.

For the traditional MDM guy or someone who doesn't have an AD legacy to preserve, I'd say start with a careful investigation of all built-in WS1 capabilities, then move on to to fill in any gaps through Baselines.   On the other hand, if you're in an organization with significant investment in GPOs and need to port that legacy to modern management, the path ideally begins with AirLift. With that reporting you get your arms around what's going on in your environment, size up your challenges and rationalize your GPOs.  Then thoroughly investigate the built in capabilities of UEM.  You may find a lot of what you need is already built into the tool.  From there, investigate Baselines and then fall back to AirLift's export capabilities.  If there's still challenges, you can turn to the other CSP options,  the GPO Migration Tool or various other DIY alternatives imaginable. 

Thursday, March 26, 2020

A Primer On NSX Advanced Load Balancer (Avi Vantage) For Horizon And Workspace ONE

NSX Advanced Load Balancer, formerly called Avi Vantage, is a solution VMware secured through the acquisition of Avi Networks.  A fully software defined load balancing solution/application delivery controller, Avi Vantage adds L4 - L7 server load balancing to NSX, rounding out an already impressive SDN solution.  Overall, the Avi Vantage offering is a natural progression for VMware, a continuation of what the company has always been good at: replacing beefy, unwieldy hardware bound solutions with agile and efficient virtualization.

While the acquisition has been cause for VMware network geeks to rejoice, it's also a particularly exciting development for VMware's end user computing products, Horizon and Workspace ONE.  Traditionally these solutions have required the use of third party load balancers, which has been fine, though it does introduce a bit of complexity and another vendor to deal with. So to start with the Avi acquisition offers an opportunity to simplify the VMware EUC stack, along with the promise of a more tightly integrated load balancing solution.  In mid March the release of UAG 3.9 added, "Qualified support for the AVI Networks load balancer used in front-ending Unified Access Gateway for Horizon."   Earlier in the year, a Reference Architecture For Horizon leveraging Avi Networks was released.  Further, there's this step by step configuration guide, Configure Avi Vantage For VMware Horizon. While these documents are quite exhaustive, I put together this post as a primer on Avi Vantage for Horizon Admins.  The idea is to give folks a high level overview of how Avi Vantage plugs into the Horizon/WS1 stack and why it's relevant.

Why I'm So Giddy About Avi Vantage And Horizon

When it comes to VDI and App Publishing it's essentially a 2 company game: VMware vs Citrix.  The competition and rivalry is intense to say the least.  Large fortunes and entire careers fuel fierce debate, endless FUD, mud slinging, hyper bake offs and neurotic excel spreadsheets filled with feature by feature comparisons.  Fear and loathing abounds with otherwise genteel engineers staring out through dead shark eyes, broken half bottles in hand, ready to cut ya!  At times it feels more akin to identity politics, fanatical sports rivalry or a down right Hatfiled vs McCoys family feud.   As someone in the middle of this conflict I've always had to admit that Netscaler sounded like a pretty solid product.  For awhile, the worst thing you heard about it was it's too expensive and offers more functionality than Citrix customers actually need.  However, with it's latest vulnerability Netscaler's stature as unquestionably awesome has come under scrutiny.  Combined with the notoriously bad treatment and support customers receive from Citrix, folks are really starting to wonder if it's worth the trouble to rely on them for this critical functionality.

More notably, both Citrix and VMware customers, being techies, are always looking for more innovative and smarter ways of handling things.   In the field of load balancing there hasn't been a lot of innovation or change, so in that regard Avi Vantage really stands out.  We're not talking about just P2V-ing a load balancer and patting ourselves on the back. With Avi Vantage we're talking about an elastic fabric that allows you to take advantage of the virtualization infrastructure you already have in place, whether it's across multiple data centers or even different cloud vendors.  Accordingly, Avi Vantage is a real shot in the arm for VMware's EUC stack in a couple major ways.   First, by adding load balancing and application controller capabilities to VMware's arsenal, it brings it's EUC stack much closer to parity with what Netscaler and Citrix offers. Two, while Avi Vantage might not be at complete parity with Netscaler, it does a lot that Netscaler can't.   In light of the current pandemic and associated challenges this differentiator has some real teeth.  When firing up a new data center in the middle of a crisis do you want to wait on the purchase and shipment of new hardware?  Do you want to limp along with a virtual appliance that's a sub par version of the load balancer you normally work with? Or would you rather prefer walking through a few left clicks and right clicks on your Avi Controller, simply extending a fabric you already have in place?

No doubt, there will be plenty of debate over Netscaler + Citrix vs Avi Vantage + Horizon.  If reason and cooler heads prevail it wont be a simple debate, but instead a thought provoking and interesting one.

Avi Vantage Overview

At a high level, Avi Vantage is a software defined load balancing solution/application delivery controller that functions across an entire enterprise, including separate cloud environments like AWS, Azure or Google Cloud.  Most relevant for typical Horizon shops, it integrates quite impressively with traditional on-premisses vSphere environments.   It all begins with a software based Avi Controller, the brains of the operation where all load balancing policies are defined.   The Controller, or controller cluster, essentially binds to your vSphere environment(s).  In turn, the Avi Controller manages and controls the placement of virtual services across your vSphere infrastructure, what are referred to as Avi Service Engines.  Based on instructions received from the Avi Controller, the Service Engines, "perform load balancing and all client- and server-facing network interactions." They also collect,  "real-time application telemetry from application traffic flows." The controller can automagically control the setup and distribution of these service engines across the ESXi host within your vSphere environment, ensuring proper redundancy, capacity and work load distribution.

These different Service Engines laid out across the vSphere infrastructure are what endpoint clients actually connect to and interface with.  They're associated with the VIPs and handle traffic based on the virtual services and pools defined on the Avi Controller.  So essentially, you define the load balancing logic on the controller, then these Service Engines act as minions that execute the logic for incoming client connections.

The end result is an elastic load balancing solution that avoids the challenges with efficiency that plagued traditional hardware based load balancing solutions.  The ability to automatically spin up Services Engines on the fly, scaling out VIPs horizontally as needed, allows for right sizing.   Service Engines can be spun up or spun down in increments as small as 1 vCPU, 2 gigs of RAM and 10 GB of storage.  Contrast this to redundant pairs of active/stand by hardware based appliances and this benefit of Avi Vantage becomes pretty compelling.

For more info check out this Architectural Overview for Avi Vantage.

Avi Vantage For UAG Appliances

The Reference Architecture for Horizon reviews 3 different methods for load balancing external traffic to UAGs. Factors such as the need for HIPAA compliance or whether you’ll have multiple clients behind a single NAT, at a remote site, determine which method is most appropriate. For this post, I’m going to review the first option, Single VIP with two virtual services.

Regardless of which option you go with, it all begins with a Horizon client communicating with a virtual service supported on Avi Service Engines.  Virtual services are comprised of IP and port combinations defined on the Avi Controllers.  The client traffic is passed by these services to the optimal UAG appliance based on pools that have also been defined on the Avi Controller. Pools determine the ideal server to pass traffic to based off configurations like server lists, health monitoring, load balancing algorithms, etc... 

To illustrate, below is a graphic detailing the anatomy of a typical Horizon Blast session through a UAG appliance.  Initially you have the primary Horizon protocol handling authentication through XML structured messages over port 443.   Then you have the secondary Horizon protocol, Blast in this example, operating over 8443.  (For an excellent primer on UAG load balancing and Horizon protocols check out this amazing post by Mark Benson.)

Accordingly, we have two virtual services to configure on Avi Vantage, one for the primary protocol and one for the secondary protocol.  Below is a screen shot from my own lab.   The virtual service Horizon_UAG_L7 is configured to accommodate the primary Horizon protocol operating over TCP 443, while Horizon_UAG_L4 is configured for both the PCoIP and Blast extreme secondary protocols that operate over TCP/UDP 4172 and 8443 respectively.

These virtual services in turn are associated with a pool that determines server selection for incoming traffic based off configurations such as load balancing algorithms, health monitoring and persistence profiles.  

Finally, below is a screenshot of a custom Health Monitor that's created for Horizon.  The Health Monitor is associated with a pool and helps, "validate whether servers are working correctly and are able to accommodate additional workloads."

One of the key requirements of this entire setup is ensuring that users are routed to the same UAG appliance for both the primary and secondary protocols.  In a nutshell, we have to ensure the same UAG appliance that authenticates a user is used for the display protocol traffic as well.  For a single Horizon connection, you can't have authentication against one UAG appliance then display traffic flow over a separate UAG appliance.

This has been a very basic high level overview of what's involved in load balancing UAG appliances through Avi Vantage.  For more details and step-by-step guidance, check out the Reference Architecture For Horizon along with Configure Avi Vantage For VMware Horizon.   Again, three different methods to choose from, based off the specifics of your use case, are detailed in this documentation.

Horizon Connection Server 

Traditionally load balancers have always been a requirement for Horizon Connection servers, with at least two Connection servers needed to ensure redundancy for a production caliber deployment.  So for a typical Horizon deployment with UAG appliances you'll need load balancing in front of both the the UAG appliances as well as in front of the Connection servers.  Below is a helpful image to illustrate:

As you might imagine, accommodating this model is pretty much a slam dunk for Avi Vantage.  Setting up load balancing for the Horizon Connection Servers is very similar to that for the UAG appliances.  As with UAG appliances, you'll configure a virtual service(s), a pool and health monitor, then you're off to the races.  For detailed step by step instructions on configuring Avi Vantage for Horizon Connection servers, check out this section of the Reference Architecture For Horizon.

For those familiar with UAG's built in load balancing-ish capability referred to as High Availability, note that HA for UAG doesn't include load balancing for the Horizon Connection servers, just rudimentary load balancing for the UAG appliances.  This is a major advantage Avi Vantage offers over HA, though certainly not the only one.

Global Load Balancing For Always On Point Of Care Architecture

Always On Point Of Care is an architecture that's been around for about 9 years now.  The basic idea is to provide a fully redundant, bullet-proof Horizon deployment.  Essentially, you stand up two separate Horizon environments that share no interdependencies, so that theoretically you could loose an entire site but still have Horizon services available.   Key to this model is a global load balancing solution that sits in front of the two sites, routing the client connections to the separate Horizon environments.   Historically, this functionality has been handled by our load balancing partners. 

Nowadays, rather than leaning on a partner, we can leverage Avi Vantage for global load balancing.  The documentation refers to this global load balancing feature as Avi GSLB.  For more details on configuring leveraging Avi GSLB for Horizon, check out GSLB In Avi Vantage For Horizon.  Here's an awesome looking graphic on this deployment model for APOC that I stole from the Avi Networks website:

App Volumes

Avi Vantage also supports the Always On Point Of Care model by providing load balancing for App Volumes.  Load balancing has always been a requirement for App Volumes redundancy and scaling. You have multiple, essentially stateless App Volume managers that share a common database, sitting in front of a load balancer.  Load balancing for App Volumes is briefly covered in the Avi Reference Architecture for Horizon .  For reference you can also check out the F5 guide, Load Balancing VMware App Volumes.

Client Connection Breakdown 

Depending on the deployment method you go with, Avi Vantage can offer a nifty little break down of the session health for individual connections. It can distinguish between latency between the remote client and the Avi Service Engine versus latency between the service engine and the back end server. It can also account for fast or slow app server response time. This promises to come in handy when trying to get to the bottom of latency encountered by your Blast connections through UAG.

WS1 Use Cases

With official support for Horizon access already, it seems like only a matter of time before there's official support for WS1 UEM services on UAG like Secure Email Gateway (SEG), VMware Tunnel and Content Gateway.    Further the resources these services provide access to - email, intranet sites, SharePoint, etc... - are the more typical types of servers Avi Vantage has always been able to accommodate.   So just as for the Horizon use case, you'll have front-ending for the UAG appliances along with load balancing for on premises resources.

vIDM Connector 

While it's kind of a niche scenario, there are situations that require load balancing for vIDM Connector, such as when it's used for kerberos authentication.  I'm not aware of any official support but there's no reason to believe Avi Vantage can't provide load balancing for vIDM Connectors.


This is the most excited I've been about a VMware acquisition since AirWatch.  Along with all the practical capabilities that Avi Vantage brings to the EUC stack in the here and now, there's all the speculation about what it might be built to do in the future.  There's about 2 or 3 different scenarios that consistently pop up when I speculate with old timers over what VMware might do with Avi Vantage to further enhance the EUC experience.  I'm not going to go into that here, but I'm confident I'll be writing about such enhancements in the future.  

Tuesday, December 24, 2019

Using VMware's Horizon Performance Tracker For Rudimentary Blast Optimization

Recently updated for Horizon 7.10, the VMware Blast Extreme Optimization Guide focuses on, "two key configurable components: the transport protocol and display protocol codec."   To gain real-time insight into the configuration of these components, and Blast performance in general, the Horizon Performance Tracker is a natural fit.  Both free and built into the Horizon agent, it's a very accessible way to get started with rudimentary Blast optimization. This article details general principals behind Blast optimization and illustrate how Horizon Performance Tracker can assist in the fine tuning of Blast protocol behavior.   It aims to provide context and guidance for tuning Blast's transport protocol, then moves on to codec and bandwidth considerations.  Along the way it will also review how the Horizon Help Desk Tool, built into the Horizon solution as well,  can further assist with Blast optimization.

The Basic Anatomy Of A Horizon Blast Session

The blog post, Load Balancing Across VMware Unified Access Gateway Appliances, contains one my favorite descriptions of Horizon sessions.  Under the section titled, "Horizon Protocols," it details the distinction between a primary and secondary Horizon protocol.   The primary Horizon protocol is all about authenticating against the Horizon environment through XML over 443 .  The secondary protocol is the display protocol itself, what translates/transmits pixels from within a virtual desktop OS to the display of an endpoint device.   This is what we're primarily concerned with when optimizing the Blast experience.   If you go with the default port of 8443 for Blast traffic here's what the traffic flow looks like when remoting into a Horizon environment through a UAG appliance:  

Typically, the primary protocol is completely over 443 between the Horizon client and UAG appliance, as well as between the UAG appliance and Horizon Connection server.   For the secondary protocol, Blast Extreme in this example, traffic flows over 8443 between the client and the UAG appliance.  Then, from the UAG appliance to the virtual desktop or RDS host, traffic flows over 22443.  

In the context of optimizing Blast for your environment, one of the first questions to ask about your Blast traffic is whether UDP or TCP is used for the transport protocol.  For most use cases UDP is more ideal and is what the Blast protocol first attempts to leverage by default.  Accordingly, confirming that UDP is actually in use for your environment is a first step towards achieving an optimal Blast experience.

Observing The Transport Protocol In Use 

While you can look at Blast logs to determine what transport type is in use, the Horizon Performance Tracker offers a really, really, really easy and convenient way to determine this info. While not installed by default, Horizon Performance Tracker is built right into the Horizon agent and is offered as an optional component during the agent install.  (Here's more official guidance on installing Horizon Performance Tracker.)  Once installed, from within an active session launch Horizon Performance Tracker from your start menu.   When it's launched you're presented the, "At a Glance," tab.  While this initial screen is certainly interesting in it's own right, things get particularly useful when you click on the icon with the grids in the right corner.  (Underlined with red in the image below.)

In the screenshot below, under the transport section, there's confirmation that UDP is leveraged for the transport protocol in both directions, the default behavior that the Blast strives for. 

If UDP were being blocked for some reason, you'd see something like this:

Again, for most uses cases, UDP is the optimal transport, with the optimization guide stating that with but two exceptions, "VMware recommends that you use UDP for the best user experience.  And if Blast Extreme encounters problems making its initial connection over UDP, it will automatically switch and use TCP for the session instead."  Accordingly, in most scenarios, if you see TCP in use as a transport protocol, something has gone wrong and tuning Blast involves making adjustments to ensure UDP is leveraged instead.  Your first step is to determine if there's issues with UDP port connectivity for 8443 or 22443 along your Horizon session's network path.  (I've provided guidance on this process in a previous post, Troubleshooting Port Connectivity For Horizon's Unified Access Gateway 3.2 Using Curl And Tcpdump.)    If you find that UDP traffic is getting blocked while traversing a foreign network outside of your control, you can try and stack the deck in your favor by leveraging port 443 for external Blast traffic on your UAG appliance.  

Shifting External Blast Traffic To Port 443 On UAG

Configuring your UAG appliance to leverage 443 for external Blast traffic increases the likelihood that external networks will allow your Blast traffic to pass. 443 TCP access is pretty much a given everywhere, a slam dunk in most uses cases.  While 443 UDP connectivity isn't as certain as 443 TCP connectivity, it certainly has better odds that 8443 and is worth a shot.   Further, as an added bonus, making this change will most certainly increases your odds of TCP connectivity and having at least some kind of successful Blast connection.  Here's what the traffic flow will look like:

Shifting Blast traffic to 443 on your UAG appliance is a relatively simple process.  First, navigate to Horizon Edge services on the UAG appliance.   Here's what it looks like when the Blast External URL is configured for port 8443:

To change it to 443, simply append 443 instead of 8443 to the configured URL:

When configuring the Blast External URL, I like to imagine I'm sitting inside a Horizon endpoint client itself, looking for a path to forward Blast traffic too.   Think in terms of what's externally resolvable and accessible from the perspective of the endpoint.   Typically, it ends up being the VIP and associated DNS on a load balancer.

When To Use TCP For Your Transport Protocol

The optimization guide indicates that UDP is usually the optimal transport to leverage, with two exceptions.  First, you'd want to go with TCP if, "Traffic must pass through a UDP-hostile network service or device such as a TCP-based SSL VPN, which re-packages UDP in TCP packets."  Since the days of PCoIP dominance TCP-based SSL VPNs have always been a challenge for Horizon.  The encapsulation of UDP traffic into TCP packet by such VPNs is a real downer, nullifying the performance benefits of UDP.  For Blast traffic it's best to stick to TCP when using these types of devices or when there's some other network challenges preventing UDP use.   

The second reason to go with TCP instead of UDP is when, "WAN circuits are experiencing very high latency (250 milliseconds and greater)"   In regard to this 2nd consideration, Horizon Performance Tracker can again be of assistance.  Round trip latency is prominently displayed under the network section in real-time.

In the above screen shot, with latency at 65ms it would seem that all is right with the world in terms of the transport selection of UDP.   However, if we were witnessing some latency above 250ms, something like below, we'd want to consider forcing TCP usage. 

With latency above 250 ms and low packet loss, the optimization guide is pretty clear in its guidance to leverage TCP for the transport protocol.  However if packet loss were also high, the decision wouldn't be as straight forward.   With Blast's UDP stack's better handling of packet loss than it's TCP stack, you might still want to stick with UDP as a transport protocol in a high latency situation.  Fortunately the Horizon Help Desk Tool can provide insight into whether or not there's packet loss so we can make an informed decision.  

Horizon Help Desk Tool 

The Horizon Help Desk Tool offers an even more useful view of network latency for a particular Horizon session.    It provides a breakdown of network latency for a specific session over the span of 15 minutes, given you a better overall sense of what latency is.  Below is a graph cranked out by the tool for a particularly challenged Horizon session that spikes to latencies above 1200 ms, certainly not the most ideal of scenarios.  

A further benefit of the tool is its ability to report on packet loss within a session which, as previously mentioned, is relevant in determining the optimal transport protocol. After looking up a user's session, from the details screen expand the user metrics session and under Blast counters you'll see the packet loss.  For the session above, though there's high latency, there's no indication of packet loss.   

With high latency and zero percent packet loss we have network conditions bettered accommodated by the TCP transport.  However, had there been high packet loss, we'd have to make a choice between TCPs performance benefits in high latency environments versus the UDP stacks ability to better handle packet loss.  To simulate such a situation in my lab I used a utility called clumsy on my remote endpoint.  After configuring the utility to create significant packet loss the hit on network performance was clearly reflected through the Horizon Help Desk tool.

In this situation, where packet loss is high, UDP might be the preferred transport to stick with, despite the hight latency. Both the VMware Blast Extreme Optimization Guide and Blast Extreme Display Protocol In VMware Horizon 7 white paper indicate that UDP is the optimal transport to stick with under high packet loss conditions.  The white paper specifically states that, "UDP is better at handling packet loss than TCP.  UDP can deliver a good user experience in conditions of up to 20 percent packet loss."

Fun Facts About Codecs

The optimization guide states that, "A codec is a computer program that can encode or decode a digital data stream for transmission. The word codec is a blend of the words coder- decoder." As of today Blast offers a choice between three codecs, H.264, JPG/PNG and H.265, with H.264 being the default.  

One of the H.264 codecs claims to fame is it's ability to handle rapidly changing content.  Another major claim to fame is the ability to leverage the built in H.264 chip of endpoint devices for hardware based decoding, sparing the endpoint's CPUs the trouble.  This both improves performance and extends the battery life of these endpoint devices.  When NVIDIA grid cards are in the mix, things get even more exciting.   The encoding of the codec can be offloaded to the NVIDIA GPU, improving performance and offloading the encoding from the server.  This offloading in turn improves user density and efficiency on the ESXi hosts.   

JPG/PNG, sometimes referred to as the adaptive encoder, is the original codec used by Blast and does software based encoding and decoding. While H.264 is the default, Blast will fall back to JPG/PNG when H.264 isn't option, such as when the HTML client is used from a non-chrome browser. It's also desirable when you have, "Images that require lossless compression," such as quality still images, complex fonts or medical imaging.  However, the optimization guide is pretty clear that it's not so great for rapidly moving content, something the H.264 codec excels at. 

H.265, referred to as High Efficiency Video Decoding (HEVC), is the bigger, badder successor to H.264.  While it introduces bandwidth improvements, it absolutely requires the use of NVIDIA GRID GPUs on your ESXi hosts.  It also requires clients with H.265 decode support, which is common nowadays but not guaranteed.

Finally, a new feature called Encoder Switcher allows Blast, "to dynamically switch between the JPG/PNG and H.264 codecs, depending on screen content type."

Using Horizon Performance Tracker To Observer Codec Usage

Regardless of which codec is best suited for your use case Horizon Performance Tracker can provide visibility into which one your session is actually using.  To observe this in action we can control the codec selection using the VMware Blast settings on the Horizon client.  Here's a screen shot of the codec settings from the Horizon client:

If you uncheck the option, "Allow H.264 decoding," you'll fall back to JPG/PNG and Performance Tracker will report, "adaptive", as the encoder.  (Note: The Blast Extreme Display Protocol in VMware Horizon 7 clarifies that, "JPG/PNG is referred to as the adaptive encoder.")

Whereas accepting the default of, "Allow H.264 decoding," under typical conditions, will cause Horizon Performance Tracker to report, "h264 4:2:0," as the encoder. 

Should you select the option, "Allow high color accuracy," and H.264 is successfully implemented, the tool will report back, "h264 4:4:4," as the encoder name. 

Further, if H.264 is enabled and there's an NVIDIA Grid card enabled for your VM, the tool reports back an encoder name of, "NVIDIA NvEnc H264."  Here's an example from a GPU enabled VM in VMware's TestDrive environment: 

Finally, to allow for use of H.265 when connecting to the same virtual desktop as detailed above, on the Horizon client I checked the box for, "Allow High Efficiency Video Decoding (HEVC)." With my client supporting H.265, Horizon Performance Tracker reports back that, "NVIDIA NvEnc HEV," as the encoder in use.  

Observing Blast's Bandwidth Consumption In Real-Time

Roughly 5 and half years ago I had the honor of meeting the great Cale Fogel, Breaker Of Chains, Knower Of Things And Talker Of Straight. During some chit chat in the hallways of VMworld 2014 he summarized the situation with display protocols quite succinctly.   "It's all about how much screen real estate you're dealing with, resolution, number of screens, versus the amount of changes on the screen.  The more changes that occur and the higher the resolution, the more pixels that have to cross the wire and get reordered on the endpoint."   So, if you have a single monitor with low resolution and a completely static screen, you'll have very few pixels to change and the protocol will gobble up very little in terms of compute resources and network bandwidth.  On the other hand, if you have multiple monitors at high resolution, displaying a lot of active changing content, compute consumption will be high and bandwidth usage will be high.  

An easy way to see this first hand in real-time is through the Horizon Performance Tracker.    Along with the nifty info we've discussed so far, it details how much bandwidth the display protocol is currently gobbling up.   Under the encoder section, there's a field, "Bandwidth used."  Reduce the screen resolution and do nothing within the VM, and you'll see the bandwidth usage plummet.

Only 10k of traffic generated by the Blast protocol, woo-hoo!  However, don't get too excited haole.  Within the same session, move the Horizon Performance Tracker utility itself around on the desktop, shaking it hard and violently like a chimpanzee on meth.  Bandwidth will temporarily spike.

Now, for some real fun, fire up youtube, put in a trailer for Star Wars, increase the youtube resolution to high definition and then take a look at performance tracker.

When it comes to Horizon's display protocols, I like to say, the only way through is through.   Lots of changes on the desktop translate to lots of compute and bandwidth usage.   Fundamentally, it's more of a math problem than anything else.   In the optimization guide, this dynamic is well articulated with the statement, "It is extremely important to recognize that optimizing for higher quality nearly always results in more system resources being used, not less. Except under very unique conditions, it is not possible to increase quality while limiting system resources."  It goes on to elaborate on the inverse relationship between quality experience and optimized resource usage,  stating, "Except in unique situations, optimizing quality increases bandwidth utilization, whereas optimizations for WANs require limiting quality to function over poor network conditions."  So, you're going to have to be honest with yourself and pick your poison.

More Advanced Tuning Covered By The Optimization Guide

The optimization guide goes on to cover additional Blast tuning settings such as Max Session Bandwidth, Minimum Session Bandwidth and Frame Per Second.  While Horizon Performance Tracker can assist with the configuration of these more advanced settings, before mucking around with them I'd circle your attention back to the VM, OS and underlying infrastructure.  This isn’t to say that advanced Blast tuning methods are a waste of time. It’s just that in the absence of other information about your use case, holistically speaking, I'd say you’re more likely to have challenges with the user experience due to the VM and underlying infrastructure than due to advanced Blast tuning.  The optimization guide echoes this sentiment, recommending that, “Before tuning Blast Extreme, it is critical to properly size and optimize the virtual desktops, Microsoft RDSH servers, and supporting infrastructure.”  Remember, key processes behind Blast, VMBlastS.exe, VMBlastW.exe and VMBlastP.exe, are running WITHIN the OS of your virtual desktops. So if those VMs are under specced or starved for resources, your Blast processes will be starved and Blast performance is going to suck. Further, if critical apps within your VM are starved for resources no amount of tuning is going to make up for an app experience that's ruined before anythings even been remotely displayed.  Along those lines, after confirming your VMs are properly specced, optimized and supported by your infrastructure,  I'd recommend taking a hard second look at profile configuration, critical apps and the network paths those apps rely on.  Often a poor user experience is the result of a deficiency outside the Horizon stack, with Horizon just being the messenger.  And we all know what folks love to do to messengers.  

So, in summary, when it comes to Blast tuning, to begin with I'd confirm you're getting the proper transport and codec selection.  I'd also recommend being honest without yourself about the bandwidth requirements, use case requirements and network limitations.  However, before doing a deep dive into the advance tuning of Blast, I'd take a very long, hard second look at the rest of your environment.