UAG Troubleshooter

UAG Troubleshooter is a set of services that diagnose common challenges with new deployments of VMware's Unified Access Gateway.  Long term there's plans to extend it's scope to WS1 use cases, but for this iteration the focus is on Horizon deployments.  The overall goal is to validate communication paths between the external world and a UAG appliance, as well as as between the UAG appliance and the internal Horizon environment it's paired with.  To this end UAG Troubleshooter offers 3 separate services: Horizon Edge Scan, Test Packets, and UAG Local Troubleshooter.  

Horizon Edge Scan provides scans of UAG appliances based on their externally resolvable URL.  It validates basic requirements like port connectivity, SSL configurations and service availability, with an option for more advanced SSL analysis through Qualys SSL Lab.  It's a wonderful demonstration of how much you can deduce about a UAG deployment with nothing more than it's external URL. Available from troubleshoot.evengooder.com, it will quickly yield a report with 9 to 11 datapoints about your deployment. 

Test Packets, also available at troubleshoot.evengooder.com, offers a way to conclusively validate external port connectivity on a UAG appliance with minimal effort.  It marks packets with ASCII payloads including the phrase, "EvenGooder test packet," then fires them off to specified URLs and ports.  You can then leverage tcpdump and grep to observe these specific packets cross the eth0 interface, validating external port connectivity in realtime with minimal effort.  

UAG Local Troubleshooter is a python based diagnostics utility delivered and run on UAG appliances as a Docker container.  By leveraging the Linux container host capabilities of Photon OS it delivers logic for troubleshooting UAG appliances from the inside out.  The utility is accessed through the command line on the UAG console.  In a matter of seconds it validates the availability of services on the UAG appliance as well as the critical network path to the Horizon environment. 




















While each of these utilities are useful in their own right, collectively they provide a compelling holistic perspective, with 28+ datapoints regarding the health of the UAG's services, network connectivity and SSL configuration.  Coupled with the methodology below UAG Troubleshooter is positioned to address a majority of challenges that beset first time deployments. 

Note: as of today the utility assumes single nic Horizon deployments using Blast as the Horizon secondary protocol.  It may still have value for deployments falling outside this range, but your mileage will vary.  Finally, this solution has been built by Justin Johnson of EvenGooder.com and is in no way officially supported by VMware, Broadcom or any other organization in their right mind.  


Overall Strategy For Troubleshooting Horizon Connections Across UAG

Understanding the difference between Horizon's primary and secondary protocols is absolutely key to efficiently troubleshooting Unified Access Gateway deployments for Horizon.  Together these protocols get folks connected to their virtual desktops, but each protocol has it's specific requirements in terms of network and port connectivity.  With this in mind, I'd say you should break down most initial UAG deployments to 4 major milestones:  
  1. Securely connecting to the Horizon landing page through UAG
  2. Viewing entitlements after successful passthrough authentication
  3. Actually connecting to your virtual desktop or RDS host
  4. Adding stronger forms of authentication 
These milestones represent incremental steps to a fully blown successful UAG for Horizon configuration that can easily be scaled and replicated using ini files and PowerShell.  Each phase has it's own unique requirements, with subsequent phases building upon the previous.  Overall, I've found that following this process breaks your deployment down into smaller, more manageable, bite size pieces, allowing you to narrow your focus and troubleshoot your way to success.  UAG Troubleshooter is specifically designed to help you along this path. 


Milestone 1: Secure Connecting To The Horizon Landing Page Through UAG

Getting the Horizon landing page securely displayed through the UAG appliance is an often underappreciated milestone.  It confirms basic network and SSL requirements are getting met, not just between external devices and the UAG appliance, but also between the UAG appliance and the Horizon environment.  Accordingly,  if the Horizon Edge Scan reports the landing page is accessible and 200 results for your HTTPs request to UAG, it's a very telling and promising sign.   



(Note, it's possible that customers have removed or altered the Horizon landing page, though not very common. To confirm, from your trusted network point your browser directly to the Horizon environment.)

In the cases where challenges are detected with Horizon Edge Scan, UAG Local Troubleshooter can provide invaluable information.   Along with confirming crucial services are available on the UAG appliance, it provides detail diagnostics regarding the connection to your internal Horizon environment. By running within the UAG appliance itself UAG Local Troubleshooter enjoys guaranteed line of site to UAG's REST APIs and the local network paths it relies on. While the container itself is isolated from the underlying Photon OS, the container shares the host's network namespace, giving it the ability to run diagnostics that complete the initial picture provided by the Horizon Edge Scan.



Since Photon OS has built-in support for Docker this container based utility is easily downloaded and run on the UAG appliance.  Here are the commands:

    systemctl start docker
    systemctl enable docker
    docker run -it --net host doublekjj/uag-troubleshooter

And huzzha! You have up to date diagnostics logic for your UAG challenge.  Here's an example of the details we get from UAG Local Troubleshooter regarding the connection to a Horizon environment:
Once we get positive results from both the Horizon Edge Scanner and UAG Local Troubleshooter, success with the Horizon primary protocol is well within reach, assuming the Horizon environment and it's connectivity to local AD is solid.


Milestone 2: Viewing Entitlements After Successful Passthrough Authentication 

A typical Horizon primary protocol connection involves successful authentication through the Horizon client and the presentation of entitlements to the user.



Though most organizations look to leverage stronger forms of authentication for their UAG deployments, to simplify the initial setup folks should initially stick with passthrough authentication till they've already proven out display protocol connectivity. This isn't an absolute requirement, but rather a practical suggestion for those setting up UAG for the first time. Get all your ducks in a row first with a standard UAG deployment, THEN move onto stronger forms of authentication. If you stick with this strategy, positive Horizon Edge Scans and UAG Local Troubleshooter scans all but guarantee a successful Horizon primary protocol connection.


Milestone 3: Actually Connecting To Your Virtual Desktop Or RDS Host

Once you've gotten your Horizon entitlements presented to you, the next step of the journey begins with an actual double click on an entitlement.  This leads the Horizon secondary protocol to initiate, whether it be Blast, PCoIP or RDP   Essentially, a second session from client is initiated, a connection to the virtual desktop itself through which the display protocol remotely displays pixels from the virtual desktop.



Now, this is by far the most treacherous part of the deployment journey for first timers.  If I had a nickel for every time port 4172 or 8443 was blocked, despite everyone saying otherwise,  phew.  I can't think of any challenge more common than folks getting nothing but a black screen when trying to connect to their desktop because there isn't port connectivity for the display protocol.  All three utilities make a contribution towards addressing this common challenge.    First off, Horizon Edge Scan presumes folks are using the default port for Blast, 8443, with the same URL UAG is initially accessed from.  This is not an absolute certainty by any means, but for most POCs and new deployments, it's the most likely configured External Blast URL. Accordingly, the Edge Scan confirms TCP 8443 is open, along with confirming it's yielding the same SSL cert as is leveraged for 443. Here's a successful scan:



In addition to these preliminary results from the Horizon Edge Scan, UAG Local Troubleshooter can validate that the Blast service is active and available directly from the UAG appliance itself.  This is useful information in cases where firewalls or load balancers are blocking connections.  So long as you go with the automatic scan option, providing the utility admin creds to the web interface, it can read the appliances current configuration regarding the Blast External URL and the current state of the Blast service. 

Finally, Test Packets can be used to test any other ports that might be in use on the UAG appliance.  Just enter in the hostname and port you'd like to check and Test Packets will send test packets marked with an ASCII payload including the phrase, "EvenGooder test packet." Using a combination of tcpdump and grep you can easily view these packets crossing UAG's 
eth0 interface using the command, "tcpdump -i eth0 -l -A -n -v | grep "EvenGooder"", from the within the UAG console.



Tcpdump has been a popular troubleshooting tool for UAG for over half a decade now and is the topic of one of my most popular posts, Troubleshooting Port Connectivity For Horizon’s Unified Access Gateway 3.2 Using Curl And Tcpdump. Using Test Packets with tcpdump introduces simplicity and specificity for realtime analysis of port connectivity. It allows us to easily narrow down our search to specific test packets, something that's not always easy to do when NAT'd addresses and load balancers are in the mix.


Milestone 4: Adding Stronger Forms Of Authentication

Once the Horizon secondary protocol is working properly it's a good time to work on stronger forms of authentication.  Though UAG Troubleshooter helps pretty extensively getting through the first 3 milestones, it doesn't currently include help for setting up stronger forms of authentication.  At least, not yet.  However, there's plenty of good information available on the topic of, including this VMworld 2020 session, VMware Unified Access Gateway Deployment And Security Best Practices.  


Guidance From VMware's TechZone

There's a ton of great resources within VMware's TechZone regarding the setup and troubleshooting of UAG.  One of my favorites is, Understand And Troubleshoot Horizon Sessions.  Below is a graphic from it's section, "Troubleshoot Connections." UAG Troubleshooter is focused on validating communication paths paths 1 and 3 detailed on this drawing.



Other highly relevant articles within TechZone include Load Balancing Unified Access Gateway for Horizon and VMware Blast Extreme Display Protocol In VMware Horizon.


Troubleshooting UAG In A Nutshell - Be Like The Squirrel 

"Take all your problems and rip 'em apart" -Little Acorns, White Stripes 

Unified Access Gateway is incredibly reliable and predictable.  Accordingly, resolving it's challenges is mostly about slogging through traditional network troubleshooting tatics till eventually coming to clear conclusions.   In a world of grey areas, fuzzy lines, gas lighting and all sorts of uncertainty, working with UAG offers refreshing binary clarity.  Challenges invariably end with clear, white light, aha moments where things are working again and we know exactly how we messed things up in the first place.  In my 22 years of IT I've never seen a solution work as reliably as UAG.  My earnest and quite often correct initial assumption when hearing about a UAG challenge is, "okay, what did you mess up or what environmental factor changed," not, "is this another UAG bug?"

Getting to these euphoric aha moments with UAG deployments can be hellish, but man, I'm telling you, they're there.  Fundamentally, you need to break down the problems into smaller pieces and components, then scratch, claw and beat the challenge into submission.  Happy hunting. 

No comments:

Post a Comment