Containerized Diagnostics For Unified Access Gateway With UAG Local Troubleshooter

As part of the UAG Troubleshooter solution, I've recently developed UAG Local Troubleshooter, a container based diagnostic utility run within the UAG console.  Since Photon OS has built-in support for Docker this utility is easily downloaded from Docker Hub to a UAG appliance and executed locally, troubleshooting UAG from the inside out.  As a container running in host network mode it has unobscured network access to the UAG appliance, enabling it to validate the status of the Horizon Edge and Blast services.  Further, it validates the communication path between UAG and it's paired Horizon environment, checking for challenges with network connectivity and SSL trust.  Finally, as an added bonus, I built the solution using the popular netshoot Docker image, described as, "a Docker Networking Trouble-shooting Swiss-Army Container." So folks using UAG Local Troubleshooter will have access to netshoot's nifty tools like nmap and netcat.

















If you're curious and your UAG appliance has internet access you can download and run UAG Local Troubleshooter with just 3 commands.    These commands are executed directly from the UAG console, eventually launching CLI based diagnostics.   


Using The Utility 

The commands for executing the utility from the UAG console are: 

        systemctl start docker
        systemctl enable docker
        docker run -it --net host doublekjj/uag-troubleshooter

If all goes well, you should see output like this: 



















Upon launching the utility presents the user 3 different options.  The first of these, "Have utility automatically detect current UAG configuration" is the most interesting and exciting for new deployments.  The utility prompts for admin credentials to UAG's browser based admin UI, then uses these credentials to make API calls to UAG's REST API.  This yields current status information of the Horizon Edge Service, along with critical config info like the configured Blast External URL and the Connection Server URL Thumbprint.  It goes on to validate the communication path to the internal Horizon environment.  This includes detailed analysis about the SSL connection to the Connection Server URL, a common challenge for first time deployments.  Here's a screenshot from my lab.  























If you don't want to provide credentials for your admin interface you can select option two.  It's not nearly as exciting, but still provides very useful information.  After prompting the user for their Connection Server URL it goes on to validate the path to the Horizon environment.  It's also able to validate some service availability using netstat and HTTP requests to local ports.  Though without admin credentials it can't confirm if the configured Connection Server URL Thumbprint is accurate, it makes calls to the inputted Connection Server URL to deduce what the thumbprint should be and displays it in the output.  Here's sample output from the same Horizon environment as above: 


























Finally, the 3rd option, "Leverage netshoot utilities, " is pretty exciting in it's own right.  It allows you to leverage the complete functionality of the netshoot utility, a popular solution for network troubleshooting  containers.   It includes incredibly relevant tools like nmap, netcat, tcpdump and termshark.  Personally, nmap is my first love when it comes to port scanning.  Here's an example of running nmap through UAG Local Troubleshooter:



Nice, right? Traditionally, to test port connectivity against external targets from UAG you needed to leverage curl.  Curl is is certainly still an option, but with nmap we get our hands on a full featured port scanner.  We can do extensive scans against the Horizon Connection server or even scan ranges of desktops with tremendous ease.   


Docker Considerations And Supportability

Again, UAG Local Troubleshooter is a solution I've personally built, not a product.  It's not officially supported by VMware, Broadcom or any organization in it's right mind.  

Generally speaking, your not supposed to muck with UAG's underlying Photon OS and you're certainly not supposed to install any agents.  That said, UAG Local Troubleshooter technically isn't an agent, it's a manually executed container that makes no changes to the underlying OS.  Further, as outlandish as it might sound to download and run a container on a UAG appliance, that is in fact what Photon OS was designed to do well.  So, overall, UAG Local Troubleshooter presents a bit of dilemma. On one hand, it's an unsupported utility built by some dude at EvenGooder.com. On the other hand, it totally works and it's totally awesome and can totally help with new deployments.  A potential compromise would be for folks to use the utility in lab or dev environments to resolve challenges and establish a working configuration, a  golden ini if you will, then blow away and recreate the UAG instance.  So: get doublekjj/uag-troubleshooter on the appliance, use it to solve your problem, establish a working config, get your working ini file, then blow away the original appliance you ran the utility on and redeploy.  It's one approach to consider, though everyone's situation is different. When weighing out the security risk versus diagnostics reward of running the utility, here's some additional food for thought.  

My account on Docker Hub is doublekjj and the doublekjj/uag-troubleshooter container has been signed accordingly.  That doesn't vouch for my competency, but if you're pulling doublekjj/uag-troubleshooter from Docker Hub, yeah, that's me.  The container itself doesn't have access to UAG's underlying filesystem when executed with the commands documented above.  It is running within the UAG appliance and shares the network namespace with the UAG appliance, as it is running in host network mode, but has no privilege access to the underlying Photon OS.  So a good analogy is that it's much like a virtual desktop running on the ESXi hypervisor.  Further, once your done running the container, there's no reason for it to start back up unless you manually start it. Finally, while it's a little freaky to have a utility prompt you for your admin credentials, these are the admin credentials for UAG web based interface, not root credentials to the appliance itself.  Technically speaking, they're useless to anyone who doesn't have port connectivity to 9443 on your UAG appliance.  (No one from the outside world should have access to that port if you've set up your firewall rules correctly.) 

For the paranoid but still curious, you can run tcpdump on the UAG appliance itself while running the utility to confirm there's no nefarious traffic.  If your curious about the code that drives the utility you can take a peak at it, risk free, by not launching the container in host network mode and running a bash shell to explore it.  Here's the command: 

        docker run -it host doublekjj/uag-troubleshooter bash

You'll get a command prompt into the container, but the container won't have access to UAG appliances network namespace.  The bash prompt will land you in the directory where all the code is run from. UAG_Local_Checker.py is the main script to check out.  Under normal circumstances it's automatically launched when the container is started, but appending bash to the docker command prevents it's execution.  For purposes of transparency I'll get the code uploaded to github shortly.  
 

Cleaning Up Containers After A Scan  - Burn After Using

If you don't feel the need to immediately blow away your UAG appliance but want to clean up containers and images generated from the execution of doublekjj/uag-troubleshooter, you can leverage some docker commands.   For a sledgehammer, delete everything and ask questions latter approach, you can run this command to delete all containers: 

        docker rm $(docker ps -aq) 

Then, you can blow away all your images using the command:

        docker rmi -f $(docker images -q) 

Otherwise, if you want to be a little more surgical and deliberate you can check Docker's website for more info.  (Note: if your running SEG and Horizon on a UAG appliance your testing from, first of all weird, second of all, I'd go with a more deliberate strategy.  I think there's mechanisms in place to prevent you from blowing the SEG container away, but I haven't had a chance to test yet.) 


Let Me Know What You Think 

I am absolutely, positively, unequivocally, without a doubt, dying to get feed back on the utility from folks who've given it a try.  This is it's first major iteration and I plan to make it better and better.  For that, I need earnest feedback.  If anything goes wrong or you have any suggestions for improvement, let me know.  And if it helps you solve a problem, definitely let me know. 

No comments:

Post a Comment