Tuesday, August 16, 2016

Cloud Connector Edition (CCE) Deployment - Lessons Learned

Hey everyone! Yes, I know, it has been STUPID long since I wrote a blog post, and my excuses are pathetic. Pathetic as they may be, let me list a couple out to try to redeem myself a bit:
  • I changed jobs. Again! - Yep, I my stay at Deloitte ended up being a little shorter than I originally thought it might be, but an opportunity at Integration Partners came my way, and I just couldn't pass it up. While I worked with an amazing team at Deloitte, and was very grateful for that opportunity, this new role has been a pretty awesome ride already, with an awesome team to boot!
  • I've been keeping up with my weekly #Skype4BRecap webcast. -  Yes, I know, webcasting alone is not good enough, but with a weekly schedule, it does actually take up quite a bit of time!
  • My family did a cross-country move. - You have to admit, that is a pretty big task, moving an entire household from NC to TX. And all the packing/prepping was being done while keeping the house in "Showing" condition for potential buyers.
  • And the lamest excuse of all... I was a bit burned out on "written" material after Version 2.0 of my Skype for Business Hybrid Handbook. - I know, cry me a river, but hey, Version 2.0 was a pretty big increase in content, with an entirely new chapter devoted entirely to Cloud Connector Edition! Which leads me to today's topic.

One of the more recent projects I have had the pleasure to dive into has centered around a Cloud Connector Edition (CCE) deployment. The situation was that the company was deploying a greenfield Skype for Business Online environment in Office 365, meaning they did not already have Skype for Business (or Lync Server) on-prem, and wanted to bake in PSTN calling capability for their Skype for Business users. This was all fine and great for their U.S.-based users, who could simply use Cloud PBX with PSTN Calling. However, this company also had a small group of users in a South American country, and with no PSTN Calling functionality outside of the U.S. and U.K. (ok, AND technically Puerto Rico), they would not be able to place PSTN calls via Skype for Business for their South American users.

Enter CCE! The plan was to move all users into Office 365, with all U.S. users using Cloud PBX with PSTN Calling, and all South American users using Cloud PBX with a new on-prem CCE deployment (the CCE would be connecting to a Sonus SBC as the PSTN Gateway, but that doesn't really matter much for this post). So far, all is well! Below is a nifty little network diagram of how CCE was to be deployed (networking info changed to protect the innocent, of course!):


As you can see above, there was only going to be a single PSTN Site created (a single CCE instance); there was no HA to plan for, or other potential complications. A simple deployment was right up my alley, though, as this would be my first production CCE deployment. I was quite excited.

About those "Lessons Learned"?


Alright, I know you are ready for me to quite blabbing and get on with my pointers already, so I want walk you through this step-by-step - we'll save that for another time! Today is simply about a few lessons that I learned when deploying CCE.

1. Plan your networking ahead of time.


This may seem silly to even call out, as it should be obvious, but I found it really helpful, and almost necessary, to have Visio or other diagram that gave you a good visual of how all the networking components were going to be layed out, and more specifically, what IPs would be assigned. Unlike a Skype for Business Server 2015 on-prem deployment, where you can deploy certain pieces in phases, coming back for things when you are ready, CCE requires you to modify a single text file (CloudConnector.ini) with ALL the necessary values for building out your ENTIRE VM environment before deploying your build script.

This means that you needed to prepare your SSL and have it issued and placed on the server prior to running the script, whereas you could simply execute Step 3 in the Skype for Business Deployment Wizard when you were good and ready for on-prem. You needed to provide the public IP for your Access Edge component, as well as the IPs for each of the 4 VMs, and an additional public-facing (but internal) IP for the Edge server, on a separate network than the 4 IPs assigned to the other VMs. You needed to provide the DNS server IP addresses that your VMs would use for public name resolution (I used Google's public DNS servers at 8.8.8.8 for all public name resolution). As you can see, there were plenty of variables to have completely laid out before pressing that Enter button to execute the PowerShell cmdlet for building the environment.

2. No Errors on the build script doesn't mean everything deployed as expected.


After getting all the requirements gathered and documented in your config file, running it smoothly, and seeing that the cmdlet finished without any errors, you may think the execution was flawless. You may be especially tempted to think this when you see those 4 shiny new VMs in the Hyper-V manager, and start accessing them, noticing the presence of all the right software. Sweetness! Or maybe not so much...

Let's say you go to make that call after getting your user configured completely and logged into a client, and bummer of all bummers, the call doesn't go through. First you try an outbound call from the client, and it doesn't even ring; it pretty much just kills the call after a couple seconds. Then you try an inbound call, dialing the assigned LineURI of this new CCE user. Unfortunately, it may start to ring, but never gets through.

In my case, I ended up installing Skype for Business Debugging Tools on the Mediation server VM, and using CLS Logger. With CLS Logger I could not see any attempts at all when placing an outbound call. Looking at the diagram above, we see that the CCE user would first hit Office 365, and then the call would attempt to route through the Edge role and then the Mediation role before moving on to the SBC (my test users was external to the corporate network). Since I saw nothing on the Mediation server via CLS Logger, this meant that the traffic was only getting as far as the Edge role. I then installed Wireshark on the Edge role, and discovered that a Reset was being sent back to Office 365 from the Edge server every time the outbound call was made.

At the same time I noticed that INBOUND calls were getting further, making it to through the Sonus SBC and to the Mediation server, but were not getting any further as the CLS Logger revealed a 503 error, stating that the Invite failed via the proxy, and that it was unable to establish a connection. With both issues, the Edge server appeared to be the common denominator. This confused me, as I would have figured that any problems with the build would have been reported on the PowerShell window during the build. After all, there wasn't any custom config; all config was done by the script, using the values provided in the config file.Well, I thought I would check out the Edge server for the heck of it.

What do you know, there were several Skype for Business services stopped on the Edge, including the Access Edge service! Trying to start these services failed, and further analysis of the Skype for Business event log showed that the reason for the services not starting was missing certificates. How could this be? The CCE cmdlet succeeded in building the environment, and didn't complain about any certificates...I opened up the Skype for Business Deployment Wizard, go to the certificates section, and sure enough, all of the external certificate fields were blank!

Alright, I don't get how that happened at all, but when I highlighted the External section and clicked "Assign", the external certificate was present as an option. This means that the script did in fact install the certificate on the server, but just didn't assign it during the Skype for Business deployment on the Edge. *SIGH*. I assigned it, restarted services, and BOOM. Traffic started to flow through, and calls started ringing. There were still issues to deal with at the carrier level, but the CCE portion was now fixed.

3. Location, Location, Location (for Office 365 License assignment)


Remember how I said that some of the users in this company were in the US, but others were in South America? Well, when this test account was setup, it was configured like most of the other accounts, leaving the default location as US when assigning licenses. No big deal at first, but remember, I was getting ready to test dialing from with this user, making the assumption that the user was located in this South American country. Well, when I try to dial out international, using the expected format for the specific country, the dialing did not work. At all! Never made it to the Edge server. As a matter of fact, the only way I could dial and make it to the Edge was to start out dialing an E.164 formatted number.

I then used the following cmdlet to view the user's properties:

Get-CsOnlineUser -Identity <user>@<sipdomain>

Looking at the output, I could see that the user's DialPlan was set to "US". Clearly this would not work. So, I went back into the Users section of Office 365, into the Properties for the test user, and went to edit the assigned licenses. When prompted for Location, I changed this to the proper country in South America. After saving the settings, I could now see that the user had the DialPlan that reflected their respective country. Perfect! After waiting about 10 minutes for replication, I signed the user back in, and was able to dial as expected, as if the user was in the South American country.

The thing to remember of this point is that many adopters of CCE are going to be global or international firms that want to make a move into Office 365 for most of their services, but are still not able to move many non-U.S. or non-U.K. users into PSTN Calling; they will be interested in moving as much as they can into Office 365, while leaning on CCE to provide PSTN capabilities to the geographically-dispersed portion of their user base via on-prem infrastructure. With this in mind, specifying the correct location when assigning Office 365 licenses will be very important.

In Summary


Well that's about all the lessons I have to share for now from my recent adventures in CCE. However, I feel like there will be more of these in my future, with maybe other factors to consider, so I may just update this post as I come across any more interesting things to watch for as you wade into the fairly new waters of CCE. Hope this has been helpful in some shape or form. If you have run into any of your own interesting "Gotchas" in a CCE deployment, feel free to share your experience in the Comments section!

Stay techy, my friends!


5 comments:

  1. Hi Josh - i experienced a similar EDGE cert issue on the latest CCE deployment, where the AV service would not start, i assigned the certifikate manually and then the service started.. strange. :) Regards, Thomas.

    ReplyDelete
  2. Nice post, I also had the same issue on installation with the certificate.

    I detailed some of our issues in the following blog:

    https://redskype.wordpress.com/

    you might come across another one of the them where it will randomly stop trusting your certificate.

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Graham, thanks for sharing! Seems to be a pattern emerging on this. And very interesting about the certificate trust issue. I will have to keep my eye open for that as well.

    ReplyDelete
  5. I had a similar problem although in my case the installation clearly did fail (I'm surprised it completed successfully for others if the Edge services didn't start). In my example the external certificate gets imported successfully (step 12) and immediately gets assigned to the AudioVideoAuthentication service. The RTCSRV service then tries to start and fails.

    If I assign the internally generated cert to the AudioVideo Authentication service (set-cscertificate -type audiovideoauthentication -thumbprint .....) the services then start.

    I've run the installation repeatedly and always experience the same problem.

    I've yet to read any blogs whereby people experience a clean error free install that doesn't require manual intervention, anyone had it install 100%?

    ReplyDelete