The journey to successful SD-WAN deployment

I’ve decided to put this article together as we’re approaching the end of our SD-WAN project. There are still couple challenging sites left to deploy (delay is caused by hardware delivery), but these are not going to have any negative impact on overall picture. I am pretty sure everyone in our company will agree that this project was a great success from a design and delivery perspective. We had few challenges, but managed to adjust our approach in a timely manner ensuring agility is not compromised. I’d like to say we could have done this quicker, but legal sector (at least our company!) doesn’t appreciate disruptive changes, so we had to balance.

We started our SD-WAN journey more than two years ago. I realized the power of this architecture when I first saw the demo of Talari and Silver-Peak SD-WAN products at London UC Expo back in 2018. Upon my return to the office I told my line manager that we are like dinosaurs and that we can do much better when it comes to network operation. I believe this has put a start to a number of consultancy engagements where we’ve been assisted with our Branch Connectivity vision in line with our Cloud First strategy. Back then I’ve been working as a Network Architect for slightly more than a year, but my background was operational support and project delivery. I was very well aware of daily challenges that our network operations team has been facing.

The main things that got my attention were

  • Agile, Centralized, policy-driven configuration
    • Ease of management
    • Quick and safe Enterprise-wide configuration changes
    • API-based integration with 3rd parties
  • Transport Independence
    • Secure communication over any media
  • Intelligent Link Management
    • Dynamic IP SLA
    • Forward Error Correction inc Packet Duplication (1:1 FEC)
    • Link Bonding

These are the pillars of SD-WAN. Any vendor must adhere to these architecture principles or else their product cannot be called SD-WAN. On a high level, all SD-WAN vendors offer similar capabilities, and therefore to make them look unique they add some unique features, such as

  • Enhanced Security (Cisco Viptela, Palo Alto CloudGenix, Fortinet)
  • WAN optimization (Aruba Silver-Peak and Riverbed)
  • Global Traffic Steering (vmWare VeloCloud)

Get to Know Business Needs

When it comes to the choice of appropriate SD-WAN vendor make sure you understand business needs (or high level requirements). There’s a very high risk of delivery failure if there’s no vision of what you’re trying to achieve and why do you need SD-WAN (maybe you don’t?). Here are few examples of what you can think of as a business requirements, as well supporting SD-WAN capabilities:

  • Cost Saving
    • SD-WAN is transport agnostic, so it enables you to be flexible with WAN transports
    • Chose from MPLS and Internet, or dual Internet, or even 4G/5G options
  • Simplified and Agile Operational Support
    • SD-WAN manages devices centrally and does not promote (or even does not support) device-level configurations.
    • Templates enable NOC to react to changing requirements in an agile way
    • Overlay networking abstracts away underlay transports making WAN failures transparent for the end users.
  • Improved Quality of User Experience
    • SD-WAN dynamically measures performance of underlay transports and reacts to brown-out and black-out conditions automatically as per business intent described through policy configuration.
    • Supports sub-second failovers
  • Improved Security
    • Traffic is always encrypted irrespective of underlay transport (MPLS vs Internet)
    • Encryption keys are managed centrally and dynamically

Don’t hesitate to get some help from your consultancy partners with definition of high level requirements. You will only benefit from this engagement. I highly recommend to ensure this consultancy is not SD-WAN centric though. Think of your business goals and how you can get there, don’t try to chose SD-WAN vendor in early phases. At the end of the day, remember what I’ve said above – SD-WAN offerings are more or less the same with some unique features. I also would not recommend engaging with consultancy businesses for whom SD-WAN is bread and butter as they will certainly steer your choice to their preferred vendor, even though you’re going to be told they have no preference! Such engagements should be left for when you get into design and delivery phase and ideally recommendations should come from a chosen vendor.

Once business needs have been identified you can work with the consultancy to shortlist SD-WAN vendors. I hope it’s getting clearer why working with SD-WAN centric consultancy might not be the best choice – they’ll steer your thinking to vendors with whom they have best business relationship.

Get Low Level Requirements Right

Based on above your BA team can then collect a list of more specific low level requirements from key stakeholders, such as

  • The solution MUST be able to operate over 4G/5G networks without additional hardware
  • The solution SHOULD support WAN optimization inc deduplication and latency optimization
  • The solution COULD support configuration management via RESTful API
  • The solution MUST support API-driven integration with cloud firewall vendors (e.g. Zscaler)
  • The solution MUST support OAM protocols (NetFlow and SNMPv3)
  • The solution MUST support dynamic routing protocols
  • … etc (many others)

Keep in mind that you shouldn’t go lower than this with your requirements. For example, do not lock your choice to proprietary routing protocols, like EIGRP. As a rule of thumb all vendors support BGP and OSPF, but none will support something proprietary, like EIGRP. Even Cisco spent a while to introduce support for EIGRP into their Viptela code. Don’t try to force SD-WAN vendors to adhere to your existing standards (e.g. QoS) as there will be better ways to achieve the desired outcome (more below).

Also, you should be very careful with non-vendor-agnostic requirements, unless you have very GOOD reasons for this. For example, if you say that solution must support SGTs, then you’re kind of done with a vendor selection – it’s Cisco! This may be a valid requirements in case you have enterprise-wide SDA deployment. If not, then don’t lock yourself to a single vendor by promoting hypothetical requirements.

Engage with Procurement as Soon as Possible

Please make sure your procurement team is part of the process from very beginning. At the very least you need to understand what is the weight of their final say. Basically, you can spend ages evaluating different vendors, building scoring tables, but then your strongest candidate is going to be thrown into the bin due to ‘more attractive’ financial deal. Believe me this may happen! It is frustrating too. Luckily, our procurement team told us – “ignore finance, please find the best product” and they let us do this. Our 2nd place candidate offered better deal, but guess what? We’ve gone for Number 1.

Chose the Right SD-WAN Vendor

Now, when you have a list of requirements, it’s time to ask your procurement team to engage with shortlisted vendors and initiate RFI/RFP process. By the way, don’t go crazy about vendors shortlist. It must be short. As a rule of thumb, don’t let it grow beyond 5 vendors (better less)! There are tens of different vendors out there, but the strongest are Aruba (Silver-Peak), Cisco (Viptela), Fortinet, Oracle (Talari), Palo Alto (CloudGenix), Riverbed and VMware (VeloCloud). This order is alphabetical and has nothing to do with vendors significance. Each vendor has its strong and weak sides. Based on your business requirements you (or consultancy services) should be able to discard unsuitable candidates. We started with 4 vendors, but then added one additional vendor later in the process for the contrast.

Our RFI/RFP process was absolutely transparent and out of five we’ve chosen two strongest vendors based on points system (MoSCoW weights). We had some specific requirements which made couple candidates look stronger than rest of market players. One of such requirements was support for traditional WAN optimization (specifically packet deduplication) as we had to run business in some challenging locations where bandwidth is very expensive and latency is very high. Even though we knew that traditional WAN optimization was in decline, we had to support certain centralized business critical applications until such time when they migrate to SaaS. We had zero appetite to refresh and support WAN optimization technology stack separately. In fact, one of our key business requirements was to reduce network operations cost through simplified support and minimized infrastructure footprint.

Speak to Your Peers!

Real life experience of someone you know is worth more than a word of a potentially interested party. Ask your peers if anyone has deployed SDWAN using shortlisted vendors, especially top two candidates. Someone may have deployed one of them in production, or maybe evaluated few during PoC / PoV engagements. This may save your time and money.

Is PoC Really Required?

What I am going to say now should be treated as my own opinion, it has nothing to do with industry best practices. If you ended up with ONE strong candidate that beats out the rest by reasonable amount of points and you had a chance to find someone who uses SDWAN of this vendor in production and got his positive honest feedback, then go for it. SD-WAN is no longer evolving technology, it’s been around for years. There’s nothing really to PoC as long as you’re not in doubt of your business needs, low level requirements and vendor’s capabilities.

However, if you’re in doubt, i.e. your 2nd place vendor is more attractive financially, then go for PoC. But don’t go crazy about it. Get look and feel of what both vendors can do and spend more time on features where there’s controversy from the perspective of meeting requirements. Also, I recommend asking someone from your NOC to evaluate UI (or API) to get their feedback, unless it’s going to be a managed service.

We’ve not done PoC. I had a chance to discuss two top vendors with my peers who work in large international enterprises and that helped us to ensure our choice was absolutely spot on! We’ve not changed our mind since then.

Find the Right Integration Partner

Unless you’ve done few SD-WAN deployments yourself, do not attempt to deploy in isolation! Do not try to design SD-WAN network using traditional principles as you’ll get it wrong. I would highly recommend engaging with one of vendor’s preferred partners (once you’ve chosen the vendor!) to help you at least with design and delivery. We decided to go for 1 year managed service post-delivery to ensure our NOC builds up confidence before accepting service ownership. We may extend managed service at the end of this one year term, or may bring this service in house – a decision for NOC to make.

Be Open Minded

I can pretty much guarantee SD-WAN will disrupt some of your network standards and designs. You will end up adjusting them to the new world. As I previously mentioned, branch simplification was one of our business requirements, including decreased footprint and simplified operation. This had some great impact on how we decided to treat WAN QoS. Before SD-WAN our QoS design was based on Cisco Medianet Architecture (aka QoS Design v4.0). It uses a very rich toolset to ensure traffic is appropriately serviced across private WAN. The downside of this architecture is its complexity.

As Cisco QoS is one of my favourite topics, you can imagine that I was quite reluctant to any changes in this field. However, after few technical sessions with our partner I’ve realized that I have to become open to changes. As a result we significantly simplified our WAN QoS design at the branch by letting SD-WAN fabric deal with latency, packet loss and jitter using SD-WAN features and not traditional QoS tools. This design choice allowed us to completely abstract away from carriers’ QoS models achieving pure transport independence. This is just one example, but there were many.

Approach that Worked

Once we completed our design and it got stamped by TDA, we decided to go for a Pilot deployment. This is standard approach and is needed to ensure design is correct, and if not it can be adjusted prior to main rollout. Our Pilot was limited to 5 locations with one DC, one hub and three branches (inc one in challenging location where we tested WAN optimization engine).

Pilot was meant to prove our design meets key requirements, some are listed below

  • Connectivity between SD-WAN and non-SD-WAN locations is available
    • Failure of single MPLS transport in DC does not impact this connectivity, downtime up to 3m was deemed acceptable (identical to traditional networking)
  • Connectivity between SD-WAN locations is available
    • Failure of a WAN transport, or a combination of WAN transports, must not have severe impact on voice or video communications (calls must not be interrupted) – sub-second failover.
  • WAN optimization is applied and has positive impact on WAN links utilization and user experience

Our main goal was to ensure SD-WAN and non-SDWAN networks can transparently co-exist throughout deployment. We decided to integrate SD-WAN into our Data Centers before deploying into any (non pilot) branch. These act as interconnection points (SD-WAN / traditional networks co-existence).

Even though our chosen vendor supports integration with underlay networks (MPLS providers) via BGP peering, we decided to use hub-n-spoke architecture for SD-WAN to non-SD-WAN integration due to significant simplicity of configuration. In this case we haven’t had to rely on temporary per-transport QoS configuration, nor we had to establish BGP peering with private WAN providers in every branch. SD-WAN was integrated into DC networks in such a way that it provided highly available connectivity with traditional networks without asymmetry and with appropriate security controls.

Even though this design approach made DC integrations look complex, it allowed us to maintain very basic configuration at branches. Apart from BGP ASN for LAN peering, IP addressing and some regional differences, all our branches are homogeneous and simple. This had a very positive impact on main rollout – 95% of branches were absolutely identical (ignoring private WAN transports of course – but this layer was abstracted away).

As we’ve always treated SD-WAN as an enabler, the following activities are being planned, or taking place as I am writing this article:

  • Removal of technical debt
    • Legacy WAN optimization technology stack (decreases footprint in branches and DCs!)
    • Simplified configuration – we’ve got rid of EIGRP in branches, and there’s no more redistribution needed (in branches)
  • MPLS/VPLS decommission with aim to end up with Dual Internet branches
  • Integration with Zscaler Internet Access (SWG) – we’re moving towards SASE and ZTNA

If you have any questions, or would like to discuss our deployment, I am more than happy for a quick chat! I hope this one was useful.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.