How does your BGP path impact your digital performance? This is easy to understand:

  • Network performance, digital experience, user performance, business conversion and performance are intertwined. 
  • The network path taken by your users (driven by your BGP policy) drives the network performance to your digital platform and as a consequence their experience and your overall digital business performance (if you have any doubt on this, read this article). 

That network path is driven by:

  • the way you manage your cloud infrastructure,
  • your DNS records
  • and the BGP policies of network operators on your users’ side to access your own AS (Autonomous System).

If you are not familiar with BGP, I recommend that you take a look at this article and if you wonder how BGP routing works, well this one will help also

This article covers: 

  • How DNS and BGP drive the network path from your users to your applications, 
  • How it affects reachability of your applications from the internet, 
  • What you can do to optimize it,
  • A presentation of a concrete case study on a real life large scale retail business.

Network performance and the AS path

First of all, your BGP policy will drive the route taken from your AS (Autonomous System) to other ASs. On the other hand, the route taken by users to reach your digital platform and AS will be driven by their operators’ BGP policy.

This pretty much works like a chain: your peers and transit providers will announce the IP prefixes of your AS to their own peers and transit providers; these last ones will themselves announce your prefixes to another set of peers and transit providers. 

Second, all operators will make routing decisions based on the routes announced to your prefixes. If several AS paths exist, will arbitrate between the different paths based on criteria such as their IP transit cost and the length of the path to your AS.

Ultimately the AS path (also referred to as BGP path) will drive network latency and hence have a significant impact on your digital performance.

Mutiple AS paths from user to server: we can see 2 AS paths from user to server: 1. AS1 > AS2 > AS4 > AS5 > Your AS and 2. AS1 > AS3 > AS5 > Your AS.

Network path and DNS 

Each user will use DNS to translate your hostname into an IP address. Different DNS resolutions may translate into different IP addresses which belong to different ASs, locations and network paths as well. 

This way, you can use DNS to direct users from different regions to different servers / CDNs etc… to optimize performance and costs. You can either automate or configure it manually, as well as manage it yourself or delegate it to your cloud service providers.

AS path variation with multiple DNS resolutions: here we can see that if the DNS record points to another IP address located outside of your AS, the AS path and the network performance for your users will vary accordingly.

What you can do to optimize the network path to your apps? 

How can you optimize something which is not completely under your control? That’s an interesting question, is not it?
Well there are things that you can definitely do and some which are off the table to optimize the BGP path taken by your users and as a consequence your digital performance. 

Gaining visibility on the network path performance from your main users’ locations is your #1 step! 

You can only improve what you measure (and understand). Your sole actions and infrastructure will not impact directly the path taken by your users. As a consequence, you have to understand the impact of many third parties on your users’ network path (users, their ISP, etc.). 

Here is a short list of things to know: 

  • How your users resolve your FQDN: where are the DNS resolutions pointing your users towards. 
  • How your users’ operators route traffic to your AS: understand through which AS path (series of operators) your users reach your application, how many routing hops they go through and what are the resulting network latency and packet loss levels. 
  • About the short and long paths: understand if there is one or multiple paths and how they compare. 
  • About punctual quality of service events: understand whether routes are changing, suffering from degradations (packet loss, congestion, etc.)

What actions can you take to improve the network path? 

Your first thought will point to your BGP policy: this may not be the direction. Your BGP policy mostly drives how you route the traffic from your AS to outside destinations. 

Actually there are two main ways to improve your network path: 

  • The first one is to manage the DNS resolutions to your resources in an efficient way (use different DNS resolutions based on the users’ location / operator). 
  • The second one is to manage cleverly who is going to announce your IP prefixes. And you can only do this with your direct peers and transit providers. The key here is to continuously measure which of your direct peers and transit providers are part of the AS path that drive the best possible network performance.

A concrete use case: a drive retail site

As an example, we will monitor the path from a key national operator to a nationwide drive retail platform. 

Here is a geographical overview of the internet reachability of their platform: as you can see the network performance will vary through time and depending on the country.

If you focus on France which is their main market: you can see significant change in network response times and packet loss levels at the beginning of the observation period.

If we look at the details of the network path used from the main operators, this confirms the severe change in network latency, packet loss and route length (number of routers on the path). We also see that over that period 3 distinct DNS resolutions in use, driving to different resources (and IP addresses) which are routed through 3 different autonomous systems (AS): 

If we focus our observation on the initial period, we can easily identify what was the DNS resolution and AS path then (which will most likely be our preferred path from now on):

Of course we can also check the two other paths that were used after that period:

This is a pretty good start to make informed decisions and quickly improve the network reachability of your applications and platform.
It is important to note that, with Kadiska, it takes less than 1 minute to get this up and running. 

Shall you be interested in knowing more about this: