The Ambassador of QoS… (Quan) | Problem Solver Blog

Problem to Solve – My Voice – Video – Applications are latency sensitive. How can I make sure that my QoS is performing properly so I can assure service reliability?

I know, I know .. Quality of Service “QoS” is not the elusive “Quan” from the classic Jerry Maguire movie. But I could not think of anything interesting lead-ins that start with “Q”. But now that we are here, let’s roll with it! If your company is planning to run Voice and Video services on your Data network, you had better think about considering an “Ambassador of QoS” to ensure that the rollout goes smoothly and efficient.

When it comes to monitoring Quality of Service (QoS), we get all kinds of requests for various business implementations and challenges. In today’s world, the network has become the Super Highway of Transportation for moving traffic for voice over IP (VoIP), Video Conferencing, and of course … critical business applications. All of these services need to perform reliably and adequately to your end users ….. or to be very frank … they will whine like little kids.

I say “little kids” with a slight a bit of sarcasm of course and want to formally apologize to any children that are reading this riveting blog article. The reality is, when running these types of latency sensitive applications on a shared network wire (segment), there will DEFINITELY be contention for that bandwidth. To address the bandwidth contention issue, most organizations will deploy QoS policies on their network infrastructure. QoS policies ensure that the most critical services get the first priority to access the “Super Highway of Transportation” (i.e., VoIP signaling gets first priority). Other less critical services are given a lower priority for access (i.e., user population to access to YouTube gets no priority). Some of these QoS deployment methods include Differentiated Services Code Point (DSCP), IP Precedence, or Type of Service (TOS).

The Challenges

There are various challenges when deploying QoS policies on the network. To name a few..

Assessing and categorizing the current network traffic so that it can be analyzed for priority. This will entail data collection and historical reporting of application traffic.
Proper planning to prioritize services and applications to implement a policy. Video signaling almost always gets a priority queue, while generic user web browsing will get put into the default queue.
Implementing the QoS policy into the infrastructure. Configuring the routers, switches, load balancers, etc. to leverage and enforce the QoS policy.
Assessing and re-assessing the network and application performance “pre and post” QoS deployment. This is the point I will focus on as this is the most usual “pain point” for my customers.

Measuring the QoS Performance

From my perspective, there are really three vantage points that need to be considered to validate the actual performance of the QoS policy. As new applications and services are added to your environment, this type of assessment should really be done on a consistent basis.

1) Application Traffic by QoS (Class or Individual DSCP Code Point) – From a post QoS deployment perspective, I have found it very useful to look at a customer’s application traffic by a class (group of DSCP code points) or by individual DSCP code point. I guide my customers to look at their QoS by individual DSCP code point in the early stages of the deployment, because it will be easier to spot mis-configured traffic. This can be done in various methods including NetFlow, dedicated instrumentation, traffic capture appliances, or even a generic packet capture and decode on a network segment. Each method has its own pros and cons in order to leverage and understand the data.

Why is this Valuable? The purpose in seeing the application traffic is to validate that the QoS policy is working and configured properly. If you can see all of the HTTP traffic running in the actual configured DSCP code point, then you have validated the configuration. The key thing is to view and validate the priority applications and confirm they are being categorized into their proper queue setting. This will ensure that when network bandwidth contention occurs, the most important services will be available and run optimally.

2) QoS Policy MIB on Network Infrastructure – Viewing the QoS policy and performance from the actual infrastructure perspective is the item that I see get less attention. Some customers have told me that they really only care that the application traffic gets categorized correctly. I like to remind them that approach is an important component, but it only tells part of the QoS performance story. This data is available via a QoS MIB on the actual device (router or switch) and can be polled via a SNMP poller.

Why is this Valuable? Viewing the application traffic will display the traffic categorization, but it does not show when infrastructure QoS queues get “full”. When infrastructure QoS queues get “full”, packets will get dropped. When packets get dropped, applications will be affected by latency and retransmissions. Keeping a close watch on the QoS drops counters per queue on a device is an excellent indicator to measure and report. QoS drops will indicate that there may be too much traffic on segment, or more likely, a QoS policy may need to be reassessed.

3) View of Actual Application Response Times – I guide my customers to evaluate their application response times pre and post QoS deployment. The purpose of this guidance is to see how the applications are performing (from a response time perspective) before and after the QoS policy is enabled.

Why is this Valuable? The whole goal of the QoS deployment is to ensure that the most critical application services are available to your users and running optimally. No one wants to be in the situation of telling their management ….”The QoS policy has been configured, deployed, and enabled on the enterprise network. Our Top 5 most critical application services have seen a serious degradation in response time.” Measuring and reporting the application response time (before and after QoS deployed) will be the proof indicating that the efficiencies gained. It will also be the proof point that shows where a QoS policy reassessment might be necessary.

Points to Ponder

– How have your enterprise wide QoS deployments gone?

– Any interesting war stories / lessons learned to share with the class?

– What methodologies have you used to prove the effect of the QoS deployment — good or bad?

– Any other considerations that you have looked at prior to or after a QoS deployment?

Until next time …..