Problem to Solve: How can we guarantee the collection of network traffic that feeds our APM/NPM, Security, & Big Data tools inside of our production network?
In my previous video blog, I discussed the use of SPANs with their associated pros and cons (“To SPAN or not To SPAN”). This post will focus on two things. First, who doesn’t remember and love the end of the movie “Shrek”, where “Donkey” gets to sing karaoke style. Second, I will focus on SPAN’s bigger, older and more complex brother…. Taps.
Why would you use a Tap?
Taps are a physical device which you place “in-line” between two network devices. Their purpose is to provide eyes and ears upon the communications which are flowing between those nodes. Because the taps are a physical device (which require an outage window to be installed), they are much more unlikely to be disabled on the fly, which is in stark contrast to the effort needed to remove a SPAN. This is quite beneficial when you plug your taps into an APM or NPM solution which has the ability to perform analytics. Should an issue occur within your environment, historical data is essential to determine if the statistics you’re reviewing reveal an abnormality or a long time trend.
Continuous vs Ad-Hoc Data Collection
Occasionally I will have a discussion with individuals who are on the fence concerning taps and SPANs. It is important to note that this particular scenario has nothing to do with technical considerations or concerns, but their own personal experience and organization behavior. If your organization were to have an incident (at this very moment as you were reading this article), what data would you have available at your disposal RIGHT now? Do you need to reach out to the end user to recreate the problem after your instrumentation is deployed? Or is your tool set ready and waiting? No matter how you answered this question, there is no right or wrong answer. The real answer lies with your use case, as well as your personal and organizational goals as it relates to APM, NPM, security and service triage.
But wait…. You ask “are you going to spend the time to write a blog entry to only tell me I don’t HAVE to install taps?” Thats right! Lets face it folks, not every organization or business is prepared or willing to become more proactive as it relates to triage across their platforms. Personally I work for a vendor, and my personal ethics drive me to pick the best solution for my customer given the problem I am tasked with solving. The goal is not to find a problem for a solution! I recommend a talk with your vendor to better understand their approach and reasoning. Ask “Why can’t I solve my problem with SPANs?” You will not only better understand the your solution design but you will also find out if your vendor has your best interest in mind.
My final point on this matter is rather simplistic. In nearly every “fire fight situation” I have been a part of, there is always an overarching concern. The concern of whether or not a stream of network traffic has been “constantly” being fed into the tool sets. It is this very point that I start to lean towards taps in discussions with customers. Why? When you have a problem with ‘X’ application, if you have your links tapped then probability will be on your side that the problem has already been captured by your tool sets. If your links were spanned, then you are at the mercy of the switch’s SPAN configuration to see if the right data was captured.
What can I Tap?
Taps can easily be broken into two separate categories: optical and electrical/copper. Optical taps are my personal preference as they are the most simplistic devices which commonly consist of just a mirror. Short of physically destroying the tap itself, there is limited risk of a failure when installed correctly. From the optical perspective the most common are LR and SR 10 Gig optics, followed by SX/LX. With recent network refreshes, I have seen an explosion in 40 Gig BiDi deployments.
What about the Security of a Tap?
With all the media coverage concerning cyber and identity theft, security of a tap has become a popular topic of conversation. So for those of you who have IDS/IPS/DLP/APM/NPM/etc. tools (which is likely all of you reading this blog), how would you know when those tools have lost their SPAN connection? Remember their production network connection is still live. For many, it would be the next time a fire broke out or when you noticed your inbox was no longer flooded with alarms. Do you want to explain to your management that you are unable to triage a performance or security solution? Well, you run that exact risk should you opt for a SPAN and not tap. You are putting all your faith in the assumption that folks won’t “fat finger” a configuration or “move the data source of your SPAN” for an effort they deem more important. Can taps be modified? Yes, they can if someone has physical access to unplug the Tap: however, physical access is directly tied to your access procedures and security strategy of your data center. They are not tied to the logical switch configuration of the SPAN port.
So how about this for a possible scenario? The threat where an “insider” disables the SPAN to your security tool set, commits their dirty deed, then re-enables the SPANs once their work is done? Or rather swaps the vlan/interfaces being monitored so the proper data is not captured to the tool… all from the comforts of home via wireless and VPN. Taps offer a more secure solution (for ensuring proper data collection to tools) which is exceptionally hard for someone to detect. The only method I’ve personally come up with is if someone took light readings before and after the tap installation.
Impact on your Monitoring Solution
So in closing, both SPANs and taps offer a viable solution depending on the problem you are looking to solve. Your vendor should provide guidance for you to address your problems with the appropriate technology and deployment model. Now I hope you are armed with additional knowledge and questions the next time you are faced with this decision.