Applying Advance NPM (aka NPM+) to Healthcare; a PCPM Odyssey (Part 2)
avatar

While traffic utilization and link error rates (NPM) have their place and uses, we must look deeper to truly impact patient care. Service enablers for example are services which are often overlooked and neglected, yet have the most widespread impact to the entire health of an IT organization. Failure to properly maintain this machine will cause complete chaos within IT.
DNS … A Key Service Enabler
 
I will estimate that over 95% or more of the applications that care givers access will completely cease to function should this one service enabler go down. What is it? The Domain Name Service or more widely know by its acronym “DNS”. What is DNS? Well its similar to a phone book for data communication networks. It provides a translation of friendly alphanumeric (www.microsoft.com) text to IP addressing (192.168.1.1). You see when computers/iPads attempt to access an application or dependency, typically its done so by a name. This is done as a general best practice to be more user-friendly, but most notably, to enable the ability for maintenance and traffic balancing.
 
Now that we know what DNS is, why should we pay so much attention to it? Much like a phone book, if we are unable to translate a name to a number, then we won’t be able to make a connection (PC, phone, Citrix, EHR). Further, if the name we are looking up is incomplete, then we will spend time looking up all the various forms and spellings of that name which we may know about (DNS Prefix).  It is then not surprising, that if we incorrectly spell the name we are looking for, then we will never be able to establish the connection.
 
Performance Impacts of Improperly Implemented DNS
Name failure
 
By far, the most common example of DNS failure you will experience will be an error code of “3”, which relates to name failures. For example, the name that your PC/Mac/Phone/Tablet is attempting to look up doesn’t exist and results in an error coming back from the server, in this case “3”. Have you ever mistyped “www.google.com”? Or your corporate home page? This results in a DNS error which you aren’t likely to see. These type of problems are unavoidable when we talk about end users manually typing. But what if the administrator is having a busy day and quickly provides what commonly is referred to as a ‘short name’? For example, “EHR” instead of “ehr.corporate.na.local”…. What’s the impact if he takes this short hand and pushes it out to every desktop? Let’s review.
 
This question isn’t easily answered given that each corporation is configured differently. We must review the corporation DNS prefix search list to better understand their individual impact. DNS prefixes permit your laptop to append various domains to your short name in an attempt to find your intended destination. For the sake of this blog entry lets say as a result of numerous mergers, a company provided laptop has 5 prefixes preconfigured as shown below.
 
      • corp.local
      • corp.eu.local
      • mergedcorp.local
      • mergedcorp.from.5years.ago.local
      • corporate.na.local
 
If a caregiver attempts to access the Citrix front end of their EHR located at “ehr.corporate.na.local” via “ehr”, their laptop will perform lookups on the following
  • ‘EHR’
  • ‘EHR.corp.local”
  • “EHR.corp.eu.local”
  • “EHR.mergedcorp.local”
  • “EHR.mergedcorp.from.5years.ago.local”
  • “EHR.corporate.na.local”
Now these lookups are executed in series, and given DNS default behavior when an entry is not known the responses could vary greatly. But let’s say the failures take 120 ms each (the average at my home office). This results in 600 milliseconds (120 ms multiplied by 5 lookups) or half a second of delay. While half a second isn’t a large sum of time, it is significant when you multiply it across all EHR users and include it with any other performance delays the EHR may be having that particular day.  Now let’s say your EHR has 3 tiers to its architecture.   In this case, each tier is impacted which would translate into 1.5 seconds of delay to access queries.
 
This has a much larger impact if multiple large systems in an organization are utilizing incorrect names. The DNS infrastructure must now have the horsepower to handle all of these requests which shouldn’t exist in the first place. Now, capital costs can be tied to the soft costs of users. I personally recommend a failure rate of no more than 20%.
 
DNS Scenario Review
 
Scenario 1 Expanded: A nurse calls and reports periodic slowness just moments ago in the delivery of her EHR application.
NPM Facts: NPM tools report that your MAN/DWDM/WAN/T1/Etc is at that 20% utilization; zero errors; top application is Citrix
NPM+ Facts: NPM+ toolsets are showing DNS is failing 50% of the time for names corresponding to Citrix, with a DNS error code of 2 (server failure)
Question: Can you help solve this problem? Yes, by reviewing the NPM+ data we can see that DNS is problematic and should be reviewed. The client hasn’t even reached the Citrix platform offering up the EHR application to experience any performance problems.
 
DHCP …. Another Key Service Enabler
 
The dynamic host configuration protocol  (DHCP) is so often ignored yet commands an absolutely pivotal role. DHCP is responsible for assigning IP addresses in a company’s network. Without an address, devices are simply unable to communicate at all. No EHR, no email, no modern patient care, period.
 
The impact of a faulty DHCP installation is easy to imagine. If DHCP is unable to provide addressing, then devices such as patient care terminals are unable to gain access to the EHR. This does not fit within this topic’s scenarios, as problems within the DHCP environment typically result in devices not getting connected.
 
Scenario 2 : A nurse calls and reports she is unable to access her EHR application.
NPM Facts: NPM tools report that the link MAN/DWDM/WAN/T1/Etc is running at that 20% utilization; zero errors; top application is Citrix
NPM+ Facts: NPM+ toolsets are showing DNS queries of “ehr.corporate.na.loca” are 100% successful. DHCP is showing an increased failure rate for the facility the nurse is calling from.
Question: Can you help solve this problem? Yes, by reviewing the NPM+ data we can see that DHCP is problematic and should be reviewed. The client is unable to even access the network, let alone EHR. To non-technical staff, it’s a simple statement that EHR is not accessible. Much like you are unable to diagnose Lupus, they are unable to diagnose DHCP problems.
 
Other Examples of Service Enablers
 
DNS and DHCP are by no means the only service enablers that seen in a network environment nor are they the only contributors to the EHR working normally. LDAP and Radius play a huge role. If just one of these systems have problems it will cause daily and possibly consistent delays for users which has nothing to do with the “meat and potatoes” of the core EHR application or their mission in life.

Continue on to the next article in the Series

Applying APM to Healthcare; a PCPM Odyssey (Part 3)

Comments

Applying Advance NPM (aka NPM+) to Healthcare; a PCPM Odyssey (Part 2) — 2 Comments

  1. Pingback: Applying NPM to Healthcare; a PCPM Odyssey (Part 1) - Problem Solver Blog

  2. Pingback: A Patient Care Performance Management (PCPM) Odyssey - Problem Solver Blog