Monday, October 27, 2014

Virtualization Issues to Watch For With Your Call Center Software

The modern call center has benefited greatly from improvements in virtualization technology, and the widespread acceptance of the cloud contact center as a wise choice has accelerated improvements in that field.  Asterisk runs well on both actual and virtual commodity servers and has done for several years now.

As with any technology, however, virtualization is no panacea.  When deploying your call center software on virtual servers, care must be taken, as in all technology decisions.  The flexibility, ease of deployment, and quick ability to alter the server all come at a cost.  Some forget that a virtual server is not actually the full equivalent of a hardware server.

Here are a few things we have found, so you don't have to:

More virtual cores don't give you more actual performance by themselves.  This is not something that would normally be encountered in a cloud environment, but can be caused by somebody who is a little overeager with the virtualization settings.  If, for example, you configure two 4-core virtual servers and have them running simultaneously, you will not get 8 cores worth of performance if they are running on a 4-core host.  This may be fine if each server is using only a fraction of the allocated resources, or if peak loads on each system do not coincide.  It is not fine if you are actually going to require that sort of performance on each of the servers on a regular basis.

Virtual servers can starve one another of resources.  Normally a little thought beforehand is all that is necessary to prevent these issues.  A disk-heavy application may not be a good candidate to be paired with another disk-heavy VM.  One example we have seen in the Cloud is network usage.  While moving recordings from a set of Asterisk servers, the overall usage was enough to saturate the network controller on the shared hardware.  This caused significant call quality issues, even though it appeared each server should have enough bandwidth.  Once the issue was spotted it was easy to move the servers to separate hardware.

The virtualization technology itself may have a constraint or bug.  It's occasionally the case that the virtual environment doesn't match up with the way hardware would due to a bug or design constraint.  For instance, one major technology has had an issue with being unable to spread interrupts for the ethernet controller over multiple cores.  In situations where there is a significant amount of network traffic, this can place a high amount of work on a single core, causing performance constraints.  Other virtualization technologies may have design constraints due to issues with presenting hardware features that may not exist on all supported platforms. This forces the software to perform tasks that may have been more optimally performed in the hardware. 

The virtualization technology may not be completely transparent to the system.  This is somewhat related to the last item, but in some cases attributes of the virtualization technology are not completely transparent to the guest OS.  One example we have seen a few times is the compiler not using the correct CPU type, resulting in "Illegal Instruction" errors when running Asterisk.  This can usually be resolved by passing a flag during the configure step.  Here are two we have seen used (separately) successfully:
  • ./configure CC="gcc -O3" CFLAGS=-O3 
  • ./configure CFLAGS=mtune=native
With a small amount of planning these issues can be easily avoided or overcome, allowing for a successful use of your call center software in the cloud.

Monday, October 20, 2014

Maximizing Call Center Software Performance With Load Balancing

When adding capacity to an Asterisk-based ACD (Automatic Call Distributor) system, the desire is to increase the throughput of the system in a linear fashion.  Choosing call center software that allows the addition of servers to increase capacity is an essential step.  However, one must take certain steps to ensure that individual servers don't become a choke point for performance.  This is where load balancing comes in.

When scaling call center software, there are a few limitations that can stop you in your tracks if you are not watching for them.  Sometimes the effect is confined to a local system.  Sometimes the effect is in the design of the entire installation.

Some examples of local system issues are:

  • Limitations on I/O operations - this can be due to limitations on network traffic, hard disk read/write/seek speeds, or the sheer amount of data that can be moved per second
  • CPU usage - on Asterisk this is usually due to transcoding, mixing audio, etc..  Other times, especially in virtual environments, the CPU may have more work to do due to the lack of hardware support for direct data handling (as there is no hardware).
  • Recordings - here the limiting factor is usually disk speed, although the speed at which Asterisk can mix audio and convert it to another audio format for recording.
  • Virtualization software - the type of virtualization being done can introduce chokepoints for performance.  For instance, Xen (which is used by Amazon) had a history of not allowing network interrupts to be shared across cores of a CPU.  This potentially pushes too much work onto one core and limiting the amount of work that can be done while leaving processing capacity sitting idle.  Other types of virtualization don't take full advantage of hardware capacity.  There are several other ways performance can be limited, but that should probably be its own series of posts.
No matter how well tuned a single server is, it will still have capacity issues.  At some point, one of the above categories is likely to get pushed to the maximum if servers aren't added.

 You can have your servers all well-tuned and ready for peak performance, and still not receive full performance of your system.  A common cause of this is a design where a certain server becomes a limiting factor in capacity.  Some examples we've seen of this are:

  • Too many agents assigned to an individual server.  That often results in too high a percentage of calls being routed through that particular server.
  • A high capacity trunk being set to register to a single server.  If all calls must go through a single server, then the capacity of that server is your choke point.
  • Too many users overall.  For web servers, it's sometimes the case that administrative users are pulling recordings and reports from a single server that is also shared by call center agents.  They can choke off system resources quite quickly in some cases if they are downloading multiple recordings at a time.

For these reasons, we encourage the even distribution of workload through load balancing.  For calls, the Q-Suite offers a High Availability SIP proxy that allows the contact center to specify which Asterisk servers calls can be load balanced over.  When calls come in, a round-robin method is used to deliver calls in a balanced manner.  If a particular server is expected to have a higher workload due to other factors, such as being shared among multiple services or being used as a registration server, it can be excluded from load balancing. Q-Suite also offers load balancing for agent web services.

In general, load balancing can prevent issues with system design, as some of those decisions get made as calls come in.  Load balancing can also be used to cover shortcomings in local system capacity until they can be resolved, by spreading the load over multiple similar systems.  In this way, additional servers can be provisioned in a cloud contact center environment to allow production to continue while avoiding certain thresholds on single systems. In either case, it allows you to reach your full potential.

Monday, October 13, 2014

Trust, But Verify: Simple Steps to Finding Your IVR Issues

The call came first thing in the morning.  Agents were logged in, callers could dial in, but callers were not getting connected to agents.  A foundational tenet of skills-based call routing is that calls get routed to agents, so the team leapt into action.  As support techs logged in and began gathering PCAPs (Packet Captures) and poring over logs to discover the cause, we also began test calls into their IVR (Interactive Voice Response).

One diagnostic technique that can occasionally clear up issues with an IVR, queue or agents on the floor is the test call.  I am occasionally surprised when a single test call reveals the smoking gun, and frequently surprised at how many call center managers and supervisors are so resistant to picking up an extension and dialing a number on their own system.

Here are a few cases I've seen that were quickly resolved with a test call:

* A newly launched center called to report that agents were receiving "ghost calls."  That's an annoyingly non-specific description that can cover a number of cases.  In this case, the agent was receiving a call, but could not hear the client.  They would disposition the call and move on, usually receiving a few ghost calls in a row.  With the administrators all conferenced in, a test call was made.  We could hear the agent, but the agent could not hear us.  When somebody  was sent to the floor, it turned out that when executing the transfer function, agents would occasionally accidentally turn down the volume of their headsets.  Agents received instruction on this, and the problem did not recur.

* A new IVR was not setting the correct information for agent retrieval when the agent received the call.  Senior users of the system pored over the IVR for far too long trying to discover the error in the dialplan.  When we called into the assigned DID (Direct Inward Dial), we found the IVR they thought they were using was not the actual IVR being hit.  It turned out that a last minute change had occurred, and the DID was now going to a different place.  Once that was discovered, the logic error in the IVR was quickly discovered and rectified.

* A very complex IVR had been copied and modified for a similar purpose.  After a few hours in production, the center noticed they weren't getting any calls into a particular queue.  When we were asked to look into it, a single test call revealed that calls meeting the most common criteria were being directed to another queue entirely.  We discovered this by asking the agent what queue we had been routed to.  The error was then quickly rectified.

* A center complained that they were not getting calls into a particular queue, even though that was one of their busiest.  We called the toll-free number that was assigned, and got another call center.  Their client controlled the DIDs at the telco level, and had rerouted some of them due to issues upstream from our client call center.  Their client hadn't bothered to notify them, but when the telco issues were resolved, calls were directed back at our client.

Other issues I have seen at various places which are easily testable with a call are:

* Poor music on hold quality.  Sometimes it sounds like you're listening to static.
* Toll free or other numbers not coming in on the expected DID.
* Numbers coming in on the expected DID, but prepended with a + or 1

Quite honestly, there are going to be any number of problems that can be diagnosed with a test call, and it's a mystery to me why more call centers don't do semi-regular call tests.  You can uncover issues and annoyances that may be affecting your SLA (Service Level Agreement) and abandon rates.

Back to the case that I opened with, it was becoming clear that the number we were dialing did go to an IVR with similar periodic messages and music on hold, but we were not seeing our dials in the captured packets or in the logs.  This could have been indicative of a problem with logging itself, but the smoking gun was when Asterisk was restarted on the system and my 15 minute-long call was not disconnected.  Relating that information to them allowed them to determine an old clone had become active and was somehow sitting on the IP address the inbound trunk was connecting to.  With agents on one system, and calls coming into another that the Q-Suite was unaware of, there was no chance of getting calls in.  The issue was then quickly rectified directly by the client's IT department, and they were back in business before their peak call times.

Monday, October 6, 2014

Localization and Other Benefits of Inbound Call Routing

When handling inbound calls, directing calls to the appropriate destination while balancing resource availability is key to ensuring an optimal experience for the caller.  For many call centers, skills-based routing ACD call center software is all that is needed.  Other centers may have differing demands.  In some cases, calls may be routed in different ways depending on various criteria, such as location, the DID dialed, user input or information pulled from the client record while still in the dialplan.

Any dialplan builder will have options for conditionally branching.  In cases where there are a large number of possible conditions, specifying each potential condition and branch can be tedious and time consuming.  An example from a client was postal codes.  In the IVR, the client was collecting the postal code, but needed to be able to differentiate between areas that were serviced by a local office, and those that would be handled by the center itself.  Q-Suite 5.7 has a feature called the Inbound Call Router that fit the bill.  By uploading the full set of postal codes in question along with the the local office number where one existed, the client was able to use a component in the Visual IVR Builder to set the correct values needed for each postal code.  Routing the call to the local office when one existed, or to the corporate call floor in the case one did not exist or the lines were busy, ensured that callers received care from the correct source while still being able to call the number advertised in the large advertising campaign.

Using the Q-Suite Inbound Call Router, a call center can not only direct the call appropriately, but can also automatically set values that can be sent at the time the call is presented to the agent via the agent scripts built in the script builder.  This allows details such as localization data to be sent to the agent as well, improving the caller’s overall experience and allowing the call center to ensure the agent handles the call in the correct manner.

Monday, September 29, 2014

Tradeoffs with Answering Machine Detection on Asterisk

Answering Machine Detection (AMD) is something that interests everyone running a predictive dialer or other automatic dialing.  As discussed previously, the changing world of outbound dialing and telephony leaves call center software users looking for ways to wring additional efficiencies from their lead lists and call floors, and AMD appears to be one of those ways.

Put simply, AMD allows the calls that do connect to be screened for the presence of a live person or an answering machine (or IVR in the case of businesses).  Call centers who have a high percentage of answering machine hits often find that these calls take up too much of their agents' time, and would prefer to have that machine handled automatically.  This analysis is sometimes flawed.  What they often fail to appreciate is that the accuracy of Asterisk AMD is not 100%.  Usually not 90%, and the highest rates of accuracy are dependent on correct tuning of the parameters Asterisk makes available.  Badly configured settings can result in an accuracy rate of near 0% - it may identify every call as a machine.

Even in the case that AMD is tuned correctly and reaches an accuracy of 85%, this can still cause issues.  You may be missing a significant percentage of the live people the dialer is able to connect you to.  If you are losing 15% of the contacts you actually reach, this may put a serious crimp in your business.  Furthermore, you may have regulatory concerns if you are not handling those calls correctly, and they may count as dropped calls. Of course, if the number of connects is actually mostly answering machines, the cost of handling all those calls may make the tradeoffs well worth it.

Even with these caveats, well-tuned AMD in the correct environment can be a boon to your business.  For this reason, the Q-Suite predictive dialer has included direct access to the AMD settings on a per-campaign basis for years in a convenient location, allowing power users to tune values selectively, while making it easy to use it with the default settings.

Lately, Sangoma's Lyra AMD product has been gaining traction in the Asterisk-based call center software arena.  Sangoma has been a player in the Asterisk market for years with their line of products, and improved AMD is certainly welcome.  With their patent-pending technology and claims of improved accuracy, the cost of the product may be quickly recouped in centers where answering machine handling is a large cost.  Clients have demanded the ability to use this technology  with the Q-Suite, and so development work has been done to allow the use of Lyra with the Q-Suite.  Indosoft believes that adding this option with their predictive dialer gives clients access to the additional tools they need to manage the tradeoffs in using Asterisk AMD.

Friday, September 26, 2014

Call Center Software in the Cloud Still Requires Security Considerations

One of the advantages of moving call center software to the Cloud is the lower cost of managing your infrastructure.  Despite this, it is still vitally important that you do manage the infrastructure.  Security holes are always being discovered, and keeping your software up to date and following best practices will help your platform from becoming exploited.

For most of the last decade, Indosoft has preferred a Debian-based server operating system, but has needed to stay with Bash to avoid errors in a third-party configuration tool.  Recent news is therefore of concern, so it is a relief that repair of the Shellshock vulnerability on Debian and Ubuntu systems is so straightforward. 

As long as one is mindful of security updates and best practices, running call center software in the Cloud can offer benefits to the enterprise.  Keeping software updated, setting up firewalls, and closing known vulnerabilities will allow you to focus most of your attention where it should be.

Monday, September 22, 2014

Warm Leads and Outbound Dialing in Today's Environment

Outbound dialing in the call center has undergone a revolutionary change in the past decade.  In October of 2004, the Supreme Court of the United States allowed a ruling from a lower court to stand that enabled the FTC Do Not Call regulations.  The widespread registration of home phones, along with restrictions on dialing cellphones (and their increasing share of the number of phones outstanding), signaled a massive shift in the way outbound contact centers would operate.  Automatic or predictive dialing was not killed off then, but it has been in critical condition since.

These days, outbound dialing campaigns have to be conducted much more conservatively.  Quality leads are harder to get, increasing the expense of obtaining and contacting them.  Many outbound-oriented call centers are increasingly dialing warm, hot, live or immediate leads.  There are several ways of referring to these leads as well as several ways of collecting them.  Such leads may be obtained by referral, by client-generated inquiry on a web site or inbound call center, or from a larger set of leads with an applied business logic.  In any case, what they share is immediacy; their value is high when called immediately, and in many cases drops quickly with time.

Call center software that handles outbound campaigns must provide a way to insert leads into the queue quickly and often on a next-number-dialed basis.  The Q-Suite provides several ways to queue leads immediately, with the next available agent getting the new lead, or being added to the end of the queue so as to be handled in the next little while.  It also makes it possible to script business logic, apply time-of-day rules, or integrate with another CRM that provides leads on an agent-driven or CRM driven basis.  This flexibility, along with the capability to preview a lead, dial on a one-to-one basis, or even run predictive campaigns, is increasingly important in today’s environment and will be needed even more in the future.