Monday, October 20, 2014

Maximizing Call Center Software Performance With Load Balancing

When adding capacity to an Asterisk-based ACD (Automatic Call Distributor) system, the desire is to increase the throughput of the system in a linear fashion.  Choosing call center software that allows the addition of servers to increase capacity is an essential step.  However, one must take certain steps to ensure that individual servers don't become a choke point for performance.  This is where load balancing comes in.

When scaling call center software, there are a few limitations that can stop you in your tracks if you are not watching for them.  Sometimes the effect is confined to a local system.  Sometimes the effect is in the design of the entire installation.

Some examples of local system issues are:

  • Limitations on I/O operations - this can be due to limitations on network traffic, hard disk read/write/seek speeds, or the sheer amount of data that can be moved per second
  • CPU usage - on Asterisk this is usually due to transcoding, mixing audio, etc..  Other times, especially in virtual environments, the CPU may have more work to do due to the lack of hardware support for direct data handling (as there is no hardware).
  • Recordings - here the limiting factor is usually disk speed, although the speed at which Asterisk can mix audio and convert it to another audio format for recording.
  • Virtualization software - the type of virtualization being done can introduce chokepoints for performance.  For instance, Xen (which is used by Amazon) had a history of not allowing network interrupts to be shared across cores of a CPU.  This potentially pushes too much work onto one core and limiting the amount of work that can be done while leaving processing capacity sitting idle.  Other types of virtualization don't take full advantage of hardware capacity.  There are several other ways performance can be limited, but that should probably be its own series of posts.
No matter how well tuned a single server is, it will still have capacity issues.  At some point, one of the above categories is likely to get pushed to the maximum if servers aren't added.

 You can have your servers all well-tuned and ready for peak performance, and still not receive full performance of your system.  A common cause of this is a design where a certain server becomes a limiting factor in capacity.  Some examples we've seen of this are:

  • Too many agents assigned to an individual server.  That often results in too high a percentage of calls being routed through that particular server.
  • A high capacity trunk being set to register to a single server.  If all calls must go through a single server, then the capacity of that server is your choke point.
  • Too many users overall.  For web servers, it's sometimes the case that administrative users are pulling recordings and reports from a single server that is also shared by call center agents.  They can choke off system resources quite quickly in some cases if they are downloading multiple recordings at a time.

For these reasons, we encourage the even distribution of workload through load balancing.  For calls, the Q-Suite offers a High Availability SIP proxy that allows the contact center to specify which Asterisk servers calls can be load balanced over.  When calls come in, a round-robin method is used to deliver calls in a balanced manner.  If a particular server is expected to have a higher workload due to other factors, such as being shared among multiple services or being used as a registration server, it can be excluded from load balancing. Q-Suite also offers load balancing for agent web services.

In general, load balancing can prevent issues with system design, as some of those decisions get made as calls come in.  Load balancing can also be used to cover shortcomings in local system capacity until they can be resolved, by spreading the load over multiple similar systems.  In this way, additional servers can be provisioned in a cloud contact center environment to allow production to continue while avoiding certain thresholds on single systems. In either case, it allows you to reach your full potential.

Monday, October 13, 2014

Trust, But Verify: Simple Steps to Finding Your IVR Issues

The call came first thing in the morning.  Agents were logged in, callers could dial in, but callers were not getting connected to agents.  A foundational tenet of skills-based call routing is that calls get routed to agents, so the team leapt into action.  As support techs logged in and began gathering PCAPs (Packet Captures) and poring over logs to discover the cause, we also began test calls into their IVR (Interactive Voice Response).

One diagnostic technique that can occasionally clear up issues with an IVR, queue or agents on the floor is the test call.  I am occasionally surprised when a single test call reveals the smoking gun, and frequently surprised at how many call center managers and supervisors are so resistant to picking up an extension and dialing a number on their own system.

Here are a few cases I've seen that were quickly resolved with a test call:

* A newly launched center called to report that agents were receiving "ghost calls."  That's an annoyingly non-specific description that can cover a number of cases.  In this case, the agent was receiving a call, but could not hear the client.  They would disposition the call and move on, usually receiving a few ghost calls in a row.  With the administrators all conferenced in, a test call was made.  We could hear the agent, but the agent could not hear us.  When somebody  was sent to the floor, it turned out that when executing the transfer function, agents would occasionally accidentally turn down the volume of their headsets.  Agents received instruction on this, and the problem did not recur.

* A new IVR was not setting the correct information for agent retrieval when the agent received the call.  Senior users of the system pored over the IVR for far too long trying to discover the error in the dialplan.  When we called into the assigned DID (Direct Inward Dial), we found the IVR they thought they were using was not the actual IVR being hit.  It turned out that a last minute change had occurred, and the DID was now going to a different place.  Once that was discovered, the logic error in the IVR was quickly discovered and rectified.

* A very complex IVR had been copied and modified for a similar purpose.  After a few hours in production, the center noticed they weren't getting any calls into a particular queue.  When we were asked to look into it, a single test call revealed that calls meeting the most common criteria were being directed to another queue entirely.  We discovered this by asking the agent what queue we had been routed to.  The error was then quickly rectified.

* A center complained that they were not getting calls into a particular queue, even though that was one of their busiest.  We called the toll-free number that was assigned, and got another call center.  Their client controlled the DIDs at the telco level, and had rerouted some of them due to issues upstream from our client call center.  Their client hadn't bothered to notify them, but when the telco issues were resolved, calls were directed back at our client.

Other issues I have seen at various places which are easily testable with a call are:

* Poor music on hold quality.  Sometimes it sounds like you're listening to static.
* Toll free or other numbers not coming in on the expected DID.
* Numbers coming in on the expected DID, but prepended with a + or 1

Quite honestly, there are going to be any number of problems that can be diagnosed with a test call, and it's a mystery to me why more call centers don't do semi-regular call tests.  You can uncover issues and annoyances that may be affecting your SLA (Service Level Agreement) and abandon rates.

Back to the case that I opened with, it was becoming clear that the number we were dialing did go to an IVR with similar periodic messages and music on hold, but we were not seeing our dials in the captured packets or in the logs.  This could have been indicative of a problem with logging itself, but the smoking gun was when Asterisk was restarted on the system and my 15 minute-long call was not disconnected.  Relating that information to them allowed them to determine an old clone had become active and was somehow sitting on the IP address the inbound trunk was connecting to.  With agents on one system, and calls coming into another that the Q-Suite was unaware of, there was no chance of getting calls in.  The issue was then quickly rectified directly by the client's IT department, and they were back in business before their peak call times.

Monday, October 6, 2014

Localization and Other Benefits of Inbound Call Routing

When handling inbound calls, directing calls to the appropriate destination while balancing resource availability is key to ensuring an optimal experience for the caller.  For many call centers, skills-based routing ACD call center software is all that is needed.  Other centers may have differing demands.  In some cases, calls may be routed in different ways depending on various criteria, such as location, the DID dialed, user input or information pulled from the client record while still in the dialplan.

Any dialplan builder will have options for conditionally branching.  In cases where there are a large number of possible conditions, specifying each potential condition and branch can be tedious and time consuming.  An example from a client was postal codes.  In the IVR, the client was collecting the postal code, but needed to be able to differentiate between areas that were serviced by a local office, and those that would be handled by the center itself.  Q-Suite 5.7 has a feature called the Inbound Call Router that fit the bill.  By uploading the full set of postal codes in question along with the the local office number where one existed, the client was able to use a component in the Visual IVR Builder to set the correct values needed for each postal code.  Routing the call to the local office when one existed, or to the corporate call floor in the case one did not exist or the lines were busy, ensured that callers received care from the correct source while still being able to call the number advertised in the large advertising campaign.

Using the Q-Suite Inbound Call Router, a call center can not only direct the call appropriately, but can also automatically set values that can be sent at the time the call is presented to the agent via the agent scripts built in the script builder.  This allows details such as localization data to be sent to the agent as well, improving the caller’s overall experience and allowing the call center to ensure the agent handles the call in the correct manner.

Monday, September 29, 2014

Tradeoffs with Answering Machine Detection on Asterisk

Answering Machine Detection (AMD) is something that interests everyone running a predictive dialer or other automatic dialing.  As discussed previously, the changing world of outbound dialing and telephony leaves call center software users looking for ways to wring additional efficiencies from their lead lists and call floors, and AMD appears to be one of those ways.

Put simply, AMD allows the calls that do connect to be screened for the presence of a live person or an answering machine (or IVR in the case of businesses).  Call centers who have a high percentage of answering machine hits often find that these calls take up too much of their agents' time, and would prefer to have that machine handled automatically.  This analysis is sometimes flawed.  What they often fail to appreciate is that the accuracy of Asterisk AMD is not 100%.  Usually not 90%, and the highest rates of accuracy are dependent on correct tuning of the parameters Asterisk makes available.  Badly configured settings can result in an accuracy rate of near 0% - it may identify every call as a machine.

Even in the case that AMD is tuned correctly and reaches an accuracy of 85%, this can still cause issues.  You may be missing a significant percentage of the live people the dialer is able to connect you to.  If you are losing 15% of the contacts you actually reach, this may put a serious crimp in your business.  Furthermore, you may have regulatory concerns if you are not handling those calls correctly, and they may count as dropped calls. Of course, if the number of connects is actually mostly answering machines, the cost of handling all those calls may make the tradeoffs well worth it.

Even with these caveats, well-tuned AMD in the correct environment can be a boon to your business.  For this reason, the Q-Suite predictive dialer has included direct access to the AMD settings on a per-campaign basis for years in a convenient location, allowing power users to tune values selectively, while making it easy to use it with the default settings.

Lately, Sangoma's Lyra AMD product has been gaining traction in the Asterisk-based call center software arena.  Sangoma has been a player in the Asterisk market for years with their line of products, and improved AMD is certainly welcome.  With their patent-pending technology and claims of improved accuracy, the cost of the product may be quickly recouped in centers where answering machine handling is a large cost.  Clients have demanded the ability to use this technology  with the Q-Suite, and so development work has been done to allow the use of Lyra with the Q-Suite.  Indosoft believes that adding this option with their predictive dialer gives clients access to the additional tools they need to manage the tradeoffs in using Asterisk AMD.

Friday, September 26, 2014

Call Center Software in the Cloud Still Requires Security Considerations

One of the advantages of moving call center software to the Cloud is the lower cost of managing your infrastructure.  Despite this, it is still vitally important that you do manage the infrastructure.  Security holes are always being discovered, and keeping your software up to date and following best practices will help your platform from becoming exploited.

For most of the last decade, Indosoft has preferred a Debian-based server operating system, but has needed to stay with Bash to avoid errors in a third-party configuration tool.  Recent news is therefore of concern, so it is a relief that repair of the Shellshock vulnerability on Debian and Ubuntu systems is so straightforward. 

As long as one is mindful of security updates and best practices, running call center software in the Cloud can offer benefits to the enterprise.  Keeping software updated, setting up firewalls, and closing known vulnerabilities will allow you to focus most of your attention where it should be.

Monday, September 22, 2014

Warm Leads and Outbound Dialing in Today's Environment

Outbound dialing in the call center has undergone a revolutionary change in the past decade.  In October of 2004, the Supreme Court of the United States allowed a ruling from a lower court to stand that enabled the FTC Do Not Call regulations.  The widespread registration of home phones, along with restrictions on dialing cellphones (and their increasing share of the number of phones outstanding), signaled a massive shift in the way outbound contact centers would operate.  Automatic or predictive dialing was not killed off then, but it has been in critical condition since.

These days, outbound dialing campaigns have to be conducted much more conservatively.  Quality leads are harder to get, increasing the expense of obtaining and contacting them.  Many outbound-oriented call centers are increasingly dialing warm, hot, live or immediate leads.  There are several ways of referring to these leads as well as several ways of collecting them.  Such leads may be obtained by referral, by client-generated inquiry on a web site or inbound call center, or from a larger set of leads with an applied business logic.  In any case, what they share is immediacy; their value is high when called immediately, and in many cases drops quickly with time.

Call center software that handles outbound campaigns must provide a way to insert leads into the queue quickly and often on a next-number-dialed basis.  The Q-Suite provides several ways to queue leads immediately, with the next available agent getting the new lead, or being added to the end of the queue so as to be handled in the next little while.  It also makes it possible to script business logic, apply time-of-day rules, or integrate with another CRM that provides leads on an agent-driven or CRM driven basis.  This flexibility, along with the capability to preview a lead, dial on a one-to-one basis, or even run predictive campaigns, is increasingly important in today’s environment and will be needed even more in the future.

Monday, September 15, 2014

Using Agent Skill Priority in Skills-Based Routing

Skill priority is a topic that has been covered before in this blog, in the case where skills are being used and we want to ensure that calls are routed evenly.  However, we have found that some are unclear on the benefits of assigning differing levels of skill to agents.

We find there are a few different cases that are useful when assigning differing skill levels to agents in a skills-based ACD software like Q-Suite:

  • Some agents have a weaker skill and should only receive calls if other agents stronger in that skill are currently unavailable.  A typical example would be language skills - the agent whose Spanish is weak should only receive calls from the Spanish queue if no other agents who are stronger in Spanish are available.
  • Supervisors or other users who are not normally on the phones who should only be getting calls from a queue that must be answered if no other agents are available.  This could be a signal that the queue is over capacity, or a mechanism to have more capacity at peak times.
  • In a training situation, you may use skill level to ensure that trainees receive as many calls as possible in order to get up to speed quickly, causing more experienced agents to get bumped if there are two agents waiting for a call.
  • Agents from related queues may be assigned a lower level of skill so in peak periods there can be enough agents to take calls, but under normal circumstances the agents would be employed in other queues.
Skill priorities in combination with queue priorities can be a great tool to ensure that your calls are handled in a timely manner by agents competent to handle the call while reserving your best agents for the more challenging calls, or quickly giving less-experienced call center agents the practice they need.  After a little time in production, it can become quickly apparent how a change in skilling or queue priorities can improve efficiency.