It’s estimated that the number of global companies employing Robotic Process Automation (RPA) solutions will rise from 53% to 72% in the next two years — and within five years, RPA will achieve near-universal adoption.
So, in a market currently saturated with AI hype, it would probably be helpful to understand what RPA is, what it isn’t, and where it fits into the apparently inexorable march towards AI-driven automation.
To illustrate this, we’ll also take an in-depth look at the specifics of a ‘trouble-shooting’ automation workflow, using UiPath.
The scope of Robotic Process Automation
Until recent times process automation was a relatively obscure pursuit, either because it operated at an arcane and expensive level (server-side scripting and brittle, custom-built workflows) or a semi-amateur one (prosumer software such as AutoHotkey, AppleScript/Automator, QuicKeys, and infamous Visual Basic macros).
Arguably, the rise of interest in RPA in the last three years has been fueled by business interest in artificial intelligence solutions which are:
- Ready to use now.
- Offer proven company performance benefits through process automation.
- Able to leverage emerging machine learning sectors such as Natural Language Processing (NLP) and Optical Character Recognition (OCR).
- Capable of evolving in themselves, as the supporting technologies that they use mature.
- Capable of automating the existing tools that humans are currently using, without requiring the massive and costly paradigm shift that full-fledged ‘AI first’ systems are promising a few years down the line.
Of the small number of major global companies which currently specialize in providing scalable RPA software solutions, we have chosen to work with the UiPath open enterprise platform, which offers powerful GUI automation flows backed up by programmatic and flexible AI components.
In a previous post, we outlined the development of a banking-based client approval workflow in UiPath. This time, we’ll see how UiPath can massively speed up a routine and repetitive administrative chore by directly accessing all the familiar applications that a human user needs for the task — but running through it all at lightning speed in a Virtual Machine.
What happened to headless workflows?
One question that might arise as we work through the steps is: Why are we imitating user clicks and GUI navigation? We’re creating an automated process (running on a Virtual Machine, to boot) that could potentially be broken down into JSON-based alerts with CLI-based, programmatic responses, and other ‘headless’ methodologies, surely..?
The shortest answer is probably that we want to automate this process today and be done with it, using an intuitive and visual UI that has enough power to observe our actions and repeat them on demand and to schedule; but enough flexibility to permit higher-level input, such as injectable secondary scripting languages and even hard code (C#, VB, etc.) in certain modules in the process, as might be necessary.
Additionally, it would likely prove expensive, or even impossible, to directly interact with the Application Programming Interface (API) of every possible program we might want to include in our workflows. That’s assuming that the program even has an API available and that its calls expose the data we need to get at in a timely fashion — or at all.
Furthermore, our RPA solution might be addressing a short-term or minor business need that can’t justify major consulting or outsourcing solutions, or in-house development time; or else be aimed at business data streams that offer no alternative, non-manual method of data access.
During some of the pacier periods of development, hours sometimes get logged by people who are not yet associated with the project that they’re claiming time for. This ‘orphaned’ credit makes the Jira project statistics inaccurate, and can also stop contractors from getting paid.
To prevent this, we have set up Intervals to send an automated alert email to a dedicated Gmail account each time this error comes up.
The email subject line contains the three pieces of information that we need to base our UiPath RPA process on the person’s name, their email address, and the project’s code number.
We’ll use the name to identify the employee in the time-tracking database; the project code to check if the employee is still not associated with that project (the problem may have been fixed manually by the time the robot addresses it); and, optionally, the email address to inform the employee that the records have been rectified.
Automating record correction with UiPath
1: Activating the process from an automated email alert
Our UiPath process executes at intervals as a cron job on a virtual machine running a desktop installation of Windows.
Though there is a great deal of simulated clicking in a typical UiPath routine, in this case, it does at least begin more elegantly, by accessing a dedicated Gmail account over IMAP:
(The passwords to access any accounts used in this RPA sequence are stored and sent encrypted. Neither the program nor any UiPath user with access to it is able to read these passwords once they’re integrated into the workflow)
The Gmail tag that our special alerts are filtered into might later contain other types of incident reports. Therefore this particular UiPath routine is looking for a very specific set of words at the beginning of the subject line:
2: Extracting information
If it finds an unread mail with this key phrase in the title, UiPath splits the subject’s name, email address, and associated project code out of the subject line and into three different variables, which we noted earlier. These will be used to query and update the Intervals database.
It also marks the email as read, so that it is not automatically processed a second time.
3: Locating the employee in the database
Now it’s time for the RPA process to start imitating human behavior and interacting directly with applications.
UiPath can identify and remember interface elements, such as text fields and buttons, from style tags, their place in a nested document structure, or even using image recognition (in the case of interfaces that render through opaque graphics layer that can’t be explored directly).
The UiPath routine opens a new browser instance and logs securely into the Intervals database. It inputs the name variable that was extracted from the Gmail alert and executes a search to find the employee in the Intervals database:
It turns out that the employee has an additional middle name, which UiPath will need to know in order to correctly add the person to the project.
4: Assigning the employee to the project
After taking note of the full employee name, UiPath uses the project’s codename (which it earlier extracted from the Gmail subject line) to find the project page in Intervals. It then analyses the column on the right to see if the employee has already been listed among the assigned project staff:
If the employee is not already assigned, UiPath scrolls down the project page and ticks the box which will fix the problem.
UiPath saves the change and then monitors the page for the notification which confirms that the worker is now associated with the project that they are claiming time for.
As a secondary check, UiPath also searches for the employee’s name in the refreshed list of people associated with the project. If either of these two indicators of success is missing, the RPA process automatically notifies a qualified team member to give the problem further attention.
Otherwise, a notification about the successful amendment can be sent to the email address that UiPath extracted from the Gmail address at the beginning.
A modular future
RPA processes like this can string together even dozens of applications that have no other universal way to cooperate. They can also breathe new life into legacy in-house software architectures which might otherwise be prohibitively expensive to automate.
We’ve seen here just how modular an RPA process is, breaking down complex problems into manageable stages, where each stage may make use of wildly differing approaches, from the structural page and text analysis through to image recognition — or even live-user intervention as necessary (for example, to complete Captcha challenges on a banking app, as we discussed before).
RPA solutions such as UiPath are likewise modular by nature. For example, the OCR features of UiPath can leverage the text recognition engines of Microsoft Office or Google’s Cloud Vision API (either of which might be more suitable for a particular task), with room to add more choices from FOSS or subscription-based AI and Machine Learning sources as they become available in the future.
This ability to evolve and adapt in a granular way, and to incorporate emerging technologies into an existing framework of technologies, takes business software beyond the ‘version’ model that has dominated over the last thirty years. Instead of a commitment to a platform, protocol, vendor, or particular software product, the emerging model, exemplified by RPA, is a commitment to the process.