Michael Garel discusses using machine
learning and data analytics to clean
data and process work orders faster

Understanding 'data cleaning' in equipment service, and the tools used to do it

June 11, 2019
by John R. Fischer, Staff Reporter
More than 50 percent of a data scientist’s time is spent cleaning data, according to a Cloudflower 2017 Data Scientist Report.

Michael Garel, director of data strategy for Accruent, gave a presentation at AAMI Exchange in Cleveland this weekend, in which he argued that estimate is actually "quite low."

“From what I’ve seen, data scientists spend most, if not 80 or 90 percent, of their time cleaning data,” he said. “Everybody thinks this data scientist role is the greatest ever. It’s really processing a lot of data.”

Cleaning data refers to the process of detecting and correcting corrupt or inaccurate records so that the true analytic insight can shine through the information.

With over 500 million work orders — more than 230 million in healthcare — Accruent utilizes a number of tools in data analytics, machine learning and deep learning to clean data and uncover insights for completing hospital equipment work orders more efficiently. Which tools to use comes down to what information the user is trying to uncover, the type of work order and the variables involved.

In his presentation, entitled Big Data Insights on Capital Equipment from 500 Million Work Orders, Garel examined specific uses and scenarios that a few of these tools are best suited for addressing:

Data Analytics
Data science is the ability to comprehend and process data, and to extract value from it, visualize it and communicate it. Applying data analytics can be helpful for this, depending on the type of scenarios users are faced with.
Machine Learning
Machine learning is data analysis that automates analytical model building. While most software requires training on where to look, the aim of this technology is to uncover hidden insights without explicitly being programmed where to search. It can instead learn from data, by identifying patterns and by making predictions.

Deep Learning
A more accurate and faster form of machine learning, deep learning does not require as much upfront, as the tools and framework are already built in.. It can train networks and adjust variables within them. The downside is the greater amounts of data required for training, compared to standard machine learning.

He adds that if used correctly, all of these and other tools available can help speed up data cleaning and ensure faster completion of work orders for workflow and quality patient care.

“Our hypothesis is that we can apply data science to clean up enough of this data so we can actually make it useful to generate results. That’s the goal.”