The necessity for clean data – a sample use case

Articles
Tyler Robinson - Consultant

With many businesses operating on diminishing margins, making fast and correct decisions from your data should be a fundamental goal for your company right now.

This means finding the signal in the noise and drilling down into actionable information. Quickly.

In the case I am presenting here, we were able to literally strip away the noise of a field that was coming into our data model through a Web API. The “Comment” field that we were extracting is a free-form text box that users on a web portal can enter anything into. When we pulled this field into Qlik Sense via REST API, we noticed that this field was full of HTML tags. See below for how the field looked. You may say, “but you can still read it, it just takes a minute to mentally translate it.” Well, if you’re making operational decisions off of this field, you must do the mental gymnastics every time you look at it. This time adds up. Or what if you miss something?

Without getting too much into the technical solution on how we were able to clean this field, it’s worth noting that while it didn’t take too long to write and validate the script, the time savings are evident and immediate. If this field is used for nurses and medical techs to fill out at the end of their shift so that their managers can determine staffing for subsequent shifts, even just a few minutes per day per manager can add up. That’s a lot of opportunity cost and doesn’t even factor in that the accuracy of their reading may improve decision making. Below is the result after running the script, see how all of the HTML tags are gone and we inserted carriage returns for readability,

 

While this simple example was very literal with respect to cleaning fields to improve analysis, you can draw parallels to the data you utilize in your organization and how you wish it was modeled differently, appeared differently on the front-end, or you want to extract novel data, from a Web API or otherwise.

For more info, check out the post I wrote for the Qlik Community. (https://community.qlik.com/t5/Qlik-Sense-Documents-Videos/How-to-Efficiently-Clean-HTML-Tags-from-Your-Data-Source/ta-p/1688670)

Lastly, don’t hesitate to reach out to our team here at Pomerol Partners with questions or comments. Thanks!