Business analysis for today is a markedly different function from even a few years ago. Cheaply available computing power combined with sophisticated open-source technologies and methods means that organizations that embrace a data-driven culture significantly out-perform their traditional peers. In certain disciplines, (e.g. Web/Marketing Analytics, Risk/Portfolio Management) the use-cases have been clear for some time and AI adoption is pervasive.
Use-cases in Finance and Accounting have traditionally been less clear, but that is rapidly changing. Organizations are seeking to leverage Advanced Analytics (from data preparation through visualization and ultimately machine learning model development) but they often could use assistance in getting started. Clearview’s Active Supplier Management is proving to be a very good analytics use-case as we have been identifying 10-20% savings in vendor spend.
We also realize that the concept of Machine Learning for finance & accounting can be daunting, and its relevance difficult to envision. However, the first stage in Advanced Analytics is obtaining and preparing data – and this is an area that any analyst, regardless of background, can relate to. Furthermore, there is a substantial amount of value to be extracted while creating this data foundation, upon which more sophisticated modeling can be conducted.
To help take Advanced Analytics from an abstract concept to something more relevant, we are introducing a series of Micro-Cases in Analytics. These cases are individual examples of contemporary analytical techniques or capabilities that have been valuable to our clients. We do not intend to follow any order or schedule with our Micro-Cases except to share topics that may be of interest to others.
For our first in the series, we look at the problem of applying credits at scale.
Micro-Cases in Analytics #1: Applying Credits
A data characteristic that is unique to finance and accounting is the concept of a reversing transaction. Reversals in science or marketing analytics are rare, but in the finance world, it is perfectly normal to see 2-5% of entries in a dataset offsetting. Such entries not only add no informational value, they actually distort any analysis involving counts or averages. For example, a hotel stay with 3 nights’ charges plus a negative amount for a single night – meaning four lines for what was a two-night stay.
Historically, there has often been little choice but to manually identify and remove offsetting entries which is a time-consuming (and boring) task. Consequently, this is one of the first things we encountered that was helpful to solve.
In our illustration, we consider a public dataset of Oklahoma State Employee Purchase card transactions. There are 1.07M transactions from 2015 to today, with a total of 35,000 negative transactions (credits) – or 3.25% of all transactions.
To match and remove offsetting entries would be manual, extremely time-consuming and prone to errors. To ease this, we created a function apply_credits to identify offsetting transactions based on cardholder, merchant and amount and eliminate them from the working dataset.
When applied to the 1.07M transactions, our apply_credits took under 2 seconds to:
- Identify 24,000 matches based on our criteria
- Check they sum to zero
- Removed them from our dataset
- Check the residual total was unchanged
- Create a .csv file of all the matched items.
Our apply_credits function is written in a generalizable form so that it can be applied to a different dataset within minutes – which is an incredibly useful micro-capability and is a terrific example of how advanced analytics can help with even foundational tasks.