Using d3 visualization for fraud detection and trending

d3 (Data Driven Document) is a great data visualizing tool and recently used it to track possible fraud or how some metrics have behaved over few hours to over few weeks.  You can filter a group that is of interest out of, say, million or more and then use d3 to work through manually for more unique set.

The graph is dynamic (unlike in this blog) where you can select range of lines by moving the cursor over any of Y-axes and selecting them.  It is much more interesting in action than a snapshot of images below.

Below is users’ “A” metric over 5 hours, 5 to 12 hours, 12 hours to 1 day, similarly up to last 14 to 28 days.  The user group was selected from few different segments and each major color corresponds to a segment.  d3 automatically varies the color slightly for each line.  Though it is not very clear in the image below main colors were blue, green, orange and purple.

The input to d3 is a simple csv file and all the presentation is handled by d3 unlike in many previous packages I had used where I ended up creating a output file in html or some xml for flash.  Big advantage with d3 over these is attaching the html element to each data point in the programming and in-built data visualizing functions do the rest of magic.

In the above scenario, for example, you can move the cursor to left most scale (5 hour) and zoom in on lines above 1,700 to 2,000.  There is only one user well above the rest who have metric 200 or lower.  This user hasn’t done much in last 4 weeks until last 5 hours!  Time to look into what this user is doing and we use other tools to do further analysis.

Similar to above scenario, below is another graph where I am interested in about all users whose score was between 600 and 1,400 over last 2 to 7 days.  There is not much exciting in below graph and have seen more interesting details other times.

Happy data visualization!