The client had received their data in a number of formats, including several Excel files, text files, and a MySQL database from the credit card processor. Data definitions were not present. My first step was to consolidate the different data sources, clean the data, and ensure that I had keys to match user accounts to transactions, transactions to merchant categories, and both to geographic locations.

After a data cleaning process, I built a database using Amazon’s Web Services RDS and uploaded the now clean data to it. I then use Tableau Desktop on top of the newly built RDS instance to explore and visualize the data.

Using Tableau I was able to cluster lower-value predicting behaviors and fraudulent-predicting behaviors.