Setup Environment

Set up environment for data processing and input credentials

%%capture capt  

Pull Data From API for Processing

Pull all transaction data from API and create large table with all extracted data. As part of this step consider performing some data structure optimizations for example storing text fields as categories.


Current Extraction Runtime:  14.3 minutes, 100 API calls, 990002 records extracted so far (continuing ....)
Current Extraction Runtime:  28.74 minutes, 200 API calls, 1990013 records extracted so far (continuing ....)
Current Extraction Runtime:  43.37 minutes, 300 API calls, 2990018 records extracted so far (continuing ....)
Current Extraction Runtime:  58.13 minutes, 400 API calls, 3990028 records extracted so far (continuing ....)
Original Memory Usage : 299.255 megabytes
Final Memory Usage : 195.352 megabytes
Extraction Total Runtime is: 63.63 minutes

Review Some Sample Data

Display some key details for a sample of transaction reports.


report.reportType report.reporter transaction.direction transaction.transactionDatetime transaction.amount
internationalFundsTransferInstruction
CBA
incoming
2020-09-22 22:58:25+00:00
$9987.50
internationalFundsTransferInstruction
CBA
incoming
2020-02-27 05:51:06+00:00
$9985.00
internationalFundsTransferInstruction
CBA
incoming
2020-10-22 02:51:24+00:00
$9950.00
internationalFundsTransferInstruction
CBA
incoming
2020-07-20 03:57:49+00:00
$9950.00

Explore Data

Leverage a Data Exploration Analysis (EDA) tool - the below example uses sweetviz.

Pandas profiling is another good option.