Analysis of IFTI Data - Published Under Freedom of Information (FOI) Request
This is an example of macro Financial Intelligence analysis, based on Australian IFTI data that was published under a Freedom of Information (FOI) request. The source data was originally published here.
If this data is ever removed it will be able to be found at the web archive. The source data includes summary data (total incoming and outgoing amounts broken down by year), as well as detailed data (total incoming and outgoing amounts broken down by year and country). For some reason the data tables were provided in a PDF with a text watermark (hiding some of the figures). Which can be easily worked around however from a tracablility perspective it would be far more preferable if the data was directly accessible (possibly through a public API). For the purposes of this analysis the PDF was processed through Adobe's pdf-to-word tool and the tables extracted into an excel file.
%%capture capt
### Setup Notebook for Analysis
!python3 -mpip install kaleido plotly>=4.5.2
### Load Up Libraries for Analysis
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px
from plotly.subplots import make_subplots
import plotly.graph_objects as go
import requests
from jinja2 import Template
from IPython.display import display, HTML, Image
from collections import Counter
from kaleido.scopes.plotly import PlotlyScope
#@title Input Credentials
#@markdown First things first input your credentials
mapBoxApiKey = "xxxx" #@param {type: "string"}
#@markdown ---
### Setup configuration values
px.set_mapbox_access_token(mapBoxApiKey )
#plt.rcParams['figure.figsize'] = [7, 3]
plt.rc('axes', axisbelow=True)
plt.rcParams['figure.dpi'] = 300
pd.options.display.float_format = '${:,.2f}'.format
template = Template('''
<table class="formattedTable">
<thead>
<tr>
{% for c in columns %}
<th>{{ c }}</th>
{% endfor %}
</tr>
</thead>
<tbody>
{% for row in rows %}
<tr class="{{ loop.cycle('odd', 'even') }}" >
{% for k, v in row.items() %}
{% if k == 'sum_totalNetTransfers' %}
{% if (v|int) < 0 %}
<td style="color:red">{{"${:,.0f}".format(v|int)}}</td>
{% else %}
<td style="color:black">{{"${:,.0f}".format(v|int)}}</td>
{% endif %}
{% elif k.lower() in ['country', 'year', 'region', 'name', 'comment'] %}
<td>{{v}}</td>
{% else %}
<td>{{"${:,.0f}".format(v|int)}}</td>
{% endif %}
{% endfor %}
</tr>
{% endfor %}
</tbody>
</table>
<style>
table.formattedTable {
border: 0.5px solid white;
border-bottom: 1px solid lightgray;
border-spacing: 0px;
}
table.formattedTable th {
border: thin solid white;
background : lightgray;
color: black;
min-width: 100px;
text-align: center;
padding: 10px;
}
table.formattedTable td {
border: thin solid ligthblack;
min-width: 90px;
text-align: center;
padding: 12px;
}
table.formattedTable.odd {
background: white;
}
.even {
background : lightgray;
color: black;
}
</style>''')
Load Up Summary IFTI Amount Data (period 2010-2019)
The first analysis step is to load up the summary IFTI data then display a summary table of the data. You can see we are talking about trillions of dollars transferred per year (i.e. numbers with greater than 12 digits) for IFTI-Es and billions per year for IFTI-DRAs.
### Pull in summary sheet from excel
df = pd.read_excel('https://www.swarmee.net/static/resources/AUSTRAC%20Information%20Report%20IFTI%20overview%202010%20to%202019%20-%20extracted%20data.xlsx',
sheet_name='Total Value of IFTIs',)
### Convert Types to Floats
df = df.astype({'IFTI-E Incoming' : float,
'IFTI-E Outgoing' : float,
'IFTI-E Total Amount' : float,
'IFTI-DRA Incoming' : float,
'IFTI-DRA Outgoing' : float,
'IFTI-DRA Total Amount': float,
'Grand Total Amount' : float
}
)
### Display dataframe
df = df.sort_values(by=['Year'], ascending=True).reset_index()
df.drop('index', inplace=True, axis=1)
df['ALL-IFTI Incoming'] = df['IFTI-E Incoming'] + df['IFTI-DRA Incoming']
df['ALL-IFTI Outgoing'] = df['IFTI-E Outgoing'] + df['IFTI-DRA Outgoing']
df['Year'] = df['Year'].astype(int)
HTML(template.render(
rows=df[['Year',
'IFTI-E Incoming',
'IFTI-DRA Incoming',
'ALL-IFTI Incoming',
'IFTI-E Outgoing',
'IFTI-DRA Outgoing',
'ALL-IFTI Outgoing',
'Grand Total Amount' ]].sort_values(by=['Year'], ascending=True).to_dict(orient='records'),
columns=df[['Year',
'IFTI-E Incoming',
'IFTI-DRA Incoming',
'ALL-IFTI Incoming',
'IFTI-E Outgoing',
'IFTI-DRA Outgoing',
'ALL-IFTI Outgoing',
'Grand Total Amount' ]].sort_values(by=['Year'], ascending=True).columns.to_list()
))
Review Summary IFTI Data as Chart
Understanding the data in a table is hard. Its much easier to display as a chart which is what we do in this step (i.e. same information as above just displayed as a chart). It's easy to see that the total amounts transferred have almost doubled between 2010 and 2019. There are only two years (in the ten years) when the total value of IFTI transfers decreased - that was in 2012 and 2016.
IFTI-DRAs continue to make up a very small percentage of the total amount transfered (you can't really even see them on the chart).
### Create Bar Chart of Total Transfers
df.plot(x='Year',
y=['IFTI-E Incoming',
'IFTI-E Outgoing',
'IFTI-DRA Incoming',
'IFTI-DRA Outgoing'],
figsize=(14,7),
subplots=False,
kind='bar',
stacked=True,
ylabel='Total AUD Amount',
title='Total AUD Amount Over Years'
).yaxis.grid(True, color='black')
df['IFTI-E Incoming % Change'] = df['IFTI-E Incoming'].pct_change().mul(100).round(2) #.map(lambda x: '{0:g}%'.format(x) if x==x else x)
df['IFTI-E Outgoing % Change'] = df['IFTI-E Outgoing'].pct_change().mul(100).round(2) # .map(lambda x: '{0:g}%'.format(x) if x==x else x)
df['IFTI-DRA Incoming % Change'] = df['IFTI-DRA Incoming'].pct_change().mul(100).round(2) #.map(lambda x: '{0:g}%'.format(x) if x==x else x)
df['IFTI-DRA Outgoing % Change'] = df['IFTI-DRA Outgoing'].pct_change().mul(100).round(2) # .map(lambda x: '{0:g}%'.format(x) if x==x else x)
df['Grand Total Amount % Change'] = df['Grand Total Amount'].pct_change().mul(100).round(2) # .map(lambda x: '{0:g}%'.format(x) if x==x else x)
#df['Grand Total Amount % Change'].mean()
### Create Bar Chart of Total Transfers of the 10 years
df.plot(x='Year',y=[ 'IFTI-E Incoming % Change',
'IFTI-E Incoming % Change',
'IFTI-DRA Incoming % Change',
'IFTI-DRA Outgoing % Change'],
figsize=(14,7),
subplots=False,
kind='bar',
stacked=False,
ylabel='Year on Year Percentage Grow/Decline',
title='Percentage Change in Value of ITFIs (2010 to 2019)').yaxis.grid(True, color='black')
### Create Net Transfered Figure Per Year
df['netTransfersMillions'] = (df['IFTI-E Incoming'] + df['IFTI-DRA Incoming'] - df['IFTI-E Outgoing'] - df['IFTI-DRA Outgoing']) / 1000000
### Visualise Net Transfered Per Year
p2 = df.sort_values(by=['Year'], ascending=True).plot(x='Year',y='netTransfersMillions' ,
figsize=(14,7),
subplots=False,
kind='bar',
stacked=True,
ylabel='Total AUD Amount ($millions)',
title='Net Transfers (Incoming less Outgoing)').yaxis.grid(True, color='black')
df.drop('netTransfersMillions', inplace=True, axis=1)
plt.show()
df_incoming = pd.read_excel('https://www.swarmee.net/static/resources/AUSTRAC%20Information%20Report%20IFTI%20overview%202010%20to%202019%20-%20extracted%20data.xlsx',
sheet_name='Incoming Value by Country')
df_outgoing = pd.read_excel('https://www.swarmee.net/static/resources/AUSTRAC%20Information%20Report%20IFTI%20overview%202010%20to%202019%20-%20extracted%20data.xlsx',
sheet_name='Outgoing Value by Country')
dra_incoming = pd.read_excel('https://www.swarmee.net/static/resources/AUSTRAC%20Information%20Report%20IFTI%20overview%202010%20to%202019%20-%20extracted%20data.xlsx',
sheet_name='Incoming Value by Country - DRA')
dra_outgoing = pd.read_excel('https://www.swarmee.net/static/resources/AUSTRAC%20Information%20Report%20IFTI%20overview%202010%20to%202019%20-%20extracted%20data.xlsx',
sheet_name='Outgoing Value by Country - DRA')
all_iftis = pd.merge(df_incoming,
df_outgoing,
on="Country",
how="outer",
suffixes=("_incoming", "_outgoing")
)
all_dra = pd.merge(dra_incoming,
dra_outgoing,
on="Country",
how="outer",
suffixes=("_dra_incoming", "_dra_outgoing")
)
df_detail = pd.merge(all_iftis,
all_dra,
on="Country",
how="outer",
suffixes=("", "_dra")
)
df_detail.fillna(value=0,inplace=True)
def trim_column_name(name:str):
name = name.replace(".0","")
return name
df_detail.columns = [trim_column_name(x) for x in df_detail.columns ]
df_detail['Country'] = pd.Series([country.upper() for country in df_detail['Country'].to_list()])
Comparison of Detail and Total Total Amounts
Now we have the provided detailed and summary data loaded we can compare them. Basically this should identify the total amount of reports where the source/destination country could not be determined.
As we can see the difference (between the summary and detailed data) is very very small. Indicating very aggressive rules to allocate a country to all reports.
#print(df.columns)
## added for re-runability
if 'Detail Total Amount' in df.columns:
df.drop([ 'Detail Total Amount', 'Summary Detail Difference'], axis = 1, inplace=True)
detail_rollup = pd.DataFrame([])
detail_rollup = detail_rollup.append({'Year': 2010, 'Detail Total Amount': df_detail['2010_incoming'].sum() + df_detail['2010_outgoing'].sum() }, ignore_index=True)
detail_rollup = detail_rollup.append({'Year': 2011, 'Detail Total Amount': df_detail['2011_incoming'].sum() + df_detail['2011_outgoing'].sum() }, ignore_index=True)
detail_rollup = detail_rollup.append({'Year': 2012, 'Detail Total Amount': df_detail['2012_incoming'].sum() + df_detail['2012_outgoing'].sum() }, ignore_index=True)
detail_rollup = detail_rollup.append({'Year': 2013, 'Detail Total Amount': df_detail['2013_incoming'].sum() + df_detail['2013_outgoing'].sum() }, ignore_index=True)
detail_rollup = detail_rollup.append({'Year': 2014, 'Detail Total Amount': df_detail['2014_incoming'].sum() + df_detail['2014_outgoing'].sum() }, ignore_index=True)
detail_rollup = detail_rollup.append({'Year': 2015, 'Detail Total Amount': df_detail['2015_incoming'].sum() + df_detail['2015_outgoing'].sum() }, ignore_index=True)
detail_rollup = detail_rollup.append({'Year': 2016, 'Detail Total Amount': df_detail['2016_incoming'].sum() + df_detail['2016_outgoing'].sum() }, ignore_index=True)
detail_rollup = detail_rollup.append({'Year': 2017, 'Detail Total Amount': df_detail['2017_incoming'].sum() + df_detail['2017_outgoing'].sum() }, ignore_index=True)
detail_rollup = detail_rollup.append({'Year': 2018, 'Detail Total Amount': df_detail['2018_incoming'].sum() + df_detail['2018_outgoing'].sum() }, ignore_index=True)
detail_rollup = detail_rollup.append({'Year': 2019, 'Detail Total Amount': df_detail['2019_incoming'].sum() + df_detail['2019_outgoing'].sum() }, ignore_index=True)
df = pd.merge(df,
detail_rollup,
on="Year",
how="inner",
suffixes=("", "")
)
df['Summary Detail Difference'] = df['Grand Total Amount'] - df['Detail Total Amount']
HTML(template.render(
rows=df[['Year', 'Grand Total Amount', 'Detail Total Amount', 'Summary Detail Difference']].sort_values(by=['Year'], ascending=True).to_dict(orient='records'),
columns=df[['Year', 'Grand Total Amount', 'Detail Total Amount', 'Summary Detail Difference']].sort_values(by=['Year'], ascending=True).columns.to_list()
))
Assigning a Consistent and Current Country Code
This step connects to a country api - hosted at Country API to obtain additional information in relation to the transfer country. This API leverages data from the restcounties api with a few additional aliases. The list of non standard non standard aliases is also available in the API. It's highly likely these non standard countries are data submission issues (e.g. USSR, FORMER YUGOSLAVIA, ZAIRE). A few of these data submission issues have been listed below.
## added for re-runability
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
if 'population' in df_detail.columns:
df_detail.drop(['population', 'latlng', 'subregion', 'region', 'name', 'alpha2Code', 'lat', 'lon'], axis = 1, inplace=True)
countryData = []
for index, row in df_detail.iterrows():
url = "https://www.swarmee.net/country/" + row['Country']
r = requests.get(url)
if r.status_code == 200:
r = r.json()
try:
data ={"Country": row['Country'],
"population" : r['population'],
"latlng" : r['latlng'],
"subregion" : r.get('subregion', r['region']) ,
"region" : r['region'],
"name" : r['name']['common'],
"alpha2Code" : r['cca2']
}
countryData.append(data)
except:
print(r)
else:
print(url)
#print('No Details Found For : ' + row['Country'])
countryData = pd.DataFrame(countryData)
#countryData[['lat','lon']] = countryData['Type'].str.split(expand=True,)
countryData.fillna(value=0,inplace=True)
df_detail = pd.merge(df_detail, countryData, on="Country", how="outer", suffixes=("", ""))
df_detail.fillna(value=0,inplace=True)
df_detail['latlng'] = [ [0.0,0.0] if x in [np.NaN, None, [], 0] else x for x in df_detail['latlng'] ]
df_detail[['lat','lon']] = pd.DataFrame(df_detail['latlng'].to_list())
data_issues = df_detail.loc [
(df_detail['Country'].isin(["USSR","FORMER YUGOSLAVIA", "SERBIA MONTENEGRO","NETHERLANDS ANTILLES", "ZAIRE" ])
),
['Country', '2010_outgoing', '2011_outgoing', '2012_outgoing',
'2013_outgoing', '2014_outgoing', '2015_outgoing', '2016_outgoing',
'2017_outgoing', '2018_outgoing', '2019_outgoing', '2010_incoming',
'2011_incoming', '2012_incoming', '2013_incoming', '2014_incoming',
'2015_incoming', '2016_incoming', '2017_incoming', '2018_incoming',
'2019_incoming']
]
data_issues['sum_totalTransfers'] = data_issues[['Country', '2010_outgoing', '2011_outgoing', '2012_outgoing',
'2013_outgoing', '2014_outgoing', '2015_outgoing', '2016_outgoing',
'2017_outgoing', '2018_outgoing', '2019_outgoing', '2010_incoming',
'2011_incoming', '2012_incoming', '2013_incoming', '2014_incoming',
'2015_incoming', '2016_incoming', '2017_incoming', '2018_incoming',
'2019_incoming']].sum( axis=1)
data_issues_comments = pd.DataFrame([
{"Country" : 'USSR', "comment" : 'Dissolved 26 December 1991'},
{"Country" : 'NETHERLANDS ANTILLES', "comment" : 'dissolved October 10, 2010'},
{"Country" : 'SERBIA MONTENEGRO', "comment" : 'Dissolved 2006'},
{"Country" : 'FORMER YUGOSLAVIA', "comment" : 'dissolved 1992'},
{"Country" : 'ZAIRE', "comment" :'Changed name in 1997'}
])
data_issues = pd.merge(data_issues,
data_issues_comments,
on="Country",
how="inner",
suffixes=("_outgoing", "_incoming")
)
HTML(template.render(
rows=data_issues[['Country', 'sum_totalTransfers', 'comment']].sort_values(by=['sum_totalTransfers'], ascending=False).to_dict(orient='records'),
columns=data_issues[['Country', 'sum_totalTransfers', 'comment']].columns.to_list()
))
sumFields = {}
for yearField in [ x for x in df_detail.columns if x.startswith('2')]:
sumFields[yearField] = 'sum'
sumFields
df_detail_summary = df_detail.groupby([df_detail['population'], df_detail['lat'], df_detail['lon'], df_detail['subregion'], df_detail['region'], df_detail['name'], df_detail['alpha2Code']]).agg(sumFields)
df_detail_summary.index = df_detail_summary.index.set_names(['population', 'lat', 'lon', 'subregion', 'region', 'name', 'alpha2code'])
df_detail_summary = df_detail_summary.reset_index()
df_detail_summary['name'] =df_detail_summary['name'].replace('United Kingdom of Great Britain and Northern Ireland', 'United Kingdom')
df_detail_summary['2019_netTransfers'] = df_detail_summary['2019_incoming'] - df_detail_summary['2019_outgoing']
df_detail_summary['2018_netTransfers'] = df_detail_summary['2018_incoming'] - df_detail_summary['2018_outgoing']
df_detail_summary['2017_netTransfers'] = df_detail_summary['2017_incoming'] - df_detail_summary['2017_outgoing']
df_detail_summary['2016_netTransfers'] = df_detail_summary['2016_incoming'] - df_detail_summary['2016_outgoing']
df_detail_summary['2015_netTransfers'] = df_detail_summary['2015_incoming'] - df_detail_summary['2015_outgoing']
df_detail_summary['2014_netTransfers'] = df_detail_summary['2014_incoming'] - df_detail_summary['2014_outgoing']
df_detail_summary['2013_netTransfers'] = df_detail_summary['2013_incoming'] - df_detail_summary['2013_outgoing']
df_detail_summary['2012_netTransfers'] = df_detail_summary['2012_incoming'] - df_detail_summary['2012_outgoing']
df_detail_summary['2011_netTransfers'] = df_detail_summary['2011_incoming'] - df_detail_summary['2011_outgoing']
df_detail_summary['2010_netTransfers'] = df_detail_summary['2010_incoming'] - df_detail_summary['2010_outgoing']
df_detail_summary['2019_totalTransfers'] = df_detail_summary['2019_incoming'] + df_detail_summary['2019_outgoing']
df_detail_summary['2018_totalTransfers'] = df_detail_summary['2018_incoming'] + df_detail_summary['2018_outgoing']
df_detail_summary['2017_totalTransfers'] = df_detail_summary['2017_incoming'] + df_detail_summary['2017_outgoing']
df_detail_summary['2016_totalTransfers'] = df_detail_summary['2016_incoming'] + df_detail_summary['2016_outgoing']
df_detail_summary['2015_totalTransfers'] = df_detail_summary['2015_incoming'] + df_detail_summary['2015_outgoing']
df_detail_summary['2014_totalTransfers'] = df_detail_summary['2014_incoming'] + df_detail_summary['2014_outgoing']
df_detail_summary['2013_totalTransfers'] = df_detail_summary['2013_incoming'] + df_detail_summary['2013_outgoing']
df_detail_summary['2012_totalTransfers'] = df_detail_summary['2012_incoming'] + df_detail_summary['2012_outgoing']
df_detail_summary['2011_totalTransfers'] = df_detail_summary['2011_incoming'] + df_detail_summary['2011_outgoing']
df_detail_summary['2010_totalTransfers'] = df_detail_summary['2010_incoming'] + df_detail_summary['2010_outgoing']
df_detail_summary['sum_totalIncomingTransfers'] = df_detail_summary['2019_incoming'] + df_detail_summary['2018_incoming'] + df_detail_summary['2017_incoming'] + df_detail_summary['2016_incoming'] + df_detail_summary['2015_incoming'] + df_detail_summary['2014_incoming'] + df_detail_summary['2013_incoming'] + df_detail_summary['2012_incoming'] + df_detail_summary['2011_incoming'] + df_detail_summary['2010_incoming']
df_detail_summary['sum_totalOutgoingTransfers'] = df_detail_summary['2019_outgoing'] + df_detail_summary['2018_outgoing'] + df_detail_summary['2017_outgoing'] + df_detail_summary['2016_outgoing'] + df_detail_summary['2015_outgoing'] + df_detail_summary['2014_outgoing'] + df_detail_summary['2013_outgoing'] + df_detail_summary['2012_outgoing'] + df_detail_summary['2011_outgoing'] + df_detail_summary['2010_outgoing']
df_detail_summary['sum_totalNetTransfers'] = df_detail_summary['2019_netTransfers'] + df_detail_summary['2018_netTransfers'] + df_detail_summary['2017_netTransfers'] + df_detail_summary['2016_netTransfers'] + df_detail_summary['2015_netTransfers'] + df_detail_summary['2014_netTransfers'] + df_detail_summary['2013_netTransfers'] + df_detail_summary['2012_netTransfers'] + df_detail_summary['2011_netTransfers'] + df_detail_summary['2010_netTransfers']
df_detail_summary['std_totalNetTransfers'] = df_detail_summary[[
'2019_netTransfers',
'2018_netTransfers',
'2017_netTransfers',
'2016_netTransfers',
'2015_netTransfers',
'2014_netTransfers',
'2013_netTransfers',
'2012_netTransfers',
'2011_netTransfers',
'2010_netTransfers' ]].std( axis=1)
df_detail_summary['std_totalTransfers'] = df_detail_summary[[
'2019_totalTransfers',
'2018_totalTransfers',
'2017_totalTransfers',
'2016_totalTransfers',
'2015_totalTransfers',
'2014_totalTransfers',
'2013_totalTransfers',
'2012_totalTransfers',
'2011_totalTransfers',
'2010_totalTransfers' ]].std( axis=1)
df_detail_summary['sum_totalTransfers'] = df_detail_summary[[
'2019_totalTransfers',
'2018_totalTransfers',
'2017_totalTransfers',
'2016_totalTransfers',
'2015_totalTransfers',
'2014_totalTransfers',
'2013_totalTransfers',
'2012_totalTransfers',
'2011_totalTransfers',
'2010_totalTransfers' ]].sum( axis=1)
PieData = df_detail_summary[ ['region', 'name', 'sum_totalTransfers']].melt(id_vars=['region', 'name'],
var_name="Type",
value_name="Value")
regionPie = PieData.groupby([PieData['region']]).agg({'Value':['sum']})
regionPie.index = regionPie.index.set_names(['region'])
regionPie.columns = ['sum_totalTransfers']
regionPie = regionPie.reset_index()
fig = px.pie(regionPie, values='sum_totalTransfers', names='region', #title='Sum Total Transfers by Region',
color_discrete_sequence=px.colors.sequential.Rainbow)
fig.show()
regionYearData = df_detail_summary[ ['region', '2019_totalTransfers',
'2018_totalTransfers',
'2017_totalTransfers',
'2016_totalTransfers',
'2015_totalTransfers',
'2014_totalTransfers',
'2013_totalTransfers',
'2012_totalTransfers',
'2011_totalTransfers',
'2010_totalTransfers' ]].melt(id_vars=['region'],
var_name="Type",
value_name="Value")
regionYearData[['Year','Type']] = regionYearData['Type'].str.split("_", expand=True,)
regionYearData = regionYearData.groupby([regionYearData['region'], regionYearData['Year']]).agg({'Value':['sum']})
regionYearData.index = regionYearData.index.set_names(['Region', 'Year'])
regionYearData.columns = ['sum_totalTransfers']
regionYearData = regionYearData.reset_index()
regionYearData = regionYearData[(regionYearData['Region'] != "unknown") ]
fig = px.bar(regionYearData,
color='Year',
y='sum_totalTransfers',
x='Region',
barmode='group',
category_orders={"Region": regionYearData.loc [(regionYearData['Year'] == '2019'), ['sum_totalTransfers' , "Region"]].sort_values(by='sum_totalTransfers', ascending=False)['Region'].to_list()}
)
fig.for_each_trace(lambda t: t.update(name=t.name.replace("Year=", "")))
fig.show()
regionYearData = df_detail_summary[ ['region',
'2019_netTransfers',
'2018_netTransfers',
'2017_netTransfers',
'2016_netTransfers',
'2015_netTransfers',
'2014_netTransfers',
'2013_netTransfers',
'2012_netTransfers',
'2011_netTransfers',
'2010_netTransfers' ]].melt(id_vars=['region'],
var_name="Type",
value_name="Value")
regionYearData[['Year','Type']] = regionYearData['Type'].str.split("_", expand=True,)
regionYearData = regionYearData.groupby([regionYearData['region'], regionYearData['Year']]).agg({'Value':['sum']})
regionYearData.index = regionYearData.index.set_names(['Region', 'Year'])
regionYearData.columns = ['sum_totalNetTransfers']
regionYearData = regionYearData.reset_index()
regionYearData = regionYearData[(regionYearData['Region'] != "unknown") ]
fig = px.bar(regionYearData,
color='Year',
y='sum_totalNetTransfers',
x='Region',
barmode='group',
category_orders={"Region": regionYearData.loc [(regionYearData['Year'] == '2019'), ['sum_totalNetTransfers' , "Region"]].sort_values(by='sum_totalNetTransfers', ascending=False)['Region'].to_list()}
)
fig.for_each_trace(lambda t: t.update(name=t.name.replace("Year=", "")))
fig.show()
countryPie = PieData.groupby([PieData['name']]).agg({'Value':['sum']})
countryPie.index = countryPie.index.set_names(['name'])
countryPie.columns = ['sum_totalTransfers']
countryPie = countryPie.sort_values(by = 'sum_totalTransfers', ascending=False).reset_index()
splitDataFrameAfter = 15
topCountries = countryPie[:splitDataFrameAfter]
bottomCountries = countryPie[splitDataFrameAfter:]
totalBottomCountries = bottomCountries['sum_totalTransfers'].sum()
otherCountries = pd.DataFrame([{'name': 'Other', 'sum_totalTransfers' : totalBottomCountries}])
topCountries = pd.concat([otherCountries, topCountries], ignore_index=True)
fig = px.pie(topCountries, values='sum_totalTransfers', names='name', #title='Sum Total Transfers by Region',
color_discrete_sequence=px.colors.sequential.Rainbow)
fig.show()
countryYearData = df_detail_summary[ ['name',
'sum_totalTransfers',
'2019_totalTransfers',
'2018_totalTransfers',
'2017_totalTransfers',
'2016_totalTransfers',
'2015_totalTransfers',
'2014_totalTransfers',
'2013_totalTransfers',
'2012_totalTransfers',
'2011_totalTransfers',
'2010_totalTransfers' ]].sort_values(by='sum_totalTransfers', ascending=False).head(15).melt(id_vars=['name'],
var_name="Type",
value_name="Value")
countryYearData[['Year','Type']] = countryYearData['Type'].str.split("_", expand=True,)
countryYearData = countryYearData[(countryYearData['Year'] != "sum") ]
countryYearData = countryYearData.groupby([countryYearData['name'], countryYearData['Year']]).agg({'Value':['sum']})
countryYearData.index = countryYearData.index.set_names(['name', 'Year'])
countryYearData.columns = ['totalTransfers']
countryYearData = countryYearData.reset_index()
fig = px.bar(countryYearData,
y='totalTransfers',
color='Year',
barmode='group',
x='name',
category_orders={"name": countryYearData.loc [(countryYearData['Year'] == '2018'), ['totalTransfers' , "name"]].sort_values(by='totalTransfers', ascending=False)['name'].to_list()})
fig.for_each_trace(lambda t: t.update(name=t.name.replace("Year=", "")))
fig.show()
countryYearData = df_detail_summary[ ['name',
'sum_totalNetTransfers' ,
'2019_netTransfers',
'2018_netTransfers',
'2017_netTransfers',
'2016_netTransfers',
'2015_netTransfers',
'2014_netTransfers',
'2013_netTransfers',
'2012_netTransfers',
'2011_netTransfers',
'2010_netTransfers' ]].sort_values(by='sum_totalNetTransfers', ascending=False).head(15).melt(id_vars=['name'],
var_name="Type",
value_name="Value")
countryYearData[['Year','Type']] = countryYearData['Type'].str.split("_", expand=True,)
countryYearData = countryYearData[(countryYearData['Year'] != "sum") ]
countryYearData = countryYearData.groupby([countryYearData['name'], countryYearData['Year']]).agg({'Value':['sum']})
countryYearData.index = countryYearData.index.set_names(['name', 'Year'])
countryYearData.columns = ['netTransfers']
countryYearData = countryYearData.reset_index()
fig = px.bar(countryYearData,
y='netTransfers',
color='Year',
barmode='group',
x='name',
category_orders={"name": countryYearData.loc [(countryYearData['Year'] == '2019'), ['netTransfers' , "name"]
].sort_values(by='netTransfers', ascending=False)['name'].to_list()})
fig.for_each_trace(lambda t: t.update(name=t.name.replace("Year=", "")))
fig.show()
countryYearData = df_detail_summary[ ['name',
'sum_totalNetTransfers' ,
'2019_netTransfers',
'2018_netTransfers',
'2017_netTransfers',
'2016_netTransfers',
'2015_netTransfers',
'2014_netTransfers',
'2013_netTransfers',
'2012_netTransfers',
'2011_netTransfers',
'2010_netTransfers' ]].sort_values(by='sum_totalNetTransfers', ascending=True).head(15).melt(id_vars=['name'],
var_name="Type",
value_name="Value")
countryYearData[['Year','Type']] = countryYearData['Type'].str.split("_", expand=True,)
countryYearData = countryYearData[(countryYearData['Year'] != "sum") ]
countryYearData = countryYearData.groupby([countryYearData['name'], countryYearData['Year']]).agg({'Value':['sum']})
countryYearData.index = countryYearData.index.set_names(['name', 'Year'])
countryYearData.columns = ['netTransfers']
countryYearData = countryYearData.reset_index()
fig = px.bar(countryYearData,
y='netTransfers',
color='Year',
barmode='group',
x='name',
category_orders={"name": countryYearData.loc [(countryYearData['Year'] == '2018'), ['netTransfers' , "name"]
].sort_values(by='netTransfers', ascending=True)['name'].to_list()})
fig.for_each_trace(lambda t: t.update(name=t.name.replace("Year=", "")))
fig.show()
## Generate scatter plot
fig = px.scatter_mapbox(df_detail_summary,
mapbox_style='light',
hover_name= 'name',
hover_data = [ 'sum_totalIncomingTransfers' , 'sum_totalOutgoingTransfers', 'sum_totalTransfers','sum_totalNetTransfers', 'std_totalNetTransfers', 'std_totalTransfers'],
lat="lat",
lon="lon",
color="sum_totalNetTransfers",
color_continuous_scale=["red", "orange", "blue", "green"], #px.colors.diverging.Armyrose,
zoom=1,
size=df_detail_summary['sum_totalNetTransfers'].abs(), ### size of the dot is the absolute value of the net transfer
size_max=60,
height=600,
width=1200,
range_color=[-2636969230767,2636969230767]
)
#fig.update_layout(coloraxis=dict(cmin=2600369069230767))
#fig.show()
#df_detail_summary['sum_totalNetTransfers'].min()
scope = PlotlyScope()
img_bytes = scope.transform(fig, format="png", scale=1)
Image(img_bytes)
Comparison of Total Transferred against Country Population.
Based on total transfers between 2010 and 2019. The diagram highlights outliers such as Holy See, Cayman Islands, Bermuda, Jersey, etc.. with relatively low populations however significant total transfers. Similarly North Korea is an outlier wiht a relatively large population and very few transfers.
#Create chart
fig = px.scatter(df_detail_summary[ (df_detail_summary['population'] != 1 ) ],
x="population",
y="sum_totalTransfers",
color = "name",
log_x=True,
log_y=True,
hover_name="name",
hover_data=[ 'sum_totalIncomingTransfers' , 'sum_totalOutgoingTransfers', 'sum_totalTransfers','sum_totalNetTransfers', 'std_totalNetTransfers', 'std_totalTransfers']
)
fig.for_each_annotation(lambda a: a.update(text=a.text.replace("name=", "")))
fig.for_each_trace(lambda t: t.update(name=t.name.replace("name=", "")))
fig.layout.update(showlegend=False)
fig.update_traces(marker=dict(size=16,
line=dict(
width=1,
color='DarkSlateGrey')
),
selector=dict(mode='markers'))
fig.show()