Overview

Dataset statistics

Number of variables4
Number of observations23995
Missing cells0
Missing cells (%)0.0%
Duplicate rows23
Duplicate rows (%)0.1%
Total size in memory2.7 MiB
Average record size in memory119.4 B

Variable types

DateTime1
Numeric2
Categorical1

Alerts

Dataset has 23 (0.1%) duplicate rowsDuplicates

Reproduction

Analysis started2023-06-16 00:00:33.250961
Analysis finished2023-06-16 00:00:34.524448
Duration1.27 second
Software versionpandas-profiling v0.0.dev0
Download configurationconfig.json

Variables

date
Date

Distinct31
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size891.0 KiB
Minimum1997-10-01 00:00:00
Maximum1997-10-31 00:00:00
2023-06-15T20:00:34.593300image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-06-15T20:00:34.702468image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)

account_id
Real number (ℝ)

Distinct4362
Distinct (%)18.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2989.8172
Minimum1
Maximum11382
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size891.0 KiB
2023-06-15T20:00:34.864963image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile242.7
Q11232.5
median2456
Q33689.5
95-th percentile9203
Maximum11382
Range11381
Interquartile range (IQR)2457

Descriptive statistics

Standard deviation2530.1562
Coefficient of variation (CV)0.84625783
Kurtosis1.9266768
Mean2989.8172
Median Absolute Deviation (MAD)1228
Skewness1.5096556
Sum71740663
Variance6401690.4
MonotonicityNot monotonic
2023-06-15T20:00:35.032047image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4321 17
 
0.1%
7445 17
 
0.1%
3733 15
 
0.1%
6083 15
 
0.1%
3526 15
 
0.1%
10478 14
 
0.1%
2933 14
 
0.1%
3646 14
 
0.1%
3862 14
 
0.1%
8899 14
 
0.1%
Other values (4352) 23846
99.4%
ValueCountFrequency (%)
1 4
< 0.1%
2 6
< 0.1%
3 3
< 0.1%
4 6
< 0.1%
5 3
< 0.1%
6 4
< 0.1%
7 5
< 0.1%
8 6
< 0.1%
9 6
< 0.1%
10 5
< 0.1%
ValueCountFrequency (%)
11382 6
< 0.1%
11362 10
< 0.1%
11359 7
< 0.1%
11349 6
< 0.1%
11333 6
< 0.1%
11328 6
< 0.1%
11327 1
 
< 0.1%
11325 12
0.1%
11320 6
< 0.1%
11317 8
< 0.1%

type
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
WITHDRAWAL
14455 
CREDIT
9540 

Length

Max length10
Median length10
Mean length8.4096687
Min length6

Characters and Unicode

Total characters201790
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWITHDRAWAL
2nd rowCREDIT
3rd rowWITHDRAWAL
4th rowWITHDRAWAL
5th rowWITHDRAWAL

Common Values

ValueCountFrequency (%)
WITHDRAWAL 14455
60.2%
CREDIT 9540
39.8%

Length

2023-06-15T20:00:35.155312image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-15T20:00:35.292840image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
withdrawal 14455
60.2%
credit 9540
39.8%

Most occurring characters

ValueCountFrequency (%)
W 28910
14.3%
A 28910
14.3%
I 23995
11.9%
T 23995
11.9%
D 23995
11.9%
R 23995
11.9%
H 14455
7.2%
L 14455
7.2%
C 9540
 
4.7%
E 9540
 
4.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 201790
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
W 28910
14.3%
A 28910
14.3%
I 23995
11.9%
T 23995
11.9%
D 23995
11.9%
R 23995
11.9%
H 14455
7.2%
L 14455
7.2%
C 9540
 
4.7%
E 9540
 
4.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 201790
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
W 28910
14.3%
A 28910
14.3%
I 23995
11.9%
T 23995
11.9%
D 23995
11.9%
R 23995
11.9%
H 14455
7.2%
L 14455
7.2%
C 9540
 
4.7%
E 9540
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 201790
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
W 28910
14.3%
A 28910
14.3%
I 23995
11.9%
T 23995
11.9%
D 23995
11.9%
R 23995
11.9%
H 14455
7.2%
L 14455
7.2%
C 9540
 
4.7%
E 9540
 
4.7%

amount
Real number (ℝ)

Distinct10036
Distinct (%)41.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5998.506
Minimum0.1
Maximum64900
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size891.0 KiB
2023-06-15T20:00:35.402112image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0.1
5-th percentile14.6
Q1123.9
median2107
Q37193.5
95-th percentile25085.3
Maximum64900
Range64899.9
Interquartile range (IQR)7069.6

Descriptive statistics

Standard deviation9474.0926
Coefficient of variation (CV)1.5794087
Kurtosis6.7870479
Mean5998.506
Median Absolute Deviation (MAD)2092.4
Skewness2.4686457
Sum1.4393415 × 108
Variance89758431
MonotonicityNot monotonic
2023-06-15T20:00:35.548208image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14.6 3584
 
14.9%
30 206
 
0.9%
100 111
 
0.5%
600 79
 
0.3%
1200 78
 
0.3%
2400 77
 
0.3%
1500 75
 
0.3%
1800 75
 
0.3%
1000 73
 
0.3%
1300 73
 
0.3%
Other values (10026) 19564
81.5%
ValueCountFrequency (%)
0.1 1
 
< 0.1%
0.2 1
 
< 0.1%
0.3 1
 
< 0.1%
0.8 1
 
< 0.1%
0.9 3
 
< 0.1%
1 4
< 0.1%
1.7 1
 
< 0.1%
2 8
< 0.1%
2.3 1
 
< 0.1%
2.5 1
 
< 0.1%
ValueCountFrequency (%)
64900 1
< 0.1%
64700 1
< 0.1%
64600 1
< 0.1%
64000 1
< 0.1%
63400 1
< 0.1%
63000 1
< 0.1%
62800 1
< 0.1%
62700 1
< 0.1%
62400 1
< 0.1%
62200 2
< 0.1%

Interactions

2023-06-15T20:00:33.872703image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-06-15T20:00:33.484137image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-06-15T20:00:33.985971image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-06-15T20:00:33.685684image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2023-06-15T20:00:35.674904image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
account_idamounttype
account_id1.0000.0730.013
amount0.0731.0000.210
type0.0130.2101.000

Missing values

2023-06-15T20:00:34.144760image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-06-15T20:00:34.468071image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

dateaccount_idtypeamount
6576111997-10-013571WITHDRAWAL1300.0
6576121997-10-012370CREDIT300.0
6576131997-10-017733WITHDRAWAL14200.0
6576141997-10-017753WITHDRAWAL12600.0
6576151997-10-017721WITHDRAWAL18000.0
6576161997-10-013136WITHDRAWAL4100.0
6576171997-10-013278WITHDRAWAL3120.0
6576181997-10-013635WITHDRAWAL5800.0
6576191997-10-013638WITHDRAWAL9900.0
6576201997-10-015891WITHDRAWAL7400.0
dateaccount_idtypeamount
6815961997-10-311769WITHDRAWAL14.6
6815971997-10-311765WITHDRAWAL14.6
6815981997-10-311763WITHDRAWAL14.6
6815991997-10-311773WITHDRAWAL14.6
6816001997-10-311771WITHDRAWAL14.6
6816011997-10-311775WITHDRAWAL14.6
6816021997-10-311767WITHDRAWAL14.6
6816031997-10-311772WITHDRAWAL30.0
6816041997-10-311768WITHDRAWAL14.6
6816051997-10-311777WITHDRAWAL14.6

Duplicate rows

Most frequently occurring

dateaccount_idtypeamount# duplicates
01997-10-31103CREDIT33.42
11997-10-317944CREDIT360.22
21997-10-318316CREDIT257.22
31997-10-318320CREDIT70.52
41997-10-318327CREDIT146.92
51997-10-318330CREDIT164.72
61997-10-318489CREDIT162.22
71997-10-318519CREDIT204.72
81997-10-318784CREDIT141.72
91997-10-318982CREDIT262.72