Dataset statistics
| Number of variables | 4 |
|---|---|
| Number of observations | 23995 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 23 |
| Duplicate rows (%) | 0.1% |
| Total size in memory | 2.7 MiB |
| Average record size in memory | 119.4 B |
Variable types
| DateTime | 1 |
|---|---|
| Numeric | 2 |
| Categorical | 1 |
| Dataset has 23 (0.1%) duplicate rows | Duplicates |
Reproduction
| Analysis started | 2023-06-16 00:00:33.250961 |
|---|---|
| Analysis finished | 2023-06-16 00:00:34.524448 |
| Duration | 1.27 second |
| Software version | pandas-profiling v0.0.dev0 |
| Download configuration | config.json |
date
Date
| Distinct | 31 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 891.0 KiB |
| Minimum | 1997-10-01 00:00:00 |
|---|---|
| Maximum | 1997-10-31 00:00:00 |
Histogram with fixed size bins (bins=31)
account_id
Real number (ℝ)
| Distinct | 4362 |
|---|---|
| Distinct (%) | 18.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2989.8172 |
| Minimum | 1 |
|---|---|
| Maximum | 11382 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 891.0 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 242.7 |
| Q1 | 1232.5 |
| median | 2456 |
| Q3 | 3689.5 |
| 95-th percentile | 9203 |
| Maximum | 11382 |
| Range | 11381 |
| Interquartile range (IQR) | 2457 |
Descriptive statistics
| Standard deviation | 2530.1562 |
|---|---|
| Coefficient of variation (CV) | 0.84625783 |
| Kurtosis | 1.9266768 |
| Mean | 2989.8172 |
| Median Absolute Deviation (MAD) | 1228 |
| Skewness | 1.5096556 |
| Sum | 71740663 |
| Variance | 6401690.4 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 4321 | 17 | 0.1% |
| 7445 | 17 | 0.1% |
| 3733 | 15 | 0.1% |
| 6083 | 15 | 0.1% |
| 3526 | 15 | 0.1% |
| 10478 | 14 | 0.1% |
| 2933 | 14 | 0.1% |
| 3646 | 14 | 0.1% |
| 3862 | 14 | 0.1% |
| 8899 | 14 | 0.1% |
| Other values (4352) | 23846 |
| Value | Count | Frequency (%) |
| 1 | 4 | |
| 2 | 6 | |
| 3 | 3 | |
| 4 | 6 | |
| 5 | 3 | |
| 6 | 4 | |
| 7 | 5 | |
| 8 | 6 | |
| 9 | 6 | |
| 10 | 5 |
| Value | Count | Frequency (%) |
| 11382 | 6 | |
| 11362 | 10 | |
| 11359 | 7 | |
| 11349 | 6 | |
| 11333 | 6 | |
| 11328 | 6 | |
| 11327 | 1 | < 0.1% |
| 11325 | 12 | |
| 11320 | 6 | |
| 11317 | 8 |
type
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.2 MiB |
| WITHDRAWAL | |
|---|---|
| CREDIT |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 8.4096687 |
| Min length | 6 |
Characters and Unicode
| Total characters | 201790 |
|---|---|
| Distinct characters | 10 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | WITHDRAWAL |
|---|---|
| 2nd row | CREDIT |
| 3rd row | WITHDRAWAL |
| 4th row | WITHDRAWAL |
| 5th row | WITHDRAWAL |
Common Values
| Value | Count | Frequency (%) |
| WITHDRAWAL | 14455 | |
| CREDIT | 9540 |
Length
Histogram of lengths of the category
Common Values (Plot)
| Value | Count | Frequency (%) |
| withdrawal | 14455 | |
| credit | 9540 |
Most occurring characters
| Value | Count | Frequency (%) |
| W | 28910 | |
| A | 28910 | |
| I | 23995 | |
| T | 23995 | |
| D | 23995 | |
| R | 23995 | |
| H | 14455 | |
| L | 14455 | |
| C | 9540 | 4.7% |
| E | 9540 | 4.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 201790 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| W | 28910 | |
| A | 28910 | |
| I | 23995 | |
| T | 23995 | |
| D | 23995 | |
| R | 23995 | |
| H | 14455 | |
| L | 14455 | |
| C | 9540 | 4.7% |
| E | 9540 | 4.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 201790 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| W | 28910 | |
| A | 28910 | |
| I | 23995 | |
| T | 23995 | |
| D | 23995 | |
| R | 23995 | |
| H | 14455 | |
| L | 14455 | |
| C | 9540 | 4.7% |
| E | 9540 | 4.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 201790 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| W | 28910 | |
| A | 28910 | |
| I | 23995 | |
| T | 23995 | |
| D | 23995 | |
| R | 23995 | |
| H | 14455 | |
| L | 14455 | |
| C | 9540 | 4.7% |
| E | 9540 | 4.7% |
amount
Real number (ℝ)
| Distinct | 10036 |
|---|---|
| Distinct (%) | 41.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5998.506 |
| Minimum | 0.1 |
|---|---|
| Maximum | 64900 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 891.0 KiB |
Quantile statistics
| Minimum | 0.1 |
|---|---|
| 5-th percentile | 14.6 |
| Q1 | 123.9 |
| median | 2107 |
| Q3 | 7193.5 |
| 95-th percentile | 25085.3 |
| Maximum | 64900 |
| Range | 64899.9 |
| Interquartile range (IQR) | 7069.6 |
Descriptive statistics
| Standard deviation | 9474.0926 |
|---|---|
| Coefficient of variation (CV) | 1.5794087 |
| Kurtosis | 6.7870479 |
| Mean | 5998.506 |
| Median Absolute Deviation (MAD) | 2092.4 |
| Skewness | 2.4686457 |
| Sum | 1.4393415 × 108 |
| Variance | 89758431 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 14.6 | 3584 | 14.9% |
| 30 | 206 | 0.9% |
| 100 | 111 | 0.5% |
| 600 | 79 | 0.3% |
| 1200 | 78 | 0.3% |
| 2400 | 77 | 0.3% |
| 1500 | 75 | 0.3% |
| 1800 | 75 | 0.3% |
| 1000 | 73 | 0.3% |
| 1300 | 73 | 0.3% |
| Other values (10026) | 19564 |
| Value | Count | Frequency (%) |
| 0.1 | 1 | < 0.1% |
| 0.2 | 1 | < 0.1% |
| 0.3 | 1 | < 0.1% |
| 0.8 | 1 | < 0.1% |
| 0.9 | 3 | < 0.1% |
| 1 | 4 | |
| 1.7 | 1 | < 0.1% |
| 2 | 8 | |
| 2.3 | 1 | < 0.1% |
| 2.5 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 64900 | 1 | |
| 64700 | 1 | |
| 64600 | 1 | |
| 64000 | 1 | |
| 63400 | 1 | |
| 63000 | 1 | |
| 62800 | 1 | |
| 62700 | 1 | |
| 62400 | 1 | |
| 62200 | 2 |
| account_id | amount | type | |
|---|---|---|---|
| account_id | 1.000 | 0.073 | 0.013 |
| amount | 0.073 | 1.000 | 0.210 |
| type | 0.013 | 0.210 | 1.000 |
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
| date | account_id | type | amount | |
|---|---|---|---|---|
| 657611 | 1997-10-01 | 3571 | WITHDRAWAL | 1300.0 |
| 657612 | 1997-10-01 | 2370 | CREDIT | 300.0 |
| 657613 | 1997-10-01 | 7733 | WITHDRAWAL | 14200.0 |
| 657614 | 1997-10-01 | 7753 | WITHDRAWAL | 12600.0 |
| 657615 | 1997-10-01 | 7721 | WITHDRAWAL | 18000.0 |
| 657616 | 1997-10-01 | 3136 | WITHDRAWAL | 4100.0 |
| 657617 | 1997-10-01 | 3278 | WITHDRAWAL | 3120.0 |
| 657618 | 1997-10-01 | 3635 | WITHDRAWAL | 5800.0 |
| 657619 | 1997-10-01 | 3638 | WITHDRAWAL | 9900.0 |
| 657620 | 1997-10-01 | 5891 | WITHDRAWAL | 7400.0 |
| date | account_id | type | amount | |
|---|---|---|---|---|
| 681596 | 1997-10-31 | 1769 | WITHDRAWAL | 14.6 |
| 681597 | 1997-10-31 | 1765 | WITHDRAWAL | 14.6 |
| 681598 | 1997-10-31 | 1763 | WITHDRAWAL | 14.6 |
| 681599 | 1997-10-31 | 1773 | WITHDRAWAL | 14.6 |
| 681600 | 1997-10-31 | 1771 | WITHDRAWAL | 14.6 |
| 681601 | 1997-10-31 | 1775 | WITHDRAWAL | 14.6 |
| 681602 | 1997-10-31 | 1767 | WITHDRAWAL | 14.6 |
| 681603 | 1997-10-31 | 1772 | WITHDRAWAL | 30.0 |
| 681604 | 1997-10-31 | 1768 | WITHDRAWAL | 14.6 |
| 681605 | 1997-10-31 | 1777 | WITHDRAWAL | 14.6 |
Most frequently occurring
| date | account_id | type | amount | # duplicates | |
|---|---|---|---|---|---|
| 0 | 1997-10-31 | 103 | CREDIT | 33.4 | 2 |
| 1 | 1997-10-31 | 7944 | CREDIT | 360.2 | 2 |
| 2 | 1997-10-31 | 8316 | CREDIT | 257.2 | 2 |
| 3 | 1997-10-31 | 8320 | CREDIT | 70.5 | 2 |
| 4 | 1997-10-31 | 8327 | CREDIT | 146.9 | 2 |
| 5 | 1997-10-31 | 8330 | CREDIT | 164.7 | 2 |
| 6 | 1997-10-31 | 8489 | CREDIT | 162.2 | 2 |
| 7 | 1997-10-31 | 8519 | CREDIT | 204.7 | 2 |
| 8 | 1997-10-31 | 8784 | CREDIT | 141.7 | 2 |
| 9 | 1997-10-31 | 8982 | CREDIT | 262.7 | 2 |