Toggle navigation
Dataset statistics
Number of variables
14
Number of observations
1237
Missing cells
0
Missing cells (%)
0.0%
Duplicate rows
1092
Duplicate rows (%)
88.3%
Total size in memory
135.4 KiB
Average record size in memory
112.1 B
Variable types
Categorical
2
Numeric
12
Warnings
Dataset has 1092 (88.3%) duplicate rows
Duplicates
IVS_2010
is highly correlated with
IVS_REN_10
High correlation
IVS_REN_10
is highly correlated with
IVS_2010
High correlation
MASC
is highly correlated with FEM
and 2 other fields
High correlation
FEM
is highly correlated with MASC
and 2 other fields
High correlation
POP
is highly correlated with MASC
and 2 other fields
High correlation
DOM_OCU
is highly correlated with
MASC
and 2 other fields
High correlation
area
is highly correlated with
perimeter
and 1 other fields
High correlation
perimeter
is highly correlated with
area
and 1 other fields
High correlation
URB_RURAL
is highly correlated with
area
and 1 other fields
High correlation
Dens_Dom
is highly correlated with
Dens_hab
High correlation
Dens_hab
is highly correlated with
Dens_Dom
High correlation
MASC
has 40 (3.2%) zeros
Zeros
FEM
has 44 (3.6%) zeros
Zeros
POP
has 40 (3.2%) zeros
Zeros
DOM_OCU
has 40 (3.2%) zeros
Zeros
Dens_Dom
has 440 (35.6%) zeros
Zeros
Dens_hab
has 40 (3.2%) zeros
Zeros
Reproduction
Analysis started
2021-01-29 03:41:08.846244
Analysis finished
2021-01-29 03:41:34.302615
Duration
25.46 seconds
Software version
Download configuration
config.yaml
Ocup_2010
Categorical
Distinct
2
Distinct (%)
0.2%
Missing
0
Missing (%)
0.0%
Memory size
9.8 KiB
1
808
0
429
Toggle details
Length
Max length
1
Median length
1
Mean length
1
Min length
1
Characters and Unicode
Total characters
1237
Distinct characters
2
Distinct categories
1 ?
Distinct scripts
1 ?
Distinct blocks
1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
Unique
0 ?
Unique (%)
0.0%
Sample
1st row
0
2nd row
0
3rd row
0
4th row
0
5th row
0
Value
Count
Frequency (%)
1
808
65.3%
0
429
34.7%
Histogram of lengths of the category
Value
Count
Frequency (%)
1
808
65.3%
0
429
34.7%
Value
Count
Frequency (%)
1
808
65.3%
0
429
34.7%
Value
Count
Frequency (%)
Decimal Number
1237
100.0%
Value
Count
Frequency (%)
1
808
65.3%
0
429
34.7%
Value
Count
Frequency (%)
Common
1237
100.0%
Value
Count
Frequency (%)
1
808
65.3%
0
429
34.7%
Value
Count
Frequency (%)
ASCII
1237
100.0%
Value
Count
Frequency (%)
1
808
65.3%
0
429
34.7%
IVS_2010
Real number (ℝ≥0)
HIGH CORRELATION
\
Distinct
7
Distinct (%)
0.6%
Missing
0
Missing (%)
0.0%
Infinite
0
Infinite (%)
0.0%
Mean
0.3602247373
Minimum
0.31
Maximum
0.536
Zeros
0
Zeros (%)
0.0%
Memory size
9.8 KiB
Toggle details
Quantile statistics
Minimum
0.31
5-th percentile
0.31
Q1
0.31
median
0.31
Q3
0.435
95-th percentile
0.484
Maximum
0.536
Range
0.226
Interquartile range (IQR)
0.125
Descriptive statistics
Standard deviation
0.0616858069
Coefficient of variation (CV)
0.171242562
Kurtosis
-0.5259216255
Mean
0.3602247373
Median Absolute Deviation (MAD)
0
Skewness
0.8709291587
Sum
445.598
Variance
0.003805138773
Monotocity
Not monotonic
Histogram with fixed size bins (bins=7)
Value
Count
Frequency (%)
0.31
646
52.2%
0.435
197
15.9%
0.362
182
14.7%
0.484
92
7.4%
0.385
85
6.9%
0.438
23
1.9%
0.536
12
1.0%
Value
Count
Frequency (%)
0.31
646
52.2%
0.362
182
14.7%
0.385
85
6.9%
0.435
197
15.9%
0.438
23
1.9%
Value
Count
Frequency (%)
0.536
12
1.0%
0.484
92
7.4%
0.438
23
1.9%
0.435
197
15.9%
0.385
85
6.9%
IVS_INF_10
Real number (ℝ≥0)
Distinct
7
Distinct (%)
0.6%
Missing
0
Missing (%)
0.0%
Infinite
0
Infinite (%)
0.0%
Mean
0.1465658852
Minimum
0.105
Maximum
0.381
Zeros
0
Zeros (%)
0.0%
Memory size
9.8 KiB
Toggle details
Quantile statistics
Minimum
0.105
5-th percentile
0.105
Q1
0.105
median
0.105
Q3
0.209
95-th percentile
0.255
Maximum
0.381
Range
0.276
Interquartile range (IQR)
0.104
Descriptive statistics
Standard deviation
0.06033772187
Coefficient of variation (CV)
0.4116764401
Kurtosis
0.6008386036
Mean
0.1465658852
Median Absolute Deviation (MAD)
0
Skewness
1.185055336
Sum
181.302
Variance
0.00364064068
Monotocity
Not monotonic
Histogram with fixed size bins (bins=7)
Value
Count
Frequency (%)
0.105
646
52.2%
0.209
197
15.9%
0.112
182
14.7%
0.235
92
7.4%
0.255
85
6.9%
0.176
23
1.9%
0.381
12
1.0%
Value
Count
Frequency (%)
0.105
646
52.2%
0.112
182
14.7%
0.176
23
1.9%
0.209
197
15.9%
0.235
92
7.4%
Value
Count
Frequency (%)
0.381
12
1.0%
0.255
85
6.9%
0.235
92
7.4%
0.209
197
15.9%
0.176
23
1.9%
IVS_CPH_10
Real number (ℝ≥0)
Distinct
7
Distinct (%)
0.6%
Missing
0
Missing (%)
0.0%
Infinite
0
Infinite (%)
0.0%
Mean
0.5170517381
Minimum
0.467
Maximum
0.603
Zeros
0
Zeros (%)
0.0%
Memory size
9.8 KiB
Toggle details
Quantile statistics
Minimum
0.467
5-th percentile
0.467
Q1
0.467
median
0.467
Q3
0.576
95-th percentile
0.595
Maximum
0.603
Range
0.136
Interquartile range (IQR)
0.109
Descriptive statistics
Standard deviation
0.05698853493
Coefficient of variation (CV)
0.1102182446
Kurtosis
-1.786892214
Mean
0.5170517381
Median Absolute Deviation (MAD)
0
Skewness
0.3636587349
Sum
639.593
Variance
0.003247693114
Monotocity
Not monotonic
Histogram with fixed size bins (bins=7)
Value
Count
Frequency (%)
0.467
646
52.2%
0.576
197
15.9%
0.595
182
14.7%
0.579
92
7.4%
0.495
85
6.9%
0.59
23
1.9%
0.603
12
1.0%
Value
Count
Frequency (%)
0.467
646
52.2%
0.495
85
6.9%
0.576
197
15.9%
0.579
92
7.4%
0.59
23
1.9%
Value
Count
Frequency (%)
0.603
12
1.0%
0.595
182
14.7%
0.59
23
1.9%
0.579
92
7.4%
0.576
197
15.9%
IVS_REN_10
Real number (ℝ≥0)
HIGH CORRELATION
\
Distinct
7
Distinct (%)
0.6%
Missing
0
Missing (%)
0.0%
Infinite
0
Infinite (%)
0.0%
Mean
0.4166693614
Minimum
0.357
Maximum
0.637
Zeros
0
Zeros (%)
0.0%
Memory size
9.8 KiB
Toggle details
Quantile statistics
Minimum
0.357
5-th percentile
0.357
Q1
0.357
median
0.357
Q3
0.521
95-th percentile
0.637
Maximum
0.637
Range
0.28
Interquartile range (IQR)
0.164
Descriptive statistics
Standard deviation
0.09032930481
Coefficient of variation (CV)
0.2167889295
Kurtosis
0.3817008174
Mean
0.4166693614
Median Absolute Deviation (MAD)
0
Skewness
1.339804865
Sum
515.42
Variance
0.008159383308
Monotocity
Not monotonic
Histogram with fixed size bins (bins=7)
Value
Count
Frequency (%)
0.357
646
52.2%
0.521
197
15.9%
0.379
182
14.7%
0.637
92
7.4%
0.406
85
6.9%
0.547
23
1.9%
0.624
12
1.0%
Value
Count
Frequency (%)
0.357
646
52.2%
0.379
182
14.7%
0.406
85
6.9%
0.521
197
15.9%
0.547
23
1.9%
Value
Count
Frequency (%)
0.637
92
7.4%
0.624
12
1.0%
0.547
23
1.9%
0.521
197
15.9%
0.406
85
6.9%
MASC
Real number (ℝ≥0)
HIGH CORRELATION
ZEROS
\
Distinct
89
Distinct (%)
7.2%
Missing
0
Missing (%)
0.0%
Infinite
0
Infinite (%)
0.0%
Mean
158.3791431
Minimum
0
Maximum
950
Zeros
40
Zeros (%)
3.2%
Memory size
9.8 KiB
Toggle details
Quantile statistics
Minimum
0
5-th percentile
3
Q1
47
median
128
Q3
225
95-th percentile
329
Maximum
950
Range
950
Interquartile range (IQR)
178
Descriptive statistics
Standard deviation
162.7664735
Coefficient of variation (CV)
1.027701441
Kurtosis
11.40354589
Mean
158.3791431
Median Absolute Deviation (MAD)
87
Skewness
2.850508253
Sum
195915
Variance
26492.9249
Monotocity
Not monotonic
Histogram with fixed size bins (bins=8)
Value
Count
Frequency (%)
258
64
5.2%
138
61
4.9%
114
49
4.0%
0
40
3.2%
170
39
3.2%
207
35
2.8%
147
32
2.6%
128
31
2.5%
950
31
2.5%
283
31
2.5%
Other values (79)
824
66.6%
Value
Count
Frequency (%)
0
40
3.2%
1
4
0.3%
2
17
1.4%
3
21
1.7%
4
16
1.3%
Value
Count
Frequency (%)
950
31
2.5%
358
17
1.4%
338
12
1.0%
329
28
2.3%
328
22
1.8%
FEM
Real number (ℝ≥0)
HIGH CORRELATION
ZEROS
\
Distinct
91
Distinct (%)
7.4%
Missing
0
Missing (%)
0.0%
Infinite
0
Infinite (%)
0.0%
Mean
178.8229588
Minimum
0
Maximum
985
Zeros
44
Zeros (%)
3.6%
Memory size
9.8 KiB
Toggle details
Quantile statistics
Minimum
0
5-th percentile
2
Q1
52
median
138
Q3
265
95-th percentile
396
Maximum
985
Range
985
Interquartile range (IQR)
213
Descriptive statistics
Standard deviation
175.0384106
Coefficient of variation (CV)
0.9788363404
Kurtosis
8.750252436
Mean
178.8229588
Median Absolute Deviation (MAD)
96
Skewness
2.432589893
Sum
221204
Variance
30638.44517
Monotocity
Not monotonic
Histogram with fixed size bins (bins=8)
Value
Count
Frequency (%)
296
64
5.2%
138
52
4.2%
0
44
3.6%
178
39
3.2%
182
38
3.1%
228
35
2.8%
170
32
2.6%
308
31
2.5%
140
31
2.5%
985
31
2.5%
Other values (81)
840
67.9%
Value
Count
Frequency (%)
0
44
3.6%
1
13
1.1%
2
6
0.5%
3
18
1.5%
4
14
1.1%
Value
Count
Frequency (%)
985
31
2.5%
400
12
1.0%
397
17
1.4%
396
27
2.2%
378
22
1.8%
POP
Real number (ℝ≥0)
HIGH CORRELATION
ZEROS
\
Distinct
100
Distinct (%)
8.1%
Missing
0
Missing (%)
0.0%
Infinite
0
Infinite (%)
0.0%
Mean
337.2021019
Minimum
0
Maximum
1935
Zeros
40
Zeros (%)
3.2%
Memory size
9.8 KiB
Toggle details
Quantile statistics
Minimum
0
5-th percentile
5
Q1
99
median
260
Q3
495
95-th percentile
711
Maximum
1935
Range
1935
Interquartile range (IQR)
396
Descriptive statistics
Standard deviation
337.3408572
Coefficient of variation (CV)
1.00041149
Kurtosis
10.03673834
Mean
337.2021019
Median Absolute Deviation (MAD)
181
Skewness
2.636230117
Sum
417119
Variance
113798.8539
Monotocity
Not monotonic
Histogram with fixed size bins (bins=8)
Value
Count
Frequency (%)
554
64
5.2%
320
56
4.5%
215
49
4.0%
252
49
4.0%
0
40
3.2%
348
39
3.2%
435
35
2.8%
317
32
2.6%
591
31
2.5%
268
31
2.5%
Other values (90)
811
65.6%
Value
Count
Frequency (%)
0
40
3.2%
1
4
0.3%
3
13
1.1%
4
3
0.2%
5
4
0.3%
Value
Count
Frequency (%)
1935
31
2.5%
755
17
1.4%
738
12
1.0%
711
27
2.2%
706
22
1.8%
DOM_OCU
Real number (ℝ≥0)
HIGH CORRELATION
ZEROS
\
Distinct
78
Distinct (%)
6.3%
Missing
0
Missing (%)
0.0%
Infinite
0
Infinite (%)
0.0%
Mean
94.48827809
Minimum
0
Maximum
552
Zeros
40
Zeros (%)
3.2%
Memory size
9.8 KiB
Toggle details
Quantile statistics
Minimum
0
5-th percentile
1
Q1
26
median
75
Q3
137
95-th percentile
203
Maximum
552
Range
552
Interquartile range (IQR)
111
Descriptive statistics
Standard deviation
96.1753854
Coefficient of variation (CV)
1.017855202
Kurtosis
10.24206169
Mean
94.48827809
Median Absolute Deviation (MAD)
52
Skewness
2.668039073
Sum
116882
Variance
9249.704757
Monotocity
Not monotonic
Histogram with fixed size bins (bins=8)
Value
Count
Frequency (%)
168
64
5.2%
75
49
4.0%
0
40
3.2%
100
39
3.2%
1
38
3.1%
80
38
3.1%
87
37
3.0%
132
35
2.8%
145
31
2.5%
82
31
2.5%
Other values (68)
835
67.5%
Value
Count
Frequency (%)
0
40
3.2%
1
38
3.1%
2
20
1.6%
3
18
1.5%
4
12
1.0%
Value
Count
Frequency (%)
552
31
2.5%
204
17
1.4%
203
22
1.8%
200
27
2.2%
199
12
1.0%
area
Real number (ℝ≥0)
HIGH CORRELATION
\
Distinct
144
Distinct (%)
11.6%
Missing
0
Missing (%)
0.0%
Infinite
0
Infinite (%)
0.0%
Mean
270238.3907
Minimum
40071.84875
Maximum
1002134.316
Zeros
0
Zeros (%)
0.0%
Memory size
9.8 KiB
Toggle details
Quantile statistics
Minimum
40071.84875
5-th percentile
40072.15161
Q1
40074.81189
median
40075.04517
Q3
40085.81372
95-th percentile
1002092.473
Maximum
1002134.316
Range
962062.4677
Interquartile range (IQR)
11.00183105
Descriptive statistics
Standard deviation
410543.0688
Coefficient of variation (CV)
1.51918855
Kurtosis
-0.5035733545
Mean
270238.3907
Median Absolute Deviation (MAD)
2.879638672
Skewness
1.223618957
Sum
334284889.3
Variance
1.685456113 × 10^11^
Monotocity
Not monotonic
Histogram with fixed size bins (bins=8)
Value
Count
Frequency (%)
40080.54285
64
5.2%
40072.15161
49
4.0%
40080.68677
39
3.2%
40074.95715
38
3.1%
40075.01538
35
2.8%
40074.94751
32
2.6%
1001847.063
31
2.5%
40077.01318
31
2.5%
1001994.239
31
2.5%
40072.29919
30
2.4%
Other values (134)
857
69.3%
Value
Count
Frequency (%)
40071.84875
1
0.1%
40071.87329
2
0.2%
40072.00989
11
0.9%
40072.02722
1
0.1%
40072.03333
19
1.5%
Value
Count
Frequency (%)
1002134.316
9
0.7%
1002129.26
24
1.9%
1002119.52
2
0.2%
1002112.066
1
0.1%
1002110.929
1
0.1%
perimeter
Real number (ℝ≥0)
HIGH CORRELATION
\
Distinct
144
Distinct (%)
11.6%
Missing
0
Missing (%)
0.0%
Infinite
0
Infinite (%)
0.0%
Mean
1567.361838
Minimum
800.7884176
Maximum
4004.584675
Zeros
0
Zeros (%)
0.0%
Memory size
9.8 KiB
Toggle details
Quantile statistics
Minimum
800.7884176
5-th percentile
800.7913629
Q1
800.8158141
median
800.8182115
Q3
800.9215971
95-th percentile
4004.504431
Maximum
4004.584675
Range
3203.796258
Interquartile range (IQR)
0.1057829836
Descriptive statistics
Standard deviation
1367.274108
Coefficient of variation (CV)
0.8723410731
Kurtosis
-0.5035735213
Mean
1567.361838
Median Absolute Deviation (MAD)
0.02673125689
Skewness
1.223618914
Sum
1938826.594
Variance
1869438.486
Monotocity
Not monotonic
Histogram with fixed size bins (bins=8)
Value
Count
Frequency (%)
800.8697137
64
5.2%
800.7913629
49
4.0%
800.8711313
39
3.2%
800.8172217
38
3.1%
800.8179584
35
2.8%
800.8171713
32
2.6%
800.8423494
31
2.5%
4004.026548
31
2.5%
4004.314205
31
2.5%
800.7928178
30
2.4%
Other values (134)
857
69.3%
Value
Count
Frequency (%)
800.7884176
1
0.1%
800.7886419
2
0.2%
800.7899897
11
0.9%
800.7901192
1
0.1%
800.7902002
19
1.5%
Value
Count
Frequency (%)
4004.584675
9
0.7%
4004.576857
24
1.9%
4004.553936
2
0.2%
4004.542472
1
0.1%
4004.54077
1
0.1%
URB_RURAL
Categorical
HIGH CORRELATION
\
Distinct
2
Distinct (%)
0.2%
Missing
0
Missing (%)
0.0%
Memory size
9.8 KiB
1
941
0
296
Toggle details
Length
Max length
1
Median length
1
Mean length
1
Min length
1
Characters and Unicode
Total characters
1237
Distinct characters
2
Distinct categories
1 ?
Distinct scripts
1 ?
Distinct blocks
1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
Unique
0 ?
Unique (%)
0.0%
Sample
1st row
1
2nd row
1
3rd row
1
4th row
1
5th row
1
Value
Count
Frequency (%)
1
941
76.1%
0
296
23.9%
Histogram of lengths of the category
Value
Count
Frequency (%)
1
941
76.1%
0
296
23.9%
Value
Count
Frequency (%)
1
941
76.1%
0
296
23.9%
Value
Count
Frequency (%)
Decimal Number
1237
100.0%
Value
Count
Frequency (%)
1
941
76.1%
0
296
23.9%
Value
Count
Frequency (%)
Common
1237
100.0%
Value
Count
Frequency (%)
1
941
76.1%
0
296
23.9%
Value
Count
Frequency (%)
ASCII
1237
100.0%
Value
Count
Frequency (%)
1
941
76.1%
0
296
23.9%
Dens_Dom
Real number (ℝ≥0)
HIGH CORRELATION
ZEROS
\
Distinct
6
Distinct (%)
0.5%
Missing
0
Missing (%)
0.0%
Infinite
0
Infinite (%)
0.0%
Mean
0.001728375101
Minimum
0
Maximum
0.005
Zeros
440
Zeros (%)
35.6%
Memory size
9.8 KiB
Toggle details
Quantile statistics
Minimum
0
5-th percentile
0
Q1
0
median
0.002
Q3
0.003
95-th percentile
0.005
Maximum
0.005
Range
0.005
Interquartile range (IQR)
0.003
Descriptive statistics
Standard deviation
0.001662146143
Coefficient of variation (CV)
0.9616813746
Kurtosis
-0.8569973151
Mean
0.001728375101
Median Absolute Deviation (MAD)
0.002
Skewness
0.5684827957
Sum
2.138
Variance
2.762729801 × 10^6^
Monotocity
Not monotonic
Histogram with fixed size bins (bins=6)
Value
Count
Frequency (%)
0
440
35.6%
0.002
306
24.7%
0.001
146
11.8%
0.004
125
10.1%
0.003
110
8.9%
0.005
110
8.9%
Value
Count
Frequency (%)
0
440
35.6%
0.001
146
11.8%
0.002
306
24.7%
0.003
110
8.9%
0.004
125
10.1%
Value
Count
Frequency (%)
0.005
110
8.9%
0.004
125
10.1%
0.003
110
8.9%
0.002
306
24.7%
0.001
146
11.8%
Dens_hab
Real number (ℝ≥0)
HIGH CORRELATION
ZEROS
\
Distinct
124
Distinct (%)
10.0%
Missing
0
Missing (%)
0.0%
Infinite
0
Infinite (%)
0.0%
Mean
0.006086562618
Minimum
0
Maximum
0.01883971406
Zeros
40
Zeros (%)
3.2%
Memory size
9.8 KiB
Toggle details
Quantile statistics
Minimum
0
5-th percentile
2.195905554 × 10^5^
Q1
0.0005228866384
median
0.005365302324
Q3
0.009731768111
95-th percentile
0.01761695235
Maximum
0.01883971406
Range
0.01883971406
Interquartile range (IQR)
0.009208881473
Descriptive statistics
Standard deviation
0.005811442849
Coefficient of variation (CV)
0.9547988271
Kurtosis
-0.7230218254
Mean
0.006086562618
Median Absolute Deviation (MAD)
0.004842415686
Skewness
0.6823117744
Sum
7.529077958
Variance
3.377286798 × 10^5^
Monotocity
Not monotonic
Histogram with fixed size bins (bins=8)
Value
Count
Frequency (%)
0.01382216808
64
5.2%
0.006288656582
49
4.0%
0
40
3.2%
0.008682485957
39
3.2%
0.00798503661
38
3.1%
0.01085464337
35
2.8%
0.007910178795
32
2.6%
0.0005898237508
31
2.5%
0.001931432523
31
2.5%
0.00668712508
31
2.5%
Other values (114)
847
68.5%
Value
Count
Frequency (%)
0
40
3.2%
9.98118972 × 10^7^
2
0.2%
2.994613106 × 10^6^
3
0.2%
7.98480544 × 10^6^
1
0.1%
9.981296241 × 10^6^
4
0.3%
Value
Count
Frequency (%)
0.01883971406
17
1.4%
0.0184155705
12
1.0%
0.0177418051
27
2.2%
0.01761695235
22
1.8%
0.01759199515
28
2.3%
IVS_2010 IVS_INF_10 IVS_CPH_10 IVS_REN_10 MASC FEM POP DOM_OCU area perimeter Dens_Dom Dens_hab
IVS_2010 IVS_INF_10 IVS_CPH_10 IVS_REN_10 MASC FEM POP DOM_OCU area perimeter Dens_Dom Dens_hab
IVS_2010 IVS_INF_10 IVS_CPH_10 IVS_REN_10 MASC FEM POP DOM_OCU area perimeter Dens_Dom Dens_hab
IVS_2010 IVS_INF_10 IVS_CPH_10 IVS_REN_10 MASC FEM POP DOM_OCU area perimeter Dens_Dom Dens_hab
IVS_2010 IVS_INF_10 IVS_CPH_10 IVS_REN_10 MASC FEM POP DOM_OCU area perimeter Dens_Dom Dens_hab
IVS_2010 IVS_INF_10 IVS_CPH_10 IVS_REN_10 MASC FEM POP DOM_OCU area perimeter Dens_Dom Dens_hab
IVS_2010 IVS_INF_10 IVS_CPH_10 IVS_REN_10 MASC FEM POP DOM_OCU area perimeter Dens_Dom Dens_hab
IVS_2010 IVS_INF_10 IVS_CPH_10 IVS_REN_10 MASC FEM POP DOM_OCU area perimeter Dens_Dom Dens_hab
IVS_2010 IVS_INF_10 IVS_CPH_10 IVS_REN_10 MASC FEM POP DOM_OCU area perimeter Dens_Dom Dens_hab
IVS_2010 IVS_INF_10 IVS_CPH_10 IVS_REN_10 MASC FEM POP DOM_OCU area perimeter Dens_Dom Dens_hab
IVS_2010 IVS_INF_10 IVS_CPH_10 IVS_REN_10 MASC FEM POP DOM_OCU area perimeter Dens_Dom Dens_hab
IVS_2010 IVS_INF_10 IVS_CPH_10 IVS_REN_10 MASC FEM POP DOM_OCU area perimeter Dens_Dom Dens_hab
IVS_2010 IVS_INF_10 IVS_CPH_10 IVS_REN_10 MASC FEM POP DOM_OCU area perimeter Dens_Dom Dens_hab
Toggle correlation descriptions
The Pearson's correlation coefficient (r) is a measure of linear
correlation between two variables. It's value lies between -1 and +1, -1
indicating total negative linear correlation, 0 indicating no linear
correlation and 1 indicating total positive linear correlation.
Furthermore, r is invariant under separate changes in location and
scale of the two variables, implying that for a linear function the
angle to the x-axis does not affect r.
To calculate r for two variables X and Y, one divides the
covariance of X and Y by the product of their standard deviations.
The Spearman's rank correlation coefficient (ρ) is a measure of
monotonic correlation between two variables, and is therefore better in
catching nonlinear monotonic correlations than Pearson's r. It's value
lies between -1 and +1, -1 indicating total negative monotonic
correlation, 0 indicating no monotonic correlation and 1 indicating
total positive monotonic correlation.
To calculate ρ for two variables X and Y, one divides the
covariance of the rank variables of X and Y by the product of their
standard deviations.
Similarly to Spearman's rank correlation coefficient, the Kendall rank
correlation coefficient (τ) measures ordinal association between two
variables. It's value lies between -1 and +1, -1 indicating total
negative correlation, 0 indicating no correlation and 1 indicating total
positive correlation.
To calculate τ for two variables X and Y, one determines the
number of concordant and discordant pairs of observations. τ is given
by the number of concordant pairs minus the discordant pairs divided by
the total number of pairs.
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
Ocup_2010
IVS_2010
IVS_INF_10
IVS_CPH_10
IVS_REN_10
MASC
FEM
POP
DOM_OCU
area
perimeter
URB_RURAL
Dens_Dom
Dens_hab
0
0
0.385
0.255
0.495
0.406
79
81
160
48
40085.928467
800.922815
1
0.001
0.003991
1
0
0.385
0.255
0.495
0.406
79
81
160
48
40085.928467
800.922815
1
0.001
0.003991
2
0
0.385
0.255
0.495
0.406
79
81
160
48
40085.928467
800.922815
1
0.001
0.003991
3
0
0.385
0.255
0.495
0.406
79
81
160
48
40085.928467
800.922815
1
0.001
0.003991
4
0
0.385
0.255
0.495
0.406
79
81
160
48
40085.928467
800.922815
1
0.001
0.003991
5
0
0.385
0.255
0.495
0.406
79
81
160
48
40085.928467
800.922815
1
0.001
0.003991
6
0
0.385
0.255
0.495
0.406
79
81
160
48
40085.928467
800.922815
1
0.001
0.003991
7
0
0.385
0.255
0.495
0.406
50
45
95
25
40085.940796
800.922917
1
0.001
0.002370
8
0
0.385
0.255
0.495
0.406
50
45
95
25
40085.940796
800.922917
1
0.001
0.002370
9
0
0.385
0.255
0.495
0.406
0
0
0
0
40085.788696
800.921393
1
0.000
0.000000
Ocup_2010
IVS_2010
IVS_INF_10
IVS_CPH_10
IVS_REN_10
MASC
FEM
POP
DOM_OCU
area
perimeter
URB_RURAL
Dens_Dom
Dens_hab
1227
1
0.435
0.209
0.576
0.521
35
42
77
18
1.001806e+06
4003.963098
0
0.0
0.000077
1228
1
0.435
0.209
0.576
0.521
35
42
77
18
1.001806e+06
4003.963098
0
0.0
0.000077
1229
1
0.435
0.209
0.576
0.521
35
42
77
18
1.001806e+06
4003.963098
0
0.0
0.000077
1230
1
0.435
0.209
0.576
0.521
35
42
77
18
1.001806e+06
4003.963098
0
0.0
0.000077
1231
1
0.435
0.209
0.576
0.521
35
42
77
18
1.001806e+06
4003.963098
0
0.0
0.000077
1232
1
0.438
0.176
0.590
0.547
0
0
0
0
1.001783e+06
4003.919750
0
0.0
0.000000
1233
1
0.435
0.209
0.576
0.521
33
36
69
17
1.001784e+06
4003.921569
0
0.0
0.000069
1234
1
0.435
0.209
0.576
0.521
33
36
69
17
1.001784e+06
4003.921569
0
0.0
0.000069
1235
1
0.435
0.209
0.576
0.521
33
36
69
17
1.001784e+06
4003.921569
0
0.0
0.000069
1236
1
0.435
0.209
0.576
0.521
33
36
69
17
1.001784e+06
4003.921569
0
0.0
0.000069
Ocup_2010
IVS_2010
IVS_INF_10
IVS_CPH_10
IVS_REN_10
MASC
FEM
POP
DOM_OCU
area
perimeter
URB_RURAL
Dens_Dom
Dens_hab
count
28
0
0.362
0.112
0.595
0.379
258
296
554
168
4.008054e+04
800.869714
1
0.004
0.013822
64
95
1
0.435
0.209
0.576
0.521
114
138
252
75
4.007215e+04
800.791363
1
0.002
0.006289
49
27
0
0.362
0.112
0.595
0.379
170
178
348
100
4.008069e+04
800.871131
1
0.002
0.008682
39
62
1
0.310
0.105
0.467
0.357
138
182
320
80
4.007496e+04
800.817222
1
0.002
0.007985
38
66
1
0.310
0.105
0.467
0.357
207
228
435
132
4.007502e+04
800.817958
1
0.003
0.010855
35
63
1
0.310
0.105
0.467
0.357
147
170
317
87
4.007495e+04
800.817171
1
0.002
0.007910
32
17
0
0.310
0.105
0.467
0.357
283
308
591
145
1.001994e+06
4004.314205
0
0.000
0.000590
31
18
0
0.310
0.105
0.467
0.357
950
985
1935
552
1.001847e+06
4004.026548
0
0.001
0.001931
31
107
1
0.484
0.235
0.579
0.637
128
140
268
82
4.007701e+04
800.842349
1
0.002
0.006687
31
91
1
0.435
0.209
0.576
0.521
96
119
215
69
4.007230e+04
800.792818
1
0.002
0.005365
30
Report generated with pandas-profiling.