Dataset statistics
Number of variables | 7 |
---|---|
Number of observations | 2064 |
Missing cells | 2034 |
Missing cells (%) | 14.1% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 113.0 KiB |
Average record size in memory | 56.1 B |
Variable types
Numeric | 1 |
---|---|
Categorical | 6 |
Brand has a high cardinality: 324 distinct values | High cardinality |
Variety has a high cardinality: 1945 distinct values | High cardinality |
Review # is highly correlated with Country and 1 other fields | High correlation |
Country is highly correlated with Review # and 1 other fields | High correlation |
Top Ten is highly correlated with Review # and 1 other fields | High correlation |
Top Ten has 2032 (98.4%) missing values | Missing |
Variety is uniformly distributed | Uniform |
Top Ten is uniformly distributed | Uniform |
Review # has unique values | Unique |
Reproduction
Analysis started | 2021-11-21 22:40:05.312515 |
---|---|
Analysis finished | 2021-11-21 22:40:07.357062 |
Duration | 2.04 seconds |
Software version | pandas-profiling v3.1.0 |
Download configuration | config.json |
Distinct | 2064 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1295.616279 |
Minimum | 1 |
---|---|
Maximum | 2580 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 16.2 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 127.15 |
Q1 | 646.75 |
median | 1298.5 |
Q3 | 1945.25 |
95-th percentile | 2456.85 |
Maximum | 2580 |
Range | 2579 |
Interquartile range (IQR) | 1298.5 |
Descriptive statistics
Standard deviation | 748.8958659 |
---|---|
Coefficient of variation (CV) | 0.57802289 |
Kurtosis | -1.208799672 |
Mean | 1295.616279 |
Median Absolute Deviation (MAD) | 650.5 |
Skewness | -0.005266261832 |
Sum | 2674152 |
Variance | 560845.018 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
284 | 1 | < 0.1% |
1960 | 1 | < 0.1% |
1439 | 1 | < 0.1% |
1502 | 1 | < 0.1% |
155 | 1 | < 0.1% |
2295 | 1 | < 0.1% |
864 | 1 | < 0.1% |
516 | 1 | < 0.1% |
403 | 1 | < 0.1% |
1731 | 1 | < 0.1% |
Other values (2054) | 2054 |
Value | Count | Frequency (%) |
1 | 1 | |
3 | 1 | |
4 | 1 | |
6 | 1 | |
9 | 1 | |
10 | 1 | |
11 | 1 | |
12 | 1 | |
13 | 1 | |
14 | 1 |
Value | Count | Frequency (%) |
2580 | 1 | |
2579 | 1 | |
2578 | 1 | |
2577 | 1 | |
2576 | 1 | |
2575 | 1 | |
2574 | 1 | |
2573 | 1 | |
2572 | 1 | |
2571 | 1 |
Distinct | 324 |
---|---|
Distinct (%) | 15.7% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 16.2 KiB |
Nissin | |
---|---|
Nongshim | 75 |
Maruchan | 60 |
Mama | 57 |
Paldo | 52 |
Other values (319) |
Common Values
Value | Count | Frequency (%) |
Nissin | 314 | 15.2% |
Nongshim | 75 | 3.6% |
Maruchan | 60 | 2.9% |
Mama | 57 | 2.8% |
Paldo | 52 | 2.5% |
Indomie | 43 | 2.1% |
Myojo | 42 | 2.0% |
Samyang Foods | 41 | 2.0% |
Ottogi | 36 | 1.7% |
Vina Acecook | 29 | 1.4% |
Other values (314) | 1315 |
Length
Value | Count | Frequency (%) |
nissin | 314 | 11.1% |
mama | 83 | 2.9% |
foods | 79 | 2.8% |
nongshim | 75 | 2.6% |
maruchan | 60 | 2.1% |
noodle | 60 | 2.1% |
samyang | 59 | 2.1% |
paldo | 55 | 1.9% |
wai | 44 | 1.6% |
indomie | 43 | 1.5% |
Other values (391) | 1962 |
Most occurring characters
Value | Count | Frequency (%) |
No values found. |
Most occurring categories
Value | Count | Frequency (%) |
No values found. |
Most frequent character per category
Most occurring scripts
Value | Count | Frequency (%) |
No values found. |
Most frequent character per script
Most occurring blocks
Value | Count | Frequency (%) |
No values found. |
Most frequent character per block
Distinct | 1945 |
---|---|
Distinct (%) | 94.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 16.2 KiB |
Beef | 7 |
---|---|
Yakisoba | 6 |
Vegetable | 6 |
Artificial Chicken | 6 |
Miso Ramen | 5 |
Other values (1940) |
Length
Max length | 96 |
---|---|
Median length | 28 |
Mean length | 29.5494186 |
Min length | 3 |
Characters and Unicode
Total characters | 0 |
---|---|
Distinct characters | 0 |
Distinct categories | 0 ? |
Distinct scripts | 0 ? |
Distinct blocks | 0 ? |
Unique
Unique | 1858 ? |
---|---|
Unique (%) | 90.0% |
Sample
1st row | Cup Noodles Seafood |
---|---|
2nd row | Donbei Tensoba |
3rd row | Demae Ramen Shoyu |
4th row | Shrimp Udon |
5th row | Mi Keriting Rasa Ayam Bawang |
Common Values
Value | Count | Frequency (%) |
Beef | 7 | 0.3% |
Yakisoba | 6 | 0.3% |
Vegetable | 6 | 0.3% |
Artificial Chicken | 6 | 0.3% |
Miso Ramen | 5 | 0.2% |
Artificial Beef Flavor | 4 | 0.2% |
Artificial Spicy Beef | 4 | 0.2% |
Chicken | 4 | 0.2% |
Imitation Chicken Vegetarian | 3 | 0.1% |
Soy Sauce | 3 | 0.1% |
Other values (1935) | 2016 |
Length
Value | Count | Frequency (%) |
noodles | 550 | 5.7% |
noodle | 411 | 4.3% |
instant | 364 | 3.8% |
flavour | 323 | 3.3% |
ramen | 274 | 2.8% |
chicken | 267 | 2.8% |
flavor | 258 | 2.7% |
spicy | 213 | 2.2% |
beef | 188 | 1.9% |
soup | 157 | 1.6% |
Other values (1278) | 6654 |
Most occurring characters
Value | Count | Frequency (%) |
No values found. |
Most occurring categories
Value | Count | Frequency (%) |
No values found. |
Most frequent character per category
Most occurring scripts
Value | Count | Frequency (%) |
No values found. |
Most frequent character per script
Most occurring blocks
Value | Count | Frequency (%) |
No values found. |
Most frequent character per block
Style
Categorical
Distinct | 7 |
---|---|
Distinct (%) | 0.3% |
Missing | 2 |
Missing (%) | 0.1% |
Memory size | 16.2 KiB |
Pack | |
---|---|
Bowl | |
Cup | |
Tray | 84 |
Box | 5 |
Other values (2) | 2 |
Common Values
Value | Count | Frequency (%) |
Pack | 1247 | |
Bowl | 366 | 17.7% |
Cup | 358 | 17.3% |
Tray | 84 | 4.1% |
Box | 5 | 0.2% |
Can | 1 | < 0.1% |
Bar | 1 | < 0.1% |
(Missing) | 2 | 0.1% |
Length
Pie chart
Value | Count | Frequency (%) |
pack | 1247 | |
bowl | 366 | 17.7% |
cup | 358 | 17.4% |
tray | 84 | 4.1% |
box | 5 | 0.2% |
can | 1 | < 0.1% |
bar | 1 | < 0.1% |
Most occurring characters
Value | Count | Frequency (%) |
No values found. |
Most occurring categories
Value | Count | Frequency (%) |
No values found. |
Most frequent character per category
Most occurring scripts
Value | Count | Frequency (%) |
No values found. |
Most frequent character per script
Most occurring blocks
Value | Count | Frequency (%) |
No values found. |
Most frequent character per block
Distinct | 38 |
---|---|
Distinct (%) | 1.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 16.2 KiB |
Japan | |
---|---|
USA | |
South Korea | |
Taiwan | |
Thailand | |
Other values (33) |
Common Values
Value | Count | Frequency (%) |
Japan | 269 | |
USA | 258 | |
South Korea | 243 | |
Taiwan | 186 | |
Thailand | 162 | |
China | 130 | 6.3% |
Malaysia | 120 | 5.8% |
Hong Kong | 110 | 5.3% |
Indonesia | 104 | 5.0% |
Singapore | 92 | 4.5% |
Other values (28) | 390 |
Length
Value | Count | Frequency (%) |
japan | 269 | |
usa | 258 | |
south | 243 | |
korea | 243 | |
taiwan | 186 | 7.7% |
thailand | 162 | 6.7% |
china | 130 | 5.4% |
malaysia | 120 | 5.0% |
hong | 110 | 4.5% |
kong | 110 | 4.5% |
Other values (31) | 587 |
Most occurring characters
Value | Count | Frequency (%) |
No values found. |
Most occurring categories
Value | Count | Frequency (%) |
No values found. |
Most frequent character per category
Most occurring scripts
Value | Count | Frequency (%) |
No values found. |
Most frequent character per script
Most occurring blocks
Value | Count | Frequency (%) |
No values found. |
Most frequent character per block
Stars
Categorical
Distinct | 39 |
---|---|
Distinct (%) | 1.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 16.2 KiB |
4 | |
---|---|
5 | |
3.75 | |
3.5 | |
3.25 | |
Other values (34) |
Common Values
Value | Count | Frequency (%) |
4 | 316 | |
5 | 302 | |
3.75 | 275 | |
3.5 | 268 | |
3.25 | 145 | |
3 | 143 | |
4.5 | 110 | 5.3% |
4.25 | 105 | 5.1% |
2.75 | 72 | 3.5% |
4.75 | 53 | 2.6% |
Other values (29) | 275 |
Length
Value | Count | Frequency (%) |
4 | 316 | |
5 | 302 | |
3.75 | 275 | |
3.5 | 268 | |
3.25 | 145 | |
3 | 143 | |
4.5 | 110 | 5.3% |
4.25 | 105 | 5.1% |
2.75 | 72 | 3.5% |
4.75 | 53 | 2.6% |
Other values (29) | 275 |
Most occurring characters
Value | Count | Frequency (%) |
No values found. |
Most occurring categories
Value | Count | Frequency (%) |
No values found. |
Most frequent character per category
Most occurring scripts
Value | Count | Frequency (%) |
No values found. |
Most frequent character per script
Most occurring blocks
Value | Count | Frequency (%) |
No values found. |
Most frequent character per block
Distinct | 31 |
---|---|
Distinct (%) | 96.9% |
Missing | 2032 |
Missing (%) | 98.4% |
Memory size | 16.2 KiB |
2 | |
2012 #7 | 1 |
2014 #5 | 1 |
2012 #3 | 1 |
2014 #7 | 1 |
Other values (26) |
Common Values
Value | Count | Frequency (%) |
2 | 0.1% | |
2012 #7 | 1 | < 0.1% |
2014 #5 | 1 | < 0.1% |
2012 #3 | 1 | < 0.1% |
2014 #7 | 1 | < 0.1% |
2012 #4 | 1 | < 0.1% |
2013 #2 | 1 | < 0.1% |
2013 #3 | 1 | < 0.1% |
2015 #8 | 1 | < 0.1% |
2012 #1 | 1 | < 0.1% |
Other values (21) | 21 | 1.0% |
(Missing) | 2032 |
Length
Value | Count | Frequency (%) |
2012 | 9 | |
2014 | 7 | |
2015 | 6 | |
2013 | 5 | |
7 | 4 | 6.7% |
4 | 4 | 6.7% |
1 | 4 | 6.7% |
8 | 3 | 5.0% |
6 | 3 | 5.0% |
10 | 3 | 5.0% |
Other values (5) | 12 |
Most occurring characters
Value | Count | Frequency (%) |
2 |
Most occurring categories
Value | Count | Frequency (%) |
Control | 2 |
Most frequent character per category
Control
Value | Count | Frequency (%) |
2 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 2 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
2 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 2 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
2 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
Review # | Brand | Variety | Style | Country | Stars | Top Ten | |
---|---|---|---|---|---|---|---|
0 | 284 | Nissin | Cup Noodles Seafood | Cup | Hong Kong | 4.5 | NaN |
1 | 976 | Nissin | Donbei Tensoba | Bowl | Japan | 4 | NaN |
2 | 74 | Nissin | Demae Ramen Shoyu | Pack | Japan | 4 | NaN |
3 | 425 | Chikara | Shrimp Udon | Pack | USA | 4.5 | NaN |
4 | 870 | SuperMi | Mi Keriting Rasa Ayam Bawang | Pack | Indonesia | 3.75 | NaN |
5 | 1503 | Nongshim | Bowl Noodle Soup Shrimp Habanero Lime Flavor | Bowl | USA | 3.25 | NaN |
6 | 647 | Sunlee | Tom Yum Shrimp Noodle | Bowl | Thailand | 3.5 | NaN |
7 | 2254 | Nissin | Disney Cuties Instant Noodle Seaweed Flavour | Cup | Thailand | 3 | NaN |
8 | 1181 | Samyang Foods | Star Popeye Ramyun Snack | Pack | South Korea | 4 | NaN |
9 | 1211 | Nissin | Demae Iccho Instant Noodle With Soup Base Artificial Chicken Flavour | Bowl | Hong Kong | 3 | NaN |
Last rows
Review # | Brand | Variety | Style | Country | Stars | Top Ten | |
---|---|---|---|---|---|---|---|
2054 | 2533 | Nongshim | Shin Ramyun Black | Pack | South Korea | 5 | NaN |
2055 | 419 | Myojo | Ramen Desse Shio | Bowl | Japan | 4.25 | NaN |
2056 | 2484 | Nissin | Demae Ramen Tokyo Soy Sauce | Pack | Germany | 4 | NaN |
2057 | 819 | Maruchan | Tempura Soba | Pack | Japan | 4 | NaN |
2058 | 987 | Trident | Singapore Soft Noodles | Pack | Australia | 2.75 | NaN |
2059 | 1433 | Maggi | 2 Minute Noodles Curry Flavour | Pack | Singapore | 3.75 | NaN |
2060 | 426 | Vifon | Tu quy Chicken | Pack | Vietnam | 3 | NaN |
2061 | 814 | Indomie | Beef | Pack | Indonesia | 3.5 | NaN |
2062 | 1458 | Nissin | Premium Instant Noodles Roasted Beef Flavour | Bowl | Singapore | 3.75 | NaN |
2063 | 1234 | Sainsbury's | Barbecue Beef Flavour Instant Noodles | Pack | UK | 2.75 | NaN |