Food industry is one of the fastest moving industrial sectors. The glamorous and glittering retail shops and supermarkets are expanding very fast all over the country. The majority of food is pre-packed and presented to the consumer in a labelled container. The data for the project is based on a study attempts to evaluate the consumers attitude towards food labels and awareness of information printed on food labels.This study was carried out to identify the association between demographic, socio-economic characteristics and health related factors on consumers attitude towards food labels and association between demographic, socio-economic characteristics and health related factors on awareness of information printed in food labels.
Varibale | Description | Type of variable |
---|---|---|
Gender | Gender | Qualitative |
Age | Age | Quantitative |
Education | Educational level | Qualitative |
Employment | Employment status | Qualitative |
Income | Household Income | Qualitative |
House size | Age groups | Qualitative |
Children | Number of children | Quantitative |
Marital | Marital status | Qualitative |
fshopper | Major food shopper of the household | Qualitative |
mplanner | Major meal planner of the household | Qualitative |
FA | Having food allergies | Qualitative |
Diabetes | Having Diabetes | Qualitative |
Metabolic syndrome | Having Obesity, High blood pressure/Cholesterol, Heart disease | Qualitative |
Other | Having Migrain, Osteoporoses, Other | Qualitative |
Specif | Having a specific diet(pregnancy, breast feeding,training for sports,vegetarian) | Qualitative |
Job1 | Doctors, nurses, health care workers | Qualitative |
Job2 | Legislators related to food items Manufactures/ advertisers related to food items | Qualitative |
Exercise | Frequency of do exercise | Qualitative |
Health | Self perception of overall health | Qualitative |
Place | Place of where buy package food | Qualitative |
Easy | Easiness of the package food | Qualitative |
Familiarity | Familiarity with the product | Qualitative |
Friends | Recommendation by family and friends | Qualitative |
Useful | Usefulness of food label | Qualitative |
Easiness | Easiness of understand the information on food labels | Qualitative |
Sufficient | Sufficiency of information provided in food label | Qualitative |
Truthfulness | Truthfulness of information provided in food label | Qualitative |
Clear | clarity of information printed in food label | Qualitative |
Attractive pack | Influence of attractive package | Qualitative |
Hc/nufriclaim | Influence of health claims/ Nutrition claims | Qualitative |
Graphical | Influence of graphical and pictorial information | Qualitative |
Free Price | Influence of Free/ Prizes/ Contests | Qualitative |
Net quan | Awareness of net quantity | Qualitative |
Low in fat | Awareness of low in fat | Qualitative |
Low in cho | Awareness of low in cholesterol | Qualitative |
Sodium | Awareness of nutrition claim indicates the lowest amount of sodium | Qualitative |
elabels | Awareness of Ecode labels | Qualitative |
library(tidyverse)
library(janitor)
library(ggplot2)
library(gmodels)
library(GGally)
library(patchwork)
library(MASS)
library(huxtable)
First, we neeed to look at the types of collected data.
summary(foodlabel)
glimpse(food_label)
In here, we can see that the factors were recognized as integers. We need to convert them into factors. As well as in house size, eleventh there is 8 categories two data reported as 10 and 9. Therefore we have to replace those as missing values.
foodlabel <- foodlabel %>% mutate(Housesize = replace(Housesize, which(Housesize > 8), NA))
table 01:
tabyl(food_label,Gender)
Gender | n | percent |
---|---|---|
female | 377 | 0.643 |
male | 209 | 0.357 |
tabyl(food_label, marital)
marital | n | percent |
---|---|---|
single | 155 | 0.265 |
married | 431 | 0.735 |
table 02:
tabyl(food_label, Education)
Education | n | percent |
---|---|---|
Below O/L | 66 | 0.113 |
Passed GCE O/L | 68 | 0.116 |
Passed GCE A/L | 88 | 0.15 |
Diploma | 145 | 0.247 |
Degree | 219 | 0.374 |
table 03:
tabyl(food_label, Employment)
Employment | n | percent |
---|---|---|
Employed full time | 232 | 0.396 |
Employed part-time | 66 | 0.113 |
Unemployed | 58 | 0.099 |
Student | 83 | 0.142 |
Housewife | 99 | 0.169 |
Retired | 48 | 0.0819 |
According to the table 03 we can see that approximately 40% of respondents are full time employees. Only 8% of retired persons are available in the sample. table 04:
tabyl(food_label, Income)
Income | n | percent |
---|---|---|
Less than Rs: 20000 | 20 | 0.0341 |
Rs: 20000 - Rs: 34999 | 91 | 0.155 |
Rs: 35000 - Rs: 49999 | 204 | 0.348 |
Rs: 50000 - Rs: 64999 | 197 | 0.336 |
Over Rs: 64499 | 74 | 0.126 |
Table 04 shows that around 34% people receive Rs: 35000 - Rs: 49999 and Rs: 50000 - Rs: 64999 income.
table 05:
tabyl(food_label, Housesize)%>% filter(!(is.na(Housesize)))
Housesize | n | percent | valid_percent |
---|---|---|---|
0-24 months | 31 | 0.0529 | 0.0531 |
2-5 years | 134 | 0.229 | 0.229 |
6-10 years | 178 | 0.304 | 0.305 |
11-16 years | 116 | 0.198 | 0.199 |
17-18 years | 75 | 0.128 | 0.128 |
18-30 years | 26 | 0.0444 | 0.0445 |
30-55 years | 17 | 0.029 | 0.0291 |
over 55 years | 7 | 0.0119 | 0.012 |
table 06:
tabyl(food_label, fshopper)
fshopper | n | percent |
---|---|---|
no | 171 | 0.292 |
yes | 415 | 0.708 |
tabyl(food_label, mplanner)
mplanner | n | percent |
---|---|---|
no | 150 | 0.256 |
yes | 436 | 0.744 |
The following table 06 reveals that the majority of the respondents are major food shopper of the household. As well as it shows that the majority of the sample are the major meal planner in the household.
table 07:
CrossTable(FA, Gender)
Cell Contents
|-------------------------|
| N |
| Chi-square contribution |
| N / Row Total |
| N / Col Total |
| N / Table Total |
|-------------------------|
Total Observations in Table: 586
| Gender
FA | female | male | Row Total |
-------------|-----------|-----------|-----------|
no | 336 | 180 | 516 |
| 0.049 | 0.088 | |
| 0.651 | 0.349 | 0.881 |
| 0.891 | 0.861 | |
| 0.573 | 0.307 | |
-------------|-----------|-----------|-----------|
yes | 41 | 29 | 70 |
| 0.361 | 0.652 | |
| 0.586 | 0.414 | 0.119 |
| 0.109 | 0.139 | |
| 0.070 | 0.049 | |
-------------|-----------|-----------|-----------|
Column Total | 377 | 209 | 586 |
| 0.643 | 0.357 | |
-------------|-----------|-----------|-----------|
In here we can see that only 11% of the sample suffering from the food allergies and most of them are ma
table 08:
CrossTable(Diabetes, Gender)
Cell Contents
|-------------------------|
| N |
| Chi-square contribution |
| N / Row Total |
| N / Col Total |
| N / Table Total |
|-------------------------|
Total Observations in Table: 586
| Gender
Diabetes | female | male | Row Total |
-------------|-----------|-----------|-----------|
no | 102 | 77 | 179 |
| 1.504 | 2.712 | |
| 0.570 | 0.430 | 0.305 |
| 0.271 | 0.368 | |
| 0.174 | 0.131 | |
-------------|-----------|-----------|-----------|
yes | 275 | 132 | 407 |
| 0.661 | 1.193 | |
| 0.676 | 0.324 | 0.695 |
| 0.729 | 0.632 | |
| 0.469 | 0.225 | |
-------------|-----------|-----------|-----------|
Column Total | 377 | 209 | 586 |
| 0.643 | 0.357 | |
-------------|-----------|-----------|-----------|
Table 08 shows that the majority of people are suffering from diabetes. Among them, 67% are females.
table 09:
CrossTable(`Metabolic cyndrents`, Gender)
Cell Contents
|-------------------------|
| N |
| Chi-square contribution |
| N / Row Total |
| N / Col Total |
| N / Table Total |
|-------------------------|
Total Observations in Table: 586
| Gender
Metabolic cyndrents | female | male | Row Total |
--------------------|-----------|-----------|-----------|
no | 157 | 100 | 257 |
| 0.421 | 0.759 | |
| 0.611 | 0.389 | 0.439 |
| 0.416 | 0.478 | |
| 0.268 | 0.171 | |
--------------------|-----------|-----------|-----------|
yes | 220 | 109 | 329 |
| 0.329 | 0.593 | |
| 0.669 | 0.331 | 0.561 |
| 0.584 | 0.522 | |
| 0.375 | 0.186 | |
--------------------|-----------|-----------|-----------|
Column Total | 377 | 209 | 586 |
| 0.643 | 0.357 | |
--------------------|-----------|-----------|-----------|
According to the table 09 we can see that around 56% people have Metabolic cyndrentssuch as Obesity. High blood pressure/Cholesterol, Heart disease.
table 10:
CrossTable(specific, Gender)
Cell Contents
|-------------------------|
| N |
| Chi-square contribution |
| N / Row Total |
| N / Col Total |
| N / Table Total |
|-------------------------|
Total Observations in Table: 586
| Gender
specific | female | male | Row Total |
-------------|-----------|-----------|-----------|
no | 228 | 131 | 359 |
| 0.038 | 0.068 | |
| 0.635 | 0.365 | 0.613 |
| 0.605 | 0.627 | |
| 0.389 | 0.224 | |
-------------|-----------|-----------|-----------|
yes | 149 | 78 | 227 |
| 0.060 | 0.108 | |
| 0.656 | 0.344 | 0.387 |
| 0.395 | 0.373 | |
| 0.254 | 0.133 | |
-------------|-----------|-----------|-----------|
Column Total | 377 | 209 | 586 |
| 0.643 | 0.357 | |
-------------|-----------|-----------|-----------|
By table 10 it seems that less number of people have specific diet due to pregnancy, breast feeding, training for sports, vegetarian.
table 11:
tabyl(food_label, job1)
job1 | n | percent |
---|---|---|
no | 255 | 0.435 |
yes | 331 | 0.565 |
tabyl(food_label, job2)
job2 | n | percent |
---|---|---|
no | 315 | 0.538 |
yes | 271 | 0.462 |
table 12:
CrossTable(Exercise, Gender)
Cell Contents
|-------------------------|
| N |
| Chi-square contribution |
| N / Row Total |
| N / Col Total |
| N / Table Total |
|-------------------------|
Total Observations in Table: 586
| Gender
Exercise | female | male | Row Total |
-------------------------|-----------|-----------|-----------|
daily | 15 | 6 | 21 |
| 0.164 | 0.296 | |
| 0.714 | 0.286 | 0.036 |
| 0.040 | 0.029 | |
| 0.026 | 0.010 | |
-------------------------|-----------|-----------|-----------|
at least 2 days per week | 57 | 41 | 98 |
| 0.580 | 1.046 | |
| 0.582 | 0.418 | 0.167 |
| 0.151 | 0.196 | |
| 0.097 | 0.070 | |
-------------------------|-----------|-----------|-----------|
rarely | 165 | 84 | 249 |
| 0.144 | 0.260 | |
| 0.663 | 0.337 | 0.425 |
| 0.438 | 0.402 | |
| 0.282 | 0.143 | |
-------------------------|-----------|-----------|-----------|
never | 140 | 78 | 218 |
| 0.000 | 0.001 | |
| 0.642 | 0.358 | 0.372 |
| 0.371 | 0.373 | |
| 0.239 | 0.133 | |
-------------------------|-----------|-----------|-----------|
Column Total | 377 | 209 | 586 |
| 0.643 | 0.357 | |
-------------------------|-----------|-----------|-----------|
According to the table 12, most of the people are doing exercises rarely. It seems that Females tend to do exercises than males.
table 13:
CrossTable(Health, Gender)
Cell Contents
|-------------------------|
| N |
| Chi-square contribution |
| N / Row Total |
| N / Col Total |
| N / Table Total |
|-------------------------|
Total Observations in Table: 586
| Gender
Health | female | male | Row Total |
-------------|-----------|-----------|-----------|
excellent | 5 | 4 | 9 |
| 0.108 | 0.194 | |
| 0.556 | 0.444 | 0.015 |
| 0.013 | 0.019 | |
| 0.009 | 0.007 | |
-------------|-----------|-----------|-----------|
good | 21 | 10 | 31 |
| 0.056 | 0.101 | |
| 0.677 | 0.323 | 0.053 |
| 0.056 | 0.048 | |
| 0.036 | 0.017 | |
-------------|-----------|-----------|-----------|
fair | 73 | 37 | 110 |
| 0.070 | 0.127 | |
| 0.664 | 0.336 | 0.188 |
| 0.194 | 0.177 | |
| 0.125 | 0.063 | |
-------------|-----------|-----------|-----------|
poor | 145 | 80 | 225 |
| 0.000 | 0.001 | |
| 0.644 | 0.356 | 0.384 |
| 0.385 | 0.383 | |
| 0.247 | 0.137 | |
-------------|-----------|-----------|-----------|
can't say | 133 | 78 | 211 |
| 0.056 | 0.100 | |
| 0.630 | 0.370 | 0.360 |
| 0.353 | 0.373 | |
| 0.227 | 0.133 | |
-------------|-----------|-----------|-----------|
Column Total | 377 | 209 | 586 |
| 0.643 | 0.357 | |
-------------|-----------|-----------|-----------|
Table 13 shows that most of them have poor health condition. Approximately 64% females have poor health. Only 1.5% have the excellent health condition.
table 14:
CrossTable(place, Gender)
Cell Contents
|-------------------------|
| N |
| Chi-square contribution |
| N / Row Total |
| N / Col Total |
| N / Table Total |
|-------------------------|
Total Observations in Table: 586
| Gender
place | female | male | Row Total |
---------------|-----------|-----------|-----------|
retail shops | 71 | 31 | 102 |
| 0.441 | 0.795 | |
| 0.696 | 0.304 | 0.174 |
| 0.188 | 0.148 | |
| 0.121 | 0.053 | |
---------------|-----------|-----------|-----------|
super markets | 174 | 102 | 276 |
| 0.072 | 0.129 | |
| 0.630 | 0.370 | 0.471 |
| 0.462 | 0.488 | |
| 0.297 | 0.174 | |
---------------|-----------|-----------|-----------|
both equally | 132 | 76 | 208 |
| 0.025 | 0.044 | |
| 0.635 | 0.365 | 0.355 |
| 0.350 | 0.364 | |
| 0.225 | 0.130 | |
---------------|-----------|-----------|-----------|
Column Total | 377 | 209 | 586 |
| 0.643 | 0.357 | |
---------------|-----------|-----------|-----------|
It seems that around 47% of people buy packaged foods from the supermarkets.
figure 01:
ggplot(food_label, aes(x = Gender, y = Age, fill = Gender)) +
geom_boxplot(size = .75) + facet_grid(marital~., margins = FALSE) +
theme(axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1))+
ggtitle("Distribution of Age by Gender and Marital status")
According to figure 01 we can see that married females positively skewed distributed with age, while males have negatively skewed distribution. Both single males and females have a symmetric distribution with age.
figure 02:
ggplot(food_label, aes(x = Education, y = Age, fill = Gender)) +
geom_boxplot(size = .75) +
theme(axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1))+
ggtitle("Distribution of Age by Gender and Education level")
Figure 02 shows that females negatively skewed distributed only for Passed GCE A/L with Age.
figure 03:
ggplot(food_label, aes(x = Employment, y = Age, fill = Gender)) +
geom_boxplot(size = .75) +
theme(axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1))+
ggtitle("Distribution of Age by Gender and Employment")
It seems that both male and female students show negatively skewed distribution with age. Retired and unemployed males show negatively skewed distribution. Full time male employers have nearly symmetric distribution. figure 04:
ggplot(food_label, aes(x = Income, y = Age, fill = Gender)) +
geom_boxplot(size = .75) +
theme(axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1))+
ggtitle("Distribution of Age by Gender and Income")
According to the figure 05 we can see that males have only positively skewed distribution with age in over Rs: 64499 income.
figure 05:
ggplot(food_label, aes(x = fshopper, y = Age, fill = Gender)) +
geom_boxplot(size = .75) + facet_grid(mplanner~., margins = FALSE) +
theme(axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1))+
ggtitle("Distribution of Age by Gender, meal palnner and food shopper")
By figure 05 we can see that both males and females who major food shopper and meal planner, have positively skewed distribution with age. Females who are not moth food shopper and meal planner have negatively skewed distribution.
figure 06:
p1 <- ggplot(food_label, aes(x=FA, y=Age, fill = Gender)) +
geom_boxplot(outlier.size = 1, colour="black", width=0.1 ) +
geom_violin(alpha = 0.2, width = 1) +facet_grid(Gender~., margins = FALSE) +
theme(legend.position = "none")+
xlab("Food allergies") +
ylab("Age") +
ggtitle("Distribution of Age by Food allergies")
p2 <- ggplot(food_label, aes(x=Diabetes, y=Age, fill = Gender)) +
geom_boxplot(outlier.size = 1, colour="black", width=0.1 ) +
geom_violin(alpha = 0.2, width = 1) +facet_grid(Gender~., margins = FALSE) +
theme(legend.position = "none")+
xlab("Diabetes") +
ylab("Age") +
ggtitle("Distribution of Age by Diabetes")
p3 <- ggplot(food_label, aes(x=`Metabolic cyndrents`, y=Age, fill = Gender)) +
geom_boxplot(outlier.size = 1, colour="black", width=0.1 ) +
geom_violin(alpha = 0.2, width = 1) +facet_grid(Gender~., margins = FALSE) +
theme(legend.position = "none")+
xlab("Metabolic cyndrents") +
ylab("Age") +
ggtitle("Distribution of Age by \nMetabolic cyndrents")
p4 <- ggplot(food_label, aes(x=specific, y=Age, fill = Gender)) +
geom_boxplot(outlier.size = 1, colour="black", width=0.1 ) +
geom_violin(alpha = 0.2, width = 1) +facet_grid(Gender~., margins = FALSE) +
xlab("specific diets") +
ylab("Age") +
ggtitle("Distribution of Age by \n specific diets")
(p1|p2) / (p3|p4)
Figure 06 shows that there are bimodal distribution for food allergies, Diabetes, Metabolic syndrome and specific diet. There may be some external factor that affects food allergies. Males who are having food allergies have positively skewed distribution.
figure 07:
p1 <- ggplot(food_label, aes(x=job1, y=Age, fill = Gender)) +
geom_boxplot(outlier.size = 1, colour="black", width=0.1 ) +
geom_violin(alpha = 0.2, width = 1) +facet_grid(Gender~., margins = FALSE) +
theme(legend.position = "none", axis.text.x = element_text(angle = 60, hjust = 1))+
ylab("Age") +
ggtitle("Distribution of Age by \ndoctors, nurses, \nhealth care workers jobs")
p2 <- ggplot(food_label, aes(x=job2, y=Age, fill = Gender)) +
geom_boxplot(outlier.size = 1, colour="black", width=0.1 ) +
geom_violin(alpha = 0.2, width = 1) +facet_grid(Gender~., margins = FALSE) +
theme(axis.text.x = element_text(angle = 60, hjust = 1))+
ylab("Age") +
ggtitle("Distribution of Age by \nLegislators related to food items,\nManufactures/advertisers \nrelated to food items")
p1|p2
According to the figure 07 there are bimodal distributions in every category. There might be some external factors that affect those job types. As well as other than males who are working and not working as legislators, Manufactures/advertisers related to food items have positively skewed distribution with age.
figure 08:
ggplot(food_label, aes(x=Exercise, y=Age, fill = Gender)) +
geom_boxplot(outlier.size = 1, colour="black", width=0.1 ) +
geom_violin(alpha = 0.2, width = 1) +facet_grid(Gender~., margins = FALSE) +
xlab(" Exercise") +
ylab("Age") +
ggtitle("Distribution of Age by exercise")
In here we can see that older men tend to do exercises daily. But younger males show positively skewed distribution for not exercising. A small number of females are doing exercises daily and it shows positively skewed distribution.
figure 09:
ggplot(food_label, aes(x=place, y=Age, fill = Gender)) +
geom_boxplot(outlier.size = 1, colour="black", width=0.1 ) +
geom_violin(alpha = 0.2, width = 1) +facet_grid(Gender~., margins = FALSE) +
geom_violin(alpha = 0.2, fill = "pink", width = 1) +
xlab(" place") +
ylab("Age") +
ggtitle("Distribution of Age by place")
In here we can see that there are some bimodal distributions. Females positively skewed distribution for both retails shops and supermarkets.
figure 10:
ggpairs(food_label, mapping = aes(color=Gender, alpha =0.2),
columns =c ("Age", "Children"))+
ggtitle("Scatter plot matrix by gender")
Figure 10 shows that there is a weak positive linear relationship between age and children. Females have a negative, weak linear relationship between age and children.
figure 11:
ggpairs(food_label, mapping = aes(color=marital, alpha =0.6),
columns =c ("Age", "Children"))+
ggtitle("Scatter plot matrix by marital status")
Figure 11 shows that there is a weak positive linear relationship between age and children for both single and married people.
figure 12:
ggpairs(food_label, mapping = aes(color=Education, alpha =0.6),
columns =c ("Age", "Children"))+
ggtitle("Scatter plot matrix by education")
In here we can see that there is a moderate positive linear relationship between age and children of people who GCE A/L passed.
figure 12:
ggpairs(food_label, mapping = aes(color=Exercise, alpha =0.6),
columns =c ("Age", "Children"))+
ggtitle("Scatter plot matrix by exercise")
In here we can see that there is a moderate positive linear relationship between age and children for people who exercise daily.