Whenever we are able to control the way the data is serving we should take an advantage. For example in a poll by applying sampling techniques or in medical statistics by splitting the participants into groups and treatments and so on.
In this post we will give an example of the Orthogonal Arrays which is a one example of the family of Experimental Designs.
Scenario
Let’s assume that Joe sometimes suffers from stomach ache during the night. His gastroenterologist suspects that his diet is responsible for this occasional symptoms. Let’s also assume that Joe’s diet includes:
- Breakfast: Sandwich or Pancakes or Omelette or Yogurt with Honey and Nuts
- Morning Beverage: Coffee or Orange Juice
- Lunch: Pork or Fish or Chicken or Salad
- Dinner: Pasta or Rice or Milk with Cereals or Pizza
- Dessert: Ice-Cream or Nothing
- Night Drink: Tea or Wine
So all the possible combinations are \(4\times 2 \times 4 \times 4 \times 2 \times 2=512\). The Doctor would like to detect which food(s) may cause him this discomfort and he is planning to apply the Orthogonal Arrays. Assuming that there is no interaction in the meals, he asks Joe to follow the following diet.
Question: Which are all the possible Orthogonal Arrays from this case?
Answer: Notice that we have 3 factors of 4 levels and 3 factors of 2 levels. Using the library DoE.base we can get the list of them.
library(DoE.base) ## the orthogonal arrays with 3 4-level factors and 3 2-level factors show.oas(factors = list(nlevels=c(4,2),number=c(3,3)))
5 resolution IV or more arrays found
name nruns lineage
10 L64.2.8.4.3 64
12 L64.2.6.4.4 64
23 L128.2.20.4.3 128
26 L192.2.36.4.3 192
29 L256.2.52.4.3 256
990 orthogonal arrays found,
the first 10 are listed
name nruns lineage
17 L16.2.6.4.3 16 4~5;:(4~1!2~3;)
18 L16.2.3.4.4 16 4~5;:(4~1!2~3;)
53 L32.2.22.4.3 32 4~8;8~1;:(8~1!2~4;4~1;)(4~1!2~3;)
55 L32.2.19.4.4 32 4~8;8~1;:(8~1!2~4;4~1;)(4~1!2~3;)
57 L32.2.16.4.5 32 4~8;8~1;:(8~1!2~4;4~1;)(4~1!2~3;)
59 L32.2.15.4.3.8.1 32 4~8;8~1;:(4~1!2~3;)
60 L32.2.13.4.6 32 4~8;8~1;:(8~1!2~4;4~1;)(4~1!2~3;)
61 L32.2.12.4.4.8.1 32 4~8;8~1;:(4~1!2~3;)
62 L32.2.10.4.7 32 4~8;8~1;:(8~1!2~4;4~1;)(4~1!2~3;)
63 L32.2.9.4.5.8.1 32 4~8;8~1;:(4~1!2~3;)
From the R output we can see that 8 runs is the minimum number of runs that we can get from this experiment. The ID code of this experiment is L16.2.6.4.3 which tells you that you can also use 6 2-level factors and 3 4-level factor.
Question: What is the recommended diet for Joe?
Answer: The doctor could ask Joe to follow the diet below for the next 16 days. Notice that 16 was the number of minimum runs that we got from that particular experimental design.
OA<-oa.design(nruns=16, factor.names=list(Breakfast=c("Sandwich","Pancakes","Omelette", "Yogurt+Honey+Nuts"), Beverage=c("Coffee","Orange Juice"), Lunch=c("Pork","Fish", "Chicken", "Salad"), Dinner=c("Pasta","Rice", "Milk+Cereals", "Pizza"), Dessert=c("Ice-Cream","Nothing"), Drink=c("Tea", "Wine"))) OA
Breakfast Beverage Lunch Dinner Dessert Drink
1 Yogurt+Honey+Nuts Coffee Fish Milk+Cereals Ice-Cream Tea
2 Yogurt+Honey+Nuts Orange Juice Salad Pizza Nothing Tea
3 Omelette Orange Juice Chicken Milk+Cereals Ice-Cream Wine
4 Pancakes Coffee Fish Rice Nothing Wine
5 Omelette Orange Juice Fish Pasta Nothing Tea
6 Pancakes Coffee Chicken Pizza Ice-Cream Tea
7 Yogurt+Honey+Nuts Orange Juice Pork Rice Ice-Cream Wine
8 Pancakes Orange Juice Salad Pasta Ice-Cream Wine
9 Sandwich Coffee Pork Pasta Ice-Cream Tea
10 Sandwich Coffee Salad Milk+Cereals Nothing Wine
11 Yogurt+Honey+Nuts Coffee Chicken Pasta Nothing Wine
12 Sandwich Orange Juice Fish Pizza Ice-Cream Wine
13 Omelette Coffee Salad Rice Ice-Cream Tea
14 Omelette Coffee Pork Pizza Nothing Wine
15 Sandwich Orange Juice Chicken Rice Nothing Tea
16 Pancakes Orange Juice Pork Milk+Cereals Nothing Tea
Every row in the table above represents one day.
A good check is to see that the factor levels are balances pairwise. Let’s take two factors for example:
aggregate(Lunch~Breakfast+Dessert, OA, length)
Breakfast Dessert Lunch
1 Sandwich Ice-Cream 2
2 Pancakes Ice-Cream 2
3 Omelette Ice-Cream 2
4 Yogurt+Honey+Nuts Ice-Cream 2
5 Sandwich Nothing 2
6 Pancakes Nothing 2
7 Omelette Nothing 2
8 Yogurt+Honey+Nuts Nothing 2
Question: What are the next steps
Answer: Every single day, Joe should right down how was his stomach ache during the night. The range of the score can be from 0 to 10. Then the doctor would have the Xs independent variables from the Orthogonal Array and the Y dependent variable will be the score provided by Joe. Finally, he will be able to run a regression or ANOVA model to find out which variables are statistically significant.