Project #2:
Part-I: Exploring beyond One-way ANOVA with simulated data.
What if there exist huge imbalances among sample sizes
{n1,n2,..,nK }? For instance, n1 = n2 = .. = nK −1 = 100, but
nK = 106.
What if the equal variance assumption is violated? That is,
members of {σ21 ,σ22 ,..,σ2K } are not all equal.
What if the Normality assumption is violated?
What if you want to discover the community structures among the
K samples?
Part-II: Exploring beyond One-way ANOVA with Kaggle data.
BRFSS 2015 dataset on Kaggle https://www.kaggle.com/alexteboul/heart-disease-he…
Divide the entire data set into 32(= K ) samples with respect to 5
binary variables of your choices, for instance, HD, Sex, Stroke,
Blood-pressure, Cholesterol. Compare the 32 BMI distributions.
Perform the one-way ANOVA and Tukey-Kramer’s simultaneous
pairwise comparison as if all Normality and equal-variance
assumptions hold. Check whether the equal variance assumption is
violated? That is, members of {σ21 ,σ22 ,..,σ2K } are not all equal.
Check whether the Normality assumption is violated?
Construct HC-tree to discover the community structures among the
K samples?
Compare results from ANOVA and and Tukey-Kramer’s comparison
with results found in HC-tree.
Please use RStudio and see Two part question and Lecture.
Requirements: pdf