Two-way Anova - R Companion

文章推薦指數: 80 %
投票人數:10人

Two-way anova example · 1 male ff 1.884 · 2 male ff 2.283 · 3 male fs 2.396 · 4 female ff 2.838 · 5 male fs 2.956 · 6 female ff 4.216 · 7 female ss 3.620 · 8 female ff ... AnRCompanionfortheHandbookofBiological Statistics SalvatoreS.Mangiafico SearchRcompanion.org   Contents   Introduction Purposeofthisbook TheHandbookforBiologicalStatistics Abouttheauthor AboutR ObtainingR AFewNotestoGetStartedwithR AvoidingPitfallsinR HelpwithR RTutorials FormalStatisticsBooks   TestsforNominalVariables ExactTestofGoodness-of-Fit PowerAnalysis Chi-squareTestofGoodness-of-Fit G–testofgoodness-of-fit Chi-squareTestofIndependence G–testofIndependence Fisher’sExactTestofIndependence SmallNumbersinChi-squareandG–tests RepeatedG–testsofGoodness-of-Fit Cochran–Mantel–HaenszelTestforRepeatedTestsofIndependence   DescriptiveStatistics StatisticsofCentralTendency StatisticsofDispersion StandardErroroftheMean ConfidenceLimits   TestsforOneMeasurementVariable Student’st–testforOneSample Student’st–testforTwoSamples Mann–WhitneyandTwo-samplePermutationTest ChaptersNotCoveredinThisBook TypeI,II,andIIISumsofSquares One-wayAnova Kruskal–WallisTest One-wayAnalysiswithPermutationTest NestedAnova Two-wayAnova Two-wayAnovawithRobustEstimation Pairedt–test WilcoxonSigned-rankTest   Regressions CorrelationandLinearRegression SpearmanRankCorrelation CurvilinearRegression AnalysisofCovariance MultipleRegression SimpleLogisticRegression MultipleLogisticRegression   MultipleTests MultipleComparisons   Miscellany ChaptersNotCoveredinThisBook   OtherAnalyses ContrastsinLinearModels Cate–NelsonAnalysis   AdditionalHelpfulTips ReadingSASDatalinesinR   OtherBooks SummaryandAnalysisofExtensionProgramEvaluationinR     Advertisement Two-wayAnova   Advertisement ExamplesinSummaryandAnalysisofExtensionProgramEvaluation SAEEPER:Two-wayANOVA SAEEPER:UsingRandom EffectsinModels SAEEPER:Whatare LeastSquareMeans?   Packagesusedinthischapter Thefollowingcommandswillinstallthesepackagesifthey arenotalreadyinstalled: if(!require(FSA)){install.packages("FSA")} if(!require(ggplot2)){install.packages("ggplot2")} if(!require(car)){install.packages("car")} if(!require(multcompView)){install.packages("multcompView")} if(!require(lsmeans)){install.packages("lsmeans")} if(!require(grid)){install.packages("grid")} if(!require(nlme)){install.packages("nlme")} if(!require(lme4)){install.packages("lme4")} if(!require(lmerTest)){install.packages("lmerTest")} if(!require(rcompanion)){install.packages("rcompanion")}   Whentouseit Nullhypotheses Howthetestworks Assumptions Seethe Handbookforinformationonthesetopics.   Examples Therattlesnakeexampleisshownattheendofthe“Howto dothetest”section.   Howtodothetest Fornotesonlinearmodelsandconductinganova,seethe“How todothetest”sectionintheOne-wayanovachapterofthisbook.  Fortwo-wayanovawithrobustregression,seethechapteronTwo-wayAnovawith RobustEstimation.   Two-wayanovaexample   ###-------------------------------------------------------------- ###Two-wayanova,SASexample,pp.179–180 ###-------------------------------------------------------------- Input=("  idSex    Genotype Activity  1male   ff       1.884  2male   ff       2.283  3male   fs       2.396  4female ff       2.838    5male   fs       2.956  6female ff       4.216  7female ss       3.620  8female ff       2.889    9female fs       3.550  10male   fs       3.105  11female fs       4.556  12female fs       3.087  13male   ff       4.939  14male   ff       3.486  15female ss       3.079  16male   fs       2.649  17female fs       1.943  19female ff       4.198  20female ff       2.473  22female ff       2.033   24female fs       2.200  25female fs       2.157  26male   ss       2.801  28male   ss       3.421  29female ff       1.811  30female fs       4.281  32female fs       4.772  34female ss       3.586  36female ff       3.944  38female ss       2.669  39female ss       3.050  41male   ss       4.275   43female ss       2.963  46female ss       3.236  48female ss       3.673  49male   ss       3.110 ") Data=read.table(textConnection(Input),header=TRUE)     Meansandsummarystatisticsbygroup   library(Rmisc) sum=summarySE(Data,                measurevar="Activity",                groupvars=c("Sex","Genotype")) sum       SexGenotypeNActivity       sd       se       ci 1female      ff8 3.050250.95990320.33937700.8024992 2female      fs8 3.318251.14453880.40465560.9568584 3female      ss8 3.234500.36177540.12790690.3024518 4  male      ff4 3.148001.37451150.68725582.1871546 5  male      fs4 2.776500.31684330.15842160.5041684 6  male      ss4 3.401750.63481090.31740551.0101258     Interactionplotusingsummarystatistics   library(ggplot2) pd=position_dodge(.2) ggplot(sum,aes(x=Genotype,                y=Activity,                color=Sex))+    geom_errorbar(aes(ymin=Activity-se,                      ymax=Activity+se),                   width=.2,size=0.7,position=pd)+    geom_point(shape=15,size=4,position=pd)+    theme_bw()+    theme(          axis.title.y=element_text(vjust=1.8),          axis.title.x=element_text(vjust=-0.5),          axis.title=element_text(face="bold"))+    scale_color_manual(values=c("black","blue")) ###Youmayseeanerror,“ymaxnotdefined” ### Inthiscase,itdoesnotappeartoaffectanything     Interactionplotforatwo-wayanova. Squarepointsrepresent meansforgroups,anderrorbarsindicatestandarderrorsofthemean.     Simpleboxplotofmaineffectandinteraction   boxplot(Activity~Genotype,        data=Data,        xlab="Genotype",        ylab="MPIActivity")       boxplot(Activity~Genotype:Sex,        data=Data,        xlab="GenotypexSex",        ylab="MPIActivity")     FitthelinearmodelandconductANOVA   model=lm(Activity~Sex+Genotype+Sex:Genotype,           data=Data) library(car) Anova(model,type="II")                   # Canusetype="III"  ###Ifyouusetype="III",you needthefollowinglinebeforetheanalysis  ###options(contrasts =c("contr.sum","contr.poly"))                SumSqDfFvaluePr(>F) Sex          0.0681 1 0.08610.7712 Genotype     0.2772 2 0.17540.8400 Sex:Genotype 0.8146 2 0.51530.6025 Residuals   23.713830      anova(model)                              # ProducestypeIsumofsquares               Df SumSqMeanSqFvaluePr(>F) Sex          1 0.06810.06808 0.08610.7712 Genotype     2 0.27720.13862 0.17540.8400 Sex:Genotype 2 0.81460.40732 0.51530.6025 Residuals   3023.71380.79046     summary(model)    #Producesr-square, overallp-value,parameterestimates   MultipleR-squared: 0.04663,AdjustedR-squared: -0.1123 F-statistic:0.2935on5and30DF, p-value:0.9128     Checkingassumptionsofthemodel   hist(residuals(model),     col="darkgray")   Ahistogramofresidualsfromalinearmodel. Thedistributionof theseresidualsshouldbeapproximatelynormal.     plot(fitted(model),     residuals(model))     Aplotofresidualsvs.predictedvalues. Theresidualsshouldbe unbiasedandhomoscedastic. Foranillustrationoftheseproperties,seethis diagrambySteveJostatDePaulUniversity:condor.depaul.edu/sjost/it223/documents/resid-plots.gif.     ###additionalmodelcheckingplots with:plot(model) ###alternative:library(FSA); residPlot(model)     Post-hoccomparisonofleast-squaremeans Fornotesonleast-squaremeans,seethe“Post-hoc comparisonofleast-squaremeans”sectionintheNestedanovachapterin thisbook.   Oneadvantageoftheusingthelsmeanspackagefor post-hoctestsisthatitcanproducecomparisonsforinteractioneffects.   Ingeneral,iftheinteractioneffectissignificant,you willwanttolookatcomparisonsofmeansfortheinteractions. Ifthe interactioneffectisnotsignificantbutamaineffectis,itisappropriate tolookatcomparisonsamongthemeansforthatmaineffect. Inthiscase, becausenoeffectofSex,Genotype,orSex:Genotypewas significant,wewouldnotactuallyperformanymeanseparationtest.   Meanseparationsformainfactorwithlsmeans   library(multcompView) library(lsmeans)lsmeans=lsmeans::lsmeans###Usesthelsmeans function                           ### fromthelsmeanspackage,                           ### notfromthelmerTestpackage leastsquare=lsmeans(model,                    "Genotype",                     adjust="tukey") cld(leastsquare,    alpha=.05,    Letters=letters)   Genotype  lsmean       SEdflower.CLupper.CL.group  fs      3.0473750.2722236302.3590653.735685 a     ff      3.0991250.2722236302.4108153.787435 a     ss      3.3181250.2722236302.6298154.006435 a      ###Meanssharingaletterin .grouparenotsignificantlydifferent     Meanseparationsforinteractioneffectwithlsmeans   library(multcompView) library(lsmeans)lsmeans=lsmeans::lsmeans###Usesthelsmeans function                           ### fromthelsmeanspackage,                           ### notfromthelmerTestpackage leastsquare=lsmeans(model,                    pairwise~Sex:Genotype,                     adjust="tukey")   cld(leastsquare,    alpha=.05,     Letters=letters)      Sex   Genotype lsmean       SEdflower.CLupper.CL.group  male  fs      2.776500.4445393301.5246664.028334 a     femaleff      3.050250.3143368302.1650693.935431 a     male  ff      3.148000.4445393301.8961664.399834 a     femaless      3.234500.3143368302.3493194.119681 a     femalefs      3.318250.3143368302.4330694.203431 a     male  ss      3.401750.4445393302.1499164.653584 a    ###Notethatmeansarelistedfrom lowtohigh, ### notinthesameorderas summarySE     Graphingtheresults   Simplebarplotwithcategoriesandnoerrorbars   ###Re-enterdataasmatrix Input=(" Sex    ff      fs      ss  Female 3.05025 3.31825 3.23450  Male   3.14800 2.77650 3.40175 ") Matriz=as.matrix(read.table(textConnection(Input),                   header=TRUE,                   row.names=1)) Matriz              ff     fs     ss Female3.050253.318253.23450 Male  3.148002.776503.40175     barplot(Matriz,        beside=TRUE,        legend=TRUE,        ylim=c(0,5),        xlab="Genotype",        ylab="MPIActivity")     Barplotwitherrorbarswithggplot2 ThisplotusesthedataframecreatedbysummarySEin Rmisc. Errorbarsindicatestandarderrorofthemeans(sein thedataframe).   library(Rmisc) sum=summarySE(Data,measurevar="Activity",groupvars=c("Sex","Genotype")) sum       SexGenotypeNActivity       sd       se       ci 1female      ff8 3.050250.95990320.33937700.8024992 2female      fs8 3.318251.14453880.40465560.9568584 3female      ss8 3.234500.36177540.12790690.3024518 4  male      ff4 3.148001.37451150.68725582.1871546 5  male      fs4 2.776500.31684330.15842160.5041684 6  male      ss4 3.401750.63481090.31740551.0101258     ###Plotadaptedfrom: ###  shinyapps.stat.ubc.ca/r-graph-catalog/   library(ggplot2) library(grid) ggplot(sum,   aes(x=Genotype,y=Activity,fill=Sex,       ymax=Activity+se,ymin=Activity-se)) +       geom_bar(stat="identity",position="dodge",width =0.7)+       geom_bar(stat="identity",position="dodge",                colour="black",width=0.7,                show_guide=FALSE) +       scale_y_continuous(breaks=seq(0,4,0.5),                limits=c(0,4),                expand=c(0,0)) +       scale_fill_manual(name="Counttype",                 values=c('grey80','grey30'),                 labels=c("Female",                            "Male")) +       geom_errorbar(position=position_dodge(width=0.7),                     width=0.0,size=0.5,color="black") +       labs(x="Genotype",            y="MPIActivity") +       ##ggtitle("Maintitle")+       theme_bw() +       theme(panel.grid.major.x=element_blank(),             panel.grid.major.y=element_line(colour="grey50"),             plot.title=element_text(size=rel(1.5),             face="bold",vjust=1.5),             axis.title=element_text(face="bold"),             legend.position="top",             legend.title=element_blank(),             legend.key.size=unit(0.4,"cm"),             legend.key=element_rect(fill="black"),             axis.title.y=element_text(vjust=1.8),             axis.title.x=element_text(vjust=-0.5)           )   #    #    #   Barplotforatwo-wayanova. Barheightsrepresentmeansfor groups,anderrorbarsindicatestandarderrorsofthemean.     Rattlesnakeexample–two-wayanovawithout replication,repeatedmeasures Thisexamplecouldbeinterpretedastwo-wayanovawithout replicationorasaone-wayrepeatedmeasuresexperiment.Belowitisanalyzed asatwo-wayfixedeffectsmodelusingthelmfunction,andasamixed effectsmodelusingthenlmepackageandlme4packages.   ### -------------------------------------------------------------- ###Two-wayanova,rattlesnakeexample,pp.177–178 ###-------------------------------------------------------------- Input=("  Day Snake Openings  1   D1       85  1   D3      107  1   D5       61  1   D8       22  1   D11      40  1   D12      65  2   D1       58  2   D3       51  2   D5       60  2   D8       41  2   D11      45  2   D12      27  3   D1       15  3   D3       30  3   D5       68  3   D8       63  3   D11      28  3   D12       3  4   D1       57  4   D3       12  4   D5       36  4   D8       21  4   D11      10  4   D12      16 ") Data=read.table(textConnection(Input),header=TRUE)### TreatDayasafactorvariable Data$Day=as.factor(Data$Day)     Usingtwo-wayfixedeffectsmodel   Meansandsummarystatisticsbygroup   library(Rmisc) sum=summarySE(Data,measurevar="Openings",groupvars=c("Day")) sum    DayNOpenings      sd       se      ci 1  1663.3333330.4543412.43293131.95987 2  2647.0000012.21475 4.98664912.81859 3  3634.5000025.9595810.59795627.24291 4  4625.3333318.08498 7.38316418.97903     Simpleboxplots   boxplot(Openings~Day,        data=Data,        xlab="Day",        ylab="Openingsuntiltailstopsrattling")     FitthelinearmodelandconductANOVA   model=lm(Openings~Day+Snake,           data=Data) library(car) Anova(model,type="II")                   # Canusetype="III"            SumSqDfFvalue Pr(>F)  Day      4877.8 3 3.32010.04866* Snake    3042.2 5 1.24240.33818  Residuals7346.015                      anova(model)                              # ProducestypeIsumofsquares            DfSumSqMeanSqFvalue Pr(>F)  Day       34877.81625.93 3.32010.04866* Snake     53042.2 608.44 1.24240.33818  Residuals157346.0 489.73                      summary(model)    #Producesr-square, overallp-value,parameterestimates   MultipleR-squared: 0.5188, AdjustedR-squared: 0.2622 F-statistic:2.022on8and15DF, p-value:0.1142     Checkingassumptionsofthemodel   hist(residuals(model),     col="darkgray")   Ahistogramofresidualsfromalinearmodel. Thedistributionof theseresidualsshouldbeapproximatelynormal.     plot(fitted(model),     residuals(model))     Aplotofresidualsvs.predictedvalues. Theresidualsshouldbe unbiasedandhomoscedastic. Foranillustrationoftheseproperties,seethis diagrambySteveJostatDePaulUniversity:condor.depaul.edu/sjost/it223/documents/resid-plots.gif.   ###additionalmodelcheckingplots with:plot(model) ###alternative:library(FSA); residPlot(model)     Meanseparationsformainfactorwithlsmeans   Fornotesonleast-squaremeans,seethe“Post-hoc comparisonofleast-square”meanssectionintheNestedanovachapterin thisbook. Forothermeanseparationtechniquesforamainfactorinanova, see“TukeyandLeastSignificantDifferencemeanseparationtests(pairwise comparisons)”sectionintheOne-wayanovachapter.   library(multcompView) library(lsmeans)lsmeans=lsmeans::lsmeans###Usesthelsmeans function                           ### fromthelsmeanspackage,                           ### notfromthelmerTestpackage leastsquare=lsmeans(model,                   "Day",                     adjust="tukey") cld(leastsquare,    alpha=.05,    Letters=letters)   Day  lsmean      SEdf  lower.CLupper.CL.group  4  25.333339.03447615-0.208587150.87525 a     3  34.500009.03447615 8.958079660.04192 ab    2  47.000009.0344761521.458079672.54192 ab    1  63.333339.0344761537.791412988.87525  b       ###Meanssharingaletterin .grouparenotsignificantlydifferent      Usingmixedeffectsmodelwithnlme Thisisanabbreviatedexampleusingthelmefunction inthenlmepackage. library(nlme) model=lme(Openings~Day,random=~1|Snake,            data=Data,            method="REML") anova.lme(model,          type="sequential",          adjustSigma=FALSE)            numDFdenDF F-valuep-value (Intercept)    1   1571.38736 <.0001 day library function leastsquare='lsmeans(model,             "Day",              ' adjust="tukey" cld significantlydifferent usingmixedeffectsmodelwithlmer thisisanabbreviatedexampleusingthelmer functioninthelme4package. model="lmer(Openings~Day+(1|Snake)," anova analysisofvariancetableoftypeiii approximationfordegreesoffreedom>F)  Day4877.8 1625.9    3   15 3.32010.04866* rand(model) AnalysisofRandomeffectsTable:      Chi.sqChi.DFp.value Snake0.0915     1    0.8 Leastsquaremeanswiththelsmeanspackage library(multcompView) library(lsmeans) lsmeans=lsmeans::lsmeans###Usesthelsmeans function                           ### fromthelsmeanspackage,                           ### notfromthelmerTestpackage leastsquare=lsmeans(model,                     pairwise~Day,                     alpha=.05,                     adjust="tukey") cld(leastsquare,    alpha=.05,     Letters=letters,    adjust="tukey") Day  lsmean      SE   df  lower.CLupper.CL.group  4  25.333339.30419619.81-0.144177950.81084 a     3  34.500009.30419619.81 9.022488759.97751 ab    2  47.000009.30419619.8121.522488772.47751 ab    1  63.333339.30419619.8137.855822188.81084  b   Degrees-of-freedommethod:satterthwaite Confidencelevelused:0.95 Conf-leveladjustment:sidakmethodfor4estimates Pvalueadjustment:tukeymethodforcomparingafamilyof4estimates significancelevelused:alpha=0.05   ###Meanssharingaletterin.grouparenot significantlydifferent LeastsquaremeansusingthelmerTestpackage lsmeans=lmerTest::lsmeans###Usesthelsmeans function                            ### fromthelmerTest package,                            ### notfromthelsmeans package LT=lsmeans(model,             test.effs="Day") LT LeastSquaresMeanstable:        DayEstimateStandardError  DFt-valueLowerCIUpperCIp-value    Day 1 1.0   63.33          9.3019.8   6.81   43.91    82.8 <2e-16 *** Day 2 2.0   47.00          9.3019.8   5.05   27.58    66.4  1e-04*** Day 3 3.0   34.50          9.3019.8   3.71   15.08    53.9  0.001** Day 4 4.0   25.33          9.3019.8   2.72    5.91    44.8  0.013*  PT=difflsmeans(model,                 test.effs="Day") PT DifferencesofLSMEANS:          EstimateStandardError  DFt-valueLowerCIUpperCIp-value   Day1-2    16.3         12.7815.0   1.28  -10.90    43.6  0.220   Day1-3    28.8         12.7815.0   2.26    1.60    56.1  0.039* Day1-4    38.0         12.7815.0   2.97   10.77    65.2  0.009** Day2-3    12.5         12.7815.0   0.98  -14.73    39.7  0.343   Day2-4    21.7         12.7815.0   1.70   -5.57    48.9  0.111   Day3-4     9.2         12.7815.0   0.72  -18.07    36.4  0.484   ###Extractlsmeanstable Sum=PT$diffs.lsmeans.table ###Extractcomparisonsandp-values Comparison=rownames(Sum) P.value  =Sum$'p-value' ###Adjustp-values P.value.adj=p.adjust(P.value,                       method= "none")   ###Fixnamesofcomparisons Comparison=gsub("-","-Day",Comparison) ###Producecompactletterdisplay library(rcompanion) cldList(comparison=Comparison,        p.value   =P.value.adj,        threshold=0.05)  GroupLetterMonoLetter 1 Day1     a        a 2 Day2    ab        ab 3 Day3     b         b 4 Day4     b         b #    #    #   ©2015bySalvatoreS.Mangiafico.RutgersCooperative Extension,NewBrunswick,NJ.Organizationofstatisticaltestsandselectionofexamplesforthese tests©2014byJohnH.McDonald. Usedwithpermission. Non-commercialreproductionofthiscontent,with attribution,ispermitted. For-profitreproductionwithoutpermissionis prohibited. Ifyouusethecodeorinformationinthissitein apublishedwork,pleaseciteitasasource. Also,ifyouareaninstructorandusethisbookinyourcourse,pleaseletmeknow.  MycontactinformationisontheAbouttheAuthorpage. ThissiteusesadvertisingfromMedia.net.Formore information,visitourprivacypolicypage. Proceedsfromtheseadsgo tosupporteducationandresearchactivities,includingtheimprovement ofthissite. Citation: Mangiafico,S.S.2015.AnRCompanionfortheHandbookofBiological Statistics,version1.3.2. rcompanion.org/rcompanion/.(Pdfversion: rcompanion.org/documents/RCompanionBioStatistics.pdf.)



請為這篇文章評分?