Statistical significance - Wikipedia
文章推薦指數: 80 %
is the probability of obtaining a result at least as extreme, given that the null hypothesis is true. The result is statistically significant, ... Statisticalsignificance FromWikipedia,thefreeencyclopedia Jumptonavigation Jumptosearch Conceptininferentialstatistics Instatisticalhypothesistesting,[1][2]aresulthasstatisticalsignificancewhenitisveryunlikelytohaveoccurredgiventhenullhypothesis(simplybychancealone).[clarificationneeded][3]Moreprecisely,astudy'sdefinedsignificancelevel,denotedby α {\displaystyle\alpha} ,istheprobabilityofthestudyrejectingthenullhypothesis,giventhatthenullhypothesisistrue;[4]andthep-valueofaresult, p {\displaystylep} ,istheprobabilityofobtainingaresultatleastasextreme,giventhatthenullhypothesisistrue.[5]Theresultisstatisticallysignificant,bythestandardsofthestudy,when p ≤ α {\displaystylep\leq\alpha} .[6][7][8][9][10][11][12]Thesignificancelevelforastudyischosenbeforedatacollection,andistypicallysetto5%[13]ormuchlower—dependingonthefieldofstudy.[14] Inanyexperimentorobservationthatinvolvesdrawingasamplefromapopulation,thereisalwaysthepossibilitythatanobservedeffectwouldhaveoccurredduetosamplingerroralone.[15][16]Butifthep-valueofanobservedeffectislessthan(orequalto)thesignificancelevel,aninvestigatormayconcludethattheeffectreflectsthecharacteristicsofthewholepopulation,[1]therebyrejectingthenullhypothesis.[17] Thistechniquefortestingthestatisticalsignificanceofresultswasdevelopedintheearly20thcentury.Thetermsignificancedoesnotimplyimportancehere,andthetermstatisticalsignificanceisnotthesameasresearchsignificance,theoreticalsignificance,orpracticalsignificance.[1][2][18][19]Forexample,thetermclinicalsignificancereferstothepracticalimportanceofatreatmenteffect.[20] Contents 1History 1.1Relatedconcepts 2Roleinstatisticalhypothesistesting 2.1Significancethresholdsinspecificfields 3Limitations 3.1Effectsize 3.2Reproducibility 4Challenges 4.1Overuseinsomejournals 4.2Redefiningsignificance 5Seealso 6References 7Furtherreading 8Externallinks History[edit] Mainarticle:Historyofstatistics Statisticalsignificancedatestothe1700s,intheworkofJohnArbuthnotandPierre-SimonLaplace,whocomputedthep-valueforthehumansexratioatbirth,assuminganullhypothesisofequalprobabilityofmaleandfemalebirths;seep-value§ Historyfordetails.[21][22][23][24][25][26][27] In1925,RonaldFisheradvancedtheideaofstatisticalhypothesistesting,whichhecalled"testsofsignificance",inhispublicationStatisticalMethodsforResearchWorkers.[28][29][30]Fishersuggestedaprobabilityofoneintwenty(0.05)asaconvenientcutoffleveltorejectthenullhypothesis.[31]Ina1933paper,JerzyNeymanandEgonPearsoncalledthiscutoffthesignificancelevel,whichtheynamed α {\displaystyle\alpha} .Theyrecommendedthat α {\displaystyle\alpha} besetaheadoftime,priortoanydatacollection.[31][32] Despitehisinitialsuggestionof0.05asasignificancelevel,Fisherdidnotintendthiscutoffvaluetobefixed.Inhis1956publicationStatisticalMethodsandScientificInference,herecommendedthatsignificancelevelsbesetaccordingtospecificcircumstances.[31] Relatedconcepts[edit] Thesignificancelevel α {\displaystyle\alpha} isthethresholdfor p {\displaystylep} belowwhichthenullhypothesisisrejectedeventhoughbyassumptionitweretrue,andsomethingelseisgoingon.Thismeansthat α {\displaystyle\alpha} isalsotheprobabilityofmistakenlyrejectingthenullhypothesis,ifthenullhypothesisistrue.[4]ThisisalsocalledfalsepositiveandtypeIerror. Sometimesresearcherstalkabouttheconfidencelevelγ=(1−α)instead.Thisistheprobabilityofnotrejectingthenullhypothesisgiventhatitistrue.[33][34]ConfidencelevelsandconfidenceintervalswereintroducedbyNeymanin1937.[35] Roleinstatisticalhypothesistesting[edit] Mainarticles:Statisticalhypothesistesting,Nullhypothesis,Alternativehypothesis,p-value,andTypeIandtypeIIerrors Inatwo-tailedtest,therejectionregionforasignificancelevelofα=0.05ispartitionedtobothendsofthesamplingdistributionandmakesup5%oftheareaunderthecurve(whiteareas). Statisticalsignificanceplaysapivotalroleinstatisticalhypothesistesting.Itisusedtodeterminewhetherthenullhypothesisshouldberejectedorretained.Thenullhypothesisisthedefaultassumptionthatnothinghappenedorchanged.[36]Forthenullhypothesistoberejected,anobservedresulthastobestatisticallysignificant,i.e.theobservedp-valueislessthanthepre-specifiedsignificancelevel α {\displaystyle\alpha} . Todeterminewhetheraresultisstatisticallysignificant,aresearchercalculatesap-value,whichistheprobabilityofobservinganeffectofthesamemagnitudeormoreextremegiventhatthenullhypothesisistrue.[5][12]Thenullhypothesisisrejectedifthep-valueislessthan(orequalto)apredeterminedlevel, α {\displaystyle\alpha} . α {\displaystyle\alpha} isalsocalledthesignificancelevel,andistheprobabilityofrejectingthenullhypothesisgiventhatitistrue(atypeIerror).Itisusuallysetatorbelow5%. Forexample,when α {\displaystyle\alpha} issetto5%,theconditionalprobabilityofatypeIerror,giventhatthenullhypothesisistrue,is5%,[37]andastatisticallysignificantresultisonewheretheobservedp-valueislessthan(orequalto)5%.[38]Whendrawingdatafromasample,thismeansthattherejectionregioncomprises5%ofthesamplingdistribution.[39]These5%canbeallocatedtoonesideofthesamplingdistribution,asinaone-tailedtest,orpartitionedtobothsidesofthedistribution,asinatwo-tailedtest,witheachtail(orrejectionregion)containing2.5%ofthedistribution. Theuseofaone-tailedtestisdependentonwhethertheresearchquestionoralternativehypothesisspecifiesadirectionsuchaswhetheragroupofobjectsisheavierortheperformanceofstudentsonanassessmentisbetter.[3]Atwo-tailedtestmaystillbeusedbutitwillbelesspowerfulthanaone-tailedtest,becausetherejectionregionforaone-tailedtestisconcentratedononeendofthenulldistributionandistwicethesize(5%vs.2.5%)ofeachrejectionregionforatwo-tailedtest.Asaresult,thenullhypothesiscanberejectedwithalessextremeresultifaone-tailedtestwasused.[40]Theone-tailedtestisonlymorepowerfulthanatwo-tailedtestifthespecifieddirectionofthealternativehypothesisiscorrect.Ifitiswrong,however,thentheone-tailedtesthasnopower. Significancethresholdsinspecificfields[edit] Furtherinformation:StandarddeviationandNormaldistribution Inspecificfieldssuchasparticlephysicsandmanufacturing,statisticalsignificanceisoftenexpressedinmultiplesofthestandarddeviationorsigma(σ)ofanormaldistribution,withsignificancethresholdssetatamuchstricterlevel(e.g.5σ).[41][42]Forinstance,thecertaintyoftheHiggsbosonparticle'sexistencewasbasedonthe5σcriterion,whichcorrespondstoap-valueofabout1in3.5million.[42][43] Inotherfieldsofscientificresearchsuchasgenome-wideassociationstudies,significancelevelsaslowas5×10−8arenotuncommon[44][45]—asthenumberoftestsperformedisextremelylarge. Limitations[edit] Researchersfocusingsolelyonwhethertheirresultsarestatisticallysignificantmightreportfindingsthatarenotsubstantive[46]andnotreplicable.[47][48]Thereisalsoadifferencebetweenstatisticalsignificanceandpracticalsignificance.Astudythatisfoundtobestatisticallysignificantmaynotnecessarilybepracticallysignificant.[49][19] Effectsize[edit] Mainarticle:Effectsize Effectsizeisameasureofastudy'spracticalsignificance.[49]Astatisticallysignificantresultmayhaveaweakeffect.Togaugetheresearchsignificanceoftheirresult,researchersareencouragedtoalwaysreportaneffectsizealongwithp-values.Aneffectsizemeasurequantifiesthestrengthofaneffect,suchasthedistancebetweentwomeansinunitsofstandarddeviation(cf.Cohen'sd),thecorrelationcoefficientbetweentwovariablesoritssquare,andothermeasures.[50] Reproducibility[edit] Mainarticle:Reproducibility Astatisticallysignificantresultmaynotbeeasytoreproduce.[48]Inparticular,somestatisticallysignificantresultswillinfactbefalsepositives.Eachfailedattempttoreproducearesultincreasesthelikelihoodthattheresultwasafalsepositive.[51] Challenges[edit] Seealso:Misuseofp-values Overuseinsomejournals[edit] Startinginthe2010s,somejournalsbeganquestioningwhethersignificancetesting,andparticularlyusingathresholdofα=5%,wasbeingreliedontooheavilyastheprimarymeasureofvalidityofahypothesis.[52]Somejournalsencouragedauthorstodomoredetailedanalysisthanjustastatisticalsignificancetest.Insocialpsychology,thejournalBasicandAppliedSocialPsychologybannedtheuseofsignificancetestingaltogetherfrompapersitpublished,[53]requiringauthorstouseothermeasurestoevaluatehypothesesandimpact.[54][55] Othereditors,commentingonthisbanhavenoted:"Banningthereportingofp-values,asBasicandAppliedSocialPsychologyrecentlydid,isnotgoingtosolvetheproblembecauseitismerelytreatingasymptomoftheproblem.Thereisnothingwrongwithhypothesistestingandp-valuesperseaslongasauthors,reviewers,andactioneditorsusethemcorrectly."[56]Somestatisticiansprefertousealternativemeasuresofevidence,suchaslikelihoodratiosorBayesfactors.[57]UsingBayesianstatisticscanavoidconfidencelevels,butalsorequiresmakingadditionalassumptions,[57]andmaynotnecessarilyimprovepracticeregardingstatisticaltesting.[58] Thewidespreadabuseofstatisticalsignificancerepresentsanimportanttopicofresearchinmetascience.[59] Redefiningsignificance[edit] In2016,theAmericanStatisticalAssociation(ASA)publishedastatementonp-values,sayingthat"thewidespreaduseof'statisticalsignificance'(generallyinterpretedas'p ≤0.05')asalicenseformakingaclaimofascientificfinding(orimpliedtruth)leadstoconsiderabledistortionofthescientificprocess".[57]In2017,agroupof72authorsproposedtoenhancereproducibilitybychangingthep-valuethresholdforstatisticalsignificancefrom0.05to0.005.[60]Otherresearchersrespondedthatimposingamorestringentsignificancethresholdwouldaggravateproblemssuchasdatadredging;alternativepropositionsarethustoselectandjustifyflexiblep-valuethresholdsbeforecollectingdata,[61]ortointerpretp-valuesascontinuousindices,therebydiscardingthresholdsandstatisticalsignificance.[62]Additionally,thechangeto0.005wouldincreasethelikelihoodoffalsenegatives,wherebytheeffectbeingstudiedisreal,butthetestfailstoshowit.[63] In2019,over800statisticiansandscientistssignedamessagecallingfortheabandonmentoftheterm"statisticalsignificance"inscience,[64]andtheASApublishedafurtherofficialstatement[65]declaring(page2):Weconclude,basedonourreviewofthearticlesinthisspecialissueandthebroaderliterature,thatitistimetostopusingtheterm"statisticallysignificant"entirely.Norshouldvariantssuchas"significantlydifferent,"" p ≤ 0.05 {\displaystylep\leq0.05} ,"and"nonsignificant"survive,whetherexpressedinwords,byasterisksinatable,orinsomeotherway. Seealso[edit] Mathematicsportal A/Btesting,ABXtest Estimationstatistics Fisher'smethodforcombiningindependenttestsofsignificance Look-elsewhereeffect Multiplecomparisonsproblem Samplesize Texassharpshooterfallacy(givesexamplesoftestswherethesignificancelevelwassettoohigh) References[edit] ^abcSirkin,R.Mark(2005)."Two-samplettests".StatisticsfortheSocialSciences(3rd ed.).ThousandOaks,CA:SAGEPublications,Inc.pp. 271–316.ISBN 978-1-412-90546-6. ^abBorror,ConnieM.(2009)."Statisticaldecisionmaking".TheCertifiedQualityEngineerHandbook(3rd ed.).Milwaukee,WI:ASQQualityPress.pp. 418–472.ISBN 978-0-873-89745-7. ^abMyers,JeromeL.;Well,ArnoldD.;Lorch,RobertF.Jr.(2010)."Developingfundamentalsofhypothesistestingusingthebinomialdistribution".Researchdesignandstatisticalanalysis(3rd ed.).NewYork,NY:Routledge.pp. 65–90.ISBN 978-0-805-86431-1. ^abDalgaard,Peter(2008)."Powerandthecomputationofsamplesize".IntroductoryStatisticswithR.StatisticsandComputing.NewYork:Springer.pp. 155–56.doi:10.1007/978-0-387-79054-1_9.ISBN 978-0-387-79053-4. ^ab"StatisticalHypothesisTesting".www.dartmouth.edu.Archivedfromtheoriginalon2020-08-02.Retrieved2019-11-11. ^Johnson,ValenE.(October9,2013)."Revisedstandardsforstatisticalevidence".ProceedingsoftheNationalAcademyofSciences.110(48):19313–19317.Bibcode:2013PNAS..11019313J.doi:10.1073/pnas.1313476110.PMC 3845140.PMID 24218581. ^Redmond,Carol;Colton,Theodore(2001)."Clinicalsignificanceversusstatisticalsignificance".BiostatisticsinClinicalTrials.WileyReferenceSeriesinBiostatistics(3rd ed.).WestSussex,UnitedKingdom:JohnWiley&SonsLtd.pp. 35–36.ISBN 978-0-471-82211-0. ^Cumming,Geoff(2012).UnderstandingTheNewStatistics:EffectSizes,ConfidenceIntervals,andMeta-Analysis.NewYork,USA:Routledge.pp. 27–28. ^Krzywinski,Martin;Altman,Naomi(30October2013)."Pointsofsignificance:Significance,Pvaluesandt-tests".NatureMethods.10(11):1041–1042.doi:10.1038/nmeth.2698.PMID 24344377. ^Sham,PakC.;Purcell,ShaunM(17April2014)."Statisticalpowerandsignificancetestinginlarge-scalegeneticstudies".NatureReviewsGenetics.15(5):335–346.doi:10.1038/nrg3706.PMID 24739678.S2CID 10961123. ^Altman,DouglasG.(1999).PracticalStatisticsforMedicalResearch.NewYork,USA:Chapman&Hall/CRC.pp. 167.ISBN 978-0412276309. ^abDevore,JayL.(2011).ProbabilityandStatisticsforEngineeringandtheSciences(8th ed.).Boston,MA:CengageLearning.pp. 300–344.ISBN 978-0-538-73352-6. ^Craparo,RobertM.(2007)."Significancelevel".InSalkind,NeilJ.(ed.).EncyclopediaofMeasurementandStatistics.Vol. 3.ThousandOaks,CA:SAGEPublications.pp. 889–891.ISBN 978-1-412-91611-0. ^Sproull,NatalieL.(2002)."Hypothesistesting".HandbookofResearchMethods:AGuideforPractitionersandStudentsintheSocialScience(2nd ed.).Lanham,MD:ScarecrowPress,Inc.pp. 49–64.ISBN 978-0-810-84486-5. ^Babbie,EarlR.(2013)."Thelogicofsampling".ThePracticeofSocialResearch(13th ed.).Belmont,CA:CengageLearning.pp. 185–226.ISBN 978-1-133-04979-1. ^Faherty,Vincent(2008)."Probabilityandstatisticalsignificance".CompassionateStatistics:AppliedQuantitativeAnalysisforSocialServices(WithexercisesandinstructionsinSPSS)(1st ed.).ThousandOaks,CA:SAGEPublications,Inc.pp. 127–138.ISBN 978-1-412-93982-9. ^McKillup,Steve(2006)."Probabilityhelpsyoumakeadecisionaboutyourresults".StatisticsExplained:AnIntroductoryGuideforLifeScientists(1st ed.).Cambridge,UnitedKingdom:CambridgeUniversityPress.pp. 44–56.ISBN 978-0-521-54316-3. ^Myers,JeromeL.;Well,ArnoldD.;Lorch,RobertF.Jr.(2010)."Thetdistributionanditsapplications".ResearchDesignandStatisticalAnalysis(3rd ed.).NewYork,NY:Routledge.pp. 124–153.ISBN 978-0-805-86431-1. ^abHooper,Peter."WhatisP-value?"(PDF).UniversityofAlberta,DepartmentofMathematicalandStatisticalSciences.RetrievedNovember10,2019. ^Leung,W.-C.(2001-03-01)."Balancingstatisticalandclinicalsignificanceinevaluatingtreatmenteffects".PostgraduateMedicalJournal.77(905):201–204.doi:10.1136/pmj.77.905.201.ISSN 0032-5473.PMC 1741942.PMID 11222834. ^Brian,Éric;Jaisson,Marie(2007)."Physico-TheologyandMathematics(1710–1794)".TheDescentofHumanSexRatioatBirth.SpringerScience&BusinessMedia.pp. 1–25.ISBN 978-1-4020-6036-6. ^JohnArbuthnot(1710)."AnargumentforDivineProvidence,takenfromtheconstantregularityobservedinthebirthsofbothsexes"(PDF).PhilosophicalTransactionsoftheRoyalSocietyofLondon.27(325–336):186–190.doi:10.1098/rstl.1710.0011. ^Conover,W.J.(1999),"Chapter3.4:TheSignTest",PracticalNonparametricStatistics(Third ed.),Wiley,pp. 157–176,ISBN 978-0-471-16068-7 ^Sprent,P.(1989),AppliedNonparametricStatisticalMethods(Second ed.),Chapman&Hall,ISBN 978-0-412-44980-2 ^Stigler,StephenM.(1986).TheHistoryofStatistics:TheMeasurementofUncertaintyBefore1900.HarvardUniversityPress.pp. 225–226.ISBN 978-0-67440341-3. ^Bellhouse,P.(2001),"JohnArbuthnot",inStatisticiansoftheCenturiesbyC.C.HeydeandE.Seneta,Springer,pp. 39–42,ISBN 978-0-387-95329-8 ^Hald,Anders(1998),"Chapter4.ChanceorDesign:TestsofSignificance",AHistoryofMathematicalStatisticsfrom1750to1930,Wiley,p. 65 ^Cumming,Geoff(2011)."Fromnullhypothesissignificancetotestingeffectsizes".UnderstandingTheNewStatistics:EffectSizes,ConfidenceIntervals,andMeta-Analysis.MultivariateApplicationsSeries.EastSussex,UnitedKingdom:Routledge.pp. 21–52.ISBN 978-0-415-87968-2. ^Fisher,RonaldA.(1925).StatisticalMethodsforResearchWorkers.Edinburgh,UK:OliverandBoyd.pp. 43.ISBN 978-0-050-02170-5. ^Poletiek,FennaH.(2001)."Formaltheoriesoftesting".Hypothesis-testingBehaviour.EssaysinCognitivePsychology(1st ed.).EastSussex,UnitedKingdom:PsychologyPress.pp. 29–48.ISBN 978-1-841-69159-6. ^abcQuinn,GeoffreyR.;Keough,MichaelJ.(2002).ExperimentalDesignandDataAnalysisforBiologists(1st ed.).Cambridge,UK:CambridgeUniversityPress.pp. 46–69.ISBN 978-0-521-00976-8. ^Neyman,J.;Pearson,E.S.(1933)."Thetestingofstatisticalhypothesesinrelationtoprobabilitiesapriori".MathematicalProceedingsoftheCambridgePhilosophicalSociety.29(4):492–510.Bibcode:1933PCPS...29..492N.doi:10.1017/S030500410001152X.S2CID 119855116. ^"Conclusionsaboutstatisticalsignificancearepossiblewiththehelpoftheconfidenceinterval.Iftheconfidenceintervaldoesnotincludethevalueofzeroeffect,itcanbeassumedthatthereisastatisticallysignificantresult."Prel,Jean-Baptistdu;Hommel,Gerhard;Röhrig,Bernd;Blettner,Maria(2009)."ConfidenceIntervalorP-Value?".DeutschesÄrzteblattOnline.106(19):335–9.doi:10.3238/arztebl.2009.0335.PMC 2689604.PMID 19547734. ^StatNews#73:OverlappingConfidenceIntervalsandStatisticalSignificance ^Neyman,J.(1937)."OutlineofaTheoryofStatisticalEstimationBasedontheClassicalTheoryofProbability".PhilosophicalTransactionsoftheRoyalSocietyA.236(767):333–380.Bibcode:1937RSPTA.236..333N.doi:10.1098/rsta.1937.0005.JSTOR 91337. ^Meier,KennethJ.;Brudney,JeffreyL.;Bohte,John(2011).AppliedStatisticsforPublicandNonprofitAdministration(3rd ed.).Boston,MA:CengageLearning.pp. 189–209.ISBN 978-1-111-34280-7. ^Healy,JosephF.(2009).TheEssentialsofStatistics:AToolforSocialResearch(2nd ed.).Belmont,CA:CengageLearning.pp. 177–205.ISBN 978-0-495-60143-2. ^McKillup,Steve(2006).StatisticsExplained:AnIntroductoryGuideforLifeScientists(1st ed.).Cambridge,UK:CambridgeUniversityPress.pp. 32–38.ISBN 978-0-521-54316-3. ^Health,David(1995).AnIntroductionToExperimentalDesignAndStatisticsForBiology(1st ed.).Boston,MA:CRCpress.pp. 123–154.ISBN 978-1-857-28132-3. ^Hinton,PerryR.(2010)."Significance,error,andpower".Statisticsexplained(3rd ed.).NewYork,NY:Routledge.pp. 79–90.ISBN 978-1-848-72312-2. ^Vaughan,Simon(2013).ScientificInference:LearningfromData(1st ed.).Cambridge,UK:CambridgeUniversityPress.pp. 146–152.ISBN 978-1-107-02482-3. ^abBracken,MichaelB.(2013).Risk,Chance,andCausation:InvestigatingtheOriginsandTreatmentofDisease(1st ed.).NewHaven,CT:YaleUniversityPress.pp. 260–276.ISBN 978-0-300-18884-4. ^Franklin,Allan(2013)."Prologue:Theriseofthesigmas".ShiftingStandards:ExperimentsinParticlePhysicsintheTwentiethCentury(1st ed.).Pittsburgh,PA:UniversityofPittsburghPress.pp. Ii–Iii.ISBN 978-0-822-94430-0. ^Clarke,GM;Anderson,CA;Pettersson,FH;Cardon,LR;Morris,AP;Zondervan,KT(February6,2011)."Basicstatisticalanalysisingeneticcase-controlstudies".NatureProtocols.6(2):121–33.doi:10.1038/nprot.2010.182.PMC 3154648.PMID 21293453. ^Barsh,GS;Copenhaver,GP;Gibson,G;Williams,SM(July5,2012)."GuidelinesforGenome-WideAssociationStudies".PLOSGenetics.8(7):e1002812.doi:10.1371/journal.pgen.1002812.PMC 3390399.PMID 22792080. ^Carver,RonaldP.(1978)."TheCaseAgainstStatisticalSignificanceTesting".HarvardEducationalReview.48(3):378–399.doi:10.17763/haer.48.3.t490261645281841.S2CID 16355113. ^Ioannidis,JohnP.A.(2005)."Whymostpublishedresearchfindingsarefalse".PLOSMedicine.2(8):e124.doi:10.1371/journal.pmed.0020124.PMC 1182327.PMID 16060722. ^abAmrhein,Valentin;Korner-Nievergelt,Fränzi;Roth,Tobias(2017)."Theearthisflat(p>0.05):significancethresholdsandthecrisisofunreplicableresearch".PeerJ.5:e3544.doi:10.7717/peerj.3544.PMC 5502092.PMID 28698825. ^abHojat,Mohammadreza;Xu,Gang(2004)."AVisitor'sGuidetoEffectSizes".AdvancesinHealthSciencesEducation.9(3):241–9.doi:10.1023/B:AHSE.0000038173.00909.f6.PMID 15316274.S2CID 8045624. ^Pedhazur,ElazarJ.;Schmelkin,LioraP.(1991).Measurement,Design,andAnalysis:AnIntegratedApproach(Student ed.).NewYork,NY:PsychologyPress.pp. 180–210.ISBN 978-0-805-81063-9. ^Stahel,Werner(2016)."StatisticalIssueinReproducibility".Principles,Problems,Practices,andProspectsReproducibility:Principles,Problems,Practices,andProspects:87–114.doi:10.1002/9781118865064.ch5.ISBN 9781118864975. ^"CSSMESeminarSeries:Theargumentoverp-valuesandtheNullHypothesisSignificanceTesting(NHST)paradigm".www.education.leeds.ac.uk.SchoolofEducation,UniversityofLeeds.Retrieved2016-12-01. ^Novella,Steven(February25,2015)."PsychologyJournalBansSignificanceTesting".Science-BasedMedicine. ^Woolston,Chris(2015-03-05)."PsychologyjournalbansPvalues".Nature.519(7541):9.Bibcode:2015Natur.519....9W.doi:10.1038/519009f. ^Siegfried,Tom(2015-03-17)."Pvalueban:smallstepforajournal,giantleapforscience".ScienceNews.Retrieved2016-12-01. ^Antonakis,John(February2017)."Ondoingbetterscience:Fromthrillofdiscoverytopolicyimplications"(PDF).TheLeadershipQuarterly.28(1):5–21.doi:10.1016/j.leaqua.2017.01.006. ^abcWasserstein,RonaldL.;Lazar,NicoleA.(2016-04-02)."TheASA'sStatementonp-Values:Context,Process,andPurpose".TheAmericanStatistician.70(2):129–133.doi:10.1080/00031305.2016.1154108. ^García-Pérez,MiguelA.(2016-10-05)."ThouShaltNotBearFalseWitnessAgainstNullHypothesisSignificanceTesting".EducationalandPsychologicalMeasurement.77(4):631–662.doi:10.1177/0013164416668232.ISSN 0013-1644.PMC 5991793.PMID 30034024. ^Ioannidis,JohnP.A.;Ware,JenniferJ.;Wagenmakers,Eric-Jan;Simonsohn,Uri;Chambers,ChristopherD.;Button,KatherineS.;Bishop,DorothyV.M.;Nosek,BrianA.;Munafò,MarcusR.(January2017)."Amanifestoforreproduciblescience".NatureHumanBehaviour.1:0021.doi:10.1038/s41562-016-0021.PMC 7610724.PMID 33954258. ^Benjamin,Daniel;et al.(2018)."Redefinestatisticalsignificance".NatureHumanBehaviour.1(1):6–10.doi:10.1038/s41562-017-0189-z.PMID 30980045. ^Chawla,Dalmeet(2017)."'One-size-fits-all'thresholdforPvaluesunderfire".Nature.doi:10.1038/nature.2017.22625. ^Amrhein,Valentin;Greenland,Sander(2017)."Remove,ratherthanredefine,statisticalsignificance".NatureHumanBehaviour.2(1):0224.doi:10.1038/s41562-017-0224-0.PMID 30980046.S2CID 46814177. ^Vyse,Stuart(November2017)."MovingScience'sStatisticalGoalposts".csicop.org.CSI.Retrieved10July2018. ^McShane,Blake;Greenland,Sander;Amrhein,Valentin(March2019)."Scientistsriseupagainststatisticalsignificance".Nature.567(7748):305–307.Bibcode:2019Natur.567..305A.doi:10.1038/d41586-019-00857-9.PMID 30894741. ^Wasserstein,RonaldL.;Schirm,AllenL.;Lazar,NicoleA.(2019-03-20)."MovingtoaWorldBeyond"p<0.05"".TheAmericanStatistician.73(sup1):1–19.doi:10.1080/00031305.2019.1583913. Furtherreading[edit] LydiaDenworth,"ASignificantProblem:Standardscientificmethodsareunderfire.Willanythingchange?",ScientificAmerican,vol.321,no.4(October2019),pp. 62–67."Theuseofpvaluesfornearlyacentury[since1925]todeterminestatisticalsignificanceofexperimentalresultshascontributedtoanillusionofcertaintyand[to]reproducibilitycrisesinmanyscientificfields.Thereisgrowingdeterminationtoreformstatisticalanalysis...Some[researchers]suggestchangingstatisticalmethods,whereasotherswoulddoawaywithathresholdfordefining"significant"results."(p.63.) Ziliak,StephenandDeirdreMcCloskey(2008),TheCultofStatisticalSignificance:HowtheStandardErrorCostsUsJobs,Justice,andLives.AnnArbor,UniversityofMichiganPress,2009.ISBN 978-0-472-07007-7.Reviewsandreception:(compiledbyZiliak) Thompson,Bruce(2004)."The"significance"crisisinpsychologyandeducation".JournalofSocio-Economics.33(5):607–613.doi:10.1016/j.socec.2004.09.034. Chow,SiuL.,(1996).StatisticalSignificance:Rationale,ValidityandUtility,Volume1ofseriesIntroducingStatisticalMethods,SagePublicationsLtd,ISBN 978-0-7619-5205-3–arguesthatstatisticalsignificanceisusefulincertaincircumstances. Kline,Rex,(2004).BeyondSignificanceTesting:ReformingDataAnalysisMethodsinBehavioralResearchWashington,DC:AmericanPsychologicalAssociation. Nuzzo,Regina(2014).Scientificmethod:Statisticalerrors.NatureVol.506,p. 150-152(openaccess).Highlightscommonmisunderstandingsaboutthepvalue. Cohen,Joseph(1994).[1]Archived2017-07-13attheWaybackMachine.Theearthisround(p<.05 amrhein externallinks wikiversityhaslearningresourcesaboutstatisticalsignificance thearticle vtestatistics outline index descriptivestatisticscontinuousdatacenter mean arithmetic cubic generalized geometric harmonic heinz lehmer median mode dispersion averageabsolutedeviation coefficientofvariation interquartilerange percentile range standarddeviation variance shape centrallimittheorem moments kurtosis l-moments skewness countdata indexofdispersion summarytables contingencytable frequencydistribution groupeddata dependence partialcorrelation pearsonproduct-momentcorrelation rankcorrelation kendall spearman scatterplot graphics barchart biplot boxplot controlchart correlogram fanchart forestplot histogram piechart q radarchart runchart stem-and-leafdisplay violinplot datacollectionstudydesign effectsize missingdata optimaldesign population replication samplesizedetermination statistic statisticalpower surveymethodology sampling cluster stratified opinionpoll questionnaire standarderror controlledexperiments blocking factorialexperiment interaction randomassignment randomizedcontrolledtrial randomizedexperiment scientificcontrol adaptivedesigns adaptiveclinicaltrial stochasticapproximation up-and-downdesigns observationalstudies cohortstudy cross-sectionalstudy naturalexperiment quasi-experiment statisticalinferencestatisticaltheory probabilitydistribution samplingdistribution orderstatistic empiricaldistribution densityestimation statisticalmodel modelspecification lpspace parameter location scale parametricfamily likelihood exponentialfamily completeness sufficiency statisticalfunctional bootstrap u v optimaldecision lossfunction efficiency statisticaldistance divergence asymptotics robustness frequentistinferencepointestimation estimatingequations maximumlikelihood methodofmoments m-estimator minimumdistance unbiasedestimators mean-unbiasedminimum-variance rao lehmann medianunbiased plug-in intervalestimation confidenceinterval pivot likelihoodinterval predictioninterval toleranceinterval resampling jackknife testinghypotheses power uniformlymostpowerfultest permutationtest randomizationtest multiplecomparisons parametrictests likelihood-ratio score wald specifictests z-test student f-test goodnessoffit chi-squared g-test kolmogorov anderson lilliefors jarque normality likelihood-ratiotest modelselection crossvalidation aic bic rankstatistics sign samplemedian signedrank hodges ranksum nonparametricanova orderedalternative vanderwaerdentest bayesianinference bayesianprobability prior posterior credibleinterval bayesfactor bayesianestimator maximumposteriorestimator correlationregressionanalysiscorrelation pearsonproduct-moment confoundingvariable coefficientofdetermination regressionanalysis errorsandresiduals regressionvalidation mixedeffectsmodels simultaneousequationsmodels multivariateadaptiveregressionsplines linearregression simplelinearregression ordinaryleastsquares generallinearmodel bayesianregression non-standardpredictors nonlinearregression nonparametric semiparametric isotonic robust heteroscedasticity homoscedasticity generalizedlinearmodel exponentialfamilies logistic partitionofvariance analysisofvariance analysisofcovariance multivariateanova degreesoffreedom categorical cohen graphicalmodel log-linearmodel mcnemar cochran-mantel-haenszelstatistics multivariate regression manova principalcomponents canonicalcorrelation discriminantanalysis clusteranalysis classification structuralequationmodel factoranalysis multivariatedistributions ellipticaldistributions normal time-seriesgeneral decomposition trend stationarity seasonaladjustment exponentialsmoothing cointegration structuralbreak grangercausality dickey johansen q-statistic durbin breusch timedomain autocorrelation partial cross-correlation armamodel arimamodel autoregressiveconditionalheteroskedasticity vectorautoregression frequencydomain spectraldensityestimation fourieranalysis least-squaresspectralanalysis wavelet whittlelikelihood survivalsurvivalfunction kaplan proportionalhazardsmodels acceleratedfailuretime firsthittingtime hazardfunction nelson test log-ranktest applicationsbiostatistics bioinformatics clinicaltrials epidemiology medicalstatistics engineeringstatistics chemometrics methodsengineering probabilisticdesign process reliability systemidentification socialstatistics actuarialscience census crimestatistics demography econometrics jurimetrics nationalaccounts officialstatistics populationstatistics psychometrics spatialstatistics cartography environmentalstatistics geographicinformationsystem geostatistics kriging category commons wikiproject retrievedfrom categories:statisticalhypothesistestinghiddencategories:articleswithshortdescriptionshortdescription isdifferentfromwikidatawikipediaarticlesneedingclarificationfrommarch2022webarchivetemplatewaybackli nksacwith0elements navigationmenu personaltools notloggedintalkcontributionscreateaccountlogin namespaces articletalk english views readeditviewhistory more search navigation mainpagecontentscurrenteventsrandomarticleaboutwikipediacontactusdonate contribute helplearntoeditcommunityportalrecentchangesuploadfile tools whatlinkshererelatedchangesuploadfilespecialpagespermanentlinkpageinformationcitethispagewikidataite m print downloadaspdfprintableversion languages editlinks>
延伸文章資訊
- 1A note on the power of Fisher's least significant difference ...
Fisher's least significant difference (LSD) procedure is a two-step testing procedure for pairwis...
- 2Fisher's Least Significant Difference (LSD) Test
The first pairwise comparison technique was developed by Fisher in 1935 and is called the least s...
- 3Fishers Least Significant Difference (LSD) test in Prism
Fishers Least Significant Difference (LSD) test in Prism. Following one-way (or two-way) analysis...
- 4Analysis of variance - Wikipedia
Analysis of variance (ANOVA) is a collection of statistical models and their associated estimatio...
- 5What is Fisher's least significant difference (LSD) method for ...
Fisher's LSD method is used in ANOVA to create confidence intervals for all pairwise differences ...