A Structural Approach to Selection Bias - Lippincott

文章推薦指數: 80 %
投票人數:10人

Epidemiologists apply the term “selection bias” to many biases, including bias resulting from inappropriate selection of controls in case-control studies, bias ... September2004-Volume15-Issue5 Previous Article Next Article ArticleasEPUB ExportAllImagestoPowerPointFile AddtoMyFavorites Colleague'sE-mailisInvalid YourName:(optional) YourEmail: Colleague'sEmail: Separatemultiplee-mailswitha(;). Message: Thoughtyoumightappreciatethisitem(s)IsawatEpidemiology. Sendacopytoyouremail Yourmessagehasbeensuccessfullysenttoyourcolleague. Someerrorhasoccurredwhileprocessingyourrequest.Pleasetryaftersometime. EndNoteProciteReferenceManager Savemyselection OriginalArticleHernán,MiguelA.*;Hernández-Díaz,Sonia†;Robins,JamesM.* AuthorInformation Fromthe*DepartmentofEpidemiology,HarvardSchoolofPublicHealth,Boston,Massachusetts;andthe†SloneEpidemiologyCenter,BostonUniversitySchoolofPublicHealth,Brookline,Massachusetts. Submitted21March2003;finalversionaccepted24May2004. MiguelHernánwassupportedbyNIHgrantK08-AI-49392andJamesRobinsbyNIHgrantR01-AI-32475. Correspondence:MiguelHernán,DepartmentofEpidemiology,HarvardSchoolofPublicHealth,677HuntingtonAvenue,Boston,MA02115.E-mail:[email protected] Epidemiology: September2004-Volume15-Issue5-p615-625 doi:10.1097/01.ede.0000135174.63482.43 Free SDC Metrics Abstract Theterm“selectionbias”encompassesvariousbiasesinepidemiology.Wedescribeexamplesofselectionbiasincase-controlstudies(eg,inappropriateselectionofcontrols)andcohortstudies(eg,informativecensoring).Wearguethatthecausalstructureunderlyingthebiasineachexampleisessentiallythesame:conditioningonacommoneffectof2variables,oneofwhichiseitherexposureoracauseofexposureandtheotheriseithertheoutcomeoracauseoftheoutcome.Thisstructureissharedbyotherbiases(eg,adjustmentforvariablesaffectedbypriorexposure).Astructuralclassificationofbiasdistinguishesbetweenbiasesresultingfromconditioningoncommoneffects(“selectionbias”)andthoseresultingfromtheexistenceofcommoncausesofexposureandoutcome(“confounding”).Thisclassificationalsoleadstoaunifiedapproachtoadjustforselectionbias. ArticlePlus ClickonthelinksbelowtoaccessalltheArticlePlusforthisarticle. PleasenotethatArticlePlusfilesmaylaunchaviewerapplicationoutsideofyourwebbrowser. https://links.lww.com/EDE/A107 Epidemiologistsapplytheterm“selectionbias”tomanybiases,includingbiasresultingfrominappropriateselectionofcontrolsincase-controlstudies,biasresultingfromdifferentialloss-to-followup,incidence–prevalencebias,volunteerbias,healthy-workerbias,andnonresponsebias. Asdiscussedinnumeroustextbooks,1–5thecommonconsequenceofselectionbiasisthattheassociationbetweenexposureandoutcomeamongthoseselectedforanalysisdiffersfromtheassociationamongthoseeligible.Inthisarticle,weconsiderwhetheralltheseseeminglyheterogeneoustypesofselectionbiasshareacommonunderlyingcausalstructurethatjustifiesclassifyingthemtogether.Weusecausaldiagramstoproposeacommonstructureandshowhowthisstructureleadstoaunifiedstatisticalapproachtoadjustforselectionbias.Wealsoshowthatcausaldiagramscanbeusedtodifferentiateselectionbiasfromwhatepidemiologistsgenerallyconsiderconfounding. CAUSALDIAGRAMSANDASSOCIATION Directedacyclicgraphs(DAGs)areusefulfordepictingcausalstructureinepidemiologicsettings.6–12Infact,thestructureofbiasresultingfromselectionwasfirstdescribedintheDAGliteraturebyPearl13andbySpirtesetal.14ADAGiscomposedofvariables(nodes),bothmeasuredandunmeasured,andarrows(directededges).AcausalDAGisoneinwhich1)thearrowscanbeinterpretedasdirectcausaleffects(asdefinedinAppendixA.1),and2)allcommoncausesofanypairofvariablesareincludedonthegraph.CausalDAGsareacyclicbecauseavariablecannotcauseitself,eitherdirectlyorthroughothervariables.ThecausalDAGinFigure1representsthedichotomousvariablesL(beingasmoker),E(carryingmatchesinthepocket),andD(diagnosisoflungcancer).ThelackofanarrowbetweenEandDindicatesthatcarryingmatchesdoesnothaveacausaleffect(causativeorpreventive)onlungcancer,ie,theriskofDwouldbethesameifoneintervenedtochangethevalueofE.FIGURE1.: CommoncauseLofexposureEandoutcomeD.Besidesrepresentingcausalrelations,causalDAGsalsoencodethecausaldeterminantsofstatisticalassociations.Infact,thetheoryofcausalDAGsspecifiesthatanassociationbetweenanexposureandanoutcomecanbeproducedbythefollowing3causalstructures13,14: Causeandeffect:IftheexposureEcausestheoutcomeD,orviceversa,thentheywillingeneralbeassociated.Figure2representsarandomizedtrialinwhichE(antiretroviraltreatment)preventsD(AIDS)amongHIV-infectedsubjects.The(associational)riskratioARREDdiffersfrom1.0,andthisassociationisentirelyattributabletothecausaleffectofEonD.FIGURE2.: CausaleffectofexposureEonoutcomeD.Commoncauses:Iftheexposureandtheoutcomeshareacommoncause,thentheywillingeneralbeassociatedevenifneitherisacauseoftheother.InFigure1,thecommoncauseL(smoking)resultsinE(carryingmatches)andD(lungcancer)beingassociated,ie,again,ARRED≠1.0. Commoneffects:AnexposureEandanoutcomeDthathaveacommoneffectCwillbeconditionallyassociatediftheassociationmeasureiscomputedwithinlevelsofthecommoneffectC,ie,thestratum-specificARRED|Cwilldifferfrom1.0,regardlessofwhetherthecrude(equivalently,marginal,orunconditional)ARREDis1.0.Moregenerally,aconditionalassociationbetweenEandDwilloccurwithinstrataofacommoneffectCof2othervariables,oneofwhichiseitherexposureoracauseofexposureandtheotheriseithertheoutcomeoracauseoftheoutcome.NotethatEandDneednotbeunconditionallyassociatedsimplybecausetheyhaveacommoneffect.IntheAppendixwedescribeadditional,morecomplex,structuralcausesofstatisticalassociations. Thatcausalstructures(1)and(2)implyacrudeassociationaccordswiththeintuitionofmostepidemiologists.Wenowprovideintuitionforwhystructure(3)inducesaconditionalassociation.(Foraformaljustification,seereferences13and14.)InFigure3,thegenetichaplotypeEandsmokingDbothcausecoronaryheartdiseaseC.Nonetheless,EandDaremarginallyunassociated(ARRED=1.0)becauseneithercausestheotherandtheysharenocommoncause.Wenowargueheuristicallythat,ingeneral,theywillbeconditionallyassociatedwithinlevelsoftheircommoneffectC.FIGURE3.: ConditioningonacommoneffectCofexposureEandoutcomeD.Supposethattheinvestigators,whoareinterestedinestimatingtheeffectofhaplotypeEonsmokingstatusD,restrictedthestudypopulationtosubjectswithheartdisease(C=1).ThesquarearoundCinFigure3indicatesthattheyareconditioningonaparticularvalueofC.KnowingthatasubjectwithheartdiseaselackshaplotypeEprovidessomeinformationabouthersmokingstatusbecause,intheabsenceofE,itismorelikelythatanothercauseofCsuchasDispresent.Thatis,amongpeoplewithheartdisease,theproportionofsmokersisincreasedamongthosewithoutthehaplotypeE.Therefore,EandDareinverselyassociatedconditionallyonC=1,andtheconditionalriskratioARRED|C=1islessthan1.0.Intheextreme,ifEandDweretheonlycausesofC,thenamongpeoplewithheartdisease,theabsenceofoneofthemwouldperfectlypredictthepresenceoftheother. Asanotherexample,theDAGinFigure4addstotheDAGinFigure3adiureticmedicationMwhoseuseisaconsequenceofadiagnosisofheartdisease.EandDarealsoassociatedwithinlevelsofMbecauseMisacommoneffectofEandD.FIGURE4.: ConditioningonacommoneffectMofexposureEandoutcomeD.Thereisanotherpossiblesourceofassociationbetween2variablesthatwehavenotdiscussedyet.Asaresultofsamplingvariability,2variablescouldbeassociatedbychanceevenintheabsenceofstructures(1),(2),or(3).Chanceisnotastructuralsourceofassociationbecausechanceassociationsbecomesmallerwithincreasedsamplesize.Incontrast,structuralassociationsremainunchanged.Tofocusourdiscussiononstructuralratherthanchanceassociations,weassumewehaverecordeddataineverysubjectinaverylarge(perhapshypothetical)populationofinterest.Wealsoassumethatallvariablesareperfectlymeasured. ACLASSIFICATIONOFBIASESACCORDINGTOTHEIRSTRUCTURE Wewillsaythatbiasispresentwhentheassociationbetweenexposureandoutcomeisnotinitsentiretytheresultofthecausaleffectofexposureonoutcome,ormorepreciselywhenthecausalriskratio(CRRED),definedinAppendixA.1,differsfromtheassociationalriskratio(ARRED).Inanidealrandomizedtrial(ie,noconfounding,fulladherencetotreatment,perfectblinding,nolossestofollowup)suchastheonerepresentedinFigure2,thereisnobiasandtheassociationmeasureequalsthecausaleffectmeasure. Becausenonchanceassociationsaregeneratedbystructures(1),(2),and(3),itfollowsthatbiasescouldbeclassifiedonthebasisofthesestructures: Causeandeffectcouldcreatebiasasaresultofreversecausation.Forexample,inmanycase-controlstudies,theoutcomeprecedestheexposuremeasurement.Thus,theassociationoftheoutcomewithmeasuredexposurecouldinpartreflectbiasattributabletotheoutcome'seffectonmeasuredexposure.7,8Examplesofreversecausationbiasincludenotonlyrecallbiasincase-controlstudies,butalsomoregeneralformsofinformationbiaslike,forexample,whenabloodparameteraffectedbythepresenceofcancerismeasuredafterthecancerispresent. Commoncauses:Ingeneral,whentheexposureandoutcomeshareacommoncause,theassociationmeasurediffersfromtheeffectmeasure.Epidemiologiststendtousethetermconfoundingtorefertothisbias. Conditioningoncommoneffects:Weproposethatthisstructureisthesourceofthosebiasesthatepidemiologistsrefertoasselectionbias.Wearguebywayofexample. EXAMPLESOFSELECTIONBIAS InappropriateSelectionofControlsinaCase-ControlStudy Figure5representsacase-controlstudyoftheeffectofpostmenopausalestrogens(E)ontheriskofmyocardialinfarction(D).ThevariableCindicateswhetherawomaninthepopulationcohortisselectedforthecase-controlstudy(yes=1,no=0).ThearrowfromdiseasestatusDtoselectionCindicatesthatcasesinthecohortaremorelikelytobeselectedthannoncases,whichisthedefiningfeatureofacase-controlstudy.Inthisparticularcase-controlstudy,investigatorsselectedcontrolspreferentiallyamongwomenwithahipfracture(F),whichisrepresentedbyanarrowfromFtoC.ThereisanarrowfromEtoFtorepresenttheprotectiveeffectofestrogensonhipfracture.NoteFigure5isessentiallythesameasFigure3,exceptwehavenowelaboratedthecausalpathwayfromEtoC.FIGURE5.: Selectionbiasinacase-controlstudy.Seetextfordetails.Inacase-controlstudy,theassociationalexposure–diseaseoddsratio(AORED|C=1)isbydefinitionconditionalonhavingbeenselectedintothestudy(C=1).IfsubjectswithhipfractureFareoversampledascontrols,thentheprobabilityofcontrolselectiondependsonaconsequenceFoftheexposure(asrepresentedbythepathfromEtoCthroughF)and“inappropriatecontrolselection”biaswilloccur(eg,AORED|C=1willdifferfrom1.0,evenwhenlikeinFigure5theexposurehasnoeffectonthedisease).ThisbiasarisesbecauseweareconditioningonacommoneffectCofexposureanddisease.Aheuristicexplanationofthisbiasfollows.Amongsubjectsselectedforthestudy,controlsaremorelikelythancasestohavehadahipfracture.Therefore,becauseestrogenslowertheincidenceofhipfractures,acontrolislesslikelytobeonestrogensthanacase,andhenceAORED|C=1isgreaterthan1.0,eventhoughtheexposuredoesnotcausetheoutcome.IdenticalreasoningwouldexplainthattheexpectedAORED|C=1wouldbegreaterthanthecausalOREDevenhadthecausalOREDdifferedfrom1.0. Berkson'sBias Berkson15pointedoutthat2diseases(EandD)thatareunassociatedinthepopulationcouldbeassociatedamonghospitalizedpatientswhenbothdiseasesaffecttheprobabilityofhospitaladmission.BytakingCinFigure3tobetheindicatorvariableforhospitalization,werecognizethatBerkson'sbiascomesfromconditioningonthecommoneffectCofdiseasesEandD.Asaconsequence,inacase-controlstudyinwhichthecaseswerehospitalizedpatientswithdiseaseDandcontrolswerehospitalizedpatientswithdiseaseE,anexposureRthatcausesdiseaseEwouldappeartobeariskfactorfordiseaseD(ie,Fig.3ismodifiedbyaddingfactorRandanarrowfromRtoE).Thatis,AORRD|C=1woulddifferfrom1.0evenifRdoesnotcauseD. DifferentialLosstoFollowUpinLongitudinalStudies Figure6arepresentsafollow-upstudyoftheeffectofantiretroviraltherapy(E)onAIDS(D)riskamongHIV-infectedpatients.Thegreaterthetruelevelofimmunosuppression(U),thegreatertheriskofAIDS.Uisunmeasured.Ifapatientdropsoutfromthestudy,hisAIDSstatuscannotbeassessedandwesaythatheiscensored(C=1).PatientswithgreatervaluesofUaremorelikelytobelosttofollowupbecausetheseverityoftheirdiseasepreventsthemfromattendingfuturestudyvisits.TheeffectofUoncensoringismediatedbypresenceofsymptoms(fever,weightloss,diarrhea,andsoon),CD4count,andviralloadinplasma,allsummarizedinthe(vector)variableL,whichcouldorcouldnotbemeasured.TheroleofL,whenmeasured,indataanalysisisdiscussedinthenextsection;inthissection,wetakeLtobeunmeasured.Patientsreceivingtreatmentareatagreaterriskofexperiencingsideeffects,whichcouldleadthemtodropout,asrepresentedbythearrowfromEtoC.Forsimplicity,assumethattreatmentEdoesnotcauseDandsothereisnoarrowfromEtoD(CRRED=1.0).ThesquarearoundCindicatesthattheanalysisisrestrictedtothosepatientswhodidnotdropout(C=0).Theassociationalrisk(orrate)ratioARRED|C=0differsfrom1.0.This“differentiallosstofollow-up”biasisanexampleofbiasresultingfromstructure(3)becauseitarisesfromconditioningonthecensoringvariableC,whichisacommoneffectofexposureEandacauseUoftheoutcome.FIGURE6.: Selectionbiasinacohortstudy.Seetextfordetails.Anintuitiveexplanationofthebiasfollows.Ifatreatedsubjectwithtreatment-inducedsideeffects(andtherebyatagreaterriskofdroppingout)didinfactnotdropout(C=0),thenitisgenerallylesslikelythatasecondcauseofdroppingout(eg,alargevalueofU)waspresent.Therefore,aninverseassociationbetweenEandUwouldbeexpected.However,UispositivelyassociatedwiththeoutcomeD.Therefore,restrictingtheanalysistosubjectswhodidnotdropoutofthisstudyinducesaninverseassociation(mediatedbyU)betweenexposureandoutcome,ie,ARRED|C=0isnotequalto1.0. Figure6aisasimpletransformationofFigure3thatalsorepresentsbiasresultingfromstructure(3):theassociationbetweenDandCresultingfromadirecteffectofDonCinFigure3isnowtheresultofU,acommoncauseofDandC.Wenowpresent3additionalstructures,(Figs.6b–d),whichcouldleadtoselectionbiasbydifferentiallosstofollowup. Figure6bisavariationofFigure6a.Ifpriortreatmenthasadirecteffectonsymptoms,thenrestrictingthestudytotheuncensoredindividualsagainimpliesconditioningonthecommoneffectCoftheexposureandUtherebyintroducingaspuriousassociationbetweentreatmentandoutcome.Figures6aand6bcoulddepicteitheranobservationalstudyoranexperimentinwhichtreatmentEisrandomlyassigned,becausetherearenocommoncausesofEandanyothervariable.Thus,ourresultsdemonstratethatrandomizedtrialsarenotfreeofselectionbiasasaresultofdifferentiallosstofollowupbecausesuchselectionoccursaftertherandomization. Figures6canddarevariationsofFigures6aandb,respectively,inwhichthereisacommoncauseU*ofEandanothermeasuredvariable.U*indicatesunmeasuredlifestyle/personality/educationalvariablesthatdeterminebothtreatment(throughthearrowfromU*toE)andeitherattitudestowardattendingstudyvisits(throughthearrowfromU*toCinFig.6c)orthresholdforreportingsymptoms(throughthearrowfromU*toLinFig.6d).Again,these2areexamplesofbiasresultingfromstructure(3)becausethebiasarisesfromconditioningonthecommoneffectCofbothacauseU*ofEandacauseUofD.ThisparticularbiashasbeenreferredtoasMbias.12ThebiascausedbydifferentiallosstofollowupinFigures6a–disalsoreferredtoasbiasduetoinformativecensoring. NonresponseBias/MissingDataBias ThevariableCinFigures6a–dcanrepresentmissingdataontheoutcomeforanyreason,notjustasaresultoflosstofollowup.Forexample,subjectscouldhavemissingdatabecausetheyarereluctanttoprovideinformationorbecausetheymissstudyvisits.RegardlessofthereasonswhydataonDaremissing,standardanalysesrestrictedtosubjectswithcompletedata(C=0)willbebiased. VolunteerBias/Self-selectionBias Figures6a–dcanalsorepresentastudyinwhichCisagreementtoparticipate(yes=1,no=0),Eiscigarettesmoking,Discoronaryheartdisease,Uisfamilyhistoryofheartdisease,andU*ishealthylifestyle.(LisanymediatorbetweenUandCsuchasheartdiseaseawareness.)Underanyofthesestructures,therewouldbenobiasifthestudypopulationwasarepresentative(ie,random)sampleofthetargetpopulation.However,biaswillbepresentifthestudyisrestrictedtothosewhovolunteeredorelectedtoparticipate(C=1).Volunteerbiascannotoccurinarandomizedstudyinwhichsubjectsarerandomized(ie,exposed)onlyafteragreeingtoparticipate,becausenoneofFigures6a–dcanrepresentsuchatrial.Figures6aandbareeliminatedbecauseexposurecannotcauseC.Figures6canddareeliminatedbecause,asaresultoftherandomexposureassignment,therecannotexistacommoncauseofexposureandanyanothervariable. HealthyWorkerBias Figures6a–dcanalsodescribeabiasthatcouldarisewhenestimatingtheeffectofachemicalE(anoccupationalexposure)onmortalityDinacohortoffactoryworkers.TheunderlyingunmeasuredtruehealthstatusUisadeterminantofbothdeath(D)andofbeingatwork(C).Thestudyisrestrictedtoindividualswhoareatwork(C=1)atthetimeofoutcomeascertainment.(Lcouldbetheresultofbloodtestsandaphysicalexamination.)Beingexposedtothechemicalisapredictorofbeingatworkinthenearfuture,eitherdirectly(eg,exposurecancausedisablingasthma),likeinFigures6aandb,orthroughacommoncauseU*(eg,certainexposedjobsareeliminatedforeconomicreasonsandtheworkerslaidoff)likeinFigures6candd. This“healthyworker”biasisanexampleofbiasresultingfromstructure(3)becauseitarisesfromconditioningonthecensoringvariableC,whichisacommoneffectof(acauseof)exposureand(acauseof)theoutcome.However,theterm“healthyworker”biasisalsousedtodescribethebiasthatoccurswhencomparingtheriskincertaingroupofworkerswiththatinagroupofsubjectsfromthegeneralpopulation.ThissecondbiascanbedepictedbytheDAGinFigure1inwhichLrepresentshealthstatus,Erepresentsmembershipinthegroupofworkers,andDrepresentstheoutcomeofinterest.TherearearrowsfromLtoEandDbecausebeinghealthyaffectsjobtypeandriskofsubsequentoutcome,respectively.Inthiscase,thebiasiscausedbystructure(1)andwouldthereforegenerallybeconsideredtobetheresultofconfounding. TheseexamplesleadustoproposethatthetermselectionbiasincausalinferencesettingsbeusedtorefertoanybiasthatarisesfromconditioningonacommoneffectasinFigure3oritsvariations(Figs.4–6). Inadditiontotheexamplesgivenhere,DAGshavebeenusedtocharacterizevariousotherselectionbiases.Forexample,Robins7explainedhowcertainattemptstoeliminateascertainmentbiasinstudiesofestrogensandendometrialcancercouldthemselvesinducebias16;Hernánetal.8discussedincidence–prevalencebiasincase-controlstudiesofbirthdefects;andColeandHernán9discussedthebiasthatcouldbeintroducedbystandardmethodstoestimatedirecteffects.17,18InAppendixA.2,weprovideafinalexample:thebiasthatresultsfromtheuseofthehazardratioasaneffectmeasure.Wedeferredthisexampletotheappendixbecauseofitsgreatertechnicalcomplexity.(NotethatstandardDAGsdonotrepresent“effectmodification”or“interactions”betweenvariables,butthisdoesnotaffecttheirabilitytorepresentthecausalstructuresthatproducebias,asmorefullyexplainedinAppendixA.3). Todemonstratethegeneralityofourapproachtoselectionbias,wenowshowthatabiasthatarisesinlongitudinalstudieswithtime-varyingexposures19canalsobeunderstoodasaformofselectionbias. AdjustmentforVariablesAffectedbyPreviousExposure(oritscauses) Considerafollow-upstudyoftheeffectofantiretroviraltherapy(E)onviralloadattheendoffollowup(D=1ifdetectable,D=0otherwise)inHIV-infectedsubjects.Thegreaterasubject'sunmeasuredtrueimmunosuppressionlevel(U),thegreaterherviralloadDandthelowertheCD4countL(low=1,high=0).TreatmentincreasesCD4count,andthepresenceoflowCD4count(aproxyforthetruelevelofimmunosuppression)increasestheprobabilityofreceivingtreatment.Weassumethat,intruthbutunknowntothedataanalyst,treatmenthasnocausaleffectontheoutcomeD.TheDAGsinFigures7aandbrepresentthefirst2timepointsofthestudy.Attime1,treatmentE1isdecidedafterobservingthesubject'sriskfactorprofileL1.(E0couldbedecidedafterobservingL0,buttheinclusionofL0intheDAGwouldnotessentiallyalterourmainpoint.)LetEbethesumofE0andE1.ThecumulativeexposurevariableEcanthereforetake3values:0(ifthesubjectisnottreatedatanytime),1(ifthesubjectistreatedattimeoneonlyorattime2only),and2(ifthesubjectistreatedatbothtimes).Supposetheanalyst'sinterestliesincomparingtheriskhadallsubjectsbeenalwaystreated(E=2)withthathadallsubjectsneverbeentreated(E=0),andthatthecausalriskratiois1.0(CRRED=1,whencomparingE=2vs.E=0).FIGURE7.: Adjustmentforavariableaffectedbypreviousexposure.ToestimatetheeffectofEwithoutbias,theanalystneedstobeabletoestimatetheeffectofeachofitscomponentsE0andE1simultaneouslyandwithoutbias.17Aswewillsee,thisisnotpossibleusingstandardmethods,evenwhendataonL1areavailable,becauselackofadjustmentforL1precludesunbiasedestimationofthecausaleffectofE1whereasadjustmentforL1bystratification(or,equivalently,byconditioning,matching,orregressionadjustment)precludesunbiasedestimationofthecausaleffectofE0. Unlikepreviousstructures,Figures7aand7bcontainacommoncauseofthe(componentE1of)exposureEandtheoutcomeD,sooneneedstoadjustforL1toeliminateconfounding.Thestandardapproachtoconfoundercontrolisstratification:theassociationalriskratioiscomputedineachlevelofthevariableL1.ThesquarearoundthenodeL1denotesthattheassociationalriskratios(ARRED|L=0andARRED|L=1)areconditionalonL1.Examplesofstratification-basedmethodsareaMantel-Haenzselstratifiedanalysisorregressionmodels(linear,logistic,Poisson,Cox,andsoon)thatincludethecovariateL1.(NotincludinginteractiontermsbetweenL1andtheexposureinaregressionmodelisequivalenttoassuminghomogeneityofARRED|L=0andARRED|L=1.)TocalculateARRED|L=l,thedataanalysthastoselect(ie,conditionon)thesubsetofthepopulationwithvalueL1=l.However,inthisexample,theprocessofchoosingthissubsetresultsinselectiononavariableL1affectedby(acomponentE0of)exposureEandthuscanresultinbiasaswenowdescribe. Althoughstratificationiscommonlyusedtoadjustforconfounding,itcanhaveunintendedeffectswhentheassociationmeasureiscomputedwithinlevelsofL1andinadditionL1iscausedbyorsharescauseswithacomponentE0ofE.AmongthosewithlowCD4count(L1=1),beingontreatment(E0=1)makesitmorelikelythatthepersonisseverelyimmunodepressed;amongthosewithahighlevelofCD4(L1=0),beingofftreatment(E0=0)makesitmorelikelythatthepersonisnotseverelyimmunodepressed.Thus,thesideeffectofstratificationistoinduceanassociationbetweenpriorexposureE0andU,andthereforebetweenE0andtheoutcomeD.StratificationeliminatesconfoundingforE1atthecostofintroducingselectionbiasforE0.Thenetbiasforanyparticularsummaryofthetime-varyingexposurethatisusedintheanalysis(cumulativeexposure,averageexposure,andsoon)dependsontherelativemagnitudeoftheconfoundingthatiseliminatedandtheselectionbiasthatiscreated.Insummary,theassociational(conditional)riskratioARRED|L1,couldbedifferentfrom1.0eveniftheexposurehistoryhasnoeffectontheoutcomeofanysubjects. ConditioningonconfoundersL1whichareaffectedbypreviousexposurecancreateselectionbiaseveniftheconfounderisnotonacausalpathwaybetweenexposureandoutcome.Infact,nosuchcausalpathwayexistsinFigures7aand7b.Ontheotherhand,inFigure7CtheconfounderL1forsubsequentexposureE1liesonacausalpathwayfromearlierexposureE0toanoutcomeD.Nonetheless,conditioningonL1stillresultsinselectionbias.WerethepotentialforselectionbiasnotpresentinFigure7C(e.g.,wereUnotacommoncauseofL1andD),theassociationofcumulativeexposureEwiththeoutcomeDwithinstrataofL1couldbeanunbiasedestimateofthedirecteffect18ofEnotthroughL1butstillwouldnotbeanunbiasedestimateoftheoveralleffectofEonD,becausetheeffectofE0mediatedthroughL1isnotincluded. ADJUSTINGFORSELECTIONBIAS Selectionbiascansometimesbeavoidedbyanadequatedesignsuchasbysamplingcontrolsinamannertoensurethattheywillrepresenttheexposuredistributioninthepopulation.Othertimes,selectionbiascanbeavoidedbyappropriatelyadjustingforconfoundingbyusingalternativestostratification-basedmethods(seesubsequently)inthepresenceoftime-dependentconfoundersaffectedbypreviousexposure. However,appropriatedesignandconfoundingadjustmentcannotimmunizestudiesagainstselectionbias.Forexample,losstofollowup,self-selection,and,ingeneral,missingdataleadingtobiascanoccurnomatterhowcarefultheinvestigator.Inthosecases,theselectionbiasneedstobeexplicitlycorrectedintheanalysis,whenpossible. Selectionbiascorrection,aswebrieflydescribe,couldsometimesbeaccomplishedbyageneralizationofinverseprobabilityweighting20–23estimatorsforlongitudinalstudies.ConsideragainFigures6a–dandassumethatLismeasured.Inverseprobabilityweightingisbasedonassigningaweighttoeachselectedsubjectsothatsheaccountsintheanalysisnotonlyforherself,butalsoforthosewithsimilarcharacteristics(ie,thosewiththesamevalesofLandE)whowerenotselected.Theweightistheinverseoftheprobabilityofherselection.Forexample,ifthereare4untreatedwomen,age40–45years,withCD4count>500,inourcohortstudy,and3ofthemarelosttofollowup,thenthese3subjectsdonotcontributetotheanalysis(ie,theyreceiveazeroweight),whereastheremainingwomanreceivesaweightof4.Inotherwords,the(estimated)conditionalprobabilityofremaininguncensoredis1/4=0.25,andthereforethe(estimated)weightfortheuncensoredsubjectis1/0.25=4.Inverseprobabilityweightingcreatesapseudopopulationinwhichthe4subjectsoftheoriginalpopulationarereplacedby4copiesoftheuncensoredsubject. Theeffectmeasurebasedonthepseudopulation,incontrasttothatbasedontheoriginalpopulation,isunaffectedbyselectionbiasprovidedthattheoutcomeintheuncensoredsubjectstrulyrepresentstheunobservedoutcomesofthecensoredsubjects(withthesamevaluesofEandL).Thisprovisionwillbesatisfiediftheprobabilityofselection(thedenominatoroftheweight)iscalculatedconditionalonEandonalladditionalfactorsthatindependentlypredictbothselectionandtheoutcome.Unfortunately,onecanneverbesurethattheseadditionalfactorswereidentifiedandrecordedinL,andthusthecausalinterpretationoftheresultingadjustmentforselectionbiasdependsonthisuntestableassumption. Onemightattempttoremoveselectionbiasbystratification(ie,byestimatingtheeffectmeasureconditionalontheLvariables)ratherthanbyweighting.StratificationcouldyieldunbiasedconditionaleffectmeasureswithinlevelsofLundertheassumptionsthatallrelevantLvariablesweremeasuredandthattheexposuredoesnotcauseorshareacommoncausewithanyvariableinL.Thus,stratificationwouldwork(ie,itwouldprovideanunbiasedconditionaleffectmeasure)underthecausalstructuresdepictedinFigures6aandc,butnotunderthoseinFigures6bandd.InverseprobabilityweightingappropriatelyadjustsforselectionbiasunderallthesesituationsbecausethisapproachisnotbasedonestimatingeffectmeasuresconditionalonthecovariatesL,butratheronestimatingunconditionaleffectmeasuresafterreweightingthesubjectsaccordingtotheirexposureandtheirvaluesofL. InverseprobabilityweightingcanalsobeusedtoadjustfortheconfoundingoflaterexposureE1byL1,evenwhenexposureE0eithercausesL1orsharesacommoncausewithL1(Figs.7a–7c),asituationinwhichstratificationfails.Whenusinginverseprobabilityweightingtoadjustforconfounding,wemodeltheprobabilityofexposureortreatmentgivenpastexposureandpastLsothatthedenominatorofasubject'sweightis,informally,thesubject'sconditionalprobabilityofreceivinghertreatmenthistory.Wethereforerefertothismethodasinverse-probability-of-treatmentweighting.22 Onelimitationofinverseprobabilityweightingisthatallconditionalprobabilities(ofreceivingcertaintreatmentorcensoringhistory)mustbedifferentfromzero.Thiswouldnotbetrue,forexample,inoccupationalstudiesinwhichtheprobabilityofbeingexposedtoachemicaliszeroforthosenotworking.Inthesecases,g-estimation19ratherthaninverseprobabilityweightingcanoftenbeusedtoadjustforselectionbiasandconfounding. Theuseofinverseprobabilityweightingcanprovideunbiasedestimatesofcausaleffectseveninthepresenceofselectionbiasbecausethemethodworksbycreatingapseudopopulationinwhichcensoring(ormissingdata)hasbeenabolishedandinwhichtheeffectoftheexposureisthesameasintheoriginalpopulation.Thus,thepseudopopulationeffectmeasureisequaltotheeffectmeasurehadnobodybeencensored.Forexample,Figure8representsthepseudopulationcorrespondingtothepopulationofFigure6awhentheweightswereestimatedconditionalonLandE.Thecensoringnodeisnowlower-casebecauseitdoesnotcorrespondtoarandomvariablebuttoaconstant(everybodyisuncensoredinthepseudopopulation).Thisinterpretationisdesirablewhencensoringistheresultoflosstofollowupornonresponse,butquestionablyhelpfulwhencensoringistheresultofcompetingrisks.Forexample,inastudyaimedatestimatingtheeffectofcertainexposureontheriskofAlzheimer'sdisease,wemightnotwishtobaseoureffectestimatesonapseudopopulationinwhichallothercausesofdeath(cancer,heartdisease,stroke,andsoon)havebeenremoved,becauseitisunclearevenconceptuallywhatsortofmedicalinterventionwouldproducesuchapopulation.Anothermorepragmaticreasonisthatnofeasibleinterventioncouldpossiblyremovejustonecauseofdeathwithoutaffectingtheothersaswell.24FIGURE8.: Causaldiagraminthepseudopopulationcreatedbyinverse–probabilityweighting.DISCUSSION Theterms“confounding”and“selectionbias”areusedinmultipleways.Forinstance,thesamephenomenonissometimesnamed“confoundingbyindication”byepidemiologistsand“selectionbias”bystatisticians/econometricians.Othersusetheterm“selectionbias”when“confounders”areunmeasured.Sometimesthedistinctionbetweenconfoundingandselectionbiasisblurredintheterm“selectionconfounding.” Weelectedtorefertothepresenceofcommoncausesas“confounding”andtorefertoconditioningoncommoneffectsas“selectionbias.”Thisstructuraldefinitionprovidesaclearcutclassificationofconfoundingandselectionbias,eventhoughitmightnotcoincideperfectlywiththetraditional,oftendiscipline-specific,terminologies.Ourgoal,however,wasnottobenormativeaboutterminology,butrathertoemphasizethat,regardlessoftheparticulartermschosen,thereare2distinctcausalstructuresthatleadtothesebiases.Themagnitudeofbothbiasesdependsonthestrengthofthecausalarrowsinvolved.12,25(When2ormorecommoneffectshavebeenconditionedon,anevenmoregeneralformulationofselectionbiasisuseful.Forabriefdiscussion,seeAppendixA.4.) Theendresultofbothstructuresisthesame:noncomparability(alsoreferredtoaslackofexchangeability)betweentheexposedandtheunexposed.Forexample,consideracohortstudyrestrictedtofirefightersthataimstoestimatetheeffectofbeingphysicallyactive(E)ontheriskofheartdisease(D)(asrepresentedinFig.9).Forsimplicity,wehaveassumedthat,althoughunknowntothedataanalyst,EdoesnotcauseD.Parentalsocioeconomicstatus(L)affectstheriskofbecomingafirefighter(C)and,throughchildhooddiet,ofheartdisease(D).Attractiontowardactivitiesthatinvolvephysicalactivity(anunmeasuredvariableU)affectstheriskofbecomingafirefighterandofbeingphysicallyactive(E).UdoesnotaffectD,andLdoesnotaffectE.Accordingtoourterminology,thereisnoconfoundingbecausetherearenocommoncausesofEandD.Thus,ifourstudypopulationhadbeenarandomsampleofthetargetpopulation,thecrudeassociationalriskratioARREDwouldhavebeenequaltothecausalriskratioCRREDof1.0.FIGURE9.: Thefirefighters’study.However,inastudyrestrictedtofirefighters,thecrudeARREDandCRREDwoulddifferbecauseconditioningonacommoneffectCofcausesofexposureandoutcomeinducesselectionbiasresultinginnoncomparabilityoftheexposedandunexposedfirefighters.Tothestudyinvestigators,thedistinctionbetweenconfoundingandselectionbiasismootbecause,regardlessofnomenclature,theymuststratifyonLtomaketheexposedandtheunexposedfirefighterscomparable.Thisexampledemonstratesthatastructuralclassificationofbiasdoesnotalwayshaveconsequencesforeithertheanalysisorinterpretationofastudy.Indeed,forthisreason,manyepidemiologistsusetheterm“confounder”foranyvariableLonwhichonehastostratifytocreatecomparability,regardlessofwhetherthe(crude)noncomparabilitywastheresultofconditioningonacommoneffectortheresultofacommoncauseofexposureanddisease. Thereare,however,advantagesofadoptingastructuralorcausalapproachtotheclassificationofbiases.First,thestructureoftheproblemfrequentlyguidesthechoiceofanalyticalmethodstoreduceoravoidthebias.Forexample,inlongitudinalstudieswithtime-dependentconfounding,identifyingthestructureallowsustodetectsituationsinwhichstratification-basedmethodswouldadjustforconfoundingattheexpenseofintroducingselectionbias.Inthosecases,inverseprobabilityweightingorg-estimationarebetteralternatives.Second,evenwhenunderstandingthestructureofbiasdoesnothaveimplicationsfordataanalysis(likeinthefirefighters’study),itcouldstillhelpstudydesign.Forexample,investigatorsrunningastudyrestrictedtofirefightersshouldmakesurethattheycollectinformationonjointriskfactorsfortheoutcomeandforbecomingafirefighter.Third,selectionbiasresultingfromconditioningonpreexposurevariables(eg,beingafirefighter)couldexplainwhycertainvariablesbehaveas“confounders”insomestudiesbutnotothers.Inourexample,parentalsocioeconomicstatuswouldnotnecessarilyneedtobeadjustedforinstudiesnotrestrictedtofirefighters.Finally,causaldiagramsenhancecommunicationamonginvestigatorsbecausetheycanbeusedtoprovidearigorous,formaldefinitionoftermssuchas“selectionbias.” ACKNOWLEDGMENTS WethankStephenColeandSanderGreenlandfortheirhelpfulcomments.REFERENCES 1.RothmanKJ,GreenlandS.ModernEpidemiology,2nded.Philadelphia:Lippincott-Raven;1998. CitedHere 2.SzkloM0,NietoFJ.Epidemiology.BeyondtheBasics.Gaithersburg,MD:Aspen;2000. CitedHere 3.MacMahonB,TrichopoulosD.Epidemiology.Principles&Methods,2nded.Boston:Little,BrownandCo;1996. CitedHere 4.HennekensCH,BuringJE.EpidemiologyinMedicine.Boston:Little,BrownandCo;1987. CitedHere 5.GordisL.Epidemiology.Philadelphia:WBSaundersCo;1996. CitedHere 6.GreenlandS,PearlJ,RobinsJM.Causaldiagramsforepidemiologicresearch.Epidemiology.1999;10:37–48. CitedHere 7.RobinsJM.Data,design,andbackgroundknowledgeinetiologicinference.Epidemiology.2001;11:313–320. CitedHere 8.HernánMA,Hernández-DiazS,WerlerMM,etal.Causalknowledgeasaprerequisiteforconfoundingevaluation:anapplicationtobirthdefectsepidemiology.AmJEpidemiol.2002;155:176–184. CitedHere 9.ColeSR,HernánMA.Fallibilityintheestimationofdirecteffects.IntJEpidemiol.2002;31:163–165. CitedHere 10.MaclureM,SchneeweissS.Causationofbias:theepiscope.Epidemiology.2001;12:114–122. CitedHere 11.GreenlandS,BrumbackBA.Anoverviewofrelationsamongcausalmodelingmethods.IntJEpidemiol.2002;31:1030–1037. CitedHere 12.GreenlandS.Quantifyingbiasesincausalmodels:classicalconfoundingversuscollider-stratificationbias.Epidemiology.2003;14:300–306. CitedHere 13.PearlJ.Causaldiagramsforempiricalresearch.Biometrika.1995;82:669–710. CitedHere 14.SpirtesP,GlymourC,ScheinesR.Causation,Prediction,andSearch.LectureNotesinStatistics81.NewYork:Springer-Verlag;1993. CitedHere 15.BerksonJ.Limitationsoftheapplicationoffourfoldtableanalysistohospitaldata.Biometrics.1946;2:47–53. CitedHere 16.GreenlandS,NeutraRR.Ananalysisofdetectionbiasandproposedcorrectionsinthestudyofestrogensandendometrialcancer.JChronicDis.1981;34:433–438. CitedHere 17.RobinsJM.Anewapproachtocausalinferenceinmortalitystudieswithasustainedexposureperiod—applicationtothehealthyworkersurvivoreffect[publishederrataappearinMathematicalModelling.1987;14:917–921].MathematicalModelling.1986;7:1393–1512. CitedHere 18.RobinsJM,GreenlandS.Identifiabilityandexchangeabilityfordirectandindirecteffects.Epidemiology.1992;3:143–155. CitedHere 19.RobinsJM.Causalinferencefromcomplexlongitudinaldata.In:BerkaneM,ed.LatentVariableModelingandApplicationstoCausality.LectureNotesinStatistics120.NewYork:Springer-Verlag;1997:69–117. CitedHere 20.HorvitzDG,ThompsonDJ.Ageneralizationofsamplingwithoutreplacementfromafiniteuniverse.JAmStatAssoc.1952;47:663–685. CitedHere 21.RobinsJM,FinkelsteinDM.CorrectingfornoncomplianceanddependentcensoringinanAIDSclinicaltrialwithinverseprobabilityofcensoringweighted(IPCW)log-ranktests.Biometrics.2000;56:779–788. CitedHere 22.HernánMA,BrumbackB,RobinsJM.MarginalstructuralmodelstoestimatethecausaleffectofzidovudineonthesurvivalofHIV-positivemen.Epidemiology.2000;11:561–570. CitedHere 23.RobinsJM,HernánMA,BrumbackB.Marginalstructuralmodelsandcausalinferenceinepidemiology.Epidemiology.2000;11:550–560. CitedHere 24.GreenlandS.Causalitytheoryforpolicyusesofepidemiologicmeasures.In:MurrayCJL,SalomonJA,MathersCD,etal.,eds.SummaryMeasuresofPopulationHealth.Cambridge,MA:HarvardUniversityPress/WHO;2002. CitedHere 25.WalkerAM.ObservationandInference:AnintroductiontotheMethodsofEpidemiology.NewtonLowerFalls:EpidemiologyResourcesInc;1991. CitedHere 26.GreenlandS.Absenceofconfoundingdoesnotcorrespondtocollapsibilityoftherateratioorratedifference.Epidemiology.1996;7:498–501. CitedHere APPENDIX A.1.CausalandAssociationalRiskRatio Foragivensubject,EhasacausaleffectonDifthesubject'svalueofDhadshebeenexposeddiffersfromthevalueofDhadsheremainedunexposed.Formally,lettingDi,e=1andDi,e=0besubject'si(counterfactualorpotential)outcomeswhenexposedandunexposed,respectively,wesaythereisacausaleffectforsubjectiifDi,e=1≠Di,e=0.Onlyoneofthecounterfactualoutcomescanbeobservedforeachsubject(theonecorrespondingtohisobservedexposure),ie,Di,e=DiifEi=e,whereDiandEirepresentsubjecti'sobservedoutcomeandexposure.Forapopulation,wesaythatthereisnoaveragecausaleffect(preventiveorcausative)ofEonDiftheaverageofDwouldremainunchangedwhetherthewholepopulationhadbeentreatedoruntreated,ie,whenPr(De=1=1)=Pr(De=0=1)foradichotomousD.Equivalently,wesaythatEdoesnothaveacausaleffectonDifthecausalriskratioisone,ie,CRRED=Pr(De=1=1)/Pr(De=0=1)=1.0.Foranextensionofcounterfactualtheoryandmethodstocomplexlongitudinaldata,seereference19. InaDAG,CRRED=1.0isrepresentedbythelackofadirectedpathofarrowsoriginatingfromEandendingonDas,forexample,inFigure5.Weshallrefertoadirectedpathofarrowsasacausalpath.Ontheotherhand,inFigure5,CRREC≠1.0becausethereisacausalpathfromEtoCthroughF.ThelackofadirectarrowfromEtoCimpliesthatEdoesnothaveadirecteffectonC(relativetotheothervariablesontheDAG),ie,theeffectiswhollymediatedthroughothervariablesontheDAG(ie,F). Forapopulation,wesaythatthereisnoassociationbetweenEandDiftheaverageofDisthesameinthesubsetofthepopulationthatwasexposedasinthesubsetthatwasunexposed,ie,whenPr(D=1|E=1)=Pr(D=1|E=0)foradichotomousD.Equivalently,wesaythatEandDareunassociatediftheassociationalriskratiois1.0,ie,ARRED=Pr(D=1|E=1)/Pr(D=1|E=0)=1.0.Theassociationalriskratiocanalwaysbeestimatedfromobservationaldata.Wesaythatthereisbiaswhenthecausalriskratiointhepopulationdiffersfromtheassociationalriskratio,ie,CRRED≠ARRED. A.2.HazardRatiosasEffectMeasures ThecausalDAGinAppendixFigure1adescribesarandomizedstudyoftheeffectofsurgeryEondeathattimes1(D1)and2(D2).SupposetheeffectofexposureonD1isprotective.ThenthelackofanarrowfromEtoD2indicatesthat,althoughtheexposureEhasadirectprotectiveeffect(decreasestheriskofdeath)attime1,ithasnodirecteffectondeathattime2.Thatis,theexposuredoesnotinfluencethesurvivalstatusattimeD2ofanysubjectwhowouldsurvivepasttime1whenunexposed(andthuswhenexposed).SupposefurtherthatUisanunmeasuredhaplotypethatdecreasesthesubject'sriskofdeathatalltimes.TheassociationalriskratiosARRED1andARRED2areunbiasedmeasuresoftheeffectofEondeathattimes1and2,respectively.(Becauseoftheabsenceofconfounding,ARRED1andARRED2equalthecausalriskratiosCRRED1andCRRED2,respectively.)Notethat,eventhoughEhasnodirecteffectonD2,ARRED2(or,equivalently,CRRED2)willbelessthan1.0becauseitisameasureoftheeffectofEontotalmortalitythroughtime2.AppendixFigure1.: Effectofexposureonsurvival.Considernowthetime-specificassociationalhazard(rate)ratioasaneffectmeasure.Indiscretetime,thehazardofdeathattime1istheprobabilityofdyingattime1andthusisthesameasARRED1.However,thehazardattime2istheprobabilityofdyingattime2amongthosewhosurvivedpasttime1.Thus,theassociationalhazardratioattime2isthenARRED2|D1=0.ThesquarearoundD1inAppendixFigure1aindicatesthisconditioning.Exposedsurvivorsoftime1arelesslikelythanunexposedsurvivorsoftime1tohavetheprotectivehaplotypeU(becauseexposurecanexplaintheirsurvival)andthereforearemorelikelytodieattime2.Thatis,conditionalonD1=0,exposureisassociatedwithahighermortalityattime2.Thus,thehazardratioattime1islessthan1.0,whereasthehazardratioattime2isgreaterthan1.0,ie,thehazardshavecrossed.Weconcludethatthehazardratioattime2isabiasedestimateofthedirecteffectofexposureonmortalityattime2.ThebiasisselectionbiasarisingfromconditioningonacommoneffectD1ofexposureandofU,whichisacauseofD2thatopensthenoncausal(ie,associational)pathE→D1←U→D2betweenEandD2.13Inthesurvivalanalysisliterature,anunmeasuredcauseofdeaththatismarginallyunassociatedwithexposuresuchasUisoftenreferredtoasafrailty. Incontrasttothis,theconditionalhazardratioARRED2|D1=0,UatD2givenUisequalto1.0withineachstratumofUbecausethepathE→D1←U→D2betweenEandD2isnowblockedbyconditioningonthenoncolliderU.Thus,theconditionalhazardratiocorrectlyindicatestheabsenceofadirecteffectofEonD2.ThefactthattheunconditionalhazardratioARRED2|D1=0differsfromthecommon-stratumspecifichazardratiosof1.0eventhoughUisindependentofE,showsthenoncollapsibilityofthehazardratio.26 Unfortunately,theunbiasedmeasureARRED2|D1=0,UofthedirecteffectofEonD2cannotbecomputedbecauseUisunobserved.IntheabsenceofdataonU,itisimpossibletoknowwhetherexposurehasadirecteffectonD2.Thatis,thedatacannotdeterminewhetherthetruecausalDAGgeneratingthedatawasthatinAppendixFigure1aversusthatinAppendixFigure1b. A.3.EffectModificationandCommonEffectsinDAGs AlthoughanarrowonacausalDAGrepresentsadirecteffect,astandardcausalDAGdoesnotdistinguishaharmfuleffectfromaprotectiveeffect.Similarly,astandardDAGdoesnotindicatethepresenceofeffectmodification.Forexample,althoughAppendixFigure1aimpliesthatbothEandUaffectdeathD1,theDAGdoesnotdistinguishamongthefollowing3qualitativelydistinctwaysthatUcouldmodifytheeffectofEonD1: ThecausaleffectofexposureEonmortalityD1isinthesamedirection(ie,harmfulorbeneficial)inbothstratumU=1andstratumU=0. ThedirectionofthecausaleffectofexposureEonmortalityD1instratumU=1istheoppositeofthatinstratumU=0(ie,thereisaqualitativeinteractionbetweenUandE). ExposureEhasacausaleffectonD1inonestratumofUbutnocausaleffectintheotherstratum,eg,EonlykillssubjectswithU=0. BecausestandardDAGsdonotrepresentinteraction,itfollowsthatitisnotpossibletoinferfromaDAGthedirectionoftheconditionalassociationbetween2marginallyindependentcauses(EandU)withinstrataoftheircommoneffectD1.Forexample,supposethat,inthepresenceofanundiscoveredbackgroundfactorVthatisunassociatedwithEorU,havingeitherE=1orU=1issufficientandnecessarytocausedeath(an“or”mechanism),butthatneitherEnorUcausesdeathintheabsenceofV.Thenamongthosewhodiedbytime1(D1=1),EandUwillbenegativelyassociated,becauseitismorelikelythatanunexposedsubject(E=0)hadU=1becausetheabsenceofexposureincreasesthechancethatUwasthecauseofdeath.(Indeed,thelogarithmoftheconditionaloddsratioORUE|D1=1willapproachminusinfinityasthepopulationprevalenceofVapproaches1.0.)Althoughthis“or”mechanismwastheonlyexplanationgiveninthemaintextfortheconditionalassociationofindependentcauseswithinstrataofacommoneffect;nonetheless,otherpossibilitiesexist.Forexample,supposethatinthepresenceoftheundiscoveredbackgroundfactorV,havingbothE=1andU=1issufficientandnecessarytocausedeath(an“and”mechanism)andthatneitherEnorUcausesdeathintheabsenceofV.Then,amongthosewhodiebytime1,thosewhohadbeenexposed(E=1)aremorelikelytohavethehaplotype(U=1),ie,EandUarepositivelycorrelated.AstandardDAGsuchasthatinAppendixFigure1afailstodistinguishbetweenthecaseofEandUinteractingthroughan“or”mechanismfromthecaseofan“and”mechanism. AlthoughconditioningoncommoneffectD1alwaysinducesaconditionalassociationbetweenindependentcausesEandUinatleastoneofthe2strataofD1(say,D1=1),thereisaspecialsituationunderwhichEandUremainconditionallyindependentwithintheotherstratum(say,D1=0).Thissituationoccurswhenthedatafollowamultiplicativesurvivalmodel.Thatis,whentheprobability,Pr[D1=0|U=u,E=e],ofsurvival(ie,D1=0)givenEandUisequaltoaproductg(u)h(e)offunctionsofuande.ThemultiplicativemodelPr[D1=0|U=u,E=e]=g(u)h(e)isequivalenttothemodelthatassumesthesurvivalratioPr[D1=0|U=u,E=e]/Pr[D1=0|U=0,E=0]doesnotdependonuandisequaltoh(e).(NotethatifPr[D1=0|U=u,E=e]=g(u)h(e),thenPr[D1=1|U=u,E=e]=1–[g(u)h(e)]doesnotfollowamultiplicativemortalitymodel.Hence,whenEandUareconditionallyindependentgivenD1=0,theywillbeconditionallydependentgivenD1=1.) Biologically,thismultiplicativesurvivalmodelwillholdwhenEandUaffectsurvivalthroughtotallyindependentmechanismsinsuchawaythatUcannotpossiblymodifytheeffectofEonD1,andviceversa.Forexample,supposethatthesurgeryEaffectssurvivalthroughtheremovalofatumor,whereasthehaplotypeUaffectssurvivalthroughincreasinglevelsoflow-densitylipoprotein-cholesterollevelsresultinginanincreasedriskofheartattack(whetherornotatumorispresent),andthatdeathbytumoranddeathbyheartattackareindependentinthesensethattheydonotshareacommoncause.Inthisscenario,wecanconsider2cause-specificmortalityvariables:deathfromtumorD1AanddeathfromheartattackD1B.TheobservedmortalityvariableD1isequalto1(death)wheneitherD1AorD1Bisequalto1,andD1isequalto0(survival)whenbothD1AandD1Bequal0.WeassumethemeasuredvariablesarethoseinAppendixFigure1asodataonunderlyingcauseofdeathisnotrecorded.AppendixFigure2isanexpansionofAppendixFigure1athatrepresentsthisscenario(variableD2isnotrepresentedbecauseitisnotessentialtothecurrentdiscussion).BecauseD1=0impliesbothD1A=0andD1B=0,conditioningonobservedsurvival(D1=0)isequivalenttosimultaneouslyconditioningonD1A=0andD1B=0aswell.Asaconsequence,wefindbyapplyingd-separation13toAppendixFigure2thatEandUareconditionallyindependentgivenD1=0,ie,thepath,betweenEandUthroughtheconditionedoncolliderD1isblockedbyconditioningonthenoncollidersD1AandD1B.8Ontheotherhand,conditioningonD1=1doesnotimplyconditioningonanyspecificvaluesofD1AandD1BastheeventD1=1iscompatiblewith3possibleunmeasuredeventsD1A=1andD1B=1,D1A=1andD1B=0,andD1A=0andD1B=1.Thus,thepathbetweenEandUthroughtheconditionedoncolliderD1isnotblocked,andthusEandUareassociatedgivenD1=1.AppendixFigure2.: Multiplicativesurvivalmodel.WhatisinterestingaboutAppendixFigure2isthatbyaddingtheunmeasuredvariablesD1AandD1B,whichfunctionallydeterminetheobservedvariableD1,wehavecreatedanannotatedDAGthatsucceedsinrepresentingboththeconditionalindependencebetweenEandUgivenD1=0andthetheirconditionaldependencegivenD1=1.Asfarasweareaware,thisisthefirsttimesuchaconditionalindependencestructurehasbeenrepresentedonaDAG. IfEandUaffectsurvivalthroughacommonmechanism,thentherewillexistanarroweitherfromEtoD1BorfromUtoD1A,asshowninAppendixFigure3a.Inthatcase,themultiplicativesurvivalmodelwillnothold,andEandUwillbedependentwithinbothstrataofD1.Similarly,ifthecausesD1AandD1BarenotindependentbecauseofacommoncauseVasshowninAppendixFigure3b,themultiplicativesurvivalmodelwillnothold,andEandUwillbedependentwithinbothstrataofD1.AppendixFigure3.: Multiplicativesurvivalmodeldoesnothold.Insummary,conditioningonacommoneffectalwaysinducesanassociationbetweenitscauses,butthisassociationcouldberestrictedtocertainlevelsofthecommoneffect. A.4.GeneralizationsofStructure(3) ConsiderAppendixFigure4arepresentingastudyrestrictedtofirefighters(F=1).EandDareunassociatedamongfirefightersbecausethepathEFACDisblockedbyC.IfwethenstratifyonthecovariateClikeinAppendixFigure4b,EandDareconditionallyassociatedamongfirefightersinagivenstratumofC;yetCisneithercausedbyEnorbyacauseofE.Thisexampledemonstratesthatourpreviousformulationofstructure(3)isinsufficientlygeneraltocoverexamplesinwhichwehavealreadyconditionedonanothervariableFbeforeconditioningonC.Notethatonecouldtrytoarguethatourpreviousformulationworksbyinsistingthattheset(F,C)ofallvariablesconditionedberegardedasasinglesupervariableandthenapplyourpreviousformulationwiththissupervariableinplaceofC.Thisfix-upfailsbecauseitwouldrequireEandDtobeconditionallyassociatedwithinjointlevelsofthesupervariable(C,F)inAppendixFigure4caswell,whichisnotthecase.AppendixFigure4.: Conditioningon2variables.However,ageneralformulationthatworksinallsettingsisthefollowing.AconditionalassociationbetweenEandDwilloccurwithinstrataofacommoneffectCof2othervariables,oneofwhichiseithertheexposureorstatisticallyassociatedwiththeexposureandtheotheriseithertheoutcomeorstatisticallyassociatedwiththeoutcome. Clearly,ourearlierformulationisimpliedbythenewformulationand,furthermore,thenewformulationgivesthecorrectresultsforbothAppendixFigures4band4c.Adrawbackofthisnewformulationisthatitisnotstatedpurelyintermsofcausalstructures,becauseitmakesreferenceto(possiblynoncausal)statisticalassociations.Nowitactuallyispossibletoprovideafullygeneralformulationintermsofcausalstructuresbutitisnotsimple,andsowewillnotgiveithere,butseereferences13and14.SupplementalDigitalContent 00001648-920040900-00004.doc;[Word](35KB) ©2004LippincottWilliams&Wilkins,Inc.Viewfullarticletext Source AStructuralApproachtoSelectionBias Epidemiology15(5):615-625,September2004. Full-Size Email +Favorites Export ViewinGallery Colleague'sE-mailisInvalid YourName:(optional) YourEmail: Colleague'sEmail: Separatemultiplee-mailswitha(;). Message: Thoughtyoumightappreciatethisitem(s)IsawatEpidemiology. Sendacopytoyouremail Yourmessagehasbeensuccessfullysenttoyourcolleague. Someerrorhasoccurredwhileprocessingyourrequest.Pleasetryaftersometime. ​FollowEPIDEMIOLOGY ​  ​   ArticleasEPUB ExportAllImagestoPowerPointFile AddtoMyFavorites Colleague'sE-mailisInvalid YourName:(optional) YourEmail: Colleague'sEmail: Separatemultiplee-mailswitha(;). Message: Thoughtyoumightappreciatethisitem(s)IsawatEpidemiology. Sendacopytoyouremail Yourmessagehasbeensuccessfullysenttoyourcolleague. Someerrorhasoccurredwhileprocessingyourrequest.Pleasetryaftersometime. EndNoteProciteReferenceManager Savemyselection ArticlesinPubMedbyMiguelA.Hernán ArticlesinGoogleScholarbyMiguelA.Hernán OtherarticlesinthisjournalbyMiguelA.Hernán Dataistemporarilyunavailable.Pleasetryagainsoon. Causality RothmanPrizeCollection Thiswebsiteusescookies.Bycontinuingtousethiswebsiteyouaregivingconsenttocookiesbeingused.ForinformationoncookiesandhowyoucandisablethemvisitourPrivacyandCookiePolicy. Gotit,thanks!



請為這篇文章評分?