Effects of the Family Environment Gene.pdf

(165 KB) Pobierz
14696578 UNPDF
2008, Vol. 44, No. 2, 305–315
Copyright 2008 by the American Psychological Association
Effects of the Family Environment: Gene–Environment Interaction and
Passive Gene–Environment Correlation
Thomas S. Price and Sara R. Jaffee
University of Pennsylvania
The classical twin study provides a useful resource for testing hypotheses about how the family
environment influences children’s development, including how genes can influence sensitivity to
environmental effects. However, existing statistical models do not account for the possibility that
children can inherit exposure to family environments (i.e., passive gene– environment correlation). The
authors introduce a method to simultaneously estimate the effects of passive gene– environment corre-
lation and gene– environment interaction and use it to investigate the relationship between chaos in the
home and verbal ability in a large sample of 4-year-old twins.
Keywords: twins, epidemiology, gene– environment interaction, gene– environment correlation
Supplemental materials: http://dx.doi.org/10.1037/0012-1649.44.2.305.supp
Developmental psychologists have long been interested in in-
vestigating whether and how children’s family environments in-
fluence their cognitive and behavioral development. Researchers
have demonstrated that environmental factors measured at the
family level, such as socioeconomic status (SES) and geographical
location, are associated with children’s outcomes independent of
individual-level parent or child factors (e.g., Duncan & Brooks-
Gunn, 1997; Leventhal & Brooks-Gunn, 2000) and are often
mediated by more proximal processes (Collins, Maccoby, Stein-
berg, Hetherington, & Bornstein, 2000).
Although many studies have demonstrated associations between
family environmental factors and children’s adjustment, it is
equally true that the effects of family environment can vary from
child to child, even for children raised in the same family. For
example, extreme privation in childhood can cause profound cog-
nitive deficits, but children raised under such conditions vary
widely in terms of their cognitive and socioemotional functioning
(Rutter, O’Connor, & the English and Romanian Adoptees Study
Team, 2004). One possibility is that the effects of adversity are
conditioned by individual differences in children’s resilience or
adaptivity to environmental risk, which may have a genetic basis
E. It has
been demonstrated in animal experiments by researchers showing
differential effects of environmental conditions in groups of ani-
mals stratified by their genetic background (e.g., Bennett et al.,
2002). For obvious reasons, it has been much harder to demon-
strate G E in humans. Whereas in animal studies both genotype
and environmental exposure can be manipulated for experimental
purposes, in human studies interactions must generally be sought
between naturally occurring variations in genotype and environ-
ment. Epidemiological approaches to G E can offer greater
validity than experimental studies, but they do so at the expense of
statistical power and experimental control. One of the drawbacks
of an epidemiological study of G E is the possibility of corre-
lation between genotype and environment: the phenomenon of
gene– environment correlation (rGE). Statistical methods for de-
tecting G E in human populations need to allow for the possible
lack of independence between genetic and environmental risk
factors (Etheredge, Christensen, Del Junco, Murray, & Mitchell,
2005; Liu, Fallin, & Kao, 2004).
Genotype and environment can correlate for various reasons
(Jaffee & Price, 2007), but studies of the effects of family envi-
ronments on children’s outcomes are particularly subject to con-
founding due to passive rGE (Kendler & Eaves, 1986; Plomin,
1986). Passive rGE occurs when the family environment depends
on heritable parental characteristics, so that parents pass on to their
children an environment that correlates with the parental genotype.
Biological parents also pass on the genotype to their children.
When this genotype also influences children’s behavioral or cog-
nitive outcomes, the result is a spurious association between en-
Thomas S. Price, Institute for Translational Medicine and Therapeutics,
University of Pennsylvania; Sara R. Jaffee, Department of Psychology,
University of Pennsylvania.
We have no financial interests or conflicts of interest related to the
material reported in the article. This work was supported by Grant P50
HL81012 from the National Heart, Blood, and Lung Institute to Thomas S.
Price and Grant R01 HD050691 from the National Institute of Child Health
and Human Development to Sara R. Jaffee.
We thank the participants in the Twins Early Development Study
(TEDS), Robert Plomin, and the TEDS research team, especially Andy
McMillan for providing data management support.
Correspondence concerning this article should be addressed to Thomas
S. Price, Institute for Translational Medicine and Therapeutics, University
of Pennsylvania, Room 807 BRB II/III, 421 Curie Boulevard, Philadelphia,
PA 19104. E-mail: tom@spirit.gcrc.upenn.edu
305
Developmental Psychology
0012-1649/08/$12.00 DOI: 10.1037/0012-1649.44.2.305
(Rutter, 2003). Gene– environment interactions (G E) occur
when genetic factors influence sensitivity to environmental effects.
An alternative way of conceptualizing the interaction is to say that
environmental exposure moderates the effect of genetic risk fac-
tors. The existence of such moderating effects would suggest
greater scope for environmental intervention to alter heritable traits
such as cognitive abilities.
However, many challenges remain to identify G
306
PRICE AND JAFFEE
vironment and outcome (Plomin, DeFries, & Loehlin, 1977; Scarr
& McCartney, 1983). In this way, an association between a mea-
sure of the family environment and a childhood outcome can be
partially or totally accounted for by the effects of parental geno-
type. In the absence of passive rGE this association is attributed to
the influence of the family environment on the outcome. Failure to
rule out the possibility that passive rGE accounts for some portion
of the association may result in its misattribution to environmental
causes.
When the association between measured environment and out-
come is accounted for by unobserved genetic factors (namely,
parental genotype), then the association is said to be genetically
mediated . When the association is accounted for by unobserved
environmental factors, then it is said to be environmentally medi-
ated . This terminology is admittedly somewhat confusing. The
terms genetic mediation and environmental mediation as used by
behavioral geneticists do not imply that genes or environments are
intervening variables in the association between the measured
environment and the outcome; clearly, parental genotype is caus-
ally prior to both measured environment and childhood outcome.
Studies of twin children can exploit differences in genetic re-
latedness between monozygotic (MZ) and dizygotic (DZ) pairs to
quantify the degree to which the effects of environmental exposure
at the level of the individual are genetically mediated and envi-
ronmentally mediated, even in the presence of G E (Eaves,
Silberg, & Erkanli, 2003; Rathouz, Van Hulle, Rodgers, & Lahey,
2007). However, current statistical methods for studies of twin
children are uninformative about whether the effects of family-
wide environments are environmentally or genetically mediated
(Turkheimer, D’Onofrio, Maes, & Eaves, 2005). In fact, the meth-
ods that are currently in use implicitly assume the absence of
passive rGE. This problem extends to studies that investigate
possible interactions between genetic influences and the family
environment.
The goal of the current article is to introduce an analytical
method for twin studies that simultaneously estimates G E and
passive rGE for measures of the family environment. The motiva-
tions for developing this method are twofold. First, we have
identified a problem with the statistical methodology that is cur-
rently used to investigate the moderating effects of the family
environment, namely, the assumption that there is no passive rGE.
Therefore, we wish to develop an alternative method that does not
suffer the consequences of violating this assumption. The second
motivation is the prospect that passive rGE can be estimated using
data from child twins under specific circumstances: namely, when
genetic influences on the phenotype both correlate with and are
moderated by a measure of the family environment.
In this study, we analyze simulated datasets to investigate
whether these motivations are justifiable. First, we analyze the
simulated datasets using the existing method for detecting G E
to quantify the problems that arise when passive rGE is present.
Second, we reanalyze the simulated data using the new model to
outline the range of circumstances under which the simulated
parameter values are accurately recovered.
We illustrate the model with an application to the trait of
childhood verbal ability. Below, we review twin studies that have
attempted to demonstrate how family-wide environments such as
SES and parental education moderate genetic influences on chil-
dren’s cognitive outcomes. We highlight methodological problems
in these studies that originate in their failure to account for possible
effects of passive rGE and explain how the new statistical model
may overcome these shortcomings. Finally, we apply the method
to a large sample of twins and show for the first time that such data
can be used to distinguish true environmental effects from passive
rGE.
Studies of G
E and Children’s Cognitive Abilities
A series of twin studies has attempted to quantify and test the
moderating effects of family environmental variables (e.g., paren-
tal education, SES) on genetic factors that influence individual
differences in children’s verbal or cognitive abilities (Asbury,
Wachs, & Plomin, 2005; Fischbein, 1980; Guo & Stearns, 2002;
Harden, Turkheimer, & Loehlin, 2007; Kremen et al., 2005; Rowe,
Jacobson, & Van den Oord, 1999; Scarr-Salapatek, 1971; Turkhei-
mer, Haley, Waldron, D’Onofrio, & Gottesman, 2003). The results
have been contradictory. An analysis of data from the National
Longitudinal Study of Adolescent Health concluded that the her-
itability of verbal ability was greater in families with highly
educated parents (Rowe et al., 1999), although a reanalysis as-
cribed the moderating effect to employment status and race (Guo
& Stearns, 2002). Other studies have also found that the genetic
influences on cognitive abilities are stronger in families in which
parents have more education (Kremen et al., 2005) or higher SES
(Harden et al., 2007; Turkheimer et al., 2003). In contrast, a large
study of 4-year-old twins did not find that heritability estimates
varied as a function of SES (Asbury et al., 2005) but that herita-
bility estimates were higher in high-risk families characterized by
high levels of chaos and poor parent– child communication—
aspects of the environment that typically correlate with low SES
(Asbury et al., 2005; Evans, 2004). Moreover, these reports of G
E have not been confirmed in studies of nontwin families (Nagoshi
& Johnson, 2005; Van den Oord & Rowe, 1997).
The twin studies reviewed above used either a structural equa-
tion modeling framework (e.g., Turkheimer et al., 2003), a mixed
model (Guo & Stearns, 2002), or a DeFries-Fulker regression
model (e.g., Rowe et al., 1999) to test hypotheses about the effects
of the family environment on cognitive outcomes. These methods
estimate the overall association between family environment and
the phenotype but do not distinguish between environmentally
mediated effects and passive rGE (Turkheimer et al., 2005). In
effect, the influences of latent genetic and environmental factors
are estimated from the variation in the phenotype that remains after
estimating a main effect of the measured family environment
(Purcell & Koenen, 2005; Turkheimer et al., 2005). This is equiv-
alent to assuming that the association between the measured family
environment and the child phenotype is mediated entirely through
the shared environment. In the presence of passive rGE this im-
plicit assumption is violated, so that these procedures not only
misspecify the effect of the measured environment but also mis-
specify the effects of the latent shared environmental factor. A
further consequence of the presence of passive rGE is that the
phenotypic variance cannot be resolved into separate genetic and
environmental components in the usual way (Rathouz et al., un-
published manuscript). Crucially, at least two studies have dem-
onstrated that rGEs are likely to account for part of the association
between SES or parental education and offspring cognitive abili-
ties (Neiss, Rowe, & Rodgers, 2002; Tambs, Sundet, Magnus, &
SPECIAL SECTION: EFFECTS OF THE FAMILY ENVIRONMENT
307
Berg, 1989), suggesting that estimates of G E as predictors of
cognitive abilities might be biased. The same problem may apply
to twin studies of G E in other phenotypes: For example, the
relationship between family dysfunction and children’s antisocial
behavior may be genetically rather than environmentally mediated
(Button, Scourfield, Martin, Purcell, & McGuffin, 2005). On the
other hand, studies that have been careful to measure environments
that are not likely to be genetically correlated with the outcome,
such as geographical region, are less vulnerable to this criticism
(e.g., Dick, Rose, Viken, Kaprio, & Koskenvuo, 2001).
Method
Statistical Model
In this section we first outline a model for the effects of a
measured family environment on twin phenotypes that parameter-
izes the effects of passive rGE and explains why it cannot be
successfully estimated. Next, we extend the model to account for
the effects of both G E and passive rGE and show that the
existence of G E allows the main effect of the measured family
environment to be distinguished from passive rGE.
Let us say that we are interested in understanding the sources of
variation in a phenotype like children’s verbal ability. In the
standard biometric model for twin data, the phenotype Y ij for twin
j in family i is determined by the population mean and the values
of the random variables A ij ,C i , and E ij that represent additive
genetics, shared environment, and nonshared environment, respec-
tively. We assume that the latent variables A , C , and E are
independently normally distributed and load on the phenotype with
coefficients a , c , and e , respectively. In order for the model to be
identified, it is necessary to provide an arbitrary location and scale for
the latent genetic and environmental variables. In accordance with
convention, we scale these latent variables to zero mean and unit
variance. Because MZ twins have the same genomic DNA, whereas
DZ twins share half their segregating genes, the genetic factors for
twins in the same family, A i1 and A i2 , are correlated with Coefficient
1 for MZ twins and with Coefficient 0.5 for DZ twins. The nonshared
environments E ij are uncorrelated within members of the same family.
We supplement this model with an additional random variable X
representing a measured family environment—that is, a variable
measured at the family level whose value differs between families
but not within families (e.g., family chaos). X is normally distrib-
uted with zero mean and variance 2 , has a main effect x on the
phenotype, and is correlated with A due to a passive rGE, such that
Cor( X i, A i1 ) Cor( X i, A i2 ) r . Under this model, the value of the
phenotype Y ij for twin j in family i is given by
Need for New Statistical Methods
In this article we suggest a methodological innovation that
addresses the problem we have identified. We introduce a statis-
tical model for the classical twin design that estimates both the
environmentally mediated effects of the family environment and
passive rGE in the presence of G E. The influence of the
measured family environment is modeled as a random effect that
may correlate with genotype rather than a fixed main effect, an
approach that allows both the genetically mediated effects and the
environmentally mediated effects of the measured environment to
be estimated from the data. Similar analytic strategies have been
suggested previously in relation to child-specific environments
(Eaves et al., 2003; Purcell, 2002). Simulation studies have been
performed that quantify the deficiencies of the existing method and
validate the proposed analytical procedure.
We illustrate the new method using data from a large twin study
of early cognitive development. A previous report from this study
found evidence that chaos in the family home and features of
parent– child communication style moderated genetic influences
on verbal ability at age 4 (Asbury et al., 2005). We selected family
chaos for our analysis to facilitate comparison with the existing
literature on the moderating effects of distal family environments
over genetic influences on cognitive development.
Chaos reflects the child’s physical microenvironment, including
the child’s exposure to noise, crowding, and patterns of environ-
mental traffic (Matheny, Wachs, Ludwig, & Phillips, 1995). Fam-
ily chaos may retard cognitive development by causing children to
filter out useful environmental stimuli along with unwanted noise
(Evans, 2006). Parent– child interactions in noisy or crowded
homes are also less conducive to cognitive development because
parents are less responsive to their children (Evans, 2006). Chaos
correlates with SES and may function as a proximal mediator of its
effects (Asbury et al., 2005; Pike, Iervolino, Eley, Price, & Plomin,
2006). It has previously been shown that this measure of the family
environment not only correlates with verbal ability (Petrill, Pike,
Price, & Plomin, 2004; Pike et al., 2006) but also moderates the
effect of genetic influences on verbal ability (Asbury et al., 2005).
Although these findings are premised on the assumption that chaos
has environmentally mediated effects on children’s verbal ability,
an alternative hypothesis is that parents who raise their children in
chaotic home environments also pass along genetic variants that
are associated with poor verbal ability and that, in fact, the asso-
ciation between chaos and children’s verbal ability is partly ge-
netically mediated. In this study, we use a new statistical method
to show that a previously reported association between chaos at
home and early verbal ability cannot be explained by passive rGE.
Y ij aA ij cC i eE ij xX i .
(1)
Because we are interested in the effects of an environmental factor
that differs between families but not within families, it is illumi-
nating to rearrange the model for the phenotypic scores into
separate terms for the half sum (the within-family mean, corre-
sponding to the component of the twins’ phenotypic scores that
differs between families) and the half difference (the component of
the twins’ phenotypic scores that varies within families):
1
2
Y i 1 Y i 2
1
2 a A i 1 A i 2
cC i
1
2 e E i 1 E i 2 xX i , (2)
1
2
Y i 1
Y i 2
1
2 a
A i 1
A i 2
1
2 e
E i 1
E i 2
Note that the sum of these terms is Y i1 (i.e., the score for Twin
1 in family i ) and the difference between them is Y i2 (i.e., the score
for Twin 2 in family i ). The variance in the phenotypic scores can
be partitioned into a component of variance due to differences
between families, 2 , accounted for by factors that differ between
308
PRICE AND JAFFEE
families such as the main effect of X , and a component of variance
due to differences within families, 2 , which cannot be accounted
for by the main effect of X because X takes the same value for all
children in a family. Let us assume that the variables follow a
multivariate normal distribution and there are no systematic effects
of birth order. The latter is not generally considered a controversial
assumption for behavioral data simply because twins are so close
in age, although there is some evidence that perinatal risk is
elevated in second-born twins (Armson et al., 2006). Under these
assumptions the following equality holds:
parameters (namely a , c , e , x and r ) to be estimated (the variance
X can be estimated directly from the data). This means that x and
r are not identified: It is not possible to estimate unique values for
these parameters from the data.
Let us now extend the model with additional terms describing
the moderating effects of the measured family environment. We
can model G E by allowing a moderating effect of the measured
environment on the genotype so that the coefficient of genetic
influence on the phenotype is given by a 1 m A X , where m A is
a linear moderation term. Nonzero values for this term imply that
the genetic influences on the phenotype vary across levels of the
measured environmental variable. We can also allow linear mod-
eration of the paths for the shared environment, c 1 m C X , and
nonshared environment, e 1 m E X . The variances of the phe-
notypic sums and differences are now given by:
Var Y i 1 Var Y i 2 b w Var
1
2
Y i 1 Y i 2
Var
1
2
Y i 1 Y i 2
. (3)
bMZ
a 2
1 m A X
2
c 2
1 m C X
2
The variances of the half-sums and half-differences correspond,
respectively, to the between-family variance 2 and the within-
family variance 2 , and are uncorrelated. The values of 2 and 2
depend on the zygosity of the twin pair, because genetic differences
between twins can contribute to within-pair differences for DZ pairs
but not for MZ pairs. It can be shown that the between- and within-
family phenotypic variances for MZ and DZ pairs are given by:
1
2
e 2
1 m E X
2
x 2
2
2 a 1 m A X x X r , (5)
wMZ
1
2 e 2
1
m E X
2 ,
3
4 a 2
bDZ
1 m A X
2
c 2
1 m C X
2
1
2 e 2
bMZ
a 2
c 2
2
2
2 ax X r ,
(4)
1
2 e 2
1 m E X
2
x 2
2
2a 1 m A X x X r ,
1
2 e 2 ,
wMZ
1
4 a 2
1
2 e 2
wDZ
1
m A X
2
1
m E X
2 .
3
4 a 2
1
2 e 2
bDZ
c 2
x 2
2
2 ax X r ,
The covariance between the measured environment X and the
phenotype Y , c XY , comprises terms relating to the environmentally
and genetically mediated effects of the measured environment:
wDZ
1
4 a 2
1
2 e 2 .
We can see that a positive association between the measured
family environment and the phenotype increases the variance of
the phenotype between families. This association has two compo-
nents: an environmental component that is mediated solely by the
latent environmental variable X , and a genetic component due to
the passive rGE between A and X . The phenotypic variance ac-
counted for by the environmental component of the association is
simply the square of the main effect of the environmental influence
on the phenotype, x 2
c XY x 2
a 1 m A X X r .
(6)
2 . The phenotypic variance due to passive
rGE equals twice the covariance between A and X 2 ax X r . Note
that the within-family phenotypic variance does not depend on
either x or r . The lack of dependence of within-family variance on
x is not surprising, because we model the effect of the family
environment the same way for both twins in a pair. The reason that
passive rGE makes no contribution to within-family differences
can be understood intuitively as follows. Genetic differences
within DZ twin pairs arise from recombination during meiosis, a
random process that is uncorrelated with the parental genotypes
and hence with the processes that cause passive rGE. As a final
point, it is important to note that the environmentally and geneti-
cally mediated effects of the measured family environment on the
between-family and within-family variances are confounded in
this model. There are four pieces of information, and there are five
The covariance between measured environment and phenotype
due to passive rGE, a 1 m A X X r , is a linear function of X ,
whereas the covariance due to the environmentally mediated ef-
fect, x 2 , is constant with respect to X . Consequently, if it is
known (or can be shown in advance) that there is moderation of the
genetic variance such that m A and a are nonzero, then a model that
estimates the covariance between measured environment and phe-
notype as a linear function of X is identified and allows unique
values for x and r to be estimated from the data. In other words, the
existence of G E allows one to distinguish between the envi-
ronmentally and genetically mediated effects of the measured
family environment. For example, if the environmental exposure X
is binary with values corresponding to exposure/nonexposure, then
the equations in (5) will provide eight pieces of information—four
variances for each value of X —sufficient to estimate the eight
parameters in the model. Naturally, the power to discriminate
between the main effect of the environment and passive rGE will
depend strongly on the values of the genetic path parameter a and
the genetic moderation parameter m A : Power will increase as the
magnitude of these parameters increases.
Because the covariance between X and Y is entirely due to the
covariance between X and the half-sums, which also equals c XY ,
2
2
2
2
2
2
2
2
SPECIAL SECTION: EFFECTS OF THE FAMILY ENVIRONMENT
309
the covariances between the variables in our model can be de-
scribed completely by inserting the values from the equations in
(5) and (6) into the following structural equations:
A path diagram corresponding to this model of environmental
mediation and moderation is shown in Figure 1. For simplicity of
presentation, the means model is omitted. The model can either be
implemented within a structural equation modeling paradigm and
estimated by maximum likelihood—an example script is provided
as part of the supplementary materials online in Supplemental
Appendix I, for the freely distributed software package Mx (http://
www.vcu.edu/mx/; Neale, Boker, Xie, & Maes, 1999)— or imple-
mented as a Bayesian model and estimated by Markov chain
Monte Carlo methods. Bayesian models allow very flexible pa-
rameterization: A script for the freely distributed program win-
BUGS 1.4.1 (www.mrc-bsu.cam.ac.uk/bugs/; Spiegelhalter,
Thomas, Best, & Lunn, 2003) is provided online in Supplemental
Appendix II.
MZ : Var
1
2
Y i 1 Y i 2
, 1
2
Y i 1 Y i 2
, X i
bMZ
2
0c XY
0
wMZ
2
0
, (7)
c XY
0
2
DZ : Var
1
2
Y i 1 Y i 2
, 1
2
Y i 1 Y i 2
, X i
bDZ
2
0 c XY
0
wDZ
2
0
.
c XY
0
2
Data Analysis
The expected vector of means for, respectively, the half-sums,
half-differences, and measured environment is ( , 0, 0) for both
zygosity groups. However, in practice it may be better to estimate
the means for MZ and DZ groups without imposing any con-
straints to capture any mean effects due to zygosity and birth order.
As mentioned previously, the statistical model described by the
equations in (7) can only be estimated when it is already known
that the measured family environment moderates the genetic vari-
ance (i.e., m A 0). For this reason, the full model— containing
both r and m A parameters— cannot, by itself, provide a test for the
(1+ m C X ) 2
r
C b
s 2
(1+ m A X ) 2
(1+ m E X ) 2
(1+ m E X ) 2
X
A b
c
E b
E w
a
1
x
√½ e
√½ e
MZ
X
½(Y 1 +Y 2 )
½(Y 1 –Y 2 )
(1+ m C X ) 2
s 2
4 / 3 r
(1+ m A X ) 2
C b
(1+ m E X ) 2
(1+ m E X ) 2
X
A b
c
E b
E w
(1+ m A X ) 2
√¾ a
A w
1
√½ e
√½ e
x
√¼ a
DZ
X
½(Y 1 +Y 2 )
½(Y 1 –Y 2 )
the variance of
the measured family environment; 1 2 (Y 1 Y 2 ) half sum of twin phenotypes; 1 2 (Y 1 –Y 2 ) half difference
between twin phenotypes; A additive genetics; C shared environment; E nonshared environment; b
suffix between-family variance; w suffix within-family variance; m C linear moderation of shared
environmental path; r correlation due to passive rGE; m A linear moderation of genetic path; a additive
genetic path parameters; c shared environment path parameters; e nonshared environment path parameters;
x environmentally mediated effect of the measured environment.
½(Y 1 +Y 2 )
Figure 1. Path diagrams for monozygotic (MZ) and dizygotic (DZ) twin pairs, showing observed variables
(square boxes), latent variables (circles), regression paths (single-headed arrows) and correlations (double
headed arrows). The means model has been omitted. X measured family environment; x 2
14696578.001.png
Zgłoś jeśli naruszono regulamin