|
|
||||||||
PUBLIC HEALTH MATTERS |
At the time this work was completed, Paige Muellerleile was with the Department of Psychology, University of WisconsinMarshfield, Marshfield, and Brian Mullen was with Department of Psychology, Syracuse University, Syracuse, New York.
Correspondence: Requests for reprints should be sent to Paige Muellerleile, PhD, University of Wisconsin-Marshfield, 2000 West Fifth Street, Marshfield, Wisconsin 54449 (e-mail: pmueller{at}uwc.edu).
| ABSTRACT |
|---|
|
|
|---|
We propose cumulative meta-analysis as the procedure of completing a new meta-analysis at each successive wave in a research database. Two facets of cumulative knowledge are considered: the first, sufficiency, refers to whether the meta-analytic database adequately demonstrates that a public health intervention works. The second, stability, refers to the shifts over time in the accruing evidence about whether a public health intervention works.
We used a hypothetical data set to develop the indicators of sufficiency and stability, and then applied them to existing, published datasets. Our discussion centers on the implications of the use of this procedure in evaluating public health interventions.
| INTRODUCTION |
|---|
|
|
|---|
Traditional meta-analysis can inform public health interventions and policies, usually to determine whether an intervention has an impact on health practices, and the magnitude of that impact. However, traditional meta-analysis overlooks 2 aspects of public health information. The first is sufficiency. Sufficiency refers to whether the meta-analytic database adequately demonstrates whether a public health intervention works. For example, 1 meta-analysis10 synthesizes the relationship between socioeconomic status and self-esteem, integrating the results of 446 hypothesis tests conducted among 312 940 participants. This number of hypothesis tests begs the question of whether there was sufficient justification for using valuable research and participant resources to conduct the 446th hypothesis test to help establish the relationship between socioeconomic status and self-esteem. If there was little value in adding the 446th hypothesis test, was there sufficient value in the 445th test? What about the 200th test?11 For many public health issues, collecting additional evidence for an already-established effect may waste more than research and participant resources: delaying implementation of effective risk-reduction interventions may also waste health care resources, employer costs, and lives.
The second of the aspects overlooked by traditional meta-analysis is stability. Stability refers to the shifts over time in the accruing evidence about whether a public health intervention works. For example, the purported effects of sex education have been controversial. Studies of the efficacy of sex education programs have rendered conflicting estimates of the effects of these programs on adolescent sexual activity: some studies indicate that sex education programs decrease sexual activity.12 Others indicate that sex education programs do not appear to influence rates of sexual activity.13 Still others indicate that sex education programs lead to increased sexual activity.14 As additional studies are added, the estimate of the typical effect of sex education programs on adolescent sexual activity may continue to fluctuate.15 For a number of public health issues, implementing effective interventions is a worthwhile effort. However, implementing ineffective interventions can waste health care resources, employer costs, and lives.
We describe these 2 aspects of cumulative knowledge in the public health context. We discuss previous efforts to interpret cumulative meta-analysis, explain indicators of sufficiency and stability to aid interpretation of cumulative meta-analysis, and consider the use of the indicators of sufficiency and stability in a set of previously published meta-analyses.
| CUMULATIVE META-ANALYSIS |
|---|
|
|
|---|
To illustrate the examination of evidence for sufficiency and stability in cumulative meta-analysis, we will make use of a hypothetical data set that has previously been used to illustrate other meta-analytic issues.1,2,16,19,20 Table 1
describes the data set, which includes the results of 10 studies of the effects of X on Y. For this example, let X = some public health intervention (e.g., seatbelt laws) and let Y = some public health outcome (e.g., traffic fatalities). For each hypothesis test, Table 1
also presents the corresponding Z for significance and ZFisher for effect size (the Fisher logarithmic transformation of the product moment r for effect size). This use of ZFisher is consistent with the meta-analytic techniques4,21,22 used in this effort. Cumulative meta-analyses using Cohen d or Hedges
, or any other linear metric of effect size, could be conducted similarly.
|
Fisher 2 = 0.42. Performing a new meta-analysis for each of the 10 waves in the database results in a mean effect size of
Fisher 10 = 0.50.
|
Fisher i are not intended for use as estimators of inferential probabilities. Cumulative meta-analysis necessarily involves multiple tests of the same hypothesis, and using CIs for estimating inferential probabilities therefore increases the likelihood of committing a Type I error. In this context, rather than being indications of the likelihood that the effects are significant, the CIs indicate the range of values that are statistically equivalent to the parameter. In other words, the CIi around the
Fisher i for wave i indicates the range of values indistinguishable from the parameter value. Generally, the CIis become narrower as the number of hypothesis tests, ki, increases, and as the cumulative sample size,
Ni, increases.19 For a
Fisher i that remains constant, then, additional studies result in narrower CIi s around that mean, which decreases the range of values for the effect size that are statistically equivalent to the true effect size.
From the first wave through the end of the database, the evidence for the effect of X on Y appeared to be sufficient: the CIi around the mean effect size did not include the value of zero. Put differently, the range of values for the mean effect size at each wave appeared to be statistically different from a null effect. Therefore, it would be hard to argue for additional research about the effects of X on Y, as it appears that the effect was there from the start. Similarly, from the first wave through the end of the database, the evidence for the effect of X on Y appeared to be stable: there is little change in the value of the mean effect. Therefore, it would be hard to argue for additional research to determine whether the emergent picture of the effect of X upon Y might change. Although the visual information presented in Figure 1a
portrays a simple data set for which this interpretation is straightforward, one cannot expect real datasets to be so obliging. In real datasets, it may be very difficult to determine when there was sufficient evidence to determine that X had a particular effect on Y. Moreover, in real datasets, it may be very difficult to determine when the effect became stable; that is, the point at which the value of the effect of X upon Y did not change appreciably from one wave to the next.
| PREVIOUS EFFORTS TO INTERPRET CUMULATIVE META-ANALYSES |
|---|
|
|
|---|
Pogue and Yusuf25 suggested a different approach for determining when accumulating evidence is statistically significant, which involves the adaptation of classical monitoring boundaries. They propose that the cumulative meta-analyst calculate an "optimum information size," which is the cumulative sample size needed to demonstrate an effect, in light of event rates and the minimum reasonable values of the independent variable that would be considered consequential.
Although their efforts to produce a method for statistical inference within cumulative meta-analysis are commendable, there has been little debate about the efficacy of the proposed monitoring boundaries. We propose the use of more straightforward indicators of sufficiency and stability, even though there may not be accompanying inferential probabilities for them. The first reason for using more straightforward indicators is their simplicity. The second reason for using more straightforward indicators is that Pogue and Yusuf25 require a priori specification of the optimum information size. However, a researcher must know what the event rates might bewhich requires an understanding of what minimum effects of the independent variable are both consequential and reasonablebefore specifying the optimum information size. In other words, the researcher would need extensive knowledge of the observed results of the accumulated research before undertaking a cumulative meta-analysis to understand the observed results of the accumulated research. Finally, the third reason for using more straightforward indicators is that Pogue and Yusuf were concerned only with sufficiency: whether additional evidence is needed to establish that X has some effect upon Y. They did not address whether that effect has become stable across waves in a database.25 For these reasons, we propose that cumulative meta-analysts make use of more straightforward indicators of (both) sufficiency and stability.
| INDICATORS OF SUFFICIENCY AND STABILITY |
|---|
|
|
|---|
Clearly, the hypothetical database presented in Table 1
, used to generate Figure 1
, appears to demonstrate sufficient evidence for the effect of X upon Y. It also appears that the effect of X upon Y is stable. However, real research databases are unlikely to be as clear-cut as this one. Therefore, we will outline the procedures for generating indicators of sufficiency and stability using the hypothetical database, and then use the same procedures in real databases.
The Failsafe Ratio
There is a bias in favor of publishing reports of significant results.4247 The consequence of the bias is the possibility that unpublished or unknown studies with null results may exist in researchers file drawers.47 To address the file drawer problem, Rosenthal developed a technique for estimating the number of unpublished, unretrieved studies with null results that would have to exist in file drawers that would bring the overall combined probability to just significant at the
= 0.05 level. The resulting "failsafe number" (Nfs(P=0.05)42) is calculated as follows:

Rosenthal47 noted that it would be unlikely that there would be 5 times as many unretrieved studies as there were in the meta-analysts database. He proposed that Nfs(P = 0.05) exceed 5k + 10 (the addition of 10 studies would ensure that for very small meta-analytic databases of 1 or 2 studies, the number of unretrieved studies would be 15 or 20, rather than only 5 or 10). The importance of the failsafe number Nfs(P = 0.05) and Rosenthals47 5k + 10 standard is illustrated by the studies that use it.4852 The "failsafe ratio" is an indicator of the relative sizes of the failsafe number and the Rosenthal standard, and is calculated as follows:

where ki = the number of studies in the database at wave i. If the failsafe ratio is less than 1.000, then Nfs(P = 0.05)i at wave i has not exceeded the 5ki + 10 standard. Thus, the results at wave i are still vulnerable to future null results. If the failsafe ratio exceeds 1.000, then Nfs(P = 0.05)i at wave i has exceeded the 5ki + 10 standard. Thus, the results at wave i will tolerate future null results.
Figure 1b
displays the cumulative meta-analysis from Figure 1
, with the addition of the failsafe ratio that was calculated at each wave of the database. For example, the first wave had 1 study (k1 = 1), and the Nfs(P = 0.05) 1 = 7.5. Therefore, the value of the failsafe ratio would be:

Because the value of the failsafe ratio is less than 1.000, the results at wave 1 are still vulnerable to future null results. The second wave added 1 study (k2 = 2), and the Nfs(P = 0.05) 2 = 20.7. Therefore, the value of the failsafe ratio would be:

Because the failsafe ratio exceeds 1.000, the results at wave 2 are likely to tolerate future null results. The value of the failsafe ratio continues to increase to a value of 10.483 by the 10th wave of the database.
Inspection of the failsafe ratio displayed in Figure 1b
reveals that the number of studies in the database with null results needed to reduce the combined significance to P = 0.05 becomes excessive beyond the second wave in the database, where the failsafe ratio exceeds 1.000. From that point onward in time, there seems to be no need for additional research to establish the effect of X on Y; there is sufficient evidence that the phenomenon exists, and additional research is unlikely to change the weight of that evidence.
Although the failsafe ratio can indicate the sufficiency of a research database, it does not adequately address the stability of the effect size. To the extent that the results of additional studies are of different magnitudes (as long as they are not null effects, on average), there can be fluctuations in the magnitude of the cumulative effect size that will not be captured by examination of the failsafe ratio. It is necessary to consider a more direct indicator of stability.
The Cumulative Slope
One way to determine whether there is a change in a database over time is to plot the data and examine the slope of the plotted points. The combined effect sizes presented as
Fisher i at each wave can mask the change in effect size in successive waves. In Figure 1c
, each data points placement has been preserved across waves, rather than presenting the average effect for each wave. For example, the first study in the database appears at wave 1 (ZFisher = 0.49). That same data point is also displayed at subsequent waves. The second study in the database appears at wave 2 (ZFisher = 0.35). That data point, along with the first, is displayed at all subsequent waves. In the final wave of the database, all 10 data points appear, for a total of 55 data points in the figure.
Additionally, the regression line in Figure 1c
is the result of the regression of the
ki = 55 data points across each wave upon ki as a predictor. The purpose of the regression is to estimate the rate of change (slope) across all of the waves of the meta-analytic database. It would be inappropriate to use the slope to derive inferential probabilities, because meta-analytic data violate the assumptions of the general linear model for statistical inference.1,2,4,53,54 However, the least-squares estimates of regression parameters like the slope and the intercept are not biased. In the hypothetical database, the regression equation that results from the 55 pairs of
Fisher andki data points is
Fisher = 0.46 + 0.004(k). In the cumulative meta-analysis, the slope (0.004) indicates that the best-fitting line levels off as the number of hypothesis tests increases. In other words, the effect becomes stable, not changing dramatically across waves in the database. A comparison of the size of the slope in successive waves in the database provides the cumulative meta-analyst with a means of determining whether a phenomenon has become stable.
Figure 1c
does not show how the regression line may have changed across waves, which would indicate the point in the database at which the regression line became stable. In contrast, Figure 1d
displays the cumulative meta-analysis from Figure 1a
, with the addition of the cumulative slope, which changes as regressions are performed on each of the successive pairs of
Fisher andki data points at each wave of the database. In Figure 1d
, the absolute values of the slopes resulting from regressing effect size on each successive wave i comprise the "cumulative slope." Absolute value is used because of the chance that the first few effect sizes are larger (resulting in a negative slope) or smaller (resulting in a positive slope) than the eventual mean effect size. For example, the first cumulative slope plotted at wave 2, ß = 0.070, represents the slope from the 3 pairs of data points at waves 1 and 2. The values of the slopes fluctuate between |0.070| at wave 2, and +0.023 at wave 4. After that point, they level out at around 0.000.
Inspection of the slopes displayed in Figure 1d
reveals that the phenomenon becomes stable after the third wave in the database, where the value of the cumulative slope approaches 0.000. Thus, to the extent that the cumulative slope is different from 0.000, the cumulative weight of evidence continues to fluctuate. As the cumulative slope approaches 0.000, the cumulative weight of evidence has become stable. In other words, additional studies are unlikely to change the picture of the phenomenon.
Examining sufficiency and stability as complementary aspects of an emerging cumulative meta-analytic database allow the analyst to consider the separate contributions that sufficiency and stability can make toward understanding the phenomenon. In the case of a phenomenon that appears to be strong at the outset, a cumulative slope of 0.000 indicates that additional studies would continue to support the phenomenons existence (high sufficiency). However, in the case of a phenomenon that appears to be negligible or null at the outset, a cumulative slope of 0.000 suggests that additional studies would not support existence of the phenomenon (low sufficiency). As such, the cumulative slope is a better indicator of the stability of a phenomenon than of sufficient evidence for it.
Summary
Figure 1a
displays a hypothetical example of a database that is both sufficient and stable from the outset. The indicators of sufficiency (failsafe ratio) and stability (cumulative slope) permit the cumulative meta-analyst to determine when there was sufficient evidence for the existence of the phenomenon, and when it became stable. Because the failsafe ratio and cumulative slope established sufficiency and stability early in the hypothetical database, these indicators appear to correspond with the conclusion the analyst might have drawn from an examination of Figure 1a
. The following section makes use of these indicators in real meta-analyses that are less obvious than the hypothetical example.
| APPLICATIONS TO ACTUAL META-ANALYSES |
|---|
|
|
|---|
|
|
A third picture emerges from examination of Figure 2c
. The music therapy interventions57 did not achieve sufficiency until after 7 studies (k4 = 7 hypothesis tests, 6 years before the meta-analysis, and before 67% of the includable hypothesis tests). Further examination of Figure 2c
reveals that the interventions achieved stability just after that point (k5 = 11 hypothesis tests, 5 years before the meta-analysis, and before 48% of the includable hypothesis tests). The cumulative meta-analysis for the effectiveness of music therapy for adults with dementia reveals that excessive time and effort was invested in evaluating programs for which sufficiency and stability had been established much earlier. However, unlike the nutrition programs in Figure 2a
, the cumulative meta-analysis for music therapy would have indicated that more data needed to accumulate before the sufficiency and stability of the intervention effectiveness could be established.
Finally, the picture that emerges in Figure 2d
is similar to that of Figure 2b
. The caregiver burden reduction programs58 did not achieve sufficiency at all, even after 24 studies (k13 = 27 hypothesis tests). Stability, however, was established relatively early in the cumulative meta-analytic database (k4 = 7 hypothesis tests, 12 years before the meta-analysis, and before 74% of the includable hypothesis tests). The cumulative meta-analysis for the caregiving burden interventions would have revealed that a good deal of effort and resources had been invested in conducting research on a phenomenon that never achieved sufficiency and yet for which stability might have been established long ago.
| DISCUSSION |
|---|
|
|
|---|
The complementary aspects of cumulative knowledge, sufficiency and stability, correspond with 2 dimensions of study outcome: significance level and effect size. First, significance level refers to the likelihood of having obtained the observed results, or results more extreme, if in fact the null hypothesis of no difference is true, whereas sufficiency refers to whether the cumulative weight of evidence allows us to accept the existence of the phenomenon. Sufficiency requires a high cumulative probability. Second, effect size refers to the strength of a phenomenon, whereas stability refers to whether the cumulative weight of evidence has leveled off at a steady aggregate picture of the phenomenon. Stability requires a steady cumulative average effect. The cumulative meta-analytic context underscores the role of the size of the database. At the individual study level, significance levels and effect sizes are linked through the size of the sample. That is, a significant effect of P = 0.0499999 might be weak if based on a large sample (n = 1000, ZFisher = 0.052), but strong if based on a small sample (n = 3, ZFisher = 1.830).4 Given the correspondence between significance level/effect size and sufficiency/stability, the size of the database should play a pivotal role in cumulative meta-analysis. Indeed, this appears to be the point of Schmidts20 admonition: when is it possible to tell when there is sufficient evidence for the existence of a phenomenon?
| PUBLIC HEALTH IMPLICATIONS OF CUMULATIVE META-ANALYSIS |
|---|
|
|
|---|
The cumulative meta-analysis generated from the integration of heart-healthy nutrition interventions55 demonstrated that, early on, both sufficiency and stability for an effective program was attained. However, the cumulative meta-analysis generated from the integration of drug abuse prevention programs56 demonstrated that sufficiency was never established, but stability for the essentially null effect was established by the fourth wave in the database. However, these 2 programs appear to receive differential research support and commitment. For example, the Healthy People 2010 59 guidelines delineate only 1 objective for improving nutrition in school meals, but there are at least 7 objectives for decreasing substance use among schoolchildren. Although drug abuse is a serious public health problem, the Healthy People 2010 objectives appear to be made on the basis of some of the same studies that appeared in White and Pitts meta-analysis,56 indicating an overemphasis on promoting programs from which schoolchildren derive no benefit. Meanwhile, the objectives underemphasize a program from which schoolchildren derive significant benefits. Despite the emerging cultural alarm over obesity and its associated health problems, efficacious heart-healthy eating programs appear to be overlooked. Indeed, a simple MEDLINE search of the literature on schoolchildren corroborates this suspicion: A search for heart healthy and nutrition yielded 13 citations; a search for drug abuse and prevention yielded 651 citations. The rendered wisdom from current research objectives is that there is more promotion of (ineffective) drug abuse prevention programs than (effective) heart-healthy eating programs.
Consider the cumulative meta-analysis generated from the integration58 of interventions to reduce caregiver burden. The cumulative meta-analysis demonstrated that by the seventh wave in the database, stability for the negligible effect was attained, indicating no substantive changes to the accruing evidence that interventions do not reduce caregiver burden. However, 7 years after stability was established, 1 study60 set out recommendations for physicians to identify and intervene with overburdened caregivers. Their recommendations included the same educational, counseling, and respite-care services assessed in the primary-level studies integrated in Acton and Kangs58 meta-analysis.
Moreover, 8 years after stability for the negligible effect was attained, the US Department of Health and Human Services61 issued a preliminary report on governmental commitments to programs for independent living, including caregiver burden reduction programs. The report claims that "a growing body of evidence confirms that the provision of supportive services can diminish caregiver burden, [and] permit caregivers to remain in the workforce. . . ."61 The 2001 appropriations for the National Caregiver Support Program were $125 000 000.61 To date, we have been unable to determine that any appropriations have been dedicated for music therapy programs. The rendered wisdom from current research objectives is that there is more promotion of (ineffective) caregiver burden reduction programs than (effective) music therapy programs.
The examples above make it clear that research in public health can benefit from tools for determining when sufficient evidence has accrued to establish intervention efficacy. There are several valuable applications of this approach. For example, for research questions involving moderators, cumulative meta-analysis can be used to examine sufficiency and stability separately within levels of the moderator: The evidence from studies testing the intervention at 1 level of the moderator may demonstrate sufficiency, whereas studies testing another level of the moderator may not demonstrate sufficiency. Similarly, cumulative meta-analysis can be used to gauge the fit of public policy recommendations: despite the evidence that the effect of caregiver burden reduction levels off at zero, policy recommendations favor more funding. Finally, this approach may provide an empirically based benchmark against which funding proposals can be evaluated by granting agencies: proposals for new studies that use cumulative meta-analysis to document that current evidence for an intervention that has not yet achieved stability stand as particularly valuable opportunities to invest time, effort, and resources. The failsafe ratio and cumulative slope can reveal information about an emerging phenomenon to help researchers make the best use of limited resources needed to advance the state of the science and improve public health.
| Footnotes |
|---|
Contributors
Both authors developed the conceptual perspective, analyzed the data, and wrote the article.
Human Participant Protection
No protocol approval was needed for this study.
Accepted for publication January 5, 2005.
| References |
|---|
|
|
|---|
2. Mullen B. Advanced BASIC Meta-Analysis. 2nd ed. Mahwah, NJ: Lawrence Erlbaum Associates. In press.
3. Mullen B, Rosenthal R. BASIC Meta-Analysis. Hillsdale, NJ: Lawrence Erlbaum Associates; 1985.
4. Rosenthal R. Meta-Analytic Procedures for Social Research. Newbury Park, CA: Sage; 1991.
5. McDonald HP, Garg AX, Haynes RB. Interventions to enhance patient adherence to medication prescriptions: scientific review. JAMA. 2002;288: 28682879.
6. Peterson AM, Takiya L, Finley R. Meta-analysis of interventions to improve drug adherence in patients with hyperlipidemia. Pharmacotherapy. 2003;23:8087.[CrossRef][Web of Science][Medline]
7. Roter DL, Hall JA, Merisca R, Nordstrom B, Cretin D, Svarstad B. Effectiveness of interventions to improve patient compliance: a meta-analysis. Med Care. 1998; 36:11381161.[CrossRef][Web of Science][Medline]
8. Davis D, OBrien MA, Freemantle N, Wolf FM, Mazmanian P, Taylor-Vaisey A. Impact of formal continuing medical education: do conferences, workshops, rounds, and other traditional continuing education activities change physician behavior or health care outcomes? JAMA. 1999;282:867874.
9. Fichtenberg CM, Glantz SA. Effect of smoke-free workplaces on smoking behaviour: systematic review. BMJ. 2002;325:188191.
10. Twenge JM, Campbell WK. Self-esteem and socioeconomic status: a meta-analytic review. Pers Soc Psychol Rev. 2002;6:5971.
11. Schmidt FL. What do data really mean? Research findings, meta-analysis, and cumulative knowledge in psychology. Am Psychol. 1992;47:11731181.[CrossRef]
12. Ku L, Sonenstein FL, Pleck JH. Factors influencing first intercourse for teenage men. Public Health Rep. 1993;108:680694.[Web of Science][Medline]
13. Eisen M, Zellman GL. Changes in incidence of sexual intercourse of unmarried teenagers following a community-based sex education program. J Sex Res. 1987;23:527533.
14. Marsiglio W, Mott FL. The impact of sex education on sexual activity, contraceptive use and premarital pregnancy among American teenagers. Fam Plann Perspect. 1986;18:151162.[CrossRef][Web of Science][Medline]
15. Rosnow RL, Rosenthal R. Statistical procedures and the justification of knowledge in psychological science. Am Psychol. 1989;44:12761284.[CrossRef]
16. Mullen B, Muellerleile P, Bryant B. Cumulative meta-analysis: a consideration of indicators of sufficiency and stability. Pers Soc Psychol Bull. 2001;27: 14501462.
17. Cooper HM. The Integrative Research Review: A Social Science Approach. Beverly Hills, CA: Sage; 1984.
18. Light RJ, Pillemer DB. Summing Up: The Science of Reviewing Research. Cambridge, MA: Harvard University Press; 1984.
19. Johnson B, Mullen B, Salas E. A comparison of the three major meta-analytic approaches. J Appl Psychol. 1995;80:94106.[CrossRef][Web of Science]
20. Schmidt FL, Hunter JE. Comparison of three meta-analysis methods revisted: an analysis of Johnson, Mullen, & Salas (1995). J Appl Psychol. 1999;84: 144148.[CrossRef][Web of Science]
21. Rosenthal R, Rubin DB. Interpersonal expectancy effects: the first 345 studies. Behav Brain Sci. 1978;3: 410415.
22. Rosenthal R, Rubin DB. Comment: assumptions and procedures in the file drawer problem. Stat Sci. 1988;3:120125.[CrossRef]
23. Antman EM, Lau J, Kupelnick B. A comparison of results of meta-analyses of randomized control trials and recommendations of clinical experts. JAMA. 1992; 268:240248.
24. Lau J, Antman EM, Jimenez-Silva J, Kupelnick B, Mosteller F, Chalmers TC. Cumulative meta-analysis of therapeutic trials for myocardial infarction. N Engl J Med. 1992;327:248254.[Abstract]
25. Pogue JM, Yusuf S. Cumulating evidence from randomized trials: utilizing sequential monitoring boundaries for cumulative meta-analysis. Control Clin Trials. 1997;18:580593.[CrossRef][Web of Science][Medline]
26. Yusuf S, Held P, Furberg C. Update of effects of calcium antagonists in myocardial infarction or angina in light of the second Danish Verapamil Infarction Trial (DAVIT-II) and other recent studies. Am J Cardiol. 1991;67:12951297.[CrossRef][Web of Science][Medline]
27. DeProspero A, Cohen S. Inconsistent visual analysis of intrasubject data. J Appl Behav Anal. 1979;12: 573579.[CrossRef][Web of Science][Medline]
28. Furlong MJ, Wampold BE. Intervention effects and relative variation as dimensions in experts use of visual inference. J Appl Behav Anal. 1982;15:415421.[CrossRef][Web of Science][Medline]
29. Gottman JM, Glass GV. Analysis of interrupted time-series experiments. In: Kratochwill TR, ed. Single Subject Research: Strategies for Evaluating Change. New York, NY: Academic Press; 1978:197235.
30. Jones R, Weinrott M, Vaught R. Effects of serial dependency on the agreement between visual and statistical inference. J Appl Behav Anal. 1978;11: 277283.[CrossRef][Web of Science][Medline]
31. Tryon WW. A simplified times series analysis for evaluating treatment interventions. J Appl Behav Anal. 1982;15:423429.[CrossRef][Web of Science][Medline]
32. Ottenbacher KJ. Interrater agreement of visual analysis in single subject decisions: quantitative review and analysis. Am J Ment Retard. 1993;98: 135142.[Web of Science][Medline]
33. Chambers JM, Cleveland WS, Kleiner B, Tukey PA. Graphic Methods for Data Analysis. Belmont, CA: Wadsworth; 1983.
34. Cleveland WS. Elements of Graphing Data. Summit, NJ: Hobart Press; 1994.
35. Cleveland WS, McGill R. Graphical perception: theory, experimentation, and application to the development of graphical methods. J Am Stat Assoc. 1984; 79:531554.[CrossRef][Web of Science]
36. Cleveland WS, McGill R. The many faces of a scatterplot. J Am Stat Assoc. 1984;79:807822.[CrossRef][Web of Science]
37. Mosteller F, Tukey JW. Data analysis, including statistics. In: Lindzey G, Aronson E, eds. The Handbook of Social Psychology. Vol 2. 2nd ed. Reading, MA: Addison-Wesley; 1968.
38. Tufte ER. Envisioning Information. Cheshire, CT: Graphics Press; 1990.
39. Tufte ER. Graphical Explanations. Cheshire, CT: Graphics Press; 1997.
40. Tukey JW. Data based graphics: visual display in the decades to come. Stat Sci. 1990;5:327339.
41. Wainer H. Visual Revelations: Graphical Tales of Fate and Deception from Napoleon Bonaparte to Ross Perot. New York: Copernicus; 1997.
42. Cooper HM. Statistically combining independent studies: a meta-analysis of sex differences in conformity research. J Pers Soc Psychol. 1979;37:131135.[CrossRef][Web of Science]
43. Greenwald AG. Consequences of prejudice against the null hypothesis. Psychol Bull. 1975;82:120.
44. Hedges LV, Vevea JL. Estimating effect size under publication bias: small sample properties and robustness of a random effects selection model. J Educ Behav Stat. 1996;21:299333.[CrossRef]
45. Hojat M, Gonnella JS, Caelleigh AS. Impartial judgment by the "gatekeepers" of science: fallibility and accountability in the peer review process. Adv Health Sci Educ Theory Pract. 2003;8:7596.[CrossRef][Medline]
46. Olson CM, Rennie D, Cook D, et al. Publication bias in editorial decision making. JAMA. 2002;287: 28252828.
47. Rosenthal R. The "file drawer problem" and tolerance for null results. Psychol Bull. 1979;86:638641.[CrossRef][Web of Science]
48. Beck CT. A meta-analysis of the relationship between postpartum depression and infant temperament. Nurs Res. 1996;45:225230.[CrossRef][Web of Science][Medline]
49. Herbert TB, Cohen S. Depression and immunity: a meta-analytic review. Psychol Bull. 1993;113: 472486.[CrossRef][Web of Science][Medline]
50. Ito TA, Miller N, Pollock VE. Alcohol and aggression: a meta-analysis on the moderating effects of inhibitory cues, triggering events, and self-focused attention. Psychol Bull 1996;120:6082.[CrossRef][Web of Science][Medline]
51. Sheeran P, Orbell S. Do intentions predict condom use? Meta-analysis and examination of six moderator variables. Br J Soc Psychol. 1998;37:231250.
52. Sweeney PD, Anderson K, Bailey S. Attributional styles and depression: a meta-analytic review. J Pers Soc Psychol. 1986;50:974991.[CrossRef][Web of Science][Medline]
53. Hedges LV, Olkin I. Statistical Methods for Meta-Analysis. Orlando, FL: Academic Press; 1985.
54. McCain LJ, McCleary R. The statistical analysis of the simple interrupted time-series quasi-experiment. In: Cook TD, Campbell DT, eds. Quasi-Experimentation: Design and Analysis Issues for Field Settings. Chicago, IL: Rand McNally; 1979:233293.
55. McArthur DB. Heart healthy eating behaviors of children following a school-based intervention: a meta-analysis. Issues Compr Pediatr Nurs. 1998;21:3548.[Medline]
56. White D, Pitts M. Educating young people about drugs: a systematic review. Addiction. 1998;93: 14751487.[CrossRef][Web of Science][Medline]
57. Koger SM, Chapin K, Brotons M. Is music therapy an effective intervention for dementia? A meta-analytic review of literature. J Music Ther. 1999;36:215.[Web of Science][Medline]
58. Acton GJ, Kang J. Interventions to reduce the burden of caregiving for an adult with dementia: a meta-analysis. Res Nurs Health. 2001;24:349360.[CrossRef][Web of Science][Medline]
59. Healthy People 2010: Understanding and Improving Health. 2nd ed. Washington, DC: US Department of Health and Human Services; 2000.
60. Kasuya RT, Polgar-Bailey P, Takeuchi R. Caregiver burden and burnout: a guide for primary care physicians. Postgrad Med. 2000;108:119123.[Medline]
61. US Department of Health and Human Services. Delivering on the Promise: Preliminary Report. 2001. Available at: http://www.hhs.gov/newfreedom/prelim/caregive.html. Accessed on November 16, 2003.
This article has been cited by other articles:
![]() |
G. Gini and T. Pozzoli Association Between Bullying and Psychosomatic Problems: A Meta-analysis Pediatrics, March 1, 2009; 123(3): 1059 - 1065. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. J. Wellman, D. B. Sugarman, J. R. DiFranza, and J. P. Winickoff The Extent to Which Tobacco Marketing and Tobacco Use in Films Contribute to Children's Use of Tobacco: A Meta-analysis Arch Pediatr Adolesc Med, December 1, 2006; 160(12): 1285 - 1296. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |