Systematic Review and Meta-analysis
This technique is used to combine the results of a number of different reports into one report to create a single, more precise estimate of an effect. Benefits of conducting a meta-analysis include overcoming the dangers of bias, decisions of achieving the final aggregate sizes are always transparent and they give more precise estimates of the size of any effects uncovered \cite{Lalkhen_2008}. The required steps for the meta-analysis are described below:
1. Using the PICO (population, exposure, comparison and outcomes) framework the research question for the matter analysis will be, "Compared with lower levels of exposure to air pollutants, what is the risk of acute myocardial infarction related hospitalisation for people who are exposed to higher level of air pollutants?"
2. A systematic review of literature will primarily use the Embase and Medline databases to search for relevant studies using the Boolean logic and fuzzy logic \cite{Tuttle_2009} search principles. The eligibility criteria for the literature search will focus on studies analyzing the associations of exposure to 1 or more air pollutants with acute myocardial infarction hospital admissions and emergency department visits. The relevant studies should involve the general populace and be in any study design. Studies have to be in the English language and had to be published within the last 10 years. Key words to this search include air pollution, acute myocardial infarction, effect size, realative risk, etc. Particulars of this search strategy are shown in Appendix E whereby a total of 152 relevant studies were found. However, this number may be reduced after removing duplicated studies and other studies that do not fit the elements descibed in the research question for the meta-analysis.
3. The following scheme (exclusion criteria) will then be employed in order to go through the list of retrieved abstracts and titles of the studies from step 2 and to retrieve their full texts.
a. The article is published in a foreign language and cannot be translated
b. The article is irrelevant to the study question
c. The article does not have the relevant population
d. The article is a duplicate of another study
e. The article does not discuss the outcome that is of interest to this research
f. The article does not have a relevant comparison group
4. The full texts from the studies identified in step 3 will be thoroughly read and then decide whether to further exclude them OR, using their reference lists, other studies can be identified and in this way, iteratively, build a database of studies to be included in the meta-analysis.
5. This step involves abstracting information from individual studies included in the meta-analysis. This will include information on authors, years of study, the population studied, the sample size, study type, the effect size or point estimate, the 95% confidence interval and the p-value
6. From the information you collected in step 5, a forest plot will be used to graphically display the distribution of the effect size of the different studies.
7. In this step, the information obtained from step 5 will be used to first test the heterogeneity of the studies. Testing the heterogeneity of the studies involves using a variation of chi-square test based on a pooled estimate, the effect estimate of each individual study, and the number of studies. This test is referred to as Q statistic and the associated p-value noted, and further evaluated at 0.05 for the null hypothesis. The null hypothesis will be that, " the results of the studies are similar to each other or that there is no difference between the results of the studies included in the meta-analysis". The alternative hypothesis is that "the results of the studies differ from each other". If the p-value rejects the null hypothesis, then the studies are heterogeneous; if the p-value fails to reject the null hypothesis, then the conclusion will be that the studies are homogeneous. If the studies are homogeneous, the results of the study will be pooled together and two types of estimates reported: (1) fixed effects estimate based on the assumption that the studies that have been included in the research form an exhaustive set of studies; and (2) a random effect estimate where it will be assumed that the set of studies being included in the analysis form a 'sample' or random sample of studies of 'all possible studies'.
8. This step requires to carry out a pooled estimate as mentioned in step 7 and report the effect estimate.
9. In this step, I will test for publication bias. Publication bias refers to a bias that occurs due to the fact that smaller studies and those with "equivocal estimates" (that is estimates that are inconclusive or those studies with negative estimates) are less likely to be published and therefore less likely to be captured in this meta-analysis than those studies that are large and have significant findings. Plotting the variance of the study estimates (variance of the effect estimate of a study is a function of its sample size) and the effect estimate itself, itself, will show that the cloud of points may define a funnel. The base of the funnel will be formed by studies that are small in size (hence large variance) and the effect estimates will vary all around the point estimate; the apex or peak of the funnel will be formed by those studies that are large sized (hence low variance) and all the estimates will be clouded around the point estimate estimate obtained in the meta-analysis. If part of the funnel is missing, then that indicates that there was publication bias. This is referred to as the funnel plot. There are other tests, such as "Egger's Test" that can statistically report the extent of publication bias. Eggers test is a formalized statistical tests for assessing funnel plot asymmetry . In addition, Eggers test plots the regression line between precision of the studies (independent variable) and the standardized effect (dependent variable). When there isn’t publication bias the regression line originates in the Y-axis zero. If it is much futher away from zero, this suggests further evidence of publication bias \cite{molina2012}. There is not much to be done to remedy publication bias other than searching for 'fugitive literature' and contacting the research groups and others who can have studies that are small and remained unpublished or obtain the raw data from different sources.
10. In this final step I will test for meta-regression or subgroup analyses. In this analysis, I will subgroup the data and analyse them separately using a regression model. I will test if the estimates are different for those in developing versus developed countries, and also for those with different types of source apportionment. Source apportionment refers to the phenomenon that different sources will contribute differently to air pollution. For example, do sources such as vehicle exhausts lead to higher admission rates than say coal burning plants? Is the association between PM10 and hospital admissions different for developing countries than for developed countries?
Linear Regression and Modelling
Linear regression is suitable to use when a time series data suggest a simple linear increasing or decreasing trend on a plot of data against time \cite{Gilbert_1988}. The overall idea of regression is first to examine whether a set of predictor variables (criteria pollutants) is good enough to predict an outcome (risk of acute myocardial infarction) and secondly, to identify which of the variables are significant predictors of the outcome variable. Also, regression examines in what way in terms of magnitude does the predictor variables impact the outcome variable \cite{statisticssolutions2013}. In addition, I will need to set up a model using this regression method to impute hospital admission rates due to acute myocardial infarction and the pollution data