(Source: https://upload.wikimedia.org/wikipedia/commons/thumb/b/b8/Simple_Confounding_Case.svg/1200px-Simple_Confounding_Case.svg.png)
The above diagram shows that in the association between X and Y, Z is a confounding variable. As you can see, Z is associated with both X and Y, but Z does not come in a causal connection that links X and Y. For example, in Nordic countries, Eero Pukkala and colleagues studied airline pilot who were considered to be at high risk of developing skin cancer as they flew over the poles and were exposed to high energy cosmic rays \cite{Pukkala_2002}. If you were to study this association, that is flying as an occupation and risk of death from cancer, you'd have to include gender as a confounding variable. Men are more likely to fly transcontinental aircrafts, and men are also more likely to die from cancers
You can control for confounding in many different ways. Before beginning a study, you can identify potential confounding variables, and then you can either restrict participants so that a potential confounding variable would be controlled for. In the airline pilot and cancer study, for example, you would only consider men. You could also match on specific variables. In case of the airline pilot and risk of cancer deaths, for example, you could take same number of men and women or the same proportion of men and women in the comparison arms. You can also control the role of potential confounding variables by conducting stratified analysis or conducting multivariable analysis. We will examine these examples in the section on data analysis for epidemiology. 

Hill's Criteria or Considerations

In exploring the relationship between an exposure and a health outcome, if you can rule out the play of chance, eliminate biases before or at the stage of study design, and control for confounding variables either at the stage of study design or during the analysis of data, then you have established that the exposure has a valid association with the outcome. But what will help us to understand the nature of this association that this association is one of cause and effect? 
This answer is not easy and as it happens, there is no easy way to understand this. One approach might be to consider several criteria and examine them to judge whether the association might be one of cause and effect. The reason this type of association is important is because if you know that X is a cause of Y, then you can state that if X is eliminated or if X is removed or if X is controlled for, then there would be a certain level in the diminution of Y as well. This is the basis of precautionary principle that is used in Environmental and Occupational health. Precautionary principle is a principle that states that even if we do not know the full extent of the association, if we know that some exposure will lead to an adverse health outcome, then a precaution can be taken to remove or restrict or eliminate that factor to protect human health. 
It'd be easy if we were to adopt a list of items that we could check off. Something similar along this lines was provided by the British Occupational Health statistician Sir Austin Bradford Hill in 1965, and these are referred to as Hill's Criteria \cite{Hill_2015}. Hill stated the following "considerations" for assessing whether an association might be one of cause and effect. These are:
  1. Strength of Association. -- The stronger an association is, the more likely that this is one of cause and effect (we will review this concept in the next section)
  2. Specificity. -- If X is a cause of Y, then there should be a one on one association between X and Y. This is not always true.
  3. Consistency. -- If X is a cause of Y, then this association should be observed in different situations and under different conditions, different studies.
  4. Temporality. -- If X is a cause of Y, then X must precede Y. This is a reasonable and valid clause and indeed, one that is necessary condition for X to be a cause of Y.
  5. Biological Gradient. -- In Environmental and Occupational health we refer this to as dose response gradient. By this, we mean that if X is a cause of Y, then, as the dosage or the amount of exposure to X increases, this will also cause a corresponding increase in the outcome or Y. This association is sometimes linear, but it does not have to be a linear association. We will review this concept when we will study the concepts of environmental health risk assessment. 
So, this summarises the concepts of causation and causal inference. Causality is intuitive and in health sciences, we have more than one cause for a health outcome. You can think of causes as models where many causes biologically interact to form what is referred to as "sufficient cause"; each cause in a sufficient cause is a component cause and a component cause that has to be present in every sufficient causal model is referred to as necessary cause. In order for an exposure E to be a cause of an health outcome O, E has to fulfil that it has a statistically valid association. This is done by ruling out chance, eliminating biases, and controlling for confounding variables. Finally, whether such an association is one of cause and effect is also determined by considering different situations as outlined by Sir Austin Hill, referred to as HIll's criteria. 

Measurement in Epidemiology

We will now discuss the different measures we use in Epidemiology. There are measures of disease occurrence and measures of association. We discuss three measures of disease occurrence -- incidence, prevalence, and standardised rates. We will discuss two measures of exposure-outcome association -- relative and absolute rates, and odds ratios. 

Concepts of Incidence and Prevalence

Prevalence is defined as the proportion of population who suffer from or who have a particular disease condition or a health condition. Prevalence is given by the following formula:
\(\frac{Number\ of\ Individuals\ with\ Disease}{Total\ Population\ }\ \) multiplied by a base population, something like 1000, or 10, 000.
Consider two towns A & B, with populations of 1000 and 2000;  let's say that it is reported that 15 people in town A and 20 people in town B have diarrhoea and E.coli infection from contamination of drinking water. Which township has more disease? If we do not know the total population of each township, it may appear that indeed more people have disease in town B, but once we consider the population of each town, we see a different situation: