Introduction

The purpose of this project is to find better ways to predict medium term climate this would mean 1-5 years in the future. In order to do so we will look at climate indexes, which are measurements of the current state of the climate, such as El Nino/La Nina, North Atlantic Oscillation, Pacific Oscillation which are states of the climate. These Indexes of ocean Surface Temperature have been used by meteorologist for decades. If we can find better clusters/states of the climate that better correlate with specific location we could create a useful tool for weather prediction, in probality distribution terms.
It is understood that macro climatic conditions can affect micro climate at specific locations. These climate teleconnections  are critical for evaluating climate change and the impact. Moreover it can give insight into predictability of climate. There are multiple ways of going about this problem. The goal would be to see how we can cluster the climate in order to improve the correlation to specific locations.
Let us take for example el Nino, multiple indexes 1,2,3,4 are defined by the ship tracks that cross that part of the atlantic. The Nino ONI, is the official one used by NOAA is caracterized by measurement of the third shipping lane,  and calculated an anomaly of +- 0.5C on the three month running average for 5 consecutive months. As we can see the definition is arbitrary and if you are in Latin America or in India both places highly effected by the Nino, you could build a model of indexes that is relevant (highly correlated to local weather) and even create a different set of indexes for each part of the world. The data already exists in historical daily data of surface water temperature which we can correlate to precipitation and temperature on land.

Goal

Data mine the available global climate data for a better indexes that correlate with yearly weather patterns. If we can find states of macro-climate that can correlate with local weather patterns we will consider our project a success and a potential tool for downscaling GCM.    

Data

NOAA Merged Land Ocean Global Surface Temperature Analysis Dataset of global land and ocean temperature at monthly frequency 5 degree resolution. 

Link to all global gridded surface temperature datasets, there are alternatives, in particular   

        NCEP Global Ocean Data Assimilation System - monthly at high resolution (<1 degree)          NCEP/DOE Reanalysis II  which is daily data, a reanalysis incorporating observation and simulated data

        
 
 

Literature Review

\cite{Steinhaeuser_2011} This paper focuses on the " the extraction of ocean climate indices from observed climatological data. In this case, it is possible to quantify the relative performance of different methods. Specifically, we propose to extract indices with complex networks constructed from climate data, which have been shown to effectively capture the dynamical behavior of the global climate system, and compare their predictive power to candidate indices obtained using other popular clustering methods. Our results demonstrate that network-based clusters are statistically significantly better predictors of land climate than any other clustering method, which could lead to a deeper understanding of climate processes and complement physics-based climate models." We will in part build upon this paper to use GCM for validation 
\cite{Chikamoto_2017} Spoke with Chikamoto on the memory of the oceans with a potential predictability of 3 years. Suggesting that mining ocean temperature data, in this case correlated with soil moisture is highly predictive. The problem with precipitation is that it is too noisy and some smoothing would be necessary
 
\cite{Steinbach_2003} This paper attempts to find new climate indexes through SVD, very intuitive approach anda clear background and understanding.