\documentclass[11pt,a4]{article}
\usepackage{fullpage}
\usepackage{setspace}
\usepackage{parskip}
\usepackage{titlesec}
\usepackage[section]{placeins}
\usepackage{xcolor}
\usepackage{breakcites}
\usepackage{lineno}
\usepackage{hyphenat}
\PassOptionsToPackage{hyphens}{url}
\usepackage[colorlinks = true,
linkcolor = blue,
urlcolor = blue,
citecolor = blue,
anchorcolor = blue]{hyperref}
\usepackage{etoolbox}
\makeatletter
\patchcmd\@combinedblfloats{\box\@outputbox}{\unvbox\@outputbox}{}{%
\errmessage{\noexpand\@combinedblfloats could not be patched}%
}%
\makeatother
\usepackage[round]{natbib}
\let\cite\citep
\renewenvironment{abstract}
{{\bfseries\noindent{\abstractname}\par\nobreak}\footnotesize}
{\bigskip}
\titlespacing{\section}{0pt}{*3}{*1}
\titlespacing{\subsection}{0pt}{*2}{*0.5}
\titlespacing{\subsubsection}{0pt}{*1.5}{0pt}
\usepackage{authblk}
\usepackage{graphicx}
\usepackage[space]{grffile}
\usepackage{latexsym}
\usepackage{textcomp}
\usepackage{longtable}
\usepackage{tabulary}
\usepackage{booktabs,array,multirow}
\usepackage{amsfonts,amsmath,amssymb}
\providecommand\citet{\cite}
\providecommand\citep{\cite}
\providecommand\citealt{\cite}
% You can conditionalize code for latexml or normal latex using this.
\newif\iflatexml\latexmlfalse
\providecommand{\tightlist}{\setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}%
\AtBeginDocument{\DeclareGraphicsExtensions{.pdf,.PDF,.eps,.EPS,.png,.PNG,.tif,.TIF,.jpg,.JPG,.jpeg,.JPEG}}
\usepackage[utf8]{inputenc}
\usepackage[english]{babel}
\usepackage{float}
\begin{document}
\title{Social interventions connect to the origin of the population}
\author[1]{Samuel Schiess}%
\author[1]{Joschka Geissler}%
\affil[1]{EPFL}%
\vspace{-1em}
\date{\today}
\begingroup
\let\center\flushleft
\let\endcenter\endflushleft
\maketitle
\endgroup
\sloppy
\section*{Introduction}
{\label{809642}}
The world human population is steadily increasing. In developed
countries, \textgreater{} 70 \% of the population live in cities or
urban areas~\cite{Bettencourt2007}. With the increase of the population,
humankind is facing new challenges and existing problems become worse.
Not only problems of famine, water scarcity or increased energy demand
can occur, also socially problematic situations arise when many people
live in the same spots~\cite{Glaeser_1999}. With the density of humans
living in the same area, also psychological problems occur. The social
pressure rises and the concurrence in every domain is increasing, as so
many people in the near areas specialize in the similar disciplines. On
the other hand, people need to choose a lifestyle,~ find their way of
living and are responsible for their own well being. The pressure and
responsibility can often end in psychological problems, which can lead
to domestic violence, self-hatred and even suicide attempts. For actions
that do not respect the law, the police is forced to intervene. In our
digitized world everything and also those social interventions are
recorded.
Additionally, not only a growth of the population and the subsequent
problems may be observed today. Also the mixing of the cultures and
nationalities all over the world gets more important and brings new
possibilities and advantages, but also new problems. In common speaking,
the foreign cultures have often another treatment of psychological
diseases, health, crime and violence than the native culture in for
example Western Europe. This may be right in some cases, but the
connection between social troubles and the percentage of foreigners in
specific areas has never been clearly stated~\cite{Entorf_2000}.~
In the case-study of the municipality of Vernier, we have access to data
from the interventions due to social reasons and the demographic
properties of the area under study. The context of those two variables
is often taken as a political argument from right-wing parties, but a
more narrow investigation will be provided in this paper.~Our study area
is formed by the city of Vernier with its 768 hectares and 35 300
inhabitants, situated in the Canton of Geneva and thus a part of the
metropolitan area of the city of Geneva~\cite{officiel}. The main
focus will be set to investigate the spatial correlation between those
two variables. In addition to that, other statistical measurements will
be provided, based on the data set and the area given.~
\par\null
\section*{Hypothesis}
In the first step we want to investigate the existence and the
properties of hotspots in the region of Vernier regarding the two
factors social interventions by the police and the origin of the
population. Secondly, the relation between those two variables is of
interest and their spatial correlation will be investigated. The
expected facts are, that both parameters underlie a spatial distribution
and hotspots exist and that the positions of the hotspots and thus the
parameters are positively correlated to each other.
\par\null
\section*{Data}
We use population data from STATPOP of the year 2015 which is annually~
surveyed by the Federal Statistical Office of Switzerland for the
households of the country~\cite{2017a}. From this data,
quantitative information about the total population, the number of
Swiss, non-Swiss and non-European inhabitants have been used. The second
data set used is the data published by the cantonal police department of
Geneva. The data set provides all police interventions due to social
reasons in the area under study from 2014 to 2017~\cite{sitg}.~
\section*{Methods}
All spatial and statistical analyses have been performed with the
software Geoda \cite{projecta}. QGIS (Version
2.18.)~\cite{project} has been used for the visualization.
Having the two data sets described, we were able to perform a first
statistical analysis. Therefore we performed the processing chain shown
in Figure 1. The first step was to create a grid over the study area of
vernier with a cell-size of 100 m x 100 m. For each cell, the total
number of interventions has been summed up and divided by the total
number of the population in this specific cell. Additionally, the number
of Swiss, Non-Swiss, and Non-European inhabitants has been computed for
each cell. To get the percentage of the different origins, those numbers
have also been divided through the total number of the population within
the cell. With the obtained variables, we were able to perform a first
Linear Regression (Non-Swiss/Interventions, Non-European/Interventions).
In addition to the Linear Regression, we also computed the Raw Rate.
Therefore, the number of interventions divided by the total population
was divided through the percentage of Non-Swiss and Non-European
inhabitants within each cell.
\par\null\selectlanguage{english}
\begin{figure}[H]
\begin{center}
\includegraphics[width=0.70\columnwidth]{figures/Bild1/Bild1}
\caption{{Processing chain for Raw Rate and Linear Regression
{\label{186839}}%
}}
\end{center}
\end{figure}
In order to find the socalled hotspots of our variables we had to
perform a spatial analysis. The workflow for this spatial analysis is
shown in Figure 2.~\selectlanguage{english}
\begin{figure}[H]
\begin{center}
\includegraphics[width=0.70\columnwidth]{figures/Bild2/Bild2}
\caption{{Processing chain for Spatial Rate Smoothening
{\label{302019}}%
}}
\end{center}
\end{figure}
First, a Queen's 2 weight file was calculated for the grid of vernier.
Figure 3a shows the idea of this weight file. Each cell with a distance
closer or equal to two is taken into account (green) for the computation
of the value of the cell of interest (red). The values of the white
cells do not have an influence on the calculation of the red value. For
instance, if we want to calculate the spatial rate smoothed value for
the red cell, we would divide the sum of the first parameter (e.g.
Number of Interventions) of all green colored cells through the sum of
the second parameter (Total Population) of all green cells. Several
``Spatial Rate Smoothed''-rates (SRS) were computed (Percentage of
Non-Swiss inhabitants per population, Percentage of Non-European
inhabitants per population, Number of Interventions per population). All
results have been plotted, visually analysed, discussed and compared to
the raw rate which has not been spatially smoothed.
\par\null\selectlanguage{english}
\begin{figure}[H]
\begin{center}
\includegraphics[width=0.42\columnwidth]{figures/Bild3/Bild3}
\caption{{Spatial Analysis; a) Concept of the Queen's 2 weight file; b) Concept of
the Local Moran's I
{\label{243702}}%
}}
\end{center}
\end{figure}
To investigate dependencies between the variables the Local Moran's
I-analysis was performed in addition to the raw and the spatially
smoothed rates. The Local Moran's I provides a possibility to check
spatially distributed attributes in its context. As we are interested in
the connection between the interventions and the origin of the
population, we have chosen the Bivariate method. It enables a comparison
of the values for the variables with its neighbors and presents the
correlation of the two variables at the same time.\textbf{~}When working
with~\emph{Local~}Moran's I, the value of each cell is associated to its
specific location and not with the global context.~
Local Indicators of Spatial Association (LISA) measures the association
for each spatial unit and identifies the type of spatial correlation.
For instance, the Bivariate Local Moran's I gives an indication of the
sign of linear association (positive or negative) between the averaged
value of the first variable at a given location and the averaged value
of another variable at neighboring locations~\cite{mirzaie2013a}.
As Figure 3b shows, cells with relatively high values for both variables
end up in the top-right quadrant ``HH''. Cells with relatively low
values for both variables end up in the ``LL''-quadrant (positive
association) . Cells with a high and a low~ value for the variables are
located in the ``LH'' or ``HL''-quadrant respectively (negative
association). In case the variables of a cell are not significantly high
or low the cell is categorized as non significant using a threshold for
significance (p-value).
\par\null
\section*{Results}\label{results}
\par\null
Regarding the hotspots, a simple representation of the variables over
the area has been applied. To observe existing neighboring effects and
to include a smoothening filter, the SRS was applied for both parameters
using a Queen's 2 weights file. To be able to compare the results for
raw and smoothed data directly, the images are represented aside each
other. The cells are organized in 5 quantiles, with the darkest being
the highest and the lightest being the lowest quantile. Figure 4 shows
the interventions per population, while Figure 5 represents the
percentage of foreigners per cell.~
\par\null\par\null\selectlanguage{english}
\begin{figure}[H]
\begin{center}
\includegraphics[width=1.00\columnwidth]{figures/merge-interventions1/merge-interventions1}
\caption{{Shows the cells of the municipality of Vernier for the variable
interventions divided by the population of each cell. On the left the
raw data, without treatment is represented, while the right map shows
the smoothed data according to SRS.~For the raw data, the legend shows
the number of interventions per person per cell, while the smoothed data
on the right has another unit and has to be looked at relatively to its
values and not be directly compared to the raw data.
{\label{755511}}%
}}
\end{center}
\end{figure}\selectlanguage{english}
\begin{figure}[H]
\begin{center}
\includegraphics[width=1.00\columnwidth]{figures/merge-interventions/merge-foreigners}
\caption{{Shows the cells of the municipality of Vernier for the percentage of
foreigners living in a cell. T left map shows the raw data, while the
right map represents the smoothed data with the two hotspots in the
northern part. For the raw data, the legend shows \% of the total
population, while the smoothed data on the right has another unit and
has to be looked seperately.~~
{\label{373262}}%
}}
\end{center}
\end{figure}
\par\null
To investigate the relation between the two parameters, we first used
common statistical methods such as a regression of the data. In the
following, the two scatterplots for the two parameters are presented.
The first shows the regression of the percentage of foreigners and
interventions and its statistics for the raw data, while the second
shows the variable interventions with the percentage of Swiss
inhabitants for the smoothed data according to the SRS and the Queen's 2
weight file. in Figure 6 the points with a 100\% foreign population are
selected and the blue line shows the regression without those ten
points. In the second scatterplot in Figure 7, the cells with
\textless{} 50\% Swiss citizens are selected and the blue line shows the
regression when ignoring those.~
\par\null\par\null\selectlanguage{english}
\begin{figure}[H]
\begin{center}
\includegraphics[width=0.70\columnwidth]{figures/Scatter-PerSwiss-IntperPop-SRC/Scatter-PerFor-IntperPop-RAW}
\caption{{Represents the regression line between the interventions due to social
reasons per population and the percentage of foreigners in the cell for
the raw data. The violet line and data is the regression for the whole
data set, while the blue comprises the same regression when ignoring all
the points with 100 \% foreigners.~
{\label{111243}}%
}}
\end{center}
\end{figure}\selectlanguage{english}
\begin{figure}[H]
\begin{center}
\includegraphics[width=0.70\columnwidth]{figures/Scatter-PerFor-IntperPop-RAW/Scatter-PerSwiss-IntperPop-SRC}
\caption{{Shows the regression line between the interventions due to social
reasons per population and the percentage of Swiss inhabitants in the
cell for the smoothed data using SRS and the Queen's 2 weight file. The
violet line is the regression of the whole data and the blue one is the
regression when ignoring all points with a Swiss population of
\textless{} 50 \%.
{\label{836426}}%
}}
\end{center}
\end{figure}
\par\null
To investigate the spatial correlation between the two variables, we
applied the Local Moran's I, using the same Queen's 2 weight file as
before. The results are plotted on an Open Street Map using the software
QGIS. 28\% of the cells have a significant relation, which is supporting
our hypothesis. 42\% of the cells do not have a significant correlation
between the two variables and are therefore marked in lightyellow.~\selectlanguage{english}
\begin{figure}[H]
\begin{center}
\includegraphics[width=1.00\columnwidth]{figures/Map-Morans-Cluster/Map2}
\caption{{The map shows the cluster representation of the Local Moran's I using
the Queen's 2 weight file. A significance of 0.001 was assumed. Two
hotspots of High-High relation can be spotted and one large area of
Low-Low relation is situated in the south of the municipality. The
background map has its source in the Open Street Map integrated in
QGIS.~
{\label{499001}}%
}}
\end{center}
\end{figure}\selectlanguage{english}
\begin{figure}[H]
\begin{center}
\includegraphics[width=1.00\columnwidth]{figures/Map-small/Map}
\caption{{The map shows a spatial analysis comparing the percentage of non-Swiss
inhabitants with the number of interventions per cell. Both variables
have been smoothed with Queen's 2 contiguity. Compare the map with the
right map of Figure 5. The background map has its source in the Open
Street Map integrated in QGIS.~
{\label{689302}}%
}}
\end{center}
\end{figure}
\par\null
\section*{Discussion}
{\label{625101}}
Both hypotheses were proved considering the smoothed data of the SRS
analysis. Nevertheless, it is important to point out, that the smoothing
effect of the SRS is quite high and a loss in precision of the data can
not be excluded. Therefore, the new information varies from the provided
data and the statistical check is less high than with the raw data. The
strong smoothing effect has its source in the Queen's 2 contiguity. We
have chosen the creation of the weight file like that, so a clear
contrast between the raw data and the smoothed one is visible. Like
this, we were able to point out the effect of spatial statistics and
analysis compared to usual statistics. With a Queen's file of a lower
order, a compromise between the smoothing and loss of data could be
achieved.~
\par\null
For the analysis of the hotspots, the representation of the raw data and
the smoothed data are already a solid base. The percentages of the
foreigners living in the cells are already spatially correlated, before
smoothing with SRS. Two hotspots with a share of more than 50\%
foreigners per cell can be detected, both with the raw and the smoothed
data, as seen in Figure 5.\textbf{~}After applying the SRS, the two
hotspots are clearly separated from the other cells. This clustering of
the nationalities in urban regions has already been investigated in
other reports and this fact also adapts to Swiss
cities~\cite{Adelman2016}.
For the interventions divided by the population, some hotspots are
established, although not as clear as for the percentage of foreigners
as to be seen in Figure 4. On the other hand, the raw data would not
show clusters in any quantile, which warns us, that the smoothed data
should be handled with a critical mind. The literature also states a
clustering of social interventions, although the exact distinction
between common crime due to social problems is not
clear~\cite{Massey_1991}. To further proof this fact, our study area is
certainly not large enough. Another aspect is the restricted period of
time that the data of the interventions is representing. A time series
over a longer time period could improve the result.
\par\null
The boxplots in Figure 6 and 7 do not clearly show a correlation between
the variables, neither for the raw data nor for the smoothed. Also when
playing with the selection of the cells, no clear correlation can be
stated. When normalizing the data, the slopes do not show the expected
slope of 1, but much smaller values. In addition, the
R\textsuperscript{2~}do never exceed 0.1, which makes the regression not
reliable, with a huge error that can not be explained by the regression
curve. Thus, with a statistical analysis of the connection between
interventions and percentage of foreigners no clear correlation can be
detected.
\par\null
For the further investigation, a spatial analysis was adapted using the
Local Moran's I. When comparing the clustering of the Local Moran's I in
Figure 8, clear tendencies exist in building two hotspots for the
High-High correlation and also one for the Low-Low connection. Also in
this case, the smoothing effect of the SRS method must be taken into
account, which helps to build clusters from the variables. With only
28\% of the area, resulting in 113 cells showing a significant relation
between the two variables, the effect of the variables on each other are
rather weak. The clusters of the High-High connection lie in those areas
where it was to be expected according to the hotspots of the smoothed
data in Figure 5. The spatial distribution of interventions and
foreigner population would be expected as more correlated, according to
general speaking~\cite{2017}. Although scientifically seen, this
connection is not clearly proofed~\cite{Entorf2000} and Adelman et al.
even states the opposite\cite{Adelman2016}. The reason for the thrown
hypothesis in this case can also be the same as for the hotspots, as the
area is not large enough and the data not that comprising. Also, the
interventions due to social reasons do not necessarily correlate with
crimes in general.~
\par\null
The two important living areas in Vernier, Lignon and Avanchets, do not
specially appear in neither of the analysis. The reason could be, that
the interventions are divided by the population living in the cell. But
also in the analysis of the percentage of foreigners per cell, the areas
are not showing special properties regarding the variable. The region of
Lignon has become a popular living area and therefore, the prices do not
specially attract foreigners or other social classes.~
\par\null
When looking at the results overall, it can clearly be said, that the
correlation between the percentage of foreigners living in an area and
the social interventions by the police is rather weak. Other social
properties, such as education, wealth, stress etc. must have a more
important influence in this region and could be investigated in a
further study with more input data regarding those topics.~
~
\selectlanguage{english}
\FloatBarrier
\bibliographystyle{plainnat}
\bibliography{bibliography/converted_to_latex.bib%
}
\end{document}