\documentclass{article}
\usepackage{fullpage}
\usepackage{parskip}
\usepackage{titlesec}
\usepackage{xcolor}
\usepackage[colorlinks = true,
linkcolor = blue,
urlcolor = blue,
citecolor = blue,
anchorcolor = blue]{hyperref}
\usepackage[natbibapa]{apacite}
\usepackage{eso-pic}
\AddToShipoutPictureBG{\AtPageLowerLeft{\includegraphics[scale=0.7]{powered-by-Authorea-watermark.png}}}
\renewenvironment{abstract}
{{\bfseries\noindent{\abstractname}\par\nobreak}\footnotesize}
{\bigskip}
\titlespacing{\section}{0pt}{*3}{*1}
\titlespacing{\subsection}{0pt}{*2}{*0.5}
\titlespacing{\subsubsection}{0pt}{*1.5}{0pt}
\usepackage{authblk}
\usepackage{graphicx}
\usepackage[space]{grffile}
\usepackage{latexsym}
\usepackage{textcomp}
\usepackage{longtable}
\usepackage{tabulary}
\usepackage{booktabs,array,multirow}
\usepackage{amsfonts,amsmath,amssymb}
\providecommand\citet{\cite}
\providecommand\citep{\cite}
\providecommand\citealt{\cite}
% You can conditionalize code for latexml or normal latex using this.
\newif\iflatexml\latexmlfalse
\AtBeginDocument{\DeclareGraphicsExtensions{.pdf,.PDF,.eps,.EPS,.png,.PNG,.tif,.TIF,.jpg,.JPG,.jpeg,.JPEG}}
\usepackage[utf8]{inputenc}
\usepackage[english]{babel}
\begin{document}
\title{Astrometric constraints on the nanohertz gravitational-wave background with \textit{Gaia}}
\author[ ]{Stephen Taylor}
\affil[ ]{}
\vspace{-1em}
\date{}
\begingroup
\let\center\flushleft
\let\endcenter\endflushleft
\maketitle
\endgroup
\textit{Gaia} is a $5$-year (potentially extendable to $10$-year) space-borne mission to perform precision astrometry on $\sim 10^9$ stars and $\sim 10^6$ AGN. Each target will have $\sim 100$ positional measurements over the mission lifetime. A gravitational-wave (GW) crossing between the target star/AGN and the observer will cause angular deflections of the photon path, leading to observed perturbations of the target position. Furthermore, a GW \textit{background} will embed all of the targets in its common influence, leading to perturbations of the angular position that are correlated across all targets in a known way \cite{Book_2011}.
The analogy of this experimental setup and pulsar-timing searches for GWs is very strong. Pulsar-timing looks for correlated deviations of the arrival-times of radio pulses across many galactic millisecond pulsars to infer the presence and properties of a nanohertz stochastic GW background. The dominant contribution to the arrival time is from a determinstic timing model, which relies on the pulsar period, spindown, position, and (if the pulsar is in a binary system) binary orbital dynamics. GWs cause a small, but distinctive, perturbation to this deterministic timing model. Similarly, astrometry can look for correlated positional deviations across many target stars and AGN. The dominant contribution to the positional reconstructuon will come from a determinsitic model, with GWs contributing a perturbation around this solution.
We now write down the likelihood of a single target with astrometric measurements, where the position of the target can be described as a sum of deterministic and stochastic components. In the following all stochastic components are treated as random Gaussian processes.
\begin{equation}
\vec{n} = \vec{n}^\mathrm{det} + \vec{n}^\mathrm{stoch}
\end{equation}
where $\vec{n}$ is the vector of positional measurments for one target over a mission lifetime. We assume that the only deterministic contributions come from the $5$-dimensional positional model itself, consisting of terms that describe the position, parallax, and proper-motion. The stochastic contribution can be modeled as follows:
\begin{equation}
\vec{n}^\mathrm{stoch} = \vec{n}^\mathrm{white} + \vec{n}^\mathrm{red}
\end{equation}
where the white stochastic process simply accounts for measurement uncertainties in the positions (but can include other physically-motivated terms), while the red process can include both long-timescale noise sources, and most importantly a GW background signal.
We decompose the red processes onto a Fourier basis of sines and cosines, in such a way that the units of the signal are kept to be dimensionless positions on the unit sphere.
\begin{equation}
\vec{n}^\mathrm{red} = F\vec{h}
\end{equation}
where $F$ is the $N_\mathrm{obs} \times N_\mathrm{freq}$ Fourier design matrix, corresponding to columns of sines and cosines of the various GW frequencies at each time-stamp of the target observation. The vector $\vec{h}$ are the Fourier amplitudes of the GW signal.
The likelihood is now:
\begin{equation}
\mathcal{L} = \frac{\exp{(-\frac{1}{2}(\vec{n} - \vec{n}^\mathrm{det}-F\vec{h})^\mathrm{T}N^{-1}(\vec{n} - \vec{n}^\mathrm{det}-F\vec{h}))}}{\sqrt{\mathrm{det}(2\pi N)}}
\end{equation}
where $N$ is a diagonal matrix of positional measurment variances.
We now apply additional prior constraints on the GW signal, given our assumptions about its statistical properties, i.e. we assume it zero-mean Gaussian stationary. Hence:
\begin{align}
\begin{aligned}
\langle h \rangle &= 0, \\
\langle h h^\mathrm{T} \rangle &= \phi,
\end{aligned}
\end{align}
such that,
\begin{equation}
p(h|\phi) = \frac{\exp{(h^\mathrm{T}\phi^{-1}h)}}{\sqrt{\mathrm{det}(2\pi \phi)}}
\end{equation}
The full likelihood then becomes:
\begin{equation}
\mathcal{L} = \frac{\exp{(-\frac{1}{2}(\vec{n} - \vec{n}^\mathrm{det}-F\vec{h})^\mathrm{T}N^{-1}(\vec{n} - \vec{n}^\mathrm{det}-F\vec{h}))}}{\sqrt{\mathrm{det}(2\pi N)}} \times \frac{\exp{(h^\mathrm{T}\phi^{-1}h)}}{\sqrt{\mathrm{det}(2\pi \phi)}}
\end{equation}
We are not really interested in the Fourier amplitudes of the signal, so would just like to marginalize over their parameter space. There are two paths to doing this:
\begin{enumerate}
\item We assume that the deterministic positional model is in the linear regime centred around the global best-fit solution, so that $\vec{n}^\mathrm{det}=M\vec{\epsilon}$. We can then group all low-rank processes (the positional model and the red processes) together, and simultaneously marginalize over all amplitudes, such that
\begin{align}
\begin{aligned}
T &= \begin{bmatrix} M & F \end{bmatrix}, \\
b &= \begin{bmatrix} \epsilon \\ h \end{bmatrix}
\end{aligned}
\end{align}
In this case we still want to marginalize over $\epsilon$ with uniform priors, so can approximate this using an infinite variance Gaussian. This makes the algebra easier since everything can still be treated as a random Gaussian process. The resulting ``marginalized" likelihood is now:
\begin{equation}
\mathcal{L} = \frac{\exp{(-\frac{1}{2}\vec{n}^\mathrm{T}C^{-1}\vec{n})}}{\sqrt{\mathrm{det}(2\pi C)}}
\end{equation}
where
\begin{equation}
C = N + TBT^\mathrm{T}
\end{equation}
and
\begin{equation}
B = \begin{bmatrix} \infty & 0 \\ 0 & \phi \end{bmatrix}
\end{equation}
\item We don't assume that the deterministic positional model is in the linear regime, and so simply marginalize over the GW Fourier amplitudes. The resulting marginalized likelihood is then:
\begin{equation}
\mathcal{L} = \frac{\exp{(-\frac{1}{2}(\vec{n}-\vec{n}^\mathrm{det})^\mathrm{T}C^{-1}(\vec{n}-\vec{n}^\mathrm{det}))}}{\sqrt{\mathrm{det}(2\pi C)}}
\end{equation}
where
\begin{equation}
C = N + F\phi F^\mathrm{T}
\end{equation}
\end{enumerate}
Given the structure of the data covariance matrices, we can use the Wooodbury matrix lemma to reduce the rank of the linear algebra operations to be performed. This has worked very successfully in PTA data analysis.
The above describes the likelihood of a single target, but a search for GWs must correlate all measurements across all targets in a catalog. This is feasible with PTAs where the number of pulsars is $\sim \mathcal{O}(100)$. However \textit{Gaia} may have millions, if not billions, of targets to correlate. A temporary solution might be to analyse a subset of the catalog in such a way that the coverage of the correlation pattern is good enough for detection.
But for upper limits on the GW background, the solution is much simpler -- we only need to place constraints on some common red process, and interpret all of that red process as a GW signal. Hence for upper limits we can ignore the spatial correlations, which reduces the catalog likelihood to simply the product over all targets.
\begin{equation}
\mathcal{L} = \prod_{i=1}^{N_\mathrm{targets}}\mathcal{L}_i
\end{equation}
If analyzed all at once, this remains a computationally expensive process. However, further shortcuts can be made. The problem is trivially factorizable -- each system can be analyzed for a common red process in isolation, and indeed in parallel on a computer cluster. A KDE estimate of the red-process PDF can then be determined from each, and the results multiplied together. This multiplied result can be resampled (with an appropriate ``upper-limit" prior that is insensitive to the lower sampling boundary of the GWB amplitude) to compute the upper limit on the GWB amplitude over the entire catalog. This resampling scheme has been successfully employed for PTAs, giving consistent answers to the regular full data likelihood.
\selectlanguage{english}
\FloatBarrier
\bibliographystyle{apacite}
\bibliography{bibliography/converted_to_latex.bib%
}
\end{document}