Deep Learning Networks & Gravitational Wave Signal Recognization

He Wang (王赫)

[hewang@mail.bnu.edu.cn]

Department of Physics, Beijing Normal University

In collaboration with Zhou-Jian Cao

Aug 23rd, 2019

The 23rd KAGRA face-to-face meeting @Toyama

Problems
- Current matched filtering techniques are computationally expensive.
- Non-Gaussian noise limits the optimality of searches.
- Un-modelled signals?

A trigger generator \(\rightarrow\) Efficiency+ Completeness + Informative

Background

Solution:
- Machine learning (deep learning)
- ...

Introduction

Existing CNN-based approaches:
- Daniel George & E. A. Huerta (2018)
- Hunter Gabbard et al. (2018)
- X. Li et al. (2018)
- Timothy D. Gebhard et al. (2019)

Related works

Introduction

Our main contributions:
- A brand new CNN-based architecture (MF-CNN)
- Efficient training process (no bandpass and explicit whitening)
- Effective search methodology (only 4~5 days on O1)
- Fully recognized and predicted precisely (<1s) for all GW events in O1/O2

Motivation

MF-ConvNet Model

Convolutional neural network (ConvNet or CNN)

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012)

Matched-filtering (cross-correlation with the templates) can be regarded as a convolutional layer with a set of predefined kernels.

Is it matched-filtering?

Motivation

MF-ConvNet Model

Matched-filtering (cross-correlation with the templates) can be regarded as a convolutional layer with a set of predefined kernels.

In practice, we use matched filters as an essential component in the first part of CNN for GW detection.

Convolutional neural network (ConvNet or CNN)

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012)

Architechture

\rho[1,C,N] = \frac{U[1,C,N]}{\sqrt{\sigma[1,C,0]\cdot fs}}

\(\bar{S_n}(t)\)

MF-ConvNet Model

\rho_m[1,C,1] = \max_N{\rho[1,C,N]}

\rho[1,C,N] = \frac{U[1,C,N]}{\sqrt{\sigma[1,C,0]\cdot fs}}

\rho_m[1,C,1] = \max_N{\rho[1,C,N]}

C_0 = \mathop{\arg\max}_{C}\rho[1,C,N] \,,\\ N_0 = \mathop{\arg\max}_{N} U[1,C_0,N]

\(\bar{S_n}(t)\)

In the meanwhile, we can obtain the optimal time \(N_0\) (relative to the input) of feature response of matching by recording the location of the maxima value corresponding to the optimal template \(C_0\)

Architechture

MF-ConvNet Model

Experiments & Results

Dataset & Templates

	template	waveform (train/test)
Number	35	1610
Length (s)	1	5
	equal mass

We use SEOBNRE model [Cao et al. (2017)] to generate waveform, we only consider circular, spinless binary black holes.

(In preprint)

The background noises for training/testing are sampled from a closed set (33x4096s) in the first observation run (O1) in the absence of the segments (4096s) containing the first 3 GW events.

62.50M⊙ + 57.50M⊙ (\(\rho_{amp}=0.5\))

A sample as input

Experiments & Results

Dataset & Templates

(In preprint)

	template	waveform (train/test)
Number	35	1610
Length (s)	1	5
	equal mass

We use SEOBNRE model [Cao et al. (2017)] to generate waveform, we only consider circular, spinless binary black holes.

The background noises for training/testing are sampled from a closed set (33x4096s) in the first observation run (O1) in the absence of the segments (4096s) containing the first 3 GW events.

Mass distribution of dataset / templates / events

Search methodology

Experiments & Results

(In preprint)

Every 5 seconds segment as input of our MF-CNN with a step size of 1 second.
The model can scan the whole range of the input segment and output a probability score.
In the ideal case, with a GW signal hiding in somewhere, there should be 5 adjacent predictions for it with respect to a threshold.

Search methodology

Experiments & Results

(In preprint)

Input

Every 5 seconds segment as input of our MF-CNN with a step size of 1 second.
The model can scan the whole range of the input segment and output a probability score.
In the ideal case, with a GW signal hiding in somewhere, there should be 5 adjacent predictions for it with respect to a threshold.

Experiments & Results

(In progress)

Population property on O1

Sensitivity estimation
- Background: using time-shifting on the closed set from real LIGO recordings in O1
- Injection: random simulated waveforms

Detection ratio

Statistical significance on O1
- Count a group of adjacent predictions as one "trigger block".
- For pure background (non-Gaussian), monotone trend should be observed.
- In the ideal case, with a GW signal hiding in somewhere, there should be 5 adjacent predictions for it with respect to a threshold.

Number of Adjacent prediction

Experiments & Results

(In progress)

Population property on O1

Sensitivity estimation
- Background: using time-shifting on the closed set from real LIGO recordings in O1
- Injection: random simulated waveforms

Detection ratio

Statistical significance on O1
- Count a group of adjacent predictions as one "trigger block".
- For pure background (non-Gaussian), monotone trend should be observed.
- In the ideal case, with a GW signal hiding in somewhere, there should be 5 adjacent predictions for it with respect to a threshold.

Number of Adjacent prediction

a bump at 5 adjacent predictions

Experiments & Results

(In preprint)

Recovering all GW events in both O1 and O2

Experiments & Results

(In preprint)

Recovering all GW events in both O1 and O2

Summary

Some benefits from MF-CNN architechure
- Simple configuration for GW data generation
- Almost no data pre-processing
- Works on non-stationary background
- Easy parallel deployments, multiple detectors can be benefit a lot from this design
- More templates / smaller steps for searching can improve further
Main understanding of the algorithms:
- GW templates are used as likely features for matching
- Generalization of both matched-filtering and neural networks
- Matched-filtering can be rewritten as convolutional neural layers

Summary

Some benefits from MF-CNN architechure
- Simple configuration for GW data generation
- Almost no data pre-processing
- Works on non-stationary background
- Easy parallel deployments, multiple detectors can be benefit a lot from this design
- More templates / smaller steps for searching can improve further
Main understanding of the algorithms:
- GW templates are used as likely features for matching
- Generalization of both matched-filtering and neural networks
- Matched-filtering can be rewritten as convolutional neural layers

Thank you for your attention!