A GPU-implementation of a recently introduced butterfly filter (van Putten, 2016, ApJ, 819, 169) is presented with preliminary results on LIGO S6 hardware injections. It achieves compute-limited clFFT/AMD performance under OpenCL by dedicated pre- and post-callback functions, exploiting sparsity of high signal-to-noise candidate detections. It appears to be a cost-effective approach to deep searches for transient events by butterfly filtering. This research is supported, in part, by grants from NRF under 2015R1D1A1A01059793 and 2016R1A5A1013277.