Fast, parallel implementation of particle filtering on the GPU architecture

In this paper, we introduce a modified cellular particle filter (CPF) which we mapped on a graphics processing unit (GPU) architecture. We developed this filter adaptation using a state- of-the art CPF technique. Mapping this filter realization on a highly parallel architecture entailed a shift i...

Full description

Bibliographic Details
Main Authors: Gelencsér-Horváth Anna
Tornai Gábor János
Horváth András
Cserey György Gábor
Format: Article
Published: 2013
Series:EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING 2013 No. 1
Subjects:
mtmt:2415644
Online Access:https://publikacio.ppke.hu/1944

MARC

LEADER 00000nab a2200000 i 4500
001 publ1944
005 20241219105432.0
008 241219s2013 hu o 0|| Angol d
022 |a 1687-6172 
024 7 |a 2415644  |2 mtmt 
040 |a PPKE Publikáció Repozitórium  |b hun 
041 |a Angol 
100 2 |a Gelencsér-Horváth Anna 
245 1 0 |a Fast, parallel implementation of particle filtering on the GPU architecture  |h [elektronikus dokumentum] /  |c  Gelencsér-Horváth Anna 
260 |c 2013 
490 0 |a EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING  |v 2013 No. 1 
520 3 |a In this paper, we introduce a modified cellular particle filter (CPF) which we mapped on a graphics processing unit (GPU) architecture. We developed this filter adaptation using a state- of-the art CPF technique. Mapping this filter realization on a highly parallel architecture entailed a shift in the logical representation of the particles. In this process, the original two-dimensional organization is reordered as a one-dimensional ring topology. We proposed a proof-of-concept measurement on two models with an NVIDIA Fermi architecture GPU. This design achieved a 411-us kernel time per state and a 77-ms global running time for all states for 16,384 particles with a 256 neighbourhood size on a sequence of 24 states for a bearing-only tracking model. For a commonly used benchmark model at the same configuration, we achieved a 266-us kernel time per state and a 124-ms global running time for all 100 states. Kernel time includes random number generation on the GPU as well as with curand. These results attest to the effective and fast use of the particle filter in high-dimensional, real-time applications. 
650 4 |a Villamos- és elektronikai mérnöki tudományok 
700 0 1 |a Tornai Gábor János  |e aut 
700 0 1 |a Horváth András  |e aut 
700 0 1 |a Cserey György Gábor  |e aut 
856 4 0 |u https://publikacio.ppke.hu/id/eprint/1944/1/eurasip2013.pdf  |z Dokumentum-elérés