Accelerating large-scale simulation of seismic wave propagation by multi-GPUs and three-dimensional domain decomposition

Taro Okamoto, Hiroshi Takenaka, Takeshi Nakamura, Takayuki Aoki

Research output: Contribution to journalArticlepeer-review

26 Citations (Scopus)


We adopted the GPU (graphics processing unit) to accelerate the large-scale finite-difference simulation of seismic wave propagation. The simulation can benefit from the high-memory bandwidth of GPU because it is a "memory intensive" problem. In a single-GPU case we achieved a performance of about 56 GFlops, which was about 45-fold faster than that achieved by a single core of the host central processing unit (CPU). We confirmed that the optimized use of fast shared memory and registers were essential for performance. In the multi-GPU case with three-dimensional domain decomposition, the non-contiguous memory alignment in the ghost zones was found to impose quite long time in data transfer between GPU and the host node. This problem was solved by using contiguous memory buffers for ghost zones. We achieved a performance of about 2.2 TFlops by using 120 GPUs and 330 GB of total memory: nearly (or more than) 2200 cores of host CPUs would be required to achieve the same performance. The weak scaling was nearly proportional to the number of GPUs. We therefore conclude that GPU computing for large-scale simulation of seismic wave propagation is a promising approach as a faster simulation is possible with reduced computational resources compared to CPUs.

Original languageEnglish
Pages (from-to)939-942
Number of pages4
JournalEarth, Planets and Space
Issue number12
Publication statusPublished - 2010
Externally publishedYes


  • Finite-difference method
  • GPU
  • Parallel computing
  • Seismic wave propagation
  • Three-dimensional domain decomposition

ASJC Scopus subject areas

  • Geology
  • Space and Planetary Science


Dive into the research topics of 'Accelerating large-scale simulation of seismic wave propagation by multi-GPUs and three-dimensional domain decomposition'. Together they form a unique fingerprint.

Cite this