Semantically congruent audiovisual integration with modal-based attention accelerates auditory short-term memory retrieval

Hongtao Yu, Aijun Wang, Ming Zhang, Jia Jia Yang, Satoshi Takahashi, Yoshimichi Ejima, Jinglong Wu

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)


Evidence has shown that multisensory integration benefits to unisensory perception performance are asymmetric and that auditory perception performance can receive more multisensory benefits, especially when the attention focus is directed toward a task-irrelevant visual stimulus. At present, whether the benefits of semantically (in)congruent multisensory integration with modal-based attention for subsequent unisensory short-term memory (STM) retrieval are also asymmetric remains unclear. Using a delayed matching-to-sample paradigm, the present study investigated this issue by manipulating the attention focus during multisensory memory encoding. The results revealed that both visual and auditory STM retrieval reaction times were faster under semantically congruent multisensory conditions than under unisensory memory encoding conditions. We suggest that coherent multisensory representation formation might be optimized by restricted multisensory encoding and can be rapidly triggered by subsequent unisensory memory retrieval demands. Crucially, auditory STM retrieval is exclusively accelerated by semantically congruent multisensory memory encoding, indicating that the less effective sensory modality of memory retrieval relies more on the coherent prior formation of a multisensory representation optimized by modal-based attention.

Original languageEnglish
JournalAttention, Perception, and Psychophysics
Publication statusAccepted/In press - 2022


  • Audiovisual integration
  • Modal-based attention
  • Semantic congruency
  • Short-term memory

ASJC Scopus subject areas

  • Experimental and Cognitive Psychology
  • Language and Linguistics
  • Sensory Systems
  • Linguistics and Language


Dive into the research topics of 'Semantically congruent audiovisual integration with modal-based attention accelerates auditory short-term memory retrieval'. Together they form a unique fingerprint.

Cite this