Home
Publications
Contact
multimodal language models
Modal Aphasia: Can Unified Multimodal Models Describe Images From Memory?
We introduce modal aphasia, a surprising and systematic dissociation in which unified multimodal models demonstrate strong capabilities for generating visual content while simultaneously failing to access that same knowledge through text queries.
Michael Aerni
,
Joshua Swanson
,
Kristina Nikolić
,
Florian Tramèr
PDF
Cite
Code
Blog post
Twitter
Cite
×