Skip to content

Research at St Andrews

Few-shot linguistic grounding of visual attributes and relations using gaussian kernels

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Author(s)

Daniel Koudouna, Kasim Terzić

School/Research organisations

Abstract

Understanding complex visual scenes is one of fundamental problems in computer vision, but learning in this domain is challenging due to the inherent richness of the visual world and the vast number of possible scene configurations. Current state of the art approaches to scene understanding often employ deep networks which require large and densely annotated datasets. This goes against the seemingly intuitive learning abilities of humans and our ability to generalise from few examples to unseen situations. In this paper, we propose a unified framework for learning visual representation of words denoting attributes such as “blue” and relations such as “left of” based on Gaussian models operating in a simple, unified feature space. The strength of our model is that it only requires a small number of weak annotations and is able to generalize easily to unseen situations such as recognizing object relations in unusual configurations. We demonstrate the effectiveness of our model on the pr edicate detection task. Our model is able to outperform the state of the art on this task in both the normal and zero-shot scenarios, while training on a dataset an order of magnitude smaller. (Less)
Close

Details

Original languageEnglish
Title of host publicationProceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - (Volume 5)
EditorsGiovanni Maria Farinella, Petia Radeva, Jose Braz, Kadi Bouatouch
PublisherSCITEPRESS - Science and Technology Publications
Pages146-156
Volume5 VISAPP
ISBN (Print)9789897584886
DOIs
Publication statusPublished - 8 Feb 2021
Event16th International Conference on Computer Vision Theory and Applications (VISAPP 2021) - Online
Duration: 8 Feb 202110 Feb 2021
Conference number: 16
http://www.visapp.visigrapp.org/?y=2021

Conference

Conference16th International Conference on Computer Vision Theory and Applications (VISAPP 2021)
Abbreviated titleVISAPP 2021
Period8/02/2110/02/21
Internet address

    Research areas

  • Few-shot learning, Learning models, Attribute learning, Relation learning, Scene understanding

Discover related content
Find related publications, people, projects and more using interactive charts.

View graph of relations

Related by author

  1. Visualization as Intermediate Representations (VLAIR) for human activity recognition

    Jiang, A., Nacenta, M., Terzić, K. & Ye, J., 18 May 2020, PervasiveHealth '20: Proceedings of the 14th EAI International Conference on Pervasive Computing Technologies for Healthcare. Munson, S. A. & Schueller, S. M. (eds.). ACM, p. 201-210 10 p.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  2. Supervisor recommendation tool for Computer Science projects

    Zemaityte, G. & Terzic, K., 9 Jan 2019, Proceedings of the 3rd Conference on Computing Education Practice (CEP '19) . New York: ACM, 4 p. 1

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  3. BINK: Biological Binary Keypoint Descriptor

    Saleiro, M., Terzić, K., Rodrigues, J. M. F. & du Buf, J. M. H., Dec 2017, In: BioSystems. 162, p. 147-156

    Research output: Contribution to journalArticlepeer-review

  4. Texture features for object salience

    Terzić, K., Krishna, S. & du Buf, J. M. H., Nov 2017, In: Image and Vision Computing. 67, p. 43-51

    Research output: Contribution to journalArticlepeer-review

  5. Interpretable feature maps for robot attention

    Terzić, K. & du Buf, J. M. H., 2017, Universal Access in Human–Computer Interaction. Design and Development Approaches and Methods: 11th International Conference, UAHCI 2017, Held as Part of HCI International 2017, Vancouver, BC, Canada, July 9–14, 2017, Proceedings, Part I. Antona, M. & Stephanidis, C. (eds.). Cham: Springer, p. 456-467 12 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 10277).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

ID: 273372330

Top