Nowadays, data can be found in many different formats and multimodal approaches are gaining attention in many natural language processing tasks such as text generation, machine translation or sentiment analysis. Indeed, many research areas are moving from a single modality to full-fledged multimodality research, e.g. multimodal corpora, multimodal lexicons, etc. For instance, efforts are being made to integrate images, sign languages, sounds, etc. into existing wordnets. As the exchange of information among modalities can be crucial for lexical databases, we want to address this interdisciplinary research area in the first Workshop on “Multimodal wordnets”, co-located with LREC 2020.
The workshop is organized by the Global WordNet Association (GWA). GWA is a free, public and non-commercial organization that provides a platform for discussing, sharing and connecting wordnets for all languages in the world. The GWA has created in 2019 a Working Group dedicated to multimodal wordnets in order to extend the development and use of wordnets to other modalities than just text: some well known examples of multimodal approaches and wordnets are ImageNet, an image database structured according to WordNet hierarchy and ASLNet, A Wordnet for American Sign Language, and the Suggested Upper Merged Ontology (SUMO).
Topics of Interest
The workshop aims at studying the interaction and cross-fertilization between wordnets and existing multimodal resources. We invite submissions with original contributions addressing, but not limited to, the topics listed below.
- What are the benefits/drawbacks of multimodal wordnets? How can wordnets help in the transmission and characterization of multimedia information?
- To what extent is it possible to create wordnets in other modalities?
- Which new multimodal initiatives and projects are being carried out involving distinct modalities (written, spoken, audio-visual, signs, pictograms, emojis, geographical and spatio-temporal data...) and knowledge representations (wordnets, lexicons, ontologies, terminologies, dictionaries, corpora, wikipedias, distributional representations, cultural artifacts, books...)?
- What are (can be) the practical applications of multimodal wordnets? How to exploit existing multimodal wordnets, such as Visual Genome, ConceptNet, ImageNet, Imagact, etc.? Sense disambiguation on corpora, images, space role labeling, multimodal knowledge acquisition, commonsense reasoning and inference, distributed concept representation, integration of distributional (corpus-based) and knowledge-based embeddings ...
- Which approaches are being developed to create these multimodal resources? How can they be best represented?
- How to automatically map existing resources? How can we deal with similarity and relatedness across modality? How can we deal with specificity? Image, sound, smell, touch, video are all infinitely specific but words are not.
- What is the added value of wordnet hierarchies to other modalities? Which is the role of the multimodal wordnets? Which is the expected format of the resources? Which standards to adopt or to develop?
- How can we feed and feed back the algorithms with multimodal wordnets?
- Which ethical policies should be followed? (see for instance, https://www.excavating.ai/)
Identify, Describe and Share your LRs!
Describing your LRs in the LRE Map is now a normal practice in the submission procedure of LREC (introduced in 2010 and adopted by other conferences). To continue the efforts initiated at LREC 2014 about “Sharing LRs” (data, tools, web-services, etc.), authors will have the possibility, when submitting a paper, to upload LRs in a special LREC repository. This effort of sharing LRs, linked to the LRE Map for their description, may become a new “regular” feature for conferences in our field, thus contributing to creating a common repository where everyone can deposit and share data.
As scientific work requires accurate citations of referenced work so as to allow the community to understand the whole context and also replicate the experiments conducted by other researchers, LREC 2020 endorses the need to uniquely Identify LRs through the use of the International Standard Language Resource Number (ISLRN, www.islrn.org), a Persistent Unique Identifier to be assigned to each Language Resource. The assignment of ISLRNs to LRs cited in LREC papers will be offered at submission time.