Recognizing emotions in conversation is a task where existing language models encounter several difficulties. They are not adapted to multi-party dialogues and do not encode intra- and inter-speaker dependencies. Also, input length is constrained, and the information in distant historical utterances may be lost. A recent study tries to overcome these problems.

” data-src=”https://www.technology.org/texorgwp/wp-content/uploads/2020/02/1920_smile-720×470.png” alt=”” width=”499″ height=”330″>

Illustration by Lidya Nada on Unsplash, free licence

It uses flexible and memory-saving utterance recurrence to store the states of historical utterances and reuse them while identifying a query utterance. Instead of computing attention weights between words, the novel model uses dialog-aware self-attention, which differentiates reception fields and party roles, namely local self-attention, global self-attention, speaker self-attention, and listener self-attention. The extensive experiments show that the suggested model outperforms all the baselines.

This paper presents our pioneering effort for emotion recognition in conversation (ERC) with pre-trained language models. Unlike regular documents, conversational utterances appear alternately from different parties and are usually organized as hierarchical structures in previous work. Such structures are not conducive to the application of pre-trained language models such as XLNet. To address this issue, we propose an all-in-one XLNet model, namely DialogXL, with enhanced memory to store longer historical context and dialog-aware self-attention to deal with the multi-party structures. Specifically, we first modify the recurrence mechanism of XLNet from segment-level to utterance-level in order to better model the conversational data. Second, we introduce dialog-aware self-attention in replacement of the vanilla self-attention in XLNet to capture useful intra- and inter-speaker dependencies. Extensive experiments are conducted on four ERC benchmarks with mainstream models presented for comparison. The experimental results show that the proposed model outperforms the baselines on all the datasets. Several other experiments such as ablation study and error analysis are also conducted and the results confirm the role of the critical modules of DialogXL.

Link: https://arxiv.org/abs/2012.08695


You can offer your link to a page which is relevant to the topic of this post.

Leave a Reply

Your email address will not be published. Required fields are marked *