@trajektoriePL@AlexObuchowski@mar_kar_@rzeczpospolita Cóż. Wychodzi na to, że badania z zakresu lingwistyki komputerowej ze względu na popularność LLMów trafiły już do kategorii "amerykańscy naukowcy odkryli...".
A joint study by @poznanAI researchers and Samsung Electronics Polska engineers was presented at @FedCSIS 2024. The paper investigates the impact of augmenting spoken language corpora with domain-specific synthetic samples. https://t.co/NhkDOc7xI2
Today was tle last day of workshops at @aclmeeting conference. Yesterday at @wassa_ws, @PSkorzewski and @piotrjablo represented our department with paper „POLygraph: Polish Fake News Dataset” check full paper at https://t.co/6oVaclxz9a
The preprint of our paper "Two Approaches to Diachronic Normalization of Polish Texts" accepted to LaTeCH-CLfL 2024 is now available at https://t.co/SHvyyMRTB5 #NLProc#DH
In exactly 20 minutes @marekkubis@PSkorzewski@tzietkiewicz and Marcin Sowański will speak about Back Transcription as a Method for Evaluating Robustness of NLU Models to Speech Recognition Errors. Join us online or in person. We start at 11.00 am CET
https://t.co/jvRwpvJUuj
We are participating in the aUPaEU workshop in Turin, Italy, on the presentation of the concept of the Agora. We are a part of a team developing tools for collecting and searching of information for effective cooperation for scientists and HEIs in Europe. @WideningEU@poznanAI
Our work (co-authors: @PSkorzewski, Marcin Sowański, @tzietkiewicz) on using back transcription for evaluating robustness of #NLU models to speech recognition errors was featured on Samsung Research Blog https://t.co/tB8B5PkvEI #AI#NLProc@poznanAI@samsungresearch
A joint study of @UAM_Poznan researchers and Samsung Electronics Poland engineers on evaluating robustness of #NLU models to speech recognition errors was presented at #EMNLP2023 by @marekkubis and @tzietkiewicz https://t.co/bPTMuUX65L
The robustness criteria that we formulate are then used to construct a model for detecting speech recognition errors that impact the NLU model in the most significant way.
Contrary to conventional adversarial attacks, which aim at determining the samples that deteriorate the model performance under study, our method also takes into consideration samples that change the NLU outcome in other ways.
The augmented dataset is used to evaluate natural language understanding models and the outcomes of the evaluation serve as a basis for defining the criteria of NLU model robustness.
The method that we propose relies on the use of back transcription, a procedure that combines a text-to-speech model with an automatic speech recognition system to prepare a dataset contaminated with speech recognition errors.
The preprint of our paper "Back Transcription as a Method for Evaluating Robustness of Natural Language Understanding Models to Speech Recognition Errors" accepted to #EMNLP2023 is now available at https://t.co/1PoOzgPd3V #NLProc#VoiceAI#AI