And one more! The finalists for Best Academic Research are Vishnu Nair, Ian Gauk, Kathrin Gerling, Anna Chen, and Jesse J Martinez.
This category sits outside the usual voting process, but you can find links to their papers at https://t.co/kheOO6mKbq
#gamedev#accessibility
GlitchBench: Can large multimodal models detect video game glitches?
paper page: https://t.co/jWLZ05qXuh
Large multimodal models (LMMs) have evolved from large language models (LLMs) to integrate multiple input modalities, such as visual inputs. This integration augments the capacity of LLMs for tasks requiring visual comprehension and reasoning. However, the extent and limitations of their enhanced abilities are not fully understood, especially when it comes to real-world tasks. To address this gap, we introduce GlitchBench, a novel benchmark derived from video game quality assurance tasks, to test and evaluate the reasoning capabilities of LMMs. Our benchmark is curated from a variety of unusual and glitched scenarios from video games and aims to challenge both the visual and linguistic reasoning powers of LMMs in detecting and interpreting out-of-the-ordinary events. We evaluate multiple state-of-the-art LMMs, and we show that GlitchBench presents a new challenge for these models.
Excited to share GlitchBench! 🚀 It is a new benchmark designed specifically for large multimodal models. GlitchBench sets a new standard by incorporating tasks from actual game quality assurance scenarios 🎮, bringing real-world challenges into focus.
#AI#MachineLearning #GameDev
ArXiv: https://t.co/GOn2te5eRs
Project Website: https://t.co/HgiSqdqbGH
Hugging Face 🤗 Dataset: https://t.co/s484vTEThi
Leaderboard 🏆: https://t.co/gsUYgZrbd4
How to score > 90% on ImageNet?
Our new study on the spatial biases of ImageNet and relevant ImageNet-scale, OOD benchmarks reveals that all common image classifiers tested can score > 90%, if the model looks at the correct crop, i.e.,
⭐️ Zoom 🔎 is all you need! ⭐️ 1/n
Super congrats to @finlaymacklon, @taesiri, Stefan and @viggiato who had their paper "Automatically Detecting Visual Bugs in HTML5 <canvas> Games” accepted at @ASE_conf! Preprint available at https://t.co/I7gxK4aZo2 (collaboration with @ProdigyGame)
Ad for our teaching professor position is now out: https://t.co/XZuMqfq8r2
100%, permanent from start, min. 25% of worktime reserved for research (can be increased with grants).
Hit me up if you want to know more, and RTs appreciated ;)
@viggiato's paper “Identifying Similar Test Cases That Are Specified in Natural Language” was accepted for publication in the Transactions on Software Engineering journal! Preprint available at https://t.co/6DXqkUlfZ7 (with @ProdigyGame)
Tomorrow brings day 2 of @ICPEconf#ICPE2022 and a session I've been looking forward to: The #DataChallenge!
Organized by @corpaul, @swy351, and myself, and using a dataset from @MongoDB, we invited participants to do something cool with the dataset. And they have! /1
Ever wondered how the performance of #Serverless applications changes WITHOUT code changes?
Over 10 months, we observed significant changes on #AWS in our @JSSoftware paper "A case study on the stability of performance tests for serverless applications"
https://t.co/6qvNPcKrTh
@taesiri and Finlay's paper "CLIP meets GamePhysics: Towards bug identification in gameplay videos using zero-shot transfer learning" was accepted at @msrconf ! Preprint available at https://t.co/nKDPyWUL4E
CLIP meets GamePhysics: Towards bug identification in gameplay videos using zero-shot transfer learning
abs: https://t.co/uf9PEdI9U4
project page: https://t.co/8ZReEsRrFD
Game developers! We are running an anonymous research survey on the current practices, goals, and needs for quality assurance in #gamedev and would love your input. Over $4,000 in random draw prizes. Please spread the word! https://t.co/cvZfMpSalt
Finally, @HaoLi24342250's paper "An Empirical Study of Yanked Releases in the Rust Package Registry" was accepted in the @computersociety TSE journal! Preprint available @ https://t.co/CvTAyStumz
Also, (Twitter-anonymous) Mikael's paper "Studying the Performance Risks of Upgrading Docker Hub Images: A Case Study of WordPress" was accepted at ICPE 2022! Preprint available @ https://t.co/HUHZ2gzAow
Some great news from the @asgaard_lab ! @viggiato's paper “Using Natural Language Processing Techniques to Improve Manual Test Case Descriptions” was accepted in ICSE-SEIP 2022! Preprint available @ https://t.co/FzuruQliHZ
Great to see @gvwilson summary of our work in bad practices of Java benchmarking! Work done in collaboration with @xLeitix, @corpaul, and Artur Andrzejak. :)
Mikael and Chloe's systematic literature survey on Applications of Generative Adversarial Networks in Anomaly Detection is available now on arXiv: https://t.co/T8mwxPMFtd