๐งต "Show me a dress like this but in blue and shorter." Composed image retrieval is a hard, real-world problem. #NTCIR19 CAMEO benchmarks it using 369,000+ reviews from a real Vietnamese e-commerce platform. (1/4)
The ViEcomRec dataset is built from a real Vietnamese e-commerce platform, with product images, metadata, and 369K+ Vietnamese customer reviews. Retrieved images are pooled TREC-style across submissions and judged by human assessors. (3/4)
๐งต What if you could search your own memories? Thatโs the vision behind #NTCIR19 Lifelog-7. The task features four subtasks, including the new CASTLE collection with audio, video, chat transcripts, and interaction logs supporting three retrieval challenges. (1/5)
The dry run is Feb 20 โ May 1. Formal run: Apr 20 โ Aug 1. Registration deadline July 10. This is the 7th edition of the NTCIR Lifelog series. (4/5)
๐งต "Table 1 shows our method outperforms all baselines." Does it? #NTCIR19 SciClaimEval is a benchmark for verifying scientific claims against the actual tables and figures in papers. (1/5)
Published baselines: o4-mini achieves 82.3% accuracy but only 68.2% pair accuracy on subtask 1 and 85.2% on subtask 2. Qwen3-VL-30B also tested. Dataset is live on HuggingFace now. (4/5)
https://t.co/yZibKcMaaX
๐งต An AI radiologist writes "pneumonia." But what findings led there? AI usually skips that reasoning. #NTCIR19 HIDDEN-RAD2 challenges systems to make it explicit โ and now to catch their own mistakes too. (1/5)
๐งต Advances in AI have led to a proliferation of pre-trained models across many domains. But selecting the right model for a new task remains costly and time-consuming. #NTCIR19 ModelRetrieval frames this challenge as an information retrieval problem. (1/4)
You may (or may not) actually fine-tune models โ any approach that predicts rankings without exhaustive experimentation is valid. But your methodology must be documented in your paper. Ground truth: organizer-run fine-tuning results. (3/4)