Skip to content

Latest commit

 

History

History
12 lines (12 loc) · 4.51 KB

test.md

File metadata and controls

12 lines (12 loc) · 4.51 KB
Domain Title Authors Summary Link
Computer Vision OpenBias: Open-set Bias Detection in Text-to-Image Generative Models Moreno D'Incà, Elia Peruzzo, Massimiliano Mancini, Dejia Xu, Vidit Goel, Xingqian Xu, Zhangyang Wang, Humphrey Shi, Nicu Sebe The paper presents OpenBias, a pipeline for open-set bias detection in text-to-image generative models. It uses a Large Language Model (LLM) to propose biases given a set of captions, generates images using the same captions, and uses a Vision Question Answering model to recognize the biases. Link
AI & Marketing Manipulating Large Language Models to Increase Product Visibility Aounon Kumar, Himabindu Lakkaraju The authors investigate whether recommendations from LLMs can be manipulated to enhance a product's visibility. They demonstrate that adding a strategic text sequence to a product's information page can significantly increase its likelihood of being listed as the LLM's top recommendation. Link
AI & Data Processing LLoCO: Learning Long Contexts Offline Sijun Tan, Xiuyu Li, Shishir Patil, Ziyang Wu, Tianjun Zhang, Kurt Keutzer, Joseph E. Gonzalez, Raluca Ada Popa The paper presents LLoCO, a technique that combines context compression, retrieval, and parameter-efficient finetuning using LoRA. It extends the effective context window of a 4k token LLaMA2-7B model to handle up to 128k tokens. Link
AI & HCI OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh Jing Hua, Zhoujun Cheng, Dongchan Shin, Fangyu Lei, Yitao Liu, Yiheng Xu, Shuyan Zhou, Silvio Savarese, Caiming Xiong, Victor Zhong, Tao Yu The authors introduce OSWorld, a scalable, real computer environment for multimodal agents, supporting task setup, execution-based evaluation, and interactive learning across various operating systems. Link
AI & Computer Vision Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models Haotian Zhang, Haoxuan You, Philipp Dufter, Bowen Zhang, Chen Chen, Hong-You Chen, Tsu-Jui Fu, William Yang Wang, Shih-Fu Chang, Zhe Gan, Yinfei Yang The paper presents Ferret-v2, an upgrade to Ferret, with three key designs: any resolution grounding and referring, multi-granularity visual encoding, and a three-stage training paradigm. Link
AI & Online Safety Leveraging Large Language Models (LLMs) to Support Collaborative Human-AI Online Risk Data Annotation Jinkyung Park, Pamela Wisniewski, Vivek Singh The authors discuss the potential for leveraging LLMs as interactive research tools to facilitate collaboration between human coders and AI to effectively annotate online risk data at scale. Link
AI & Language Processing LaVy: Vietnamese Multimodal Large Language Model Chi Tran, Huong Le Thanh The authors introduce LaVy, a state-of-the-art Vietnamese Multimodal Large Language Model, and LaVy-Bench, a benchmark for evaluating MLLMs's understanding on Vietnamese visual language tasks. Link
AI & Security AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs Zeyi Liao, Huan Sun The authors present AmpleGCG, a high-dimensional representation of human value distributions in LLMs, orthogonal to model architecture and training data. Link
AI & Ethics High-Dimension Human Value Representation in Large Language Models Samuel Cahyawijaya, Delong Chen, Yejin Bang, Leila Khalatbari, Bryan Wilie, Ziwei Ji, Etsuko Ishii, Pascale Fung The authors propose UniVaR, a high-dimensional representation of human value distributions in LLMs, to understand the scope and nature of human values injected into these models before their release. Link
AI & Translation Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations Dayeon Ki, Marine Carpuat The authors exploit the complementary strengths of LLMs and supervised MT by guiding LLMs to automatically post-edit MT with external feedback on its quality, derived from Multidimensional Quality Metric (MQM) annotations. Link