Skip to content

mtuann/llm-updated-papers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 

Repository files navigation

Table of Contents

  1. Large Language Models Papers
  2. Other topics
  3. Large Language Models Papers with Code

Large Language Models Papers

This GitHub repository contains an updated list of Federated Learning papers as of February 24, 2025.

  • The resources are collected from various sources, including arXiv, NeurIPS, ICML, ICLR, ACL, EMNLP, AAAI, IJCAI, KDD, CVPR, ICCV, ECCV, NIPS, IEEE, ACM, Springer, ScienceDirect, Wiley, Nature, Science, and other top AI/ML conferences and journals.
  • For a better reading experience, visit the Shinyapps website.

Other Topics

Explore additional research papers on the following topics:


For contributions, inquiries, or suggestions, feel free to reach out via email.


If you find this application helpful and would like to support its development, you can buy me a coffee using one of the following methods:


Large Language Models Papers with Code

Due to GitHub repository limitations, this section includes only those papers that provide accompanying code, sorted by publish date. For access to the full list of papers, please visit the Shinyapps website.


No. Title Authors Publish Date Venue Code URL
1 CER: Confidence Enhanced Reasoning in LLMs Ali Razghandi, Seyed Mohammad Hadi Hosseini, Mahdieh Soleymani Baghshah 2025-02-22 arXiv …, 2025 https://github.com/ http://arxiv.org/abs/2502.14634v1
2 Dynamic Low-Rank Sparse Adaptation for Large Language Models Weizhong Huang, Yuxin Zhang, Xiawu Zheng, Yang Liu, Jing Lin, Yiwu Yao, Rongrong Ji 2025-02-22 arXiv …, 2025 https://github.com/wzhuang-xmu/LoSA http://arxiv.org/abs/2502.14816v1
3 A General Pseudonymization Framework for Cloud-Based LLMs: Replacing Privacy Information in Controlled Text Generation Shilong Hou, Ruilin Shang, Zi Long, Xianghua Fu, Yin Chen 2025-02-21 arXiv https://github.com/Mebymeby/Pseudonymization-Framework http://arxiv.org/abs/2502.15233v1
4 On the logical skills of large language models: evaluations using arbitrarily complex first-order logic problems Shokhrukh Ibragimov, Arnulf Jentzen, Benno Kuckuck 2025-02-21 arXiv:2502.14180, 2025 https://github.com/bkuckuck/logical-skills-of-llms http://arxiv.org/abs/2502.14180v1
5 Transfer-Prompting: Enhancing Cross-Task Adaptation in Large Language Models via Dual-Stage Prompts Optimization Yupeng Chang, Yi Chang, Yuan Wu 2025-02-21 arXiv:2502.14211, 2025 https://github.com/llm172/Transfer-Prompting http://arxiv.org/abs/2502.14211v1
6 Scale-Distribution Decoupling: Enabling Stable and Effective Training of Large Language Models Ya Wang, Zhijian Zhuo, Yutao Zeng, Xun Zhou, Jian Yang, Xiaoqing Li 2025-02-21 arXiv https://github.com/kaihemo/SDD http://arxiv.org/abs/2502.15499v1
7 STeCa: Step-level Trajectory Calibration for LLM Agent Learning Hanlin Wang, Jian Wang, Chak Tou Leong, Wenjie Li 2025-02-21 arXiv:2502.14276, 2025 https://github.com/WangHanLinHenry/STeCa http://arxiv.org/abs/2502.14276v1
8 Probe Pruning: Accelerating LLMs through Dynamic Pruning via Model-Probing Qi Le, Enmao Diao, Ziyan Wang, Xinran Wang, Jie Ding, Li Yang, Ali Anwar 2025-02-21 arXiv https://github.com/Qi-Le1/Probe_Pruning http://arxiv.org/abs/2502.15618v1
9 PredictaBoard: Benchmarking LLM Score Predictability Lorenzo Pacchiardi, Konstantinos Voudouris, Ben Slater, Fernando Martínez-Plumed, José Hernández-Orallo, Lexin Zhou, Wout Schellaert 2025-02-21 arXiv …, 2025 https://github.com/Kinds-of-Intelligence-CFI/PredictaBoard http://arxiv.org/abs/2502.14445v1
10 Plan-over-Graph: Towards Parallelable LLM Agent Schedule Shiqi Zhang, Xinbei Ma, Zouying Cao, Zhuosheng Zhang, Hai Zhao 2025-02-21 arXiv:2502.14563, 2025 https://github.com/zsq259/Plan-over-Graph http://arxiv.org/abs/2502.14563v1
11 Adversarial Prompt Evaluation: Systematic Benchmarking of Guardrails Against Prompt Input Attacks on LLMs Giulio Zizzo, Giandomenico Cornacchia, Kieran Fraser, Muhammad Zaid Hameed, Ambrish Rawat, Beat Buesser, Mark Purcell, Pin-Yu Chen, Prasanna Sattigeri, Kush Varshney 2025-02-21 arXiv https://github.com/IBM/Adversarial-Prompt-Evaluation http://arxiv.org/abs/2502.15427v1
12 Middle-Layer Representation Alignment for Cross-Lingual Transfer in Fine-Tuned LLMs Danni Liu, Jan Niehues 2025-02-21 arXiv:2502.14830, 2025 https://github.com/dannigt/mid-align http://arxiv.org/abs/2502.14830v1
13 LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention Shang Yang, Junxian Guo, Haotian Tang, Qinghao Hu, Guangxuan Xiao, Jiaming Tang, Yujun Lin, Zhijian Liu, Yao Lu, Song Han 2025-02-21 arXiv …, 2025 https://github.com/mit-han-lab/omniserve http://arxiv.org/abs/2502.14866v1
14 Investigating the Adaptive Robustness with Knowledge Conflicts in LLM-based Multi-Agent Systems Tianjie Ju, Bowen Wang, Hao Fei, Mong-Li Lee, Wynne Hsu, Yun Li, Qianren Wang, Pengzhou Cheng, Zongru Wu, Zhuosheng Zhang, Gongshen Liu 2025-02-21 arXiv https://github.com/wbw625/MultiAgentRobustness http://arxiv.org/abs/2502.15153v1
15 From RAG to Memory: Non-Parametric Continual Learning for Large Language Models Bernal Jiménez Gutiérrez, Yiheng Shu, Weijian Qi, Sizhe Zhou, Yu Su 2025-02-21 arXiv:2502.14802, 2025 https://github.com/OSU-NLP-Group/HippoRAG http://arxiv.org/abs/2502.14802v1
16 FormalSpecCpp: A Dataset of C++ Formal Specifications created using LLMs Madhurima Chakraborty, Peter Pirkelbauer, Qing Yi 2025-02-21 arXiv https://github.com/MadhuNimmo/FormalSpecCpp http://arxiv.org/abs/2502.15217v1
17 CORBA: Contagious Recursive Blocking Attacks on Multi-Agent Systems Based on Large Language Models Zhenhong Zhou, Zherui Li, Jie Zhang, Yuanhe Zhang, Kun Wang, Yang Liu, Qing Guo 2025-02-21 arXiv …, 2025 https://github.com/zhrli324/Corba http://arxiv.org/abs/2502.14529v1
18 MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models Shrey Pandit, Jiawei Xu, Junyuan Hong, Zhangyang Wang, Tianlong Chen, Kaidi Xu, Ying Ding 2025-02-21 arXiv …, 2025 https://medhallu.github.io/ http://arxiv.org/abs/2502.14302v1
19 Is Safety Standard Same for Everyone? User-Specific Safety Evaluation of Large Language Models Yeonjun In, Wonjoong Kim, Kanghoon Yoon, Sungchul Kim, Mehrab Tanjim, Kibum Kim, Chanyoung Park 2025-02-20 arXiv https://github.com/yeonjun-in/U-SafeBench http://arxiv.org/abs/2502.15086v1
20 DataSciBench: An LLM Agent Benchmark for Data Science Dan Zhang, Sining Zhoubian, Min Cai, Fengzu Li, Lekang Yang, Wei Wang, Tianjiao Dong, Ziniu Hu, Jie Tang, Yisong Yue 2025-02-19 arXiv https://github.com/THUDM/DataSciBench http://arxiv.org/abs/2502.13897v1
21 SIFT: Grounding LLM Reasoning in Contexts via Stickers Zihao Zeng, Xuyao Huang, Boxiu Li, Zhijie Deng 2025-02-19 arXiv https://github.com/zhijie-group/SIFT http://arxiv.org/abs/2502.14922v1
22 Proving Olympiad Inequalities by Synergizing LLMs and Symbolic Reasoning Zenan Li, Zhaoyu Li, Wen Tang, Xian Zhang, Yuan Yao, Xujie Si, Fan Yang, Kaiyu Yang, Xiaoxing Ma 2025-02-19 arXiv https://github.com/Lizn-zn/NeqLIPS/ http://arxiv.org/abs/2502.13834v1
23 PRIV-QA: Privacy-Preserving Question Answering for Cloud Large Language Models Guangwei Li, Yuansen Zhang, Yinggui Wang, Shoumeng Yan, Lei Wang, Tao Wei 2025-02-19 arXiv https://github.com/ligw1998/PRIV-QA http://arxiv.org/abs/2502.13564v1
24 LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization Guanzheng Chen, Xin Li, Michael Qizhe Shieh, Lidong Bing 2025-02-19 arXiv https://github.com/DAMO-NLP-SG/LongPO http://arxiv.org/abs/2502.13922v2
25 Judging the Judges: A Collection of LLM-Generated Relevance Judgements Hossein A. Rahmani, Clemencia Siro, Mohammad Aliannejadi, Nick Craswell, Charles L. A. Clarke, Guglielmo Faggioli, Bhaskar Mitra, Paul Thomas, Emine Yilmaz 2025-02-19 arXiv https://llm4eval.github.io/LLMJudge-benchmark/ http://arxiv.org/abs/2502.13908v1
26 Lost in Sequence: Do Large Language Models Understand Sequential Recommendation? Sein Kim, Hongseok Kang, Kibum Kim, Jiwan Kim, Donghyun Kim, Minchul Yang, Kwangjin Oh, Julian McAuley, Chanyoung Park 2025-02-19 arXiv https://github.com/Sein-Kim/LLM-SRec http://arxiv.org/abs/2502.13909v2
27 Craw4LLM: Efficient Web Crawling for LLM Pretraining Shi Yu, Zhiyuan Liu, Chenyan Xiong 2025-02-19 arXiv https://github.com/cxcscmu/Crawl4LLM http://arxiv.org/abs/2502.13347v1
28 Collaborative Retrieval for Large Language Model-based Conversational Recommender Systems Yaochen Zhu, Chao Wan, Harald Steck, Dawen Liang, Yesu Feng, Nathan Kallus, Jundong Li 2025-02-19 arXiv https://github.com/yaochenzhu/CRAG http://arxiv.org/abs/2502.14137v1
29 $\mathttGeLLM^3O$: Generalizing Large Language Models for Multi-property Molecule Optimization Vishal Dey, Xiao Hu, Xia Ning 2025-02-19 arXiv https://github.com/ninglab/GeLLMO http://arxiv.org/abs/2502.13398v1
30 Benchmarking LLMs for Political Science: A United Nations Perspective Yueqing Liang, Liangwei Yang, Chen Wang, Congying Xia, Rui Meng, Xiongxiao Xu, Haoran Wang, Ali Payani, Kai Shu 2025-02-19 arXiv https://github.com/yueqingliang1/UNBench http://arxiv.org/abs/2502.14122v1
31 ArtMentor: AI-Assisted Evaluation of Artworks to Explore Multimodal Large Language Models Capabilities Chanjin Zheng, Zengyi Yu, Yilin Jiang, Mingzi Zhang, Xunuo Lu, Jing Jin, Liteng Gao 2025-02-19 arXiv https://artmentor.github.io/ http://arxiv.org/abs/2502.13832v1
32 AI-Empowered Catalyst Discovery: A Survey from Classical Machine Learning Approaches to Large Language Models Yuanyuan Xu, Hanchen Wang, Wenjie Zhang, Lexing Xie, Yin Chen, Flora Salim, Ying Zhang, Justin Gooding, Toby Walsh 2025-02-19 arXiv https://github.com/LuckyGirl-XU/Awesome-Artificial-Intelligence-Empowered-Catalyst-Discovery http://arxiv.org/abs/2502.13626v1
33 Trust Me, I'm Wrong: High-Certainty Hallucinations in LLMs Adi Simhi, Itay Itzhak, Fazl Barez, Gabriel Stanovsky, Yonatan Belinkov 2025-02-18 arXiv https://github.com/technion-cs-nlp/Trust_me_Im_wrong http://arxiv.org/abs/2502.12964v1
34 Text2World: Benchmarking Large Language Models for Symbolic World Model Generation Mengkang Hu, Tianxing Chen, Yude Zou, Yuheng Lei, Qiguang Chen, Ming Li, Hongyuan Zhang, Wenqi Shao, Ping Luo 2025-02-18 arXiv https://text-to-world.github.io/ http://arxiv.org/abs/2502.13092v1
35 SparAMX: Accelerating Compressed LLMs Token Generation on AMX-powered CPUs Ahmed F. AbouElhamayed, Jordan Dotzel, Yash Akhauri, Chi-Chih Chang, Sameh Gobriel, J. Pablo Muñoz, Vui Seng Chua, Nilesh Jain, Mohamed S. Abdelfattah 2025-02-18 arXiv https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/SparAMX http://arxiv.org/abs/2502.12444v1
36 Soundwave: Less is More for Speech-Text Alignment in LLMs Yuhao Zhang, Zhiheng Liu, Fan Bu, Ruiyu Zhang, Benyou Wang, Haizhou Li 2025-02-18 arXiv https://github.com/FreedomIntelligence/Soundwave http://arxiv.org/abs/2502.12900v1
37 SEA: Low-Resource Safety Alignment for Multimodal Large Language Models via Synthetic Embeddings Weikai Lu, Hao Peng, Huiping Zhuang, Cen Chen, Ziqian Zeng 2025-02-18 arXiv https://github.com/ZeroNLP/SEA http://arxiv.org/abs/2502.12562v1
38 PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models Jiaqi Zhao, Miao Zhang, Ming Wang, Yuzhang Shang, Kaihao Zhang, Weili Guan, Yaowei Wang, Min Zhang 2025-02-18 arXiv https://github.com/zjq0455/PTQ1.61 http://arxiv.org/abs/2502.13179v1
39 MoBA: Mixture of Block Attention for Long-Context LLMs Enzhe Lu, Zhejun Jiang, Jingyuan Liu, Yulun Du, Tao Jiang, Chao Hong, Shaowei Liu, Weiran He, Enming Yuan, Yuzhi Wang, Zhiqi Huang, Huan Yuan, Suting Xu, Xinran Xu, Guokun Lai, Yanru Chen, Huabin Zheng, Junjie Yan, Jianlin Su, Yuxin Wu, Neo Y. Zhang, Zhilin Yang, Xinyu Zhou, Mingxing Zhang, Jiezhong Qiu 2025-02-18 arXiv https://github.com/MoonshotAI/MoBA http://arxiv.org/abs/2502.13189v1
40 Investigating and Extending Homans' Social Exchange Theory with Large Language Model based Agents Lei Wang, Zheqing Zhang, Xu Chen 2025-02-18 arXiv https://github.com/Paitesanshi/SET http://arxiv.org/abs/2502.12450v1
41 G-Refer: Graph Retrieval-Augmented Large Language Model for Explainable Recommendation Yuhan Li, Xinni Zhang, Linhao Luo, Heng Chang, Yuxiang Ren, Irwin King, Jia Li 2025-02-18 arXiv https://github.com/Yuhan1i/G-Refer http://arxiv.org/abs/2502.12586v1
42 Can LLM Watermarks Robustly Prevent Unauthorized Knowledge Distillation? Leyi Pan, Aiwei Liu, Shiyu Huang, Yijian Lu, Xuming Hu, Lijie Wen, Irwin King, Philip S. Yu 2025-02-17 arXiv https://github.com/THU-BPM/Watermark-Radioactivity-Attack http://arxiv.org/abs/2502.11598v1
43 VRoPE: Rotary Position Embedding for Video Large Language Models Zikang Liu, Longteng Guo, Yepeng Tang, Junxian Cai, Kai Ma, Xi Chen, Jing Liu 2025-02-17 arXiv https://github.com/johncaged/VRoPE http://arxiv.org/abs/2502.11664v1
44 Language Models Can See Better: Visual Contrastive Decoding For LLM Multimodal Reasoning Yuqi Pang, Bowen Yang, Haoqin Tu, Yun Cao, Zeyu Zhang 2025-02-17 arXiv https://github.com/Pbhgit/MVCD http://arxiv.org/abs/2502.11751v1
45 Idiosyncrasies in Large Language Models Mingjie Sun, Yida Yin, Zhiqiu Xu, J. Zico Kolter, Zhuang Liu 2025-02-17 arXiv https://eric-mingjie.github.io/llm-idiosyncrasies/index.html http://arxiv.org/abs/2502.12150v1
46 Code-Vision: Evaluating Multimodal LLMs Logic Understanding and Code Generation Capabilities Hanbin Wang, Xiaoxuan Zhou, Zhipeng Xu, Keyuan Cheng, Yuxin Zuo, Kai Tian, Jingwei Song, Junting Lu, Wenhui Hu, Xueyang Liu 2025-02-17 arXiv https://github.com/wanghanbinpanda/CodeVision http://arxiv.org/abs/2502.11829v1
47 A-MEM: Agentic Memory for LLM Agents Wujiang Xu, Zujie Liang, Kai Mei, Hang Gao, Juntao Tan, Yongfeng Zhang 2025-02-17 arXiv https://github.com/WujiangXu/AgenticMemory http://arxiv.org/abs/2502.12110v1
48 Bitnet.cpp: Efficient Edge Inference for Ternary LLMs Jinheng Wang, Hansong Zhou, Ting Song, Shijie Cao, Yan Xia, Ting Cao, Jianyu Wei, Shuming Ma, Hongyu Wang, Furu Wei 2025-02-17 arXiv https://github.com/microsoft/BitNet/tree/paper http://arxiv.org/abs/2502.11880v1
49 RIDE: Enhancing Large Language Model Alignment through Restyled In-Context Learning Demonstration Exemplars Yuncheng Hua, Lizhen Qu, Zhuang Li, Hao Xue, Flora D. Salim, Gholamreza Haffari 2025-02-17 arXiv https://github.com/AnonymousCode-ComputerScience/RIDE http://arxiv.org/abs/2502.11681v1
50 A Survey of Personalized Large Language Models: Progress and Future Directions Jiahong Liu, Zexuan Qiu, Zhongyang Li, Quanyu Dai, Jieming Zhu, Minda Hu, Menglin Yang, Irwin King 2025-02-17 arXiv https://github.com/JiahongLiu21/Awesome-Personalized-Large-Language-Models http://arxiv.org/abs/2502.11528v1
51 "Nuclear Deployed!": Analyzing Catastrophic Risks in Decision-making of Autonomous LLM Agents Rongwu Xu, Xiaojian Li, Shuo Chen, Wei Xu 2025-02-17 arXiv https://github.com/pillowsofwind/LLM-CBRN-Risks http://arxiv.org/abs/2502.11355v1
52 Atom of Thoughts for Markov LLM Test-Time Scaling Fengwei Teng, Zhaoyang Yu, Quan Shi, Jiayi Zhang, Chenglin Wu, Yuyu Luo 2025-02-17 arXiv https://github.com/qixucen/atom http://arxiv.org/abs/2502.12018v1
53 How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training Yixin Ou, Yunzhi Yao, Ningyu Zhang, Hui Jin, Jiacheng Sun, Shumin Deng, Zhenguo Li, Huajun Chen 2025-02-16 arXiv https://github.com/zjunlp/DynamicKnowledgeCircuits http://arxiv.org/abs/2502.11196v1
54 SURGE: On the Potential of Large Language Models as General-Purpose Surrogate Code Executors Bohan Lyu, Siqiao Huang, Zichen Liang 2025-02-16 arXiv https://github.com/Imbernoulli/SURGE http://arxiv.org/abs/2502.11167v1
55 ReLearn: Unlearning via Learning for Large Language Models Haoming Xu, Ningyuan Zhao, Liming Yang, Sendong Zhao, Shumin Deng, Mengru Wang, Bryan Hooi, Nay Oo, Huajun Chen, Ningyu Zhang 2025-02-16 arXiv https://github.com/zjunlp/unlearn http://arxiv.org/abs/2502.11190v1
56 Ramp Up NTT in Record Time using GPU-Accelerated Algorithms and LLM-based Code Generation Yu Cui, Hang Fu, Licheng Wang, Haibin Zhang 2025-02-16 arXiv https://github.com/LMPC-Lab/GenGPUCrypto http://arxiv.org/abs/2502.11110v1
57 MasRouter: Learning to Route LLMs for Multi-Agent Systems Yanwei Yue, Guibin Zhang, Boyang Liu, Guancheng Wan, Kun Wang, Dawei Cheng, Yiyan Qi 2025-02-16 arXiv https://github.com/yanweiyue/masrouter http://arxiv.org/abs/2502.11133v1
58 G-Safeguard: A Topology-Guided Security Lens and Treatment on LLM-based Multi-agent Systems Shilong Wang, Guibin Zhang, Miao Yu, Guancheng Wan, Fanci Meng, Chongye Guo, Kun Wang, Yang Wang 2025-02-16 arXiv https://github.com/wslong20/G-safeguard http://arxiv.org/abs/2502.11127v1
59 Exposing Numeracy Gaps: A Benchmark to Evaluate Fundamental Numerical Abilities in Large Language Models Haoyang Li, Xuejia Chen, Zhanchao XU, Darian Li, Nicole Hu, Fei Teng, Yiming Li, Luyu Qiu, Chen Jason Zhang, Qing Li, Lei Chen 2025-02-16 arXiv https://github.com/TreeAI-Lab/NumericBench http://arxiv.org/abs/2502.11075v1
60 CORDIAL: Can Multimodal Large Language Models Effectively Understand Coherence Relationships? Aashish Anantha Ramakrishnan, Aadarsh Anantha Ramakrishnan, Dongwon Lee 2025-02-16 arXiv https://github.com/aashish2000/CORDIAL http://arxiv.org/abs/2502.11300v1
61 Reasoning-Augmented Conversation for Multi-Turn Jailbreak Attacks on Large Language Models Zonghao Ying, Deyue Zhang, Zonglei Jing, Yisong Xiao, Quanchen Zou, Aishan Liu, Siyuan Liang, Xiangzheng Zhang, Xianglong Liu, Dacheng Tao 2025-02-16 arXiv https://github.com/NY1024/RACE http://arxiv.org/abs/2502.11054v1
62 BoT: Breaking Long Thought Processes of o1-like Large Language Models through Backdoor Attack Zihao Zhu, Hongbao Zhang, Mingda Zhang, Ruotong Wang, Guanzong Wu, Ke Xu, Baoyuan Wu 2025-02-16 arXiv https://github.com/zihao-ai/BoT http://arxiv.org/abs/2502.12202v1
63 Injecting Domain-Specific Knowledge into Large Language Models: A Comprehensive Survey Zirui Song, Bin Yan, Yuhan Liu, Miao Fang, Mingzhe Li, Rui Yan, Xiuying Chen 2025-02-15 arXiv https://github.com/abilliyb/Knowledge_Injection_Survey_Papers http://arxiv.org/abs/2502.10708v1
64 SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models Daniel Fleischer, Moshe Berchansky, Gad Markovits, Moshe Wasserblat 2025-02-15 arXiv …, 2025 https://github.com/IntelLabs/RAG-FiT/tree/square http://arxiv.org/abs/2502.09390v1
65 LintLLM: An Open-Source Verilog Linting Framework Based on Large Language Models Zhigang Fang, Renzhi Chen, Zhijie Yang, Yang Guo, Huadong Dai, Lei Wang 2025-02-15 arXiv https://github.com/fangzhigang32/Static-Verilog-Analysis http://arxiv.org/abs/2502.10815v1
66 An Empirical Analysis of Uncertainty in Large Language Model Evaluations Qiujie Xie, Qingqiu Li, Zhuohao Yu, Yuejie Zhang, Yue Zhang, Linyi Yang 2025-02-15 arXiv https://github.com/hasakiXie123/LLM-Evaluator-Uncertainty http://arxiv.org/abs/2502.10709v1
67 EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents Rui Yang, Hanyang Chen, Junyu Zhang, Mark Zhao, Cheng Qian, Kangrui Wang, Qineng Wang, Teja Venkat Koripella, Marziyeh Movahedi, Manling Li, Heng Ji, Huan Zhang, Tong Zhang 2025-02-15 arXiv …, 2025 https://embodiedbench.github.io http://arxiv.org/abs/2502.09560v1
68 Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMs Siyan Zhao, Mingyi Hong, Yang Liu, Devamanyu Hazarika, Kaixiang Lin 2025-02-15 arXiv …, 2025 https://prefeval.github.io/ http://arxiv.org/abs/2502.09597v1
69 KKA: Improving Vision Anomaly Detection through Anomaly-related Knowledge from Large Language Models Dong Chen, Zhengqing Hu, Peiguang Fan, Yueting Zhuang, Yafei Li, Qidong Liu, Xiaoheng Jiang, Mingliang Xu 2025-02-14 arXiv https://github.com/Anfeather/KKA http://arxiv.org/abs/2502.14880v1
70 Large Language Diffusion Models Shen Nie, Fengqi Zhu, Zebin You, Xiaolu Zhang, Jingyang Ou, Jun Hu, Jun Zhou, Yankai Lin, Ji-Rong Wen, Chongxuan Li 2025-02-14 arXiv https://ml-gsai.github.io/LLaDA-demo/ http://arxiv.org/abs/2502.09992v1
71 V2V-LLM: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multi-Modal Large Language Models Hsu-kuang Chiu, Ryo Hachiuma, Chien-Yi Wang, Stephen F. Smith, Yu-Chiang Frank Wang, Min-Hung Chen 2025-02-14 arXiv https://eddyhkchiu.github.io/v2vllm.github.io/ http://arxiv.org/abs/2502.09980v1
72 LaRA: Benchmarking Retrieval-Augmented Generation and Long-Context LLMs - No Silver Bullet for LC or RAG Routing Kuan Li, Liwen Zhang, Yong Jiang, Pengjun Xie, Fei Huang, Shuai Wang, Minhao Cheng 2025-02-14 arXiv https://github.com/likuanppd/LaRA http://arxiv.org/abs/2502.09977v1
73 MM-RLHF: The Next Step Forward in Multimodal LLM Alignment Yi-Fan Zhang, Tao Yu, Haochen Tian, Chaoyou Fu, Peiyan Li, Jianshu Zeng, Wulin Xie, Yang Shi, Huanyu Zhang, Junkang Wu, Xue Wang, Yibo Hu, Bin Wen, Fan Yang, Zhang Zhang, Tingting Gao, Di Zhang, Liang Wang, Rong Jin, Tieniu Tan 2025-02-14 arXiv https://mm-rlhf.github.io/ http://arxiv.org/abs/2502.10391v1
74 Bag of Tricks for Inference-time Computation of LLM Reasoning Fan Liu, Wenshuo Chao, Naiqiang Tan, Hao Liu 2025-02-13 arXiv:2502.07191, 2025 https://github.com/usail-hkust/benchmark_inference_time_computation_LL http://arxiv.org/abs/2502.07191v2
75 FinRL-DeepSeek: LLM-Infused Risk-Sensitive Reinforcement Learning for Trading Agents Mostapha Benhenda 2025-02-13 arXiv:2502.07393, 2025 https://github.com/benstaf/FinRL_DeepSeek http://arxiv.org/abs/2502.07393v1
76 Ask Patients with Patience: Enabling LLMs for Human-Centric Medical Dialogue with Grounded Reasoning Jiayuan Zhu, Junde Wu 2025-02-13 arXiv:2502.07143, 2025 https://github.com/SuperMedIntel/AskPatients http://arxiv.org/abs/2502.07143v1
77 LLM-Generated Microservice Implementations from RESTful API Definitions Saurabh Chauhan, Zeeshan Rasheed, Abdul Malik Sami, Zheying Zhang, Jussi Rasku, Kai-Kristian Kemell, Pekka Abrahamsson 2025-02-13 arXiv https://github.com/sirbh/code-gen http://arxiv.org/abs/2502.09766v1
78 DrugImproverGPT: A Large Language Model for Drug Optimization with Fine-Tuning via Structured Policy Optimization Xuefeng Liu, Songhao Jiang, Siyu Chen, Zhuoran Yang, Yuxin Chen, Ian Foster, Rick Stevens 2025-02-13 arXiv …, 2025 https://github.com/xuefeng-cs/DrugImproverGPT http://arxiv.org/abs/2502.07237v1
79 The Hidden Dimensions of LLM Alignment: A Multi-Dimensional Safety Analysis Wenbo Pan, Zhichao Liu, Qiguang Chen, Xiangyang Zhou, Haining Yu, Xiaohua Jia 2025-02-13 arXiv https://github.com/BMPixel/safety-residual-space http://arxiv.org/abs/2502.09674v1
80 LongReD: Mitigating Short-Text Degradation of Long-Context Large Language Models via Restoration Distillation Zican Dong, Junyi Li, Jinhao Jiang, Mingyu Xu, Wayne Xin Zhao, Bingning Wang, Weipeng Chen 2025-02-13 arXiv …, 2025 https://github.com/RUCAIBox/LongReD http://arxiv.org/abs/2502.07365v1
81 DarwinLM: Evolutionary Structured Pruning of Large Language Models Shengkun Tang, Oliver Sieberling, Eldar Kurtic, Zhiqiang Shen, Dan Alistarh 2025-02-13 arXiv …, 2025 https://github.com/IST-DASLab/DarwinLM http://arxiv.org/abs/2502.07780v1
82 LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters! Dacheng Li, Shiyi Cao, Tyler Griggs, Shu Liu, Xiangxi Mo, Eric Tang, Sumanth Hegde, Kourosh Hakhamaneshi, Shishir G. Patil, Matei Zaharia, Joseph E. Gonzalez, Ion Stoica 2025-02-13 arXiv …, 2025 https://github.com/NovaSky-AI/SkyThought http://arxiv.org/abs/2502.07374v2
83 Making Them a Malicious Database: Exploiting Query Code to Jailbreak Aligned Large Language Models Qingsong Zou, Jingyu Xiao, Qing Li, Zhi Yan, Yuhang Wang, Li Xu, Wenxuan Wang, Kuofeng Gao, Ruoyu Li, Yong Jiang 2025-02-13 arXiv https://github.com/horizonsinzqs/QueryAttack http://arxiv.org/abs/2502.09723v1
84 LawGPT: Knowledge-Guided Data Generation and Its Application to Legal LLM Zhi Zhou, Kun-Yang Yu, Shi-Yu Tian, Xiao-Wen Yang, Jiang-Xin Shi, Pengxiao Song, Yi-Xuan Jin, Lan-Zhe Guo, Yu-Feng Li 2025-02-12 arXiv …, 2025 https://github.com/LAMDASZ-ML/Knowledge-Guide-Data-Generation http://arxiv.org/abs/2502.06572v2
85 Systematic Outliers in Large Language Models Yongqi An, Xu Zhao, Tao Yu, Ming Tang, Jinqiao Wang 2025-02-12 arXiv:2502.06415, 2025 https://github.com/an-yongqi/systematic-outliers http://arxiv.org/abs/2502.06415v1
86 RALLRec: Improving Retrieval Augmented Large Language Model Recommendation with Representation Learning Jian Xu, Sichun Luo, Xiangyu Chen, Haoming Huang, Hanxu Hou, Linqi Song 2025-02-12 arXiv …, 2025 https://github.com/JianXu95/RALLRec http://arxiv.org/abs/2502.06101v2
87 Towards Zero-Shot Anomaly Detection and Reasoning with Multimodal Large Language Models Jiacong Xu, Shao-Yuan Lo, Bardia Safaei, Vishal M. Patel, Isht Dwivedi 2025-02-12 arXiv …, 2025 https://xujiacong.github.io/Anomaly-OV/ http://arxiv.org/abs/2502.07601v1
88 Large Language Models Meet Symbolic Provers for Logical Reasoning Evaluation Chengwen Qi, Ren Ma, Bowen Li, He Du, Binyuan Hui, Jinwang Wu, Yuanjun Laili, Conghui He 2025-02-12 arXiv …, 2025 https://github.com/opendatalab/ProverGen http://arxiv.org/abs/2502.06563v1
89 Data Augmentation to Improve Large Language Models in Food Hazard and Product Detection Areeg Fahad Rasheed, M. Zarkoosh, Shimam Amer Chasib, Safa F. Abbas 2025-02-12 arXiv https://github.com/AREEG94FAHAD/food-hazard-prdouct-cls http://arxiv.org/abs/2502.08687v1
90 Calibrating LLMs with Information-Theoretic Evidential Deep Learning Yawei Li, David Rügamer, Bernd Bischl, Mina Rezaei 2025-02-12 arXiv:2502.06351, 2025 https://github.com/sandylaker/ib-edl http://arxiv.org/abs/2502.06351v2
91 The Foundational Capabilities of Large Language Models in Predicting Postoperative Risks Using Clinical Notes Charles Alba, Bing Xue, Joanna Abraham, Thomas Kannampallil, Chenyang Lu 2025-02-11 npj Digital Medicine https://github.com/cja5553/LLMs_in_perioperative_care http://arxiv.org/abs/2402.17493v5
92 Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining Daouda Sow, Herbert Woisetschläger, Saikiran Bulusu, Shiqiang Wang, Hans-Arno Jacobsen, Yingbin Liang 2025-02-10 arXiv https://github.com/sowmaster/Sample-Level-Loss-Reweighting-ICLR-2025 http://arxiv.org/abs/2502.06733v1
93 LLMs in Software Security: A Survey of Vulnerability Detection Techniques and Insights Ze Sheng, Zhicheng Chen, Shuning Gu, Heqing Huang, Guofei Gu, Jeff Huang 2025-02-10 arXiv https://github.com/OwenSanzas/LLM-For-Vulnerability-Detection http://arxiv.org/abs/2502.07049v2
94 AutoAgent: A Fully-Automated and Zero-Code Framework for LLM Agents Jiabin Tang, Tianyu Fan, Chao Huang 2025-02-09 arXiv https://github.com/HKUDS/AutoAgent http://arxiv.org/abs/2502.05957v2
95 MetaChain: A Fully-Automated and Zero-Code Framework for LLM Agents Jiabin Tang, Tianyu Fan, Chao Huang 2025-02-09 arXiv https://github.com/HKUDS/MetaChain http://arxiv.org/abs/2502.05957v1
96 LLM-Powered Decentralized Generative Agents with Adaptive Hierarchical Knowledge Graph for Cooperative Planning Hanqing Yang, Jingdi Chen, Marie Siew, Tania Lorido-Botran, Carlee Joe-Wong 2025-02-08 arXiv https://happyeureka.github.io/damcs http://arxiv.org/abs/2502.05453v1
97 Learning Conformal Abstention Policies for Adaptive Risk Management in Large Language and Vision-Language Models Sina Tayebati, Divake Kumar, Nastaran Darabi, Dinithi Jayasuriya, Ranganath Krishnan, Amit Ranjan Trivedi 2025-02-08 arXiv https://github.com/sinatayebati/vlm-uncertainty http://arxiv.org/abs/2502.06884v1
98 OntoTune: Ontology-Driven Self-training for Aligning Large Language Models Zhiqiang Liu, Chengtao Gan, Junjie Wang, Yichi Zhang, Zhongpu Bo, Mengshu Sun, Huajun Chen, Wen Zhang 2025-02-08 arXiv https://github.com/zjukg/OntoTune http://arxiv.org/abs/2502.05478v1
99 DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails Yihe Deng, Yu Yang, Junkai Zhang, Wei Wang, Bo Li 2025-02-07 arXiv https://github.com/yihedeng9/DuoGuard http://arxiv.org/abs/2502.05163v1
100 QuEST: Stable Training of LLMs with 1-Bit Weights and Activations Andrei Panferov, Jiale Chen, Soroush Tabesh, Roberto L. Castro, Mahdi Nikdan, Dan Alistarh 2025-02-07 arXiv https://github.com/IST-DASLab/QuEST http://arxiv.org/abs/2502.05003v1
101 LLM-Supported Natural Language to Bash Translation Finnian Westenfelder, Erik Hemberg, Miguel Tulla, Stephen Moskal, Una-May O'Reilly, Silviu Chiricescu 2025-02-07 arXiv https://github.com/westenfelder/NL2SH http://arxiv.org/abs/2502.06858v1
102 Confidence Elicitation: A New Attack Vector for Large Language Models Brian Formento, Chuan Sheng Foo, See-Kiong Ng 2025-02-07 arXiv https://github.com/Aniloid2/Confidence_Elicitation_Attacks http://arxiv.org/abs/2502.04643v2
103 Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research Junde Wu, Jiayuan Zhu, Yuyuan Liu 2025-02-07 arXiv https://github.com/theworldofagents/Agentic-Reasoning http://arxiv.org/abs/2502.04644v1
104 ARR: Question Answering with Large Language Models via Analyzing, Retrieving, and Reasoning Yuwei Yin, Giuseppe Carenini 2025-02-07 arXiv https://github.com/YuweiYin/ARR http://arxiv.org/abs/2502.04689v2
105 ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization Yinjie Wang, Ling Yang, Guohao Li, Mengdi Wang, Bryon Aragam 2025-02-06 arXiv https://github.com/Gen-Verse/ScoreFlow http://arxiv.org/abs/2502.04306v1
106 "Short-length" Adversarial Training Helps LLMs Defend "Long-length" Jailbreak Attacks: Theoretical and Empirical Evidence Shaopeng Fu, Liang Ding, Di Wang 2025-02-06 arXiv https://github.com/fshp971/adv-icl http://arxiv.org/abs/2502.04204v1
107 Aggregate and conquer: detecting and steering LLM concepts by combining nonlinear predictors over multiple layers Daniel Beaglehole, Adityanarayanan Radhakrishnan, Enric Boix-Adserà, Mikhail Belkin 2025-02-06 arXiv https://github.com/dmbeaglehole/neural_controllers http://arxiv.org/abs/2502.03708v1
108 Beyond Prompt Content: Enhancing LLM Performance via Content-Format Integrated Prompt Optimization Yuanye Liu, Jiahang Xu, Li Lyna Zhang, Qi Chen, Xuan Feng, Yang Chen, Zhongxin Guo, Yuqing Yang, Peng Cheng 2025-02-06 arXiv https://github.com/HenryLau7/CFPO http://arxiv.org/abs/2502.04295v2
109 CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference Zehua Pei, Lancheng Zou, Hui-Ling Zhen, Xianzhi Yu, Wulong Liu, Sinno Jialin Pan, Mingxuan Yuan, Bei Yu 2025-02-06 arXiv https://github.com/JarvisPei/CMoE http://arxiv.org/abs/2502.04416v1
110 EmoBench-M: Benchmarking Emotional Intelligence for Multimodal Large Language Models He Hu, Yucheng Zhou, Lianzhong You, Hongbo Xu, Qianning Wang, Zheng Lian, Fei Richard Yu, Fei Ma, Laizhong Cui 2025-02-06 arXiv https://emo-gml.github.io/ http://arxiv.org/abs/2502.04424v1
111 My LLM might Mimic AAE -- But When Should it? Sandra C. Sandoval, Christabel Acquaye, Kwesi Cobbina, Mohammad Nayeem Teli, Hal Daumé III 2025-02-06 arXiv https://github.com/smelliecat/AAEMime http://arxiv.org/abs/2502.04564v2
112 Predicting Large Language Model Capabilities on Closed-Book QA Tasks Using Only Information Available Prior to Training Changhao Jiang, Ming Zhang, Junjie Ye, Xiaoran Fan, Yifei Cao, Jiajun Sun, Zhiheng Xi, Shihan Dou, Yi Dong, Yujiong Shen, Jingqi Tong, Zhen Wang, Tao Liang, Zhihui Fei, Mingyang Wan, Guojun Ma, Qi Zhang, Tao Gui, Xuanjing Huang 2025-02-06 arXiv https://github.com/yuhui1038/SMI http://arxiv.org/abs/2502.04066v1
113 Robotouille: An Asynchronous Planning Benchmark for LLM Agents Gonzalo Gonzalez-Pumariega, Leong Su Yean, Neha Sunkara, Sanjiban Choudhury 2025-02-06 arXiv https://github.com/portal-cornell/robotouille http://arxiv.org/abs/2502.05227v1
114 Preference Leakage: A Contamination Problem in LLM-as-a-judge Dawei Li, Renliang Sun, Yue Huang, Ming Zhong, Bohan Jiang, Jiawei Han, Xiangliang Zhang, Wei Wang, Huan Liu 2025-02-05 arXiv …, 2025 https://github.com/David-Li0406/Preference-Leakage http://arxiv.org/abs/2502.01534v1
115 PDE-Controller: LLMs for Autoformalization and Reasoning of PDEs Mauricio Soroco, Jialin Song, Mengzhou Xia, Kye Emond, Weiran Sun, Wuyang Chen 2025-02-05 arXiv …, 2025 https://pde-controller.github.io/ http://arxiv.org/abs/2502.00963v1
116 PICBench: Benchmarking LLMs for Photonic Integrated Circuits Design Yuchao Wu, Xiaofei Yu, Hao Chen, Yang Luo, Yeyu Tong, Yuzhe Ma 2025-02-05 arXiv https://github.com/PICDA/PICBench http://arxiv.org/abs/2502.03159v1
117 Picky LLMs and Unreliable RMs: An Empirical Study on Safety Alignment after Instruction Tuning Guanlin Li, Kangjie Chen, Shangwei Guo, Jie Zhang, Han Qiu, Chao Zhang, Guoyin Wang, Tianwei Zhang, Jiwei Li 2025-02-05 arXiv …, 2025 https://github.com/GuanlinLee/llm_instruction_tuning http://arxiv.org/abs/2502.01116v1
118 Knowledge Distillation from Large Language Models for Household Energy Modeling Mohannad Takrouri, Nicolás M. Cuadrado, Martin Takáč 2025-02-05 arXiv https://github.com/Singularity-AI-Lab/LLM-Energy-Knowledge-Distillation http://arxiv.org/abs/2502.03034v1
119 Robust-LLaVA: On the Effectiveness of Large-Scale Robust Image Encoders for Multi-modal Large Language Models Hashmat Shadab Malik, Fahad Shamshad, Muzammal Naseer, Karthik Nandakumar, Fahad Khan, Salman Khan 2025-02-05 arXiv …, 2025 https://github.com/HashmatShadab/Robust-LLaVA http://arxiv.org/abs/2502.01576v1
120 SPRI: Aligning Large Language Models with Context-Situated Principles Hongli Zhan, Muneeza Azmat, Raya Horesh, Junyi Jessy Li, Mikhail Yurochkin 2025-02-05 arXiv https://github.com/honglizhan/SPRI-public http://arxiv.org/abs/2502.03397v1
121 Tool Unlearning for Tool-Augmented LLMs Jiali Cheng, Hadi Amiri 2025-02-05 arXiv:2502.01083, 2025 https://clu-uml.github.io/MU-Bench-Project-Page/ http://arxiv.org/abs/2502.01083v1
122 LLM-TA: An LLM-Enhanced Thematic Analysis Pipeline for Transcripts from Parents of Children with Congenital Heart Disease Muhammad Zain Raza, Jiawei Xu, Terence Lim, Lily Boddy, Carlos M. Mery, Andrew Well, Ying Ding 2025-02-05 arXiv …, 2025 https://github.com/jiaweixu98/LLM-TA http://arxiv.org/abs/2502.01620v1
123 Overcoming Vision Language Model Challenges in Diagram Understanding: A Proof-of-Concept with XML-Driven Large Language Models Solutions Shue Shiinoki, Ryo Koshihara, Hayato Motegi, Masumi Morishige 2025-02-05 arXiv https://github.com/galirage/spreadsheet-intelligence http://arxiv.org/abs/2502.04389v1
124 Intent Representation Learning with Large Language Model for Recommendation Yu Wang, Lei Sang, Yi Zhang, Yiwen Zhang 2025-02-05 arXiv https://github.com/wangyu0627/IRLLRec http://arxiv.org/abs/2502.03307v1
125 Demystifying Long Chain-of-Thought Reasoning in LLMs Edward Yeo, Yuxuan Tong, Morry Niu, Graham Neubig, Xiang Yue 2025-02-05 arXiv https://github.com/eddycmu/demystify-long-cot http://arxiv.org/abs/2502.03373v1
126 Breaking Focus: Contextual Distraction Curse in Large Language Models Yue Huang, Yanbo Wang, Zixiang Xu, Chujie Gao, Siyuan Wu, Jiayi Ye, Xiuying Chen, Pin-Yu Chen, Xiangliang Zhang 2025-02-05 arXiv …, 2025 https://github.com/wyf23187/LLM_CDV http://arxiv.org/abs/2502.01609v1
127 AtmosSci-Bench: Evaluating the Recent Advance of Large Language Model for Atmospheric Science Chenyue Li, Wen Deng, Mengqian Lu, Binhang Yuan 2025-02-05 arXiv:2502.01159, 2025 https://github.com/Relaxed-System-Lab/AtmosSci-Bench http://arxiv.org/abs/2502.01159v1
128 AdaSVD: Adaptive Singular Value Decomposition for Large Language Models Zhiteng Li, Mingyuan Xia, Jingyuan Zhang, Zheng Hui, Linghe Kong, Yulun Zhang, Xiaokang Yang 2025-02-05 arXiv …, 2025 https://github.com/ZHITENGLI/AdaSVD http://arxiv.org/abs/2502.01403v2
129 A Benchmark for the Detection of Metalinguistic Disagreements between LLMs and Knowledge Graphs Bradley P. Allen, Paul T. Groth 2025-02-05 arXiv https://github.com/bradleypallen/trex-metalinguistic-disagreement http://arxiv.org/abs/2502.02896v1
130 CTR-Driven Advertising Image Generation with Multimodal Large Language Models Xingye Chen, Wei Feng, Zhenbang Du, Weizhen Wang, Yanyin Chen, Haohan Wang, Linkai Liu, Yaoyu Li, Jinyuan Zhao, Yu Li, Zheng Zhang, Jingjing Lv, Junjie Shen, Zhangang Lin, Jingping Shao, Yuanjie Shao, Xinge You, Changxin Gao, Nong Sang 2025-02-05 THE WEB … https://github.com/Chenguoz/CAIG http://arxiv.org/abs/2502.06823v1
131 A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods Isha Puri, Shivchander Sudalairaj, Guangxuan Xu, Kai Xu, Akash Srivastava 2025-02-05 arXiv …, 2025 https://probabilistic-inference-scaling.github.io http://arxiv.org/abs/2502.01618v2
132 Do Large Language Model Benchmarks Test Reliability? Joshua Vendrow, Edward Vendrow, Sara Beery, Aleksander Madry 2025-02-05 arXiv https://github.com/MadryLab/platinum-benchmarks http://arxiv.org/abs/2502.03461v1
133 A Training-Free Length Extrapolation Approach for LLMs: Greedy Attention Logit Interpolation (GALI) Yan Li, Tianyi Zhang, Zechuan Li, Soyeon Caren Han 2025-02-04 arXiv https://github.com/AcademyCityL/GALI http://arxiv.org/abs/2502.02659v1
134 SAISA: Towards Multimodal Large Language Models with Both Training and Inference Efficiency Qianhao Yuan, Yanjiang Liu, Yaojie Lu, Hongyu Lin, Ben He, Xianpei Han, Le Sun 2025-02-04 arXiv https://github.com/icip-cas/SAISA http://arxiv.org/abs/2502.02458v1
135 AutoGUI: Scaling GUI Grounding with Automatic Functionality Annotations from LLMs Hongxin Li, Jingfan Chen, Jingran Su, Yuntao Chen, Qing Li, Zhaoxiang Zhang 2025-02-04 arXiv https://autogui-project.github.io/ http://arxiv.org/abs/2502.01977v1
136 Risk-Aware Driving Scenario Analysis with Large Language Models Yuan Gao, Mattia Piccinini, Johannes Betz 2025-02-04 arXiv https://github.com/yuangao-tum/Riskaware-Scenario-analyse http://arxiv.org/abs/2502.02145v1
137 Multi-Lingual Cyber Threat Detection in Tweets/X Using ML, DL, and LLM: A Comparative Analysis Saydul Akbar Murad, Ashim Dahal, Nick Rahimi 2025-02-04 arXiv https://github.com/Mmurrad/Tweet-Data-Classification http://arxiv.org/abs/2502.04346v1
138 CognArtive: Large Language Models for Automating Art Analysis and Decoding Aesthetic Elements Afshin Khadangi, Amir Sartipi, Igor Tchappi, Gilbert Fridgen 2025-02-04 arXiv https://cognartive.github.io/ http://arxiv.org/abs/2502.04353v1
139 CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing Wenhao Zheng, Yixiao Chen, Weitong Zhang, Souvik Kundu, Yun Li, Zhengzhong Liu, Eric P. Xing, Hongyi Wang, Huaxiu Yao 2025-02-04 arXiv https://github.com/aiming-lab/CITER http://arxiv.org/abs/2502.01976v1
140 Progressive Binarization with Semi-Structured Pruning for LLMs Xianglong Yan, Tianao Zhang, Zhiteng Li, Yulun Zhang 2025-02-03 arXiv https://github.com/XIANGLONGYAN/PBS2P http://arxiv.org/abs/2502.01705v1
141 A Comprehensive Analysis on LLM-based Node Classification Algorithms Xixi Wu, Yifei Shen, Fangzhou Ge, Caihua Shan, Yizhu Jiao, Xiangguo Sun, Hong Cheng 2025-02-03 arXiv …, 2025 https://llmnodebed.github.io/ http://arxiv.org/abs/2502.00829v1
142 MorphBPE: A Morpho-Aware Tokenizer Bridging Linguistic Complexity for Efficient LLM Training Across Morphologies Ehsaneddin Asgari, Yassine El Kheir, Mohammad Ali Sadraei Javaheri 2025-02-03 arXiv:2502.00894, 2025 https://github.com/llm-lab-org/MorphBPE http://arxiv.org/abs/2502.00894v1
143 RTBAgent: A LLM-based Agent System for Real-Time Bidding Leng Cai, Junxuan He, Yikai Li, Junjie Liang, Yuanping Lin, Ziming Quan, Yawen Zeng, Jin Xu 2025-02-03 arXiv …, 2025 https://github.com/CaiLeng/RTBAgent http://arxiv.org/abs/2502.00792v1
144 RankFlow: A Multi-Role Collaborative Reranking Workflow Utilizing Large Language Models Can Jin, Hongwu Peng, Anxiang Zhang, Nuo Chen, Jiahui Zhao, Xi Xie, Kuangzheng Li, Shuya Feng, Kai Zhong, Caiwen Ding, Dimitris N. Metaxas 2025-02-03 arXiv …, 2025 https://github.com/jincan333/RankFlow http://arxiv.org/abs/2502.00709v2
145 MetaOpenFOAM 2.0: Large Language Model Driven Chain of Thought for Automating CFD Simulation and Post-Processing Yuxuan Chen, Xu Zhu, Hua Zhou, Zhuyin Ren 2025-02-02 arXiv:2502.00498, 2025 https://github.com/Terry-cyx/MetaOpenFOAM http://arxiv.org/abs/2502.00498v1
146 UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language Models Xin Xu, Qiyun Xu, Tong Xiao, Tianhao Chen, Yuchen Yan, Jiaxin Zhang, Shizhe Diao, Can Yang, Yang Wang 2025-02-02 arXiv …, 2025 https://github.com/YangLabHKUST/UGPhysics http://arxiv.org/abs/2502.00334v1
147 UniAttn: Reducing Inference Costs via Softmax Unification for Post-Training LLMs Yizhe Xiong, Wei Huang, Xin Ye, Hui Chen, Zijia Lin, Haoran Lian, Zhenpeng Su, Jungong Han, Guiguang Ding 2025-02-02 arXiv …, 2025 https://github.com/Bostoncake/UniAttn http://arxiv.org/abs/2502.00439v1
148 LIBRA: Measuring Bias of Large Language Model from a Local Context Bo Pang, Tingrui Qiao, Caroline Walker, Chris Cunningham, Yun Sing Koh 2025-02-02 arXiv https://github.com/ipangbo/LIBRA http://arxiv.org/abs/2502.01679v1
149 Speculative Ensemble: Fast Large Language Model Ensemble via Speculation Jiale Fu, Yuchu Jiang, Junkai Chen, Jiaming Fan, Xin Geng, Xu Yang 2025-02-01 arXiv https://github.com/Kamichanw/Speculative-Ensemble/ http://arxiv.org/abs/2502.01662v1
150 Differentially Private Steering for Large Language Model Alignment Anmol Goel, Yaxi Hu, Iryna Gurevych, Amartya Sanyal 2025-02-01 arXiv:2501.18532, 2025 https://github.com/UKPLab/iclr2025-psa http://arxiv.org/abs/2501.18532v1
151 LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models Shenghao Fu, Qize Yang, Qijie Mo, Junkai Yan, Xihan Wei, Jingke Meng, Xiaohua Xie, Wei-Shi Zheng 2025-01-31 arXiv https://github.com/iSEE-Laboratory/LLMDet http://arxiv.org/abs/2501.18954v1
152 Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation Tiansheng Huang, Sihao Hu, Fatih Ilhan, Selim Furkan Tekin, Ling Liu 2025-01-31 arXiv:2501.17433, 2025 https://github.com/git-disl/Virus http://arxiv.org/abs/2501.17433v1
153 2SSP: A Two-Stage Framework for Structured Pruning of LLMs Fabrizio Sandri, Elia Cunegatti, Giovanni Iacca 2025-01-31 arXiv:2501.17771, 2025 https://github.com/FabrizioSandri/2SSP http://arxiv.org/abs/2501.17771v1
154 Reward-Guided Speculative Decoding for Efficient LLM Reasoning Baohao Liao, Yuhui Xu, Hanze Dong, Junnan Li, Christof Monz, Silvio Savarese, Doyen Sahoo, Caiming Xiong 2025-01-31 arXiv https://github.com/BaohaoLiao/RSD http://arxiv.org/abs/2501.19324v1
155 ExeCoder: Empowering Large Language Models with Executability Representation for Code Translation Minghua He, Fangkai Yang, Pu Zhao, Wenjie Yin, Yu Kang, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang 2025-01-30 arXiv https://execoder4trans.github.io/ http://arxiv.org/abs/2501.18460v2
156 Uncertainty Quantification and Decomposition for LLM-based Recommendation Wonbin Kweon, Sanghwan Jang, SeongKu Kang, Hwanjo Yu 2025-01-30 arXiv:2501.17630, 2025 https://github.com/WonbinKweon/UNC_LLM_REC_WWW2025 http://arxiv.org/abs/2501.17630v1
157 CHiP: Cross-modal Hierarchical Direct Preference Optimization for Multimodal LLMs Jinlan Fu, Shenzhen Huangfu, Hao Fei, Xiaoyu Shen, Bryan Hooi, Xipeng Qiu, See-Kiong Ng 2025-01-28 arXiv https://github.com/LVUGAI/CHiP http://arxiv.org/abs/2501.16629v1
158 SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Model Xun Liang, Simin Niu, Zhiyu Li, Sensen Zhang, Hanyu Wang, Feiyu Xiong, Jason Zhaoxin Fan, Bo Tang, Shichao Song, Mengwei Wang, Jiawei Yang 2025-01-28 arXiv https://github.com/IAAR-Shanghai/SafeRAG http://arxiv.org/abs/2501.18636v1
159 xJailbreak: Representation Space Guided Reinforcement Learning for Interpretable LLM Jailbreaking Sunbowen Lee, Shiwen Ni, Chi Wei, Shuaimin Li, Liyang Fan, Ahmadreza Argha, Hamid Alinejad-Rokny, Ruifeng Xu, Yicheng Gong, Min Yang 2025-01-28 arXiv https://github.com/Aegis1863/xJailbreak http://arxiv.org/abs/2501.16727v2
160 Large Language Model Critics for Execution-Free Evaluation of Code Changes Aashish Yadavally, Hoan Nguyen, Laurent Callot, Gauthier Guinet 2025-01-28 arXiv https://github.com/amazon-science/code-agent-eval http://arxiv.org/abs/2501.16655v1
161 Towards Evaluating and Building Versatile Large Language Models for Medicine Chaoyi Wu, Pengcheng Qiu, Jinxin Liu, Hongfei Gu, Na Li, Ya Zhang, Yanfeng Wang, Weidi Xie 2025-01-27 arXiv https://henrychur.github.io/MedS-Bench/ https://doi.org/10.48550/arXiv.2408.12547
162 LCTG Bench: LLM Controlled Text Generation Benchmark Kentaro Kurihara, Masato Mita, Peinan Zhang, Shota Sasaki, Ryosuke Ishigami, Naoaki Okazaki 2025-01-27 arXiv https://github.com/CyberAgentAILab/LCTG-Bench http://arxiv.org/abs/2501.15875v1
163 Analyzing and Boosting the Power of Fine-Grained Visual Recognition for Multi-modal Large Language Models Hulingxiao He, Geng Li, Zijun Geng, Jinglin Xu, Yuxin Peng 2025-01-25 arXiv https://github.com/PKU-ICST-MIPL/Finedefics_ICLR2025 http://arxiv.org/abs/2501.15140v1
164 A Causality-aware Paradigm for Evaluating Creativity of Multimodal Large Language Models Zhongzhan Huang, Shanshan Zhong, Pan Zhou, Shanghua Gao, Marinka Zitnik, Liang Lin 2025-01-25 arXiv https://lotbench.github.io http://arxiv.org/abs/2501.15147v1
165 MDEval: Evaluating and Enhancing Markdown Awareness in Large Language Models Zhongpu Chen, Yinfeng Liu, Long Shi, Zhi-Jie Wang, Xingyan Chen, Yu Zhao, Fuji Ren 2025-01-25 arXiv https://github.com/SWUFE-DB-Group/MDEval-Benchmark http://arxiv.org/abs/2501.15000v1
166 PIP: Perturbation-based Iterative Pruning for Large Language Models Yi Cao, Wei-Jie Xu, Yucheng Shen, Weijie Shi, Chi-Min Chan, Jiajie Xu 2025-01-25 arXiv https://github.com/caoyiiiiii/PIP http://arxiv.org/abs/2501.15278v1
167 DRESSing Up LLM: Efficient Stylized Question-Answering via Style Subspace Editing Xinyu Ma, Yifeng Xu, Yang Lin, Tianlong Wang, Xu Chu, Xin Gao, Junfeng Zhao, Yasha Wang 2025-01-24 arXiv https://github.com/ArthurLeoM/DRESS-LLM http://arxiv.org/abs/2501.14371v1
168 Softplus Attention with Re-weighting Boosts Length Extrapolation in Large Language Models Bo Gao, Michael W. Spratling 2025-01-24 arXiv:2501.13428, 2025 https://github.com/iminfine/freeatten http://arxiv.org/abs/2501.13428v2
169 MedAgentBench: Dataset for Benchmarking LLMs as Agents in Medical Applications Yixing Jiang, Kameron C. Black, Gloria Geng, Danny Park, Andrew Y. Ng, Jonathan H. Chen 2025-01-24 arXiv https://github.com/stanfordmlgroup/MedAgentBench http://arxiv.org/abs/2501.14654v1
170 Leveraging Online Olympiad-Level Math Problems for LLMs Training and Contamination-Resistant Evaluation Sadegh Mahdavi, Muchen Li, Kaiwen Liu, Christos Thrampoulidis, Leonid Sigal, Renjie Liao 2025-01-24 arXiv https://github.com/DSL-Lab/aops http://arxiv.org/abs/2501.14275v1
171 JustLogic: A Comprehensive Benchmark for Evaluating Deductive Reasoning in Large Language Models Michael K. Chen, Xikun Zhang, Dacheng Tao 2025-01-24 arXiv https://github.com/michaelchen-lab/JustLogic http://arxiv.org/abs/2501.14851v1
172 FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration Kai-Tuo Xu, Feng-Long Xie, Xu Tang, Yao Hu 2025-01-24 arXiv https://github.com/FireRedTeam/FireRedASR http://arxiv.org/abs/2501.14350v1
173 Can Large Language Models Understand Preferences in Personalized Recommendation? Zhaoxuan Tan, Zinan Zeng, Qingkai Zeng, Zhenyu Wu, Zheyuan Liu, Fengran Mo, Meng Jiang 2025-01-24 arXiv …, 2025 https://github.com/TamSiuhin/PerRecBench http://arxiv.org/abs/2501.13391v1
174 MedAgentBench: A Realistic Virtual EHR Environment to Benchmark Medical LLM Agents Yixing Jiang, Kameron C. Black, Gloria Geng, Danny Park, James Zou, Andrew Y. Ng, Jonathan H. Chen 2025-01-24 arXiv https://github.com/stanfordmlgroup/MedAgentBench http://arxiv.org/abs/2501.14654v2
175 Do as We Do, Not as You Think: the Conformity of Large Language Models Zhiyuan Weng, Guikun Chen, Wenguan Wang 2025-01-24 arXiv:2501.13381, 2025 https://github.com/Zhiyuan-Weng/BenchForm http://arxiv.org/abs/2501.13381v1
176 Evaluating and Improving Graph to Text Generation with Large Language Models Jie He, Yijun Yang, Wanqiu Long, Deyi Xiong, Victor Gutierrez Basulto, Jeff Z. Pan 2025-01-24 arXiv https://github.com/probe2/kg_text http://arxiv.org/abs/2501.14497v1
177 Distillation Quantification for Large Language Models Sunbowen Lee, Junting Zhou, Chang Ao, Kaige Li, Xinrun Du, Sirui He, Jiaheng Liu, Min Yang, Zhoufutu Wen, Shiwen Ni 2025-01-23 arXiv …, 2025 https://github.com/Aegis1863/LLMs-Distillation-Quantification http://arxiv.org/abs/2501.12619v1
178 Low-Rank Adapters Meet Neural Architecture Search for LLM Compression J. Pablo Muñoz, Jinjie Yuan, Nilesh Jain 2025-01-23 arXiv https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning http://arxiv.org/abs/2501.16372v1
179 LLM-guided Instance-level Image Manipulation with Diffusion U-Net Cross-Attention Maps Andrey Palaev, Adil Khan, Syed M. Ahsan Kazmi 2025-01-23 arXiv https://github.com/Palandr123/DiffusionU-NetLLM http://arxiv.org/abs/2501.14046v1
180 OstQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitting Xing Hu, Yuan Cheng, Dawei Yang, Zukang Xu, Zhihang Yuan, Jiangyong Yu, Chen Xu, Zhe Jiang, Sifan Zhou 2025-01-23 arXiv https://github.com/BrotherHappy/OSTQuant http://arxiv.org/abs/2501.13987v1
181 Quantification of Large Language Model Distillation Sunbowen Lee, Junting Zhou, Chang Ao, Kaige Li, Xinrun Du, Sirui He, Haihong Wu, Tianci Liu, Jiaheng Liu, Hamid Alinejad-Rokny, Min Yang, Yitao Liang, Zhoufutu Wen, Shiwen Ni 2025-01-22 arXiv https://github.com/Aegis1863/LLMs-Distillation-Quantification http://arxiv.org/abs/2501.12619v3
182 A Survey of Graph Retrieval-Augmented Generation for Customized Large Language Models Qinggang Zhang, Shengyuan Chen, Yuanchen Bei, Zheng Yuan, Huachi Zhou, Zijin Hong, Junnan Dong, Hao Chen, Yi Chang, Xiao Huang 2025-01-21 arXiv https://github.com/DEEP-PolyU/Awesome-GraphRAG http://arxiv.org/abs/2501.13958v1
183 Can open source large language models be used for tumor documentation in Germany? -- An evaluation on urological doctors' notes Stefan Lenz, Arsenij Ustjanzew, Marco Jeray, Torsten Panholzer 2025-01-21 arXiv https://github.com/stefan-m-lenz/UroLlmEval http://arxiv.org/abs/2501.12106v1
184 EmbodiedEval: Evaluate Multimodal LLMs as Embodied Agents Zhili Cheng, Yuge Tu, Ran Li, Shiqi Dai, Jinyi Hu, Shengding Hu, Jiahao Li, Yang Shi, Tianyu Yu, Weize Chen, Lei Shi, Maosong Sun 2025-01-21 arXiv https://github.com/thunlp/EmbodiedEval http://arxiv.org/abs/2501.11858v1
185 VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model Xianwei Zhuang, Yuxin Xie, Yufan Deng, Liming Liang, Jinghan Ru, Yuguo Yin, Yuexian Zou 2025-01-21 arXiv https://vargpt-1.github.io/ http://arxiv.org/abs/2501.12327v1
186 Glinthawk: A Two-Tiered Architecture for High-Throughput LLM Inference Pouya Hamadanian, Sadjad Fouladi 2025-01-20 arXiv https://github.com/microsoft/glinthawk http://arxiv.org/abs/2501.11779v1
187 Explain-Query-Test: Self-Evaluating LLMs Via Explanation and Comprehension Discrepancy Saeid Asgari Taghanaki, Joao Monteiro 2025-01-20 arXiv https://github.com/asgsaeid/EQT http://arxiv.org/abs/2501.11721v1
188 Teaching Large Language Models to Regress Accurate Image Quality Scores using Score Distribution Zhiyuan You, Xin Cai, Jinjin Gu, Tianfan Xue, Chao Dong 2025-01-20 arXiv https://depictqa.github.io/deqa-score/ http://arxiv.org/abs/2501.11561v1
189 InsQABench: Benchmarking Chinese Insurance Domain Question Answering with Large Language Models Jing Ding, Kai Feng, Binbin Lin, Jiarui Cai, Qiushi Wang, Yu Xie, Xiaojin Zhang, Zhongyu Wei, Wei Chen 2025-01-19 arXiv https://github.com/HaileyFamo/InsQABench http://arxiv.org/abs/2501.10943v1
190 Control LLM: Controlled Evolution for Intelligence Retention in LLM Haichao Wei, Yunxiang Ren, Zhoutong Fu, Aman Lunia, Yi-Lin Chen, Alice Leung, Ya Xu 2025-01-19 arXiv https://github.com/linkedin/ControlLLM http://arxiv.org/abs/2501.10979v1
191 ChaosEater: Fully Automating Chaos Engineering with Large Language Models Daisuke Kikuta, Hiroki Ikeuchi, Kengo Tajiri, Yuusuke Nakano 2025-01-19 arXiv https://ntt-dkiku.github.io/chaos-eater http://arxiv.org/abs/2501.11107v1
192 LAVCap: LLM-based Audio-Visual Captioning using Optimal Transport Kyeongha Rho, Hyeongkeun Lee, Valentio Iverson, Joon Son Chung 2025-01-18 arXiv:2501.09291, 2025 https://github.com/NAVER-INTEL-Co-Lab/gaudi-lavcap http://arxiv.org/abs/2501.09291v1
193 Monte Carlo Tree Search for Comprehensive Exploration in LLM-Based Automatic Heuristic Design Zhi Zheng, Zhuoliang Xie, Zhenkun Wang, Bryan Hooi 2025-01-17 arXiv:2501.08603, 2025 https://github.com/zz1358m/MCTS-AHD-master http://arxiv.org/abs/2501.08603v2
194 When language and vision meet road safety: leveraging multimodal large language models for video-based traffic accident analysis Ruixuan Zhang, Beichen Wang, Juexiao Zhang, Zilin Bian, Chen Feng, Kaan Ozbay 2025-01-17 arXiv https://github.com/ai4ce/SeeUnsafe http://arxiv.org/abs/2501.10604v1
195 FaceXBench: Evaluating Multimodal LLMs on Face Understanding Kartik Narayan, Vibashan VS, Vishal M. Patel 2025-01-17 arXiv https://kartik-3004.github.io/facexbench/ http://arxiv.org/abs/2501.10360v1
196 PaSa: An LLM Agent for Comprehensive Academic Paper Search Yichen He, Guanhua Huang, Peiyuan Feng, Yuan Lin, Yuchen Zhang, Hang Li, Weinan E 2025-01-17 arXiv https://github.com/bytedance/pasa http://arxiv.org/abs/2501.10120v1
197 Gandalf the Red: Adaptive Security for LLMs Niklas Pfister, Václav Volhejn, Manuel Knott, Santiago Arias, Julia Bazińska, Mykhailo Bichurin, Alan Commike, Janet Darling, Peter Dienes, Matthew Fiedler, David Haber, Matthias Kraft, Marco Lancini, Max Mathys, Damián Pascual-Ortiz, Jakub Podolak, Adrià Romero-López, Kyriacos Shiarlis, Andreas Signer, Zsolt Terek, Athanasios Theocharis, Daniel Timbrell, Samuel Trautwein, Samuel Watts, Natalie Wu, Mateo Rojas-Carulla 2025-01-16 arXiv …, 2025 https://github.com/lakeraai/dsec-gandalf http://arxiv.org/abs/2501.07927v1
198 PokerBench: Training Large Language Models to become Professional Poker Players Richard Zhuang, Akshat Gupta, Richard Yang, Aniket Rahane, Zhengyu Li, Gopala Anumanchipalli 2025-01-16 arXiv …, 2025 https://github.com/pokerllm/pokerbench http://arxiv.org/abs/2501.08328v1
199 CWEval: Outcome-driven Evaluation on Functionality and Security of LLM Code Generation Jinjun Peng, Leyi Cui, Kele Huang, Junfeng Yang, Baishakhi Ray 2025-01-16 arXiv:2501.08200, 2025 https://github.com/Co1lin/CWEval http://arxiv.org/abs/2501.08200v1
200 LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding Hongyu Li, Jinyu Chen, Ziyu Wei, Shaofei Huang, Tianrui Hui, Jialin Gao, Xiaoming Wei, Si Liu 2025-01-16 arXiv …, 2025 https://github.com/appletea233/LLaVA-ST http://arxiv.org/abs/2501.08282v1
201 Multilingual LLMs Struggle to Link Orthography and Semantics in Bilingual Word Processing Eshaan Tanwar, Gayatri Oke, Tanmoy Chakraborty 2025-01-16 arXiv:2501.09127, 2025 https://github.com/EshaanT/Bilingual_processing_LLMs http://arxiv.org/abs/2501.09127v1
202 OpenCSG Chinese Corpus: A Series of High-quality Chinese Datasets for LLM Training Yijiong Yu, Ziyun Dai, Zekun Wang, Wei Wang, Ran Chen, Ji Pei 2025-01-16 arXiv …, 2025 https://github.com/yuyijiong/fineweb-edu-chinese http://arxiv.org/abs/2501.08197v1
203 LAMS: LLM-Driven Automatic Mode Switching for Assistive Teleoperation Yiran Tao, Jehan Yang, Dan Ding, Zackory Erickson 2025-01-15 arXiv https://lams-assistance.github.io/ http://arxiv.org/abs/2501.08558v1
204 The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Learning Capabilities Irina Bigoulaeva, Harish Tayyar Madabushi, Iryna Gurevych 2025-01-15 arXiv https://github.com/UKPLab/arxiv2025-inherent-limits-plms http://arxiv.org/abs/2501.08716v1
205 A Roadmap to Guide the Integration of LLMs in Hierarchical Planning Israel Puerta-Merino, Carlos Núñez-Molina, Pablo Mesejo, Juan Fernández-Olivares 2025-01-14 arXiv https://llmforplanning.github.io http://arxiv.org/abs/2501.08068v1
206 Lifelong Learning of Large Language Model based Agents: A Roadmap Junhao Zheng, Chengming Shi, Xidi Cai, Qiuke Li, Duzhen Zhang, Chenxing Li, Dong Yu, Qianli Ma 2025-01-13 arXiv https://github.com/qianlima-lab/awesome-lifelong-llm-agent https://doi.org/10.48550/arXiv.2501.07278
207 SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training Tianjin Huang, Ziquan Zhu, Gaojie Jin, Lu Liu, Zhangyang Wang, Shiwei Liu 2025-01-12 arXiv https://github.com/TianjinYellow/SPAM-Optimizer http://arxiv.org/abs/2501.06842v1
208 ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation Xuanle Zhao, Xianzhen Luo, Qi Shi, Chi Chen, Shuo Wang, Wanxiang Che, Zhiyuan Liu, Maosong Sun 2025-01-11 arXiv https://github.com/thunlp/ChartCoder https://doi.org/10.48550/arXiv.2501.06598
209 ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning Xiangru Tang, Tianyu Hu, Muyang Ye, Yanjun Shao, Xunjian Yin, Siru Ouyang, Wangchunshu Zhou, Pan Lu, Zhuosheng Zhang, Yilun Zhao, Arman Cohan, Mark Gerstein 2025-01-11 arXiv https://github.com/gersteinlab/chemagent https://doi.org/10.48550/arXiv.2501.06590
210 Step-by-Step Mastery: Enhancing Soft Constraint Following Ability of Large Language Models Qingyu Ren, Jie Zeng, Qianyu He, Jiaqing Liang, Yanghua Xiao, Weikang Zhou, Zeye Sun, Fei Yu 2025-01-11 arXiv https://github.com/Rainier-rq/FollowSoftConstraints https://doi.org/10.48550/arXiv.2501.04945
211 Demystifying Domain-adaptive Post-training for Financial LLMs Zixuan Ke, Yifei Ming, Xuan-Phi Nguyen, Caiming Xiong, Shafiq Joty 2025-01-11 arXiv …, 2025 https://github.com/SalesforceAIResearch/FinDap http://arxiv.org/abs/2501.04961v1
212 FairCode: Evaluating Social Bias of LLMs in Code Generation Yongkang Du, Jen-tse Huang, Jieyu Zhao, Lu Lin 2025-01-11 arXiv:2501.05396, 2025 https://github.com/YongkDu/FairCode http://arxiv.org/abs/2501.05396v1
213 HaVen: Hallucination-Mitigated LLM for Verilog Code Generation Aligned with HDL Engineers Yiyao Yang, Fu Teng, Pengju Liu, Mengnan Qi, Chenyang Lv, Ji Li, Xuhong Zhang, Zhezhi He 2025-01-11 arXiv …, 2025 https://github.com/Intelligent-Computing-Research-Group/HaVen http://arxiv.org/abs/2501.04908v1
214 SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution Chengxing Xie, Bowen Li, Chang Gao, He Du, Wai Lam, Difan Zou, Kai Chen 2025-01-11 arXiv …, 2025 https://github.com/InternLM/SWE-Fixer http://arxiv.org/abs/2501.05040v1
215 ChronoSense: Exploring Temporal Understanding in Large Language Models with Time Intervals of Events Duygu Sezen Islakoglu, Jan-Christoph Kalo 2025-01-10 arXiv https://github.com/duyguislakoglu/chronosense https://doi.org/10.48550/arXiv.2501.03040
216 Environmental large language model Evaluation (ELLE) dataset: A Benchmark for Evaluating Generative AI applications in Eco-environment Domain Jing Guo, Nan Li, Ming Xu 2025-01-10 arXiv https://github.com/CEEAI/elle https://doi.org/10.48550/arXiv.2501.06277
217 LLM4SR: A Survey on Large Language Models for Scientific Research Ziming Luo, Zonglin Yang, Zexin Xu, Wei Yang, Xinya Du 2025-01-10 arXiv https://github.com/du-nlp-lab/LLM4SR https://doi.org/10.48550/arXiv.2501.04306
218 Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models You Li, Heyu Huang, Chi Chen, Kaiyu Huang, Chao Huang, Zonghao Guo, Zhiyuan Liu, Jinan Xu, Yuhua Li, Ruixuan Li, Maosong Sun 2025-01-10 arXiv https://migician-vg.github.io/ https://doi.org/10.48550/arXiv.2501.05767
219 MinMo: A Multimodal Large Language Model for Seamless Voice Interaction Qian Chen, Yafeng Chen, Yanni Chen, Mengzhe Chen, Yingda Chen, Chong Deng, Zhihao Du, Ruize Gao, Changfeng Gao, Zhifu Gao, Yabin Li, Xiang Lv, Jiaqing Liu, Haoneng Luo, Bin Ma, Chongjia Ni, Xian Shi, Jialong Tang, Hui Wang, Hao Wang, Wen Wang, Yuxuan Wang, Yunlan Xu, Fan Yu, Zhijie Yan, Yexin Yang, Baosong Yang, Xian Yang, Guanrou Yang, Tianyu Zhao, Qinglin Zhang, Shiliang Zhang, Nan Zhao, Pei Zhang, Chong Zhang, Jinren Zhou 2025-01-10 arXiv https://funaudiollm.github.io/minmo https://doi.org/10.48550/arXiv.2501.06282
220 FlairGPT: Repurposing LLMs for Interior Designs Gabrielle Littlefair, Niladri Shekhar Dutt, Niloy J. Mitra 2025-01-10 arXiv:2501.04648, 2025 https://flairgpt.github.io/ http://arxiv.org/abs/2501.04648v1
221 Activating Associative Disease-Aware Vision Token Memory for LLM-Based X-ray Report Generation Xiao Wang, Fuling Wang, Haowen Wang, Bo Jiang, Chuanfu Li, Yaowei Wang, Yonghong Tian, Jin Tang 2025-01-09 arXiv …, 2025 https://github.com/Event-AHU/Medical_Image_Analysis http://arxiv.org/abs/2501.03458v1
222 Visual Large Language Models for Generalized and Specialized Applications Yifan Li, Zhixin Lai, Wentao Bao, Zhen Tan, Anh Dao, Kewei Sui, Jiayi Shen, Dong Liu, Huan Liu, Yu Kong 2025-01-06 arXiv https://github.com/JackYFL/awesome-VLLMs https://doi.org/10.48550/arXiv.2501.02765
223 CALM: Curiosity-Driven Auditing for Large Language Models Xiang Zheng, Longxiang Wang, Yi Liu, Xingjun Ma, Chao Shen, Cong Wang 2025-01-06 arXiv https://github.com/x-zheng16/CALM https://doi.org/10.48550/arXiv.2501.02997
224 BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning Beichen Zhang, Yuhong Liu, Xiaoyi Dong, Yuhang Zang, Pan Zhang, Haodong Duan, Yuhang Cao, Dahua Lin, Jiaqi Wang 2025-01-06 arXiv https://github.com/beichenzbc/BoostStep https://doi.org/10.48550/arXiv.2501.03226
225 LangFair: A Python Package for Assessing Bias and Fairness in Large Language Model Use Cases Dylan Bouchard, Mohit Singh Chauhan, David Skarbrevik, Viren Bajaj, Zeya Ahmad 2025-01-06 arXiv https://github.com/cvs-health/langfair https://doi.org/10.48550/arXiv.2501.03112
226 Multi-LLM Collaborative Caption Generation in Scientific Documents Jaeyoung Kim, Jongho Lee, Hong-Jun Choi, Ting-Yao Hsu, Chieh-Yang Huang, Sungchul Kim, Ryan Rossi, Tong Yu, Clyde Lee Giles, Ting-Hao 'Kenneth' Huang, Sungchul Choi 2025-01-05 arXiv https://github.com/teamreboott/MLBCAP http://arxiv.org/abs/2501.02552v1
227 HALO: Hadamard-Assisted Lower-Precision Optimization for LLMs Saleh Ashkboos, Mahdi Nikdan, Soroush Tabesh, Roberto L. Castro, Torsten Hoefler, Dan Alistarh 2025-01-05 arXiv https://github.com/IST-DASLab/HALO http://arxiv.org/abs/2501.02625v2
228 REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Jian Hu 2025-01-04 arXiv https://github.com/OpenRLHF/OpenRLHF https://doi.org/10.48550/arXiv.2501.03262
229 Cold-Start Recommendation towards the Era of Large Language Models (LLMs): A Comprehensive Survey and Roadmap Weizhi Zhang, Yuanchen Bei, Liangwei Yang, Henry Peng Zou, Peilin Zhou, Aiwei Liu, Yinghui Li, Hao Chen, Jianling Wang, Yu Wang, Feiran Huang, Sheng Zhou, Jiajun Bu, Allen Lin, James Caverlee, Fakhri Karray, Irwin King, Philip S. Yu 2025-01-04 arXiv https://github.com/YuanchenBei/Awesome-Cold-Start-Recommendation https://doi.org/10.48550/arXiv.2501.01945
230 Aligning Large Language Models for Faithful Integrity Against Opposing Argument Yong Zhao, Yang Deng, See-Kiong Ng, Tat-Seng Chua 2025-01-04 arXiv https://github.com/zhaoy777/AFICE https://doi.org/10.48550/arXiv.2501.01336
231 MIRAGE: Exploring How Large Language Models Perform in Complex Social Interactive Environments Cai Yin, Zhouhong Gu, Du Zhaohan, Ye Zheyu, Cao Shaosheng, Xu Yiqian, Feng Hongwei, Chen Ping 2025-01-04 arXiv https://github.com/lime728/MIRAGE https://doi.org/10.48550/arXiv.2501.01652
232 Text Clustering as Classification with LLMs Chen Huang, Guoxiu He 2025-01-04 Available at SSRN 5081002 https://github.com/ECNU-Text-Computing/Text-Clustering-via-LLM http://arxiv.org/abs/2410.00927v2
233 UAVs Meet LLMs: Overviews and Perspectives Toward Agentic Low-Altitude Mobility Yonglin Tian, Fei Lin, Yiduo Li, Tengchao Zhang, Qiyao Zhang, Xuan Fu, Jun Huang, Xingyuan Dai, Yutong Wang, Chunwei Tian, Bai Li, Yisheng Lv, Levente Kovács, Fei-Yue Wang 2025-01-04 arXiv https://github.com/Hub-Tian/UAVs_Meet_LLMs http://arxiv.org/abs/2501.02341v1
234 FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving Zihao Ye, Lequn Chen, Ruihang Lai, Wuwei Lin, Yineng Zhang, Stephanie Wang, Tianqi Chen, Baris Kasikci, Vinod Grover, Arvind Krishnamurthy, Luis Ceze 2025-01-03 arXiv …, 2025 http://github.com/flashinfer-ai/flashinfer http://arxiv.org/abs/2501.01005v1
235 Instruction-Following Evaluation for Large Language Models Jeffrey Zhou, Tianjian Lu, Swaroop Mishra, Siddhartha Brahma, Sujoy Basu, Yi Luan, Denny Zhou, Le Hou 2025-01-03 arXiv https://github.com/google-research/google-research/tree/master/instruction_following_eval https://doi.org/10.48550/arXiv.2311.07911
236 Labels Generated by Large Language Model Helps Measuring People's Empathy in Vitro Md. Rakibul Hasan, Yue Yao, Md. Zakir Hossain, Aneesh Krishna, Imre Rudas, Shafin Rahman, Tom Gedeon 2025-01-02 arXiv https://github.com/hasan-rakibul/LLMPathy https://doi.org/10.48550/arXiv.2501.00691
237 Aligning LLMs with Domain Invariant Reward Models David Wu, Sanjiban Choudhury 2025-01-02 arXiv:2501.00911, 2025 https://github.com/portal-cornell/dial http://arxiv.org/abs/2501.00911v1
238 Linguistic Minimal Pairs Elicit Linguistic Similarity in Large Language Models Xinyu Zhou, Delong Chen, Samuel Cahyawijaya, Xufeng Duan, Zhenguang G. Cai 2025 arXiv https://github.com/ChenDelong1999/Linguistic-Similarity https://doi.org/10.48550/arXiv.2409.12435
239 Match, Compare, or Select? An Investigation of Large Language Models for Entity Matching Tianshu Wang, Xiaoyang Chen, Hongyu Lin, Xuanang Chen, Xianpei Han, Le Sun, Hao Wang, Zhenyu Zeng 2025 arXiv https://github.com/tshu-w/ComEM https://doi.org/10.48550/arXiv.2405.16884
240 Prompting Large Language Models to Tackle the Full Software Development Lifecycle: A Case Study Bowen Li, Wenhan Wu, Ziwei Tang, Lin Shi, John Yang, Jinyang Li, Shunyu Yao, Chen Qian, Binyuan Hui, Qicheng Zhang, Zhiyin Yu, He Du, Ping Yang, Dahua Lin, Chao Peng, Kai Chen 2025 COLING https://github.com/open-compass/DevEval https://aclanthology.org/2025.coling-main.502/
241 The Dark Side of Function Calling: Pathways to Jailbreaking Large Language Models Zihui Wu, Haichang Gao, Jianping He, Ping Wang 2025 arXiv https://github.com/wooozihui/jailbreakfunction https://doi.org/10.48550/arXiv.2407.17915
242 Rethinking Kullback-Leibler Divergence in Knowledge Distillation for Large Language Models Taiqiang Wu, Chaofan Tao, Jiahao Wang, Runming Yang, Zhe Zhao, Ngai Wong 2025 COLING https://github.com/wutaiqiang/LLM_KD_AKL https://aclanthology.org/2025.coling-main.383/
243 Retrieval Augmented Instruction Tuning for Open NER with Large Language Models Tingyu Xie, Jian Zhang, Yan Zhang, Yuanyuan Liang, Qi Li, Hongwei Wang 2025 arXiv https://github.com/Emma1066/Retrieval-Augmented-IT-OpenNER https://doi.org/10.48550/arXiv.2406.17305
244 Towards Efficient and Effective Adaptation of Large Language Models for Sequential Recommendation Hangyu Wang, Jianghao Lin, Bo Chen, Yang Yang, Ruiming Tang, Weinan Zhang, Yong Yu 2025 arXiv https://github.com/justarter/E2URec https://doi.org/10.48550/arXiv.2310.01612
245 The Only Way is Ethics: A Guide to Ethical Research with Large Language Models Eddie L. Ungless, Nikolas Vitsakis, Zeerak Talat, James Garforth, Björn Ross, Arno Onken, Atoosa Kasirzadeh, Alexandra Birch 2025 COLING https://github.com/MxEddie/Ethics-Whitepaper https://aclanthology.org/2025.coling-main.603/
246 Towards Data Contamination Detection for Modern Large Language Models: Limitations, Inconsistencies, and Oracle Challenges Vinay Samuel, Yue Zhou, Henry Peng Zou 2025 arXiv https://github.com/vsamuel2003/data-contamination https://doi.org/10.48550/arXiv.2409.09927
247 Unveiling Uncertainty: A Deep Dive into Calibration and Performance of Multimodal Large Language Models Zijun Chen, Wenbo Hu, Guande He, Zhijie Deng, Zheng Zhang, Richang Hong 2025 COLING https://github.com/hfutml/Calibration-MLLM https://aclanthology.org/2025.coling-main.208/
248 Advancing Multimodal Large Language Models in Chart Question Answering with Visualization-Referenced Instruction Tuning Xingchen Zeng, Haichuan Lin, Yilin Ye, Wei Zeng 2025 arXiv https://github.com/zengxingchen/ChartQA-MLLM https://doi.org/10.48550/arXiv.2407.20174
249 Enhancing chest X-ray datasets with privacy-preserving large language models and multi-type annotations: a data-driven approach for improved classification Ricardo Bigolin Lanfredi, Pritam Mukherjee, Ronald M. Summers 2025 arXiv https://github.com/rsummers11/CADLab/tree/master/MAPLEZ_LLM_report_labeler/ https://doi.org/10.48550/arXiv.2403.04024
250 KnowledgePrompts: Exploring the Abilities of Large Language Models to Solve Proportional Analogies via Knowledge-Enhanced Prompting Thilini Wijesiriwardene, Ruwan Wickramarachchi, Sreeram Reddy Vennam, Vinija Jain, Aman Chadha, Amitava Das, Ponnurangam Kumaraguru, Amit P. Sheth 2025 COLING https://github.com/Thiliniiw/KnowledgePrompts/ https://aclanthology.org/2025.coling-main.268/
251 LLMTreeRec: Unleashing the Power of Large Language Models for Cold-Start Recommendations Wenlin Zhang, Chuhan Wu, Xiangyang Li, Yuhao Wang, Kuicai Dong, Yichao Wang, Xinyi Dai, Xiangyu Zhao, Huifeng Guo, Ruiming Tang 2025 COLING https://github.com/Applied-Machine-Learning-Lab/LLMTreeRec https://aclanthology.org/2025.coling-main.59/
252 QuickLLaMA: Query-aware Inference Acceleration for Large Language Models Jingyao Li, Han Shi, Sitong Wu, Chuanyang Zheng, Zhenguo Li, Xin Jiang, Hong Xu, Jiaya Jia 2025 COLING https://github.com/dvlab-research/Q-LLM https://aclanthology.org/2025.coling-main.34/
253 Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation Ruiyang Ren, Yuhao Wang, Yingqi Qu, Wayne Xin Zhao, Jing Liu, Hao Tian, Hua Wu, Ji-Rong Wen, Haifeng Wang 2025 arXiv https://github.com/RUCAIBox/LLM-Knowledge-Boundary https://doi.org/10.48550/arXiv.2307.11019
254 EarthMarker: A Visual Prompting Multimodal Large Language Model for Remote Sensing Wei Zhang, Miaoxin Cai, Tong Zhang, Yin Zhuang, Jun Li, Xuerui Mao 2025 IEEE Trans. Geosci. Remote. Sens. https://github.com/wivizhang/EarthMarker https://doi.org/10.1109/TGRS.2024.3523505
255 Surveillance Video-and-Language Understanding: from Small to Large Multimodal Models Tongtong Yuan, Xuange Zhang, Bo Liu, Kun Liu, Jian Jin, Zhenzhen Jiao 2025 IEEE Transactions on Circuits and Systems for Video Technology https://xuange923.github.io/Surveillance-Video-Understanding https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10681489
256 Alternate Preference Optimization for Unlearning Factual Knowledge in Large Language Models Anmol Reddy Mekala, Vineeth Dorna, Shreya Dubey, Abhishek Lalwani, David Koleczek, Mukund Rungta, Sadid A. Hasan, Elita A. Lobo 2025 arXiv https://github.com/molereddy/Alternate-Preference-Optimization https://doi.org/10.48550/arXiv.2409.13474
257 Awakening Augmented Generation: Learning to Awaken Internal Knowledge of Large Language Models for Question Answering Huanxuan Liao, Shizhu He, Yao Xu, Yuanzhe Zhang, Shengping Liu, Kang Liu, Jun Zhao 2025 COLING https://github.com/Xnhyacinth/IAG https://aclanthology.org/2025.coling-main.89/
258 CodeJudge-Eval: Can Large Language Models be Good Judges in Code Understanding? Yuwei Zhao, Ziyang Luo, Yuchen Tian, Hongzhan Lin, Weixiang Yan, Annan Li, Jing Ma 2025 arXiv https://github.com/CodeLLM-Research/CodeJudge-Eval https://doi.org/10.48550/arXiv.2408.10718
259 InternLM-Law: An Open Source Chinese Legal Large Language Model Zhiwei Fei, Songyang Zhang, Xiaoyu Shen, Dawei Zhu, Xiao Wang, Maosong Cao, Fengzhe Zhou, Yining Li, Wenwei Zhang, Dahua Lin, Kai Chen, Jidong Ge 2025 arXiv https://github.com/InternLM/InternLM-Law https://doi.org/10.48550/arXiv.2406.14887
260 Distilling Rule-based Knowledge into Large Language Models Wenkai Yang, Yankai Lin, Jie Zhou, Ji-Rong Wen 2025 COLING https://github.com/RUCBM/rule-distillation https://aclanthology.org/2025.coling-main.61/
261 Exploiting the Index Gradients for Optimization-Based Jailbreaking on Large Language Models Jiahui Li, Yongchang Hao, Haoyu Xu, Xing Wang, Yu Hong 2025 COLING https://github.com/jiah-li/magic https://aclanthology.org/2025.coling-main.305/
262 Exploring Concept Depth: How Large Language Models Acquire Knowledge and Concept at Different Layers? Mingyu Jin, Qinkai Yu, Jingyuan Huang, Qingcheng Zeng, Zhenting Wang, Wenyue Hua, Haiyan Zhao, Kai Mei, Yanda Meng, Kaize Ding, Fan Yang, Mengnan Du, Yongfeng Zhang 2025 COLING https://github.com/Luckfort/CD https://aclanthology.org/2025.coling-main.37/
263 Filter-then-Generate: Large Language Models with Structure-Text Adapter for Knowledge Graph Completion Ben Liu, Jihai Zhang, Fangquan Lin, Cheng Yang, Min Peng 2025 COLING https://github.com/LB0828/FtG https://aclanthology.org/2025.coling-main.740/
264 GraCoRe: Benchmarking Graph Comprehension and Complex Reasoning in Large Language Models Zike Yuan, Ming Liu, Hui Wang, Bing Qin 2025 arXiv https://github.com/ZIKEYUAN/GraCoRe https://doi.org/10.48550/arXiv.2407.02936
265 Gracefully Filtering Backdoor Samples for Generative Large Language Models without Retraining Zongru Wu, Pengzhou Cheng, Lingyong Fang, Zhuosheng Zhang, Gongshen Liu 2025 COLING https://github.com/ZrW00/GraceFul https://aclanthology.org/2025.coling-main.220/
266 ICLEval: Evaluating In-Context Learning Ability of Large Language Models Wentong Chen, Yankai Lin, ZhenHao Zhou, HongYun Huang, Yantao Jia, Zhao Cao, Ji-Rong Wen 2025 arXiv https://github.com/yiye3/ICLEval https://doi.org/10.48550/arXiv.2406.14955
267 Distributed Mixture-of-Agents for Edge Inference with Large Language Models Purbesh Mitra, Priyanka Kaswan, Sennur Ulukus 2024-12-30 arXiv https://github.com/purbeshmitra/distributed_moa http://arxiv.org/abs/2412.21200v1
268 Do Current Video LLMs Have Strong OCR Abilities? A Preliminary Study Yulin Fei, Yuhui Gao, Xingyuan Xian, Xiaojin Zhang, Tao Wu, Wei Chen 2024-12-29 arXiv https://github.com/YuHuiGao/FG-Bench http://arxiv.org/abs/2412.20613v1
269 Mind the Data Gap: Bridging LLMs to Enterprise Data Integration Moe Kayali, Fabian Wenz, Nesime Tatbul, Çağatay Demiralp 2024-12-29 arXiv https://goby-benchmark.github.io/ http://arxiv.org/abs/2412.20331v1
270 TokenRing: An Efficient Parallelism Framework for Infinite-Context LLMs via Bidirectional Communication Zongwu Wang, Fangxin Liu, Mingshuai Li, Li Jiang 2024-12-29 arXiv https://github.com/ACA-Lab-SJTU/token-ring http://arxiv.org/abs/2412.20501v1
271 On the Compositional Generalization of Multimodal LLMs for Medical Imaging Zhenyang Cai, Junying Chen, Rongsheng Wang, Weihong Wang, Yonglin Deng, Dingjie Song, Yize Chen, Zixu Zhang, Benyou Wang 2024-12-28 arXiv https://github.com/FreedomIntelligence/Med-MAT http://arxiv.org/abs/2412.20070v1
272 Toward Adaptive Reasoning in Large Language Models with Thought Rollback Sijia Chen, Baochun Li 2024-12-27 ICML https://github.com/iQua/llmpebase/tree/main/examples/ThoughtRollback https://openreview.net/forum?id=aoAPOOtN9E
273 An Engorgio Prompt Makes Large Language Model Babble on Jianshuo Dong, Ziyuan Zhang, Qingjie Zhang, Han Qiu, Tianwei Zhang, Hao Wang, Hewu Li, Qi Li, Chao Zhang, Ke Xu 2024-12-27 arXiv https://github.com/jianshuod/Engorgio-prompt http://arxiv.org/abs/2412.19394v1
274 Gradient Weight-normalized Low-rank Projection for Efficient LLM Training Jia-Hong Huang, Yixian Shen, Hongyi Zhu, Stevan Rudinac, Evangelos Kanoulas 2024-12-27 arXiv https://github.com/Jhhuangkay/Gradient-Weight-normalized-Low-rank-Projection-for-Efficient-LLM-Training http://arxiv.org/abs/2412.19616v1
275 MLLM-SUL: Multimodal Large Language Model for Semantic Scene Understanding and Localization in Traffic Scenarios Jiaqi Fan, Jianhua Wu, Jincheng Gao, Jianhao Yu, Yafei Wang, Hongqing Chu, Bingzhao Gao 2024-12-27 arXiv https://github.com/fjq-tongji/MLLM-SUL http://arxiv.org/abs/2412.19406v1
276 A Survey on Large Language Model Acceleration based on KV Cache Management Haoyang Li, Yiming Li, Anxin Tian, Tianhao Tang, Zhanchao Xu, Xuejia Chen, Nicole Hu, Wei Dong, Qing Li, Lei Chen 2024-12-27 arXiv https://github.com/TreeAI-Lab/Awesome-KV-Cache-Management http://arxiv.org/abs/2412.19442v2
277 Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment Ziang Yan, Zhilin Li, Yinan He, Chenting Wang, Kunchang Li, Xinhao Li, Xiangyu Zeng, Zilei Wang, Yali Wang, Yu Qiao, Limin Wang, Yi Wang 2024-12-26 arXiv https://github.com/OpenGVLab/TPO http://arxiv.org/abs/2412.19326v1
278 CoEvo: Continual Evolution of Symbolic Solutions Using Large Language Models Ping Guo, Qingfu Zhang, Xi Lin 2024-12-25 arXiv https://github.com/pgg3/CoEvo http://arxiv.org/abs/2412.18890v1
279 Large Language Model guided Deep Reinforcement Learning for Decision Making in Autonomous Driving Hao Pang, Zhenpo Wang, Guoqiang Li 2024-12-24 arXiv https://bitmobility.github.io/LGDRL/ http://arxiv.org/abs/2412.18511v1
280 Token-Budget-Aware LLM Reasoning Tingxu Han, Zhenting Wang, Chunrong Fang, Shiyu Zhao, Shiqing Ma, Zhenyu Chen 2024-12-24 arXiv https://github.com/GeniusHTX/TALE http://arxiv.org/abs/2412.18547v3
281 Property Enhanced Instruction Tuning for Multi-task Molecule Generation with Large Language Models Xuan Lin, Long Chen, Yile Wang, Xiangxiang Zeng, Philip S. Yu 2024-12-24 arXiv https://github.com/chenlong164/PEIT http://arxiv.org/abs/2412.18084v1
282 ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation Mengyang Wu, Yuzhi Zhao, Jialun Cao, Mingjie Xu, Zhongming Jiang, Xuehui Wang, Qinbin Li, Guangneng Hu, Shengchao Qin, Chi-Wing Fu 2024-12-24 arXiv https://github.com/zhaoyuzhi/ICM-Assistant http://arxiv.org/abs/2412.18216v1
283 Distilling Fine-grained Sentiment Understanding from Large Language Models Yice Zhang, Guangyu Xie, Hongling Xu, Kaiheng Hou, Jianzhu Bao, Qianlong Wang, Shiwei Chen, Ruifeng Xu 2024-12-24 arXiv https://github.com/HITSZ-HLT/FSA-Distillation http://arxiv.org/abs/2412.18552v2
284 3DGraphLLM: Combining Semantic Graphs and Large Language Models for 3D Scene Understanding Tatiana Zemskova, Dmitry Yudin 2024-12-24 arXiv https://github.com/CognitiveAISystems/3DGraphLLM http://arxiv.org/abs/2412.18450v2
285 Assessing Human Editing Effort on LLM-Generated Texts via Compression-Based Edit Distance Nicolas Devatine, Louis Abraham 2024-12-23 arXiv https://github.com/NDV-tiime/CompressionDistance http://arxiv.org/abs/2412.17321v1
286 Large Language Model Safety: A Holistic Survey Dan Shi, Tianhao Shen, Yufei Huang, Zhigen Li, Yongqi Leng, Renren Jin, Chuang Liu, Xinwei Wu, Zishan Guo, Linhao Yu, Ling Shi, Bojian Jiang, Deyi Xiong 2024-12-23 arXiv https://github.com/tjunlp-lab/Awesome-LLM-Safety-Papers http://arxiv.org/abs/2412.17686v1
287 CoF: Coarse to Fine-Grained Image Understanding for Multi-modal Large Language Models Yeyuan Wang, Dehong Gao, Bin Li, Rujiao Long, Lei Yi, Xiaoyan Cai, Libin Yang, Jinxia Zhang, Shanqing Yu, Qi Xuan 2024-12-22 arXiv https://github.com/Gavin001201/CoF http://arxiv.org/abs/2412.16869v1
288 MINTQA: A Multi-Hop Question Answering Benchmark for Evaluating LLMs on New and Tail Knowledge Jie He, Nan Hu, Wanqiu Long, Jiaoyan Chen, Jeff Z. Pan 2024-12-22 arXiv https://github.com/probe2/multi-hop/ http://arxiv.org/abs/2412.17032v1
289 Large Language Model Can Be a Foundation for Hidden Rationale-Based Retrieval Luo Ji, Feixiang Guo, Teng Chen, Qingqing Gu, Xiaoyu Wang, Ningyuan Xi, Yihong Wang, Peng Yu, Yue Zhao, Hongyang Lei, Zhonglin Jiang, Yong Chen 2024-12-21 arXiv https://github.com/flyfree5/LaHoRe http://arxiv.org/abs/2412.16615v1
290 Template-Driven LLM-Paraphrased Framework for Tabular Math Word Problem Generation Xiaoqiang Kang, Zimu Wang, Xiaobo Jin, Wei Wang, Kaizhu Huang, Qiufeng Wang 2024-12-20 arXiv https://github.com/Jason8Kang/TELL http://arxiv.org/abs/2412.15594v1
291 WebLLM: A High-Performance In-Browser LLM Inference Engine Charlie F. Ruan, Yucheng Qin, Xun Zhou, Ruihang Lai, Hongyi Jin, Yixin Dong, Bohan Hou, Meng-Shiun Yu, Yiyan Zhai, Sudeep Agarwal, Hangrui Cao, Siyuan Feng, Tianqi Chen 2024-12-20 arXiv https://github.com/mlc-ai/web-llm http://arxiv.org/abs/2412.15803v1
292 TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use Junjie Ye, Yilong Wu, Sixian Li, Yuming Yang, Tao Gui, Qi Zhang, Xuanjing Huang, Peng Wang, Zhongchao Shi, Jianping Fan, Zhengyin Du 2024-12-20 arXiv https://github.com/Junjie-Ye/TL-Training http://arxiv.org/abs/2412.15495v1
293 PruneVid: Visual Token Pruning for Efficient Video Large Language Models Xiaohu Huang, Hao Zhou, Kai Han 2024-12-20 arXiv https://github.com/Visual-AI/PruneVid http://arxiv.org/abs/2412.16117v1
294 Mitigating Social Bias in Large Language Models: A Multi-Objective Approach within a Multi-Agent Framework Zhenjie Xu, Wenqing Chen, Yi Tang, Xuanying Li, Cheng Hu, Zhixuan Chu, Kui Ren, Zibin Zheng, Zhichao Lu 2024-12-20 arXiv https://github.com/Cortantse/MOMA http://arxiv.org/abs/2412.15504v1
295 Beyond Human Data: Aligning Multimodal Large Language Models by Iterative Self-Evolution Wentao Tan, Qiong Cao, Yibing Zhan, Chao Xue, Changxing Ding 2024-12-20 arXiv https://github.com/WentaoTan/SENA http://arxiv.org/abs/2412.15650v1
296 On Verbalized Confidence Scores for LLMs Daniel Yang, Yao-Hung Hubert Tsai, Makoto Yamada 2024-12-19 arXiv https://github.com/danielyxyang/llm-verbalized-uq http://arxiv.org/abs/2412.14737v1
297 Sliding Windows Are Not the End: Exploring Full Ranking with Long-Context Large Language Models Wenhan Liu, Xinyu Ma, Yutao Zhu, Ziliang Zhao, Shuaiqiang Wang, Dawei Yin, Zhicheng Dou 2024-12-19 arXiv https://github.com/8421BCD/fullrank http://arxiv.org/abs/2412.14574v1
298 ORBIT: Cost-Effective Dataset Curation for Large Language Model Domain Adaptation with an Astronomy Case Study Eric Modesitt, Ke Yang, Spencer Hulsey, Chengxiang Zhai, Volodymyr Kindratenko 2024-12-19 arXiv https://github.com/ModeEric/ORBIT-Llama http://arxiv.org/abs/2412.14436v1
299 Agent-SafetyBench: Evaluating the Safety of LLM Agents Zhexin Zhang, Shiyao Cui, Yida Lu, Jingzhuo Zhou, Junxiao Yang, Hongning Wang, Minlie Huang 2024-12-19 arXiv https://github.com/thu-coai/Agent-SafetyBench http://arxiv.org/abs/2412.14470v1
300 InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models Cong Wei, Yujie Zhong, Haoxian Tan, Yingsen Zeng, Yong Liu, Zheng Zhao, Yujiu Yang 2024-12-18 arXiv https://github.com/congvvc/InstructSeg http://arxiv.org/abs/2412.14006v1
301 Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces Jihan Yang, Shusheng Yang, Anjali W. Gupta, Rilyn Han, Li Fei-Fei, Saining Xie 2024-12-18 arXiv https://vision-x-nyu.github.io/thinking-in-space.github.io/ http://arxiv.org/abs/2412.14171v1
302 ResQ: Mixed-Precision Quantization of Large Language Models with Low-Rank Residuals Utkarsh Saxena, Sayeh Sharify, Kaushik Roy, Xin Wang 2024-12-18 arXiv https://github.com/utkarsh-dmx/project-resq http://arxiv.org/abs/2412.14363v1
303 Beyond Outcomes: Transparent Assessment of LLM Reasoning in Games Wenye Lin, Jonathan Roberts, Yunhan Yang, Samuel Albanie, Zongqing Lu, Kai Han 2024-12-18 arXiv https://visual-ai.github.io/gamebot http://arxiv.org/abs/2412.13602v1
304 Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes Katarzyna Kobalczyk, Claudio Fanconi, Hao Sun, Mihaela van der Schaar 2024-12-18 arXiv https://github.com/kasia-kobalczyk/few-shot-steerable-alignment http://arxiv.org/abs/2412.13998v1
305 Crabs: Consuming Resrouce via Auto-generation for LLM-DoS Attack under Black-box Settings Yuanhe Zhang, Zhenhong Zhou, Wei Zhang, Xinyue Wang, Xiaojun Jia, Yang Liu, Sen Su 2024-12-18 arXiv https://github.com/shuita2333/AutoDoS http://arxiv.org/abs/2412.13879v1
306 Enhancing Knowledge Distillation for LLMs with Response-Priming Prompting Vijay Goyal, Mustafa Khan, Aprameya Tirupati, Harveer Saini, Michael Lam, Kevin Zhu 2024-12-18 arXiv https://github.com/alonso130r/knowledge-distillation http://arxiv.org/abs/2412.17846v1
307 Are Your LLMs Capable of Stable Reasoning? Junnan Liu, Hongwei Liu, Linchen Xiao, Ziyi Wang, Kuikun Liu, Songyang Gao, Wenwei Zhang, Songyang Zhang, Kai Chen 2024-12-17 arXiv https://github.com/open-compass/GPassK http://arxiv.org/abs/2412.13147v2
308 Benchmarking and Understanding Compositional Relational Reasoning of LLMs Ruikang Ni, Da Xiao, Qingye Meng, Xiangyu Li, Shihui Zheng, Hongliang Liang 2024-12-17 arXiv https://github.com/Caiyun-AI/GAR http://arxiv.org/abs/2412.12841v1
309 Graph Learning in the Era of LLMs: A Survey from the Perspective of Data, Models, and Tasks Xunkai Li, Zhengyu Wu, Jiayi Wu, Hanwen Cui, Jishuo Jia, Rong-Hua Li, Guoren Wang 2024-12-17 arXiv https://github.com/xkLi-Allen/Awesome-GNN-in-LLMs-Papers http://arxiv.org/abs/2412.12456v1
310 NLSR: Neuron-Level Safety Realignment of Large Language Models Against Harmful Fine-Tuning Xin Yi, Shunfan Zheng, Linlin Wang, Gerard de Melo, Xiaoling Wang, Liang He 2024-12-17 arXiv https://github.com/xinykou/NLSR http://arxiv.org/abs/2412.12497v1
311 SafeAgentBench: A Benchmark for Safe Task Planning of Embodied LLM Agents Sheng Yin, Xianghe Pang, Yuanzhuo Ding, Menglan Chen, Yutong Bi, Yichen Xiong, Wenhao Huang, Zhen Xiang, Jing Shao, Siheng Chen 2024-12-17 arXiv https://github.com/shengyin1224/SafeAgentBench http://arxiv.org/abs/2412.13178v2
312 SafeDrive: Knowledge- and Data-Driven Risk-Sensitive Decision-Making for Autonomous Vehicles with Large Language Models Zhiyuan Zhou, Heye Huang, Boqi Li, Shiyue Zhao, Yao Mu, Jianqiang Wang 2024-12-17 arXiv https://mezzi33.github.io/SafeDrive/ http://arxiv.org/abs/2412.13238v2
313 Assessing the Limitations of Large Language Models in Clinical Fact Decomposition Monica Munnangi, Akshay Swaminathan, Jason Alan Fries, Jenelle Jindal, Sanjana Narayanan, Ivan Lopez, Lucia Tu, Philip Chung, Jesutofunmi A. Omiye, Mehr Kashyap, Nigam Shah 2024-12-17 arXiv https://github.com/som-shahlab/factehr http://arxiv.org/abs/2412.12422v1
314 LLMs Can Simulate Standardized Patients via Agent Coevolution Zhuoyun Du, Lujie Zheng, Renjun Hu, Yuyang Xu, Xiawei Li, Ying Sun, Wei Chen, Jian Wu, Haolei Cai, Haohao Ying 2024-12-16 arXiv https://github.com/ZJUMAI/EvoPatient http://arxiv.org/abs/2412.11716v1
315 RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation Xiaoxi Li, Jiajie Jin, Yujia Zhou, Yongkang Wu, Zhonghua Li, Qi Ye, Zhicheng Dou 2024-12-16 arXiv https://github.com/sunnynexus/RetroLLM http://arxiv.org/abs/2412.11919v1
316 RL-LLM-DT: An Automatic Decision Tree Generation Method Based on RL Evaluation and LLM Enhancement Junjie Lin, Jian Zhao, Lin Liu, Yue Deng, Youpeng Zhao, Lanxiao Huang, Xia Lin, Wengang Zhou, Houqiang Li 2024-12-16 arXiv https://github.com/Linjunjie99/RL-LLM-DT http://arxiv.org/abs/2412.11417v2
317 Does VLM Classification Benefit from LLM Description Semantics? Pingchuan Ma, Lennart Rietdorf, Dmytro Kotovenko, Vincent Tao Hu, Björn Ommer 2024-12-16 arXiv https://github.com/CompVis/DisCLIP http://arxiv.org/abs/2412.11917v3
318 BlenderLLM: Training Large Language Models for Computer-Aided Design with Self-improvement Yuhao Du, Shunian Chen, Wenbo Zan, Peizhao Li, Mingxuan Wang, Dingjie Song, Bo Li, Yan Hu, Benyou Wang 2024-12-16 arXiv https://github.com/FreedomIntelligence/BlenderLLM http://arxiv.org/abs/2412.14203v1
319 Analyzing Images of Legal Documents: Toward Multi-Modal LLMs for Access to Justice Hannes Westermann, Jaromir Savelka 2024-12-16 arXiv https://github.com/hwestermann/AI4A2J_analyzing_images_of_legal_documents http://arxiv.org/abs/2412.15260v1
320 NITRO: LLM Inference on Intel Laptop NPUs Anthony Fei, Mohamed S. Abdelfattah 2024-12-15 arXiv https://github.com/abdelfattah-lab/nitro http://arxiv.org/abs/2412.11053v1
321 Empowering LLMs to Understand and Generate Complex Vector Graphics Ximing Xing, Juncheng Hu, Guotao Liang, Jing Zhang, Dong Xu, Qian Yu 2024-12-15 arXiv https://ximinng.github.io/LLM4SVGProject/ http://arxiv.org/abs/2412.11102v1
322 Learning to Verify Summary Facts with Fine-Grained LLM Feedback Jihwan Oh, Jeonghwan Choi, Nicole Hee-Yeon Kim, Taewon Yun, Hwanjun Song 2024-12-14 arXiv https://github.com/DISL-Lab/FineSumFact http://arxiv.org/abs/2412.10689v1
323 B-VLLM: A Vision Large Language Model with Balanced Spatio-Temporal Tokens Zhuqiang Lu, Zhenfei Yin, Mengwei He, Zhihui Wang, Zicheng Liu, Zhiyong Wang, Kun Hu 2024-12-13 arXiv https://github.com/zhuqiangLu/B-VLLM http://arxiv.org/abs/2412.09919v1
324 Can LLMs Convert Graphs to Text-Attributed Graphs? Zehong Wang, Sidney Liu, Zheyuan Zhang, Tianyi Ma, Chuxu Zhang, Yanfang Ye 2024-12-13 arXiv https://github.com/Zehong-Wang/TANS http://arxiv.org/abs/2412.10136v1
325 ChainStream: An LLM-based Framework for Unified Synthetic Sensing Jiacheng Liu, Yuanchun Li, Liangyan Li, Yi Sun, Hao Wen, Xiangyu Li, Yao Guo, Yunxin Liu 2024-12-13 arXiv https://github.com/MobileLLM/ChainStream http://arxiv.org/abs/2412.15240v1
326 CosyVoice 2: Scalable Streaming Speech Synthesis with Large Language Models Zhihao Du, Yuxuan Wang, Qian Chen, Xian Shi, Xiang Lv, Tianyu Zhao, Zhifu Gao, Yexin Yang, Changfeng Gao, Hui Wang, Fan Yu, Huadai Liu, Zhengyan Sheng, Yue Gu, Chong Deng, Wen Wang, Shiliang Zhang, Zhijie Yan, Jingren Zhou 2024-12-13 arXiv https://funaudiollm.github.io/cosyvoice2 http://arxiv.org/abs/2412.10117v3
327 Enhancing Multimodal Large Language Models Complex Reason via Similarity Computation Xiaofeng Zhang, Fanshuo Zeng, Yihao Quan, Zheng Hui, Jiawei Yao 2024-12-13 arXiv https://github.com/FanshuoZeng/Simignore http://arxiv.org/abs/2412.09817v1
328 Can Modern LLMs Act as Agent Cores in Radiology Environments? Qiaoyu Zheng, Chaoyi Wu, Pengcheng Qiu, Lisong Dai, Ya Zhang, Yanfeng Wang, Weidi Xie 2024-12-12 arXiv https://github.com/MAGIC-AI4Med/RadABench http://arxiv.org/abs/2412.09529v2
329 RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios Ruiwen Zhou, Wenyue Hua, Liangming Pan, Sitao Cheng, Xiaobao Wu, En Yu, William Yang Wang 2024-12-12 arXiv https://github.com/skyriver-2000/RuleArena http://arxiv.org/abs/2412.08972v1
330 Towards a Multimodal Large Language Model with Pixel-Level Insight for Biomedicine Xiaoshuang Huang, Lingdong Shen, Jia Liu, Fangxin Shang, Hongxiang Li, Haifeng Huang, Yehui Yang 2024-12-12 arXiv https://github.com/ShawnHuang497/MedPLIB http://arxiv.org/abs/2412.09278v1
331 What Makes Cryptic Crosswords Challenging for LLMs? Abdelrahman Sadallah, Daria Kotova, Ekaterina Kochmar 2024-12-12 COLING 2025 https://github.com/bodasadallah/decrypting-crosswords http://arxiv.org/abs/2412.09012v1
332 Multi-GraspLLM: A Multimodal LLM for Multi-Hand Semantic Guided Grasp Generation Haosheng Li, Weixin Mao, Weipeng Deng, Chenyu Meng, Haoqiang Fan, Tiancai Wang, Ping Tan, Hongan Wang, Xiaoming Deng 2024-12-11 arXiv https://multi-graspllm.github.io http://arxiv.org/abs/2412.08468v1
333 Concept Bottleneck Large Language Models Chung-En Sun, Tuomas Oikarinen, Berk Ustun, Tsui-Wei Weng 2024-12-11 arXiv https://github.com/Trustworthy-ML-Lab/CB-LLMs http://arxiv.org/abs/2412.07992v1
334 Autoformalizing and Simulating Game-Theoretic Scenarios using LLM-augmented Agents Agnieszka Mensfelt, Kostas Stathis, Vince Trencsenyi 2024-12-11 arXiv https://github.com/dicelab-rhul/autoformalizing-agents http://arxiv.org/abs/2412.08805v1
335 IntellectSeeker: A Personalized Literature Management System with the Probabilistic Model and Large Language Model Weizhen Bian, Siyan Liu, Yubo Zhou, Dezhi Chen, Yijie Liao, Zhenzhen Fan, Aobo Wang 2024-12-10 KSEM https://github.com/LuckyBian/ISY5001 https://doi.org/10.1007/978-981-97-5489-2_24
336 DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation Jianzong Wu, Chao Tang, Jingbo Wang, Yanhong Zeng, Xiangtai Li, Yunhai Tong 2024-12-10 arXiv https://jianzongwu.github.io/projects/diffsensei/ http://arxiv.org/abs/2412.07589v1
337 Frame Representation Hypothesis: Multi-Token LLM Interpretability and Concept-Guided Text Generation Pedro H. V. Valois, Lincon S. Souza, Erica K. Shimomoto, Kazuhiro Fukui 2024-12-10 arXiv https://github.com/phvv-me/frame-representation-hypothesis http://arxiv.org/abs/2412.07334v2
338 LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM Evaluation Eunsu Kim, Juyoung Suk, Seungone Kim, Niklas Muennighoff, Dongkwan Kim, Alice Oh 2024-12-10 arXiv https://github.com/interview-eval/ http://arxiv.org/abs/2412.10424v2
339 Methods for Legal Citation Prediction in the Age of LLMs: An Australian Law Case Study Ehsan Shareghi, Jiuzhou Han, Paul Burgess 2024-12-09 arXiv https://auslawbench.github.io http://arxiv.org/abs/2412.06272v1
340 PediaBench: A Comprehensive Chinese Pediatric Dataset for Benchmarking Large Language Models Qian Zhang, Panfeng Chen, Jiali Li, Linkun Feng, Shuyu Liu, Heng Zhao, Mei Chen, Hui Li, Yanhao Wang 2024-12-09 arXiv https://github.com/ACMISLab/PediaBench http://arxiv.org/abs/2412.06287v2
341 Exploring Multi-Grained Concept Annotations for Multimodal Large Language Models Xiao Xu, Tianhao Niu, Yuxi Xie, Libo Qin, Wanxiang Che, Min-Yen Kan 2024-12-08 arXiv https://github.com/LooperXX/MMGiC http://arxiv.org/abs/2412.05939v1
342 KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models Fan Wang, Juyong Jiang, Chansung Park, Sunghun Kim, Jing Tang 2024-12-08 arXiv https://github.com/juyongjiang/KaSA http://arxiv.org/abs/2412.06071v1
343 Training-Free Bayesianization for Low-Rank Adapters of Large Language Models Haizhou Shi, Yibin Wang, Ligong Han, Huan Zhang, Hao Wang 2024-12-07 arXiv https://github.com/Wang-ML-Lab/bayesian-peft http://arxiv.org/abs/2412.05723v1
344 LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods Haitao Li, Qian Dong, Junjie Chen, Huixue Su, Yujia Zhou, Qingyao Ai, Ziyi Ye, Yiqun Liu 2024-12-07 arXiv https://github.com/CSHaitao/Awesome-LLMs-as-Judges http://arxiv.org/abs/2412.05579v2
345 Towards Learning to Reason: Comparing LLMs with Neuro-Symbolic on Arithmetic Relations in Abstract Reasoning Michael Hersche, Giacomo Camposampiero, Roger Wattenhofer, Abu Sebastian, Abbas Rahimi 2024-12-07 arXiv https://github.com/IBM/raven-large-language-models http://arxiv.org/abs/2412.05586v1
346 EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios Lu Qiu, Yuying Ge, Yi Chen, Yixiao Ge, Ying Shan, Xihui Liu 2024-12-05 arXiv https://qiulu66.github.io/egoplanbench2/ http://arxiv.org/abs/2412.04447v1
347 LossAgent: Towards Any Optimization Objectives for Image Processing with LLM Agents Bingchen Li, Xin Li, Yiting Lu, Zhibo Chen 2024-12-05 arXiv https://github.com/lbc12345/LossAgent http://arxiv.org/abs/2412.04090v1
348 Reinforcement Learning Enhanced LLMs: A Survey Shuhe Wang, Shengyu Zhang, Jie Zhang, Runyi Hu, Xiaoya Li, Tianwei Zhang, Jiwei Li, Fei Wu, Guoyin Wang, Eduard Hovy 2024-12-05 arXiv https://github.com/ShuheWang1998/Reinforcement-Learning-Enhanced-LLMs-A-Survey http://arxiv.org/abs/2412.10400v2
349 VidHalluc: Evaluating Temporal Hallucinations in Multimodal Large Language Models for Video Understanding Chaoyu Li, Eun Woo Im, Pooyan Fazli 2024-12-04 arXiv https://vid-halluc.github.io/ http://arxiv.org/abs/2412.03735v1
350 From Individual to Society: A Survey on Social Simulation Driven by Large Language Model-based Agents Xinyi Mou, Xuanwen Ding, Qi He, Liang Wang, Jingcong Liang, Xinnong Zhang, Libo Sun, Jiayu Lin, Jie Zhou, Xuanjing Huang, Zhongyu Wei 2024-12-04 arXiv https://github.com/FudanDISC/SocialAgent http://arxiv.org/abs/2412.03563v1
351 Improving Linguistic Diversity of Large Language Models with Possibility Exploration Fine-Tuning Long Mai, Julie Carson-Berndsen 2024-12-04 arXiv https://github.com/mailong25/peft_diversity http://arxiv.org/abs/2412.03343v1
352 Alignment at Pre-training! Towards Native Alignment for Arabic LLMs Juhao Liang, Zhenyang Cai, Jianqing Zhu, Huang Huang, Kewei Zong, Bang An, Mosen Alharthi, Juncai He, Lian Zhang, Haizhou Li, Benyou Wang, Jinchao Xu 2024-12-04 arXiv https://github.com/FreedomIntelligence/AceGPT-v2 http://arxiv.org/abs/2412.03253v1
353 Fine-Grained Behavior Simulation with Role-Playing Large Language Model on Social Media Kun Li, Chenwei Dai, Wei Zhou, Songlin Hu 2024-12-04 arXiv https://github.com/linkseed18612254945/FineRob http://arxiv.org/abs/2412.03148v1
354 AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning Yiwu Zhong, Zhuoming Liu, Yin Li, Liwei Wang 2024-12-04 arXiv https://github.com/LaVi-Lab/AIM http://arxiv.org/abs/2412.03248v1
355 AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information? Kaixiong Gong, Kaituo Feng, Bohao Li, Yibing Wang, Mofan Cheng, Shijia Yang, Jiaming Han, Benyou Wang, Yutong Bai, Zhuoran Yang, Xiangyu Yue 2024-12-03 arXiv https://av-odyssey.github.io/ http://arxiv.org/abs/2412.02611v1
356 CNNSum: Exploring Long-Context Summarization with Large Language Models in Chinese Novels Lingxiao Wei, He Yan, Xiangju Lu, Junmin Zhu, Jun Wang, Wei Zhang 2024-12-03 arXiv https://github.com/CxsGhost/CNNSum http://arxiv.org/abs/2412.02819v4
357 Drawing Pandas: A Benchmark for LLMs in Generating Plotting Code Timur Galimzyanov, Sergey Titov, Yaroslav Golubev, Egor Bogomolov 2024-12-03 arXiv https://github.com/JetBrains-Research/PandasPlotBench http://arxiv.org/abs/2412.02764v1
358 Unleashing GHOST: An LLM-Powered Framework for Automated Hardware Trojan Design Md Omar Faruque, Peter Jamieson, Ahmad Patooghy, Abdel-Hameed A. Badawy 2024-12-03 arXiv https://github.com/HSTRG1/GHOSTbenchmarks http://arxiv.org/abs/2412.02816v1
359 DaDu-E: Rethinking the Role of Large Language Model in Robotic Computing Pipeline Wenhao Sun, Sai Hou, Zixuan Wang, Bo Yu, Shaoshan Liu, Xu Yang, Shuai Liang, Yiming Gan, Yinhe Han 2024-12-02 arXiv https://rlc-lab.github.io/dadu-e/ http://arxiv.org/abs/2412.01663v1
360 DFRot: Achieving Outlier-Free and Massive Activation-Free for Rotated LLMs with Refined Rotation Jingyang Xiang, Sai Qian Zhang 2024-12-01 arXiv https://github.com/JingyangXiang/DFRot http://arxiv.org/abs/2412.00648v2
361 Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification Wenxuan Huang, Zijie Zhai, Yunhang Shen, Shaosheng Cao, Fei Zhao, Xiangfeng Xu, Zheyu Ye, Shaohui Lin 2024-12-01 arXiv https://github.com/Osilly/dynamic_llava http://arxiv.org/abs/2412.00876v3
362 GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models Kunsheng Tang, Wenbo Zhou, Jie Zhang, Aishan Liu, Gelei Deng, Shuai Li, Peigui Qi, Weiming Zhang, Tianwei Zhang, Nenghai Yu 2024-12 CCS '24: Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security https://github.com/kstanghere/GenderCARE-ccs24 https://dl.acm.org/doi/10.1145/3658644.3670284
363 Mitigating Entity-Level Hallucination in Large Language Models Weihang Su, Yichen Tang, Qingyao Ai, Changyue Wang, Zhijing Wu, Yiqun Liu 2024-12 SIGIR-AP 2024: Proceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region https://github.com/oneal2000/EntityHallucination https://dl.acm.org/doi/10.1145/3673791.3698403
364 Optimization-based Prompt Injection Attack to LLM-as-a-Judge Jiawen Shi, Zenghui Yuan, Yinuo Liu, Yue Huang, Pan Zhou, Lichao Sun, Neil Zhenqiang Gong 2024-12 CCS '24: Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security https://github.com/ShiJiawenwen/JudgeDeceiver https://dl.acm.org/doi/10.1145/3658644.3690291
365 PLeak: Prompt Leaking Attacks against Large Language Model Applications Bo Hui, Haolin Yuan, Neil Zhenqiang Gong, Philippe Burlina, Yinzhi Cao 2024-12 CCS '24: Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security https://github.com/BHui97/PLeak https://dl.acm.org/doi/10.1145/3658644.3670370
366 Node Importance Estimation Leveraging LLMs for Semantic Augmentation in Knowledge Graphs Xinyu Lin, Tianyu Zhang, Chengbin Hou, Jinbao Wang, Jianye Xue, Hairong Lv 2024-11-30 arXiv https://github.com/XinyuLin-FZ/LENIE http://arxiv.org/abs/2412.00478v1
367 AgriBench: A Hierarchical Agriculture Benchmark for Multimodal Large Language Models Yutong Zhou, Masahiro Ryo 2024-11-30 arXiv https://github.com/Yutong-Zhou-cv/AgriBench http://arxiv.org/abs/2412.00465v2
368 DroidCall: A Dataset for LLM-powered Android Intent Invocation Weikai Xie, Li Zhang, Shihe Wang, Rongjie Yi, Mengwei Xu 2024-11-30 arXiv https://github.com/UbiquitousLearning/DroidCall http://arxiv.org/abs/2412.00402v1
369 T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs Shukang Yin, Chaoyou Fu, Sirui Zhao, Yunhang Shen, Chunjiang Ge, Yan Yang, Zuwei Long, Yuhan Dai, Tong Xu, Xing Sun, Ran He, Caifeng Shan, Enhong Chen 2024-11-29 arXiv https://github.com/xjtupanda/T2Vid http://arxiv.org/abs/2411.19951v2
370 TQA-Bench: Evaluating LLMs for Multi-Table Question Answering with Scalable Context and Symbolic Extension Zipeng Qiu, You Peng, Guangxin He, Binhang Yuan, Chen Wang 2024-11-29 arXiv https://github.com/Relaxed-System-Lab/TQA-Bench http://arxiv.org/abs/2411.19504v1
371 Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings Qiong Wu, Wenhao Lin, Weihao Ye, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji 2024-11-29 arXiv https://github.com/DoubtedSteam/DyVTE http://arxiv.org/abs/2411.19628v1
372 Ensemble Watermarks for Large Language Models Georg Niess, Roman Kern 2024-11-29 arXiv http://github.com/CommodoreEU/master-generation http://arxiv.org/abs/2411.19563v1
373 Auto-RAG: Autonomous Retrieval-Augmented Generation for Large Language Models Tian Yu, Shaolei Zhang, Yang Feng 2024-11-29 arXiv https://github.com/ictnlp/Auto-RAG http://arxiv.org/abs/2411.19443v1
374 Personalized Federated Fine-Tuning for LLMs via Data-Driven Heterogeneous Model Architectures Yicheng Zhang, Zhen Qin, Zhaomin Wu, Shuiguang Deng 2024-11-28 arXiv https://github.com/zyc140345/FedAMoLE http://arxiv.org/abs/2411.19128v1
375 Enhancing Visual Reasoning with Autonomous Imagination in Multimodal Large Language Models Jingming Liu, Yumeng Li, Boyuan Xiao, Yichang Jian, Ziang Qin, Tianjia Shao, Yao-Xiang Ding, Kun Zhou 2024-11-27 arXiv https://future-item.github.io/autoimagine-site http://arxiv.org/abs/2411.18142v1
376 TimeMarker: A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability Shimin Chen, Xiaohan Lan, Yitian Yuan, Zequn Jie, Lin Ma 2024-11-27 arXiv https://github.com/TimeMarker-LLM/TimeMarker/ http://arxiv.org/abs/2411.18211v1
377 ChatRex: Taming Multimodal LLM for Joint Perception and Understanding Qing Jiang, Gen Luo, Yuqin Yang, Yuda Xiong, Yihao Chen, Zhaoyang Zeng, Tianhe Ren, Lei Zhang 2024-11-27 arXiv https://github.com/IDEA-Research/ChatRex http://arxiv.org/abs/2411.18363v2
378 Can LLMs be Good Graph Judger for Knowledge Graph Construction? Haoyu Huang, Chong Chen, Conghui He, Yang Li, Jiawei Jiang, Wentao Zhang 2024-11-26 arXiv https://github.com/hhy-huang/GraphJudger http://arxiv.org/abs/2411.17388v1
379 Leveraging Large Language Models and Topic Modeling for Toxicity Classification Haniyeh Ehsani Oskouie, Christina Chance, Claire Huang, Margaret Capetz, Elizabeth Eyeson, Majid Sarrafzadeh 2024-11-26 arXiv https://github.com/aheldis/Toxicity-Classification http://arxiv.org/abs/2411.17876v1
380 Star Attention: Efficient LLM Inference over Long Sequences Shantanu Acharya, Fei Jia, Boris Ginsburg 2024-11-26 arXiv https://github.com/NVIDIA/Star-Attention http://arxiv.org/abs/2411.17116v1
381 Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models Ronghuan Wu, Wanchao Su, Jing Liao 2024-11-25 arXiv https://chat2svg.github.io/ http://arxiv.org/abs/2411.16602v1
382 From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge Dawei Li, Bohan Jiang, Liangjie Huang, Alimohammad Beigi, Chengshuai Zhao, Zhen Tan, Amrita Bhattacharjee, Yuxuan Jiang, Canyu Chen, Tianhao Wu, Kai Shu, Lu Cheng, Huan Liu 2024-11-25 arXiv https://github.com/llm-as-a-judge/Awesome-LLM-as-a-judge http://arxiv.org/abs/2411.16594v4
383 Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision Zhiheng Xi, Dingwen Yang, Jixuan Huang, Jiafu Tang, Guanyu Li, Yiwen Ding, Wei He, Boyang Hong, Shihan Do, Wenyu Zhan, Xiao Wang, Rui Zheng, Tao Ji, Xiaowei Shi, Yitao Zhai, Rongxiang Weng, Jingang Wang, Xunliang Cai, Tao Gui, Zuxuan Wu, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Yu-Gang Jiang 2024-11-25 arXiv https://mathcritique.github.io/ http://arxiv.org/abs/2411.16579v1
384 ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration Haozhan Shen, Kangjia Zhao, Tiancheng Zhao, Ruochen Xu, Zilun Zhang, Mingwei Zhu, Jianwei Yin 2024-11-25 arXiv https://github.com/om-ai-lab/ZoomEye http://arxiv.org/abs/2411.16044v1
385 CS-Eval: A Comprehensive Large Language Model Benchmark for CyberSecurity Zhengmin Yu, Jiutian Zeng, Siyi Chen, Wenhan Xu, Dandan Xu, Xiangyu Liu, Zonghao Ying, Nan Wang, Yuan Zhang, Min Yang 2024-11-25 arXiv https://github.com/CS-EVAL/CS-Eval http://arxiv.org/abs/2411.16239v2
386 BayLing 2: A Multilingual Large Language Model with Efficient Language Alignment Shaolei Zhang, Kehao Zhang, Qingkai Fang, Shoutao Guo, Yan Zhou, Xiaodong Liu, Yang Feng 2024-11-25 arXiv https://github.com/ictnlp/BayLing https://doi.org/10.48550/arXiv.2411.16300
387 Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering Federico Cocchi, Nicholas Moratelli, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara 2024-11-25 arXiv https://github.com/aimagelab/ReflectiVA http://arxiv.org/abs/2411.16863v1
388 VidHal: Benchmarking Temporal Hallucinations in Vision LLMs Wey Yeh Choong, Yangyang Guo, Mohan Kankanhalli 2024-11-25 arXiv https://github.com/Lookuz/VidHal http://arxiv.org/abs/2411.16771v1
389 Multi-label Sequential Sentence Classification via Large Language Model Mengfei Lan, Lecheng Zheng, Shufan Ming, Halil Kilicoglu 2024-11-23 EMNLP https://github.com/ScienceNLP-Lab/LLM-SSC https://aclanthology.org/2024.findings-emnlp.944
390 ChemSafetyBench: Benchmarking LLM Safety on Chemistry Domain Haochen Zhao, Xiangru Tang, Ziran Yang, Xiao Han, Xuanzhi Feng, Yueqing Fan, Senhao Cheng, Di Jin, Yilun Zhao, Arman Cohan, Mark Gerstein 2024-11-23 arXiv https://github.com/HaochenZhao/SafeAgent4Chem http://arxiv.org/abs/2411.16736v1
391 Seed-Free Synthetic Data Generation Framework for Instruction-Tuning LLMs: A Case Study in Thai Parinthapat Pengpun, Can Udomcharoenchaikit, Weerayut Buaphet, Peerat Limkonchotiwat 2024-11-23 arXiv https://github.com/parinzee/seed-free-synthetic-instruct http://arxiv.org/abs/2411.15484v1
392 MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs Chaoyou Fu, Yi-Fan Zhang, Shukang Yin, Bo Li, Xinyu Fang, Sirui Zhao, Haodong Duan, Xing Sun, Ziwei Liu, Liang Wang, Caifeng Shan, Ran He 2024-11-22 arXiv https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models/tree/Benchmarks http://arxiv.org/abs/2411.15296v2
393 SemiKong: Curating, Training, and Evaluating A Semiconductor Industry-Specific Large Language Model Christopher Nguyen, William Nguyen, Atsushi Suzuki, Daisuke Oku, Hong An Phan, Sang Dinh, Zooey Nguyen, Anh Ha, Shruti Raghavan, Huy Vo, Thang Nguyen, Lan Nguyen, Yoshikuni Hirayama 2024-11-21 arXiv https://github.com/aitomatic/semikong http://arxiv.org/abs/2411.13802v2
394 UnifiedCrawl: Aggregated Common Crawl for Affordable Adaptation of LLMs on Low-Resource Languages Bethel Melesse Tessema, Akhil Kedia, Tae-Sun Chung 2024-11-21 arXiv https://github.com/bethelmelesse/unifiedcrawl http://arxiv.org/abs/2411.14343v1
395 DRPruning: Efficient Large Language Model Pruning through Distributionally Robust Optimization Hexuan Deng, Wenxiang Jiao, Xuebo Liu, Min Zhang, Zhaopeng Tu 2024-11-21 arXiv https://github.com/hexuandeng/DRPruning http://arxiv.org/abs/2411.14055v1
396 Disentangling Memory and Reasoning Ability in Large Language Models Mingyu Jin, Weidi Luo, Sitao Cheng, Xinyi Wang, Wenyue Hua, Ruixiang Tang, William Yang Wang, Yongfeng Zhang 2024-11-20 arXiv https://github.com/MingyuJ666/Disentangling-Memory-and-Reasoning http://arxiv.org/abs/2411.13504v2
397 DriveMLLM: A Benchmark for Spatial Understanding with Multimodal Large Language Models in Autonomous Driving Xianda Guo, Ruijun Zhang, Yiqun Duan, Yuhang He, Chenming Zhang, Shuai Liu, Long Chen 2024-11-20 arXiv https://github.com/XiandaGuo/Drive-MLLM http://arxiv.org/abs/2411.13112v2
398 On the Consistency of Video Large Language Models in Temporal Comprehension Minjoon Jung, Junbin Xiao, Byoung-Tak Zhang, Angela Yao 2024-11-20 arXiv https://github.com/minjoong507/Consistency-of-Video-LLM http://arxiv.org/abs/2411.12951v1
399 Does Unlearning Truly Unlearn? A Black Box Evaluation of LLM Unlearning Methods Jai Doshi, Asa Cooper Stickland 2024-11-18 arXiv https://github.com/JaiDoshi/Knowledge-Erasure http://arxiv.org/abs/2411.12103v2
400 FLAME: Frozen Large Language Models Enable Data-Efficient Language-Image Pre-training Anjia Cao, Xing Wei, Zhiheng Ma 2024-11-18 arXiv https://github.com/MIV-XJTU/FLAME http://arxiv.org/abs/2411.11927v2
401 Understanding Multimodal LLMs: the Mechanistic Interpretability of Llava in Visual Question Answering Zeping Yu, Sophia Ananiadou 2024-11-17 arXiv https://github.com/zepingyu0512/llava-mechanism http://arxiv.org/abs/2411.10950v1
402 TS-LLaVA: Constructing Visual Tokens through Thumbnail-and-Sampling for Training-Free Video Large Language Models Tingyu Qu, Mingxiao Li, Tinne Tuytelaars, Marie-Francine Moens 2024-11-17 arXiv https://github.com/tingyu215/TS-LLaVA http://arxiv.org/abs/2411.11066v1
403 BianCang: A Traditional Chinese Medicine Large Language Model Sibo Wei, Xueping Peng, Yi-fei Wang, Jiasheng Si, Weiyu Zhang, Wenpeng Lu, Xiaoming Wu, Yinglong Wang 2024-11-17 arXiv https://github.com/QLU-NLP/BianCang http://arxiv.org/abs/2411.11027v1
404 Multilingual Large Language Models: A Systematic Survey Shaolin Zhu, Supryadi, Shaoyang Xu, Haoran Sun, Leiyu Pan, Menglong Cui, Jiangcun Du, Renren Jin, António Branco, Deyi Xiong 2024-11-17 arXiv https://github.com/tjunlp-lab/Awesome-Multilingual-LLMs-Papers http://arxiv.org/abs/2411.11072v2
405 Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model Ting Liu, Liangtao Shi, Richang Hong, Yue Hu, Quanjun Yin, Linfeng Zhang 2024-11-16 arXiv https://github.com/liuting20/MustDrop http://arxiv.org/abs/2411.10803v1
406 Evaluating Creativity and Deception in Large Language Models: A Simulation Framework for Multi-Agent Balderdash Parsa Hejabi, Elnaz Rahmati, Alireza S. Ziabari, Preni Golazizian, Jesse Thomason, Morteza Dehghani 2024-11-15 arXiv https://github.com/ParsaHejabi/Simulation-Framework-for-Multi-Agent-Balderdash http://arxiv.org/abs/2411.10422v1
407 Instruction-Guided Editing Controls for Images and Multimedia: A Survey in LLM era Thanh Tam Nguyen, Zhao Ren, Trinh Pham, Thanh Trung Huynh, Phi Le Nguyen, Hongzhi Yin, Quoc Viet Hung Nguyen 2024-11-15 arXiv https://github.com/tamlhp/awesome-instruction-editing http://arxiv.org/abs/2411.09955v2
408 Orca: Enhancing Role-Playing Abilities of Large Language Models by Integrating Personality Traits Yuxuan Huang 2024-11-15 arXiv https://github.com/Aipura/Orca http://arxiv.org/abs/2411.10006v1
409 Thinking Before Looking: Improving Multimodal LLM Reasoning via Mitigating Visual Hallucination Haojie Zheng, Tianyang Xu, Hanchi Sun, Shu Pu, Ruoxi Chen, Lichao Sun 2024-11-15 arXiv https://github.com/Terry-Xu-666/visual_inference_chain http://arxiv.org/abs/2411.12591v1
410 DROJ: A Prompt-Driven Attack against Large Language Models Leyang Hu, Boran Wang 2024-11-14 arXiv https://github.com/Leon-Leyang/LLM-Safeguard http://arxiv.org/abs/2411.09125v1
411 MM-Eval: A Hierarchical Benchmark for Modern Mongolian Evaluation in LLMs Mengyuan Zhang, Ruihui Wang, Bo Xia, Yuan Sun, Xiaobing Zhao 2024-11-14 arXiv https://github.com/joenahm/MM-Eval http://arxiv.org/abs/2411.09492v1
412 LHRS-Bot-Nova: Improved Multimodal Large Language Model for Remote Sensing Vision-Language Interpretation Zhenshi Li, Dilxat Muhtar, Feng Gu, Xueliang Zhang, Pengfeng Xiao, Guangjun He, Xiaoxiang Zhu 2024-11-14 arXiv https://github.com/NJU-LHRS/LHRS-Bot https://doi.org/10.48550/arXiv.2411.09301
413 CorrectBench: Automatic Testbench Generation with Functional Self-Correction using LLMs for HDL Design Ruidi Qiu, Grace Li Zhang, Rolf Drechsler, Ulf Schlichtmann, Bing Li 2024-11-13 arXiv https://github.com/AutoBench/CorrectBench http://arxiv.org/abs/2411.08510v1
414 DART-LLM: Dependency-Aware Multi-Robot Task Decomposition and Execution using Large Language Models Yongdong Wang, Runze Xiao, Jun Younes Louhi Kasahara, Ryosuke Yajima, Keiji Nagatani, Atsushi Yamashita, Hajime Asama 2024-11-13 arXiv https://wyd0817.github.io/project-dart-llm/ http://arxiv.org/abs/2411.09022v1
415 Large Language Models Can Self-Improve in Long-context Reasoning Siheng Li, Cheng Yang, Zesen Cheng, Lemao Liu, Mo Yu, Yujiu Yang, Wai Lam 2024-11-12 arXiv https://github.com/SihengLi99/SEALONG http://arxiv.org/abs/2411.08147v1
416 Verbosity $\neq$ Veracity: Demystify Verbosity Compensation Behavior of Large Language Models Yusen Zhang, Sarkar Snigdha Sarathi Das, Rui Zhang 2024-11-12 arXiv https://github.com/psunlpgroup/VerbosityLLM http://arxiv.org/abs/2411.07858v2
417 ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction? Canyu Chen, Jian Yu, Shan Chen, Che Liu, Zhongwei Wan, Danielle Bitterman, Fei Wang, Kai Shu 2024-11-10 arXiv https://clinicalbench.github.io http://arxiv.org/abs/2411.06469v1
418 Golden Touchstone: A Comprehensive Bilingual Benchmark for Evaluating Financial Large Language Models Xiaojun Wu, Junxi Liu, Huanyi Su, Zhouchi Lin, Yiyan Qi, Chengjin Xu, Jiajun Su, Jiajie Zhong, Fuwei Wang, Saizhuo Wang, Fengrui Hua, Jia Li, Jian Guo 2024-11-09 arXiv https://github.com/IDEA-FinAI/Golden-Touchstone http://arxiv.org/abs/2411.06272v1
419 TourSynbio-Search: A Large Language Model Driven Agent Framework for Unified Search Method for Protein Engineering Yungeng Liu, Zan Chen, Yu Guang Wang, Yiqing Shen 2024-11-09 arXiv https://github.com/tsynbio/Toursynbio-Search http://arxiv.org/abs/2411.06024v1
420 WorkflowLLM: Enhancing Workflow Orchestration Capability of Large Language Models Shengda Fan, Xin Cong, Yuepeng Fu, Zhong Zhang, Shuyan Zhang, Yuanwei Liu, Yesai Wu, Yankai Lin, Zhiyuan Liu, Maosong Sun 2024-11-08 arXiv https://github.com/OpenBMB/WorkflowLLM http://arxiv.org/abs/2411.05451v1
421 Game-theoretic LLM: Agent Workflow for Negotiation Games Wenyue Hua, Ollie Liu, Lingyao Li, Alfonso Amayuelas, Julie Chen, Lucas Jiang, Mingyu Jin, Lizhou Fan, Fei Sun, William Wang, Xintong Wang, Yongfeng Zhang 2024-11-08 arXiv https://github.com/Wenyueh/game_theory http://arxiv.org/abs/2411.05990v2
422 Exploring the Alignment Landscape: LLMs and Geometric Deep Models in Protein Representation Dong Shu, Bingbing Duan, Kai Guo, Kaixiong Zhou, Jiliang Tang, Mengnan Du 2024-11-08 arXiv https://github.com/Tizzzzy/LLM-GDM-alignment http://arxiv.org/abs/2411.05316v1
423 AutoProteinEngine: A Large Language Model Driven Agent Framework for Multimodal AutoML in Protein Engineering Yungeng Liu, Zan Chen, Yu Guang Wang, Yiqing Shen 2024-11-07 arXiv https://github.com/tsynbio/AutoPE http://arxiv.org/abs/2411.04440v1
424 FineTuneBench: How well do commercial fine-tuning APIs infuse knowledge into LLMs? Eric Wu, Kevin Wu, James Zou 2024-11-07 arXiv https://github.com/kevinwu23/StanfordFineTuneBench http://arxiv.org/abs/2411.05059v2
425 Robust and Efficient Fine-tuning of LLMs with Bayesian Reparameterization of Low-Rank Adaptation Ayan Sengupta, Vaibhav Seth, Arinjay Pathak, Natraj Raman, Sriram Gopalakrishnan, Tanmoy Chakraborty 2024-11-07 arXiv https://github.com/LCS2-IIITD/MonteCLoRA http://arxiv.org/abs/2411.04358v2
426 Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large Language Model Young-Jun Lee, Dokyong Lee, Junyoung Youn, Kyeongjin Oh, Ho-Jin Choi 2024-11-07 arXiv https://github.com/passing2961/Thanos http://arxiv.org/abs/2411.04496v1
427 Abstract2Appendix: Academic Reviews Enhance LLM Long-Context Capabilities Shengzhi Li, Kittipat Kampa, Rongyu Lin, Bohang Li, Shichao Pei 2024-11-07 arXiv https://github.com/findalexli/Abstract2Appendix http://arxiv.org/abs/2411.05232v1
428 Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models Zhijian Zhuo, Ya Wang, Yutao Zeng, Xiaoqing Li, Xun Zhou, Jinwen Ma 2024-11-06 arXiv https://github.com/BryceZhuo/PolyCom http://arxiv.org/abs/2411.03884v1
429 QUILL: Quotation Generation Enhancement of Large Language Models Jin Xiao, Bowei Zhang, Qianyu He, Jiaqing Liang, Feng Wei, Jinglei Chen, Zujie Liang, Deqing Yang, Yanghua Xiao 2024-11-06 arXiv https://github.com/GraceXiaoo/QUILL http://arxiv.org/abs/2411.03675v1
430 SMoA: Improving Multi-agent Large Language Models with Sparse Mixture-of-Agents Dawei Li, Zhen Tan, Peijia Qian, Yifan Li, Kumar Satvik Chaudhary, Lijie Hu, Jiayi Shen 2024-11-05 arXiv https://github.com/David-Li0406/SMoA http://arxiv.org/abs/2411.03284v1
431 Stochastic Monkeys at Play: Random Augmentations Cheaply Break LLM Safety Alignment Jason Vega, Junsheng Huang, Gaokai Zhang, Hangoo Kang, Minjia Zhang, Gagandeep Singh 2024-11-05 arXiv https://github.com/uiuc-focal-lab/stochastic-monkeys/ http://arxiv.org/abs/2411.02785v2
432 Change Is the Only Constant: Dynamic LLM Slicing based on Layer Redundancy Razvan-Gabriel Dumitru, Paul-Ioan Clotan, Vikas Yadav, Darius Peteleaza, Mihai Surdeanu 2024-11-05 arXiv https://github.com/RazvanDu/DynamicSlicing http://arxiv.org/abs/2411.03513v1
433 Leveraging Large Language Models in Code Question Answering: Baselines and Issues Georgy Andryushchenko, Vladimir Ivanov, Vladimir Makharev, Elizaveta Tukhtina, Aidar Valeev 2024-11-05 arXiv https://github.com/IU-AES-AI4Code/CodeQuestionAnswering http://arxiv.org/abs/2411.03012v1
434 FlexCAD: Unified and Versatile Controllable CAD Generation with Fine-tuned Large Language Models Zhanwei Zhang, Shizhao Sun, Wenxiao Wang, Deng Cai, Jiang Bian 2024-11-05 arXiv https://github.com/microsoft/CADGeneration/FlexCAD http://arxiv.org/abs/2411.05823v1
435 Culinary Class Wars: Evaluating LLMs using ASH in Cuisine Transfer Task Hoonick Lee, Mogan Gim, Donghyeon Park, Donghee Choi, Jaewoo Kang 2024-11-04 arXiv http://github.com/dmis-lab/CulinaryASH http://arxiv.org/abs/2411.01996v1
436 Eurekaverse: Environment Curriculum Generation via Large Language Models William Liang, Sam Wang, Hung-Ju Wang, Osbert Bastani, Dinesh Jayaraman, Yecheng Jason Ma 2024-11-04 arXiv https://eureka-research.github.io/eurekaverse http://arxiv.org/abs/2411.01775v1
437 SQL Injection Jailbreak: a structural disaster of large language models Jiawei Zhao, Kejiang Chen, Weiming Zhang, Nenghai Yu 2024-11-03 arXiv https://github.com/weiyezhimeng/SQL-Injection-Jailbreak http://arxiv.org/abs/2411.01565v3
438 TODO: Enhancing LLM Alignment with Ternary Preferences Yuxiang Guo, Lu Yin, Bo Jiang, Jiaqi Zhang 2024-11-02 arXiv https://github.com/XXares/TODO http://arxiv.org/abs/2411.02442v1
439 Fish-Speech: Leveraging Large Language Models for Advanced Multilingual Text-to-Speech Synthesis Shijia Liao, Yuxuan Wang, Tianyu Li, Yifan Cheng, Ruoyi Zhang, Rongzhi Zhou, Yijin Xing 2024-11-02 arXiv https://github.com/fishaudio/fish-speech http://arxiv.org/abs/2411.01156v2
440 Leveraging LLM and Text-Queried Separation for Noise-Robust Sound Event Detection Han Yin, Yang Xiao, Jisheng Bai, Rohan Kumar Das 2024-11-02 arXiv https://github.com/apple-yinhan/Noise-robust-SED http://arxiv.org/abs/2411.01174v1
441 Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM Xiong Wang, Yangze Li, Chaoyou Fu, Yunhang Shen, Lei Xie, Ke Li, Xing Sun, Long Ma 2024-11-01 arXiv https://freeze-omni.github.io/ http://arxiv.org/abs/2411.00774v5
442 LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language Models Nam V. Nguyen, Thong T. Doan, Luong Tran, Van Nguyen, Quang Pham 2024-11-01 arXiv https://fsoft-aic.github.io/fsoft-LibMoE.github.io http://arxiv.org/abs/2411.00918v1
443 Mitigating Tail Narrowing in LLM Self-Improvement via Socratic-Guided Sampling Yiwen Ding, Zhiheng Xi, Wei He, Zhuoyuan Li, Yitao Zhai, Xiaowei Shi, Xunliang Cai, Tao Gui, Qi Zhang, Xuanjing Huang 2024-11-01 arXiv https://github.com/Yiwen-Ding/Guided-Self-Improvement http://arxiv.org/abs/2411.00750v1
444 MoD: A Distribution-Based Approach for Merging Large Language Models Quy-Anh Dang, Chris Ngo 2024-11-01 arXiv https://github.com/knovel-eng/mod http://arxiv.org/abs/2411.00406v1
445 SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Models Jianyi Zhang, Da-Cheng Juan, Cyrus Rashtchian, Chun-Sung Ferng, Heinrich Jiang, Yiran Chen 2024-11-01 arXiv https://jayzhang42.github.io/sled_page/ http://arxiv.org/abs/2411.02433v2
446 Beyond Utility: Evaluating LLM as Recommender Chumeng Jiang, Jiayin Wang, Weizhi Ma, Charles L. A. Clarke, Shuai Wang, Chuhan Wu, Min Zhang 2024-11-01 arXiv https://github.com/JiangDeccc/EvaLLMasRecommender http://arxiv.org/abs/2411.00331v1
447 EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Unified Compression and Adaptive Layer Voting Zhongzhi Yu, Zheng Wang, Yuhan Li, Haoran You, Ruijie Gao, Xiaoya Zhou, Sreenidhi Reddy Bommu, Yang Katie Zhao, Yingyan Celine Lin 2024-11 DAC '24: Proceedings of the 61st ACM/IEEE Design Automation Conference https://github.com/GATECH-EIC/Edge-LLM https://dl.acm.org/doi/10.1145/3649329.3658473
448 Have You Merged My Model? On The Robustness of Large Language Model IP Protection Methods Against Model Merging Tianshuo Cong, Delong Ran, Zesen Liu, Xinlei He, Jinyuan Liu, Yichen Gong, Qi Li, Anyu Wang, Xiaoyun Wang 2024-11 LAMPS '24: Proceedings of the 1st ACM Workshop on Large AI Systems and Models with Privacy and Safety Analysis https://github.com/ThuCCSLab/MergeGuard https://dl.acm.org/doi/10.1145/3689217.3690614
449 Large Language Models for Anomaly Detection in Computational Workflows: From Supervised Fine-Tuning to In-Context Learning Hongwei Jin, George Papadimitriou, Krishnan Raghavan, Pawel Zuk, Prasanna Balaprakash, Cong Wang, Anirban Mandal, Ewa Deelman 2024-11 SC '24: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis https://github.com/PoSeiDon-Workflows/LLM_AD https://dl.acm.org/doi/10.1109/SC41406.2024.00098
450 LLaMo: Large Language Model-based Molecular Graph Assistant Jinyoung Park, Minseong Bae, Dohwan Ko, Hyunwoo J. Kim 2024-10-31 arXiv https://github.com/mlvlab/LLaMo http://arxiv.org/abs/2411.00871v1
451 What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective Ming Li, Yanhong Li, Tianyi Zhou 2024-10-31 arXiv https://github.com/MingLiiii/Layer_Gradient http://arxiv.org/abs/2410.23743v1
452 Scaling Up Membership Inference: When and How Attacks Succeed on Large Language Models Haritz Puerto, Martin Gubri, Sangdoo Yun, Seong Joon Oh 2024-10-31 arXiv https://github.com/parameterlab/mia-scaling http://arxiv.org/abs/2411.00154v1
453 BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments Xinghao Wang, Pengyu Wang, Bo Wang, Dong Zhang, Yunhua Zhou, Xipeng Qiu 2024-10-31 arXiv https://github.com/xinghaow99/BitStack http://arxiv.org/abs/2410.23918v1
454 LLM4Mat-Bench: Benchmarking Large Language Models for Materials Property Prediction Andre Niyongabo Rubungo, Kangming Li, Jason Hattrick-Simpers, Adji Bousso Dieng 2024-10-31 arXiv https://github.com/vertaix/LLM4Mat-Bench http://arxiv.org/abs/2411.00177v3
455 End-to-End Ontology Learning with Large Language Models Andy Lo, Albert Q. Jiang, Wenda Li, Mateja Jamnik 2024-10-31 arXiv https://github.com/andylolu2/ollm http://arxiv.org/abs/2410.23584v1
456 DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios Junchao Wu, Runzhe Zhan, Derek F. Wong, Shu Yang, Xinyi Yang, Yulin Yuan, Lidia S. Chao 2024-10-31 arXiv https://github.com/NLP2CT/DetectRL http://arxiv.org/abs/2410.23746v1
457 SciPIP: An LLM-based Scientific Paper Idea Proposer Wenxiao Wang, Lihui Gu, Liye Zhang, Yunxiang Luo, Yi Dai, Chen Shen, Liang Xie, Binbin Lin, Xiaofei He, Jieping Ye 2024-10-30 arXiv https://github.com/cheerss/SciPIP http://arxiv.org/abs/2410.23166v1
458 Online Intrinsic Rewards for Decision Making Agents from Large Language Model Feedback Qinqing Zheng, Mikael Henaff, Amy Zhang, Aditya Grover, Brandon Amos 2024-10-30 arXiv https://github.com/facebookresearch/oni http://arxiv.org/abs/2410.23022v2
459 ReasoningRec: Bridging Personalized Recommendations and Human-Interpretable Explanations through LLM Reasoning Millennium Bismay, Xiangjue Dong, James Caverlee 2024-10-30 arXiv https://github.com/millenniumbismay/reasoningrec http://arxiv.org/abs/2410.23180v1
460 Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning Keqin Bao, Ming Yan, Yang Zhang, Jizhi Zhang, Wenjie Wang, Fuli Feng, Xiangnan He 2024-10-30 arXiv https://github.com/ym689/rec_icl http://arxiv.org/abs/2410.23136v1
461 Causality-Enhanced Behavior Sequence Modeling in LLMs for Personalized Recommendation Yang Zhang, Juntao You, Yimeng Bai, Jizhi Zhang, Keqin Bao, Wenjie Wang, Tat-Seng Chua 2024-10-30 arXiv https://github.com/itsmeyjt/CFT http://arxiv.org/abs/2410.22809v1
462 On Memorization of Large Language Models in Logical Reasoning Chulin Xie, Yangsibo Huang, Chiyuan Zhang, Da Yu, Xinyun Chen, Bill Yuchen Lin, Bo Li, Badih Ghazi, Ravi Kumar 2024-10-30 arXiv https://memkklogic.github.io http://arxiv.org/abs/2410.23123v1
463 Comparative Analysis of Demonstration Selection Algorithms for LLM In-Context Learning Dong Shu, Mengnan Du 2024-10-30 arXiv https://github.com/Tizzzzy/Demonstration_Selection_Overview http://arxiv.org/abs/2410.23099v1
464 BUZZ: Beehive-structured Sparse KV Cache with Segmented Heavy Hitters for Efficient LLM Inference Junqi Zhao, Zhijin Fang, Shu Li, Shaohui Yang, Shichao He 2024-10-30 arXiv https://github.com/JunqiZhao888/buzz-llm http://arxiv.org/abs/2410.23079v1
465 Distinguishing Ignorance from Error in LLM Hallucinations Adi Simhi, Jonathan Herzig, Idan Szpektor, Yonatan Belinkov 2024-10-29 arXiv https://github.com/technion-cs-nlp/hallucination-mitigation http://arxiv.org/abs/2410.22071v1
466 Leveraging LLMs for Hypothetical Deduction in Logical Inference: A Neuro-Symbolic Approach Qingchuan Li, Jiatong Li, Tongxuan Liu, Yuting Zeng, Mingyue Cheng, Weizhe Huang, Qi Liu 2024-10-29 arXiv https://github.com/wufeiwuwoshihua/nshy http://arxiv.org/abs/2410.21779v1
467 Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance Dongmin Park, Sebin Kim, Taehong Moon, Minkyu Kim, Kangwook Lee, Jaewoong Cho 2024-10-29 arXiv https://github.com/krafton-ai/Rare2Frequent http://arxiv.org/abs/2410.22376v1
468 Scaling LLM Inference with Optimized Sample Compute Allocation Kexun Zhang, Shang Zhou, Danqing Wang, William Yang Wang, Lei Li 2024-10-29 arXiv https://github.com/LeiLiLab/OSCA http://arxiv.org/abs/2410.22480v1
469 Hacking Back the AI-Hacker: Prompt Injection as a Defense Against LLM-driven Cyberattacks Dario Pasquini, Evgenios M. Kornaropoulos, Giuseppe Ateniese 2024-10-28 arXiv https://github.com/pasquini-dario/project_mantis http://arxiv.org/abs/2410.20911v2
470 Instruction-Tuned LLMs Succeed in Document-Level MT Without Fine-Tuning -- But BLEU Turns a Blind Eye Yirong Sun, Dawei Zhu, Yanjun Chen, Erjia Xiao, Xinghao Chen, Xiaoyu Shen 2024-10-28 arXiv https://github.com/EIT-NLP/BLEUless_DocMT http://arxiv.org/abs/2410.20941v2
471 LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment Ge Yang, Changyi He, Jinyang Guo, Jianyu Wu, Yifu Ding, Aishan Liu, Haotong Qin, Pengliang Ji, Xianglong Liu 2024-10-28 arXiv https://github.com/AboveParadise/LLMCBench http://arxiv.org/abs/2410.21352v2
472 NewTerm: Benchmarking Real-Time New Terms for Large Language Models with Annual Updates Hexuan Deng, Wenxiang Jiao, Xuebo Liu, Min Zhang, Zhaopeng Tu 2024-10-28 arXiv https://github.com/hexuandeng/NewTerm http://arxiv.org/abs/2410.20814v1
473 ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference Hanshi Sun, Li-Wen Chang, Wenlei Bao, Size Zheng, Ningxin Zheng, Xin Liu, Harry Dong, Yuejie Chi, Beidi Chen 2024-10-28 arXiv https://github.com/bytedance/ShadowKV http://arxiv.org/abs/2410.21465v1
474 Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large Language Models Yilun Jin, Zheng Li, Chenwei Zhang, Tianyu Cao, Yifan Gao, Pratik Jayarao, Mao Li, Xin Liu, Ritesh Sarkhel, Xianfeng Tang, Haodong Wang, Zhengyang Wang, Wenju Xu, Jingfeng Yang, Qingyu Yin, Xian Li, Priyanka Nigam, Yi Xu, Kai Chen, Qiang Yang, Meng Jiang, Bing Yin 2024-10-28 arXiv https://github.com/KL4805/ShoppingMMLU http://arxiv.org/abs/2410.20745v2
475 Simple is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation Mufei Li, Siqi Miao, Pan Li 2024-10-28 arXiv https://github.com/Graph-COM/SubgraphRAG http://arxiv.org/abs/2410.20724v2
476 SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy Segment Optimization Wanhua Li, Zibin Meng, Jiawei Zhou, Donglai Wei, Chuang Gan, Hanspeter Pfister 2024-10-28 arXiv https://mengzibin.github.io/SocialGPT.github.io/ http://arxiv.org/abs/2410.21411v1
477 Learning from Response not Preference: A Stackelberg Approach for LLM Detoxification using Non-parallel Data Xinhong Xie, Tao Li, Quanyan Zhu 2024-10-27 arXiv https://github.com/XXXinhong/Detoxification_LLM http://arxiv.org/abs/2410.20298v1
478 Enhancing Inflation Nowcasting with LLM: Sentiment Analysis on News Marc-Antoine Allard, Paul Teiletche, Adam Zinebi 2024-10-26 arXiv https://github.com/paultltc/InflaBERT http://arxiv.org/abs/2410.20198v1
479 LLMs Can Evolve Continually on Modality for X-Modal Reasoning Jiazuo Yu, Haomiao Xiong, Lu Zhang, Haiwen Diao, Yunzhi Zhuge, Lanqing Hong, Dong Wang, Huchuan Lu, You He, Long Chen 2024-10-26 arXiv https://github.com/JiazuoYu/PathWeave http://arxiv.org/abs/2410.20178v2
480 APRICOT: Active Preference Learning and Constraint-Aware Task Planning with LLMs Huaxiaoyue Wang, Nathaniel Chin, Gonzalo Gonzalez-Pumariega, Xiangwan Sun, Neha Sunkara, Maximus Adrian Pace, Jeannette Bohg, Sanjiban Choudhury 2024-10-25 arXiv https://portal-cornell.github.io/apricot/ http://arxiv.org/abs/2410.19656v1
481 Language Agents Meet Causality -- Bridging LLMs and Causal World Models John Gkountouras, Matthias Lindemann, Phillip Lippe, Efstratios Gavves, Ivan Titov 2024-10-25 arXiv https://j0hngou.github.io/LLMCWM/ http://arxiv.org/abs/2410.19923v1
482 Delving into the Reversal Curse: How Far Can Large Language Models Generalize? Zhengkai Lin, Zhihang Fu, Kai Liu, Liang Xie, Binbin Lin, Wenxiao Wang, Deng Cai, Yue Wu, Jieping Ye 2024-10-24 arXiv https://github.com/alibaba/thinking_bias http://arxiv.org/abs/2410.18808v2
483 Distill Visual Chart Reasoning Ability from LLMs to MLLMs Wei He, Zhiheng Xi, Wanxu Zhao, Xiaoran Fan, Yiwen Ding, Zifei Shan, Tao Gui, Qi Zhang, Xuanjing Huang 2024-10-24 arXiv https://github.com/hewei2001/ReachQA http://arxiv.org/abs/2410.18798v1
484 GCoder: Improving Large Language Model for Generalized Graph Problem Solving Qifan Zhang, Xiaobin Hong, Jianheng Tang, Nuo Chen, Yuhan Li, Wenzhong Li, Jing Tang, Jia Li 2024-10-24 arXiv https://github.com/Bklight999/WWW25-GCoder/tree/master http://arxiv.org/abs/2410.19084v1
485 Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design Ruisi Cai, Yeonju Ro, Geon-Woo Kim, Peihao Wang, Babak Ehteshami Bejnordi, Aditya Akella, Zhangyang Wang 2024-10-24 arXiv https://github.com/VITA-Group/READ-ME http://arxiv.org/abs/2410.19123v1
486 AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models Kim Sung-Bin, Oh Hyun-Bin, JungMok Lee, Arda Senocak, Joon Son Chung, Tae-Hyun Oh 2024-10-23 arXiv https://github.com/AVHBench/AVHBench http://arxiv.org/abs/2410.18325v1
487 CoreInfer: Accelerating Large Language Model Inference with Semantics-Inspired Adaptive Sparse Activation Qinsi Wang, Saeed Vahidian, Hancheng Ye, Jianyang Gu, Jianyi Zhang, Yiran Chen 2024-10-23 arXiv https://wangqinsi1.github.io/coreinfer_page/ http://arxiv.org/abs/2410.18311v1
488 Cross-model Control: Improving Multiple Large Language Models in One-time Training Jiayi Wu, Hao Sun, Hengyi Cai, Lixin Su, Shuaiqiang Wang, Dawei Yin, Xiang Li, Ming Gao 2024-10-23 arXiv https://github.com/wujwyi/CMC http://arxiv.org/abs/2410.17599v1
489 ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage Taewhoo Lee, Chanwoong Yoon, Kyochul Jang, Donghyeon Lee, Minju Song, Hyunjae Kim, Jaewoo Kang 2024-10-22 arXiv https://github.com/dmis-lab/ETHIC http://arxiv.org/abs/2410.16848v1
490 Large Language Models Empowered Personalized Web Agents Hongru Cai, Yongqi Li, Wenjie Wang, Fengbin Zhu, Xiaoyu Shen, Wenjie Li, Tat-Seng Chua 2024-10-22 arXiv https://hongrucai.github.io/PersonalWAB/ http://arxiv.org/abs/2410.17236v1
491 Improving Causal Reasoning in Large Language Models: A Survey Longxuan Yu, Delin Chen, Siheng Xiong, Qingyang Wu, Qingzhen Liu, Dawei Li, Zhikai Chen, Xiaoze Liu, Liangming Pan 2024-10-22 arXiv https://github.com/chendl02/Awesome-LLM-causal-reasoning http://arxiv.org/abs/2410.16676v3
492 VoiceBench: Benchmarking LLM-Based Voice Assistants Yiming Chen, Xianghu Yue, Chen Zhang, Xiaoxue Gao, Robby T. Tan, Haizhou Li 2024-10-22 arXiv https://github.com/MatthewCYM/VoiceBench http://arxiv.org/abs/2410.17196v3
493 DEAN: Deactivating the Coupled Neurons to Mitigate Fairness-Privacy Conflicts in Large Language Models Chen Qian, Dongrui Liu, Jie Zhang, Yong Liu, Jing Shao 2024-10-22 arXiv https://github.com/ChnQ/DEAN http://arxiv.org/abs/2410.16672v1
494 AMUSD: Asynchronous Multi-Device Speculative Decoding for LLM Acceleration Bradley McDanel 2024-10-22 arXiv https://github.com/BradMcDanel/AMUSD/ http://arxiv.org/abs/2410.17375v1
495 Automated Spinal MRI Labelling from Reports Using a Large Language Model Robin Y. Park, Rhydian Windsor, Amir Jamaludin, Andrew Zisserman 2024-10-22 MICCAI https://github.com/robinyjpark/AutoLabelClassifier https://doi.org/10.1007/978-3-031-72086-4_10
496 CoPS: Empowering LLM Agents with Provable Cross-Task Experience Sharing Chen Yang, Chenyang Zhao, Quanquan Gu, Dongruo Zhou 2024-10-22 arXiv https://github.com/uclaml/COPS http://arxiv.org/abs/2410.16670v1
497 LLaVA-KD: A Framework of Distilling Multimodal Large Language Models Yuxuan Cai, Jiangning Zhang, Haoyang He, Xinwei He, Ao Tong, Zhenye Gan, Chengjie Wang, Xiang Bai 2024-10-21 arXiv https://github.com/Fantasyele/LLaVA-KD http://arxiv.org/abs/2410.16236v2
498 Mesa-Extrapolation: A Weave Position Encoding Method for Enhanced Extrapolation in LLMs Xin Ma, Yang Liu, Jingjing Liu, Xiaoxu Ma 2024-10-21 arXiv https://github.com/soacker/Mesa-Extrapolation http://arxiv.org/abs/2410.15859v3
499 MagicPIG: LSH Sampling for Efficient LLM Generation Zhuoming Chen, Ranajoy Sadhukhan, Zihao Ye, Yang Zhou, Jianyu Zhang, Niklas Nolte, Yuandong Tian, Matthijs Douze, Leon Bottou, Zhihao Jia, Beidi Chen 2024-10-21 arXiv https://github.com/Infini-AI-Lab/MagicPIG http://arxiv.org/abs/2410.16179v4
500 RAC: Efficient LLM Factuality Correction with Retrieval Augmentation Changmao Li, Jeffrey Flanigan 2024-10-21 arXiv https://github.com/jlab-nlp/Retrieval-Augmented-Correction http://arxiv.org/abs/2410.15667v1
501 Developing Retrieval Augmented Generation (RAG) based LLM Systems from PDFs: An Experience Report Ayman Asad Khan, Md Toufique Hasan, Kai Kristian Kemell, Jussi Rasku, Pekka Abrahamsson 2024-10-21 arXiv https://github.com/GPT-Laboratory/RAG-LLM-Development-Guidebook-from-PDFs http://arxiv.org/abs/2410.15944v1
502 CausalGraph2LLM: Evaluating LLMs for Causal Queries Ivaxi Sheth, Bahare Fatemi, Mario Fritz 2024-10-21 arXiv https://github.com/ivaxi0s/CausalGraph2LLM http://arxiv.org/abs/2410.15939v1
503 Boosting Jailbreak Transferability for Large Language Models Hanqing Liu, Lifeng Zhou, Huanqian Yan 2024-10-21 arXiv https://github.com/HqingLiu/SI-GCG http://arxiv.org/abs/2410.15645v2
504 A Comprehensive Evaluation of Cognitive Biases in LLMs Simon Malberg, Roman Poletukhin, Carolin M. Schuster, Georg Groh 2024-10-20 arXiv https://github.com/simonmalberg/cognitive-biases-in-llms http://arxiv.org/abs/2410.15413v1
505 Are LLMs Good Zero-Shot Fallacy Classifiers? Fengjun Pan, Xiaobao Wu, Zongrui Li, Anh Tuan Luu 2024-10-19 arXiv https://github.com/panFJCharlotte98/Fallacy_Detection http://arxiv.org/abs/2410.15050v1
506 Evaluating Deep Unlearning in Large Language Models Ruihan Wu, Chhavi Yadav, Russ Salakhutdinov, Kamalika Chaudhuri 2024-10-19 arXiv https://github.com/wrh14/deep_unlearning http://arxiv.org/abs/2410.15153v3
507 Explaining Graph Neural Networks with Large Language Models: A Counterfactual Perspective for Molecular Property Prediction Yinhan He, Zaiyi Zheng, Patrick Soga, Yaozhen Zhu, yushun Dong, Jundong Li 2024-10-19 EMNLP 2024 (Findings) https://github.com/YinhanHe123/new\_LLM4GNNExplanation http://arxiv.org/abs/2410.15165v1
508 GlitchMiner: Mining Glitch Tokens in Large Language Models via Gradient-based Discrete Optimization Zihui Wu, Haichang Gao, Ping Wang, Shudong Zhang, Zhaoxiang Liu, Shiguo Lian 2024-10-19 arXiv https://github.com/wooozihui/GlitchMiner http://arxiv.org/abs/2410.15052v4
509 Imprompter: Tricking LLM Agents into Improper Tool Use Xiaohan Fu, Shuheng Li, Zihan Wang, Yihao Liu, Rajesh K. Gupta, Taylor Berg-Kirkpatrick, Earlence Fernandes 2024-10-19 arXiv https://github.com/Reapor-Yurnero/imprompter http://arxiv.org/abs/2410.14923v2
510 MCCoder: Streamlining Motion Control with LLM-Assisted Code Generation and Rigorous Verification Yin Li, Liangwei Wang, Shiyuan Piao, Boo-Ho Yang, Ziyue Li, Wei Zeng, Fugee Tsung 2024-10-19 arXiv https://github.com/MCCodeAI/MCCoder http://arxiv.org/abs/2410.15154v1
511 REEF: Representation Encoding Fingerprints for Large Language Models Jie Zhang, Dongrui Liu, Chen Qian, Linfeng Zhang, Yong Liu, Yu Qiao, Jing Shao 2024-10-18 arXiv https://github.com/tmylla/REEF http://arxiv.org/abs/2410.14273v1
512 Synthesizing Post-Training Data for LLMs through Multi-Agent Simulation Shuo Tang, Xianghe Pang, Zexi Liu, Bohan Tang, Rui Ye, Xiaowen Dong, Yanfeng Wang, Siheng Chen 2024-10-18 arXiv https://github.com/ShuoTang123/MATRIX-Gen http://arxiv.org/abs/2410.14251v1
513 SRAP-Agent: Simulating and Optimizing Scarce Resource Allocation Policy with LLM-based Agent Jiarui Ji, Yang Li, Hongtao Liu, Zhicheng Du, Zhewei Wei, Weiran Shen, Qi Qi, Yankai Lin 2024-10-18 arXiv https://github.com/jijiarui-cather/SRAPAgent_Framework http://arxiv.org/abs/2410.14152v1
514 Towards Faithful Natural Language Explanations: A Study Using Activation Patching in Large Language Models Wei Jie Yeo, Ranjan Satapathy, Erik Cambria 2024-10-18 arXiv https://github.com/wj210/Causal-Faithfulness https://doi.org/10.48550/arXiv.2410.14155
515 Enabling Scalable Evaluation of Bias Patterns in Medical LLMs Hamed Fayyaz, Raphael Poulain, Rahmatollah Beheshti 2024-10-18 arXiv https://github.com/healthylaife/autofair http://arxiv.org/abs/2410.14763v1
516 CoMAL: Collaborative Multi-Agent Large Language Models for Mixed-Autonomy Traffic Huaiyuan Yao, Longchao Da, Vishnu Nandam, Justin Turnau, Zhiwei Liu, Linsey Pang, Hua Wei 2024-10-18 arXiv https://github.com/Hyan-Yao/CoMAL http://arxiv.org/abs/2410.14368v1
517 Do LLMs Overcome Shortcut Learning? An Evaluation of Shortcut Challenges in Large Language Models Yu Yuan, Lili Zhao, Kai Zhang, Guangting Zheng, Qi Liu 2024-10-17 EMNLP https://github.com/yyhappier/ShortcutSuite https://aclanthology.org/2024.emnlp-main.679
518 Data Defenses Against Large Language Models William Agnew, Harry H. Jiang, Cella Sum, Maarten Sap, Sauvik Das 2024-10-17 arXiv https://github.com/wagnew3/LLMDataDefenses http://arxiv.org/abs/2410.13138v1
519 FaithBench: A Diverse Hallucination Benchmark for Summarization by Modern LLMs Forrest Sheng Bao, Miaoran Li, Renyi Qu, Ge Luo, Erana Wan, Yujia Tang, Weisi Fan, Manveer Singh Tamber, Suleman Kazi, Vivek Sourabh, Mike Qi, Ruixuan Tu, Chenyu Xu, Matthew Gonzales, Ofer Mendelevitch, Amin Ahmad 2024-10-17 arXiv https://github.com/vectara/FaithBench http://arxiv.org/abs/2410.13210v1
520 LLM-Rank: A Graph Theoretical Approach to Pruning Large Language Models David Hoffmann, Kailash Budhathoki, Matthaeus Kleindessner 2024-10-17 arXiv https://github.com/amazon-science/llm-rank-pruning http://arxiv.org/abs/2410.13299v2
521 Retrieval-Augmented Personalization for Multimodal Large Language Models Haoran Hao, Jiaming Han, Changsheng Li, Yu-Feng Li, Xiangyu Yue 2024-10-17 arXiv https://github.com/Hoar012/RAP-MLLM http://arxiv.org/abs/2410.13360v2
522 SLM-Mod: Small Language Models Surpass LLMs at Content Moderation Xianyang Zhan, Agam Goyal, Yilun Chen, Eshwar Chandrasekharan, Koustuv Saha 2024-10-17 arXiv https://github.com/AGoyal0512/SLM-Mod http://arxiv.org/abs/2410.13155v1
523 aiXcoder-7B: A Lightweight and Effective Large Language Model for Code Completion Siyuan Jiang, Jia Li, He Zong, Huanyu Liu, Hao Zhu, Shukai Hu, Erlu Li, Jiazheng Ding, Yu Han, Wei Ning, Gen Wang, Yihong Dong, Kechi Zhang, Ge Li 2024-10-17 arXiv https://github.com/aixcoder-plugin/aiXcoder-7B http://arxiv.org/abs/2410.13187v2
524 POROver: Improving Safety and Reducing Overrefusal in Large Language Models with Overgeneration and Preference Optimization Batuhan K. Karaman, Ishmam Zabir, Alon Benhaim, Vishrav Chaudhary, Mert R. Sabuncu, Xia Song 2024-10-16 arXiv https://github.com/batuhankmkaraman/POROver http://arxiv.org/abs/2410.12999v1
525 Semantics-Adaptive Activation Intervention for LLMs via Dynamic Steering Vectors Weixuan Wang, Jingyuan Yang, Wei Peng 2024-10-16 arXiv https://github.com/weixuan-wang123/SADI http://arxiv.org/abs/2410.12299v1
526 Self-Pluralising Culture Alignment for Large Language Models Shaoyang Xu, Yongqi Leng, Linhao Yu, Deyi Xiong 2024-10-16 arXiv https://github.com/shaoyangxu/CultureSPA http://arxiv.org/abs/2410.12971v1
527 Qtok: A Comprehensive Framework for Evaluating Multilingual Tokenizer Quality in Large Language Models Iaroslav Chelombitko, Egor Safronov, Aleksey Komissarov 2024-10-16 arXiv https://github.com/nup-csai/Qtok/ http://arxiv.org/abs/2410.12989v1
528 ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs Jingming Zhuo, Songyang Zhang, Xinyu Fang, Haodong Duan, Dahua Lin, Kai Chen 2024-10-16 arXiv https://github.com/open-compass/ProSA http://arxiv.org/abs/2410.12405v1
529 Hypothesis Testing the Circuit Hypothesis in LLMs Claudia Shi, Nicolas Beltran-Velez, Achille Nazaret, Carolina Zheng, Adrià Garriga-Alonso, Andrew Jesson, Maggie Makar, David M. Blei 2024-10-16 arXiv https://github.com/blei-lab/circuitry http://arxiv.org/abs/2410.13032v1
530 DAQ: Density-Aware Post-Training Weight-Only Quantization For LLMs Yingsong Luo, Ling Chen 2024-10-16 arXiv https://github.com/LuoYingSong/DAQ http://arxiv.org/abs/2410.12187v2
531 Codellm-Devkit: A Framework for Contextualizing Code LLMs with Program Analysis Insights Rahul Krishna, Rangeet Pan, Raju Pavuluri, Srikanth Tamilselvam, Maja Vukovic, Saurabh Sinha 2024-10-16 arXiv https://github.com/IBM/codellm-devkit http://arxiv.org/abs/2410.13007v1
532 Neuron-based Personality Trait Induction in Large Language Models Jia Deng, Tianyi Tang, Yanbin Yin, Wenhao Yang, Wayne Xin Zhao, Ji-Rong Wen 2024-10-16 arXiv https://github.com/RUCAIBox/NPTI https://doi.org/10.48550/arXiv.2410.12327
533 HerO at AVeriTeC: The Herd of Open Large Language Models for Verifying Real-World Claims Yejun Yoon, Jaeyoon Jung, Seunghyun Yoon, Kunwoo Park 2024-10-16 arXiv https://github.com/ssu-humane/HerO https://doi.org/10.48550/arXiv.2410.12377
534 Exploring Model Kinship for Merging Large Language Models Yedi Hu, Yunzhi Yao, Ningyu Zhang, Shumin Deng, Huajun Chen 2024-10-16 arXiv https://github.com/zjunlp/ModelKinship https://doi.org/10.48550/arXiv.2410.12613
535 Bridging the Language Gaps in Large Language Models with Inference-Time Cross-Lingual Intervention Weixuan Wang, Minghao Wu, Barry Haddow, Alexandra Birch 2024-10-16 arXiv https://github.com/weixuan-wang123/INCLINE https://doi.org/10.48550/arXiv.2410.12462
536 Automatically Generating Visual Hallucination Test Cases for Multimodal Large Language Models Zhongye Liu, Hongbin Liu, Yuepeng Hu, Zedian Shao, Neil Zhenqiang Gong 2024-10-15 arXiv https://github.com/lycheeefish/VHExpansion https://doi.org/10.48550/arXiv.2410.11242
537 Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models Kai Yao, Penglei Gao, Lichun Li, Yuan Zhao, Xiaofeng Wang, Wei Wang, Jianke Zhu 2024-10-15 EMNLP https://github.com/Kaiseem/IST https://aclanthology.org/2024.findings-emnlp.109
538 Subspace Optimization for Large Language Models with Convergence Guarantees Yutong He, Pengrui Li, Yipeng Hu, Chuyan Chen, Kun Yuan 2024-10-15 arXiv https://github.com/pkumelon/Golore https://doi.org/10.48550/arXiv.2410.11289
539 Zero-shot Model-based Reinforcement Learning using Large Language Models Abdelhakim Benechehab, Youssef Attia El Hili, Ambroise Odonnat, Oussama Zekri, Albert Thomas, Giuseppe Paolo, Maurizio Filippone, Ievgen Redko, Balázs Kégl 2024-10-15 arXiv https://github.com/abenechehab/dicl https://doi.org/10.48550/arXiv.2410.11711
540 LLM2Swarm: Robot Swarms that Responsively Reason, Plan, and Collaborate through LLMs Volker Strobel, Marco Dorigo, Mario Fritz 2024-10-15 arXiv https://github.com/Pold87/LLM2Swarm/ http://arxiv.org/abs/2410.11387v3
541 SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing Zhiyuan Zhang, DongDong Chen, Jing Liao 2024-10-15 arXiv https://bestzzhang.github.io/SGEdit http://arxiv.org/abs/2410.11815v1
542 Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues Qibing Ren, Hao Li, Dongrui Liu, Zhanxu Xie, Xiaoya Lu, Yu Qiao, Lei Sha, Junchi Yan, Lizhuang Ma, Jing Shao 2024-10-14 arXiv https://github.com/renqibing/ActorAttack http://arxiv.org/abs/2410.10700v1
543 Locking Down the Finetuned LLMs Safety Minjun Zhu, Linyi Yang, Yifan Wei, Ningyu Zhang, Yue Zhang 2024-10-14 arXiv https://github.com/zhu-minjun/SafetyLock http://arxiv.org/abs/2410.10343v1
544 DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads Guangxuan Xiao, Jiaming Tang, Jingwei Zuo, Junxian Guo, Shang Yang, Haotian Tang, Yao Fu, Song Han 2024-10-14 arXiv https://github.com/mit-han-lab/duo-attention http://arxiv.org/abs/2410.10819v1
545 Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free Ziyue Li, Tianyi Zhou 2024-10-14 arXiv https://github.com/tianyi-lab/MoE-Embedding http://arxiv.org/abs/2410.10814v2
546 One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks Fangru Lin, Shaoguang Mao, Emanuele La Malfa, Valentin Hofmann, Adrian de Wynter, Jing Yao, Si-Qing Chen, Michael J. Wooldridge, Furu Wei 2024-10-14 arXiv https://github.com/fangru-lin/redial_dialect_robustness_fairness https://doi.org/10.48550/arXiv.2410.11005
547 Large Language Model Evaluation via Matrix Nuclear-Norm Yahan Li, Tingyu Xia, Yi Chang, Yuan Wu 2024-10-14 arXiv https://github.com/MLGroupJLU/MatrixNuclearNorm https://doi.org/10.48550/arXiv.2410.10672
548 AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models Haiquan Lu, Yefan Zhou, Shiwei Liu, Zhangyang Wang, Michael W. Mahoney, Yaoqing Yang 2024-10-14 arXiv https://github.com/haiquanlu/AlphaPruning https://doi.org/10.48550/arXiv.2410.10912
549 MentalGLM Series: Explainable Large Language Models for Mental Health Analysis on Chinese Social Media Wei Zhai, Nan Bai, Qing Zhao, Jianqiang Li, Fan Wang, Hongzhi Qi, Meng Jiang, Xiaoqin Wang, Bing Xiang Yang, Guanghui Fu 2024-10-14 arXiv https://github.com/zwzzzQAQ/MentalGLM https://doi.org/10.48550/arXiv.2410.10323
550 LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models Han Qiu, Jiaxing Huang, Peng Gao, Qin Qi, Xiaoqin Zhang, Ling Shao, Shijian Lu 2024-10-13 arXiv https://github.com/hanqiu-hq/LongHalQA https://doi.org/10.48550/arXiv.2410.09962
551 RMB: Comprehensively Benchmarking Reward Models in LLM Alignment Enyu Zhou, Guodong Zheng, Binghai Wang, Zhiheng Xi, Shihan Dou, Rong Bao, Wei Shen, Limao Xiong, Jessica Fan, Yurong Mou, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang 2024-10-13 arXiv https://github.com/Zhou-Zoey/RMB-Reward-Model-Benchmark http://arxiv.org/abs/2410.09893v1
552 FB-Bench: A Fine-Grained Multi-Task Benchmark for Evaluating LLMs' Responsiveness to Human Feedback Youquan Li, Miao Zheng, Fan Yang, Guosheng Dong, Bin Cui, Weipeng Chen, Zenan Zhou, Wentao Zhang 2024-10-12 arXiv https://github.com/PKU-Baichuan-MLSystemLab/FB-Bench http://arxiv.org/abs/2410.09412v1
553 Skipping Computations in Multimodal LLMs Mustafa Shukor, Matthieu Cord 2024-10-12 arXiv https://github.com/mshukor/ima-lmms http://arxiv.org/abs/2410.09454v1
554 LLM$\times$MapReduce: Simplified Long-Sequence Processing using Large Language Models Zihan Zhou, Chong Li, Xinyi Chen, Shuo Wang, Yu Chao, Zhili Li, Haoyu Wang, Rongqiao An, Qi Shi, Zhixing Tan, Xu Han, Xiaodong Shi, Zhiyuan Liu, Maosong Sun 2024-10-12 arXiv https://github.com/thunlp/LLMxMapReduce http://arxiv.org/abs/2410.09342v1
555 FlatQuant: Flatness Matters for LLM Quantization Yuxuan Sun, Ruikang Liu, Haoli Bai, Han Bao, Kang Zhao, Yuening Li, Jiaxin Hu, Xianzhi Yu, Lu Hou, Chun Yuan, Xin Jiang, Wulong Liu, Jun Yao 2024-10-12 arXiv https://github.com/ruikangliu/FlatQuant http://arxiv.org/abs/2410.09426v1
556 ELICIT: LLM Augmentation via External In-Context Capability Futing Wang, Jianhao Yan, Yue Zhang, Tao Lin 2024-10-12 arXiv https://github.com/LINs-lab/ELICIT http://arxiv.org/abs/2410.09343v1
557 ReLU's Revival: On the Entropic Overload in Normalization-Free Large Language Models Nandan Kumar Jha, Brandon Reagen 2024-10-12 arXiv https://github.com/Nandan91/relu-revival-normfree https://doi.org/10.48550/arXiv.2410.09637
558 OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models Jun Wang, Meng Fang, Ziyu Wan, Muning Wen, Jiachen Zhu, Anjie Liu, Ziqin Gong, Yan Song, Lei Chen, Lionel M. Ni, Linyi Yang, Ying Wen, Weinan Zhang 2024-10-12 arXiv https://openreasoner.github.io https://doi.org/10.48550/arXiv.2410.09671
559 MMAD: The First-Ever Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection Xi Jiang, Jian Li, Hanqiu Deng, Yong Liu, Bin-Bin Gao, Yifeng Zhou, Jialin Li, Chengjie Wang, Feng Zheng 2024-10-12 arXiv https://github.com/jam-cc/MMAD https://doi.org/10.48550/arXiv.2410.09453
560 Dual-AEB: Synergizing Rule-Based and Multimodal Large Language Models for Effective Emergency Braking Wei Zhang, Pengfei Li, Junli Wang, Bingchuan Sun, Qihao Jin, Guangjun Bao, Shibo Rui, Yang Yu, Wenchao Ding, Peng Li, Yilun Chen 2024-10-11 arXiv https://github.com/ChipsICU/Dual-AEB https://doi.org/10.48550/arXiv.2410.08616
561 AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation Zijun Wang, Haoqin Tu, Jieru Mei, Bingchen Zhao, Yisen Wang, Cihang Xie 2024-10-11 arXiv https://github.com/UCSC-VLAA/AttnGCG-attack http://arxiv.org/abs/2410.09040v1
562 QEFT: Quantization for Efficient Fine-Tuning of LLMs Changhun Lee, Jun-gyu Jin, Younghyun Cho, Eunhyeok Park 2024-10-11 arXiv https://github.com/xvyaward/qeft http://arxiv.org/abs/2410.08661v1
563 Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System Weize Chen, Jiarui Yuan, Chen Qian, Cheng Yang, Zhiyuan Liu, Maosong Sun 2024-10-10 arXiv https://chenweize1998.github.io/optima-project-page http://arxiv.org/abs/2410.08115v1
564 VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models Lisa Dunlap, Krishna Mandal, Trevor Darrell, Jacob Steinhardt, Joseph E Gonzalez 2024-10-10 arXiv https://github.com/lisadunlap/VibeCheck http://arxiv.org/abs/2410.12851v5
565 Towards Next-Generation LLM-based Recommender Systems: A Survey and Beyond Qi Wang, Jindong Li, Shiqi Wang, Qianli Xing, Runliang Niu, He Kong, Rui Li, Guodong Long, Yi Chang, Chengqi Zhang 2024-10-10 arXiv https://github.com/jindongli-Ai/Next-Generation-LLM-based-Recommender-Systems-Survey http://arxiv.org/abs/2410.19744v1
566 StepTool: A Step-grained Reinforcement Learning Framework for Tool Learning in LLMs Yuanqing Yu, Zhefan Wang, Weizhi Ma, Zhicheng Guo, Jingtao Zhan, Shuai Wang, Chuhan Wu, Zhiqiang Guo, Min Zhang 2024-10-10 arXiv https://github.com/yuyq18/StepTool http://arxiv.org/abs/2410.07745v2
567 Reward-Augmented Data Enhances Direct Preference Alignment of LLMs Shenao Zhang, Zhihan Liu, Zhaoran Wang 2024-10-10 arXiv https://github.com/shenao-zhang/reward-augmented-preference http://arxiv.org/abs/2410.08067v2
568 Extracting and Transferring Abilities For Building Multi-lingual Ability-enhanced Large Language Models Zhipeng Chen, Liang Song, Kun Zhou, Wayne Xin Zhao, Bingning Wang, Weipeng Chen, Ji-Rong Wen 2024-10-10 arXiv https://github.com/RUCAIBox/MAET https://doi.org/10.48550/arXiv.2410.07825
569 Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models Sitao Cheng, Liangming Pan, Xunjian Yin, Xinyi Wang, William Yang Wang 2024-10-10 arXiv https://github.com/sitaocheng/Knowledge_Interplay https://doi.org/10.48550/arXiv.2410.08414
570 Teaching-Inspired Integrated Prompting Framework: A Novel Approach for Enhancing Reasoning in Large Language Models Wenting Tan, Dongxiao Chen, Jieting Xue, Zihao Wang, Taijie Chen 2024-10-10 arXiv https://github.com/SallyTan13/Teaching-Inspired-Prompting https://doi.org/10.48550/arXiv.2410.08068
571 Privately Learning from Graphs with Applications in Fine-tuning Large Language Models Haoteng Yin, Rongzhe Wei, Eli Chien, Pan Li 2024-10-10 arXiv https://github.com/Graph-COM/PvGaLM https://doi.org/10.48550/arXiv.2410.08299
572 GameTraversalBenchmark: Evaluating Planning Abilities Of Large Language Models Through Traversing 2D Game Maps Muhammad Umair Nasir, Steven James, Julian Togelius 2024-10-10 arXiv https://github.com/umair-nasir14/Game-Traversal-Benchmark https://doi.org/10.48550/arXiv.2410.07765
573 A Closer Look at Machine Unlearning for Large Language Models Xiaojian Yuan, Tianyu Pang, Chao Du, Kejiang Chen, Weiming Zhang, Min Lin 2024-10-10 arXiv https://github.com/sail-sg/closer-look-LLM-unlearning https://doi.org/10.48550/arXiv.2410.08109
574 CoBa: Convergence Balancer for Multitask Finetuning of Large Language Models Zi Gong, Hang Yu, Cong Liao, Bingchang Liu, Chaoyu Chen, Jianguo Li 2024-10-09 EMNLP https://github.com/codefuse-ai/MFTCoder https://aclanthology.org/2024.emnlp-main.459
575 Dissecting Fine-Tuning Unlearning in Large Language Models Yihuai Hong, Yuelin Zou, Lijie Hu, Ziqian Zeng, Di Wang, Haiqin Yang 2024-10-09 EMNLP https://github.com/yihuaihong/Dissecting-FT-Unlearning https://aclanthology.org/2024.emnlp-main.228
576 Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization Changli Tang, Yixuan Li, Yudong Yang, Jimin Zhuang, Guangzhi Sun, Wei Li, Zujun Ma, Chao Zhang 2024-10-09 arXiv https://video-salmonn-2.github.io http://arxiv.org/abs/2410.06682v2
577 IterGen: Iterative Structured LLM Generation Shubham Ugare, Rohan Gumaste, Tarun Suresh, Gagandeep Singh, Sasa Misailovic 2024-10-09 arXiv https://github.com/uiuc-arc/itergen http://arxiv.org/abs/2410.07295v1
578 Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning Chongyu Fan, Jiancheng Liu, Licong Lin, Jinghan Jia, Ruiqi Zhang, Song Mei, Sijia Liu 2024-10-09 arXiv https://github.com/OPTML-Group/Unlearn-Simple http://arxiv.org/abs/2410.07163v2
579 WALL-E: World Alignment by Rule Learning Improves World Model-based LLM Agents Siyu Zhou, Tianyi Zhou, Yijun Yang, Guodong Long, Deheng Ye, Jing Jiang, Chengqi Zhang 2024-10-09 arXiv https://github.com/elated-sawyer/WALL-E http://arxiv.org/abs/2410.07484v2
580 Weak-eval-Strong: Evaluating and Eliciting Lateral Thinking of LLMs with Situation Puzzles Qi Chen, Bowen Zhang, Gang Wang, Qi Wu 2024-10-09 arXiv https://github.com/chenqi008/LateralThinking http://arxiv.org/abs/2410.06733v1
581 Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing Hao Fei, Shengqiong Wu, Hanwang Zhang, Tat-Seng Chua, Shuicheng Yan 2024-10-08 arXiv https://vitron-llm.github.io/ http://arxiv.org/abs/2412.19806v1
582 Enhancing Temporal Modeling of Video LLMs via Time Gating Zi-Yuan Hu, Yiwu Zhong, Shijia Huang, Michael R. Lyu, Liwei Wang 2024-10-08 arXiv https://github.com/LaVi-Lab/TG-Vid http://arxiv.org/abs/2410.05714v1
583 ToolBridge: An Open-Source Dataset to Equip LLMs with External Tool Capabilities Zhenchao Jin, Mengchen Liu, Dongdong Chen, Lingting Zhu, Yunsheng Li, Lequan Yu 2024-10-08 arXiv https://github.com/CharlesPikachu/ToolBridge http://arxiv.org/abs/2410.10872v1
584 MEXA: Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment Amir Hossein Kargaran, Ali Modarressi, Nafiseh Nikeghbal, Jana Diesner, François Yvon, Hinrich Schütze 2024-10-08 arXiv https://github.com/cisnlp/Mexa http://arxiv.org/abs/2410.05873v1
585 GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models Muhammad Jehanzeb Mirza, Mengjie Zhao, Zhuoyuan Mao, Sivan Doveh, Wei Lin, Paul Gavrikov, Michael Dorkenwald, Shiqi Yang, Saurav Jha, Hiromi Wakaki, Yuki Mitsufuji, Horst Possegger, Rogério Feris, Leonid Karlinsky, James R. Glass 2024-10-08 arXiv https://github.com/jmiemirza/GLOV https://doi.org/10.48550/arXiv.2410.06154
586 AgentSquare: Automatic LLM Agent Search in Modular Design Space Yu Shang, Yu Li, Keyu Zhao, Likai Ma, Jiahe Liu, Fengli Xu, Yong Li 2024-10-08 arXiv https://github.com/tsinghua-fib-lab/AgentSquare http://arxiv.org/abs/2410.06153v2
587 Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models Fei Wang, Ninareh Mehrabi, Palash Goyal, Rahul Gupta, Kai-Wei Chang, Aram Galstyan 2024-10-07 EMNLP https://feiwang96.github.io/DataAdvisor/ https://aclanthology.org/2024.emnlp-main.461
588 Better than Your Teacher: LLM Agents that learn from Privileged AI Feedback Sanjiban Choudhury, Paloma Sodhi 2024-10-07 arXiv https://leap-llm.github.io http://arxiv.org/abs/2410.05434v1
589 PrefixQuant: Static Quantization Beats Dynamic through Prefixed Outliers in LLMs Mengzhao Chen, Yi Liu, Jiahao Wang, Yi Bin, Wenqi Shao, Ping Luo 2024-10-07 arXiv https://github.com/ChenMnZ/PrefixQuant http://arxiv.org/abs/2410.05265v1
590 Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild Xinyu Zhao, Guoheng Sun, Ruisi Cai, Yukun Zhou, Pingzhi Li, Peihao Wang, Bowen Tan, Yexiao He, Li Chen, Yi Liang, Beidi Chen, Binhang Yuan, Hongyi Wang, Ang Li, Zhangyang Wang, Tianlong Chen 2024-10-07 arXiv https://github.com/Model-GLUE/Model-GLUE http://arxiv.org/abs/2410.05357v2
591 Can LLMs Understand Time Series Anomalies? Zihao Zhou, Rose Yu 2024-10-07 arXiv https://github.com/Rose-STL-Lab/AnomLLM/` http://arxiv.org/abs/2410.05440v2
592 Intriguing Properties of Large Language and Vision Models Young-Jun Lee, Byungsoo Ko, Han-Gyu Kim, Yechan Hwang, Ho-Jin Choi 2024-10-07 arXiv https://github.com/passing2961/IP-LLVM https://doi.org/10.48550/arXiv.2410.04751
593 Aligning LLMs to Be Robust Against Prompt Injection Sizhe Chen, Arman Zharmagambetov, Saeed Mahloujifar, Kamalika Chaudhuri, Chuan Guo 2024-10-07 arXiv https://github.com/facebookresearch/SecAlign http://arxiv.org/abs/2410.05451v1
594 Narrative-of-Thought: Improving Temporal Reasoning of Large Language Models via Recounted Narratives Xinliang Frederick Zhang, Nicholas Beauchamp, Lu Wang 2024-10-07 EMNLP https://github.com/launchnlp/NoT https://aclanthology.org/2024.findings-emnlp.963
595 Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality Guanyu Zhou, Yibo Yan, Xin Zou, Kun Wang, Aiwei Liu, Xuming Hu 2024-10-07 arXiv https://github.com/The-Martyr/CausalMM https://doi.org/10.48550/arXiv.2410.04780
596 Synthesizing Interpretable Control Policies through Large Language Model Guided Search Carlo Bosio, Mark W. Mueller 2024-10-07 arXiv https://github.com/muellerlab/synthesizing_interpretable_control_policies https://doi.org/10.48550/arXiv.2410.05406
597 CogDevelop2K: Reversed Cognitive Development in Multimodal Large Language Models Yijiang Li, Qingying Gao, Haoran Sun, Haiyun Lyu, Dezhi Luo, Hokin Deng 2024-10-06 arXiv https://growing-ai-like-a-child.github.io/ https://doi.org/10.48550/arXiv.2410.10855
598 Leveraging Large Language Models for Suicide Detection on Social Media with Limited Labels Vy Nguyen, Chau Pham 2024-10-06 arXiv https://github.com/khanhvynguyen/Suicide_Detection_LLMs https://doi.org/10.48550/arXiv.2410.04501
599 MindScope: Exploring Cognitive Biases in Large Language Models Through Multi-Agent Systems Zhentao Xie, Jiabao Zhao, Yilei Wang, Jinxin Shi, Yanhong Bai, Xingjiao Wu, Liang He 2024-10-06 ECAI https://github.com/2279072142/MindScope https://doi.org/10.3233/FAIA240879
600 CS4: Measuring the Creativity of Large Language Models Automatically by Controlling the Number of Story-Writing Constraints Anirudh Atmakuru, Jatin Nainani, Rohith Siddhartha Reddy Bheemreddy, Anirudh Lakkaraju, Zonghai Yao, Hamed Zamani, Haw-Shiuan Chang 2024-10-05 arXiv https://github.com/anirudhlakkaraju/cs4_benchmark https://doi.org/10.48550/arXiv.2410.04197
601 Neuron-Level Sequential Editing for Large Language Models Houcheng Jiang, Junfeng Fang, Tianyu Zhang, An Zhang, Ruipeng Wang, Tao Liang, Xiang Wang 2024-10-05 arXiv https://github.com/jianghoucheng/NSE https://doi.org/10.48550/arXiv.2410.04045
602 Steering Large Language Models between Code Execution and Textual Reasoning Yongchao Chen, Harsh Jhamtani, Srinagesh Sharma, Chuchu Fan, Chi Wang 2024-10-04 arXiv https://yongchao98.github.io/CodeSteer/ https://doi.org/10.48550/arXiv.2410.03524
603 Self-Powered LLM Modality Expansion for Large Speech-Text Models Tengfei Yu, Xuebo Liu, Zhiyi Hou, Liang Ding, Dacheng Tao, Min Zhang 2024-10-04 arXiv https://github.com/ytf-philp/Self-powered-LSM http://arxiv.org/abs/2410.03798v2
604 Leveraging Social Determinants of Health in Alzheimer's Research Using LLM-Augmented Literature Mining and Knowledge Graphs Tianqi Shang, Shu Yang, Weiqing He, Tianhua Zhai, Dawei Li, Bojian Hou, Tianlong Chen, Jason H. Moore, Marylyn D. Ritchie, Li Shen 2024-10-04 arXiv https://github.com/hwq0726/SDoHenPKG http://arxiv.org/abs/2410.09080v1
605 LLM-TOPLA: Efficient LLM Ensemble by Maximising Diversity Selim Furkan Tekin, Fatih Ilhan, Tiansheng Huang, Sihao Hu, Ling Liu 2024-10-04 arXiv https://github.com/git-disl/llm-topla http://arxiv.org/abs/2410.03953v1
606 GraphRouter: A Graph-based Router for LLM Selections Tao Feng, Yanzhen Shen, Jiaxuan You 2024-10-04 arXiv https://github.com/ulab-uiuc/GraphRouter http://arxiv.org/abs/2410.03834v1
607 Aligning LLMs with Individual Preferences via Interaction Shujin Wu, May Fung, Cheng Qian, Jeonghwan Kim, Dilek Hakkani-Tur, Heng Ji 2024-10-04 arXiv https://github.com/ShujinWu-0814/ALOE http://arxiv.org/abs/2410.03642v2
608 PersonalSum: A User-Subjective Guided Personalized Summarization Dataset for Large Language Models Lemei Zhang, Peng Liu, Marcus Tiedemann Oekland Henriksboe, Even W. Lauvrak, Jon Atle Gulla, Heri Ramampiaro 2024-10-04 arXiv https://github.com/SmartmediaAI/PersonalSum https://doi.org/10.48550/arXiv.2410.03905
609 PersoBench: Benchmarking Personalized Response Generation in Large Language Models Saleh Afzoon, Usman Naseem, Amin Beheshti, Zahra Jamali 2024-10-04 arXiv https://github.com/salehafzoon/PersoBench https://doi.org/10.48550/arXiv.2410.03198
610 Output Scouting: Auditing Large Language Models for Catastrophic Responses Andrew Bell, João Fonseca 2024-10-04 arXiv https://github.com/joaopfonseca/outputscouting https://doi.org/10.48550/arXiv.2410.05305
611 ARB-LLM: Alternating Refined Binarizations for Large Language Models Zhiteng Li, Xianglong Yan, Tianao Zhang, Haotong Qin, Dong Xie, Jiang Tian, Zhongchao Shi, Linghe Kong, Yulun Zhang, Xiaokang Yang 2024-10-04 arXiv https://github.com/ZHITENGLI/ARB-LLM https://doi.org/10.48550/arXiv.2410.03129
612 A Probabilistic Perspective on Unlearning and Alignment for Large Language Models Yan Scholten, Stephan Günnemann, Leo Schwinn 2024-10-04 arXiv https://github.com/yascho/probabilistic-unlearning https://doi.org/10.48550/arXiv.2410.03523
613 CommonIT: Commonality-Aware Instruction Tuning for Large Language Models via Data Partitions Jun Rao, Xuebo Liu, Lian Lian, Shengjun Cheng, Yunjie Liao, Min Zhang 2024-10-04 EMNLP https://github.com/raojay7/CommonIT https://aclanthology.org/2024.emnlp-main.561
614 POSIX: A Prompt Sensitivity Index For Large Language Models Anwoy Chatterjee, H. S. V. N. S. Kowndinya Renduchintala, Sumit Bhatia, Tanmoy Chakraborty 2024-10-03 EMNLP https://github.com/kowndinya-renduchintala/POSIX https://aclanthology.org/2024.findings-emnlp.852
615 Traffic Light or Light Traffic? Investigating Phrasal Semantics in Large Language Models Rui Meng, Ye Liu, Lifu Tu, Daqing He, Yingbo Zhou, Semih Yavuz 2024-10-03 EMNLP https://github.com/memray/llm_phrase_semantics https://aclanthology.org/2024.findings-emnlp.503
616 Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents Hanrong Zhang, Jingyuan Huang, Kai Mei, Yifei Yao, Zhenting Wang, Chenlu Zhan, Hongwei Wang, Yongfeng Zhang 2024-10-03 arXiv https://github.com/agiresearch/ASB http://arxiv.org/abs/2410.02644v1
617 Meta-Models: An Architecture for Decoding LLM Behaviors Through Interpreted Embeddings and Natural Language Anthony Costarelli, Mat Allen, Severin Field 2024-10-03 arXiv https://github.com/acostarelli/meta-models-public http://arxiv.org/abs/2410.02472v3
618 StringLLM: Understanding the String Processing Capability of Large Language Models Xilong Wang, Hao Fu, Jindong Wang, Neil Zhenqiang Gong 2024-10-02 arXiv https://github.com/wxl-lxw/StringLLM https://doi.org/10.48550/arXiv.2410.01208
619 Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective Zeyu Gan, Yong Liu 2024-10-02 arXiv https://github.com/ZyGan1999/Towards-a-Theoretical-Understanding-of-Synthetic-Data-in-LLM-Post-Training http://arxiv.org/abs/2410.01720v2
620 Fira: Can We Achieve Full-rank Training of LLMs Under Low-rank Constraint? Xi Chen, Kaituo Feng, Changsheng Li, Xunhao Lai, Xiangyu Yue, Ye Yuan, Guoren Wang 2024-10-02 arXiv https://github.com/xichen-fy/Fira http://arxiv.org/abs/2410.01623v2
621 EMMA: Efficient Visual Alignment in Multi-Modal LLMs Sara Ghazanfari, Alexandre Araujo, Prashanth Krishnamurthy, Siddharth Garg, Farshad Khorrami 2024-10-02 arXiv https://github.com/SaraGhazanfari/EMMA http://arxiv.org/abs/2410.02080v1
622 TypedThinker: Typed Thinking Improves Large Language Model Reasoning Danqing Wang, Jianxin Ma, Fei Fang, Lei Li 2024-10-02 arXiv https://github.com/dqwang122/ThinkHub https://doi.org/10.48550/arXiv.2410.01952
623 Open-RAG: Enhanced Retrieval Augmented Reasoning with Open-Source Large Language Models Shayekh Bin Islam, Md Asib Rahman, K. S. M. Tozammel Hossain, Enamul Hoque, Shafiq Joty, Md. Rizwan Parvez 2024-10-02 EMNLP https://openragmoe.github.io/ https://aclanthology.org/2024.findings-emnlp.831
624 DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models Yuxuan Zhang, Ruizhe Li 2024-10-02 arXiv https://github.com/MeCuping/DLP-LoRA https://doi.org/10.48550/arXiv.2410.01497
625 Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model Compression Jingcun Wang, Yu-Guang Chen, Ing-Chao Lin, Bing Li, Grace Li Zhang 2024-10-02 arXiv https://github.com/TUDa-HWAI/Basis_Sharing https://doi.org/10.48550/arXiv.2410.03765
626 FactAlign: Long-form Factuality Alignment of Large Language Models Chao-Wei Huang, Yun-Nung Chen 2024-10-02 EMNLP https://github.com/MiuLab/FactAlign https://aclanthology.org/2024.findings-emnlp.955
627 Dynamic Planning for LLM-based Graphical User Interface Automation Shaoqing Zhang, Zhuosheng Zhang, Kehai Chen, Xinbei Ma, Muyun Yang, Tiejun Zhao, Min Zhang 2024-10-01 OpenReview https://github.com/sqzhang-lazy/D-PoT http://arxiv.org/abs/2410.00467v3
628 Insight: A Multi-Modal Diagnostic Pipeline using LLMs for Ocular Surface Disease Diagnosis Chun-Hsiao Yeh, Jiayun Wang, Andrew D. Graham, Andrea J. Liu, Bo Tan, Yubei Chen, Yi Ma, Meng C. Lin 2024-10-01 arXiv https://danielchyeh.github.io/MDPipe/ http://arxiv.org/abs/2410.00292v1
629 Style-Specific Neurons for Steering LLMs in Text Style Transfer Wen Lai, Viktor Hangya, Alexander Fraser 2024-10-01 arXiv https://github.com/wenlai-lavine/sNeuron-TST http://arxiv.org/abs/2410.00593v1
630 Unleashing the Unseen: Harnessing Benign Datasets for Jailbreaking Large Language Models Wei Zhao, Zhe Li, Yige Li, Jun Sun 2024-10-01 arXiv https://github.com/suffix-maybe-feature/adver-suffix-maybe-features http://arxiv.org/abs/2410.00451v3
631 Multimodal LLM Enhanced Cross-lingual Cross-modal Retrieval Yabing Wang, Le Wang, Qiang Zhou, Zhibin Wang, Hao Li, Gang Hua, Wei Tang 2024-10 MM '24: Proceedings of the 32nd ACM International Conference on Multimedia https://github.com/LiJiaBei-7/leccr https://dl.acm.org/doi/10.1145/3664647.3680886
632 mPLUG-PaperOwl: Scientific Diagram Analysis with the Multimodal Large Language Model Anwen Hu, Yaya Shi, Haiyang Xu, Jiabo Ye, Qinghao Ye, Ming Yan, Chenliang Li, Qi Qian, Ji Zhang, Fei Huang 2024-10 MM '24: Proceedings of the 32nd ACM International Conference on Multimedia https://github.com/X-PLUG/mPLUG-DocOwl/tree/main/PaperOwl https://dl.acm.org/doi/10.1145/3664647.3681294
633 WorldGPT: Empowering LLM as Multimodal World Model Zhiqi Ge, Hongzhe Huang, Mingze Zhou, Juncheng Li, Guoming Wang, Siliang Tang, Yueting Zhuang 2024-10 MM '24: Proceedings of the 32nd ACM International Conference on Multimedia https://github.com/DCDmllm/WorldGPT https://dl.acm.org/doi/10.1145/3664647.3681488
634 Semantic Alignment for Multimodal Large Language Models Tao Wu, Mengze Li, Jingyuan Chen, Wei Ji, Wang Lin, Jinyang Gao, Kun Kuang, Zhou Zhao, Fei Wu 2024-10 MM '24: Proceedings of the 32nd ACM International Conference on Multimedia https://mccartney01.github.io/SAM https://dl.acm.org/doi/10.1145/3664647.3681014
635 Preliminary Study on Incremental Learning for Large Language Model-based Recommender Systems Tianhao Shi, Yang Zhang, Zhijian Xu, Chong Chen, Fuli Feng, Xiangnan He, Qi Tian 2024-10 CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management https://github.com/TianhaoShi2001/LSAT https://dl.acm.org/doi/10.1145/3627673.3679922
636 Hypergraph Multi-modal Large Language Model: Exploiting EEG and Eye-tracking Modalities to Evaluate Heterogeneous Responses for Video Understanding Minghui Wu, Chenxu Zhao, Anyang Su, Donglin Di, Tianyu Fu, Da An, Min He, Ya Gao, Meng Ma, Kun Yan, Ping Wang 2024-10 MM '24: Proceedings of the 32nd ACM International Conference on Multimedia https://github.com/mininglamp-MLLM/HMLLM https://dl.acm.org/doi/10.1145/3664647.3680810
637 MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors Yuan Tang, Xu Han, Xianzhi Li, Qiao Yu, Yixue Hao, Long Hu, Min Chen 2024-10 MM '24: Proceedings of the 32nd ACM International Conference on Multimedia https://github.com/TangYuan96/MiniGPT-3D https://dl.acm.org/doi/10.1145/3664647.3681257
638 MM-Forecast: A Multimodal Approach to Temporal Event Forecasting with Large Language Models Haoxuan Li, Zhengmao Yang, Yunshan Ma, Yi Bin, Yang Yang, Tat-Seng Chua 2024-10 MM '24: Proceedings of the 32nd ACM International Conference on Multimedia https://github.com/LuminosityX/MM-Forecast https://dl.acm.org/doi/10.1145/3664647.3681593
639 Fairness in Large Language Models in Three Hours Thang Viet Doan, Zichong Wang, Nhat Nguyen Minh Hoang, Wenbin Zhang 2024-10 CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management https://github.com/LavinWong/Fairness-in-Large-Language-Models https://dl.acm.org/doi/10.1145/3627673.3679090
640 Advancing Multimodal Large Language Models with Quantization-Aware Scale Learning for Efficient Adaptation Jingjing Xie, Yuxin Zhang, Mingbao Lin, Liujuan Cao, Rongrong Ji 2024-10 MM '24: Proceedings of the 32nd ACM International Conference on Multimedia https://github.com/xjjxmu/QSLAW https://dl.acm.org/doi/10.1145/3664647.3680838
641 LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation Ziyao Zhang, Yanlin Wang, Chong Wang, Jiachi Chen, Zibin Zheng 2024-09-30 arXiv https://github.com/DeepSoftwareAnalytics/LLMCodingHallucination http://arxiv.org/abs/2409.20550v1
642 VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs Ruotong Liao, Max Erler, Huiyu Wang, Guangyao Zhai, Gengyuan Zhang, Yunpu Ma, Volker Tresp 2024-09-30 arXiv https://github.com/mayhugotong/VideoINSTA http://arxiv.org/abs/2409.20365v2
643 LLMEmb: Large Language Model Can Be a Good Embedding Generator for Sequential Recommendation Qidong Liu, Xian Wu, Wanyu Wang, Yejing Wang, Yuanshao Zhu, Xiangyu Zhao, Feng Tian, Yefeng Zheng 2024-09-30 arXiv https://github.com/Applied-Machine-Learning-Lab/LLMEmb http://arxiv.org/abs/2409.19925v2
644 LexEval: A Comprehensive Chinese Legal Benchmark for Evaluating Large Language Models Haitao Li, You Chen, Qingyao Ai, Yueyue Wu, Ruizhe Zhang, Yiqun Liu 2024-09-30 arXiv https://github.com/CSHaitao/LexEval https://doi.org/10.48550/arXiv.2409.20288
645 RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models Shuhao Chen, Weisen Jiang, Baijiong Lin, James T. Kwok, Yu Zhang 2024-09-30 arXiv https://github.com/shuhao02/RouterDC https://doi.org/10.48550/arXiv.2409.19886
646 Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models Luohe Shi, Yao Yao, Zuchao Li, Lefei Zhang, Hai Zhao 2024-09-30 arXiv https://github.com/ShiLuohe/ReferenceTrustableDecoding https://doi.org/10.48550/arXiv.2409.20181
647 BuildingView: Constructing Urban Building Exteriors Databases with Street View Imagery and Multimodal Large Language Mode Zongrong Li, Yunlei Su, Chenyuan Zhu, Wufan Zhao 2024-09-29 arXiv https://github.com/Jasper0122/BuildingView https://doi.org/10.48550/arXiv.2409.19527
648 Can Large Language Models Analyze Graphs like Professionals? A Benchmark, Datasets and Models Xin Li, Weize Chen, Qizhi Chu, Haopeng Li, Zhaojun Sun, Ran Li, Chen Qian, Yiwei Wei, Zhiyuan Liu, Chuan Shi, Maosong Sun, Cheng Yang 2024-09-29 arXiv https://github.com/BUPT-GAMMA/ProGraph https://doi.org/10.48550/arXiv.2409.19667
649 Identifying Knowledge Editing Types in Large Language Models Xiaopeng Li, Shangwen Wang, Shezheng Song, Bin Ji, Huijun Liu, Shasha Li, Jun Ma, Jie Yu 2024-09-29 arXiv https://github.com/xpq-tech/KETI https://doi.org/10.48550/arXiv.2409.19663
650 A multimodal LLM for the non-invasive decoding of spoken text from brain recordings Youssef Hmamouche, Ismail Chihab, Lahoucine Kdouri, Amal El Fallah Seghrouchni 2024-09-29 arXiv https://github.com/Hmamouche/brain_decode http://arxiv.org/abs/2409.19710v1
651 OpenSep: Leveraging Large Language Models with Textual Inversion for Open World Audio Separation Tanvir Mahmud, Diana Marculescu 2024-09-28 EMNLP https://github.com/tanvir-utexas/OpenSep https://aclanthology.org/2024.emnlp-main.735
652 A Survey on the Honesty of Large Language Models Siheng Li, Cheng Yang, Taiqiang Wu, Chufan Shi, Yuji Zhang, Xinyu Zhu, Zesen Cheng, Deng Cai, Mo Yu, Lemao Liu, Jie Zhou, Yujiu Yang, Ngai Wong, Xixin Wu, Wai Lam 2024-09-27 arXiv https://github.com/SihengLi99/LLM-Honesty-Survey https://doi.org/10.48550/arXiv.2409.18786
653 CurricuLLM: Automatic Task Curricula Design for Learning Complex Robot Skills using Large Language Models Kanghyun Ryu, Qiayuan Liao, Zhongyu Li, Koushil Sreenath, Negar Mehr 2024-09-27 arXiv https://github.com/labicon/CurricuLLM https://doi.org/10.48550/arXiv.2409.18382
654 Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models Jiaming Li, Lei Zhang, Yunshui Li, Ziqiang Liu, Yuelin Bai, Run Luo, Longze Chen, Min Yang 2024-09-27 EMNLP https://github.com/Geaming2002/Ruler https://aclanthology.org/2024.findings-emnlp.172
655 Align$^2$LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation Hongzhe Huang, Jiang Liu, Zhewen Yu, Li Cai, Dian Jiao, Wenqiao Zhang, Siliang Tang, Juncheng Li, Hao Jiang, Haoyuan Li, Yueting Zhuang 2024-09-27 arXiv https://github.com/DCDmllm/Align2LLaVA http://arxiv.org/abs/2409.18541v2
656 HaloScope: Harnessing Unlabeled LLM Generations for Hallucination Detection Xuefeng Du, Chaowei Xiao, Yixuan Li 2024-09-26 arXiv https://github.com/deeplearningwisc/haloscope http://arxiv.org/abs/2409.17504v1
657 From News to Forecast: Integrating Event Analysis in LLM-Based Time Series Forecasting with Reflection Xinlei Wang, Maike Feng, Jing Qiu, Jinjin Gu, Junhua Zhao 2024-09-26 arXiv https://github.com/ameliawong1996/From_News_to_Forecast http://arxiv.org/abs/2409.17515v3
658 Extracting Affect Aggregates from Longitudinal Social Media Data with Temporal Adapters for Large Language Models Georg Ahnert, Max Pellert, David Garcia, Markus Strohmaier 2024-09-26 arXiv https://github.com/dess-mannheim/temporal-adapters http://arxiv.org/abs/2409.17990v1
659 AssistantX: An LLM-Powered Proactive Assistant in Collaborative Human-Populated Environment Nan Sun, Bo Mao, Yongchang Li, Lumeng Ma, Di Guo, Huaping Liu 2024-09-26 arXiv https://assistantx-agent.github.io/AssistantX/ http://arxiv.org/abs/2409.17655v1
660 RED QUEEN: Safeguarding Large Language Models against Concealed Multi-Turn Jailbreaking Yifan Jiang, Kriti Aggarwal, Tanmay Laud, Kashif Munir, Jay Pujara, Subhabrata Mukherjee 2024-09-26 arXiv https://github.com/kriti-hippo/red_queen https://doi.org/10.48550/arXiv.2409.17458
661 MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models Gongfan Fang, Hongxu Yin, Saurav Muralidharan, Greg Heinrich, Jeff Pool, Jan Kautz, Pavlo Molchanov, Xinchao Wang 2024-09-26 arXiv https://github.com/NVlabs/MaskLLM https://doi.org/10.48550/arXiv.2409.17481
662 Search for Efficient Large Language Models Xuan Shen, Pu Zhao, Yifan Gong, Zhenglun Kong, Zheng Zhan, Yushu Wu, Ming Lin, Chao Wu, Xue Lin, Yanzhi Wang 2024-09-25 arXiv https://github.com/shawnricecake/search-llm https://doi.org/10.48550/arXiv.2409.17372
663 AutoLLM-CARD: Towards a Description and Landscape of Large Language Models Shengwei Tian, Lifeng Han, Goran Nenadic 2024-09-25 arXiv https://github.com/shengwei-tian/dependency-parser-visualization http://arxiv.org/abs/2409.17011v3
664 DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling Kyuheon Jung, Yongdeuk Seo, Seongwoo Cho, Jaeyoung Kim, Hyun-seok Min, Sungchul Choi 2024-09-25 arXiv https://github.com/kkyuhun94/dalda http://arxiv.org/abs/2409.16949v1
665 Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction Zhenmei Shi, Yifei Ming, Xuan-Phi Nguyen, Yingyu Liang, Shafiq Joty 2024-09-25 arXiv https://github.com/SalesforceAIResearch/GemFilter http://arxiv.org/abs/2409.17422v1
666 EventHallusion: Diagnosing Event Hallucinations in Video LLMs Jiacheng Zhang, Yang Jiao, Shaoxiang Chen, Jingjing Chen, Yu-Gang Jiang 2024-09-25 arXiv https://github.com/Stevetich/EventHallusion http://arxiv.org/abs/2409.16597v1
667 HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows Wenlin Yao, Haitao Mi, Dong Yu 2024-09-25 arXiv https://github.com/wenlinyao/HDFlow http://arxiv.org/abs/2409.17433v1
668 Zero-Shot Detection of LLM-Generated Text using Token Cohesiveness Shixuan Ma, Quan Wang 2024-09-25 arXiv https://github.com/Shixuan-Ma/TOCSIN http://arxiv.org/abs/2409.16914v1
669 CHBench: A Chinese Dataset for Evaluating Health in Large Language Models Chenlu Guo, Nuo Xu, Yi Chang, Yuan Wu 2024-09-24 arXiv https://github.com/TracyGuo2001/CHBench https://doi.org/10.48550/arXiv.2409.15766
670 HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models Haoran Que, Feiyu Duan, Liqun He, Yutao Mou, Wangchunshu Zhou, Jiaheng Liu, Wenge Rong, Zekun Moore Wang, Jian Yang, Ge Zhang, Junran Peng, Zhaoxiang Zhang, Songyang Zhang, Kai Chen 2024-09-24 arXiv https://github.com/Quehry/HelloBench https://doi.org/10.48550/arXiv.2409.16191
671 XTRUST: On the Multilingual Trustworthiness of Large Language Models Yahan Li, Yi Wang, Yi Chang, Yuan Wu 2024-09-24 arXiv https://github.com/LluckyYH/XTRUST https://doi.org/10.48550/arXiv.2409.15762
672 COHERENT: Collaboration of Heterogeneous Multi-Robot System with Large Language Models Kehui Liu, Zixin Tang, Dong Wang, Zhigang Wang, Bin Zhao, Xuelong Li 2024-09-23 arXiv https://github.com/MrKeee/COHERENT https://doi.org/10.48550/arXiv.2409.15146
673 Phantom of Latent for Large Language and Vision Models Byung-Kwan Lee, Sangyun Chung, Chae Won Kim, Beomchan Park, Yong Man Ro 2024-09-23 arXiv https://github.com/ByungKwanLee/Phantom https://doi.org/10.48550/arXiv.2409.14713
674 Pretraining Data Detection for Large Language Models: A Divergence-based Calibration Method Weichao Zhang, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Yixing Fan, Xueqi Cheng 2024-09-23 EMNLP https://github.com/zhang-wei-chao/DC-PDD https://aclanthology.org/2024.emnlp-main.300
675 Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses Hung-Ting Su, Ya-Ching Hsu, Xudong Lin, Xiang Qian Shi, Yulei Niu, Han-Yuan Hsu, Hung-yi Lee, Winston H. Hsu 2024-09-22 EMNLP https://github.com/Shelley1214/Trope https://aclanthology.org/2024.findings-emnlp.872
676 PTD-SQL: Partitioning and Targeted Drilling with LLMs in Text-to-SQL Ruilin Luo, Liyuan Wang, Binghuai Lin, Zicheng Lin, Yujiu Yang 2024-09-21 arXiv https://github.com/lrlbbzl/PTD-SQL http://arxiv.org/abs/2409.14082v1
677 StateAct: State Tracking and Reasoning for Acting and Planning with Large Language Models Nikolai Rozanov, Marek Rei 2024-09-21 arXiv https://github.com/ai-nikolai/StateAct https://doi.org/10.48550/arXiv.2410.02810
678 ShizishanGPT: An Agricultural Large Language Model Integrating Tools and Resources Shuting Yang, Zehui Liu, Wolfgang Mayer, Ningpei Ding, Ying Wang, Yu Huang, Pengfei Wu, Wanli Li, Lin Li, Hong-Yu Zhang, Zaiwen Feng 2024-09-20 arXiv https://github.com/Zaiwen/CropGPT https://doi.org/10.48550/arXiv.2409.13537
679 CFSP: An Efficient Structured Pruning Framework for LLMs with Coarse-to-Fine Activation Information Yuxin Wang, Minghua Ma, Zekun Wang, Jingchang Chen, Huiming Fan, Liping Shan, Qing Yang, Dongliang Xu, Ming Liu, Bing Qin 2024-09-20 arXiv https://github.com/wyxscir/CFSP http://arxiv.org/abs/2409.13199v2
680 CLAIR-A: Leveraging Large Language Models to Judge Audio Captions Tsung-Han Wu, Joseph E. Gonzalez, Trevor Darrell, David M. Chan 2024-09-19 arXiv https://github.com/DavidMChan/clair-a https://doi.org/10.48550/arXiv.2409.12962
681 Edu-Values: Towards Evaluating the Chinese Education Values of Large Language Models Peiyi Zhang, Yazhou Zhang, Bo Wang, Lu Rong, Jing Qin 2024-09-19 arXiv https://github.com/zhangpeii/Edu-Values https://doi.org/10.48550/arXiv.2409.12739
682 HLLM: Enhancing Sequential Recommendations via Hierarchical Large Language Models for Item and User Modeling Junyi Chen, Lu Chi, Bingyue Peng, Zehuan Yuan 2024-09-19 arXiv https://github.com/bytedance/HLLM https://doi.org/10.48550/arXiv.2409.12740
683 Development and bilingual evaluation of Japanese medical large language model within reasonably low computational resources Issey Sukeda 2024-09-18 arXiv https://github.com/stardust-coder/japanese-lm-med-harness https://doi.org/10.48550/arXiv.2409.11783
684 Large Language Models Are Strong Audio-Visual Speech Recognition Learners Umberto Cappellazzo, Minsu Kim, Honglie Chen, Pingchuan Ma, Stavros Petridis, Daniele Falavigna, Alessio Brutti, Maja Pantic 2024-09-18 arXiv https://github.com/umbertocappellazzo/AVSR-LLMs https://doi.org/10.48550/arXiv.2409.12319
685 BodyShapeGPT: SMPL Body Shape Manipulation with LLMs Baldomero R. Árbol, Dan Casas 2024-09-18 arXiv https://github.com/baldoarbol/BodyShapeGPT http://arxiv.org/abs/2410.03556v1
686 Improving LLM Reasoning with Multi-Agent Tree-of-Thought Validator Agent Fatemeh Haji, Mazal Bethany, Maryam Tabar, Jason Chiang, Anthony Rios, Peyman Najafirad 2024-09-17 arXiv https://github.com/SecureAIAutonomyLab/MA-ToT http://arxiv.org/abs/2409.11527v2
687 Benchmarking Large Language Model Uncertainty for Prompt Optimization Pei-Fu Guo, Yun-Da Tsai, Shou-De Lin 2024-09-16 arXiv https://github.com/0Frett/PO-Uncertainty-Benchmarking https://doi.org/10.48550/arXiv.2409.10044
688 Do Large Language Models Need a Content Delivery Network? Yihua Cheng, Kuntai Du, Jiayi Yao, Junchen Jiang 2024-09-16 arXiv https://github.com/LMCache/LMCache https://doi.org/10.48550/arXiv.2409.13761
689 Fit and Prune: Fast and Training-free Visual Token Pruning for Multi-modal Large Language Models Weihao Ye, Qiong Wu, Wenhao Lin, Yiyi Zhou 2024-09-16 arXiv https://github.com/ywh187/FitPrune https://doi.org/10.48550/arXiv.2409.10197
690 The Two Word Test: A Semantic Benchmark for Large Language Models Nicholas Riccardi, Xuan Yang, Rutvik H. Desai 2024-09-16 arXiv https://github.com/NickRiccardi/two-word-test https://doi.org/10.48550/arXiv.2306.04610
691 HALO: Hallucination Analysis and Learning Optimization to Empower LLMs with Retrieval-Augmented Context for Guided Clinical Decision Making Sumera Anjum, Hanzhi Zhang, Wenjun Zhou, Eun Jin Paek, Xiaopeng Zhao, Yunhe Feng 2024-09-16 arXiv https://github.com/ResponsibleAILab/HALO http://arxiv.org/abs/2409.10011v2
692 Can Large Language Models Grasp Event Signals? Exploring Pure Zero-Shot Event-based Recognition Zongyou Yu, Qiang Qu, Xiaoming Chen, Chen Wang 2024-09-15 arXiv https://github.com/ChrisYu-Zz/Pure-event-based-recognition-based-LLM https://doi.org/10.48550/arXiv.2409.09628
693 Traffic Scene Generation from Natural Language Description for Autonomous Vehicles with Large Language Model Bo-Kai Ruan, Hao-Tang Tsui, Yung-Hui Li, Hong-Han Shuai 2024-09-15 arXiv https://basiclab.github.io/TTSG https://doi.org/10.48550/arXiv.2409.09575
694 AlpaPICO: Extraction of PICO Frames from Clinical Trial Documents Using LLMs Madhusudan Ghosh, Shrimon Mukherjee, Asmit Ganguly, Partha Basuchowdhuri, Sudip Kumar Naskar, Debasis Ganguly 2024-09-15 arXiv https://github.com/shrimonmuke0202/AlpaPICO http://arxiv.org/abs/2409.09704v1
695 PeriGuru: A Peripheral Robotic Mobile App Operation Assistant based on GUI Image Understanding and Prompting with LLM Kelin Fu, Yang Tian, Kaigui Bian 2024-09-14 arXiv https://github.com/Z2sJ4t/PeriGuru http://arxiv.org/abs/2409.09354v1
696 LLM-Powered Ensemble Learning for Paper Source Tracing: A GPU-Free Approach Kunlong Chen, Junjun Wang, Zhaoqun Chen, Kunjin Chen, Yitian Chen 2024-09-14 arXiv https://github.com/Cklwanfifa/KDDCUP2024-PST http://arxiv.org/abs/2409.09383v2
697 Intelligent LiDAR Navigation: Leveraging External Information and Semantic Maps with LLM as Copilot Fujing Xie, Jiajie Zhang, Sören Schwertfeger 2024-09-13 arXiv https://github.com/xiexiexiaoxiexie/Intelligent-LiDAR-Navigation-LLM-as-Copilot http://arxiv.org/abs/2409.08493v1
698 L3Cube-IndicQuest: A Benchmark Question Answering Dataset for Evaluating Knowledge of LLMs in Indic Context Pritika Rohera, Chaitrali Ginimav, Akanksha Salunke, Gayatri Sawant, Raviraj Joshi 2024-09-13 arXiv https://github.com/l3cube-pune/indic-nlp http://arxiv.org/abs/2409.08706v2
699 ProcessTBench: An LLM Plan Generation Dataset for Process Mining Andrei Cosmin Redis, Mohammadreza Fani Sani, Bahram Zarrin, Andrea Burattin 2024-09-13 arXiv https://github.com/microsoft/ProcessTBench http://arxiv.org/abs/2409.09191v2
700 FP-VEC: Fingerprinting Large Language Models via Efficient Vector Addition Zhenhua Xu, Wenpeng Xing, Zhebo Wang, Chang Hu, Chen Jie, Meng Han 2024-09-13 arXiv https://fingerprintvector.github.io https://doi.org/10.48550/arXiv.2409.08846
701 Fine-tuning Large Language Models for Entity Matching Aaron Steiner, Ralph Peeters, Christian Bizer 2024-09-12 arXiv https://github.com/wbsg-uni-mannheim/TailorMatch https://doi.org/10.48550/arXiv.2409.08185
702 DrLLM: Prompt-Enhanced Distributed Denial-of-Service Resistance Method with Large Language Models Zhenyu Yin, Shang Liu, Guangyuan Xu 2024-09-11 arXiv https://github.com/liuup/DrLLM https://doi.org/10.48550/arXiv.2409.10561
703 AdaPPA: Adaptive Position Pre-Fill Jailbreak Attack Approach Targeting LLMs Lijia Lv, Weigang Zhang, Xuehai Tang, Jie Wen, Feng Liu, Jizhong Han, Songlin Hu 2024-09-11 arXiv https://github.com/Yummy416/AdaPPA http://arxiv.org/abs/2409.07503v1
704 Think Together and Work Better: Combining Humans' and LLMs' Think-Aloud Outcomes for Effective Text Evaluation SeongYeub Chu, JongWoo Kim, MunYong Yi 2024-09-11 arXiv https://github.com/BBeeChu/InteractEval http://arxiv.org/abs/2409.07355v1
705 Understanding Knowledge Drift in LLMs through Misinformation Alina Fastowski, Gjergji Kasneci 2024-09-11 arXiv https://github.com/afastowski/knowledge_drift http://arxiv.org/abs/2409.07085v1
706 What is the Role of Small Models in the LLM Era: A Survey Lihu Chen, Gaël Varoquaux 2024-09-10 arXiv https://github.com/tigerchen52/role_of_small_models http://arxiv.org/abs/2409.06857v4
707 Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models Yao Shu, Wenyang Hu, See-Kiong Ng, Bryan Kian Hsiang Low, Fei Richard Yu 2024-09-10 arXiv https://github.com/allen4747/Ferret https://doi.org/10.48550/arXiv.2409.06277
708 LLaMA-Omni: Seamless Speech Interaction with Large Language Models Qingkai Fang, Shoutao Guo, Yan Zhou, Zhengrui Ma, Shaolei Zhang, Yang Feng 2024-09-10 arXiv https://github.com/ictnlp/LLaMA-Omni https://doi.org/10.48550/arXiv.2409.06666
709 Benchmarking Chinese Knowledge Rectification in Large Language Models Tianhe Lu, Jizhan Fang, Yunzhi Yao, Xin Xu, Ningyu Zhang, Huajun Chen 2024-09-09 arXiv https://github.com/zjunlp/EasyEdit https://doi.org/10.48550/arXiv.2409.05806
710 FLoRA: Federated Fine-Tuning Large Language Models with Heterogeneous Low-Rank Adaptations Ziyao Wang, Zheyu Shen, Yexiao He, Guoheng Sun, Hongyi Wang, Lingjuan Lyu, Ang Li 2024-09-09 arXiv https://github.com/ATP-1010/FederatedLLM https://doi.org/10.48550/arXiv.2409.05976
711 Towards Democratizing Multilingual Large Language Models For Medicine Through A Two-Stage Instruction Fine-tuning Approach Meng Zhou, Surajsinh Parmar, Anubhav Bhatti 2024-09-09 arXiv https://github.com/SpassMed/Med-Llama3 https://doi.org/10.48550/arXiv.2409.05732
712 OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMs Jintian Zhang, Cheng Peng, Mengshu Sun, Xiang Chen, Lei Liang, Zhiqiang Zhang, Jun Zhou, Huajun Chen, Ningyu Zhang 2024-09-08 arXiv https://github.com/zjunlp/OneGen http://arxiv.org/abs/2409.05152v2
713 Multi-Programming Language Ensemble for Code Generation in Large Language Model Tengfei Xue, Xuefeng Li, Tahir Azim, Roman Smirnov, Jianhui Yu, Arash Sadrieh, Babak Pahlavan 2024-09-06 arXiv https://github.com/NinjaTech-AI/MPLE https://doi.org/10.48550/arXiv.2409.04114
714 Sirius: Contextual Sparsity with Correction for Efficient LLMs Yang Zhou, Zhuoming Chen, Zhaozhuo Xu, Victoria Lin, Beidi Chen 2024-09-05 arXiv https://github.com/Infini-AI-Lab/Sirius http://arxiv.org/abs/2409.03856v1
715 Sketch: A Toolkit for Streamlining LLM Operations Xin Jiang, Xiang Li, Wenjia Ma, Xuezhi Fang, Yiqun Yao, Naitong Yu, Xuying Meng, Peng Han, Jing Li, Aixin Sun, Yequan Wang 2024-09-05 arXiv https://github.com/cofe-ai/Sketch http://arxiv.org/abs/2409.03346v1
716 Debate on Graph: a Flexible and Reliable Reasoning Framework for Large Language Models Jie Ma, Zhitao Gao, Qi Chai, Wangchun Sun, Pinghui Wang, Hongbin Pei, Jing Tao, Lingyun Song, Jun Liu, Chen Zhang, Lizhen Cui 2024-09-05 arXiv https://github.com/reml-group/DoG https://doi.org/10.48550/arXiv.2409.03155
717 Planning In Natural Language Improves LLM Search For Code Generation Evan Wang, Federico Cassano, Catherine Wu, Yunfeng Bai, Will Song, Vaskar Nath, Ziwen Han, Sean Hendryx, Summer Yue, Hugh Zhang 2024-09-05 arXiv https://github.com/scaleapi/plansearch http://arxiv.org/abs/2409.03733v2
718 LLM Detectors Still Fall Short of Real World: Case of LLM-Generated Short News-Like Posts Henrique Da Silva Gameiro, Andrei Kucharavy, Ljiljana Dolamic 2024-09-05 arXiv https://github.com/Reliable-Information-Lab-HEVS/benchmark_llm_texts_detection http://arxiv.org/abs/2409.03291v2
719 Alignment-Aware Model Extraction Attacks on Large Language Models Zi Liang, Qingqing Ye, Yanyun Wang, Sen Zhang, Yaxin Xiao, Ronghua Li, Jianliang Xu, Haibo Hu 2024-09-04 arXiv https://github.com/liangzid/alignmentExtraction https://doi.org/10.48550/arXiv.2409.02718
720 Large Language Model-Based Agents for Software Engineering: A Survey Junwei Liu, Kaixin Wang, Yixuan Chen, Xin Peng, Zhenpeng Chen, Lingming Zhang, Yiling Lou 2024-09-04 arXiv https://github.com/FudanSELab/Agent4SE-Paper-List https://doi.org/10.48550/arXiv.2409.02977
721 Hypothesizing Missing Causal Variables with LLMs Ivaxi Sheth, Sahar Abdelnabi, Mario Fritz 2024-09-04 arXiv https://github.com/ivaxi0s/hypothesizing-causal-variable-llm http://arxiv.org/abs/2409.02604v1
722 Pooling And Attention: What Are Effective Designs For LLM-Based Embedding Models? Yixuan Tang, Yi Yang 2024-09-04 arXiv https://github.com/yixuantt/PoolingAndAttn http://arxiv.org/abs/2409.02727v2
723 Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation Tiansheng Huang, Sihao Hu, Fatih Ilhan, Selim Furkan Tekin, Ling Liu 2024-09-03 arXiv https://github.com/git-disl/Booster https://doi.org/10.48550/arXiv.2409.01586
724 Exploiting the Vulnerability of Large Language Models via Defense-Aware Architectural Backdoor Abdullah Arafat Miah, Yu Bi 2024-09-03 arXiv https://github.com/SiSL-URI/Arch_Backdoor_LLM https://doi.org/10.48550/arXiv.2409.01952
725 MMLU-Pro+: Evaluating Higher-Order Reasoning and Shortcut Learning in LLMs Saeid Asgari Taghanaki, Aliasgahr Khani, Amir Khasahmadi 2024-09-03 arXiv https://github.com/asgsaeid/mmlu-pro-plus http://arxiv.org/abs/2409.02257v3
726 Agentic Society: Merging skeleton from real world and texture from Large Language Model Yuqi Bai, Kun Sun, Huishi Yin 2024-09-02 arXiv https://github.com/baiyuqi/agentic-society https://doi.org/10.48550/arXiv.2409.10550
727 FlashFlex: Accommodating Large Language Model Training over Heterogeneous Environment Ran Yan, Youhe Jiang, Wangcheng Tao, Xiaonan Nie, Bin Cui, Binhang Yuan 2024-09-02 arXiv https://github.com/Relaxed-System-Lab/FlashFlex https://doi.org/10.48550/arXiv.2409.01143
728 Large Language Models versus Classical Machine Learning: Performance in COVID-19 Mortality Prediction Using High-Dimensional Tabular Data Mohammadreza Ghaffarzadeh-Esfahani, Mahdi Ghaffarzadeh-Esfahani, Arian Salahi-Niri, Hossein Toreyhi, Zahra Atf, Amirali Mohsenzadeh-Kermani, Mahshad Sarikhani, Zohreh Tajabadi, Fatemeh Shojaeian, Mohammad Hassan Bagheri, Aydin Feyzi, Mohammadamin Tarighatpayma, Narges Gazmeh, Fateme Heydari, Hossein Afshar, Amirreza Allahgholipour, Farid Alimardani, Ameneh Salehi, Naghmeh Asadimanesh, Mohammad Amin Khalafi, Hadis Shabanipour, Ali Moradi, Sajjad Hossein Zadeh, Omid Yazdani, Romina Esbati, Moozhan Maleki, Danial Samiei Nasr, Amirali Soheili, Hossein Majlesi, Saba Shahsavan, Alireza Soheilipour, Nooshin Goudarzi, Erfan Taherifard, Hamidreza Hatamabadi, Jamil S. Samaan, Thomas Savage, Ankit Sakhuja, Ali Soroush, Girish N. Nadkarni, Ilad Alavi Darazam, Mohamad Amin Pourhoseingholi, Seyed Amir Ahmad Safavi-Naini 2024-09-02 arXiv https://github.com/mohammad-gh009/Large-Language-Models-vs-Classical-Machine-learning https://doi.org/10.48550/arXiv.2409.02136
729 Prompt Compression with Context-Aware Sentence Encoding for Fast and Improved LLM Inference Barys Liskavets, Maxim Ushakov, Shuvendu Roy, Mark Klibanov, Ali Etemad, Shane Luke 2024-09-02 arXiv https://github.com/Workday/cpc http://arxiv.org/abs/2409.01227v3
730 Automatic Pseudo-Harmful Prompt Generation for Evaluating False Refusals in Large Language Models Bang An, Sicheng Zhu, Ruiyi Zhang, Michael-Andrei Panaitescu-Liess, Yuancheng Xu, Furong Huang 2024-09-01 arXiv https://github.com/umd-huang-lab/FalseRefusal https://doi.org/10.48550/arXiv.2409.00598
731 Harnessing the Power of Semi-Structured Knowledge and LLMs with Triplet-Based Prefiltering for Question Answering Derian Boer, Fabian Koch, Stefan Kramer 2024-09-01 arXiv https://github.com/kramerlab/4StepFocus http://arxiv.org/abs/2409.00861v1
732 AskIt: Unified Programming Interface for Programming with Large Language Models Katsumi Okuda, Saman P. Amarasinghe 2024-09 2024 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) https://github.com/katsumiok/ts-askit https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10444830
733 LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models Zhiyuan Hu, Yuliang Liu, Jinman Zhao, Suyuchen Wang, Yan Wang, Wei Shen, Qing Gu, Anh Tuan Luu, See-Kiong Ng, Zhiwei Jiang, Bryan Hooi 2024-08-31 arXiv https://github.com/zhiyuanhubj/LongRecipe https://doi.org/10.48550/arXiv.2409.00509
734 MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models Shuai Peng, Di Fu, Liangcai Gao, Xiuqin Zhong, Hongguang Fu, Zhi Tang 2024-08-30 arXiv https://github.com/pengshuai-rin/MultiMath https://doi.org/10.48550/arXiv.2409.00147
735 A Survey on Evaluation of Large Language Models Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, Kaijie Zhu, Hao Chen, Linyi Yang, Xiaoyuan Yi, Cunxiang Wang, Yidong Wang, Wei Ye, Yue Zhang, Yi Chang, Philip S. Yu, Qiang Yang, Xing Xie 2024-08-28 ACM Transactions on Intelligent Systems and Technology (TIST), Volume 15, Issue 3 https://llm-eval.github.io/ https://dl.acm.org/doi/10.1145/3641289
736 Atari-GPT: Benchmarking Multimodal Large Language Models as Low-Level Policies in Atari Games Nicholas R. Waytowich, Devin White, MD Sunbeam, Vinicius G. Goecks 2024-08-28 arXiv https://dev1nw.github.io/atari-gpt/ http://arxiv.org/abs/2408.15950v2
737 CBF-LLM: Safe Control for LLM Alignment Yuya Miyaoka, Masaki Inoue 2024-08-28 arXiv https://github.com/Mya-Mya/CBF-LLM http://arxiv.org/abs/2408.15625v2
738 Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders Min Shi, Fuxiao Liu, Shihao Wang, Shijia Liao, Subhashree Radhakrishnan, De-An Huang, Hongxu Yin, Karan Sapra, Yaser Yacoob, Humphrey Shi, Bryan Catanzaro, Andrew Tao, Jan Kautz, Zhiding Yu, Guilin Liu 2024-08-28 arXiv https://github.com/NVlabs/Eagle http://arxiv.org/abs/2408.15998v1
739 Efficient LLM Scheduling by Learning to Rank Yichao Fu, Siqi Zhu, Runlong Su, Aurick Qiao, Ion Stoica, Hao Zhang 2024-08-28 arXiv https://github.com/hao-ai-lab/vllm-ltr http://arxiv.org/abs/2408.15792v1
740 Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models Yuncheng Yang, Yulei Qin, Tong Wu, Zihan Xu, Gang Li, Pengcheng Guo, Hang Shao, Yucheng Shi, Ke Li, Xing Sun, Jie Yang, Yun Gu 2024-08-28 arXiv https://github.com/Yaphabates/Rocket https://doi.org/10.48550/arXiv.2408.15915
741 LyCon: Lyrics Reconstruction from the Bag-of-Words Using Large Language Models Haven Kim, Kahyun Choi 2024-08-27 arXiv https://github.com/havenpersona/lycon https://doi.org/10.48550/arXiv.2408.14750
742 PAT: Pruning-Aware Tuning for Large Language Models Yijiang Liu, Huanrui Yang, Youxin Chen, Rongyu Zhang, Miao Wang, Yuan Du, Li Du 2024-08-27 arXiv https://github.com/kriskrisliu/PAT_Pruning-Aware-Tuning https://doi.org/10.48550/arXiv.2408.14721
743 RSTeller: Scaling Up Visual Language Modeling in Remote Sensing with Rich Linguistic Semantics from Openly Available Data and Large Language Models Junyao Ge, Yang Zheng, Kaitai Guo, Jimin Liang 2024-08-27 arXiv https://github.com/SlytherinGe/RSTeller https://doi.org/10.48550/arXiv.2408.14744
744 CURLoRA: Stable LLM Continual Fine-Tuning and Catastrophic Forgetting Mitigation Muhammad Fawi 2024-08-26 arXiv https://github.com/MNoorFawi/curlora http://arxiv.org/abs/2408.14572v1
745 AgentMove: Predicting Human Mobility Anywhere Using Large Language Model based Agentic Framework Jie Feng, Yuwei Du, Jie Zhao, Yong Li 2024-08-26 arXiv https://github.com/tsinghua-fib-lab/AgentMove https://doi.org/10.48550/arXiv.2408.13986
746 ConVis: Contrastive Decoding with Hallucination Visualization for Mitigating Hallucinations in Multimodal Large Language Models Yeji Park, Deokyeong Lee, Junsuk Choe, Buru Chang 2024-08-25 arXiv https://github.com/yejipark-m/ConVis https://doi.org/10.48550/arXiv.2408.13906
747 Vision-Language and Large Language Model Performance in Gastroenterology: GPT, Claude, Llama, Phi, Mistral, Gemma, and Quantized Models Seyed Amir Ahmad Safavi-Naini, Shuhaib Ali, Omer Shahab, Zahra Shahhoseini, Thomas Savage, Sara Rafiee, Jamil S. Samaan, Reem Al Shabeeb, Farah Ladak, Jamie O. Yang, Juan Echavarria, Sumbal Babar, Aasma Shaukat, Samuel Margolis, Nicholas P. Tatonetti, Girish N. Nadkarni, Bara El Kurdi, Ali Soroush 2024-08-25 arXiv https://github.com/Sdamirsa/LLM-VLM-in-Gastroenterology https://doi.org/10.48550/arXiv.2409.00084
748 vitaLITy 2: Reviewing Academic Literature Using Large Language Models Hongye An, Arpit Narechania, Emily Wall, Kai Xu 2024-08-24 arXiv https://vitality-vis.github.io https://doi.org/10.48550/arXiv.2408.13450
749 HRGraph: Leveraging LLMs for HR Data Knowledge Graphs with Information Propagation-based Job Recommendation Azmine Toushik Wasi 2024-08-24 Proceedings of the 1st Workshop on Knowledge Graphs and Large Language Models (KaLLM 2024), Association for Computational Linguistics 2024 https://github.com/azminewasi/HRGraph http://arxiv.org/abs/2408.13521v1
750 LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs Chansung Park, Juyong Jiang, Fan Wang, Sayak Paul, Jing Tang 2024-08-24 arXiv https://github.com/deep-diver/llamaduo http://arxiv.org/abs/2408.13467v2
751 IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities Bin Wang, Chunyu Xie, Dawei Leng, Yuhui Yin 2024-08-23 arXiv https://github.com/360CVGroup/Inner-Adaptor-Architecture https://doi.org/10.48550/arXiv.2408.12902
752 MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans? Yi-Fan Zhang, Huanyu Zhang, Haochen Tian, Chaoyou Fu, Shuangqing Zhang, Junfei Wu, Feng Li, Kun Wang, Qingsong Wen, Zhang Zhang, Liang Wang, Rong Jin, Tieniu Tan 2024-08-23 arXiv https://mme-realworld.github.io/ http://arxiv.org/abs/2408.13257v2
753 LIMP: Large Language Model Enhanced Intent-aware Mobility Prediction Songwei Li, Jie Feng, Jiawei Chi, Xinyuan Hu, Xiaomeng Zhao, Fengli Xu 2024-08-23 arXiv https://github.com/tsinghua-fib-lab/LIMP https://doi.org/10.48550/arXiv.2408.12832
754 Generating Analytic Specifications for Data Visualization from Natural Language Queries using Large Language Models Subham Sah, Rishab Mitra, Arpit Narechania, Alex Endert, John T. Stasko, Wenwen Dou 2024-08-23 arXiv https://nl4dv.github.io https://doi.org/10.48550/arXiv.2408.13391
755 BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks on Large Language Models Yige Li, Hanxun Huang, Yunhan Zhao, Xingjun Ma, Jun Sun 2024-08-23 arXiv https://github.com/bboylyg/BackdoorLLM https://doi.org/10.48550/arXiv.2408.12798
756 LLM-PBE: Assessing Data Privacy in Large Language Models Qinbin Li, Junyuan Hong, Chulin Xie, Jeffrey Tan, Rachel Xin, Junyi Hou, Xavier Yin, Zhun Wang, Dan Hendrycks, Zhangyang Wang, Bo Li, Bingsheng He, Dawn Song 2024-08-23 Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 11 https://llm-pbe.github.io/ https://dl.acm.org/doi/10.14778/3681954.3681994
757 Controllable Text Generation for Large Language Models: A Survey Xun Liang, Hanyu Wang, Yezhaohui Wang, Shichao Song, Jiawei Yang, Simin Niu, Jie Hu, Dan Liu, Shunyu Yao, Feiyu Xiong, Zhiyu Li 2024-08-22 arXiv https://github.com/IAAR-Shanghai/CTGSurvey https://doi.org/10.48550/arXiv.2408.12599
758 Enhanced Fine-Tuning of Lightweight Domain-Specific Q&A Model Based on Large Language Models Shenglin Zhang, Pengtian Zhu, Minghua Ma, Jiagang Wang, Yongqian Sun, Dongwen Li, Jingyu Wang, Qianying Guo, Xiaolei Hua, Lin Zhu, Dan Pei 2024-08-22 ISSRE https://github.com/Zero-Pointer/Self-Evolution https://doi.org/10.1109/ISSREW63542.2024.00048
759 Geolocation Representation from Large Language Models are Generic Enhancers for Spatio-Temporal Learning Junlin He, Tong Nie, Wei Ma 2024-08-22 arXiv https://github.com/Umaruchain/LLMGeovec https://doi.org/10.48550/arXiv.2408.12116
760 Reasoning Factual Knowledge in Structured Data with Large Language Models Sirui Huang, Yanggan Gu, Xuming Hu, Zhonghao Li, Qing Li, Guandong Xu 2024-08-22 arXiv https://github.com/EganGu/StructFact https://doi.org/10.48550/arXiv.2408.12188
761 Aligning (Medical) LLMs for (Counterfactual) Fairness Raphael Poulain, Hamed Fayyaz, Rahmatollah Beheshti 2024-08-22 arXiv https://github.com/healthylaife/FairAlignmentLLM http://arxiv.org/abs/2408.12055v1
762 Evidence-backed Fact Checking using RAG and Few-Shot In-Context Learning with LLMs Ronit Singhal, Pransh Patwa, Parth Patwa, Aman Chadha, Amitava Das 2024-08-22 arXiv https://github.com/ronit-singhal/evidence-backed-fact-checking-using-rag-and-few-shot-in-context-learning-with-llms http://arxiv.org/abs/2408.12060v2
763 MoE-LPR: Multilingual Extension of Large Language Models through Mixture-of-Experts with Language Priors Routing Hao Zhou, Zhijun Wang, Shujian Huang, Xin Huang, Xue Han, Junlan Feng, Chao Deng, Weihua Luo, Jiajun Chen 2024-08-21 arXiv https://github.com/zjwang21/MoE-LPR https://doi.org/10.48550/arXiv.2408.11396
764 Personality Alignment of Large Language Models Minjun Zhu, Linyi Yang, Yue Zhang 2024-08-21 arXiv https://github.com/zhu-minjun/PAlign https://doi.org/10.48550/arXiv.2408.11779
765 Story3D-Agent: Exploring 3D Storytelling Visualization with Large Language Models Yuzhou Huang, Yiran Qin, Shunlin Lu, Xintao Wang, Rui Huang, Ying Shan, Ruimao Zhang 2024-08-21 arXiv https://yuzhou914.github.io/Story3D-Agent/ https://doi.org/10.48550/arXiv.2408.11801
766 SimBench: A Rule-Based Multi-Turn Interaction Benchmark for Evaluating an LLM's Ability to Generate Digital Twins Jingquan Wang, Harry Zhang, Huzaifa Mustafa Unjhawala, Peter Negrut, Shu Wang, Khailanii Slaton, Radu Serban, Jin-Long Wu, Dan Negrut 2024-08-21 arXiv https://github.com/uwsbel/SimBench http://arxiv.org/abs/2408.11987v1
767 SysBench: Can Large Language Models Follow System Messages? Yanzhao Qin, Tao Zhang, Tao Zhang, Yanjun Shen, Wenjing Luo, Haoze Sun, Yan Zhang, Yujing Qiao, Weipeng Chen, Zenan Zhou, Wentao Zhang, Bin Cui 2024-08-20 arXiv https://github.com/PKU-Baichuan-MLSystemLab/SysBench https://doi.org/10.48550/arXiv.2408.10943
768 Putting People in LLMs' Shoes: Generating Better Answers via Question Rewriter Junhao Chen, Bowen Wang, Zhouqiang jiang, Yuta Nakashima 2024-08-20 arXiv https://github.com/3244we/Question-Rewriter http://arxiv.org/abs/2408.10573v1
769 FLAME: Learning to Navigate with Multimodal LLM in Urban Environments Yunzhe Xu, Yiyuan Pan, Zhe Liu, Hesheng Wang 2024-08-20 arXiv https://flame-sjtu.github.io http://arxiv.org/abs/2408.11051v1
770 Task-level Distributionally Robust Optimization for Large Language Model-based Dense Retrieval Guangyuan Ma, Yongliang Ma, Xing Wu, Zhenpeng Su, Ming Zhou, Songlin Hu 2024-08-20 arXiv https://github.com/tdro-llm/tdro https://doi.org/10.48550/arXiv.2408.10613
771 Large Language Models for Multimodal Deformable Image Registration Mingrui Ma, Weijie Wang, Jie Ning, Jianfeng He, Nicu Sebe, Bruno Lepri 2024-08-20 arXiv https://github.com/ninjannn/LLM-Morph https://doi.org/10.48550/arXiv.2408.10703
772 Predicting Rewards Alongside Tokens: Non-disruptive Parameter Insertion for Efficient Inference Intervention in Large Language Model Chenhan Yuan, Fei Huang, Ru Peng, Keming Lu, Bowen Yu, Chang Zhou, Jingren Zhou 2024-08-20 EMNLP https://github.com/chenhan97/Otter https://aclanthology.org/2024.emnlp-main.316
773 Beyond Labels: Aligning Large Language Models with Human-Like Reasoning Muhammad Rafsan Kabir, Rafeed Mohammad Sultan, Ihsanul Haque Asif, Jawad Ibn Ahad, Fuad Rahman, Mohammad Ruhul Amin, Nabeel Mohammed, Shafin Rahman 2024-08-20 ICPR https://github.com/apurba-nsu-rnd-lab/DFAR https://doi.org/10.1007/978-3-031-78172-8_16
774 LLM-Barber: Block-Aware Rebuilder for Sparsity Mask in One-Shot for Large Language Models Yupeng Su, Ziyi Guan, Xiaoqun Liu, Tianlai Jin, Dongkuan Wu, Graziano Chesi, Ngai Wong, Hao Yu 2024-08-20 arXiv https://github.com/YupengSu/LLM-Barber https://doi.org/10.48550/arXiv.2408.10631
775 CMoralEval: A Moral Evaluation Benchmark for Chinese Large Language Models Linhao Yu, Yongqi Leng, Yufei Huang, Shang Wu, Haixin Liu, Xinmeng Ji, Jiahui Zhao, Jinwang Song, Tingting Cui, Xiaoqing Cheng, Liutao Liutao, Deyi Xiong 2024-08-19 ACL https://github.com/tjunlp-lab/CMoralEval https://doi.org/10.18653/v1/2024.findings-acl.703
776 Pedestrian Attribute Recognition: A New Benchmark Dataset and A Large Language Model Augmented Framework Jiandong Jin, Xiao Wang, Qian Zhu, Haiyang Wang, Chenglong Li 2024-08-19 arXiv https://github.com/Event-AHU/OpenPAR https://doi.org/10.48550/arXiv.2408.09720
777 R2GenCSR: Retrieving Context Samples for Large Language Model based X-ray Medical Report Generation Xiao Wang, Yuehang Li, Fuling Wang, Shiao Wang, Chuanfu Li, Bo Jiang 2024-08-19 arXiv https://github.com/Event-AHU/Medical_Image_Analysis https://doi.org/10.48550/arXiv.2408.09743
778 AutoML-guided Fusion of Entity and LLM-based Representations for Document Classification Boshko Koloski, Senja Pollak, Roberto Navigli, Blaž Škrlj 2024-08-19 arXiv https://github.com/bkolosk1/bablfusion http://arxiv.org/abs/2408.09794v2
779 FFAA: Multimodal Large Language Model based Explainable Open-World Face Forgery Analysis Assistant Zhengchao Huang, Bin Xia, Zicheng Lin, Zhun Mou, Wenming Yang, Jiaya Jia 2024-08-19 arXiv https://ffaa-vl.github.io https://doi.org/10.48550/arXiv.2408.10072
780 Antidote: Post-fine-tuning Safety Alignment for Large Language Models against Harmful Fine-tuning Tiansheng Huang, Gautam Bhattacharya, Pratik Joshi, Josh Kimball, Ling Liu 2024-08-18 arXiv https://huangtiansheng.github.io/Antidote_gh_page/ https://doi.org/10.48550/arXiv.2408.09600
781 HiAgent: Hierarchical Working Memory Management for Solving Long-Horizon Agent Tasks with Large Language Model Mengkang Hu, Tianxing Chen, Qiguang Chen, Yao Mu, Wenqi Shao, Ping Luo 2024-08-18 arXiv https://github.com/HiAgent2024/HiAgent https://doi.org/10.48550/arXiv.2408.09559
782 PA-LLaVA: A Large Language-Vision Assistant for Human Pathology Image Understanding Dawei Dai, Yuanhui Zhang, Long Xu, Qianlan Yang, Xiaojing Shen, Shuyin Xia, Guoyin Wang 2024-08-18 arXiv https://github.com/ddw2AIGROUP2CQUPT/PA-LLaVA https://doi.org/10.48550/arXiv.2408.09530
783 TC-RAG:Turing-Complete RAG's Case study on Medical LLM Systems Xinke Jiang, Yue Fang, Rihong Qiu, Haoyu Zhang, Yongxin Xu, Hao Chen, Wentao Zhang, Ruizhe Zhang, Yuchen Fang, Xu Chu, Junfeng Zhao, Yasha Wang 2024-08-17 arXiv https://https://github.com/Artessay/SAMA http://arxiv.org/abs/2408.09199v1
784 Can Large Language Models Improve the Adversarial Robustness of Graph Neural Networks? Zhongjian Zhang, Xiao Wang, Huichi Zhou, Yue Yu, Mengmei Zhang, Cheng Yang, Chuan Shi 2024-08-16 arXiv https://github.com/zhongjian-zhang/LLM4RGNN https://doi.org/10.48550/arXiv.2408.08685
785 MIA-Tuner: Adapting Large Language Models as Pre-training Text Detector Wenjie Fu, Huandong Wang, Chen Gao, Guanghua Liu, Yong Li, Tao Jiang 2024-08-16 arXiv https://github.com/wjfu99/MIA-Tuner https://doi.org/10.48550/arXiv.2408.08661
786 Authorship Attribution in the Era of LLMs: Problems, Methodologies, and Challenges Baixiang Huang, Canyu Chen, Kai Shu 2024-08-16 arXiv https://llm-authorship.github.io http://arxiv.org/abs/2408.08946v1
787 Fine-tuning LLMs for Autonomous Spacecraft Control: A Case Study Using Kerbal Space Program Alejandro Carrasco, Victor Rodriguez-Fernandez, Richard Linares 2024-08-16 arXiv https://github.com/ARCLab-MIT/kspdg http://arxiv.org/abs/2408.08676v1
788 Multimodal Causal Reasoning Benchmark: Challenging Vision Large Language Models to Infer Causal Links Between Siamese Images Zhiyuan Li, Heng Wang, Dongnan Liu, Chaoyi Zhang, Ao Ma, Jieting Long, Tom Weidong Cai 2024-08-15 arXiv https://github.com/Zhiyuan-Li-John/MuCR https://doi.org/10.48550/arXiv.2408.08105
789 Prefix Guidance: A Steering Wheel for Large Language Models to Defend Against Jailbreak Attacks Jiawei Zhao, Kejiang Chen, Xiaojian Yuan, Weiming Zhang 2024-08-15 arXiv https://github.com/weiyezhimeng/Prefix-Guidance https://doi.org/10.48550/arXiv.2408.08924
790 Polaris: Open-ended Interactive Robotic Manipulation via Syn2Real Visual Grounding and Large Language Models Tianyu Wang, Haitao Lin, Junqiu Yu, Yanwei Fu 2024-08-15 IROS https://star-uu-wang.github.io/Polaris/ https://doi.org/10.1109/IROS58592.2024.10801446
791 Can Large Language Models Understand Symbolic Graphics Programs? Zeju Qiu, Weiyang Liu, Haiwen Feng, Zhen Liu, Tim Z. Xiao, Katherine M. Collins, Joshua B. Tenenbaum, Adrian Weller, Michael J. Black, Bernhard Schölkopf 2024-08-15 arXiv https://sgp-bench.github.io/ https://doi.org/10.48550/arXiv.2408.08313
792 FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models Zhongyu Zhao, Menghang Dong, Rongyu Zhang, Wenzhao Zheng, Yunpeng Zhang, Huanrui Yang, Dalong Du, Kurt Keutzer, Shanghang Zhang 2024-08-15 arXiv https://github.com/zhenwuweihe/FactorLLM https://doi.org/10.48550/arXiv.2408.11855
793 ArabLegalEval: A Multitask Benchmark for Assessing Arabic Legal Knowledge in Large Language Models Faris Hijazi, Somayah AlHarbi, Abdulaziz AlHussein, Harethah Abu Shairah, Reem Alzahrani, Hebah AlShamlan, George Turkiyyah, Omar Knio 2024-08-15 ArabicNLP https://github.com/Thiqah/ArabLegalEval https://aclanthology.org/2024.arabicnlp-1.20
794 Evaluating Large Language Model based Personal Information Extraction and Countermeasures Yupei Liu, Yuqi Jia, Jinyuan Jia, Neil Zhenqiang Gong 2024-08-14 arXiv https://github.com/liu00222/LLM-Based-Personal-Profile-Extraction https://doi.org/10.48550/arXiv.2408.07291
795 Knowledge in Superposition: Unveiling the Failures of Lifelong Knowledge Editing for Large Language Models Chenhui Hu, Pengfei Cao, Yubo Chen, Kang Liu, Jun Zhao 2024-08-14 arXiv https://github.com/ChenhuiHu/knowledge_in_superposition https://doi.org/10.48550/arXiv.2408.07413
796 Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities Enneng Yang, Li Shen, Guibing Guo, Xingwei Wang, Xiaochun Cao, Jie Zhang, Dacheng Tao 2024-08-14 arXiv https://github.com/EnnengYang/Awesome-Model-Merging-Methods-Theories-Applications http://arxiv.org/abs/2408.07666v4
797 LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs Yushi Bai, Jiajie Zhang, Xin Lv, Linzhi Zheng, Siqi Zhu, Lei Hou, Yuxiao Dong, Jie Tang, Juanzi Li 2024-08-13 arXiv https://github.com/THUDM/LongWriter http://arxiv.org/abs/2408.07055v1
798 Kov: Transferable and Naturalistic Black-Box LLM Attacks using Markov Decision Processes and Tree Search Robert J. Moss 2024-08-11 arXiv https://github.com/sisl/Kov.jl http://arxiv.org/abs/2408.08899v1
799 Revisiting Multi-Modal LLM Evaluation Jian Lu, Shikhar Srivastava, Junyu Chen, Robik Shrestha, Manoj Acharya, Kushal Kafle, Christopher Kanan 2024-08-09 arXiv https://kevinlujian.github.io/MLLM_Evaluations/ http://arxiv.org/abs/2408.05334v1
800 SHIELD: LLM-Driven Schema Induction for Predictive Analytics in EV Battery Supply Chain Disruptions Zhi-Qi Cheng, Yifei Dong, Aike Shi, Wei Liu, Yuzhi Hu, Jason O'Connor, Alexander G. Hauptmann, Kate S. Whitefoot 2024-08-09 arXiv https://fly1113.github.io/MFI/ http://arxiv.org/abs/2408.05357v2
801 Tabular Transfer Learning via Prompting LLMs Jaehyun Nam, Woomin Song, Seong Hyeon Park, Jihoon Tack, Sukmin Yun, Jaehyung Kim, Kyu Hwan Oh, Jinwoo Shin 2024-08-09 arXiv https://github.com/jaehyun513/P2T http://arxiv.org/abs/2408.11063v1
802 VITA: Towards Open-Source Interactive Omni Multimodal LLM Chaoyou Fu, Haojia Lin, Zuwei Long, Yunhang Shen, Meng Zhao, Yifan Zhang, Shaoqi Dong, Xiong Wang, Di Yin, Long Ma, Xiawu Zheng, Ran He, Rongrong Ji, Yunsheng Wu, Caifeng Shan, Xing Sun 2024-08-09 arXiv https://vita-home.github.io http://arxiv.org/abs/2408.05211v2
803 BA-LoRA: Bias-Alleviating Low-Rank Adaptation to Mitigate Catastrophic Inheritance in Large Language Models Yupeng Chang, Yi Chang, Yuan Wu 2024-08-08 arXiv https://github.com/cyp-jlu-ai/BA-LoRA http://arxiv.org/abs/2408.04556v3
804 ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities Jiarui Lu, Thomas Holleis, Yizhe Zhang, Bernhard Aumayer, Feng Nan, Felix Bai, Shuang Ma, Shen Ma, Mengyu Li, Guoli Yin, Zirui Wang, Ruoming Pang 2024-08-08 arXiv https://github.com/apple/ToolSandbox http://arxiv.org/abs/2408.04682v1
805 Open-domain Implicit Format Control for Large Language Model Generation Yiqun Yao, Wenjia Ma, Xuezhi Fang, Xin Jiang, Xiang Li, Xuying Meng, Peng Han, Jing Li, Aixin Sun, Yequan Wang 2024-08-08 arXiv https://github.com/cofe-ai/OIFC https://doi.org/10.48550/arXiv.2408.04392
806 Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented Generation Junde Wu, Jiayuan Zhu, Yunli Qi, Jingkun Chen, Min Xu, Filippo Menolascina, Vicente Grau 2024-08-08 arXiv https://github.com/MedicineToken/Medical-Graph-RAG https://doi.org/10.48550/arXiv.2408.04187
807 CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases Xiangyan Liu, Bo Lan, Zhiyuan Hu, Yang Liu, Zhicheng Zhang, Fei Wang, Michael Shieh, Wenmeng Zhou 2024-08-07 arXiv https://github.com/modelscope/modelscope-agent/tree/master/apps/codexgraph_agent https://doi.org/10.48550/arXiv.2408.03910
808 NACL: A General and Effective KV Cache Eviction Framework for LLMs at Inference Time Yilong Chen, Guoxia Wang, Junyuan Shang, Shiyao Cui, Zhenyu Zhang, Tingwen Liu, Shuohuan Wang, Yu Sun, Dianhai Yu, Hua Wu 2024-08-07 arXiv https://github.com/PaddlePaddle/Research/tree/master/NLP/ACL2024-NACL http://arxiv.org/abs/2408.03675v2
809 WalledEval: A Comprehensive Safety Evaluation Toolkit for Large Language Models Prannaya Gupta, Le Qi Yau, Hao Han Low, I-Shiang Lee, Hugo Maximus Lim, Yu Xin Teoh, Jia Hng Koh, Dar Win Liew, Rishabh Bhardwaj, Rajat Bhardwaj, Soujanya Poria 2024-08-07 arXiv https://github.com/walledai/walledeval https://doi.org/10.48550/arXiv.2408.03837
810 ULLME: A Unified Framework for Large Language Model Embeddings with Generation-Augmented Learning Hieu Man, Nghia Trung Ngo, Franck Dernoncourt, Thien Huu Nguyen 2024-08-06 arXiv https://github.com/nlp-uoregon/ullme https://doi.org/10.48550/arXiv.2408.03402
811 OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs Hasan Iqbal, Yuxia Wang, Minghan Wang, Georgi Georgiev, Jiahui Geng, Iryna Gurevych, Preslav Nakov 2024-08-06 arXiv https://github.com/mbzuai-nlp/openfactcheck http://arxiv.org/abs/2408.11832v2
812 Topic Modeling with Fine-tuning LLMs and Bag of Sentences Johannes Schneider 2024-08-06 arXiv https://github.com/JohnTailor/FT-Topic http://arxiv.org/abs/2408.03099v1
813 StructEval: Deepen and Broaden Large Language Model Assessment via Structured Evaluation Boxi Cao, Mengjie Ren, Hongyu Lin, Xianpei Han, Feng Zhang, Junfeng Zhan, Le Sun 2024-08-06 ACL https://github.com/c-box/StructEval https://doi.org/10.18653/v1/2024.findings-acl.314
814 Citekit: A Modular Toolkit for Large Language Model Citation Generation Jiajun Shen, Tong Zhou, Suifeng Zhao, Yubo Chen, Kang Liu 2024-08-06 arXiv https://github.com/SjJ1017/Citekit https://doi.org/10.48550/arXiv.2408.04662
815 UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model Zhaowei Li, Wei Wang, Yiqing Cai, Qi Xu, Pengyu Wang, Dong Zhang, Hang Song, Botian Jiang, Zhida Huang, Tao Wang 2024-08-05 arXiv https://github.com/lzw-lzw/UnifiedMLLM https://doi.org/10.48550/arXiv.2408.02503
816 Why Are My Prompts Leaked? Unraveling Prompt Extraction Threats in Customized Large Language Models Zi Liang, Haibo Hu, Qingqing Ye, Yaxin Xiao, Haoyang Li 2024-08-05 arXiv https://github.com/liangzid/PromptExtractionEval https://doi.org/10.48550/arXiv.2408.02416
817 RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation Daniel Fleischer, Moshe Berchansky, Moshe Wasserblat, Peter Izsak 2024-08-05 arXiv https://github.com/IntelLabs/RAGFoundry http://arxiv.org/abs/2408.02545v1
818 ReDel: A Toolkit for LLM-Powered Recursive Multi-Agent Systems Andrew Zhu, Liam Dugan, Chris Callison-Burch 2024-08-05 arXiv https://github.com/zhudotexe/redel http://arxiv.org/abs/2408.02248v2
819 SEAS: Self-Evolving Adversarial Safety Optimization for Large Language Models Muxi Diao, Rumei Li, Shiyang Liu, Guogang Liao, Jingang Wang, Xunliang Cai, Weiran Xu 2024-08-05 arXiv https://SEAS-LLM.github.io/ https://doi.org/10.48550/arXiv.2408.02632
820 PLUGH: A Benchmark for Spatial Understanding and Reasoning in Large Language Models Alexey Tikhonov 2024-08-03 arXiv https://github.com/altsoph/PLUGH https://doi.org/10.48550/arXiv.2408.04648
821 MALADE: Orchestration of LLM-powered Agents with Retrieval Augmented Generation for Pharmacovigilance Jihye Choi, Nils Palumbo, Prasad Chalasani, Matthew M. Engelhard, Somesh Jha, Anivarya Kumar, David Page 2024-08-03 arXiv https://github.com/jihyechoi77/malade http://arxiv.org/abs/2408.01869v1
822 CFBench: A Comprehensive Constraints-Following Benchmark for LLMs Tao Zhang, Yanjun Shen, Wenjing Luo, Yan Zhang, Hao Liang, Tao Zhang, Fan Yang, Mingan Lin, Yujing Qiao, Weipeng Chen, Bin Cui, Wentao Zhang, Zenan Zhou 2024-08-02 arXiv https://github.com/PKU-Baichuan-MLSystemLab/CFBench http://arxiv.org/abs/2408.01122v1
823 Talk Less, Interact Better: Evaluating In-context Conversational Adaptation in Multimodal LLMs Yilun Hua, Yoav Artzi 2024-08-02 arXiv https://github.com/lil-lab/ICCA http://arxiv.org/abs/2408.01417v1
824 Agentic LLM Workflows for Generating Patient-Friendly Medical Reports Malavikha Sudarshan, Sophie Shih, Estella Yee, Alina Yang, John Zou, Cathy Chen, Quan Zhou, Leon Chen, Chinmay Singhal, George Shih 2024-08-02 arXiv http://github.com/malavikhasudarshan/Multi-Agent-Patient-Letter-Generation http://arxiv.org/abs/2408.01112v2
825 Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed Inputs Peng Ding, Jingyu Wu, Jun Kuang, Dan Ma, Xuezhi Cao, Xunliang Cai, Shi Chen, Jiajun Chen, Shujian Huang 2024-08-02 ACM Multimedia https://github.com/NJUNLP/Hallu-PI https://doi.org/10.1145/3664647.3681251
826 Non Verbis, Sed Rebus: Large Language Models Are Weak Solvers of Italian Rebuses Gabriele Sarti, Tommaso Caselli, Malvina Nissim, Arianna Bisazza 2024-08-01 CLiC-it https://github.com/gsarti/verbalized-rebus https://ceur-ws.org/Vol-3878/96_main_long.pdf
827 Large Language Model-driven Meta-structure Discovery in Heterogeneous Information Network Lin Chen, Fengli Xu, Nian Li, Zhenyu Han, Meng Wang, Yong Li, Pan Hui 2024-08 KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining https://github.com/LinChen-65/ReStruct https://dl.acm.org/doi/10.1145/3637528.3671965
828 Neural Retrievers are Biased Towards LLM-Generated Content Sunhao Dai, Yuqi Zhou, Liang Pang, Weihao Liu, Xiaolin Hu, Yong Liu, Xiao Zhang, Gang Wang, Jun Xu 2024-08 KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining https://github.com/KID-22/Source-Bias https://dl.acm.org/doi/10.1145/3637528.3671882
829 Lookahead: An Inference Acceleration Framework for Large Language Model with Lossless Generation Accuracy Yao Zhao, Zhitian Xie, Chen Liang, Chenyi Zhuang, Jinjie Gu 2024-08 KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining https://github.com/alipay/PainlessInferenceAcceleration https://dl.acm.org/doi/10.1145/3637528.3671614
830 RecExplainer: Aligning Large Language Models for Explaining Recommendation Models Yuxuan Lei, Jianxun Lian, Jing Yao, Xu Huang, Defu Lian, Xing Xie 2024-08 KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining https://github.com/microsoft/RecAI https://dl.acm.org/doi/10.1145/3637528.3671802
831 AutoWebGLM: A Large Language Model-based Web Navigating Agent Hanyu Lai, Xiao Liu, Iat Long Iong, Shuntian Yao, Yuxuan Chen, Pengbo Shen, Hao Yu, Hanchen Zhang, Xiaohan Zhang, Yuxiao Dong, Jie Tang 2024-08 KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining https://github.com/THUDM/AutoWebGLM https://dl.acm.org/doi/10.1145/3637528.3671620
832 A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models Wenqi Fan, Yujuan Ding, Liangbo Ning, Shijie Wang, Hengyun Li, Dawei Yin, Tat-Seng Chua, Qing Li 2024-08 KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining https://advanced-recommender-systems.github.io/RAG-Meets-LLMs/ https://dl.acm.org/doi/10.1145/3637528.3671470
833 A Survey of Large Language Models for Graphs Xubin Ren, Jiabin Tang, Dawei Yin, Nitesh V. Chawla, Chao Huang 2024-08 KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining https://github.com/HKUDS/Awesome-LLM4Graph-Papers https://dl.acm.org/doi/10.1145/3637528.3671460
834 ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models Mingrui Wu, Xinyue Cai, Jiayi Ji, Jiale Li, Oucheng Huang, Gen Luo, Hao Fei, Guannan Jiang, Xiaoshuai Sun, Rongrong Ji 2024-07-31 arXiv https://github.com/mrwu-mac/ControlMLLM https://doi.org/10.48550/arXiv.2407.21534
835 Automated Review Generation Method Based on Large Language Models Shican Wu, Xiao Ma, Dehui Luo, Lulu Li, Xiangcheng Shi, Xin Chang, Xiaoyun Lin, Ran Luo, Chunlei Pei, Changyin Du, Zhi-Jian Zhao, Jinlong Gong 2024-07-30 arXiv https://github.com/TJU-ECAT-AI/AutomaticReviewGeneration https://doi.org/10.48550/arXiv.2407.20906
836 CollectiveSFT: Scaling Large Language Models for Chinese Medical Benchmark with Collective Instructions in Healthcare Jingwei Zhu, Minghuan Tan, Min Yang, Ruixue Li, Hamid Alinejad-Rokny 2024-07-29 arXiv https://github.com/CAS-SIAT-XinHai/CollectiveSFT https://doi.org/10.48550/arXiv.2407.19705
837 Can Editing LLMs Inject Harm? Canyu Chen, Baixiang Huang, Zekun Li, Zhaorun Chen, Shiyang Lai, Xiongxiao Xu, Jia-Chen Gu, Jindong Gu, Huaxiu Yao, Chaowei Xiao, Xifeng Yan, William Yang Wang, Philip Torr, Dawn Song, Kai Shu 2024-07-29 arXiv https://llm-editing.github.io http://arxiv.org/abs/2407.20224v3
838 rLLM: Relational Table Learning with LLMs Weichen Li, Xiaotong Huang, Jianwu Zheng, Zheng Wang, Chaokun Wang, Li Pan, Jianhua Li 2024-07-29 arXiv https://github.com/rllm-project/rllm http://arxiv.org/abs/2407.20157v1
839 A Role-specific Guided Large Language Model for Ophthalmic Consultation Based on Stylistic Differentiation Laiyi Fu, Binbin Fan, Hongkai Du, Yanxiang Feng, Chunhua Li, Huping Song 2024-07-26 arXiv https://github.com/sperfu/EyeDoc https://doi.org/10.48550/arXiv.2407.18483
840 Exploring Bengali Religious Dialect Biases in Large Language Models with Evaluation Perspectives Azmine Toushik Wasi, Raima Islam, Mst Rafia Islam, Taki Hasan Rafi, Dong-Kyu Chae 2024-07-25 arXiv https://heal-workshop.github.io/#:~:text=Exploring%20Bengali%20Religious%20Dialect%20Biases%20in%20Large%20Language%20Models%20with%20Evaluation%20Perspectives https://doi.org/10.48550/arXiv.2407.18376
841 Scalify: scale propagation for efficient low-precision LLM training Paul Balança, Sam Hosegood, Carlo Luschi, Andrew Fitzgibbon 2024-07-24 arXiv https://github.com/graphcore-research/jax-scalify http://arxiv.org/abs/2407.17353v1
842 Boosting Large Language Models with Socratic Method for Conversational Mathematics Teaching Yuyang Ding, Hanglei Hu, Jie Zhou, Qin Chen, Bo Jiang, Liang He 2024-07-24 CIKM https://github.com/ECNU-ICALK/SocraticMath https://doi.org/10.1145/3627673.3679881
843 Accurate and Efficient Fine-Tuning of Quantized Large Language Models Through Optimal Balance Ao Shen, Qiang Wang, Zhiquan Lai, Xionglve Li, Dong-sheng Li 2024-07-24 arXiv https://github.com/xiaocaigou/qbaraqahira https://doi.org/10.48550/arXiv.2407.17029
844 Figure it Out: Analyzing-based Jailbreak Attack on Large Language Models Shi Lin, Rongchang Li, Xun Wang, Changting Lin, Wenpeng Xing, Meng Han 2024-07-23 arXiv https://github.com/theshi-1128/ABJ-Attack https://doi.org/10.48550/arXiv.2407.16205
845 INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model Yiwei Ma, Zhibin Wang, Xiaoshuai Sun, Weihuang Lin, Qiang Zhou, Jiayi Ji, Rongrong Ji 2024-07-23 arXiv https://github.com/WeihuangLin/INF-LLaVA https://doi.org/10.48550/arXiv.2407.16198
846 UniMEL: A Unified Framework for Multimodal Entity Linking with Large Language Models Qi Liu, Yongyi He, Defu Lian, Zhi Zheng, Tong Xu, Liu Che, Enhong Chen 2024-07-23 arXiv https://github.com/Javkonline/UniMEL https://doi.org/10.48550/arXiv.2407.16160
847 Enhancing LLM's Cognition via Structurization Kai Liu, Zhihang Fu, Chao Chen, Wei Zhang, Rongxin Jiang, Fan Zhou, Yaowu Chen, Yue Wu, Jieping Ye 2024-07-23 arXiv https://github.com/alibaba/struxgpt http://arxiv.org/abs/2407.16434v2
848 Rome was Not Built in a Single Step: Hierarchical Prompting for LLM-based Chip Design Andre Nakkab, Sai Qian Zhang, Ramesh Karri, Siddharth Garg 2024-07-23 arXiv https://github.com/ajn313/ROME-LLM http://arxiv.org/abs/2407.18276v3
849 Structure-aware Domain Knowledge Injection for Large Language Models Kai Liu, Ze Chen, Zhihang Fu, Rongxin Jiang, Fan Zhou, Yaowu Chen, Yue Wu, Jieping Ye 2024-07-23 arXiv https://github.com/alibaba/struxgpt http://arxiv.org/abs/2407.16724v2
850 Counter Turing Test ($CT^2$): Investigating AI-Generated Text Detection for Hindi -- Ranking LLMs based on Hindi AI Detectability Index ($ADI_hi$) Ishan Kavathekar, Anku Rani, Ashmit Chamoli, Ponnurangam Kumaraguru, Amit Sheth, Amitava Das 2024-07-22 OpenReview https://github.com/ishank31/Counter_Turing_Test http://arxiv.org/abs/2407.15694v2
851 SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models Mingze Xu, Mingfei Gao, Zhe Gan, Hong-You Chen, Zhengfeng Lai, Haiming Gang, Kai Kang, Afshin Dehghan 2024-07-22 arXiv https://github.com/apple/ml-slowfast-llava https://doi.org/10.48550/arXiv.2407.15841
852 LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models Xi Chen, Songyang Zhang, Qibing Bai, Kai Chen, Satoshi Nakamura 2024-07-22 ACL https://github.com/openaudiolab/LLaST https://doi.org/10.18653/v1/2024.findings-acl.416
853 Knowledge Acquisition Disentanglement for Knowledge-based Visual Question Answering with Large Language Models Wenbin An, Feng Tian, Jiahao Nie, Wenkai Shi, Haonan Lin, Yan Chen, Qianying Wang, Yaqiang Wu, Guang Dai, Ping Chen 2024-07-22 arXiv https://github.com/Lackel/DKA https://doi.org/10.48550/arXiv.2407.15346
854 Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability Zhuoyan Xu, Zhenmei Shi, Yingyu Liang 2024-07-22 arXiv https://github.com/OliverXUZY/LLM_Compose https://doi.org/10.48550/arXiv.2407.15720
855 Large Language Model for Verilog Generation with Golden Code Feedback Ning Wang, Bingkun Yao, Jie Zhou, Xi Wang, Zhe Jiang, Nan Guan 2024-07-21 arXiv https://github.com/CatIIIIIIII/veriseek https://doi.org/10.48550/arXiv.2407.18271
856 Navigation Instruction Generation with BEV Perception and Large Language Models Sheng Fan, Rui Liu, Wenguan Wang, Yi Yang 2024-07-21 ECCV https://github.com/FanScy/BEVInstructor https://doi.org/10.1007/978-3-031-72670-5_21
857 BIGbench: A Unified Benchmark for Social Bias in Text-to-Image Generative Models Based on Multi-modal LLM Hanjun Luo, Haoyu Huang, Ziye Deng, Xuecheng Liu, Ruizhe Chen, Zuozhu Liu 2024-07-21 arXiv https://github.com/BIGbench2024/BIGbench2024/ http://arxiv.org/abs/2407.15240v3
858 No Size Fits All: The Perils and Pitfalls of Leveraging LLMs Vary with Company Size Ashok Urlana, Charaka Vinayak Kumar, Bala Mallikarjunarao Garlapati, Ajeet Kumar Singh, Rahul Mishra 2024-07-21 arXiv https://github.com/vinayakcse/IndustrialLLMsPapers http://arxiv.org/abs/2408.01444v2
859 Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval Yiyang Jiang, Wengyu Zhang, Xulu Zhang, Xiaoyong Wei, Chang Wen Chen, Qing Li 2024-07-21 arXiv https://github.com/fletcherjiang/LLMEPET http://arxiv.org/abs/2407.15051v3
860 SynCPKL: Harnessing LLMs to Generate Synthetic Data for Commonsense Persona Knowledge Linking Kuan-Yen Lin 2024-07-21 arXiv https://github.com/irislin1006/CPKL http://arxiv.org/abs/2407.15281v1
861 On the Design and Analysis of LLM-Based Algorithms Yanxi Chen, Yaliang Li, Bolin Ding, Jingren Zhou 2024-07-20 arXiv https://github.com/modelscope/agentscope/tree/main/examples/paper_llm_based_algorithm http://arxiv.org/abs/2407.14788v2
862 Enhancing Zero-shot Audio Classification using Sound Attribute Knowledge from Large Language Models Xuenan Xu, Pingyue Zhang, Ming Yan, Ji Zhang, Mengyue Wu 2024-07-19 arXiv https://www.github.com/wsntxxn/AttrEnhZsAc https://doi.org/10.48550/arXiv.2407.14355
863 Beyond Code Generation: Assessing Code LLM Maturity with Postconditions Fusen He, Juan Zhai, Minxue Pan 2024-07-19 arXiv https://github.com/MatureModel/PostcondGen http://arxiv.org/abs/2407.14118v1
864 Internal Consistency and Self-Feedback in Large Language Models: A Survey Xun Liang, Shichao Song, Zifan Zheng, Hanyu Wang, Qingchen Yu, Xunkai Li, Rong-Hua Li, Yi Wang, Zhonghao Wang, Feiyu Xiong, Zhiyu Li 2024-07-19 arXiv https://github.com/IAAR-Shanghai/ICSFSurvey https://doi.org/10.48550/arXiv.2407.14507
865 SegPoint: Segment Any Point Cloud via Large Language Model Shuting He, Henghui Ding, Xudong Jiang, Bihan Wen 2024-07-18 arXiv https://heshuting555.github.io/SegPoint https://doi.org/10.48550/arXiv.2407.13761
866 ViLLa: Video Reasoning Segmentation with Large Language Model Rongkun Zheng, Lu Qi, Xi Chen, Yi Wang, Kun Wang, Yu Qiao, Hengshuang Zhao 2024-07-18 arXiv https://github.com/rkzheng99/ViLLa https://doi.org/10.48550/arXiv.2407.14500
867 E5-V: Universal Embeddings with Multimodal Large Language Models Ting Jiang, Minghui Song, Zihan Zhang, Haizhen Huang, Weiwei Deng, Feng Sun, Qi Zhang, Deqing Wang, Fuzhen Zhuang 2024-07-17 arXiv https://github.com/kongds/E5-V https://doi.org/10.48550/arXiv.2407.12580
868 Leveraging Environment Interaction for Automated PDDL Generation and Planning with Large Language Models Sadegh Mahdavi, Raquel Aoki, Keyi Tang, Yanshuai Cao 2024-07-17 arXiv https://github.com/BorealisAI/llm-pddl-planning https://doi.org/10.48550/arXiv.2407.12979
869 MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models Leyang Shen, Gongwei Chen, Rui Shao, Weili Guan, Liqiang Nie 2024-07-17 arXiv https://github.com/JiuTian-VL/MoME https://doi.org/10.48550/arXiv.2407.12709
870 Patch-Level Training for Large Language Models Chenze Shao, Fandong Meng, Jie Zhou 2024-07-17 arXiv https://github.com/shaochenze/PatchTrain https://doi.org/10.48550/arXiv.2407.12665
871 VISA: Reasoning Video Object Segmentation via Large Language Models Cilin Yan, Haochen Wang, Shilin Yan, Xiaolong Jiang, Yao Hu, Guoliang Kang, Weidi Xie, Efstratios Gavves 2024-07-16 ECCV https://github.com/cilinyan/VISA https://doi.org/10.1007/978-3-031-72633-0_6
872 NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window? Mo Li, Songyang Zhang, Yunxin Liu, Kai Chen 2024-07-16 arXiv https://github.com/open-compass/opencompass http://arxiv.org/abs/2407.11963v1
873 Beyond Correctness: Benchmarking Multi-dimensional Code Generation for Large Language Models Jiasheng Zheng, Boxi Cao, Zhengzhao Ma, Ruotong Pan, Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun 2024-07-16 arXiv https://github.com/jszheng21/RACE https://doi.org/10.48550/arXiv.2407.11470
874 Robust Utility-Preserving Text Anonymization Based on Large Language Models Tianyu Yang, Xiaodan Zhu, Iryna Gurevych 2024-07-16 arXiv https://github.com/UKPLab/arxiv2024-rupta https://doi.org/10.48550/arXiv.2407.11770
875 LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices Jung Hyun Lee, Jeonghoon Kim, June Yong Yang, Se Jung Kwon, Eunho Yang, Kang Min Yoo, Dongsoo Lee 2024-07-16 arXiv https://github.com/onliwad101/FlexRound_LRQ https://doi.org/10.48550/arXiv.2407.11534
876 Think-on-Graph 2.0: Deep and Interpretable Large Language Model Reasoning with Knowledge Graph-guided Retrieval Shengjie Ma, Chengjin Xu, Xuhui Jiang, Muzhi Li, Huaren Qu, Cehao Yang, Jiaxin Mao, Jian Guo 2024-07-15 arXiv https://github.com/IDEA-FinAI/ToG-2 https://doi.org/10.48550/arXiv.2407.10805
877 When AI Meets Finance (StockAgent): Large Language Model-based Stock Trading in Simulated Real-world Environments Chong Zhang, Xinyi Liu, Mingyu Jin, Zhongmou Zhang, Lingyao Li, Zhenting Wang, Wenyue Hua, Dong Shu, Suiyuan Zhu, Xiaobo Jin, Sujian Li, Mengnan Du, Yongfeng Zhang 2024-07-15 arXiv https://github.com/MingyuJ666/Stockagent https://doi.org/10.48550/arXiv.2407.18957
878 VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation Bocheng Zou, Mu Cai, Jianrui Zhang, Yong Jae Lee 2024-07-15 EMNLP https://vgbench.github.io https://aclanthology.org/2024.emnlp-main.213
879 Uncertainty is Fragile: Manipulating Uncertainty in Large Language Models Qingcheng Zeng, Mingyu Jin, Qinkai Yu, Zhenting Wang, Wenyue Hua, Zihao Zhou, Guangyan Sun, Yanda Meng, Shiqing Ma, Qifan Wang, Felix Juefei-Xu, Kaize Ding, Fan Yang, Ruixiang Tang, Yongfeng Zhang 2024-07-15 arXiv https://github.com/qcznlp/uncertainty_attack https://doi.org/10.48550/arXiv.2407.11282
880 By My Eyes: Grounding Multimodal Large Language Models with Sensor Data via Visual Prompting Hyungjun Yoon, Biniyam Aschalew Tolera, Taesik Gong, Kimin Lee, Sung-Ju Lee 2024-07-15 EMNLP https://github.com/diamond264/ByMyEyes https://aclanthology.org/2024.emnlp-main.133
881 Prompt Selection Matters: Enhancing Text Annotations for Social Sciences with Large Language Models Louis Abraham, Charles Arnal, Antoine Marie 2024-07-15 arXiv https://prompt-ultra.github.io/ https://doi.org/10.48550/arXiv.2407.10645
882 Evaluating Large Language Models with fmeval Pola Schwöbel, Luca Franceschi, Muhammad Bilal Zafar, Keerthan Vasist, Aman Malhotra, Tomer Shenhar, Pinal Tailor, Pinar Yilmaz, Michael Diamond, Michele Donini 2024-07-15 arXiv https://github.com/aws/fmeval https://doi.org/10.48550/arXiv.2407.12872
883 MMM: Multilingual Mutual Reinforcement Effect Mix Datasets & Test with Open-domain Information Extraction Large Language Models Chengguang Gan, Sunbowen Lee, Qingyu Yin, Xinyang He, Hanjun Wei, Yunhao Liang, Younghun Lim, Shijian Wang, Hexiang Huang, Qinghao Zhang, Shiwen Ni, Tatsunori Mori 2024-07-15 arXiv https://ganchengguang.github.io/MRE/ https://doi.org/10.48550/arXiv.2407.10953
884 IDEAL: Leveraging Infinite and Dynamic Characterizations of Large Language Models for Query-focused Summarization Jie Cao, Dian Jiao, Qiang Yan, Wenqiao Zhang, Siliang Tang, Yueting Zhuang 2024-07-15 arXiv https://github.com/DCDmllm/IDEAL_Summary https://doi.org/10.48550/arXiv.2407.10486
885 ChatLogic: Integrating Logic Programming with Large Language Models for Multi-Step Reasoning Zhongsheng Wang, Jiamou Liu, Qiming Bao, Hongfei Rong, Jingfeng Zhang 2024-07-14 IJCNN https://github.com/Strong-AI-Lab/ChatLogic https://doi.org/10.1109/IJCNN60899.2024.10650138
886 Follow the Rules: Reasoning for Video Anomaly Detection with Large Language Models Yuchen Yang, Kwonjoon Lee, Behzad Dariush, Yinzhi Cao, Shao-Yuan Lo 2024-07-14 ECCV https://github.com/Yuchen413/AnomalyRuler https://doi.org/10.1007/978-3-031-73004-7_18
887 Refusing Safe Prompts for Multi-modal Large Language Models Zedian Shao, Hongbin Liu, Yuepeng Hu, Neil Zhenqiang Gong 2024-07-12 arXiv https://github.com/Sadcardation/MLLM-Refusal https://doi.org/10.48550/arXiv.2407.09050
888 Stepwise Verification and Remediation of Student Reasoning Errors with Large Language Model Tutors Nico Daheim, Jakub Macina, Manu Kapur, Iryna Gurevych, Mrinmaya Sachan 2024-07-12 EMNLP https://github.com/eth-lre/verify-then-generate https://aclanthology.org/2024.emnlp-main.478
889 Global-Local Collaborative Inference with LLM for Lidar-Based Open-Vocabulary Detection Xingyu Peng, Yan Bai, Chen Gao, Lirong Yang, Fei Xia, Beipeng Mu, Xiaofei Wang, Si Liu 2024-07-12 arXiv https://github.com/GradiusTwinbee/GLIS http://arxiv.org/abs/2407.08931v1
890 Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training Youliang Yuan, Wenxiang Jiao, Wenxuan Wang, Jen-tse Huang, Jiahao Xu, Tian Liang, Pinjia He, Zhaopeng Tu 2024-07-12 arXiv https://github.com/RobustNLP/DeRTa http://arxiv.org/abs/2407.09121v1
891 Incorporating Large Language Models into Production Systems for Enhanced Task Automation and Flexibility Yuchen Xia, Jize Zhang, Nasser Jazdi, Michael Weyrich 2024-07-11 arXiv https://github.com/YuchenXia/GPT4IndustrialAutomation https://doi.org/10.48550/arXiv.2407.08550
892 SEED-Story: Multimodal Long Story Generation with Large Language Model Shuai Yang, Yuying Ge, Yang Li, Yukang Chen, Yixiao Ge, Ying Shan, Yingcong Chen 2024-07-11 arXiv https://github.com/TencentARC/SEED-Story https://doi.org/10.48550/arXiv.2407.08683
893 The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective Zhen Qin, Daoyuan Chen, Wenhao Zhang, Liuyi Yao, Yilun Huang, Bolin Ding, Yaliang Li, Shuiguang Deng 2024-07-11 arXiv https://github.com/modelscope/data-juicer/blob/main/docs/awesome_llm_data.md https://doi.org/10.48550/arXiv.2407.08583
894 Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing Huanqian Wang, Yang Yue, Rui Lu, Jingxin Shi, Andrew Zhao, Shenzhi Wang, Shiji Song, Gao Huang 2024-07-11 arXiv https://github.com/lucywang720/model-surgery http://arxiv.org/abs/2407.08770v1
895 EfficientQAT: Efficient Quantization-Aware Training for Large Language Models Mengzhao Chen, Wenqi Shao, Peng Xu, Jiahao Wang, Peng Gao, Kaipeng Zhang, Yu Qiao, Ping Luo 2024-07-10 arXiv https://github.com/OpenGVLab/EfficientQAT https://doi.org/10.48550/arXiv.2407.11062
896 GLBench: A Comprehensive Benchmark for Graph with Large Language Models Yuhan Li, Peisong Wang, Xiao Zhu, Aochuan Chen, Haiyun Jiang, Deng Cai, Victor Wai Kin Chan, Jia Li 2024-07-10 arXiv https://github.com/NineAbyss/GLBench https://doi.org/10.48550/arXiv.2407.07457
897 Inference Performance Optimization for Large Language Models on CPUs Pujiang He, Shan Zhou, Wenhuan Huang, Changqing Li, Duyi Wang, Bin Guo, Chen Meng, Sheng Gui, Weifei Yu, Yi Xie 2024-07-10 arXiv https://github.com/intel/xFasterTransformer https://doi.org/10.48550/arXiv.2407.07304
898 RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization Xijie Huang, Zechun Liu, Shih-Yang Liu, Kwang-Ting Cheng 2024-07-10 arXiv https://github.com/HuangOwen/RoLoRA http://arxiv.org/abs/2407.08044v2
899 FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation Liqun Ma, Mingjie Sun, Zhiqiang Shen 2024-07-09 arXiv https://github.com/LiqunMa/FBI-LLM http://arxiv.org/abs/2407.07093v1
900 Etalon: Holistic Performance Evaluation Framework for LLM Inference Systems Amey Agrawal, Anmol Agarwal, Nitin Kedia, Jayashree Mohan, Souvik Kundu, Nipun Kwatra, Ramachandran Ramjee, Alexey Tumanov 2024-07-09 arXiv https://github.com/project-etalon/etalon http://arxiv.org/abs/2407.07000v2
901 Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps Yung-Sung Chuang, Linlu Qiu, Cheng-Yu Hsieh, Ranjay Krishna, Yoon Kim, James R. Glass 2024-07-09 EMNLP https://github.com/voidism/Lookback-Lens https://aclanthology.org/2024.emnlp-main.84
902 DebUnc: Mitigating Hallucinations in Large Language Model Agent Communication with Uncertainty Estimations Luke Yoffe, Alfonso Amayuelas, William Yang Wang 2024-07-08 arXiv https://github.com/lukeyoffe/debunc https://doi.org/10.48550/arXiv.2407.06426
903 LLMBox: A Comprehensive Library for Large Language Models Tianyi Tang, Yiwen Hu, Bingqian Li, Wenyang Luo, Zijing Qin, Haoxiang Sun, Jiapeng Wang, Shiyi Xu, Xiaoxue Cheng, Geyang Guo, Han Peng, Bowen Zheng, Yiru Tang, Yingqian Min, Yushuo Chen, Jie Chen, Yuanqian Zhao, Luran Ding, Yuhao Wang, Zican Dong, Chunxuan Xia, Junyi Li, Kun Zhou, Wayne Xin Zhao, Ji-Rong Wen 2024-07-08 arXiv https://github.com/RUCAIBox/LLMBox https://doi.org/10.48550/arXiv.2407.05563
904 iLLM-TSC: Integration reinforcement learning and large language model for traffic signal control policy improvement Aoyu Pang, Maonan Wang, Man-On Pun, Chung Shue Chen, Xi Xiong 2024-07-08 arXiv https://github.com/Traffic-Alpha/iLLM-TSC https://doi.org/10.48550/arXiv.2407.06025
905 GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing Zhenyu Wang, Aoxue Li, Zhenguo Li, Xihui Liu 2024-07-08 arXiv https://zhenyuw16.github.io/GenArtist_page http://arxiv.org/abs/2407.05600v2
906 KG-FPQ: Evaluating Factuality Hallucination in LLMs with Knowledge Graph-based False Premise Questions Yanxu Zhu, Jinlin Xiao, Yuhang Wang, Jitao Sang 2024-07-08 arXiv https://github.com/yanxuzhu/KG-FPQ http://arxiv.org/abs/2407.05868v2
907 LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages Yinquan Lu, Wenhao Zhu, Lei Li, Yu Qiao, Fei Yuan 2024-07-08 arXiv https://github.com/CONE-MT/LLaMAX/ http://arxiv.org/abs/2407.05975v2
908 PsycoLLM: Enhancing LLM for Psychological Understanding and Evaluation Jinpeng Hu, Tengteng Dong, Luo Gang, Hui Ma, Peng Zou, Xiao Sun, Dan Guo, Xun Yang, Meng Wang 2024-07-08 arXiv https://github.com/MACLAB-HFUT/PsycoLLM http://arxiv.org/abs/2407.05721v3
909 LogicVista: Multimodal LLM Logical Reasoning Benchmark in Visual Contexts Yijia Xiao, Edward Sun, Tianyu Liu, Wei Wang 2024-07-06 arXiv https://github.com/Yijia-Xiao/LogicVista http://arxiv.org/abs/2407.04973v1
910 Beyond Perplexity: Multi-dimensional Safety Evaluation of LLM Compression Zhichao Xu, Ashim Gupta, Tao Li, Oliver Bentham, Vivek Srikumar 2024-07-06 arXiv https://github.com/zhichaoxu-shufe/Beyond-Perplexity-Compression-Safety-Eval http://arxiv.org/abs/2407.04965v3
911 ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models Yuzhe Gu, Ziwei Ji, Wenwei Zhang, Chengqi Lyu, Dahua Lin, Kai Chen 2024-07-05 arXiv https://github.com/open-compass/ANAH https://doi.org/10.48550/arXiv.2407.04693
912 AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents Petr Anokhin, Nikita Semenov, Artyom Sorokin, Dmitry Evseev, Mikhail Burtsev, Evgeny Burnaev 2024-07-05 arXiv https://github.com/AIRI-Institute/AriGraph http://arxiv.org/abs/2407.04363v2
913 Automating Venture Capital: Founder assessment using LLM-powered segmentation, feature engineering and automated labeling techniques Ekin Ozince, Yiğit Ihlamur 2024-07-05 arXiv https://github.com/velapartners/moneyball-LLM-based-founder-features http://arxiv.org/abs/2407.04885v1
914 BiosERC: Integrating Biography Speakers Supported by LLMs for ERC Tasks Jieying Xue, Minh Phuong Nguyen, Blake Matheny, Le Minh Nguyen 2024-07-05 arXiv https://github.com/yingjie7/BiosERC http://arxiv.org/abs/2407.04279v1
915 Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs Mihir Parmar, Hanieh Deilamsalehy, Franck Dernoncourt, Seunghyun Yoon, Ryan A. Rossi, Trung Bui 2024-07-05 arXiv https://github.com/Mihir3009/Extract-AI http://arxiv.org/abs/2407.04855v1
916 Waterfall: Framework for Robust and Scalable Text Watermarking and Provenance for LLMs Gregory Kang Ruey Lau, Xinyuan Niu, Hieu Dao, Jiangwei Chen, Chuan-Sheng Foo, Bryan Kian Hsiang Low 2024-07-05 arXiv https://github.com/aoi3142/Waterfall http://arxiv.org/abs/2407.04411v2
917 When LLMs Play the Telephone Game: Cumulative Changes and Attractors in Iterated Cultural Transmissions Jérémy Perez, Grgur Kovač, Corentin Léger, Cédric Colas, Gaia Molinaro, Maxime Derex, Pierre-Yves Oudeyer, Clément Moulin-Frier 2024-07-05 arXiv https://github.com/jeremyperez2/TelephoneGameLLM http://arxiv.org/abs/2407.04503v2
918 AutoBench: Automatic Testbench Generation and Evaluation Using LLMs for HDL Design Ruidi Qiu, Grace Li Zhang, Rolf Drechsler, Ulf Schlichtmann, Bing Li 2024-07-04 arXiv https://github.com/AutoBench/AutoBench http://arxiv.org/abs/2407.03891v2
919 Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation Yi-Chen Li, Fuxiang Zhang, Wenjie Qiu, Lei Yuan, Chengxing Jia, Zongzhang Zhang, Yang Yu, Bo An 2024-07-04 arXiv https://github.com/mansicer/Q-Adapter http://arxiv.org/abs/2407.03856v3
920 TongGu: Mastering Classical Chinese Understanding with Knowledge-Grounded Large Language Models Jiahuan Cao, Dezhi Peng, Peirong Zhang, Yongxin Shi, Yang Liu, Kai Ding, Lianwen Jin 2024-07-04 EMNLP https://github.com/SCUT-DLVCLab/TongGu-LLM https://aclanthology.org/2024.findings-emnlp.243
921 The Price of Prompting: Profiling Energy Use in Large Language Models Inference Erik Johannes Husom, Arda Goknil, Lwin Khin Shar, Sagar Sen 2024-07-04 arXiv https://github.com/ejhusom/MELODI https://doi.org/10.48550/arXiv.2407.16893
922 NutriBench: A Dataset for Evaluating Large Language Models in Carbohydrate Estimation from Meal Descriptions Andong Hua, Mehak Preet Dhaliwal, Ryan Burke, Laya Pullela, Yao Qin 2024-07-04 arXiv https://mehak126.github.io/nutribench.html https://doi.org/10.48550/arXiv.2407.12843
923 CFinBench: A Comprehensive Chinese Financial Benchmark for Large Language Models Ying Nie, Binwei Yan, Tianyu Guo, Hao Liu, Haoyu Wang, Wei He, Binfan Zheng, Weihao Wang, Qiang Li, Weijian Sun, Yunhe Wang, Dacheng Tao 2024-07-02 arXiv https://cfinbench.github.io/ https://doi.org/10.48550/arXiv.2407.02301
924 Extracting and Encoding: Leveraging Large Language Models and Medical Knowledge to Enhance Radiological Text Representation Pablo Messina, René Vidal, Denis Parra, Alvaro Soto, Vladimir Araujo 2024-07-02 ACL https://github.com/PabloMessina/CXR-Fact-Encoder https://doi.org/10.18653/v1/2024.findings-acl.236
925 Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models Zihan Wang, Deli Chen, Damai Dai, Runxin Xu, Zhuoshu Li, Y. Wu 2024-07-02 arXiv https://github.com/deepseek-ai/ESFT https://doi.org/10.48550/arXiv.2407.01906
926 To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models Bozhong Tian, Xiaozhuan Liang, Siyuan Cheng, Qingbin Liu, Mengru Wang, Dianbo Sui, Xi Chen, Huajun Chen, Ningyu Zhang 2024-07-02 EMNLP https://github.com/zjunlp/KnowUnDo https://aclanthology.org/2024.findings-emnlp.82
927 Breaking Bias, Building Bridges: Evaluation and Mitigation of Social Biases in LLMs via Contact Hypothesis Chahat Raj, Anjishnu Mukherjee, Aylin Caliskan, Antonios Anastasopoulos, Ziwei Zhu 2024-07-02 arXiv https://github.com/chahatraj/breakingbias http://arxiv.org/abs/2407.02030v1
928 Enabling Discriminative Reasoning in LLMs for Legal Judgment Prediction Chenlong Deng, Kelong Mao, Yuyao Zhang, Zhicheng Dou 2024-07-02 arXiv https://github.com/ChenlongDeng/ADAPT http://arxiv.org/abs/2407.01964v4
929 TokenPacker: Efficient Visual Projector for Multimodal LLM Wentong Li, Yuqian Yuan, Jian Liu, Dongqi Tang, Song Wang, Jie Qin, Jianke Zhu, Lei Zhang 2024-07-02 arXiv https://github.com/CircleRadon/TokenPacker http://arxiv.org/abs/2407.02392v4
930 MalAlgoQA: Pedagogical Evaluation of Counterfactual Reasoning in Large Language Models and Implications for AI in Education Shashank Sonkar, Naiming Liu, Myco Le, Richard G. Baraniuk 2024-07-01 EMNLP https://github.com/luffycodes/MalAlgoQA-Dataset https://aclanthology.org/2024.findings-emnlp.913
931 MIRAI: Evaluating LLM Agents for Event Forecasting Chenchen Ye, Ziniu Hu, Yihe Deng, Zijie Huang, Mingyu Derek Ma, Yanqiao Zhu, Wei Wang 2024-07-01 arXiv https://mirai-llm.github.io/ http://arxiv.org/abs/2407.01231v1
932 FineSurE: Fine-grained Summarization Evaluation using LLMs Hwanjun Song, Hang Su, Igor Shalyminov, Jason Cai, Saab Mansour 2024-07-01 arXiv https://github.com/DISL-Lab/FineSurE-ACL24 http://arxiv.org/abs/2407.00908v3
933 SplitLoRA: A Split Parameter-Efficient Fine-Tuning Framework for Large Language Models Zheng Lin, Xuanjie Hu, Yuxin Zhang, Zhe Chen, Zihan Fang, Xianhao Chen, Ang Li, Praneeth Vepakomma, Yue Gao 2024-07-01 arXiv https://fduinc.github.io/splitlora/ https://doi.org/10.48550/arXiv.2407.00952
934 DiscoveryBench: Towards Data-Driven Discovery with Large Language Models Bodhisattwa Prasad Majumder, Harshit Surana, Dhruv Agarwal, Bhavana Dalvi Mishra, Abhijeetsingh Meena, Aryan Prakhar, Tirth Vora, Tushar Khot, Ashish Sabharwal, Peter Clark 2024-07-01 arXiv https://github.com/allenai/discoverybench https://doi.org/10.48550/arXiv.2407.01725
935 Enhancing the Capability and Robustness of Large Language Models through Reinforcement Learning-Driven Query Refinement Zisu Huang, Xiaohua Wang, Feiran Zhang, Zhibo Xu, Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang 2024-07-01 arXiv https://github.com/Huangzisu/query-refinement https://doi.org/10.48550/arXiv.2407.01461
936 EconNLI: Evaluating Large Language Models on Economics Reasoning Yue Guo, Yi Yang 2024-07-01 ACL https://github.com/Irenehere/EconNLI https://doi.org/10.18653/v1/2024.findings-acl.58
937 AutoFlow: Automated Workflow Generation for Large Language Model Agents Zelong Li, Shuyuan Xu, Kai Mei, Wenyue Hua, Balaji Rama, Om Raheja, Hao Wang, He Zhu, Yongfeng Zhang 2024-07-01 arXiv https://github.com/agiresearch/AutoFlow https://doi.org/10.48550/arXiv.2407.12821
938 LLaRA: Large Language-Recommendation Assistant Jiayi Liao, Sihang Li, Zhengyi Yang, Jiancan Wu, Yancheng Yuan, Xiang Wang, Xiangnan He 2024-07 SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval https://github.com/ljy0ustc/LLaRA https://dl.acm.org/doi/10.1145/3626772.3657690
939 USimAgent: Large Language Models for Simulating Search Users Erhan Zhang, Xingzhu Wang, Peiyuan Gong, Yankai Lin, Jiaxin Mao 2024-07 SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval https://github.com/Meow-E/USimAgent https://dl.acm.org/doi/10.1145/3626772.3657963
940 IDGenRec: LLM-RecSys Alignment with Textual ID Learning Juntao Tan, Shuyuan Xu, Wenyue Hua, Yingqiang Ge, Zelong Li, Yongfeng Zhang 2024-07 SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval https://github.com/agiresearch/IDGenRec https://dl.acm.org/doi/10.1145/3626772.3657821
941 Are Large Language Models Good at Utility Judgments? Hengran Zhang, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Yixing Fan, Xueqi Cheng 2024-07 SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval https://github.com/ict-bigdatalab/utility_judgments https://dl.acm.org/doi/10.1145/3626772.3657784
942 LLMatic: Neural Architecture Search Via Large Language Models And Quality Diversity Optimization Muhammad Umair Nasir, Sam Earle, Julian Togelius, Steven James, Christopher W. Cleghorn 2024-07 GECCO '24: Proceedings of the Genetic and Evolutionary Computation Conference https://github.com/umair-nasir14/LLMatic https://dl.acm.org/doi/10.1145/3638529.3654017
943 ChatUniTest: A Framework for LLM-Based Test Generation Yinghao Chen, Zehao Hu, Chen Zhi, Junxiao Han, Shuiguang Deng, Jianwei Yin 2024-07 FSE 2024: Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering https://github.com/ZJU-ACES-ISE/ChatUniTest https://dl.acm.org/doi/10.1145/3663529.3663801
944 GraphArena: Benchmarking Large Language Models on Graph Computational Problems Jianheng Tang, Qifan Zhang, Yuhan Li, Jia Li 2024-06-29 arXiv https://github.com/squareRoot3/GraphArena https://doi.org/10.48550/arXiv.2407.00379
945 LLMs-as-Instructors: Learning from Errors Toward Automating Model Improvement Jiahao Ying, Mingbao Lin, Yixin Cao, Wei Tang, Bo Wang, Qianru Sun, Xuanjing Huang, Shuicheng Yan 2024-06-29 arXiv https://yingjiahao14.github.io/LLMs-as-Instructors-pages/ http://arxiv.org/abs/2407.00497v1
946 YuLan: An Open-source Large Language Model Yutao Zhu, Kun Zhou, Kelong Mao, Wentong Chen, Yiding Sun, Zhipeng Chen, Qian Cao, Yihan Wu, Yushuo Chen, Feng Wang, Lei Zhang, Junyi Li, Xiaolei Wang, Lei Wang, Beichen Zhang, Zican Dong, Xiaoxue Cheng, Yuhan Chen, Xinyu Tang, Yupeng Hou, Qiangqiang Ren, Xincheng Pang, Shufang Xie, Wayne Xin Zhao, Zhicheng Dou, Jiaxin Mao, Yankai Lin, Ruihua Song, Jun Xu, Xu Chen, Rui Yan, Zhewei Wei, Di Hu, Wenbing Huang, Ze-Feng Gao, Yueguo Chen, Weizheng Lu, Ji-Rong Wen 2024-06-28 arXiv https://github.com/RUC-GSAI/YuLan-Chat https://doi.org/10.48550/arXiv.2406.19853
947 Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring Jiazheng Li, Hainiu Xu, Zhaoyue Sun, Yuxiang Zhou, David West, Cesare Aloisi, Yulan He 2024-06-28 arXiv https://github.com/lijiazheng99/thought_tree_assessment http://arxiv.org/abs/2406.19949v2
948 MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics? Jinming Li, Yichen Zhu, Zhiyuan Xu, Jindong Gu, Minjie Zhu, Xin Liu, Ning Liu, Yaxin Peng, Feifei Feng, Jian Tang 2024-06-28 arXiv https://mm-robobench.github.io/ http://arxiv.org/abs/2406.19693v1
949 Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs Sukmin Yun, Haokun Lin, Rusiru Thushara, Mohammad Qazim Bhat, Yongxin Wang, Zutao Jiang, Mingkai Deng, Jinhong Wang, Tianhua Tao, Junbo Li, Haonan Li, Preslav Nakov, Timothy Baldwin, Zhengzhong Liu, Eric P. Xing, Xiaodan Liang, Zhiqiang Shen 2024-06-28 arXiv https://mbzuai-llm.github.io/webpage2code/ http://arxiv.org/abs/2406.20098v2
950 DIM: Dynamic Integration of Multimodal Entity Linking with Large Language Model Shezheng Song, Shasha Li, Jie Yu, Shan Zhao, Xiaopeng Li, Jun Ma, Xiaodong Liu, Zhuo Li, Xiaoguang Mao 2024-06-27 PRCV https://github.com/season1blue/DIM https://doi.org/10.1007/978-981-97-8620-6_13
951 STBench: Assessing the Ability of Large Language Models in Spatio-Temporal Analysis Wenbin Li, Di Yao, Ruibo Zhao, Wenjie Chen, Zijie Xu, Chengxue Luo, Chang Gong, Quanliang Jing, Haining Tan, Jingping Bi 2024-06-27 arXiv https://github.com/LwbXc/STBench https://doi.org/10.48550/arXiv.2406.19065
952 Hierarchical Deconstruction of LLM Reasoning: A Graph-Based Framework for Analyzing Knowledge Utilization Miyoung Ko, Sue Hyun Park, Joonsuk Park, Minjoon Seo 2024-06-27 arXiv https://github.com/kaistAI/knowledge-reasoning http://arxiv.org/abs/2406.19502v2
953 Selective Prompting Tuning for Personalized Conversations with LLMs Qiushi Huang, Xubo Liu, Tom Ko, Bo Wu, Wenwu Wang, Yu Zhang, Lilian Tang 2024-06-26 OpenReview https://github.com/hqsiswiliam/SPT http://arxiv.org/abs/2406.18187v1
954 IRCAN: Mitigating Knowledge Conflicts in LLM Generation via Identifying and Reweighting Context-Aware Neurons Dan Shi, Renren Jin, Tianhao Shen, Weilong Dong, Xinwei Wu, Deyi Xiong 2024-06-26 arXiv https://github.com/danshi777/IRCAN http://arxiv.org/abs/2406.18406v2
955 CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs Zirui Wang, Mengzhou Xia, Luxi He, Howard Chen, Yitao Liu, Richard Zhu, Kaiqu Liang, Xindi Wu, Haotian Liu, Sadhika Malladi, Alexis Chevalier, Sanjeev Arora, Danqi Chen 2024-06-26 arXiv https://charxiv.github.io/ http://arxiv.org/abs/2406.18521v1
956 Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs Xin Lai, Zhuotao Tian, Yukang Chen, Senqiao Yang, Xiangru Peng, Jiaya Jia 2024-06-26 arXiv https://github.com/dvlab-research/Step-DPO http://arxiv.org/abs/2406.18629v1
957 Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation Guanting Dong, Yutao Zhu, Chenghao Zhang, Zechen Wang, Zhicheng Dou, Ji-Rong Wen 2024-06-26 arXiv https://github.com/dongguanting/DPA-RAG http://arxiv.org/abs/2406.18676v2
958 Hierarchical Context Pruning: Optimizing Real-World Code Completion with Repository-Level Pretrained Code LLMs Lei Zhang, Yunshui Li, Jiaming Li, Xiaobo Xia, Jiaxi Yang, Run Luo, Minzheng Wang, Longze Chen, Junhao Liu, Min Yang 2024-06-26 arXiv https://github.com/Hambaobao/HCP-Coder http://arxiv.org/abs/2406.18294v2
959 The Surprising Effectiveness of Multimodal Large Language Models for Video Moment Retrieval Meinardus Boris, Batra Anil, Rohrbach Anna, Rohrbach Marcus 2024-06-26 arXiv https://github.com/sudo-Boris/mr-Blip https://doi.org/10.48550/arXiv.2406.18113
960 BADGE: BADminton report Generation and Evaluation with LLM Shang-Hsuan Chiang, Lin-Wei Chao, Kuang-Da Wang, Chih-Chuan Wang, Wen-Chih Peng 2024-06-26 arXiv https://github.com/AndyChiangSH/BADGE http://arxiv.org/abs/2406.18116v1
961 Visual Reasoning and Multi-Agent Approach in Multimodal Large Language Models (MLLMs): Solving TSP and mTSP Combinatorial Challenges Mohammed Elhenawy, Ahmad Abutahoun, Taqwa I. Alhadidi, Ahmed Jaber, Huthaifa I. Ashqar, Shadi Jaradat, Ahmed Abdelhay, Sebastien Glaser, Andry Rakotonirainy 2024-06-26 Mach. Learn. Knowl. Extr. https://github.com/ahmed-abdulhuy/Solving-TSP-and-mTSP-Combinatorial-Challenges-using-Visual-Reasoning-and-Multi-Agent-Approach-MLLMs- https://doi.org/10.3390/make6030093
962 A Closer Look into Mixture-of-Experts in Large Language Models Ka Man Lo, Zeyu Huang, Zihan Qiu, Zili Wang, Jie Fu 2024-06-26 arXiv https://github.com/kamanphoebe/Look-into-MoEs https://doi.org/10.48550/arXiv.2406.18219
963 A Review of Large Language Models and Autonomous Agents in Chemistry Mayk Caldas Ramos, Christopher J. Collison, Andrew D. White 2024-06-26 arXiv https://github.com/ur-whitelab/LLMs-in-science https://doi.org/10.48550/arXiv.2407.01603
964 ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs Ahmed Heakl, Youssef Zaghloul, Mennatullah Ali, Rania Hossam, Walid Gomaa 2024-06-26 arXiv http://github.com/ahmedheakl/arazn-llm http://arxiv.org/abs/2406.18120v2
965 Crafting Customisable Characters with LLMs: Introducing SimsChat, a Persona-Driven Role-Playing Agent Framework Bohao Yang, Dong Liu, Chen Tang, Chenghao Xiao, Kun Zhao, Chao Li, Lin Yuan, Guang Yang, Lanxiao Huang, Chenghua Lin 2024-06-25 arXiv https://github.com/Bernard-Yang/SimsChat http://arxiv.org/abs/2406.17962v3
966 TALEC: Teach Your LLM to Evaluate in Specific Domain with In-house Criteria by Criteria Division and Zero-shot Plus Few-shot Kaiqi Zhang, Shuai Yuan, Honghan Zhao 2024-06-25 arXiv https://github.com/zlkqz/auto_eval http://arxiv.org/abs/2407.10999v1
967 T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge Jianyu Wei, Shijie Cao, Ting Cao, Lingxiao Ma, Lei Wang, Yanyong Zhang, Mao Yang 2024-06-25 arXiv https://github.com/microsoft/T-MAC http://arxiv.org/abs/2407.00088v1
968 Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA Minzheng Wang, Longze Chen, Cheng Fu, Shengyi Liao, Xinghua Zhang, Bingli Wu, Haiyang Yu, Nan Xu, Lei Zhang, Run Luo, Yunshui Li, Min Yang, Fei Huang, Yongbin Li 2024-06-25 arXiv https://github.com/MozerWang/Loong http://arxiv.org/abs/2406.17419v2
969 Layer-Wise Quantization: A Pragmatic and Effective Method for Quantizing LLMs Beyond Integer Bit-Levels Razvan-Gabriel Dumitru, Vikas Yadav, Rishabh Maheshwary, Paul-Ioan Clotan, Sathwik Tejaswi Madhusudhan, Mihai Surdeanu 2024-06-25 arXiv https://github.com/RazvanDu/LayerwiseQuant/ http://arxiv.org/abs/2406.17415v3
970 Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients Aashiq Muhamed, Oscar Li, David Woodruff, Mona Diab, Virginia Smith 2024-06-25 arXiv https://github.com/aashiqmuhamed/GRASS http://arxiv.org/abs/2406.17660v1
971 Predicting the Big Five Personality Traits in Chinese Counselling Dialogues Using Large Language Models Yang Yan, Lizhi Ma, Anqi Li, Jingsong Ma, Zhenzhong Lan 2024-06-25 arXiv https://github.com/kuri-leo/BigFive-LLM-Predictor https://doi.org/10.48550/arXiv.2406.17287
972 Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models Wenhao Shi, Zhiqiang Hu, Yi Bin, Junhua Liu, Yang Yang, See-Kiong Ng, Lidong Bing, Roy Ka-Wei Lee 2024-06-25 EMNLP https://github.com/HZQ950419/Math-LLaVA https://aclanthology.org/2024.findings-emnlp.268
973 Large Language Models are Interpretable Learners Ruochen Wang, Si Si, Felix Yu, Dorothea Wiesmann, Cho-Jui Hsieh, Inderjit S. Dhillon 2024-06-25 arXiv https://github.com/ruocwang/llm-symbolic-program https://doi.org/10.48550/arXiv.2406.17224
974 Improving Arithmetic Reasoning Ability of Large Language Models through Relation Tuples, Verification and Dynamic Feedback Zhongtao Miao, Kaiyan Zhao, Yoshimasa Tsuruoka 2024-06-25 arXiv https://github.com/gpgg/art https://doi.org/10.48550/arXiv.2406.17873
975 From Distributional to Overton Pluralism: Investigating Large Language Model Alignment Thom Lake, Eunsol Choi, Greg Durrett 2024-06-25 arXiv https://github.com/thomlake/investigating-alignment https://doi.org/10.48550/arXiv.2406.17692
976 Dual-Space Knowledge Distillation for Large Language Models Songming Zhang, Xue Zhang, Zengkui Sun, Yufeng Chen, Jinan Xu 2024-06-25 EMNLP https://github.com/songmzhang/DSKD https://aclanthology.org/2024.emnlp-main.1010
977 DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph Zhehao Zhang, Jiaao Chen, Diyi Yang 2024-06-25 arXiv https://github.com/SALT-NLP/DARG https://doi.org/10.48550/arXiv.2406.17271
978 Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers Xiuying Wei, Skander Moalla, Razvan Pascanu, Caglar Gulcehre 2024-06-24 arXiv https://github.com/CLAIRE-Labo/StructuredFFN/tree/main http://arxiv.org/abs/2406.16450v2
979 Prompt-Consistency Image Generation (PCIG): A Unified Framework Integrating LLMs, Knowledge Graphs, and Controllable Diffusion Models Yichen Sun, Zhixuan Chu, Zhan Qin, Kui Ren 2024-06-24 arXiv https://github.com/TruthAI-Lab/PCIG http://arxiv.org/abs/2406.16333v1
980 Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs Ashwinee Panda, Berivan Isik, Xiangyu Qi, Sanmi Koyejo, Tsachy Weissman, Prateek Mittal 2024-06-24 arXiv https://github.com/kiddyboots216/lottery-ticket-adaptation http://arxiv.org/abs/2406.16797v2
981 EVALALIGN: Supervised Fine-Tuning Multimodal LLMs with Human-Aligned Data for Evaluating Text-to-Image Models Zhiyu Tan, Xiaomeng Yang, Luozheng Qin, Mengping Yang, Cheng Zhang, Hao Li 2024-06-24 arXiv https://sais-fuxi.github.io/projects/evalalign/ http://arxiv.org/abs/2406.16562v3
982 AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models Jiale Cheng, Yida Lu, Xiaotao Gu, Pei Ke, Xiao Liu, Yuxiao Dong, Hongning Wang, Jie Tang, Minlie Huang 2024-06-24 EMNLP https://github.com/thu-coai/AutoDetect https://aclanthology.org/2024.findings-emnlp.397
983 ShadowLLM: Predictor-based Contextual Sparsity for Large Language Models Yash Akhauri, Ahmed F. AbouElhamayed, Jordan Dotzel, Zhiru Zhang, Alexander M. Rush, Safeen Huda, Mohamed S. Abdelfattah 2024-06-24 EMNLP https://github.com/abdelfattah-lab/shadow_llm/ https://aclanthology.org/2024.emnlp-main.1068
984 Multi-LogiEval: Towards Evaluating Multi-Step Logical Reasoning Ability of Large Language Models Nisarg Patel, Mohith Kulkarni, Mihir Parmar, Aashna Budhiraja, Mutsumi Nakamura, Neeraj Varshney, Chitta Baral 2024-06-24 EMNLP https://github.com/Mihir3009/Multi-LogiEval https://aclanthology.org/2024.emnlp-main.1160
985 M2Lingual: Enhancing Multilingual, Multi-Turn Instruction Alignment in Large Language Models Rishabh Maheshwary, Vikas Yadav, Hoang Nguyen, Khyati Mahajan, Sathwik Tejaswi Madhusudhan 2024-06-24 arXiv https://github.com/ServiceNow/M2Lingual https://doi.org/10.48550/arXiv.2406.16783
986 Large Language Models Are Cross-Lingual Knowledge-Free Reasoners Peng Hu, Sizhe Liu, Changjiang Gao, Xin Huang, Xue Han, Junlan Feng, Chao Deng, Shujian Huang 2024-06-24 arXiv https://github.com/NJUNLP/Knowledge-Free-Reasoning https://doi.org/10.48550/arXiv.2406.16655
987 AudioBench: A Universal Benchmark for Audio Large Language Models Bin Wang, Xunlong Zou, Geyu Lin, Shuo Sun, Zhuohan Liu, Wenyu Zhang, Zhengyuan Liu, AiTi Aw, Nancy F. Chen 2024-06-23 arXiv https://github.com/AudioLLMs/AudioBench https://doi.org/10.48550/arXiv.2406.16020
988 Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models Lynn Chua, Badih Ghazi, Yangsibo Huang, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Amer Sinha, Chulin Xie, Chiyuan Zhang 2024-06-23 arXiv https://github.com/google-research/crosslingual-knowledge-barriers https://doi.org/10.48550/arXiv.2406.16135
989 Efficient Evolutionary Search Over Chemical Space with Large Language Models Haorui Wang, Marta Skreta, Cher-Tian Ser, Wenhao Gao, Lingkai Kong, Felix Streith-Kalthoff, Chenru Duan, Yuchen Zhuang, Yue Yu, Yanqiao Zhu, Yuanqi Du, Alán Aspuru-Guzik, Kirill Neklyudov, Chao Zhang 2024-06-23 arXiv http://github.com/zoom-wang112358/MOLLEO https://doi.org/10.48550/arXiv.2406.16976
990 FS-RAG: A Frame Semantics Based Approach for Improved Factual Accuracy in Large Language Models Harish Tayyar Madabushi 2024-06-23 arXiv https://github.com/H-TayyarMadabushi/A-Frame-Semantics-based-approach-for-Improved-Factual-Accuracy-in-Large-Language-Models https://doi.org/10.48550/arXiv.2406.16167
991 FastMem: Fast Memorization of Prompt Improves Context Awareness of Large Language Models Junyi Zhu, Shuochen Liu, Yu Yu, Bo Tang, Yibo Yan, Zhiyu Li, Feiyu Xiong, Tong Xu, Matthew B. Blaschko 2024-06-23 EMNLP https://github.com/IAAR-Shanghai/FastMem https://aclanthology.org/2024.findings-emnlp.687
992 Can LLM Graph Reasoning Generalize beyond Pattern Memorization? Yizhuo Zhang, Heng Wang, Shangbin Feng, Zhaoxuan Tan, Xiaochuang Han, Tianxing He, Yulia Tsvetkov 2024-06-23 arXiv https://github.com/MatthewYZhang/NLGift http://arxiv.org/abs/2406.15992v2
993 SS-GEN: A Social Story Generation Framework with Large Language Models Yi Feng, Mingyang Song, Jiaqi Wang, Zhuang Chen, Guanqun Bi, Minlie Huang, Liping Jing, Jian Yu 2024-06-22 arXiv https://github.com/MIMIFY/SS-GEN http://arxiv.org/abs/2406.15695v2
994 Ladder: A Model-Agnostic Framework Boosting LLM-based Machine Translation to the Next Level Zhaopeng Feng, Ruizhe Chen, Yan Zhang, Zijie Meng, Zuozhu Liu 2024-06-22 arXiv https://github.com/fzp0424/MT-Ladder http://arxiv.org/abs/2406.15741v3
995 RuleR: Improving LLM Controllability by Rule-based Data Recycling Ming Li, Han Chen, Chenguang Wang, Dang Nguyen, Dianqi Li, Tianyi Zhou 2024-06-22 arXiv https://github.com/tianyi-lab/RuleR http://arxiv.org/abs/2406.15938v3
996 Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration Zhongzhi Yu, Zheng Wang, Yonggan Fu, Huihong Shi, Khalid Shaikh, Yingyan Celine Lin 2024-06-22 ICML https://github.com/GATECH-EIC/ACT https://openreview.net/forum?id=DLTjFFiuUJ
997 video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models Guangzhi Sun, Wenyi Yu, Changli Tang, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Yuxuan Wang, Chao Zhang 2024-06-22 ICML https://github.com/bytedance/SALMONN/ https://openreview.net/forum?id=nYsh5GFIqX
998 The Music Maestro or The Musically Challenged, A Massive Music Evaluation Benchmark for Large Language Models Jiajia Li, Lu Yang, Mingni Tang, Chenchong Chenchong, Zuchao Li, Ping Wang, Hai Zhao 2024-06-22 ACL https://github.com/zcli-charlie/ZIQI-Eval https://doi.org/10.18653/v1/2024.findings-acl.194
999 Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph Roman Vashurin, Ekaterina Fadeeva, Artem Vazhentsev, Lyudmila Rvanova, Akim Tsvigun, Daniil Vasilev, Rui Xing, Abdelrahman Boda Sadallah, Kirill Grishchenkov, Sergey Petrakov, Alexander Panchenko, Timothy Baldwin, Preslav Nakov, Maxim Panov, Artem Shelmanov 2024-06-21 arXiv https://github.com/IINemo/lm-polygraph https://doi.org/10.48550/arXiv.2406.15627
1000 ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models Haiquan Zhao, Lingyu Li, Shisong Chen, Shuqi Kong, Jiaan Wang, Kexin Huang, Tianle Gu, Yixu Wang, Jian Wang, Dandan Liang, Zhixu Li, Yan Teng, Yanghua Xiao, Yingchun Wang 2024-06-21 arXiv https://github.com/AIFlames/Esc-Eval https://doi.org/10.48550/arXiv.2406.14952
1001 GIEBench: Towards Holistic Evaluation of Group Identity-based Empathy for Large Language Models Leyan Wang, Yonggang Jin, Tianhao Shen, Tianyu Zheng, Xinrun Du, Chenchen Zhang, Wenhao Huang, Jiaheng Liu, Shi Wang, Ge Zhang, Liuyu Xiang, Zhaofeng He 2024-06-21 arXiv https://github.com/GIEBench/GIEBench https://doi.org/10.48550/arXiv.2406.14903
1002 Leveraging Passage Embeddings for Efficient Listwise Reranking with Large Language Models Qi Liu, Bo Wang, Nan Wang, Jiaxin Mao 2024-06-21 arXiv https://github.com/liuqi6777/pe_rank https://doi.org/10.48550/arXiv.2406.14848