Advances in Secure and Reliable Code Generation with Large Language Models

The field of large language models (LLMs) for code generation is rapidly evolving, with a growing focus on security and reliability. Recent research has highlighted the risks of unintended memorization and malicious disclosure of sensitive information, as well as the challenges of API misuse and untrusted code execution. To address these concerns, researchers are developing novel methods for assessing and mitigating these risks, such as evaluating the security and confidentiality properties of test environments and proposing automatic program repair approaches for API misuse. Additionally, there is a growing interest in optimizing container rebuild efficiency and assessing the effectiveness of code LLMs in Android malware analysis tasks. Noteworthy papers in this area include 'Identifying and Mitigating API Misuse in Large Language Models', which proposes a novel taxonomy of API misuse types and an LLM-based automatic program repair approach, and 'SandboxEval: Towards Securing Test Environment for Untrusted Code', which introduces a test suite for evaluating the security and confidentiality properties of test environments. Overall, the field is moving towards developing more secure, reliable, and efficient code generation systems, with a focus on addressing the unique challenges and risks associated with LLMs.

Sources

Malicious and Unintentional Disclosure Risks in Large Language Models for Code Generation

Identifying and Mitigating API Misuse in Large Language Models

SandboxEval: Towards Securing Test Environment for Untrusted Code

On Benchmarking Code LLMs for Android Malware Analysis

Doctor: Optimizing Container Rebuild Efficiency by Instruction Re-Orchestration

Code Red! On the Harmfulness of Applying Off-the-shelf Large Language Models to Programming Tasks

Build Code Needs Maintenance Too: A Study on Refactoring and Technical Debt in Build Systems

Towards Assessing Deep Learning Test Input Generators

RBR4DNN: Requirements-based Testing of Neural Networks

Built with on top of