Advances in Software Security through Large Language Models

The field of software security is rapidly evolving, with a growing focus on leveraging Large Language Models (LLMs) to improve vulnerability detection, code analysis, and testing. Recent developments have shown that LLMs can be effectively used to identify security risks, detect code smells, and generate highly structured test inputs. The use of contextual information and internal states of LLMs has been shown to enhance their performance in vulnerability detection and code analysis. Furthermore, fine-tuned Small Language Models (SLMs) have been found to be highly accurate and efficient tools for detecting Common Weakness Enumerations (CWEs) in code. Noteworthy papers in this area include the proposal of GraphQLer, a context-aware security testing framework for GraphQL APIs, and the development of MoCQ, a holistic neuro-symbolic framework for automated static vulnerability detection. Additionally, the introduction of Cottontail, an LLM-driven concolic execution engine, has shown promising results in generating highly structured test inputs for parsing programs.

Sources

GraphQLer: Enhancing GraphQL Security with Context-Aware API Testing

Everything You Wanted to Know About LLM-based Vulnerability Detection But Were Afraid to Ask

Simplicity by Obfuscation: Evaluating LLM-Driven Code Transformation with Semantic Elasticity

Risk Assessment Framework for Code LLMs via Leveraging Internal States

Benchmarking LLM for Code Smells Detection: OpenAI GPT-4.0 vs DeepSeek-V3

Automated Static Vulnerability Detection via a Holistic Neuro-symbolic Approach

Harden and Catch for Just-in-Time Assured LLM-Based Software Testing: Open Research Challenges

Case Study: Fine-tuning Small Language Models for Accurate and Private CWE Detection in Python Code

Context-Enhanced Vulnerability Detection Based on Large Language Model

In-Context Learning can distort the relationship between sequence likelihoods and biological fitness

Large Language Model-Driven Concolic Execution for Highly Structured Test Input Generation

Built with on top of