LLMs in Software Engineering: Function-Calling to Security

Advancements in Large Language Models for Software Engineering

This week's research highlights significant progress in the application of Large Language Models (LLMs) across various facets of software engineering, from code completion and function-calling to security and testing. A common theme across these studies is the push towards enhancing the adaptability, precision, and efficiency of LLMs to meet specific enterprise needs and real-world scenarios.

Function-Calling and Code Completion

Innovations in function-calling capabilities and code completion tasks are particularly noteworthy. Researchers have developed specialized training pipelines for scenario-specific function-calling models, introduced benchmarking frameworks for fine-grained evaluation in mobile device scenarios, and improved code completion through context and curriculum-based learning. The introduction of benchmarks for evaluating autocompletion of interactions with LLM-based chatbots also marks a pivotal development, aiming to streamline user interactions.

Software Security and Testing

In the realm of software security and testing, LLMs are being leveraged for vulnerability detection across various programming languages, with hybrid fuzzing techniques integrating LLMs to bypass the limitations of symbolic execution. Systematic evaluations of LLMs for unit testing reveal their superior performance over existing methods, highlighting the potential of fine-tuning and prompt engineering approaches.

Code Review and Generation

The field is also witnessing advancements in code review and generation, with the development of new datasets, benchmarks, and tools aimed at improving the detection of AI-generated code, automating code reviews, and generating functional web UIs from designs. The importance of open science and reproducibility in AI research is increasingly emphasized, underscoring the need for accessible code and well-documented data.

Noteworthy Innovations

Adaptable and Precise: Enterprise-Scenario LLM Function-Calling Capability Training Pipeline
HammerBench: Fine-Grained Function-Calling Evaluation in Real Mobile Device Scenarios
Improving FIM Code Completions via Context & Curriculum Based Learning
ADC: Enhancing Function Calling Via Adversarial Datasets and Code Line-Level Feedback
ChaI-TeA: A Benchmark for Evaluating Autocompletion of Interactions with LLM-based Chatbots
Vulnerability Detection in Popular Programming Languages with Language Models
Large Language Model assisted Hybrid Fuzzing
A Large-scale Empirical Study on Fine-tuning Large Language Models for Unit Testing
MRWeb: Generating multi-page, resource-aware web UIs from designs
The Unreasonable Effectiveness of Open Science in AI

These developments underscore the dynamic nature of research in LLMs and their application in software engineering, pointing towards a future where these models play an even more integral role in the development lifecycle.

LLMs in Software Engineering: Function-Calling to Security

Advancements in Large Language Models for Software Engineering

Function-Calling and Code Completion

Software Security and Testing

Code Review and Generation

Noteworthy Innovations

Sources