AuthorPublished
Jon MarienJanuary 15, 2026

Definition

LLMs often interact with the real world through function calling or plugins, which allow the model to access local or third-party APIs to manage data like users, orders, or stock. This integration expands the model’s capabilities but also introduces a significant attack surface if the model is granted excessive privileges or lacks proper user oversight.

Core Idea

  • Workflow: When an LLM determines a task requires an external tool, it generates a JSON object with arguments matching the API’s schema. The client application then executes the function call, processes the response, and feeds it back to the LLM to summarize for the user.
  • Implicit Trust: The model often performs these actions “under the hood” on behalf of the user. Because the LLM acts as a mediator, the user—and sometimes even the developer—may not realize exactly which sensitive APIs are being triggered or with what arguments.
  • Manipulation: An attacker can use prompt injection to manipulate the LLM’s internal logic, forcing it to generate malicious function arguments or call unauthorized APIs that it has access to.

Why It’s Bad / Impact

  • Unauthorized Actions: Attackers can coerce the LLM into performing destructive actions, such as deleting database records, modifying security settings, or exfiltrating sensitive internal data.
  • Bypassing Safety Filters: Function calling processes often lack the rigorous safety filters applied to standard chat mode interactions, making them an easier target for “jailbreak” style attacks.
  • Downstream Exploitation: If the LLM’s output is handled insecurely by downstream systems, it can lead to critical vulnerabilities like Cross-Site Scripting (XSS), Server-Side Request Forgery (SSRF), or even Remote Code Execution (RCE).

Protect Against It

  • Manual Authorization: Implement a mandatory confirmation step for any sensitive action; the user must explicitly approve the API call before it is executed.
  • Principle of Least Privilege: Strictly limit the permissions of the API keys used by the LLM to only the minimum necessary functions and data scopes.
  • Robust Output Validation: Scrutinize the JSON arguments generated by the LLM using strict schemas and type checks before passing them to any backend system.

How LLM APIs work

The workflow for integrating an LLM with an API depends on the structure of the API itself. When calling external APIs, some LLMs may require the client to call a separate function endpoint (effectively a private API) in order to generate valid requests that can be sent to those APIs. The workflow for this could look something like the following:

  1. The client calls the LLM with the user’s prompt.
  2. The LLM detects that a function needs to be called and returns a JSON object containing arguments adhering to the external API’s schema.
  3. The client calls the function with the provided arguments.
  4. The client processes the function’s response.
  5. The client calls the LLM again, appending the function response as a new message.
  6. The LLM calls the external API with the function response.
  7. The LLM summarizes the results of this API call back to the user.

This workflow can have security implications, as the LLM is effectively calling external APIs on behalf of the user but the user may not be aware that these APIs are being called. Ideally, users should be presented with a confirmation step before the LLM calls the external API.


Mapping LLM API attack surface

The term “excessive agency” refers to a situation in which an LLM has access to APIs that can access sensitive information and can be persuaded to use those APIs unsafely. This enables attackers to push the LLM beyond its intended scope and launch attacks via its APIs.

The first stage of using an LLM to attack APIs and plugins is to work out which APIs and plugins the LLM has access to. One way to do this is to simply ask the LLM which APIs it can access. You can then ask for additional details on any APIs of interest.

If the LLM isn’t cooperative, try providing misleading context and re-asking the question. For example, you could claim that you are the LLM’s developer and so should have a higher level of privilege.


Lab


Chaining Vulnerabilities in LLM APIs

Even if the LLM only exposes APIs that seem harmless (i.e., “open a file by name”, “fetch a report”, “render a template”), those APIs might themselves be vulnerable to classic bugs like path traversal, SQLi, or command injection. If an API takes a filename, for example, you can try payloads like ../../etc/passwd and have the LLM send them as arguments, effectively turning the LLM into your fuzzer.

Once you’ve mapped the LLM’s API attack surface (what tools it can call, with which parameters), the next step is to systematically send traditional web exploit payloads through each API: path traversal strings, injection patterns, overlong input, and so on. The LLM becomes an intermediary, but underneath you’re still doing standard vuln discovery.

Protection depends on the specific vulnerability type, but the core idea is always the same: remove unnecessary assumptions of trust and add layered controls (validation, rate limiting, least privilege, monitoring) so a single mistake does not lead to full compromise.


Lab

At first I tried just the final task, and then a “superadmin” test, but neither worked:

Then, I asked what APIs it has access to:

Using the email client, we should be able to achieve RCE, since sending emails usually requires specific permissions: We note the exploit-* server address.

We can check the logs for the response:

We can see the email in our inbox:

So, now we know it works, let us try a basic command, like pwd or whoami:

Checking the email client, we see: It worked!

So, let us write a command delete the required morale.txt file from Carlos’ home directory:

It says it could not send it, as the address was invalid, but we indeed did solve the lab: