Adding Cost Protection to My Bedrock Agent

Protect public API Gateway endpoints from abuse using usage plans and budget alerts. This post covers practical strategies to limit requests, monitor costs, and explores when AWS WAF or authentication might be needed.

Posted Jan 11, 2026 Updated Feb 2, 2026

By Deniz Yilmaz

9 min read

The Problem

I recorded a demo video for my AWS News Search agent and realised my API Gateway endpoint was visible on screen. My first thought was to regenerate the URL so the one in the video no longer worked.

Then I realised it doesn’t matter.

Anyone can open browser dev tools, go to the Network tab, and see the exact endpoint the frontend is calling. Hiding the URL in a video provides no real protection. If someone wants the endpoint, they can get it in seconds.

The actual risk is cost. Every request to the API invokes a Lambda function, which calls a Bedrock agent. Bedrock charges per token. If someone decides to spam the endpoint, my AWS bill goes up.

Protection Options

Rather than trying to hide the endpoint, focus on limiting what can be done with it.

Amazon API Gateway Usage Plans

API Gateway usage plans let you set throttling limits and quotas. You can restrict requests per second and cap the total number of requests per day or month.

For a demo project, a low daily quota works well. Set it to 100-500 requests per day. That’s enough for legitimate use but prevents someone from running thousands of requests overnight.

To create a usage plan:

Open the API Gateway console
Select Usage Plans from the navigation pane
Select Create usage plan

Configure the usage plan details:

Name: aws-news-agent-plan
Throttling: Enabled
- Rate: 1 (requests per second)
- Burst: 5 (allows short spikes)
Quota: Enabled
- Requests: 100
- Period: Per day

Create the usage plan.

Associate the API Stage

After creating the plan, associate it with your API:

In the usage plan details, select Add Stage
Select your API (aws-news-agent-api)
Select the stage (prod)
Select Add to usage plan

Create an API Key

An API key is a string that API Gateway generates. Clients include it in requests to identify themselves. The key links requests to your usage plan, which enforces the throttling and quota limits.

In the API Gateway console, select API Keys from the navigation pane
Select Create API key
Enter a name (e.g. aws-news-agent-key)
Select Save

Copy the API key value. You will need it for the frontend.

The API key is not a security mechanism on its own. Anyone with the key can use it. The protection comes from the usage plan limits attached to it.

Attach the Key to the Usage Plan

Open your usage plan (aws-news-agent-plan)
Select the Associated API keys tab
Select Add API key
Select the key you created
Select Add API key

Require the API Key on the Method

In the API Gateway console, select your API
Select the /agent resource and the POST method
Select the Method request tab
Select Edit
Select the API key required checkbox
Select Save
Select Deploy API and deploy to the prod stage

After enabling API key required, requests without a valid key receive a 403 Forbidden response. Test your endpoint to confirm the key is working before updating the frontend.

Test without the API key:

  
curl -X POST https://your-api-id.execute-api.us-east-1.amazonaws.com/prod/agent \
  -H "Content-Type: application/json" \
  -d '{"message": "test"}'

Expected response:

  
{"message":"Forbidden"}

Test with the API key:

  
curl -X POST https://your-api-id.execute-api.us-east-1.amazonaws.com/prod/agent \
  -H "Content-Type: application/json" \
  -H "x-api-key: your-api-key-here" \
  -d '{"message": "What is Lambda?"}'

Expected response: A valid response from the agent.

Update the Frontend

The frontend now needs to include the API key in the x-api-key header:

  
const response = await fetch(API_ENDPOINT, {
    method: 'POST',
    headers: {
        'Content-Type': 'application/json',
        'x-api-key': 'your-api-key-here'
    },
    body: JSON.stringify({ message: message, sessionId: sessionId })
});

The key is visible in browser dev tools, but combined with the usage plan quota, the damage from abuse is limited to 100 requests per day.

AWS WAF

AWS WAF can block or rate limit requests based on IP address. If a single IP sends too many requests in a short period, WAF blocks that IP while allowing others through.

The difference between WAF and usage plans:

Usage plans limit total requests across all users. One person could consume the entire quota.
WAF rate limiting limits requests per IP. One abusive user gets blocked, but others can still access the API.

WAF matters when you have multiple users and want to protect against one bad actor consuming the entire quota. For a demo project where you’re the main user, usage plans are sufficient.

Pricing

WAF pricing in eu-west-2 (London):

Resource	Cost
Web ACL	$5.00/month
Rule	$1.00/month per rule
Requests	$0.60 per 1 million

A basic WAF setup with one rate limit rule costs around $6/month. For a low-traffic demo already protected by usage plans, the added cost doesn’t justify the benefit.

For production APIs with multiple users or higher traffic, WAF provides an extra layer of protection worth considering.

AWS Budgets

Even with API Gateway usage plans configured, a budget alert provides a safety net for unexpected costs. I already maintain a general budget that tracks overall AWS spend. For this project, I needed a separate budget that tracks only costs generated by the AWS News agent.

Before creating a project-scoped budget, cost allocation tagging must be configured for all resources in the project.

Tag Your Resources

Apply a project tag to each resource associated with the agent:

search-aws-news (Lambda function)
invoke-agent (Lambda function)
aws-news-agent-api (API Gateway)
aws-news-agent (Amazon Bedrock Agent)

For each resource:

Open the resource in the AWS Management Console
Select the Tags tab
Add the following tag:
- Key: Project
- Value: aws-news-agent

Activate Cost Allocation Tags

Cost allocation tags must be activated before they can be used in AWS Budgets.

Open the Billing and Cost Management console
Select Cost allocation tags from the navigation pane
Locate the Project tag under User-defined cost allocation tags
Select the tag and choose Activate

Tags can take up to 24 hours to appear after resources are tagged, and up to another 24 hours to appear in AWS Budgets after activation. This delay occurs because AWS processes billing data in batches rather than in real time. Plan for up to 48 hours before filtering budgets by the tag.

The Problem with Standard Tagging

After tagging resources and activating the tag, I created an AWS Budget filtered by Project = aws-news-agent.

This captured Lambda and API Gateway costs, but it did not capture Amazon Bedrock model invocation charges. Most of the spend was missing from the budget.

Amazon Bedrock requires additional configuration because on-demand foundation model invocation costs are not automatically associated with project-level tags. Tagging the Bedrock Agent resource does not apply tags to model usage. Model invocation costs must be associated through an application inference profile.

I also considered scoping the budget by service, such as Amazon Bedrock and Amazon API Gateway. That approach was not suitable because those services are used by other projects in the same account. A service-based budget would include unrelated costs.

To isolate costs for this agent, tagging is still required. Standard resources are tagged directly. Bedrock model usage requires an application inference profile, which is covered next.

The Solution: Application Inference Profiles

Amazon Bedrock provides Application Inference Profiles to associate cost allocation tags with model invocation charges for on-demand foundation models. An inference profile references a foundation model and supports tagging. When the agent invokes the model through the profile, the tags are included in billing data.

Implementation requires:

Create an inference profile with the Project: aws-news-agent tag using the AWS CLI
Update the agent execution role to allow invocation of the inference profile
Update the agent to use the inference profile ARN instead of the model ID
Update the agent alias to apply the configuration

For detailed step-by-step instructions including CLI commands, IAM policies, and verification steps, see Cost Allocation Tags for Amazon Bedrock Agents.

Creating the Budget

Once costs appear in Cost Explorer under your tag (allow 24-48 hours after invoking the agent), create your budget.

Open the AWS Budgets console
Select Create budget
Select Customize (advanced)
Select Cost budget and select Next
Enter a budget name (aws-news-agent-budget)
For Period, select Monthly
For Budget renewal type, select Recurring budget
For Start month, select the current month
For Budgeting method, keep Fixed
Enter your budgeted amount ($10)
Under Budget scope, select Filter specific AWS cost dimensions
Select Add filter
Select Tag from the Dimension dropdown
Under Tag, select Project
Keep Includes
Select aws-news-agent and apply the filter
Leave Aggregate costs by as Unblended costs
Select Next
Select Add an alert threshold
Set Threshold to 80 (% of budgeted amount)
Set Trigger to Actual
Under Notification preferences, enter your email address
Select Next
Review your configuration and select Create budget

The budget now tracks only costs from resources tagged with Project: aws-news-agent, separate from your other AWS usage.

AWS Budget alert for the aws-news-agent project

SNS alternative
Instead of email, you can select an SNS topic ARN for notifications. This is useful if you want to trigger Lambda functions or integrate with other alerting systems.

Summary

Hiding API endpoints in videos or source code does not provide protection. The endpoint remains visible in browser network requests and can be accessed directly.

For this demo agent, a usage plan with a daily quota of 100 requests was configured, along with a budget alert filtered by the project tag. The API remains public, but request volume is limited and unexpected spending triggers a notification.

AWS WAF was evaluated as an option for rate limiting individual IP addresses. For a low-traffic demo already protected by API Gateway usage plans, the additional cost was not justified.

The controls covered in this post, including usage plans, WAF, and budget alerts, limit and monitor request volume but do not restrict who can access the API. For production APIs that handle sensitive data, additional access controls should be implemented:

Amazon Cognito
Users authenticate and receive a token that API Gateway validates before allowing requests.
IAM authorization
Requests must be signed with AWS credentials. Only explicitly permitted users or roles can invoke the API.
Lambda authorizers
Custom authorization logic validates tokens issued by an external or custom identity provider.
Private APIs
API Gateway can be configured as private and accessed only from within a VPC.

For a demo workload, usage plans and budget alerts provide sufficient protection. For production workloads, authentication is required to control who can invoke the API.

AWS, Generative AI

This post is licensed under CC BY 4.0 by the author.