Mastering Prompt Optimization on Amazon Bedrock: Your Guide to Advanced Tools

Amazon Bedrock now offers a powerful new feature called Advanced Prompt Optimization that helps you refine prompts for any model on the platform. Whether you're migrating to a new model or seeking better performance from your current one, this tool lets you compare original and optimized prompts across up to five models simultaneously. In this Q&A, we dive into the details.

What is Amazon Bedrock Advanced Prompt Optimization?

Amazon Bedrock Advanced Prompt Optimization is a tool that automatically improves your prompt templates to get better responses from language models. You provide a prompt template, example inputs, ground truth answers, and an evaluation metric. The tool then runs a metric-driven feedback loop to refine the prompt. It outputs the optimized prompt along with evaluation scores, cost estimates, and latency data. This helps you migrate to a new model seamlessly or boost performance on your current model without manual trial and error.

Mastering Prompt Optimization on Amazon Bedrock: Your Guide to Advanced Tools — Source: aws.amazon.com

How does the optimization process work?

You start by creating a prompt optimization job in the Bedrock console. Upload a JSONL file containing your prompt template, variable inputs, reference answers, and evaluation instructions. The tool supports multimodal inputs like PNG, JPG, and PDF for tasks such as document or image analysis. You can guide optimization using an AWS Lambda function, an LLM-as-a-judge rubric, or a natural language description. The system iteratively tests and tweaks the prompt to maximize your chosen metric, then shows you the original and final templates side by side.

How many models can I test at once with this tool?

You can select up to five inference models from Amazon Bedrock to optimize your prompts for simultaneously. This is especially useful if you're migrating from one model to another. Simply pick your current model as a baseline and add up to four others. The tool will optimize the prompt for each selected model and let you compare performance. If you're not switching models, you can still select just your current model to see a before-and-after optimization comparison.

What file format is required for the prompt templates?

Your prompt templates must be prepared in JSONL format, where each JSON object is on a single line. The format includes required fields like version (fixed to bedrock-2026-05-14), templateId, promptTemplate, and evaluationSamples. Optional fields include steeringCriteria and custom evaluation metrics. Each sample in evaluationSamples must contain input variables and reference responses. You can also optionally specify a custom LLM-as-a-judge prompt or a Lambda ARN for evaluation.

Can I use multimodal inputs with this optimization tool?

Yes, the prompt optimizer supports multimodal user inputs. You can include PNG, JPG, and PDF files as inputs to your prompt templates. This means you can optimize prompts for tasks like document analysis, image captioning, or extracting information from scanned forms. The tool processes these multimodal inputs alongside text variables, allowing you to fine-tune prompts for vision-language models available on Amazon Bedrock.

What evaluation metrics can I use to guide optimization?

You have several options for evaluation metrics. You can provide a short natural language description of what you consider a good response. For more precise control, use an AWS Lambda function that scores responses programmatically. Alternatively, you can set up an LLM-as-a-judge rubric by defining a custom prompt and specifying a model ID to act as the judge. The tool will use your chosen metric in a feedback loop to drive prompt improvements.

How do I get started with Advanced Prompt Optimization?

Navigate to the Advanced Prompt Optimization page in the Amazon Bedrock console and choose Create prompt optimization. Select up to five models for optimization. Prepare your JSONL file with templates, example data, and evaluation criteria. Upload it and let the tool run. After optimization, you'll see the original and optimized prompts, along with performance scores, cost, and latency. This makes it easy to validate improvements before deploying.