so the question is how do we ensure agent behavior is intended consistently?

and we need test

how to test

give a test case, with prompt/context, and expected final output

how to organize test cases and how to run tests?


<
Previous Post
Llm chess arena
>
Next Post
prompt vs code