Use tool calling

Define tool schemas and simulate invocation flows within the Playground to validate behavior before production.

Use mocks for deterministic testing or hit live sandbox endpoints to test real integrations safely.

Validate tool outputs with evaluators to ensure correctness before acting on results.