Skip to content

Retries, caching, and idempotency

SotsAI endpoints are designed to be pure and side-effect free. Given the same input payload, SotsAI produces the same behavioral reasoning intent.

This makes retries, caching, and idempotency straightforward when applied correctly.


You may retry on:

  • network errors
  • timeouts
  • transient 5xx errors

SotsAI calls are deterministic in intent for a given input, so retries will not create behavioral inconsistencies.

Do not automatically retry on:

  • ORG_QUOTA_EXCEEDED
  • authentication errors
  • invalid payload errors
  • schema validation errors

These require corrective action, not retries.


  • exponential backoff
  • limited attempts (2–3)
  • circuit breaking if failures persist

Retries should live in your orchestration layer, not inside prompts or tools. The LLM should never be responsible for retry logic.


Caching applies to the behavioral reasoning result, not to user-facing text. Caching is optional, but often useful.

You may cache when:

  • the situation context is stable
  • profiles have not changed
  • the advice will be reused across turns or users

Typical cache key components:

  • normalized context hash
  • user profile version
  • interlocutor profile version
  • situation type (if you classify it upstream)

Invalidate cache when:

  • a profile changes
  • the situation context changes materially
  • your business logic requires fresh advice

Avoid long-lived caches for human interactions that evolve over time.


SotsAI calls are idempotent by design:

  • same input → same behavioral reasoning intent
  • no mutations
  • no hidden state

This means:

  • retries are safe
  • caching is safe
  • replays are safe

You do not need idempotency keys.


  • caching across different users
  • caching when profiles are missing or partial
  • reusing advice across unrelated situations
  • caching calls that should not have been made (e.g. without a user profile)

Behavioral advice is contextual, not universal.


Avoid caching when:

  • conversations are emotionally volatile
  • profiles are frequently updated
  • guidance must adapt in real time

In these cases, recomputation is safer than reuse.


Treat SotsAI like:

  • a deterministic reasoning function
  • not a conversational state machine
  • not a text generator