Skip to content

Pipeline placement

This page explains where SotsAI should live in an LLM pipeline.


Think of your pipeline as three distinct concerns:

  1. Facts — what is true? (RAG, databases, APIs)
  2. Behavior — how should this be communicated?
  3. Language — how should this be phrased?

SotsAI owns #2.

Your LLM owns #3.


A typical production flow looks like this:

1. User submits a request
2. Orchestrator classifies intent (optional but recommended)
3. Orchestrator gathers context (facts, constraints, history)
4. Call SotsAI for behavioral reasoning (when a user psychometric profile is available)
5. LLM generates the final response using SotsAI output
6. Response is returned to the user

SotsAI should be called before final text generation.


Recommended order:

1. RAG retrieves factual context
2. Orchestrator summarizes relevant facts
3. Call SotsAI with:
- situation context
- profiles
4. LLM generates the final response

Why this works:

  • SotsAI reasons on meaning, not raw documents
  • You avoid leaking sensitive documents
  • Behavioral guidance stays focused

Avoid:

  • calling SotsAI before you have the relevant facts (when factual context matters)
  • passing raw documents or embeddings
  • asking SotsAI to interpret source material

SotsAI is not a knowledge engine.


In this setup:

  • the LLM decides when to call SotsAI within the set of tools exposed by the orchestrator
  • your backend executes the call
  • the LLM consumes the structured output

This works well when:

  • you already use tools
  • you want conditional activation
  • multiple tools coexist

In this setup:

  • your backend decides when to call SotsAI
  • the LLM never sees the decision logic
  • the LLM only sees the result

This works well when:

  • rules are explicit
  • behavioral reasoning is mandatory
  • you want full determinism

Usually no. One SotsAI call per interaction is enough.

Avoid:

  • calling SotsAI per message chunk
  • calling it inside streaming loops
  • chaining multiple behavioral calls

If you need multiple calls, it usually means:

  • profiles are missing
  • intent classification is unclear
  • orchestration logic is too implicit

SotsAI is stateless.

This means:

  • no conversation memory
  • no session tracking
  • no implicit context

You must provide:

  • the situation context
  • relevant profiles
  • any constraints

This makes behavior:

  • predictable
  • auditable
  • cacheable

Do not embed behavioral logic like:

  • “adapt tone to personality”
  • “be careful with sensitive people”
  • “this person doesn’t like feedback”

That logic belongs in data, not prompts.

Your prompt should consume SotsAI output, not recreate it.


User input
Intent classification
Context assembly (facts + situation)
SotsAI call (behavioral reasoning)
LLM generation (language + tone)
Final response