API Reference

stream/generate_sql

Streams the SQL generation process with real-time state updates

The POST /stream/generate_sql endpoint allows you to generate SQL from a natural language question and receive step-by-step streaming updates throughout the generation process.

This is the streaming version of /generate_sql, designed for interactive UIs and debugging tools that want to visualize each phase of SQL generation as it happens.

What It Does

The endpoint returns a Server-Sent Events (SSE) stream that includes:

  1. Processing states (understanding β†’ searching β†’ planning β†’ generating).
  2. Reasoning steps and traceable context.
  3. The final SQL output β€” or an error if generation fails.

Unlike /stream/ask, this endpoint only generates SQL and does not run the SQL or generate summaries.

Basic Usage

{
  "projectId": 1,
  "question": "List the top 5 states with the most customers"
}
// Stream begins
data: { "type": "message_start" }

data: {
  "type": "state",
  "data": { "state": "sql_generation_start" }
}
data: {
  "type": "state",
  "data": { "state": "sql_generation_understanding" }
}
data: {
  "type": "state",
  "data": { "state": "sql_generation_searching" }
}
data: {
  "type": "state",
  "data": { "state": "sql_generation_planning" }
}
data: {
  "type": "state",
  "data": { "state": "sql_generation_generating" }
}
data: {
  "type": "state",
  "data": {
    "state": "sql_generation_success",
    "sql": "SELECT state, COUNT(*) FROM customers GROUP BY state ORDER BY COUNT(*) DESC LIMIT 5"
  }
}

// Stream ends
data: {
  "type": "message_stop",
  "data": { "threadId": "...", "duration": 3456 }
}

This example omits several detailed properties (such as question, rephrasedQuestion, intentReasoning, traceId, and timestamp) for clarity. Refer to the full schema for a complete breakdown of all available fields.

State Lifecycle

graph TD
  message_start["message_start"]
  sql_start["sql_generation_start"]
  sql_understanding["sql_generation_understanding"]
  sql_searching["sql_generation_searching"]
  sql_planning["sql_generation_planning"]
  sql_generating["sql_generation_generating"]
  sql_correcting["sql_generation_correcting"]
  sql_success["sql_generation_success"]
  sql_failed["sql_generation_failed"]
  sql_stopped["sql_generation_stopped"]
  error_event["type: error"]
  message_stop["message_stop"]

  %% Normal flow
  message_start --> sql_start --> sql_understanding --> sql_searching --> sql_planning --> sql_generating
  sql_generating --> sql_correcting
  sql_correcting --> sql_success
  sql_generating --> sql_success
  sql_success --> message_stop

  %% Error flows
  sql_generating -->|"Too many corrections"| sql_failed --> error_event --> message_stop
  sql_generating -->|"User canceled"| sql_stopped --> message_stop
  sql_generating -->|"Internal error"| error_event --> message_stop

Event Types

During a /stream/ask request, the API streams a sequence of events using Server-Sent Events (SSE). Each event provides insight into the system’s current state or output.

message_start

  • Purpose: Indicates the start of a new streaming response.
  • Payload: Contains the timestamp when the process began.
{
  "type": "message_start",
  "timestamp": 1751014954139
}

state

  • Purpose: Describes what stage the system is in during processing.

State Lifecycle Overview

stateDescription
sql_generation_startThe system has begun processing the user’s question.
sql_generation_understandingThe AI is interpreting the question and trying to identify its intent.
sql_generation_searchingThe AI is searching relevant tables and metadata to answer the question.
sql_generation_planningA plan for how to generate SQL is being formed, including table joins or filters.
sql_generation_generatingThe SQL is being generated.
sql_generation_correctingA generated SQL failed to execute and is being corrected. (Auto-retry phase)
sql_generation_successSQL was successfully generated. The sql field will be included.
sql_generation_failedSQL generation failed. Followed by an error event.
sql_generation_stoppedSQL generation was manually canceled or interrupted (e.g. client disconnected).
sql_generation_finishedFinal state for SQL generation (used internally; usually followed by success or failed).
sql_execution_startSQL execution has started.
sql_execution_endSQL execution has completed (successfully or not).

data Field Reference ("type": "state")

πŸ”Ή sql_generation_start

FieldTypeDescription
statestring"sql_generation_start"
questionstringThe original user input.
threadIdstringUnique thread identifier.
languagestringLanguage used for summarization (e.g., "English").

πŸ”Ή SQL Generation In-Progress States
(e.g., sql_generation_understanding, searching, planning, generating, correcting)

FieldTypeDescription
statestringCurrent generation state, such as "sql_generation_searching", "sql_generation_planning", etc.
pollCountnumberNumber of polling attempts so far.
rephrasedQuestionstring | nullReformulated version of the user query.
intentReasoningstring | nullAI’s interpretation of what the user is asking.
sqlGenerationReasoningstring | nullStep-by-step reasoning of SQL generation. Appears during generating.
retrievedTablesstring[] | nullList of tables determined to be relevant.
invalidSqlstring | nullSQL that failed during correction attempts (optional).
traceIdstringFor backend debugging.

πŸ”Ή sql_generation_success

FieldTypeDescription
state"sql_generation_success"Marks the completion of SQL generation.
sqlstringThe generated SQL query.

πŸ”Ή sql_execution_start

FieldTypeDescription
state"sql_execution_start"SQL execution has begun.
sqlstringThe SQL query being executed.

πŸ”Ή sql_execution_end

FieldTypeDescription
state"sql_execution_end"SQL execution completed. No additional fields.

Example

data: {
  "type": "state",
  "data": {
    "state": "sql_generation_start",
    "question": "list 5 customers",
    "threadId": "0625991d-1bba-407d-8ad4-dd0210172484",
    "language": "English"
  },
  "timestamp": 1751014954142
}

data: {
  "type": "state",
  "data": {
    "state": "sql_generation_understanding",
    "pollCount": 1,
    "rephrasedQuestion": null,
    "intentReasoning": null,
    "sqlGenerationReasoning": null,
    "retrievedTables": null,
    "invalidSql": null,
    "traceId": "f218b1f7-4623-4a56-8b66-18d544797b20"
  },
  "timestamp": 1751014954165
}

data: {
  "type": "state",
  "data": {
    "state": "sql_generation_searching",
    "pollCount": 4,
    "rephrasedQuestion": "List 5 customers from the olist_customers_dataset table.",
    "intentReasoning": "User wants to retrieve specific customer data, likely using SQL query.",
    "sqlGenerationReasoning": null,
    "retrievedTables": null,
    "invalidSql": null,
    "traceId": "f218b1f7-4623-4a56-8b66-18d544797b20"
  },
  "timestamp": 1751014957183
}

data: {
  "type": "state",
  "data": {
    "state": "sql_generation_planning",
    "pollCount": 6,
    "rephrasedQuestion": "List 5 customers from the olist_customers_dataset table.",
    "intentReasoning": "User wants to retrieve specific customer data, likely using SQL query.",
    "sqlGenerationReasoning": null,
    "retrievedTables": ["olist_customers_dataset"],
    "invalidSql": null,
    "traceId": "f218b1f7-4623-4a56-8b66-18d544797b20"
  },
  "timestamp": 1751014959232
}

data: {
  "type": "state",
  "data": {
    "state": "sql_generation_generating",
    "pollCount": 9,
    "rephrasedQuestion": "List 5 customers from the olist_customers_dataset table.",
    "intentReasoning": "User wants to retrieve specific customer data, likely using SQL query.",
    "sqlGenerationReasoning": "1. **Identify the table involved**: The question asks for customer data, so the relevant table is `olist_customers_dataset`.\n\n2. **Determine the number of records needed**: The user requests 5 customers, so we need to select 5 entries from the table.",
    "retrievedTables": ["olist_customers_dataset"],
    "invalidSql": null,
    "traceId": "f218b1f7-4623-4a56-8b66-18d544797b20"
  },
  "timestamp": 1751014962254
}

data: {
  "type": "state",
  "data": {
    "state": "sql_generation_success",
    "sql": "SELECT \"o\".\"customer_id\", \"o\".\"customer_zip_code_prefix\", \"o\".\"customer_city\", \"o\".\"customer_state\" FROM \"olist_customers_dataset\" AS \"o\" LIMIT 5"
  },
  "timestamp": 1751014963263
}

data: {
  "type": "state",
  "data": {
    "state": "sql_execution_start",
    "sql": "SELECT \"o\".\"customer_id\", \"o\".\"customer_zip_code_prefix\", \"o\".\"customer_city\", \"o\".\"customer_state\" FROM \"olist_customers_dataset\" AS \"o\" LIMIT 5"
  },
  "timestamp": 1751014963263
}

data: {
  "type": "state",
  "data": {
    "state": "sql_execution_end"
  },
  "timestamp": 1751014963339
}

message_stop

  • Purpose: Marks the end of the entire stream. Includes processing duration and threadId.
{
  "type": "message_stop",
  "data": {
    "threadId": "0625991d-1bba-407d-8ad4-dd0210172484",
    "duration": 11096
  }
}

Error Handling

During a streaming session, if an error occurs at any stage of the pipeline, the server will emit a special event of type: "error".

This allows clients to gracefully detect and handle errors (e.g., show error messages in UI, retry logic, cancel streaming).

Error Event Structure

{
  "type": "error",
  "data": {
    "code": "INVALID_SQL",
    "error": "Unrecognized token near FROM",
    "traceId": "f218b1f7-4623-4a56-8b66-18d544797b20"
  },
  "timestamp": 1751015028888
}
Language
Credentials
Bearer
JWT
Click Try It! to start a request and see the response here!