stream/generate

The POST /stream/generate_sql endpoint allows you to generate SQL from a natural language question and receive step-by-step streaming updates throughout the generation process.

This is the streaming version of /generate_sql, designed for interactive UIs and debugging tools that want to visualize each phase of SQL generation as it happens.

What It Does

The endpoint returns a Server-Sent Events (SSE) stream that includes:

Processing states (understanding → searching → planning → generating).
Reasoning steps and traceable context.
The final SQL output — or an error if generation fails.

Unlike /stream/ask, this endpoint only generates SQL and does not run the SQL or generate summaries.

Basic Usage

{
  "projectId": 1,
  "question": "List the top 5 states with the most customers"
}

// Stream begins
data: { "type": "message_start" }

data: {
  "type": "state",
  "data": { "state": "sql_generation_start" }
}
data: {
  "type": "state",
  "data": { "state": "sql_generation_understanding" }
}
data: {
  "type": "state",
  "data": { "state": "sql_generation_searching" }
}
data: {
  "type": "state",
  "data": { "state": "sql_generation_planning" }
}
data: {
  "type": "state",
  "data": { "state": "sql_generation_generating" }
}
data: {
  "type": "state",
  "data": {
    "state": "sql_generation_success",
    "sql": "SELECT state, COUNT(*) FROM customers GROUP BY state ORDER BY COUNT(*) DESC LIMIT 5"
  }
}

// Stream ends
data: {
  "type": "message_stop",
  "data": { "threadId": "...", "duration": 3456 }
}

This example omits several detailed properties (such as question, rephrasedQuestion, intentReasoning, traceId, and timestamp) for clarity. Refer to the full schema for a complete breakdown of all available fields.

State Lifecycle

graph TD
  message_start["message_start"]
  sql_start["sql_generation_start"]
  sql_understanding["sql_generation_understanding"]
  sql_searching["sql_generation_searching"]
  sql_planning["sql_generation_planning"]
  sql_generating["sql_generation_generating"]
  sql_correcting["sql_generation_correcting"]
  sql_success["sql_generation_success"]
  sql_failed["sql_generation_failed"]
  sql_stopped["sql_generation_stopped"]
  error_event["type: error"]
  message_stop["message_stop"]

  %% Normal flow
  message_start --> sql_start --> sql_understanding --> sql_searching --> sql_planning --> sql_generating
  sql_generating --> sql_correcting
  sql_correcting --> sql_success
  sql_generating --> sql_success
  sql_success --> message_stop

  %% Error flows
  sql_generating -->|"Too many corrections"| sql_failed --> error_event --> message_stop
  sql_generating -->|"User canceled"| sql_stopped --> message_stop
  sql_generating -->|"Internal error"| error_event --> message_stop

Event Types

During a /stream/ask request, the API streams a sequence of events using Server-Sent Events (SSE). Each event provides insight into the system’s current state or output.

message_start

Purpose: Indicates the start of a new streaming response.
Payload: Contains the timestamp when the process began.

{
  "type": "message_start",
  "timestamp": 1751014954139
}

state

Purpose: Describes what stage the system is in during processing.

State Lifecycle Overview

`state`	Description
`sql_generation_start`	The system has begun processing the user’s question.
`sql_generation_understanding`	The AI is interpreting the question and trying to identify its intent.
`sql_generation_searching`	The AI is searching relevant tables and metadata to answer the question.
`sql_generation_planning`	A plan for how to generate SQL is being formed, including table joins or filters.
`sql_generation_generating`	The SQL is being generated.
`sql_generation_correcting`	A generated SQL failed to execute and is being corrected. (Auto-retry phase)
`sql_generation_success`	SQL was successfully generated. The `sql` field will be included.
`sql_generation_failed`	SQL generation failed. Followed by an error event.
`sql_generation_stopped`	SQL generation was manually canceled or interrupted (e.g. client disconnected).
`sql_generation_finished`	Final state for SQL generation (used internally; usually followed by `success` or `failed`).
`sql_execution_start`	SQL execution has started.
`sql_execution_end`	SQL execution has completed (successfully or not).

`data` Field Reference (`"type": "state"`)

🔹 sql_generation_start

Field	Type	Description
`state`	`string`	`"sql_generation_start"`
`question`	`string`	The original user input.
`threadId`	`string`	Unique thread identifier.
`language`	`string`	Language used for summarization (e.g., `"English"`).

🔹 SQL Generation In-Progress States
(e.g., sql_generation_understanding, searching, planning, generating, correcting)

Field	Type	Description
`state`	`string`	Current generation state, such as `"sql_generation_searching"`, `"sql_generation_planning"`, etc.
`pollCount`	`number`	Number of polling attempts so far.
`rephrasedQuestion`	`string` \| `null`	Reformulated version of the user query.
`intentReasoning`	`string` \| `null`	AI’s interpretation of what the user is asking.
`sqlGenerationReasoning`	`string` \| `null`	Step-by-step reasoning of SQL generation. Appears during `generating`.
`retrievedTables`	`string[]` \| `null`	List of tables determined to be relevant.
`invalidSql`	`string` \| `null`	SQL that failed during correction attempts (optional).
`traceId`	`string`	For backend debugging.

🔹 sql_generation_success

Field	Type	Description
`state`	`"sql_generation_success"`	Marks the completion of SQL generation.
`sql`	`string`	The generated SQL query.
`dialectSql`	`string` \| `undefined`	Database-specific SQL (if requested).

🔹 sql_execution_start

Field	Type	Description
`state`	`"sql_execution_start"`	SQL execution has begun.
`sql`	`string`	The SQL query being executed.

🔹 sql_execution_end

Field	Type	Description
`state`	`"sql_execution_end"`	SQL execution completed. No additional fields.

Example

data: {
  "type": "state",
  "data": {
    "state": "sql_generation_start",
    "question": "list 5 customers",
    "threadId": "0625991d-1bba-407d-8ad4-dd0210172484",
    "language": "English"
  },
  "timestamp": 1751014954142
}

data: {
  "type": "state",
  "data": {
    "state": "sql_generation_understanding",
    "pollCount": 1,
    "rephrasedQuestion": null,
    "intentReasoning": null,
    "sqlGenerationReasoning": null,
    "retrievedTables": null,
    "invalidSql": null,
    "traceId": "f218b1f7-4623-4a56-8b66-18d544797b20"
  },
  "timestamp": 1751014954165
}

data: {
  "type": "state",
  "data": {
    "state": "sql_generation_searching",
    "pollCount": 4,
    "rephrasedQuestion": "List 5 customers from the olist_customers_dataset table.",
    "intentReasoning": "User wants to retrieve specific customer data, likely using SQL query.",
    "sqlGenerationReasoning": null,
    "retrievedTables": null,
    "invalidSql": null,
    "traceId": "f218b1f7-4623-4a56-8b66-18d544797b20"
  },
  "timestamp": 1751014957183
}

data: {
  "type": "state",
  "data": {
    "state": "sql_generation_planning",
    "pollCount": 6,
    "rephrasedQuestion": "List 5 customers from the olist_customers_dataset table.",
    "intentReasoning": "User wants to retrieve specific customer data, likely using SQL query.",
    "sqlGenerationReasoning": null,
    "retrievedTables": ["olist_customers_dataset"],
    "invalidSql": null,
    "traceId": "f218b1f7-4623-4a56-8b66-18d544797b20"
  },
  "timestamp": 1751014959232
}

data: {
  "type": "state",
  "data": {
    "state": "sql_generation_generating",
    "pollCount": 9,
    "rephrasedQuestion": "List 5 customers from the olist_customers_dataset table.",
    "intentReasoning": "User wants to retrieve specific customer data, likely using SQL query.",
    "sqlGenerationReasoning": "1. **Identify the table involved**: The question asks for customer data, so the relevant table is `olist_customers_dataset`.\n\n2. **Determine the number of records needed**: The user requests 5 customers, so we need to select 5 entries from the table.",
    "retrievedTables": ["olist_customers_dataset"],
    "invalidSql": null,
    "traceId": "f218b1f7-4623-4a56-8b66-18d544797b20"
  },
  "timestamp": 1751014962254
}

data: {
  "type": "state",
  "data": {
    "state": "sql_generation_success",
    "sql": "SELECT \"o\".\"customer_id\", \"o\".\"customer_zip_code_prefix\", \"o\".\"customer_city\", \"o\".\"customer_state\" FROM \"olist_customers_dataset\" AS \"o\" LIMIT 5"
  },
  "timestamp": 1751014963263
}

data: {
  "type": "state",
  "data": {
    "state": "sql_execution_start",
    "sql": "SELECT \"o\".\"customer_id\", \"o\".\"customer_zip_code_prefix\", \"o\".\"customer_city\", \"o\".\"customer_state\" FROM \"olist_customers_dataset\" AS \"o\" LIMIT 5"
  },
  "timestamp": 1751014963263
}

data: {
  "type": "state",
  "data": {
    "state": "sql_execution_end"
  },
  "timestamp": 1751014963339
}

message_stop

Purpose: Marks the end of the entire stream. Includes processing duration and threadId.

{
  "type": "message_stop",
  "data": {
    "threadId": "0625991d-1bba-407d-8ad4-dd0210172484",
    "duration": 11096
  }
}

Error Handling

During a streaming session, if an error occurs at any stage of the pipeline, the server will emit a special event of type: "error".

This allows clients to gracefully detect and handle errors (e.g., show error messages in UI, retry logic, cancel streaming).

Error Event Structure

{
  "type": "error",
  "data": {
    "code": "INVALID_SQL",
    "error": "Unrecognized token near FROM",
    "traceId": "f218b1f7-4623-4a56-8b66-18d544797b20"
  },
  "timestamp": 1751015028888
}