Streams the SQL generation process with real-time state updates
The POST /stream/generate_sql
endpoint allows you to generate SQL from a natural language question and receive step-by-step streaming updates throughout the generation process.
This is the streaming version of /generate_sql
, designed for interactive UIs and debugging tools that want to visualize each phase of SQL generation as it happens.
What It Does
The endpoint returns a Server-Sent Events (SSE) stream that includes:
- Processing states (understanding β searching β planning β generating).
- Reasoning steps and traceable context.
- The final SQL output β or an error if generation fails.
Unlike /stream/ask
, this endpoint only generates SQL and does not run the SQL or generate summaries.
Basic Usage
{
"projectId": 1,
"question": "List the top 5 states with the most customers"
}
// Stream begins
data: { "type": "message_start" }
data: {
"type": "state",
"data": { "state": "sql_generation_start" }
}
data: {
"type": "state",
"data": { "state": "sql_generation_understanding" }
}
data: {
"type": "state",
"data": { "state": "sql_generation_searching" }
}
data: {
"type": "state",
"data": { "state": "sql_generation_planning" }
}
data: {
"type": "state",
"data": { "state": "sql_generation_generating" }
}
data: {
"type": "state",
"data": {
"state": "sql_generation_success",
"sql": "SELECT state, COUNT(*) FROM customers GROUP BY state ORDER BY COUNT(*) DESC LIMIT 5"
}
}
// Stream ends
data: {
"type": "message_stop",
"data": { "threadId": "...", "duration": 3456 }
}
This example omits several detailed properties (such as question, rephrasedQuestion, intentReasoning, traceId, and timestamp) for clarity. Refer to the full schema for a complete breakdown of all available fields.
State Lifecycle
graph TD message_start["message_start"] sql_start["sql_generation_start"] sql_understanding["sql_generation_understanding"] sql_searching["sql_generation_searching"] sql_planning["sql_generation_planning"] sql_generating["sql_generation_generating"] sql_correcting["sql_generation_correcting"] sql_success["sql_generation_success"] sql_failed["sql_generation_failed"] sql_stopped["sql_generation_stopped"] error_event["type: error"] message_stop["message_stop"] %% Normal flow message_start --> sql_start --> sql_understanding --> sql_searching --> sql_planning --> sql_generating sql_generating --> sql_correcting sql_correcting --> sql_success sql_generating --> sql_success sql_success --> message_stop %% Error flows sql_generating -->|"Too many corrections"| sql_failed --> error_event --> message_stop sql_generating -->|"User canceled"| sql_stopped --> message_stop sql_generating -->|"Internal error"| error_event --> message_stop
Event Types
During a /stream/ask
request, the API streams a sequence of events using Server-Sent Events (SSE). Each event provides insight into the systemβs current state or output.
message_start
- Purpose: Indicates the start of a new streaming response.
- Payload: Contains the timestamp when the process began.
{
"type": "message_start",
"timestamp": 1751014954139
}
state
- Purpose: Describes what stage the system is in during processing.
State Lifecycle Overview
state | Description |
---|---|
sql_generation_start | The system has begun processing the userβs question. |
sql_generation_understanding | The AI is interpreting the question and trying to identify its intent. |
sql_generation_searching | The AI is searching relevant tables and metadata to answer the question. |
sql_generation_planning | A plan for how to generate SQL is being formed, including table joins or filters. |
sql_generation_generating | The SQL is being generated. |
sql_generation_correcting | A generated SQL failed to execute and is being corrected. (Auto-retry phase) |
sql_generation_success | SQL was successfully generated. The sql field will be included. |
sql_generation_failed | SQL generation failed. Followed by an error event. |
sql_generation_stopped | SQL generation was manually canceled or interrupted (e.g. client disconnected). |
sql_generation_finished | Final state for SQL generation (used internally; usually followed by success or failed ). |
sql_execution_start | SQL execution has started. |
sql_execution_end | SQL execution has completed (successfully or not). |
data
Field Reference ("type": "state"
)
data
Field Reference ("type": "state"
)πΉ sql_generation_start
Field | Type | Description |
---|---|---|
state | string | "sql_generation_start" |
question | string | The original user input. |
threadId | string | Unique thread identifier. |
language | string | Language used for summarization (e.g., "English" ). |
πΉ SQL Generation In-Progress States
(e.g., sql_generation_understanding, searching, planning, generating, correcting)
Field | Type | Description |
---|---|---|
state | string | Current generation state, such as "sql_generation_searching" , "sql_generation_planning" , etc. |
pollCount | number | Number of polling attempts so far. |
rephrasedQuestion | string | null | Reformulated version of the user query. |
intentReasoning | string | null | AIβs interpretation of what the user is asking. |
sqlGenerationReasoning | string | null | Step-by-step reasoning of SQL generation. Appears during generating . |
retrievedTables | string[] | null | List of tables determined to be relevant. |
invalidSql | string | null | SQL that failed during correction attempts (optional). |
traceId | string | For backend debugging. |
πΉ sql_generation_success
Field | Type | Description |
---|---|---|
state | "sql_generation_success" | Marks the completion of SQL generation. |
sql | string | The generated SQL query. |
πΉ sql_execution_start
Field | Type | Description |
---|---|---|
state | "sql_execution_start" | SQL execution has begun. |
sql | string | The SQL query being executed. |
πΉ sql_execution_end
Field | Type | Description |
---|---|---|
state | "sql_execution_end" | SQL execution completed. No additional fields. |
Example
data: {
"type": "state",
"data": {
"state": "sql_generation_start",
"question": "list 5 customers",
"threadId": "0625991d-1bba-407d-8ad4-dd0210172484",
"language": "English"
},
"timestamp": 1751014954142
}
data: {
"type": "state",
"data": {
"state": "sql_generation_understanding",
"pollCount": 1,
"rephrasedQuestion": null,
"intentReasoning": null,
"sqlGenerationReasoning": null,
"retrievedTables": null,
"invalidSql": null,
"traceId": "f218b1f7-4623-4a56-8b66-18d544797b20"
},
"timestamp": 1751014954165
}
data: {
"type": "state",
"data": {
"state": "sql_generation_searching",
"pollCount": 4,
"rephrasedQuestion": "List 5 customers from the olist_customers_dataset table.",
"intentReasoning": "User wants to retrieve specific customer data, likely using SQL query.",
"sqlGenerationReasoning": null,
"retrievedTables": null,
"invalidSql": null,
"traceId": "f218b1f7-4623-4a56-8b66-18d544797b20"
},
"timestamp": 1751014957183
}
data: {
"type": "state",
"data": {
"state": "sql_generation_planning",
"pollCount": 6,
"rephrasedQuestion": "List 5 customers from the olist_customers_dataset table.",
"intentReasoning": "User wants to retrieve specific customer data, likely using SQL query.",
"sqlGenerationReasoning": null,
"retrievedTables": ["olist_customers_dataset"],
"invalidSql": null,
"traceId": "f218b1f7-4623-4a56-8b66-18d544797b20"
},
"timestamp": 1751014959232
}
data: {
"type": "state",
"data": {
"state": "sql_generation_generating",
"pollCount": 9,
"rephrasedQuestion": "List 5 customers from the olist_customers_dataset table.",
"intentReasoning": "User wants to retrieve specific customer data, likely using SQL query.",
"sqlGenerationReasoning": "1. **Identify the table involved**: The question asks for customer data, so the relevant table is `olist_customers_dataset`.\n\n2. **Determine the number of records needed**: The user requests 5 customers, so we need to select 5 entries from the table.",
"retrievedTables": ["olist_customers_dataset"],
"invalidSql": null,
"traceId": "f218b1f7-4623-4a56-8b66-18d544797b20"
},
"timestamp": 1751014962254
}
data: {
"type": "state",
"data": {
"state": "sql_generation_success",
"sql": "SELECT \"o\".\"customer_id\", \"o\".\"customer_zip_code_prefix\", \"o\".\"customer_city\", \"o\".\"customer_state\" FROM \"olist_customers_dataset\" AS \"o\" LIMIT 5"
},
"timestamp": 1751014963263
}
data: {
"type": "state",
"data": {
"state": "sql_execution_start",
"sql": "SELECT \"o\".\"customer_id\", \"o\".\"customer_zip_code_prefix\", \"o\".\"customer_city\", \"o\".\"customer_state\" FROM \"olist_customers_dataset\" AS \"o\" LIMIT 5"
},
"timestamp": 1751014963263
}
data: {
"type": "state",
"data": {
"state": "sql_execution_end"
},
"timestamp": 1751014963339
}
message_stop
Purpose
: Marks the end of the entire stream. Includes processingduration
andthreadId
.
{
"type": "message_stop",
"data": {
"threadId": "0625991d-1bba-407d-8ad4-dd0210172484",
"duration": 11096
}
}
Error Handling
During a streaming session, if an error occurs at any stage of the pipeline, the server will emit a special event of type: "error"
.
This allows clients to gracefully detect and handle errors (e.g., show error messages in UI, retry logic, cancel streaming).
Error Event Structure
{
"type": "error",
"data": {
"code": "INVALID_SQL",
"error": "Unrecognized token near FROM",
"traceId": "f218b1f7-4623-4a56-8b66-18d544797b20"
},
"timestamp": 1751015028888
}