English | 简体中文
An open-source Java 17 Text2SQL SDK with the core reasoning chain, Spring Boot starter, and runnable examples. It relies on Qdrant (vector search), Elasticsearch (keyword search + few-shot recall), and pluggable LLMs.
- Hybrid search: Qdrant column vector search + ES BM25 keyword search with metadata filtering and rerank.
- NL2SQL pipeline: rewrite → reasoning → SQL generation, with a fallback template generator.
- Multi-provider support: Supports 5 LLM providers and 4+ Embedding providers.
- Caching & safety: local TTL cache, SQL injection detection, and automatic LIMIT protection.
- Extensible: modular schema extraction, vectorization, search, and generators for easy replacement.
- Spring ecosystem: starter provides configuration binding, auto-configuration, and health checks.
| Provider | Models | Notes |
|---|---|---|
| OpenAI | gpt-4, gpt-4-turbo, gpt-3.5-turbo | Standard OpenAI API |
| Deepseek | deepseek-chat, deepseek-coder, deepseek-reasoner | OpenAI-compatible |
| Ollama | llama3, qwen2.5, mistral, codellama | Local deployment |
| SiliconFlow | Qwen/Qwen2.5-72B-Instruct, DeepSeek-V3 | 硅基流动, OpenAI-compatible |
| ZhipuAI | glm-4, glm-4-flash, glm-3-turbo | 智谱AI |
| Provider | Models | Dimension |
|---|---|---|
| OpenAI | text-embedding-3-small, text-embedding-3-large | 1536 / 3072 |
| Ollama | bge-large-zh, nomic-embed-text, mxbai-embed-large | 1024 / 768 |
| SiliconFlow | BAAI/bge-large-zh-v1.5, BAAI/bge-m3 | 1024 |
| ZhipuAI | embedding-2, embedding-3 | 1024 / 2048 |
text2sql-core: core logic and APIs (schema management, search, NL2SQL, cache, security).text2sql-spring-boot-starter: auto-configuration and health checks.text2sql-examples: basic usage, Spring Boot integration, and advanced samples.
- JDK 17, Maven 3.9+
- Elasticsearch (default index
text2sql_schemas) - Qdrant (default collection
text2sql_schemas) - One of the supported LLM providers
- One of the supported Embedding providers
mvn -pl text2sql-core -am package -DskipTests// Deepseek
LlmConfig llmConfig = LlmConfig.deepseek("sk-your-api-key");
// OpenAI
LlmConfig llmConfig = LlmConfig.openai("sk-your-api-key", "gpt-4");
// Ollama (local)
LlmConfig llmConfig = LlmConfig.ollama("qwen2.5:14b");
// SiliconFlow
LlmConfig llmConfig = LlmConfig.siliconflow("sf-your-api-key");
// ZhipuAI
LlmConfig llmConfig = LlmConfig.zhipu("your-api-key");Text2SqlClient client = Text2SqlClient.builder()
.datasource(DatasourceConfig.builder()
.name("demo")
.url("jdbc:mysql://localhost:3306/demo")
.username("user")
.password("pass")
.build())
.qdrant(QdrantConfig.builder()
.host("http://localhost")
.port(6333)
.embeddingUrl("http://localhost:11434/api/embeddings")
.embeddingModel("bge-large-zh")
.dimension(1024)
.build())
.elasticsearch(EsConfig.builder().host("http://localhost").port(9200).build())
.llm(LlmConfig.deepseek(System.getenv("DEEPSEEK_API_KEY")))
.buildAndInitialize();
String sql = client.query("Total order amount in the last 7 days");
QueryResponse result = client.queryAndExecute("Top-selling products", QueryOptions.defaults());text2sql:
llm:
provider: DEEPSEEK # OPENAI, DEEPSEEK, OLLAMA, SILICONFLOW, ZHIPU
api-key: ${DEEPSEEK_API_KEY}
model-name: deepseek-chat
qdrant:
embeddingUrl: http://localhost:11434/api/embeddings
embeddingModel: bge-large-zh
dimension: 1024- Java 17 + Lombok; run
mvn -pl text2sql-core -am package -DskipTestsbefore submitting. - Core module keeps pluggable interfaces (cache, LLM, vector/keyword stores) lightweight.
Issues and PRs are welcome! See CONTRIBUTING.md for guidelines.
Apache License 2.0, see LICENSE.