ChatGPT+Vector database to build a privatized knowledge base The meaning of vector databases has been introduced.
This time, let’s go hands-on and take a look at the schema design and interaction flow first

1. Table Structure Design

1. MySQL Table Design

1. knowledge_base (Knowledge Base Summary Table)

1
2
3
4
5
6
7
8
9
10
11
CREATE TABLE `knowledge_base` (
`id` bigint unsigned NOT NULL AUTO_INCREMENT COMMENT '知识库id',
`create_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
`create_by` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT '' COMMENT '创建者',
`update_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '修改时间',
`update_by` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT '' COMMENT '更新者',
`name` varchar(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL COMMENT '知识库名称',
`description` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '知识库描述',
`vector_collection_name` varchar(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '向量数据库的表名',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci COMMENT='知识库总表';

2、knowledge_file(Knowledge base document management)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
CREATE TABLE `knowledge_file` (
`id` bigint unsigned NOT NULL AUTO_INCREMENT COMMENT '文件id',
`create_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
`create_by` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT '' COMMENT '创建者',
`update_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '修改时间',
`update_by` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT '' COMMENT '更新者',
`knowledge_id` bigint NOT NULL COMMENT '知识库id',
`file_name` varchar(65) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL COMMENT '文件名',
`oss_id` bigint NOT NULL COMMENT 'ossId',
`file_status` int NOT NULL DEFAULT '1' COMMENT '0向量处理中,1未激活,2已完成,3失败',
`fail_reason` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '失败原因',
`slice_type` int DEFAULT NULL COMMENT '切分类型:1分隔符,2字数',
`slice_value` varchar(10) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '切分规则数据',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci COMMENT='知识库文件管理';

4、knowledge_file_slice_vector(Knowledge base document slicing steering volume data table)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
CREATE TABLE `knowledge_file_slice_vector` (
`id` bigint unsigned NOT NULL AUTO_INCREMENT COMMENT 'id',
`create_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
`create_by` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT '' COMMENT '创建者',
`update_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '修改时间',
`update_by` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT '' COMMENT '更新者',
`knowledge_id` bigint DEFAULT NULL COMMENT '知识库id',
`knowledge_file_id` bigint DEFAULT NULL COMMENT '知识库文件id',
`slice_text` text CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci COMMENT '切片数据',
`vector_id` bigint DEFAULT NULL COMMENT '向量数据id',
PRIMARY KEY (`id`),
KEY `idx_knowledge` (`knowledge_id`),
KEY `idx_knpwledge_file` (`knowledge_file_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci COMMENT='知识库文件切片转向量数据表';

4、knowledge_usage_config (Knowledge base applications)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
CREATE TABLE `knowledge_usage_config` (
`id` bigint unsigned NOT NULL AUTO_INCREMENT COMMENT 'id',
`create_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
`create_by` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT '' COMMENT '创建者',
`update_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '修改时间',
`update_by` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT '' COMMENT '更新者',
`app_name` varchar(30) DEFAULT NULL COMMENT '应用配置名称',
`app_description` varchar(255) DEFAULT NULL COMMENT '应用配置描述',
`app_icon` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT '应用图标',
`prompts_config` text CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci COMMENT 'prompts模板',
`knowledge_id` bigint DEFAULT NULL COMMENT '知识库id',
`top_k` int DEFAULT NULL COMMENT 'topK',
`top_p` double DEFAULT NULL COMMENT 'topP',
`temperature` varchar(5) DEFAULT NULL COMMENT '温度',
`app_code` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT 'appCode',
`app_secret` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL COMMENT 'appSecret',
PRIMARY KEY (`id`),
KEY `idx_app` (`app_code`,`app_secret`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci COMMENT='知识库应用';

2. Vector database table design

**Note: A knowledge base corresponds to a vector data table **

Data Id, data title, data text, data vector eigenvalue

2. Interaction flow

online address:www.processon.com/diagraming/...

3. External calls

Refer to the interface entry parameter:

1
2
3
4
5
{
"textValue": "查询问题",
"appCode": "应用appCode",
"appSecret": "应用appSecret"
}

Refer to the interface out reference:

1
2
3
4
5
6
7
8
9
10
11
12
13
{
"code":200,
"msg":"操作成功",
"data":{
"result":"返回的结果",
"sourceVoList":[
{
"title":"来源标题",
"text":"来源内容"
}
]
}
}