本文档提供了DataOps平台数据解析模块的所有API接口使用说明,包括名片解析、酒店职位管理、人才标签管理、知识图谱查询等功能。
基础URL: http://your-domain/api/data_parse
功能: 仅解析名片图片,提取信息但不保存到数据库
POST /business-card-parse
参数名 | 类型 | 必填 | 说明 |
---|---|---|---|
image | File | 是 | 名片图片文件 (multipart/form-data) |
{
"code": 200,
"success": true,
"message": "名片图片解析成功",
"data": {
"name_zh": "张三",
"name_en": "John Doe",
"title_zh": "总经理",
"title_en": "General Manager",
"mobile": "13800138000",
"phone": "021-12345678",
"email": "john.doe@example.com",
"hotel_zh": "上海希尔顿酒店",
"hotel_en": "Shanghai Hilton Hotel",
"address_zh": "上海市浦东新区...",
"address_en": "Pudong New Area, Shanghai...",
"postal_code_zh": "200000",
"postal_code_en": "200000",
"brand_zh": "希尔顿",
"brand_en": "Hilton",
"affiliation_zh": "希尔顿集团",
"affiliation_en": "Hilton Group",
"birthday": "1990-01-01",
"residence": "上海市浦东新区张江高科技园区",
"brand_group": "希尔顿,万豪",
"career_path": []
}
}
字段名 | 类型 | 说明 |
---|---|---|
name_zh | String | 中文姓名 |
name_en | String | 英文姓名 |
title_zh | String | 中文职位/头衔 |
title_en | String | 英文职位/头衔 |
mobile | String | 手机号码 |
phone | String | 固定电话 |
String | 电子邮箱 | |
hotel_zh | String | 中文酒店/公司名称 |
hotel_en | String | 英文酒店/公司名称 |
address_zh | String | 中文地址 |
address_en | String | 英文地址 |
postal_code_zh | String | 中文邮政编码 |
postal_code_en | String | 英文邮政编码 |
brand_zh | String | 中文品牌名称 |
brand_en | String | 英文品牌名称 |
affiliation_zh | String | 中文隶属关系 |
affiliation_en | String | 英文隶属关系 |
birthday | String | 生日,格式为YYYY-MM-DD |
residence | String | 居住地址信息 |
brand_group | String | 品牌组合,多个品牌用逗号分隔 |
career_path | Array | 职业轨迹,JSON数组格式 |
状态码 | 说明 |
---|---|
200 | 解析成功 |
400 | 请求参数错误(未上传文件、文件类型错误等) |
500 | 服务器错误或解析失败 |
curl -X POST \
http://your-domain/api/data_parse/business-card-parse \
-H 'Content-Type: multipart/form-data' \
-F 'image=@business_card.jpg'
import requests
url = "http://your-domain/api/data_parse/business-card-parse"
files = {'image': open('business_card.jpg', 'rb')}
response = requests.post(url, files=files)
print(response.json())
功能: 保存名片信息到数据库,包括重复检查、MinIO上传等完整业务逻辑
POST /add-business-card
方式1: JSON Body
{
"name_zh": "张三",
"name_en": "John Doe",
"title_zh": "总经理",
"title_en": "General Manager",
"mobile": "13800138000",
"phone": "021-12345678",
"email": "john.doe@example.com",
"hotel_zh": "上海希尔顿酒店",
"hotel_en": "Shanghai Hilton Hotel",
"address_zh": "上海市浦东新区...",
"address_en": "Pudong New Area, Shanghai...",
"postal_code_zh": "200000",
"postal_code_en": "200000",
"brand_zh": "希尔顿",
"brand_en": "Hilton",
"affiliation_zh": "希尔顿集团",
"affiliation_en": "Hilton Group",
"birthday": "1990-01-01",
"residence": "上海市浦东新区张江高科技园区",
"brand_group": "希尔顿,万豪",
"career_path": []
}
方式2: Form-Data
参数名 | 类型 | 必填 | 说明 |
---|---|---|---|
card_data | String | 是 | JSON格式的名片数据 |
image | File | 否 | 名片图片文件 |
成功创建新记录:
{
"code": 200,
"success": true,
"message": "名片信息保存成功。未找到同名记录,创建新记录",
"data": {
"id": 123,
"name_zh": "张三",
"name_en": "John Doe",
"birthday": "1990-01-01",
"residence": "上海市浦东新区张江高科技园区",
"created_at": "2024-01-01 12:00:00",
"image_path": "abc123.jpg",
...
}
}
发现疑似重复记录:
{
"code": 202,
"success": true,
"message": "创建新记录成功,发现疑似重复记录待处理",
"data": {
"main_card": { ... },
"duplicate_record_id": 45,
"suspected_duplicates_count": 2,
"processing_status": "pending"
}
}
状态码 | 说明 |
---|---|
200 | 成功创建或更新记录 |
202 | 创建成功但发现疑似重复记录 |
400 | 请求参数错误 |
500 | 服务器错误 |
# 方式1: 纯JSON
import requests
url = "http://your-domain/api/data_parse/add-business-card"
data = {
"name_zh": "张三",
"mobile": "13800138000",
"hotel_zh": "上海希尔顿酒店",
"birthday": "1990-01-01",
"residence": "上海市浦东新区张江高科技园区"
}
response = requests.post(url, json=data)
print(response.json())
# 方式2: 包含图片文件
import json
url = "http://your-domain/api/data_parse/add-business-card"
card_data = {
"name_zh": "张三",
"mobile": "13800138000",
"birthday": "1990-01-01",
"residence": "上海市浦东新区张江高科技园区"
}
files = {'image': open('business_card.jpg', 'rb')}
data = {'card_data': json.dumps(card_data)}
response = requests.post(url, files=files, data=data)
print(response.json())
GET /get-business-cards
{
"code": 200,
"success": true,
"message": "获取名片列表成功",
"data": [
{
"id": 1,
"name_zh": "张三",
"name_en": "John Doe",
"mobile": "13800138000",
"birthday": "1990-01-01",
"residence": "上海市浦东新区张江高科技园区",
"created_at": "2024-01-01 12:00:00",
...
}
]
}
GET /get-business-card/{card_id}
参数名 | 类型 | 说明 |
---|---|---|
card_id | Integer | 名片记录ID |
curl http://your-domain/api/data_parse/get-business-card/123
PUT /business-cards/{card_id}
参数名 | 类型 | 说明 |
---|---|---|
card_id | Integer | 名片记录ID |
{
"name_zh": "李四",
"title_zh": "副总经理",
"mobile": "13900139000",
"birthday": "1985-06-15",
"residence": "北京市朝阳区建国门外大街"
}
功能: 完全删除名片记录,包括PostgreSQL数据库记录、MinIO存储的图片文件和Neo4j图数据库中的相关节点和关系
DELETE /delete-business-card/{card_id}
参数名 | 类型 | 说明 |
---|---|---|
card_id | Integer | 名片记录ID |
PostgreSQL数据库清理:
business_cards
表中指定ID的记录duplicate_business_cards
表中以该ID作为 main_card_id
的相关记录MinIO文件存储清理:
Neo4j图数据库清理:
talent
节点中 pg_id
等于传入ID的节点完全成功删除:
{
"code": 200,
"success": true,
"message": "名片记录删除成功",
"data": {
"id": 123,
"name_zh": "张三",
"name_en": "John Doe",
"mobile": "13800138000",
"birthday": "1990-01-01",
"residence": "上海市浦东新区张江高科技园区",
"image_path": "abc123.jpg",
"created_at": "2024-01-01 12:00:00",
"status": "active"
}
}
部分成功删除:
{
"code": 206,
"success": true,
"message": "名片记录删除成功,但Neo4j图数据库清理失败: 连接超时",
"data": {
"id": 123,
"name_zh": "张三",
...
}
}
状态码 | 说明 |
---|---|
200 | 完全成功删除所有相关数据 |
206 | 部分成功 (PostgreSQL删除成功,但Neo4j删除失败) |
400 | 参数错误(无效的card_id) |
404 | 未找到指定ID的名片记录 |
500 | 删除操作失败 |
curl -X DELETE \
http://your-domain/api/data_parse/delete-business-card/123
import requests
url = "http://your-domain/api/data_parse/delete-business-card/123"
response = requests.delete(url)
print(response.json())
⚠️ 警告: 此操作不可逆,删除的数据无法恢复。建议在删除前:
PUT /update-business-cards/{card_id}/status
{
"status": "inactive"
}
状态值 | 说明 |
---|---|
active | 激活 |
inactive | 停用 |
GET /business-cards/image/{image_path}
参数名 | 类型 | 说明 |
---|---|---|
image_path | String | MinIO中的图片路径 |
curl http://your-domain/api/data_parse/business-cards/image/abc123.jpg
GET /get-hotel-positions-list
{
"success": true,
"message": "获取酒店职位列表成功",
"data": [
{
"id": 1,
"department_zh": "前厅部",
"department_en": "Front Office",
"position_zh": "前台经理",
"position_en": "Front Office Manager",
"position_abbr": "FOM",
"level_zh": "中层",
"level_en": "Middle Management",
"status": "active"
}
],
"count": 50
}
POST /add-hotel-positions
{
"department_zh": "前厅部",
"department_en": "Front Office",
"position_zh": "前台经理",
"position_en": "Front Office Manager",
"position_abbr": "FOM",
"level_zh": "中层",
"level_en": "Middle Management",
"created_by": "admin",
"status": "active"
}
字段名 | 说明 |
---|---|
department_zh | 部门中文名称 |
department_en | 部门英文名称 |
position_zh | 职位中文名称 |
position_en | 职位英文名称 |
level_zh | 职级中文名称 |
level_en | 职级英文名称 |
状态码 | 说明 |
---|---|
201 | 创建成功 |
400 | 参数错误 |
409 | 记录已存在 |
500 | 服务器错误 |
PUT /update-hotel-positions/{position_id}
GET /query-hotel-positions/{position_id}
DELETE /delete-hotel-positions/{position_id}
GET /get-hotel-group-brands-list
{
"success": true,
"message": "获取酒店集团品牌列表成功",
"data": [
{
"id": 1,
"group_name_en": "Hilton Worldwide",
"group_name_zh": "希尔顿集团",
"brand_name_en": "Hilton Hotels & Resorts",
"brand_name_zh": "希尔顿酒店",
"positioning_level_en": "Luxury",
"positioning_level_zh": "奢华",
"status": "active"
}
],
"count": 25
}
POST /add-hotel-group-brands
{
"group_name_en": "Marriott International",
"group_name_zh": "万豪国际",
"brand_name_en": "The Ritz-Carlton",
"brand_name_zh": "丽思卡尔顿",
"positioning_level_en": "Luxury",
"positioning_level_zh": "奢华"
}
POST /create-talent-tag
{
"name": "酒店管理",
"category": "人才技能",
"description": "具备酒店运营管理经验",
"status": "active"
}
GET /get-talent-tag-list
{
"success": true,
"message": "获取人才标签列表成功",
"data": [
{
"id": 123,
"name": "酒店管理",
"en_name": "Hotel Management",
"category": "人才技能",
"description": "具备酒店运营管理经验",
"status": "active",
"time": "2024-01-01 12:00:00"
}
]
}
PUT /update-talent-tag/{tag_id}
DELETE /delete-talent-tag/{tag_id}
GET /talent-get-tags/{talent_id}
参数名 | 类型 | 说明 |
---|---|---|
talent_id | Integer | 人才节点PostgreSQL ID |
{
"success": true,
"message": "获取人才标签成功",
"data": [
{
"talent": 12345,
"tag": "酒店管理"
},
{
"talent": 12345,
"tag": "市场营销"
}
]
}
POST /talent-update-tags
[
{
"talent": 12345,
"tag": "酒店管理"
},
{
"talent": 12345,
"tag": "市场营销"
},
{
"talent": 12345,
"tag": "团队领导"
}
]
{
"code": 200,
"success": true,
"message": "成功创建或更新了 3 个标签关系",
"data": {
"success_count": 3,
"total_count": 3,
"failed_items": []
}
}
POST /query-kg
{
"query_requirement": "查找具有五星级酒店和总经理经验的人才"
}
{
"code": 200,
"success": true,
"message": "查询成功执行",
"query": "MATCH (t:talent)-[:BELONGS_TO]->(dl:data_label) WHERE dl.name IN ['五星级酒店', '总经理'] WITH t, COLLECT(DISTINCT dl.name) AS labels WHERE size(labels) = 2 RETURN t.pg_id as pg_id, t.name_zh as name_zh, t.name_en as name_en, t.mobile as mobile, t.email as email, t.updated_at as updated_at",
"matched_labels": ["五星级酒店", "总经理"],
"data": [
{
"pg_id": 123,
"name_zh": "张三",
"name_en": "John Doe",
"mobile": "13800138000",
"email": "john.doe@example.com",
"updated_at": "2024-01-01 12:00:00"
}
]
}
GET /get-duplicate-records?status=pending
参数名 | 类型 | 可选值 | 说明 |
---|---|---|---|
status | String | pending, processed, ignored | 筛选特定状态的记录 |
{
"success": true,
"message": "获取重复记录列表成功",
"data": [
{
"id": 1,
"main_card_id": 123,
"suspected_duplicates": [
{
"id": 124,
"name_zh": "张三",
"mobile": "13900139000"
}
],
"duplicate_reason": "姓名相同但手机号码不同",
"processing_status": "pending",
"created_at": "2024-01-01 12:00:00",
"main_card": {
"id": 123,
"name_zh": "张三",
"name_en": "John Doe",
"mobile": "13800138000",
"birthday": "1990-01-01",
"residence": "上海市浦东新区张江高科技园区",
"created_at": "2024-01-01 11:30:00",
...
}
}
],
"count": 5
}
POST /process-duplicate-record/{duplicate_id}
参数名 | 类型 | 说明 |
---|---|---|
duplicate_id | Integer | 名片记录ID(对应DuplicateBusinessCard表中的main_card_id字段) |
⚠️ 重要说明:
BusinessCard
表的主键)DuplicateBusinessCard
表中 main_card_id
字段匹配的重复记录DuplicateBusinessCard
表的主键ID{
"action": "merge_to_suspected",
"selected_duplicate_id": 124,
"processed_by": "admin",
"notes": "确认为同一人,合并记录"
}
操作 | 说明 |
---|---|
merge_to_suspected | 合并到选中的疑似重复记录 |
keep_main | 保留主记录,不做合并 |
ignore | 忽略,标记为已处理 |
成功处理:
{
"code": 200,
"success": true,
"message": "重复记录处理成功,操作: merge_to_suspected",
"data": {
"duplicate_record": {
"id": 1,
"main_card_id": 123,
"processing_status": "processed",
"processed_at": "2024-01-01 15:30:00",
"processed_by": "admin",
"processing_notes": "确认为同一人,合并记录"
},
"result": {
"id": 124,
"name_zh": "张三",
"name_en": "John Doe",
"birthday": "1990-01-01",
"residence": "上海市浦东新区张江高科技园区",
"updated_at": "2024-01-01 15:30:00",
...
}
}
}
状态码 | 说明 |
---|---|
200 | 处理成功 |
400 | 参数错误或重复记录状态不允许处理 |
404 | 未找到对应的重复记录或目标记录 |
500 | 处理失败 |
curl -X POST \
http://your-domain/api/data_parse/process-duplicate-record/123 \
-H 'Content-Type: application/json' \
-d '{
"action": "merge_to_suspected",
"selected_duplicate_id": 124,
"processed_by": "admin",
"notes": "确认为同一人,合并记录"
}'
import requests
url = "http://your-domain/api/data_parse/process-duplicate-record/123"
data = {
"action": "merge_to_suspected",
"selected_duplicate_id": 124,
"processed_by": "admin",
"notes": "确认为同一人,合并记录"
}
response = requests.post(url, json=data)
print(response.json())
GET /get-duplicate-record-detail/{duplicate_id}
参数名 | 类型 | 说明 |
---|---|---|
duplicate_id | Integer | 名片记录ID(对应DuplicateBusinessCard表中的main_card_id字段) |
⚠️ 重要说明:
BusinessCard
表的主键)DuplicateBusinessCard
表中 main_card_id
字段匹配的重复记录{
"code": 200,
"success": true,
"message": "获取重复记录详情成功",
"data": {
"id": 1,
"main_card_id": 123,
"suspected_duplicates": [
{
"id": 124,
"name_zh": "张三",
"name_en": "John Doe",
"mobile": "13900139000",
"hotel_zh": "北京希尔顿酒店",
"created_at": "2024-01-01 10:00:00"
},
{
"id": 125,
"name_zh": "张三",
"name_en": "John Doe",
"mobile": "13700137000",
"hotel_zh": "广州万豪酒店",
"created_at": "2024-01-01 09:00:00"
}
],
"duplicate_reason": "姓名相同但手机号码不同:张三,新手机号:13800138000,发现2条疑似重复记录",
"processing_status": "pending",
"created_at": "2024-01-01 12:00:00",
"processed_at": null,
"processed_by": null,
"processing_notes": null,
"main_card": {
"id": 123,
"name_zh": "张三",
"name_en": "John Doe",
"mobile": "13800138000",
"birthday": "1990-01-01",
"residence": "上海市浦东新区张江高科技园区",
"hotel_zh": "上海希尔顿酒店",
"created_at": "2024-01-01 12:00:00",
...
}
}
}
状态码 | 说明 |
---|---|
200 | 获取成功 |
404 | 未找到对应的重复记录 |
500 | 获取失败 |
curl http://your-domain/api/data_parse/get-duplicate-record-detail/123
import requests
url = "http://your-domain/api/data_parse/get-duplicate-record-detail/123"
response = requests.get(url)
print(response.json())
GET /test-minio-connection
{
"success": true,
"message": "连接MinIO服务器成功,存储桶 dataops-bucket 存在",
"config": {
"host": "192.168.3.143:9000",
"bucket": "dataops-bucket",
"secure": false
}
}
POST /parse
{
"text": "这是测试数据"
}
功能: 修复 duplicate_business_cards
表中 main_card_id
为 null 的损坏记录
POST /fix-broken-duplicate-records
此接口用于修复在处理重复记录合并操作时可能产生的数据完整性问题。当执行合并操作删除主记录时,如果外键约束处理不当,可能导致重复记录表中的 main_card_id
字段变成 null,违反数据库的非空约束。
main_card_id
为 null 的损坏记录成功修复:
{
"code": 200,
"success": true,
"message": "成功修复并删除了2条损坏的重复记录",
"data": {
"fixed_count": 2,
"total_broken": 2,
"deleted_records": [
{
"id": 1,
"duplicate_reason": "姓名相同但手机号码不同:洪松,新手机号:+86 ...",
"processing_status": "processed",
"created_at": "2025-06-10 11:35:35",
"processed_at": "2025-06-10 16:18:53"
}
]
}
}
无需修复:
{
"code": 200,
"success": true,
"message": "没有发现需要修复的损坏记录",
"data": {
"fixed_count": 0,
"total_broken": 0
}
}
状态码 | 说明 |
---|---|
200 | 修复成功 |
500 | 修复失败 |
curl -X POST \
http://your-domain/api/data_parse/fix-broken-duplicate-records
import requests
url = "http://your-domain/api/data_parse/fix-broken-duplicate-records"
response = requests.post(url)
print(response.json())
⚠️ 重要提醒:
duplicate_business_cards
表process_duplicate_record
函数中修复了根本原因{
"name_zh": "王经理",
"name_en": "Manager Wang",
"title_zh": "总经理",
"title_en": "General Manager",
"mobile": "13812345678",
"phone": "021-88888888",
"email": "wang.manager@hotelgroup.com",
"hotel_zh": "上海国际大酒店",
"hotel_en": "Shanghai International Hotel",
"address_zh": "上海市黄浦区南京东路100号",
"address_en": "100 Nanjing East Road, Huangpu District, Shanghai",
"postal_code_zh": "200001",
"postal_code_en": "200001",
"brand_zh": "国际酒店集团",
"brand_en": "International Hotel Group",
"birthday": "1985-03-15",
"residence": "上海市黄浦区南京西路88号",
"brand_group": "希尔顿,万豪,洲际"
}
{
"department_zh": "客房部",
"department_en": "Housekeeping",
"position_zh": "客房部经理",
"position_en": "Housekeeping Manager",
"position_abbr": "HKM",
"level_zh": "中层管理",
"level_en": "Middle Management"
}
{
"name": "奢华酒店经验",
"category": "人才经验",
"description": "具备奢华酒店运营管理经验,熟悉高端客户服务标准"
}
duplicate_id
参数使用的是名片记录ID,而非重复记录表的主键ID文档版本: v1.1
最后更新: 2025年06月