DataOps Platform - Business Rules & Validation Standards
Overview
This document defines the core business rules, validation standards, and processing workflows for the DataOps platform. These rules ensure data integrity, consistent API behavior, and reliable business logic execution.
1. Data Validation Rules
1.1 General Field Validation
Rule ID: VALIDATION_001
Format Validation Rules
# Email validation
- Must match pattern: ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
- Invalid format generates ERROR
# Array fields validation
- Must be array type if present
2. API Response Standards
2.1 Standard Response Format
Rule ID: API_RESPONSE_001
All API responses MUST follow this structure:
{
"success": boolean,
"message": string,
"data": any,
"code": number (optional)
}
Success Response Example
{
"success": true,
"message": "操作成功",
"data": { ... }
}
Error Response Example
{
"success": false,
"message": "详细错误描述",
"data": null,
"code": 400
}
2.2 HTTP Status Code Rules
Rule ID: API_STATUS_001
200: Successful operation
400: Bad request (validation errors, missing parameters)
404: Resource not found
500: Internal server error
2.3 Content-Type Headers
Rule ID: API_HEADERS_001
- All API responses:
application/json; charset=utf-8
- File downloads: Preserve original content-type
- CORS headers automatically configured
3. Database Rules
3.1 Data Integrity Rules
Rule ID: DB_INTEGRITY_001
Timestamp Management
# Use East Asia timezone for all timestamps
from datetime import datetime
import pytz
3.2 Data Model Rules
Rule ID: DB_MODEL_001
Field Constraints
# String fields
name: max_length=100, nullable=False
email: max_length=100, nullable=True
# JSON fields - use for structured data
4. File Processing Rules
4.1 File Upload Rules
Rule ID: FILE_UPLOAD_001
Allowed Extensions
ALLOWED_EXTENSIONS = {
'txt', 'pdf', 'png', 'jpg', 'jpeg', 'gif',
'xlsx', 'xls', 'csv', 'sql', 'dll'
}
Storage Rules
- Development: Local filesystem
- Production: MinIO object storage
- File path tracking in database
5. Business Logic Rules
5.1 Graph Processing Rules
Rule ID: BUSINESS_LOGIC_001
Neo4j Graph Processing
- Maximum traversal depth: 10 levels
- Duplicate node prevention
- Proper relationship management
5.2 Query Processing Rules
Rule ID: BUSINESS_LOGIC_002
Graph Query Optimization
# Use recursive traversal for label-based queries
# Pattern: (start_node)-[*1..10]->(end_node)
6. Security Rules
6.1 Input Validation
Rule ID: SECURITY_001
Sanitization Requirements
- All user inputs MUST be validated
- SQL injection prevention through SQLAlchemy ORM
- XSS prevention through proper encoding
- File upload validation (extension, size, content-type)
Authentication & Authorization
- Environment variables for sensitive data
- API key validation for external services
- CORS configuration for cross-origin requests
6.2 Error Handling
Rule ID: SECURITY_002
Information Disclosure Prevention
- Generic error messages for production
- Detailed logging for debugging
- No sensitive data in error responses
- Stack traces only in development mode
7. Configuration Rules
7.1 Environment-Specific Rules
Rule ID: CONFIG_001
Development Environment
- Debug mode: ON
- Detailed logging: ON
- Local database connections
- Console logging: ON
Production Environment
- Debug mode: OFF
- Info-level logging only
- Remote database connections
- File logging only
- Security headers enforced
7.2 Service Integration Rules
Rule ID: CONFIG_002
External Service Configuration
# LLM Services (Qwen API)
- API key from environment variables
- Fallback to default for development
- Rate limiting and retry logic
# Database Services
- Connection pooling enabled
- Health check (pool_pre_ping: True)
- Connection recycling (300 seconds)
8. Logging & Monitoring Rules
8.1 Logging Standards
Rule ID: LOGGING_001
Log Format
LOG_FORMAT = '%(asctime)s - %(levelname)s - %(filename)s - %(funcName)s - %(lineno)s - %(message)s'
Log Levels
- DEBUG: Development detailed information
- INFO: General operational information
- WARNING: Validation warnings, non-critical issues
- ERROR: Error conditions, exceptions
- CRITICAL: System failures
Log Rotation
- Development: Console + file logging
- Production: File logging only
- UTF-8 encoding for Chinese character support
9. Performance Rules
9.1 Database Performance
Rule ID: PERFORMANCE_001
Query Optimization
- Use proper indexing for frequently queried fields
- Batch processing for large datasets (batch_size: 1000)
- Connection pooling (pool_size: 10, max_overflow: 20)
Caching Strategy
- Session-based caching for Neo4j queries
- API response caching for static data
10. Compliance & Audit Rules
10.1 Data Tracking
Rule ID: AUDIT_001
Change Tracking
- All data modifications logged with timestamp
- User attribution for all operations
Data Retention
- Archive processed files
- Maintain processing history
Rule Enforcement
Implementation Guidelines
- Validation: Implement validation functions
- Error Handling: Use standardized error response format
- Testing: Create unit tests for each business rule
- Documentation: Update API documentation when rules change
Rule Violation Handling
- Critical violations: Return HTTP 400/500 with detailed error message
- Warning violations: Log warning, continue processing
- Data quality issues: Create audit records for manual review
Review Process
- Monthly review of business rules effectiveness
- Update rules based on operational feedback
- Version control for rule changes
- Impact assessment for rule modifications