BUSINESS_RULES.md 6.0 KB

DataOps Platform - Business Rules & Validation Standards

Overview

This document defines the core business rules, validation standards, and processing workflows for the DataOps platform. These rules ensure data integrity, consistent API behavior, and reliable business logic execution.

1. Data Validation Rules

1.1 General Field Validation

Rule ID: VALIDATION_001

Format Validation Rules

# Email validation  
- Must match pattern: ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
- Invalid format generates ERROR

# Array fields validation
- Must be array type if present

2. API Response Standards

2.1 Standard Response Format

Rule ID: API_RESPONSE_001

All API responses MUST follow this structure:

{
    "success": boolean,
    "message": string,
    "data": any,
    "code": number (optional)
}

Success Response Example

{
    "success": true,
    "message": "操作成功",
    "data": { ... }
}

Error Response Example

{
    "success": false,
    "message": "详细错误描述",
    "data": null,
    "code": 400
}

2.2 HTTP Status Code Rules

Rule ID: API_STATUS_001

  • 200: Successful operation
  • 400: Bad request (validation errors, missing parameters)
  • 404: Resource not found
  • 500: Internal server error

2.3 Content-Type Headers

Rule ID: API_HEADERS_001

  • All API responses: application/json; charset=utf-8
  • File downloads: Preserve original content-type
  • CORS headers automatically configured

3. Database Rules

3.1 Data Integrity Rules

Rule ID: DB_INTEGRITY_001

Timestamp Management

# Use East Asia timezone for all timestamps
from datetime import datetime
import pytz

3.2 Data Model Rules

Rule ID: DB_MODEL_001

Field Constraints

# String fields
name: max_length=100, nullable=False
email: max_length=100, nullable=True

# JSON fields - use for structured data

4. File Processing Rules

4.1 File Upload Rules

Rule ID: FILE_UPLOAD_001

Allowed Extensions

ALLOWED_EXTENSIONS = {
    'txt', 'pdf', 'png', 'jpg', 'jpeg', 'gif', 
    'xlsx', 'xls', 'csv', 'sql', 'dll'
}

Storage Rules

  • Development: Local filesystem
  • Production: MinIO object storage
  • File path tracking in database

5. Business Logic Rules

5.1 Graph Processing Rules

Rule ID: BUSINESS_LOGIC_001

Neo4j Graph Processing

  • Maximum traversal depth: 10 levels
  • Duplicate node prevention
  • Proper relationship management

5.2 Query Processing Rules

Rule ID: BUSINESS_LOGIC_002

Graph Query Optimization

# Use recursive traversal for label-based queries
# Pattern: (start_node)-[*1..10]->(end_node)

6. Security Rules

6.1 Input Validation

Rule ID: SECURITY_001

Sanitization Requirements

  • All user inputs MUST be validated
  • SQL injection prevention through SQLAlchemy ORM
  • XSS prevention through proper encoding
  • File upload validation (extension, size, content-type)

Authentication & Authorization

  • Environment variables for sensitive data
  • API key validation for external services
  • CORS configuration for cross-origin requests

6.2 Error Handling

Rule ID: SECURITY_002

Information Disclosure Prevention

  • Generic error messages for production
  • Detailed logging for debugging
  • No sensitive data in error responses
  • Stack traces only in development mode

7. Configuration Rules

7.1 Environment-Specific Rules

Rule ID: CONFIG_001

Development Environment

  • Debug mode: ON
  • Detailed logging: ON
  • Local database connections
  • Console logging: ON

Production Environment

  • Debug mode: OFF
  • Info-level logging only
  • Remote database connections
  • File logging only
  • Security headers enforced

7.2 Service Integration Rules

Rule ID: CONFIG_002

External Service Configuration

# LLM Services (Qwen API)
- API key from environment variables
- Fallback to default for development
- Rate limiting and retry logic

# Database Services
- Connection pooling enabled
- Health check (pool_pre_ping: True)
- Connection recycling (300 seconds)

8. Logging & Monitoring Rules

8.1 Logging Standards

Rule ID: LOGGING_001

Log Format

LOG_FORMAT = '%(asctime)s - %(levelname)s - %(filename)s - %(funcName)s - %(lineno)s - %(message)s'

Log Levels

  • DEBUG: Development detailed information
  • INFO: General operational information
  • WARNING: Validation warnings, non-critical issues
  • ERROR: Error conditions, exceptions
  • CRITICAL: System failures

Log Rotation

  • Development: Console + file logging
  • Production: File logging only
  • UTF-8 encoding for Chinese character support

9. Performance Rules

9.1 Database Performance

Rule ID: PERFORMANCE_001

Query Optimization

  • Use proper indexing for frequently queried fields
  • Batch processing for large datasets (batch_size: 1000)
  • Connection pooling (pool_size: 10, max_overflow: 20)

Caching Strategy

  • Session-based caching for Neo4j queries
  • API response caching for static data

10. Compliance & Audit Rules

10.1 Data Tracking

Rule ID: AUDIT_001

Change Tracking

  • All data modifications logged with timestamp
  • User attribution for all operations

Data Retention

  • Archive processed files
  • Maintain processing history

Rule Enforcement

Implementation Guidelines

  1. Validation: Implement validation functions
  2. Error Handling: Use standardized error response format
  3. Testing: Create unit tests for each business rule
  4. Documentation: Update API documentation when rules change

Rule Violation Handling

  • Critical violations: Return HTTP 400/500 with detailed error message
  • Warning violations: Log warning, continue processing
  • Data quality issues: Create audit records for manual review

Review Process

  • Monthly review of business rules effectiveness
  • Update rules based on operational feedback
  • Version control for rule changes
  • Impact assessment for rule modifications