# DataOps Platform - Cursor Editor Rules ## Project Overview This is a Flask-based DataOps platform for data management, processing, and analytics. The platform integrates with Neo4j graph database for relationship management and supports n8n workflow automation. --- # Python 编码规范 ## 代码风格 - 使用 Ruff 进行代码检查和格式化(替代 Black + Flake8 + isort) - 使用 Pyright 进行类型检查(替代 MyPy) - 行长度限制:88 字符 - 使用双引号作为字符串默认引号 - 使用 4 空格缩进,不使用制表符 ## 类型注解 - 所有函数必须包含类型注解(参数和返回值) - 使用 Python 3.8+ 兼容的类型语法 - 对于 Python 3.9+ 的特性(如 `list[str]`),需要 `from __future__ import annotations` - 复杂类型使用 `typing` 模块 ## 导入规范 - 按照标准库、第三方库、本地导入的顺序组织 - 使用绝对导入,避免相对导入 - 每个导入单独一行 - 常用工具函数(如 `create_or_get_talent_node`)应在文件顶部导入 ## 命名规范 - 类名使用 PascalCase - 函数和变量使用 snake_case - 常量使用 UPPER_SNAKE_CASE - 私有成员使用单下划线前缀 ## 文档字符串 - 所有公共函数、类和模块必须包含 docstring - 使用 Google 风格的 docstring - 包含参数说明、返回值说明和异常说明(如适用) ## 错误处理 - 使用具体的异常类型,避免裸露的 `except:` - 优先使用上下文管理器(`with` 语句) - 使用 Loguru 记录异常信息用于调试 ## 日志规范 - 使用 Loguru 进行日志记录 - 日志级别:DEBUG(调试)、INFO(信息)、WARNING(警告)、ERROR(错误) - 避免在生产环境使用 print() 语句 ## 代码质量 - 避免使用 `type: ignore`,除非绝对必要并添加说明 - 函数保持简短(建议不超过 50 行) - 避免深层嵌套(最多 3 层) - 使用列表推导式和生成器表达式(但保持可读性) ## 示例 ```python from __future__ import annotations from typing import Optional from loguru import logger def process_data( items: list[str], max_length: int = 100, strict: bool = False, ) -> dict[str, int]: """ Process a list of items and return statistics. Args: items: List of strings to process. max_length: Maximum allowed length for items. strict: Whether to raise error on invalid items. Returns: Dictionary containing processing statistics. Raises: ValueError: If strict mode and invalid item found. """ result: dict[str, int] = {} try: # Implementation here logger.info(f"Processing {len(items)} items") except ValueError as e: logger.error(f"Processing failed: {e}") raise return result ``` --- ## Architecture - Flask application with modular structure - SQLAlchemy for PostgreSQL database operations - Neo4j for graph database and relationship management - RESTful API design with Blueprint-based routing - Configuration-based environment management - n8n workflow integration via MCP servers ## File Organization - `app/` - Main application code - `app/api/` - API endpoints and routes (Blueprint-based) - `app/core/` - Core business logic and domain services - `app/models/` - SQLAlchemy database models - `app/services/` - Shared services (Neo4j driver, utilities) - `app/config/` - Configuration files - `app/scripts/` - Database initialization scripts - `database/` - SQL scripts and migrations - `docs/` - Documentation - `tests/` - Test files - `scripts/` - Automation scripts - `mcp-servers/` - MCP server implementations (e.g., task-manager) - `logs/` - Application logs ## Dependencies - Python >= 3.8 - Flask >= 2.3.0 - Flask-SQLAlchemy >= 3.1.0 - SQLAlchemy >= 2.0.0 - Neo4j Python Driver (for graph database) - PostgreSQL (via psycopg2-binary) - Loguru (for logging) - Pandas & NumPy (for data processing) ## Development Tools - Ruff (linting & formatting, replaces Black + Flake8 + isort) - Pyright (type checking, replaces MyPy) - Pytest (testing) ## Development Guidelines - Always use virtual environment - Test API endpoints before committing - Update documentation for API changes - Use Loguru for logging, avoid print() statements - Handle errors gracefully with proper logging ## API Conventions - Use snake_case for Python functions and variables - Use kebab-case for API endpoints - Return consistent JSON responses with `code`, `message`, `data` structure - Include proper HTTP status codes - Validate input data ## Database - PostgreSQL for relational data - Neo4j for graph data and relationships - Use Flask-Migrate/Alembic for schema migrations - Follow naming conventions for tables and columns - Implement proper indexing - Use transactions for data consistency ## Neo4j Graph Database - Use `Neo4jDriverSingleton` for connection management - Follow Cypher query best practices - Use parameterized queries to prevent injection - Close sessions properly after use ## Security - Validate all user inputs - Use environment variables for sensitive data (see `env.example`) - Implement proper authentication - Use parameterized queries for both SQL and Cypher