|
|
@@ -2,60 +2,167 @@
|
|
|
|
|
|
## Project Overview
|
|
|
This is a Flask-based DataOps platform for data management, processing, and analytics.
|
|
|
+The platform integrates with Neo4j graph database for relationship management and supports n8n workflow automation.
|
|
|
|
|
|
-## Code Style
|
|
|
-- Use Python 3.8+ syntax
|
|
|
-- Follow PEP 8 style guidelines
|
|
|
-- Use type hints where possible
|
|
|
-- Keep functions focused and single-purpose
|
|
|
-- Use descriptive variable and function names
|
|
|
+---
|
|
|
+
|
|
|
+# Python 编码规范
|
|
|
+
|
|
|
+## 代码风格
|
|
|
+- 使用 Ruff 进行代码检查和格式化(替代 Black + Flake8 + isort)
|
|
|
+- 使用 Pyright 进行类型检查(替代 MyPy)
|
|
|
+- 行长度限制:88 字符
|
|
|
+- 使用双引号作为字符串默认引号
|
|
|
+- 使用 4 空格缩进,不使用制表符
|
|
|
+
|
|
|
+## 类型注解
|
|
|
+- 所有函数必须包含类型注解(参数和返回值)
|
|
|
+- 使用 Python 3.8+ 兼容的类型语法
|
|
|
+- 对于 Python 3.9+ 的特性(如 `list[str]`),需要 `from __future__ import annotations`
|
|
|
+- 复杂类型使用 `typing` 模块
|
|
|
+
|
|
|
+## 导入规范
|
|
|
+- 按照标准库、第三方库、本地导入的顺序组织
|
|
|
+- 使用绝对导入,避免相对导入
|
|
|
+- 每个导入单独一行
|
|
|
+- 常用工具函数(如 `create_or_get_talent_node`)应在文件顶部导入
|
|
|
+
|
|
|
+## 命名规范
|
|
|
+- 类名使用 PascalCase
|
|
|
+- 函数和变量使用 snake_case
|
|
|
+- 常量使用 UPPER_SNAKE_CASE
|
|
|
+- 私有成员使用单下划线前缀
|
|
|
+
|
|
|
+## 文档字符串
|
|
|
+- 所有公共函数、类和模块必须包含 docstring
|
|
|
+- 使用 Google 风格的 docstring
|
|
|
+- 包含参数说明、返回值说明和异常说明(如适用)
|
|
|
+
|
|
|
+## 错误处理
|
|
|
+- 使用具体的异常类型,避免裸露的 `except:`
|
|
|
+- 优先使用上下文管理器(`with` 语句)
|
|
|
+- 使用 Loguru 记录异常信息用于调试
|
|
|
+
|
|
|
+## 日志规范
|
|
|
+- 使用 Loguru 进行日志记录
|
|
|
+- 日志级别:DEBUG(调试)、INFO(信息)、WARNING(警告)、ERROR(错误)
|
|
|
+- 避免在生产环境使用 print() 语句
|
|
|
+
|
|
|
+## 代码质量
|
|
|
+- 避免使用 `type: ignore`,除非绝对必要并添加说明
|
|
|
+- 函数保持简短(建议不超过 50 行)
|
|
|
+- 避免深层嵌套(最多 3 层)
|
|
|
+- 使用列表推导式和生成器表达式(但保持可读性)
|
|
|
+
|
|
|
+## 示例
|
|
|
+
|
|
|
+```python
|
|
|
+from __future__ import annotations
|
|
|
+
|
|
|
+from typing import Optional
|
|
|
+
|
|
|
+from loguru import logger
|
|
|
+
|
|
|
+
|
|
|
+def process_data(
|
|
|
+ items: list[str],
|
|
|
+ max_length: int = 100,
|
|
|
+ strict: bool = False,
|
|
|
+) -> dict[str, int]:
|
|
|
+ """
|
|
|
+ Process a list of items and return statistics.
|
|
|
+
|
|
|
+ Args:
|
|
|
+ items: List of strings to process.
|
|
|
+ max_length: Maximum allowed length for items.
|
|
|
+ strict: Whether to raise error on invalid items.
|
|
|
+
|
|
|
+ Returns:
|
|
|
+ Dictionary containing processing statistics.
|
|
|
+
|
|
|
+ Raises:
|
|
|
+ ValueError: If strict mode and invalid item found.
|
|
|
+ """
|
|
|
+ result: dict[str, int] = {}
|
|
|
+ try:
|
|
|
+ # Implementation here
|
|
|
+ logger.info(f"Processing {len(items)} items")
|
|
|
+ except ValueError as e:
|
|
|
+ logger.error(f"Processing failed: {e}")
|
|
|
+ raise
|
|
|
+ return result
|
|
|
+```
|
|
|
+
|
|
|
+---
|
|
|
|
|
|
## Architecture
|
|
|
- Flask application with modular structure
|
|
|
-- SQLAlchemy for database operations
|
|
|
-- RESTful API design
|
|
|
-- Blueprint-based routing
|
|
|
+- SQLAlchemy for PostgreSQL database operations
|
|
|
+- Neo4j for graph database and relationship management
|
|
|
+- RESTful API design with Blueprint-based routing
|
|
|
- Configuration-based environment management
|
|
|
+- n8n workflow integration via MCP servers
|
|
|
|
|
|
## File Organization
|
|
|
- `app/` - Main application code
|
|
|
-- `app/api/` - API endpoints and routes
|
|
|
-- `app/models/` - Database models
|
|
|
-- `app/services/` - Business logic
|
|
|
-- `app/config/` - Configuration files
|
|
|
-- `database/` - Database scripts and migrations
|
|
|
+ - `app/api/` - API endpoints and routes (Blueprint-based)
|
|
|
+ - `app/core/` - Core business logic and domain services
|
|
|
+ - `app/models/` - SQLAlchemy database models
|
|
|
+ - `app/services/` - Shared services (Neo4j driver, utilities)
|
|
|
+ - `app/config/` - Configuration files
|
|
|
+ - `app/scripts/` - Database initialization scripts
|
|
|
+- `database/` - SQL scripts and migrations
|
|
|
- `docs/` - Documentation
|
|
|
- `tests/` - Test files
|
|
|
+- `scripts/` - Automation scripts
|
|
|
+- `mcp-servers/` - MCP server implementations (e.g., task-manager)
|
|
|
+- `logs/` - Application logs
|
|
|
|
|
|
## Dependencies
|
|
|
-- Flask 2.3.3+
|
|
|
-- SQLAlchemy 2.0+
|
|
|
-- PostgreSQL database
|
|
|
-- Neo4j graph database
|
|
|
-- MinIO for file storage
|
|
|
+- Python >= 3.8
|
|
|
+- Flask >= 2.3.0
|
|
|
+- Flask-SQLAlchemy >= 3.1.0
|
|
|
+- SQLAlchemy >= 2.0.0
|
|
|
+- Neo4j Python Driver (for graph database)
|
|
|
+- PostgreSQL (via psycopg2-binary)
|
|
|
+- Loguru (for logging)
|
|
|
+- Pandas & NumPy (for data processing)
|
|
|
+
|
|
|
+## Development Tools
|
|
|
+- Ruff (linting & formatting, replaces Black + Flake8 + isort)
|
|
|
+- Pyright (type checking, replaces MyPy)
|
|
|
+- Pytest (testing)
|
|
|
|
|
|
## Development Guidelines
|
|
|
- Always use virtual environment
|
|
|
- Test API endpoints before committing
|
|
|
- Update documentation for API changes
|
|
|
-- Use logging for debugging
|
|
|
-- Handle errors gracefully
|
|
|
+- Use Loguru for logging, avoid print() statements
|
|
|
+- Handle errors gracefully with proper logging
|
|
|
|
|
|
## API Conventions
|
|
|
- Use snake_case for Python functions and variables
|
|
|
- Use kebab-case for API endpoints
|
|
|
-- Return consistent JSON responses
|
|
|
+- Return consistent JSON responses with `code`, `message`, `data` structure
|
|
|
- Include proper HTTP status codes
|
|
|
- Validate input data
|
|
|
|
|
|
## Database
|
|
|
-- Use migrations for schema changes
|
|
|
+- PostgreSQL for relational data
|
|
|
+- Neo4j for graph data and relationships
|
|
|
+- Use Flask-Migrate/Alembic for schema migrations
|
|
|
- Follow naming conventions for tables and columns
|
|
|
- Implement proper indexing
|
|
|
- Use transactions for data consistency
|
|
|
|
|
|
+## Neo4j Graph Database
|
|
|
+- Use `Neo4jDriverSingleton` for connection management
|
|
|
+- Follow Cypher query best practices
|
|
|
+- Use parameterized queries to prevent injection
|
|
|
+- Close sessions properly after use
|
|
|
+
|
|
|
## Security
|
|
|
- Validate all user inputs
|
|
|
-- Use environment variables for sensitive data
|
|
|
+- Use environment variables for sensitive data (see `env.example`)
|
|
|
- Implement proper authentication
|
|
|
-- Sanitize database queries
|
|
|
+- Use parameterized queries for both SQL and Cypher
|