llm_prompts.yaml 6.2 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116
  1. # 提示词配置文件
  2. # 包含所有LLM交互使用的提示词模板
  3. # 用于customllm/base_llm_chat.py
  4. sql_generation:
  5. # SQL生成的初始提示词
  6. initial_prompt: |
  7. You are a {dialect} expert.
  8. Please help to generate a SQL query to answer the question. Your response should ONLY be based on the given context and follow the response guidelines and format instructions.
  9. # SQL生成的响应指南
  10. response_guidelines: |
  11. ===Response Guidelines
  12. **IMPORTANT**: All SQL queries MUST use Chinese aliases for ALL columns in SELECT clause.
  13. 1. If the provided context is sufficient, please generate a valid SQL query without any explanations for the question.
  14. 2. If the provided context is almost sufficient but requires knowledge of a specific string in a particular column, please generate an intermediate SQL query to find the distinct strings in that column. Prepend the query with a comment saying intermediate_sql
  15. 3. If the provided context is insufficient, please explain why it can't be generated.
  16. 4. **Context Understanding**: If the question follows [CONTEXT]...[CURRENT] format, replace pronouns in [CURRENT] with specific entities from [CONTEXT].
  17. - Example: If context mentions 'Nancheng Service Area has the most stalls', and current question is 'How many dining stalls does this service area have?',
  18. interpret it as 'How many dining stalls does Nancheng Service Area have?'
  19. 5. Please use the most relevant table(s).
  20. 6. If the question has been asked and answered before, please repeat the answer exactly as it was given before.
  21. 7. Ensure that the output SQL is {dialect}-compliant and executable, and free of syntax errors.
  22. 8. Always add NULLS LAST to ORDER BY clauses to handle NULL values properly (e.g., ORDER BY total DESC NULLS LAST).
  23. 9. **MANDATORY**: ALL columns in SELECT must have Chinese aliases. This is non-negotiable:
  24. - Every column MUST use AS with a Chinese alias
  25. - Raw column names without aliases are NOT acceptable
  26. - Examples:
  27. * CORRECT: SELECT service_name AS 服务区名称, SUM(pay_sum) AS 总收入
  28. * WRONG: SELECT service_name, SUM(pay_sum) AS total_revenue
  29. * WRONG: SELECT service_name AS service_area, SUM(pay_sum) AS 总收入
  30. - Common aliases: COUNT(*) AS 数量, SUM(...) AS 总计, AVG(...) AS 平均值, MAX(...) AS 最大值, MIN(...) AS 最小值
  31. chart_generation:
  32. # Chart generation instructions
  33. chinese_chart_instructions: |
  34. Create charts with the following requirements:
  35. 1. Generate meaningful titles based on user questions and data content
  36. 2. Generate accurate labels for X-axis and Y-axis based on the actual meaning of data columns
  37. 3. If there are legends, ensure legend labels are descriptive
  38. 4. All text (including titles, axis labels, legends, data labels, etc.) must be clear and meaningful
  39. 5. Titles should concisely summarize what the chart is showing
  40. 6. Axis labels should accurately reflect the business meaning of corresponding data columns
  41. 7. Choose the most suitable chart type for the data characteristics (bar chart, line chart, pie chart, etc.)
  42. 8. All chart text must be in Chinese.
  43. # System message template
  44. system_message_template: |
  45. User question: '{question}'
  46. Here is the pandas DataFrame data to answer the user's question:
  47. {sql_part}
  48. DataFrame structure information:
  49. {df_metadata}
  50. # User message template
  51. user_message_template: |
  52. Please generate Python Plotly visualization code for this DataFrame. Requirements:
  53. 1. Assume the data is stored in a pandas DataFrame named 'df'
  54. 2. If the DataFrame has only one value, use an Indicator chart
  55. 3. Return only Python code without any explanations
  56. 4. The code must be directly executable
  57. {chinese_chart_instructions}
  58. Special notes:
  59. - Do not use generic labels like 'Chart Title', 'X-axis Label', 'Y-axis Label'
  60. - Generate specific, meaningful labels based on actual data content and user questions
  61. - For example: if it's gender statistics, X-axis might be 'Gender', Y-axis might be 'Count' or 'Percentage'
  62. - The title should summarize the main content of the chart, such as 'Gender Distribution of Cardholders'
  63. Data labels and hover information requirements:
  64. - Do not use placeholder variables like %{text}
  65. - Use specific data values and units, e.g.: text=df['column_name'].astype(str) + ' people'
  66. - Hover information should be clear and easy to understand
  67. - Ensure all displayed text is actual data values, not variable placeholders
  68. Please generate all text content in Chinese.
  69. question_generation:
  70. # Generate question from SQL prompt
  71. system_prompt: |
  72. Based on the SQL statement below, infer the user's business question. Return only a clear natural language question without any explanations or SQL content. Do not include table names. The question should end with a question mark.
  73. Please respond in Chinese.
  74. chat_with_llm:
  75. # Default system prompt for chat conversations
  76. default_system_prompt: |
  77. You are a friendly AI assistant. Please respond in Chinese.
  78. question_merge:
  79. # Question merging system prompt
  80. system_prompt: |
  81. Your goal is to merge a series of related questions into a single question. If the second question is unrelated and completely independent from the first question, return the second question.
  82. Return only the new merged question without any additional explanations. The question should theoretically be answerable with a single SQL statement.
  83. Please respond in Chinese.
  84. summary_generation:
  85. # Summary generation system message
  86. system_message_template: |
  87. You are a professional data analysis assistant. The user asked: '{question}'
  88. Here is the pandas DataFrame data from the query results:{df_markdown}
  89. Please think and analyze in the context provided and respond accordingly.
  90. # Summary generation user instructions
  91. user_instructions: |
  92. Based on the user's question, please briefly summarize this data. Requirements:
  93. 1. Provide only a brief summary without adding extra explanations
  94. 2. If there are numbers in the data, maintain appropriate precision
  95. Please respond in Chinese.