Search-R1 Tools
search_r1_query_extract
- Function: Extracts query content from model response.
- Logic: Uses regex
r"<search>([^<]*)"to extract the content inside the last<search>tag. If not found, returns “There is no query.”; if the query does not end with?, it is automatically completed.
r1_searcher_query_extract
- Function: Extracts query from R1-Searcher response.
- Logic: Uses regex
r"<|begin_of_query|>([^<]*)"to extract the last tag content.
IRCoT & IterRetGen Tools
iterretgen_nextquery
- Function: Iterative retrieval generation.
- Logic:
next_query = f"{q} {ans}". Concatenates original question and generated answer as the Query for the next retrieval.
ircot_get_first_sent
- Function: Extracts the first sentence of the answer (up to period or question/exclamation mark).
ircot_extract_ans
- Function: Extracts the final answer.
- Logic: Matches content after
so the answer is [...].
Search-o1 Tools
search_o1_init_list
- Function: Initializes accumulation lists required by Search-o1 (sub-questions, reasoning, final info), initially filled with
<PAD>.
search_o1_combine_list
- Function: Appends the extracted Query and Reasoning of the current step to the total lists.
search_o1_query_extract
- Function: Extracts content between
<|begin_search_query|>...<|end_search_query|>.
search_o1_reasoning_extract
- Function: Extracts all text before
<|begin_search_query|>as the reasoning process.
search_o1_extract_final_information
- Function: Extracts content after
**Final Information**marker.
Utility Tools
output_extract_from_boxed
- Function: Extracts answer from LaTeX
\boxed{...}. Supports nested bracket handling and format cleaning.
merge_passages
- Function: Appends
temp_psglist toret_psglist.
evisrag_output_extract_from_special
- Function: Extracts answer from
<answer>...</answer>tags.
assign_citation_ids / assign_citation_ids_stateful
assign_citation_ids: Assigns citation IDs in the form of[1],[2]to retrieved passages.assign_citation_ids_stateful: UsesCitationRegistryclass to maintain global citation IDs (cross-step deduplication).init_citation_registry: Resets global citation registry.
SurveyCPM Tools
surveycpm_state_init
- Function: Initializes SurveyCPM state machine.
- Initial State:
state="search",cursor="outline",step=0.
surveycpm_parse_search_response
- Function: Parses search instructions (JSON or XML format) generated by the model, extracts keyword list.
surveycpm_process_passages
- Function: Processes retrieved passages, deduplicates, limits quantity (Top-K), and concatenates into string.
surveycpm_after_init_plan / after_write / after_extend
- Function: Parses Agent response for different stages (initialize outline, write content, extend plan).
- Logic:
- Calls
surveycpm_parse_responseto validate format and content. - Updates
survey_ls(outline structure) andcursor_ls(current cursor position) if successful. - Keeps original state for retry if failed.
- Calls
surveycpm_update_state
- Function: Core state machine logic.
- State Transition:
search->analyst-init_plan(cursor=“outline”)search->write(cursor=section-X)write->search(continue writing) oranalyst-extend_plan(finished current section)analyst-extend_plan->search(extend success) ordone(no extension)- Exceeds max steps ->
done
surveycpm_format_output
- Function: Converts final Survey JSON to Markdown format.
- Processing: Automatically handles heading levels (# ## ###), citation formatting (
\cite{...}to[1]), and text cleaning.
Configuration
| Parameter | Type | Description |
|---|---|---|
surveycpm_hard_mode | bool | Whether to enable SurveyCPM’s strict parsing mode (validate JSON field integrity) |
surveycpm_max_step | int | Maximum total execution steps, forced end if exceeded |
surveycpm_max_extend_step | int | Maximum plan extension times |