Data Flow

In UltraRAG, the Pipeline achieves data binding through variable names: each tool declares its input parameters and output variables during registration, and the Pipeline relies on these variable names to pass and share data between steps during execution. This mechanism is simple and intuitive, facilitating the construction of sequential data flows. However, in multi-turn calls or complex control structures, variable name conflicts or data overwriting issues may occur. For this reason, UltraRAG provides a parameter renaming mechanism, allowing developers to flexibly rename variables in the Pipeline without modifying the source code.

How Does Data Flow?

Each tool declares its input and output variable names during registration, thereby determining the entry and exit of the data flow. For example:

def __init__(self, mcp_inst):
    mcp_inst.tool(
        self.retriever_search,
        output="q_ls,top_k->ret_psg",
    )

def retriever_search(self, q_ls, top_k) -> ...
    ...
    return {"ret_psg": ...}

Here, the definition indicates:

The tool receives two input variables: q_ls and top_k
The tool returns one output variable: ret_psg

If you call the same tool (such as retriever_search) multiple times and wish to pass in different data variables (e.g., q_ls for the first time, subq_ls for the second time), you need a way to tell the Pipeline: these variables are actually “synonyms”.

Parameter Renaming Mechanism

To solve variable name conflicts and binding ambiguity issues, UltraRAG provides a flexible Parameter Renaming Mechanism. You can directly use input: and output: fields in pipeline.yaml to explicitly specify the mapping relationship between parameters and variables — without modifying the internal code of the Server, you can complete data binding redirection.

https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3b

- module.tool:
    input:
      function_parameter_name: variable_name_in_pipeline
    output:
      tool_output_key: variable_name_in_pipeline

This mechanism follows the principle of “explicit binding by name”: input: maps the function’s input parameter names, and output: maps the output keys defined during tool registration.

The simplest way: keep the input and output parameter names consistent during function definition and tool registration to directly avoid distinguishing the above two binding rules.

Example 1: Input Variable Renaming

Suppose the tool function is declared as follows:

async def retriever_search(
        self,
        query_list: List[str],
        top_k: Optional[int] | None = None,
        query_instruction: str = "",
        use_openai: bool = False,
    ) -> Dict[str, List[List[str]]]:

You can explicitly rename the input variable in the Pipeline:

- retriever.retriever_search:
    input:
      query_list: sub_q_ls

Here, the tool originally expects to receive an input parameter named query_list, but we map it to the variable sub_q_ls in the Pipeline via input:, thereby achieving seamless binding.

Input parameter mapping is performed based on the parameter names in the function declaration.

Example 2: Output Variable Renaming

Suppose the tool is defined as follows during registration:

mcp_inst.tool(
    self.retriever_search,
    output="q_ls,top_k,query_instruction,use_openai->ret_psg",
)

You can rewrite the output variable name in the Pipeline:

- retriever.retriever_search:
    output:
      ret_psg: round1_result

At this time, regardless of the return variable name inside the function, as long as the output key is specified as ret_psg during registration, the result will be mapped to round1_result for use in subsequent steps.

Output variable mapping is performed based on the output key specified during tool registration.

If a downstream module depends on this output result:

@app.prompt(output="q_ls,ret_psg,template->prompt_ls")
def qa_rag_boxed(
    q_ls: List[str], ret_psg: List[str | Any], template: str | Path
) -> list[PromptMessage]:

Then you can explicitly complete input redirection in the Pipeline:

- prompt.qa_rag_boxed:
    input:
      ret_psg: round1_result

In this way, the input ret_psg expected by qa_rag_boxed will be read from round1_result of the previous step, achieving data transfer.

Example 3: Renaming Input and Output Simultaneously

- retriever.retriever_search:
    input:
      q_ls: round1_query
    output:
      ret_psg: round1_result

This way of writing is particularly common in loop structures — each round of retrieval can use new input and output variables to avoid naming conflicts.

Reasonable use of parameter renaming allows your RAG process to remain clean and controllable in complex scenarios such as multi-turn iterations and dynamic branches without modifying the source code.

Get Started

RAG Servers

RAG Client

Develop Guide

Typical Implementation

How Does Data Flow?

Parameter Renaming Mechanism

Example 1: Input Variable Renaming

Example 2: Output Variable Renaming

Example 3: Renaming Input and Output Simultaneously

Get Started

RAG Servers

RAG Client

Develop Guide

Typical Implementation

​How Does Data Flow?

​Parameter Renaming Mechanism

​Example 1: Input Variable Renaming

​Example 2: Output Variable Renaming

​Example 3: Renaming Input and Output Simultaneously

How Does Data Flow?

Parameter Renaming Mechanism

Example 1: Input Variable Renaming

Example 2: Output Variable Renaming

Example 3: Renaming Input and Output Simultaneously