Skip to main content

Server Introduction

In a typical RAG system, the overall process usually consists of multiple functional modules, such as Retriever, Generator, etc. These modules undertake different tasks and work synergistically through process orchestration to complete complex question-answering and reasoning processes. In UltraRAG, based on the MCP (Model Context Protocol) architecture, we have uniformly encapsulated these functional modules and proposed a more standardized implementation method — Server.
A Server is essentially a RAG module component with independent functions.
Each Server encapsulates a core task logic (such as retrieval, generation, evaluation, etc.) and provides standardized interfaces through function-level Tools. With this mechanism, Servers can be flexibly combined, called, and reused in a complete Pipeline, thereby realizing a modular and scalable system construction method.

Server Development

To help you better understand how to use Server, this section will demonstrate the complete development process of building a custom Server from scratch through a simple example.

Step 1: Create Server File

First, create a folder named sayhello under the servers folder, and create a source code directory sayhello/src in it. Then, create a file sayhello.py in the src directory as the main program entry of the Server. In UltraRAG, all Servers are instantiated through the base class UltraRAG_MCP_Server. The example is as follows:
servers/sayhello/src/sayhello.py
from ultrarag.server import UltraRAG_MCP_Server

app = UltraRAG_MCP_Server("sayhello")

if __name__ == "__main__":
    # Start the sayhello server using stdio transport
    app.run(transport="stdio")

Step 2: Implement Tool Functions

Use the @app.tool decorator to register tool functions (Tool). These functions will be called during the Pipeline execution process to implement specific functional logic. For example, the following example defines the simplest greeting function greet, which inputs a name and returns the corresponding greeting:
servers/sayhello/src/sayhello.py
from typing import Dict
from ultrarag.server import UltraRAG_MCP_Server

app = UltraRAG_MCP_Server("sayhello")

@app.tool(output="name->msg")
def greet(name: str) -> Dict[str, str]:
    ret = f"Hello, {name}!"
    app.logger.info(ret)
    return {"msg": ret}

if __name__ == "__main__":
    # Start the sayhello server using stdio transport
    app.run(transport="stdio")

Step 3: Configure Parameter File

Next, create a parameter configuration file parameter.yaml under the sayhello folder. This file is used to declare the input parameters required by the Tool and their default values, facilitating automatic loading and passing during Pipeline runtime. The example is as follows:
https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3bservers/sayhello/parameter.yaml
name: UltraRAG v3
Here, the parameter name is defined with a default value of “UltraRAG v3”.

Parameter Registration Mechanism

If there are parameter naming conflicts between different Prompt Tools, please refer to the “Multi-Prompt Tool Calling Scenario” section in Prompt Server for solutions.
UltraRAG automatically reads the parameter.yaml file in each Server directory during the build phase, and perceives and registers the parameters required by tool functions accordingly. When using, please note the following points:
  • Parameter Sharing Mechanism: When multiple Tools need to share the same parameter (such as template, model_name_or_path, etc.), it can be declared only once in parameter.yaml and reused without repeated definition.
  • Field Overwrite Risk: If the parameters required by multiple Tools have the same name but different meanings or default values, the field names should be explicitly distinguished using different names to avoid being overwritten in the automatically generated configuration file.
  • Context Automatic Inference Mechanism: If some input parameters in the tool function do not appear in parameter.yaml, UltraRAG will default to attempting to infer from the runtime context (i.e., obtaining from the output of upstream Tools). Therefore, it is only necessary to explicitly define in parameter.yaml when parameters cannot be automatically passed through context.

Encapsulating Shared Variables via Class

In some scenarios, we may want to maintain shared state or variables within the same Server, such as model instances, cache objects, configurations, etc. In this case, the Server can be encapsulated as a class, and the definition of shared variables and Tool registration can be completed during the initialization phase of the class. The following example demonstrates how to encapsulate the sayhello Server as a class to achieve internal variable sharing:
servers/sayhello/src/sayhello.py
from typing import Dict
from ultrarag.server import UltraRAG_MCP_Server

app = UltraRAG_MCP_Server("sayhello")

class Sayhello:
    def __init__(self, mcp_inst: UltraRAG_MCP_Server):
        mcp_inst.tool(self.greet, output="name->msg")
        self.sen = "Nice to meet you"

    def greet(self, name: str) -> Dict[str, str]:
        ret = f"Hello, {name}! {self.sen}!"
        app.logger.info(ret)
        return {"msg": ret}

if __name__ == "__main__":
    Sayhello(app)
    app.run(transport="stdio")
In this example, self.sen is used to simulate variables that need to be shared between different Tools. This method is particularly suitable for scenarios that require loading models and repeated configuration parameters.