Skip to main content

Pipeline Introduction

In UltraRAG, a Pipeline is a process script used to define “how the inference task is executed”. It is like a “task schedule” clarifying the operations the system needs to perform at each step. You can flexibly combine functions (Tools) in different modules (Servers) through the Pipeline to build a complete, reproducible, and controllable RAG inference process. For example:
  • Load data → Retrieve documents → Construct prompt → Call large model → Evaluate results;
  • Or in multi-turn generation, decide whether to re-retrieve or stop generation early based on the model’s intermediate performance.
With a single YAML file, you can define and run a complete RAG inference process.

Writing Specifications

In UltraRAG, Pipelines are written in the form of YAML files to define the complete task execution process. A Pipeline file usually consists of two top-level structures:
  • servers: Declares all MCP Server modules used in the current process. Each Server corresponds to a functional module (such as retrieval, generation, evaluation, etc.), where the key is the module name and the value is its path in the project.
  • pipeline: Defines the execution logic of the task. Each item represents an execution step or process control node, supporting control structures such as serial, loop, and branch judgment.
https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3bexamples/rag_full.yaml
servers:
  benchmark: servers/benchmark
  retriever: servers/retriever
  prompt: servers/prompt
  generation: servers/generation
  evaluation: servers/evaluation
  custom: servers/custom

pipeline:
- benchmark.get_data
- retriever.retriever_init
- retriever.retriever_embed
- retriever.retriever_index
- retriever.retriever_search
- generation.generation_init
- prompt.qa_rag_boxed
- generation.generate
- custom.output_extract_from_boxed
- evaluation.evaluate