Paper: https://arxiv.org/abs/2502.17888RankCoT consists of two stages: first, a “knowledge refinement model” refines the retrieved documents, then an “answer generation model” generates answers based on the refined results. Both are essentially LLM inferences, so in UR-2.0, the same Generation Server code can be reused through Server aliases, only assigning different aliases and configuring different parameters/models in the Pipeline (see Server Alias Reuse Mechanism for details).
servers/prompt/src/prompt.py
:
servers/prompt/src/prompt.py
:
examples/
directory, such as RankCoT.yaml:
servers/generation
as cot
and gen
, and call them separately in the Pipeline.
examples/parameter/RankCoT_parameter.yaml
, and assign different models or inference parameters for the two aliases:
cot
and gen
each have independent parameter blocks without overwriting each other.