Skip to main content
AgentCPM-Report is an open-source large language model agent jointly developed by THUNLP, Renmin University of China RUCBM, and ModelBest. The model is built on the MiniCPM4.1 base (8 billion parameters) and can accept user instructions to autonomously conduct in-depth research and generate long-form reports. Key Highlights:
  • Extreme Performance, Minimal Footprint: Through an average of 40 rounds of deep retrieval and nearly 100 rounds of chain-of-thought reasoning, it achieves comprehensive information mining and restructuring, enabling edge-side models to produce logically rigorous, deeply insightful long-form articles with tens of thousands of words. With just 8 billion parameters, it delivers performance on par with top-tier closed-source systems in deep research tasks.
  • Physical Isolation, Local Security: Specifically designed for high-privacy scenarios, it supports fully offline and agile local deployment, completely eliminating the risk of cloud data leaks. Leveraging our UltraRAG framework, it efficiently mounts and understands your local private knowledge base, securely transforming core confidential data into highly valuable professional decision-making reports without ever leaving its domain.
This document aims to introduce how to configure and use AgentCPM-Report in the UltraRAG framework.
To obtain the best generation results, we strongly recommend downloading and using the supporting proprietary model AgentCPM-Report.

1. Pipeline Structure Overview

The AgentCPM-Report Pipeline adopts a dynamic loop architecture based on a state machine. After initialization, the system enters a core loop where an intelligent router (Router) autonomously decides the next action based on the current task progress—flexibly switching between four branches: information gathering (Search), initial planning (Analyst-Init), content writing (Write), and plan extension (Analyst-Extend). Through continuous “retrieval-planning-writing” iterations, it completes all research tasks and outputs the final formatted report.
https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3bexamples/AgentCPM-Report.yaml
# AgentCPM-Report Demo for UltraRAG UI

# MCP Server
servers:
  benchmark: servers/benchmark
  generation: servers/generation
  retriever: servers/retriever
  prompt: servers/prompt
  router: servers/router
  custom: servers/custom

# MCP Client Pipeline
pipeline:
- benchmark.get_data:
    output:
      q_ls: instruction_ls
- retriever.retriever_init
- generation.generation_init
- custom.surveycpm_init_citation_registry
- custom.surveycpm_state_init
- loop:
    times: 140
    steps:
    - branch:
        router:
        - router.surveycpm_state_router
        branches:
          search:
          - prompt.surveycpm_search:
              output:
                prompt_ls: search_prompt_ls          
          - generation.generate:
              input:
                prompt_ls: search_prompt_ls
              output:
                ans_ls: search_response_ls
          - custom.surveycpm_parse_search_response:
              input:
                response_ls: search_response_ls       
          - retriever.retriever_batch_search:
              input:
                batch_query_list: keywords_ls
          - custom.surveycpm_process_passages_with_citation
          - custom.surveycpm_update_state
          analyst-init_plan:
          - prompt.surveycpm_init_plan:
              output:
                prompt_ls: init_plan_prompt_ls
          - generation.generate:
              input:
                prompt_ls: init_plan_prompt_ls
              output:
                ans_ls: init_plan_response_ls
          - custom.surveycpm_after_init_plan:
              input:
                response_ls: init_plan_response_ls
          - custom.surveycpm_update_state
          write:
          - prompt.surveycpm_write:
              output:
                prompt_ls: write_prompt_ls
          - generation.generate:
              input:
                prompt_ls: write_prompt_ls
              output:
                ans_ls: write_response_ls
          - custom.surveycpm_after_write:
              input:
                response_ls: write_response_ls
          - custom.surveycpm_update_state
          analyst-extend_plan:
          - prompt.surveycpm_extend_plan:
              output:
                prompt_ls: extend_prompt_ls         
          - generation.generate:
              input:
                prompt_ls: extend_prompt_ls
              output:
                ans_ls: extend_response_ls
          - custom.surveycpm_after_extend:
              input:
                response_ls: extend_response_ls
          - custom.surveycpm_update_state
          done: []
- custom.surveycpm_format_output:
    output:
      ans_ls: final_survey_ls

2. Compile Pipeline File

Execute the following command to compile this workflow:
ultrarag build examples/AgentCPM-Report.yaml

3. Configure Running Parameters

Modify examples/parameter/AgentCPM-Report_parameter.yaml.
Want to adjust research depth? Please adjust as needed in the custom configuration: increase surveycpm_max_step to extend research time, and increase surveycpm_max_extend_step to obtain more detailed extended content. If you have extremely high quality requirements, be sure to enable surveycpm_hard_mode (hard mode).
https://mintcdn.com/ultrarag/T7GffHzZitf6TThi/images/yaml.svg?fit=max&auto=format&n=T7GffHzZitf6TThi&q=85&s=69b41e79144bc908039c2ee3abbb1c3bexamples/parameter/AgentCPM-Report_parameter.yaml
benchmark:
  benchmark:
    key_map:
      gt_ls: golden_answers
      q_ls: question
    limit: -1
    name: nq
    path: data/sample_nq_10.jsonl
    seed: 42
    shuffle: false
custom:
  surveycpm_hard_mode: false
  surveycpm_hard_mode: true
  surveycpm_max_extend_step: 12
  surveycpm_max_step: 140
generation:
  backend: vllm
  backend: openai
  backend_configs:
    hf:
      batch_size: 8
      gpu_ids: 2,3
      model_name_or_path: openbmb/AgentCPM-Report
      trust_remote_code: true
    openai:
      api_key: abc
      base_delay: 1.0
      base_url: http://localhost:8000/v1
      base_url: http://localhost:65506/v1
      concurrency: 8
      model_name: MiniCPM4-8B
      model_name: AgentCPM-Report
      retries: 3
    vllm:
      dtype: auto
      gpu_ids: 2,3
      gpu_memory_utilization: 0.9
      model_name_or_path: openbmb/AgentCPM-Report
      trust_remote_code: true
  extra_params:
    chat_template_kwargs:
      enable_thinking: false
  sampling_params:
    max_tokens: 2048
    temperature: 0.7
    top_p: 0.8
  system_prompt: ''
  system_prompt: 'You are a professional UltraRAG Q&A assistant.'
prompt:
  surveycpm_extend_plan_template: prompt/surveycpm_extend_plan.jinja
  surveycpm_init_plan_template: prompt/surveycpm_init_plan.jinja
  surveycpm_search_template: prompt/surveycpm_search.jinja
  surveycpm_write_template: prompt/surveycpm_write.jinja
retriever: 
  backend: sentence_transformers
  backend: openai
  backend_configs:
    bm25:
      lang: en
      save_path: index/bm25
    infinity:
      bettertransformer: false
      model_warmup: false
      pooling_method: auto
      trust_remote_code: true
    openai:
      api_key: abc
      base_url: https://api.openai.com/v1
      base_url: http://localhost:65504/v1
      model_name: text-embedding-3-small
      model_name: qwen-embedding
    sentence_transformers:
      sentence_transformers_encode:
        encode_chunk_size: 256
        normalize_embeddings: false
        psg_prompt_name: document
        psg_task: null
        q_prompt_name: query
        q_task: null
      trust_remote_code: true
  batch_size: 16
  collection_name: wiki
  corpus_path: data/corpus_example.jsonl
  gpu_ids: '1'
  index_backend: faiss
  index_backend_configs:
    faiss:
      index_chunk_size: 10000
      index_path: index/index.index
      index_use_gpu: true
    milvus:
      id_field_name: id
      id_max_length: 64
      index_chunk_size: 1000
      index_params:
        index_type: AUTOINDEX
        metric_type: IP
      metric_type: IP
      search_params:
        metric_type: IP
        params: {}
      text_field_name: contents
      text_max_length: 60000
      token: null
      uri: index/milvus_demo.db
      vector_field_name: vector
  is_demo: false
  is_multimodal: false
  model_name_or_path: openbmb/MiniCPM-Embedding-Light
  query_instruction: ''
  query_instruction: 'Query: '
  top_k: 5
  top_k: 20

4. Effect Demonstration

After configuration is complete, start the AgentCPM-Report Pipeline in UltraRAG UI.
Since the generation of a 10,000-word review involves a large amount of concurrent retrieval and multi-turn reasoning, it usually takes more than 10 minutes. You can use the UI’s background running function and come back to check the final report after the task is completed.