model_name
: The name or path of the generation model usedbase_url
: The HTTP interface address of the vLLM model serviceport
: The port that the local vLLM service listens ongpu_ids
: Specifies GPU devicesapi_key
: The API Key required to call the model servicesampling_params
: Generation parameters supported by vLLM, such as temperature, top-p, max_length, etc.initialize_local_vllm
: Starts a vLLM model service locally and waits for it to be ready, ultimately returning the base_url of the service.generate
: Receives prompt input provided by the Prompt Server, calls the LLM interface supporting the OpenAI API protocol for generation, and finally returns a list of response strings.