Skip to main content

VLLMShardArgs

Arguments for sharding a model using vLLM.

Attributes

AttributeTypeDescription
tensor_parallel_sizeint = 1Number of tensor parallel workers.
dtypestring = "auto"Data type for model weights.
trust_remote_codeboolean = trueWhether to trust remote code from HuggingFace.
max_model_len`intnull` = null
file_pattern`stringnull` = DEFAULT_SHARD_PATTERN
max_file_sizeint = 5368709120Maximum size for each sharded file.

Methods


get_vllm_args()

@classmethod
def get_vllm_args(
model_path: string
) - > dict[str, Any]

Get arguments dict for vLLM LLM constructor.

Parameters

NameTypeDescription
model_pathstringThe filesystem path or HuggingFace repository ID of the model to be loaded.

Returns

TypeDescription
dict[str, Any]A dictionary containing the model path, tensor parallelism settings, data type, and remote code trust settings required to initialize a vLLM LLM instance.