VLLMShardArgs

Arguments for sharding a model using vLLM.

Attributes

Attribute	Type	Description
tensor_parallel_size	`int` = 1	Number of tensor parallel workers.
dtype	`string` = "auto"	Data type for model weights.
trust_remote_code	`boolean` = true	Whether to trust remote code from HuggingFace.
max_model_len	`int	null` = null
file_pattern	`string	null` = DEFAULT_SHARD_PATTERN
max_file_size	`int` = 5368709120	Maximum size for each sharded file.

@classmethod
def get_vllm_args(
    model_path: string
) - > dict[str, Any]

Get arguments dict for vLLM LLM constructor.

Name	Type	Description
model_path	`string`	The filesystem path or HuggingFace repository ID of the model to be loaded.

Type	Description
`dict[str, Any]`	A dictionary containing the model path, tensor parallelism settings, data type, and remote code trust settings required to initialize a vLLM LLM instance.