VLLMShardArgs
Arguments for sharding a model using vLLM.
Attributes
| Attribute | Type | Description |
|---|---|---|
| tensor_parallel_size | int = 1 | Number of tensor parallel workers. |
| dtype | string = "auto" | Data type for model weights. |
| trust_remote_code | boolean = true | Whether to trust remote code from HuggingFace. |
| max_model_len | `int | null` = null |
| file_pattern | `string | null` = DEFAULT_SHARD_PATTERN |
| max_file_size | int = 5368709120 | Maximum size for each sharded file. |
Methods
get_vllm_args()
@classmethod
def get_vllm_args(
model_path: string
) - > dict[str, Any]
Get arguments dict for vLLM LLM constructor.
Parameters
| Name | Type | Description |
|---|---|---|
| model_path | string | The filesystem path or HuggingFace repository ID of the model to be loaded. |
Returns
| Type | Description |
|---|---|
dict[str, Any] | A dictionary containing the model path, tensor parallelism settings, data type, and remote code trust settings required to initialize a vLLM LLM instance. |