triton_wrapper
- class coffea.ml_tools.triton_wrapper(model_url: str, client_args: Dict | None = None, batch_size=-1)[source]
Bases:
nonserializable_attribute
,numpy_call_wrapper
Wrapper for running triton inference.
The target of this class is such that all triton specific operations are wrapped and abstracted-away from the users. The users should then only needs to handle awkward-level operations to mangle the arrays into the expected input format required by the the model of interest.
Attributes Summary
Getting the batch size to be used for array splitting.
Function for adding default arguments to the client constructor kwargs.
Getting the protocol module based on the url protocol string.
Methods Summary
numpy_call
(output_list, input_dict)- param - output_list:
of interest. These strings will be automatically translated into the
validate_numpy_input
(output_list, input_dict)tritonclient can return the expected input array dimensions and available output values.
Attributes Documentation
- batch_size
Getting the batch size to be used for array splitting. If it is explicitly set by the users, use that; otherwise, extract from the model configuration hosted on the server.
- batch_size_fallback = 10
- client_args
Function for adding default arguments to the client constructor kwargs.
- http_client_concurrency = 12
- pmod
Getting the protocol module based on the url protocol string.
Methods Documentation
- numpy_call(output_list: List[str], input_dict: Dict[str, array]) Dict[str, array] [source]
- Parameters:
output_list (-) – of interest. These strings will be automatically translated into the required
tritonclient.InferRequestedOutput
objects.input_dict (-) – appropriate numpy array as the dictionary value. This dictionary is automatically translated into a list of
tritonclient.InferInput
objects.
- Returns:
The return will be the dictionary of numpy arrays that have the
output_list arguments as keys.