Re: API for inference (OpenVINNO)

TonyD1

In the Openshift AI lab I tried to deploy my own model using a saved onnx model and OpenVINNO in serving model. In S3 bucket I uploaded the .onnx file

It was successful and in status ready. When I try to use POST Rest API I get an error

"The model with requested version is not found"

Name:mymodel namespace: personalmodels

Model is using 4 numerical inputs and one numerical output

As a host, I used both option https://mymodel-predictor-personalmodels.apps.ocp4.example.com/v2/models/mymodel/infer and https://mymodel-personalmodels.apps.ocp4.example.com/v2/models/mymodel/infer

Then I passed the input data

curl -sk -X 'POST' <host>
-H 'accept: application/json' -H 'Content-Type: application/json' \
-d '{
"inputs": [
{
"data": [0.2,4,2,5],
"datatype": "FP32", "name": "inputs", "shape": [1,4]
}
],
"outputs": [{"name": "output0"}]
}'

Which API calls should work for OpenVINNO/onnx? Why does it complain for version?

jramcast

That "model with requested version is not found" error sounds like OpenVINO cannot find the model version in single-model serving.

Are you using single or multi-model serving? Single-model serving requires your model to be stored in a particular directory structure in S3. This directory structure must contain the version:

models/mymodel/1.0/model_files.onnx

Then, when you create the RHOAI single-model server, you have to specify the "models/mymodel/" directory as the path (without the version).

For more details, see: https://docs.openvino.ai/2024/openvino-workflow/model-server/ovms_docs_models_repository.html