Skip to content

在 S3 上使用保存的模型部署 InferenceService

有兩種方法可以為 AWS S3 存儲配置憑證:

  • AWS IAM Role for Service Account (建議)
  • AWS IAM User Credentials

S3 憑證的全局配置選項可以在 inferenceservice 配置裡的 configmap 中找到,如果在秘密或服務帳戶上找不到相關註釋,將用作備份。

創建具有 IAM 角色的服務帳戶

根據 AWS 文檔創建 IAM 角色並進行配置。 KServe 將讀取服務帳戶上的註釋,以便在存儲初始化器容器中註入適當的環境變量。

創建 Service Account

apiVersion: v1
kind: ServiceAccount
metadata:
  name: sa
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/s3access # replace with your IAM role ARN
    serving.kserve.io/s3-endpoint: s3.amazonaws.com # replace with your s3 endpoint e.g minio-service.kubeflow:9000
    serving.kserve.io/s3-usehttps: "1" # by default 1, if testing with minio you can set to 0
    serving.kserve.io/s3-region: "us-east-2"
    serving.kserve.io/s3-useanoncredential: "false" # omitting this is the same as false, if true will ignore provided credential and use anonymous credentials

執行下列命令來套用

kubectl apply -f create-s3-sa.yaml

創建 Secret 並附加到服務帳戶

使用您的 S3 用戶憑據創建一個秘密,KServe 讀取秘密註釋以將 S3 環境變量注入存儲初始化程序或模型代理以從 S3 存儲下載模型。

創建 Secret

apiVersion: v1
kind: Secret
metadata:
  name: s3creds
  annotations:
     serving.kserve.io/s3-endpoint: s3.amazonaws.com # replace with your s3 endpoint e.g minio-service.kubeflow:9000
     serving.kserve.io/s3-usehttps: "1" # by default 1, if testing with minio you can set to 0
     serving.kserve.io/s3-region: "us-east-2"
     serving.kserve.io/s3-useanoncredential: "false" # omitting this is the same as false, if true will ignore provided credential and use anonymous credentials
type: Opaque
stringData: # use `stringData` for raw credential string or `data` for base64 encoded string
  AWS_ACCESS_KEY_ID: XXXX
  AWS_SECRET_ACCESS_KEY: XXXXXXXX

將 Secret 附加到服務帳戶

apiVersion: v1
kind: ServiceAccount
metadata:
  name: sa
secrets:
- name: s3creds

執行下列命令來套用

kubectl apply -f create-s3-secret.yaml

Info

如果您在啟用 istio sidecars 的情況下運行 kserve,則 istio 代理準備就緒和代理拉取模型之間可能存在競爭條件。當代理嘗試從 s3 下載時,這將導致 tcp 撥號連接被拒絕錯誤。

為了解決這個問題,istio 允許阻塞 pod 中的其他容器,直到代理容器準備就緒。

您可以通過在 istio-sidecar-injector configmap 中設置 proxy.holdApplicationUntilProxyStarts: true 來啟用此功能,proxy.holdApplicationUntilProxyStarts 標誌在 Istio 1.7 中作為實驗性功能引入,默認情況下處於關閉狀態。

使用 InferenceService 在 S3 上部署模型

使用 s3 storageUri 和附加了 s3 secret 的 service acccount 來創建 InferenceService

mnist-s3.yaml
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "mnist-s3"
spec:
  predictor:
    serviceAccountName: sa
    model:
      modelFormat:
        name: tensorflow
      storageUri: "s3://kserve-examples/mnist"

執行下列命令來套用

kubectl apply -f mnist-s3.yaml

運行模型推論

現在,可以通過 ${INGRESS_HOST}:${INGRESS_PORT} 訪問入口,或按照此說明查找入口 IP 和端口。

SERVICE_HOSTNAME=$(kubectl get inferenceservice mnist-s3 -o jsonpath='{.status.url}' | cut -d "/" -f 3)

MODEL_NAME=mnist-s3
INPUT_PATH=@./input.json
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH

結果:

Note: Unnecessary use of -X or --request, POST is already inferred.
*   Trying 35.237.217.209...
* TCP_NODELAY set
* Connected to mnist-s3.default.35.237.217.209.xip.io (35.237.217.209) port 80 (#0)
> POST /v1/models/mnist-s3:predict HTTP/1.1
> Host: mnist-s3.default.35.237.217.209.xip.io
> User-Agent: curl/7.55.1
> Accept: */*
> Content-Length: 2052
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
* We are completely uploaded and fine
< HTTP/1.1 200 OK
< content-length: 251
< content-type: application/json
< date: Sun, 04 Apr 2021 20:06:27 GMT
< x-envoy-upstream-service-time: 5
< server: istio-envoy
<
* Connection #0 to host mnist-s3.default.35.237.217.209.xip.io left intact
{
    "predictions": [
        {
            "predictions": [0.327352405, 2.00153053e-07, 0.0113353515, 0.203903764, 3.62863029e-05, 0.416683704, 0.000281196437, 8.36911859e-05, 0.0403052084, 1.82206513e-05],
            "classes": 5
        }
    ]
}