在 S3 上使用保存的模型部署 InferenceService¶
有兩種方法可以為 AWS S3 存儲配置憑證:
- AWS IAM Role for Service Account (建議)
- AWS IAM User Credentials
S3 憑證的全局配置選項可以在 inferenceservice 配置裡的 configmap 中找到,如果在秘密或服務帳戶上找不到相關註釋,將用作備份。
創建具有 IAM 角色的服務帳戶¶
根據 AWS 文檔創建 IAM 角色並進行配置。 KServe 將讀取服務帳戶上的註釋,以便在存儲初始化器容器中註入適當的環境變量。
創建 Service Account¶
apiVersion: v1
kind: ServiceAccount
metadata:
name: sa
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/s3access # replace with your IAM role ARN
serving.kserve.io/s3-endpoint: s3.amazonaws.com # replace with your s3 endpoint e.g minio-service.kubeflow:9000
serving.kserve.io/s3-usehttps: "1" # by default 1, if testing with minio you can set to 0
serving.kserve.io/s3-region: "us-east-2"
serving.kserve.io/s3-useanoncredential: "false" # omitting this is the same as false, if true will ignore provided credential and use anonymous credentials
執行下列命令來套用
創建 Secret 並附加到服務帳戶¶
使用您的 S3 用戶憑據創建一個秘密,KServe 讀取秘密註釋以將 S3 環境變量注入存儲初始化程序或模型代理以從 S3 存儲下載模型。
創建 Secret
apiVersion: v1
kind: Secret
metadata:
name: s3creds
annotations:
serving.kserve.io/s3-endpoint: s3.amazonaws.com # replace with your s3 endpoint e.g minio-service.kubeflow:9000
serving.kserve.io/s3-usehttps: "1" # by default 1, if testing with minio you can set to 0
serving.kserve.io/s3-region: "us-east-2"
serving.kserve.io/s3-useanoncredential: "false" # omitting this is the same as false, if true will ignore provided credential and use anonymous credentials
type: Opaque
stringData: # use `stringData` for raw credential string or `data` for base64 encoded string
AWS_ACCESS_KEY_ID: XXXX
AWS_SECRET_ACCESS_KEY: XXXXXXXX
將 Secret 附加到服務帳戶
執行下列命令來套用
Info
如果您在啟用 istio sidecars 的情況下運行 kserve,則 istio 代理準備就緒和代理拉取模型之間可能存在競爭條件。當代理嘗試從 s3 下載時,這將導致 tcp 撥號連接被拒絕錯誤。
為了解決這個問題,istio 允許阻塞 pod 中的其他容器,直到代理容器準備就緒。
您可以通過在 istio-sidecar-injector
configmap 中設置 proxy.holdApplicationUntilProxyStarts: true
來啟用此功能,proxy.holdApplicationUntilProxyStarts
標誌在 Istio 1.7 中作為實驗性功能引入,默認情況下處於關閉狀態。
使用 InferenceService 在 S3 上部署模型¶
使用 s3 storageUri
和附加了 s3 secret 的 service acccount 來創建 InferenceService
。
mnist-s3.yaml
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "mnist-s3"
spec:
predictor:
serviceAccountName: sa
model:
modelFormat:
name: tensorflow
storageUri: "s3://kserve-examples/mnist"
執行下列命令來套用
運行模型推論¶
現在,可以通過 ${INGRESS_HOST}:${INGRESS_PORT}
訪問入口,或按照此說明查找入口 IP 和端口。
SERVICE_HOSTNAME=$(kubectl get inferenceservice mnist-s3 -o jsonpath='{.status.url}' | cut -d "/" -f 3)
MODEL_NAME=mnist-s3
INPUT_PATH=@./input.json
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
結果:
Note: Unnecessary use of -X or --request, POST is already inferred.
* Trying 35.237.217.209...
* TCP_NODELAY set
* Connected to mnist-s3.default.35.237.217.209.xip.io (35.237.217.209) port 80 (#0)
> POST /v1/models/mnist-s3:predict HTTP/1.1
> Host: mnist-s3.default.35.237.217.209.xip.io
> User-Agent: curl/7.55.1
> Accept: */*
> Content-Length: 2052
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
* We are completely uploaded and fine
< HTTP/1.1 200 OK
< content-length: 251
< content-type: application/json
< date: Sun, 04 Apr 2021 20:06:27 GMT
< x-envoy-upstream-service-time: 5
< server: istio-envoy
<
* Connection #0 to host mnist-s3.default.35.237.217.209.xip.io left intact
{
"predictions": [
{
"predictions": [0.327352405, 2.00153053e-07, 0.0113353515, 0.203903764, 3.62863029e-05, 0.416683704, 0.000281196437, 8.36911859e-05, 0.0403052084, 1.82206513e-05],
"classes": 5
}
]
}