Edge and Multicloud Deployed Model Inference

How to perform inferences on deployed models in edge and multicloud environments.

Edge Deployment Endpoints

The following endpoints are available for API calls to the edge deployed pipeline.

List Pipelines

The endpoint GET /pipelines returns:

  • id (String): The name of the pipeline.
  • status (String): The status as either Running, or Error if there are any issues.

List Pipelines Example

curl localhost:8080/pipelines

List Models

The endpoint GET /models returns a List of models with the following fields:

  • name (String): The model name.
  • sha (String): The sha hash value of the ML model.
  • status (String): The status of either Running or Error if there are any issues.
  • version (String): The model version. This matches the version designation used by Wallaroo to track model versions in UUID format.

List Models Example

curl localhost:8080/models

Edge Inference Endpoints

The inference endpoint takes the following patterns:

  • POST /infer: The static inference endpoint. If a model deployment is updated or a new pipeline publish replaces a previous one, the /infer endpoint always points to the current deployed pipeline. For more information, see Run Anywhere: In-Line Model Updates on Edge Devices.
  • POST /pipelines/{pipeline-name}: The pipeline-name is the same as returned from the /pipelines endpoint as id. This endpoint changes based on the pipeline publish deployed.

Organizations are encouraged to use the /infer endpoint for consistency.

Wallaroo inference endpoint URLs accept the following data inputs through the Content-Type header:

  • Content-Type: application/vnd.apache.arrow.file: For Apache Arrow tables.
  • Content-Type: application/json; format=pandas-records: For pandas DataFrame in record format.

Once deployed, we can perform an inference through the deployment URL.

The endpoint returns Content-Type: application/json; format=pandas-records by default with the following fields:

  • check_failures (List[Integer]): Whether any validation checks were triggered. For more information, see Wallaroo SDK Essentials Guide: Pipeline Management: Anomaly Testing.
  • elapsed (List[Integer]): A list of time in nanoseconds for:
    • [0] The time to serialize the input.
    • [1…n] How long each step took.
  • model_name (String): The name of the model used.
  • model_version (String): The version of the model in UUID format.
  • original_data: The original input data. Returns null if the input may be too long for a proper return.
  • outputs (List): The outputs of the inference result separated by data type, where each data type includes:
    • data: The returned values.
    • dim (List[Integer]): The dimension shape returned.
    • v (Integer): The vector shape of the data.
  • pipeline_name (String): The name of the pipeline.
  • shadow_data: Any shadow deployed data inferences in the same format as outputs.
  • time (Integer): The time since UNIX epoch.

Edge Inference Endpoint Example

The following example demonstrates sending an Apache Arrow table to the Edge deployed pipeline, requesting the inference results back in a pandas DataFrame records format.

curl -X POST localhost:8080/infer -H "Content-Type: application/vnd.apache.arrow.file" -H 'Accept: application/json; format=pandas-records'  --data-binary @./data/image_224x224.arrow



Log Retrieval from Edge Locations

Inference logs are retrieved from edge location deployments through the /logs endpoint.

  • Endpoint: /logs
  • Type: POST
  • Headers:
    • Content-Type: application/json: Submissions to the /logs endpoint in JSON text format.
    • Accept: application/json; format=pandas-records: The /logs endpoint returns JSON in pandas Record format.
  • Parameters:
    • {}: An empty set.

Inference logs are returned as JSON in pandas Record Format with the following fields:

timeDateTimeDateTime field in Epoch format.
inDictThe inputs in Dict format.
outDictThe outputs in Dict format with the model field outputs and values.
anomalyDictAny anomalies detected; the field count is reserved for the total number of validations derived as True. See anomalies for more details.
metadataDictMetadata of the transaction that includes:
  • last_model (Dict): The last model used in the inference request that includes:
    • model_name (String): The name assigned to the model when uploaded to Wallaroo Ops.
    • model_sha (String): The sha hash of the model.
  • pipeline_version (String): The version of the pipeline in UUID format.
  • elasped (List[Integer]):
    • A list of time in nanoseconds for:
      • The time to serialize the input.
      • How long each step took.
  • dropped (List[String]): Any inference input fields dropped to reduce log storage. Inference results always return the entire inference inputs and outputs; inference logs may drop input fields for space purposes.
  • partition (String): The partition the inference logs are assigned. On edge deployments, the partition matches the name of the edge added to the pipeline.
  • Log Retrieval from Edge Locations Example

    The following shows retrieving logs from a model deployment on an edge location.

    We will store the logs to a JSON file in pandas Record format, then display the edge logs as a DataFrame.

    !curl -X POST http://localhost:8080/logs \
        -H "Content-Type: Content-Type: application/json; format=pandas-records" \
        --data {} > ./edge-logs.df.json
    df_logs = pd.read_json("./edge-logs.df.json", orient="records")
    01713880452318{'tensor': [4.0, 2.5, 2900.0, 5505.0, 2.0, 0.0, 0.0, 3.0, 8.0, 2900.0, 0.0, 47.6063, -122.02, 2970.0, 5251.0, 12.0, 0.0, 0.0]}{'variable': [718013.7]}{'count': 0}{'last_model': '{"model_name":"rf-house-price-estimator","model_sha":"e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6"}', 'pipeline_version': '76bec2c1-d93c-4941-b17b-3c6a6254d0b2', 'elapsed': [15654000, 17385666], 'dropped': [], 'partition': 'houseprice-low-connection-demonstration-01'}
    11713880461579{'tensor': [4.0, 2.75, 3010.0, 7215.0, 2.0, 0.0, 0.0, 3.0, 9.0, 3010.0, 0.0, 47.6952018738, -122.1780014038, 3010.0, 7215.0, 0.0, 0.0, 0.0]}{'variable': [795841.06]}{'count': 0}{'last_model': '{"model_name":"rf-house-price-estimator","model_sha":"e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6"}', 'pipeline_version': '76bec2c1-d93c-4941-b17b-3c6a6254d0b2', 'elapsed': [6297000, 12721000], 'dropped': [], 'partition': 'houseprice-low-connection-demonstration-01'}
    21713880461579{'tensor': [4.0, 1.75, 1400.0, 7920.0, 1.0, 0.0, 0.0, 3.0, 7.0, 1400.0, 0.0, 47.465801239, -122.1839981079, 1910.0, 7700.0, 52.0, 0.0, 0.0]}{'variable': [267013.97]}{'count': 0}{'last_model': '{"model_name":"rf-house-price-estimator","model_sha":"e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6"}', 'pipeline_version': '76bec2c1-d93c-4941-b17b-3c6a6254d0b2', 'elapsed': [6297000, 12721000], 'dropped': [], 'partition': 'houseprice-low-connection-demonstration-01'}
    31713880461579{'tensor': [4.0, 2.5, 3130.0, 13202.0, 2.0, 0.0, 0.0, 3.0, 10.0, 3130.0, 0.0, 47.5877990723, -121.9759979248, 2840.0, 10470.0, 19.0, 0.0, 0.0]}{'variable': [879083.56]}{'count': 0}{'last_model': '{"model_name":"rf-house-price-estimator","model_sha":"e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6"}', 'pipeline_version': '76bec2c1-d93c-4941-b17b-3c6a6254d0b2', 'elapsed': [6297000, 12721000], 'dropped': [], 'partition': 'houseprice-low-connection-demonstration-01'}
    41713880461579{'tensor': [3.0, 2.25, 1620.0, 997.0, 2.5, 0.0, 0.0, 3.0, 8.0, 1540.0, 80.0, 47.5400009155, -122.0260009766, 1620.0, 1068.0, 4.0, 0.0, 0.0]}{'variable': [544392.06]}{'count': 0}{'last_model': '{"model_name":"rf-house-price-estimator","model_sha":"e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6"}', 'pipeline_version': '76bec2c1-d93c-4941-b17b-3c6a6254d0b2', 'elapsed': [6297000, 12721000], 'dropped': [], 'partition': 'houseprice-low-connection-demonstration-01'}
    9951713880461579{'tensor': [4.0, 2.5, 2040.0, 9225.0, 1.0, 0.0, 0.0, 5.0, 8.0, 1610.0, 430.0, 47.6360015869, -122.0970001221, 1730.0, 9225.0, 46.0, 0.0, 0.0]}{'variable': [627853.3]}{'count': 0}{'last_model': '{"model_name":"rf-house-price-estimator","model_sha":"e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6"}', 'pipeline_version': '76bec2c1-d93c-4941-b17b-3c6a6254d0b2', 'elapsed': [6297000, 12721000], 'dropped': [], 'partition': 'houseprice-low-connection-demonstration-01'}
    9961713880461579{'tensor': [3.0, 3.0, 1330.0, 1379.0, 2.0, 0.0, 0.0, 4.0, 8.0, 1120.0, 210.0, 47.6125984192, -122.31300354, 1810.0, 1770.0, 9.0, 0.0, 0.0]}{'variable': [450867.7]}{'count': 0}{'last_model': '{"model_name":"rf-house-price-estimator","model_sha":"e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6"}', 'pipeline_version': '76bec2c1-d93c-4941-b17b-3c6a6254d0b2', 'elapsed': [6297000, 12721000], 'dropped': [], 'partition': 'houseprice-low-connection-demonstration-01'}
    9971713880461579{'tensor': [3.0, 2.5, 1880.0, 4499.0, 2.0, 0.0, 0.0, 3.0, 8.0, 1880.0, 0.0, 47.5663986206, -121.9990005493, 2130.0, 5114.0, 22.0, 0.0, 0.0]}{'variable': [553463.25]}{'count': 0}{'last_model': '{"model_name":"rf-house-price-estimator","model_sha":"e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6"}', 'pipeline_version': '76bec2c1-d93c-4941-b17b-3c6a6254d0b2', 'elapsed': [6297000, 12721000], 'dropped': [], 'partition': 'houseprice-low-connection-demonstration-01'}
    9981713880461579{'tensor': [4.0, 1.5, 1200.0, 10890.0, 1.0, 0.0, 0.0, 5.0, 7.0, 1200.0, 0.0, 47.342300415, -122.0879974365, 1250.0, 10139.0, 42.0, 0.0, 0.0]}{'variable': [241330.17]}{'count': 0}{'last_model': '{"model_name":"rf-house-price-estimator","model_sha":"e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6"}', 'pipeline_version': '76bec2c1-d93c-4941-b17b-3c6a6254d0b2', 'elapsed': [6297000, 12721000], 'dropped': [], 'partition': 'houseprice-low-connection-demonstration-01'}
    9991713880461579{'tensor': [4.0, 3.25, 5180.0, 19850.0, 2.0, 0.0, 3.0, 3.0, 12.0, 3540.0, 1640.0, 47.5620002747, -122.1620025635, 3160.0, 9750.0, 9.0, 0.0, 0.0]}{'variable': [1295531.8]}{'count': 0}{'last_model': '{"model_name":"rf-house-price-estimator","model_sha":"e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6"}', 'pipeline_version': '76bec2c1-d93c-4941-b17b-3c6a6254d0b2', 'elapsed': [6297000, 12721000], 'dropped': [], 'partition': 'houseprice-low-connection-demonstration-01'}

    1000 rows × 5 columns

    Edge Bundle One Time Token

    When an edge is added to a pipeline publish, the field docker_run_variables contains a JSON value for edge devices to connect to the Wallaroo Ops instance.

    The settings are stored in the key EDGE_BUNDLE as a base64 encoded value that include the following:

    • BUNDLE_VERSION: The current version of the bundled Wallaroo pipeline.
    • EDGE_NAME: The edge name as defined when created and added to the pipeline publish.
    • JOIN_TOKEN_: The one time authentication token for authenticating to the Wallaroo Ops instance.
    • OPSCENTER_HOST: The hostname of the Wallaroo Ops edge service. See Edge Deployment Registry Guide for full details on enabling pipeline publishing and edge observability to Wallaroo.
    • PIPELINE_URL: The OCI registry URL to the containerized pipeline.
    • WORKSPACE_ID: The numerical ID of the workspace.

    For example:

    base64 -D
    export BUNDLE_VERSION=1
    export EDGE_NAME=xgb-ccfraud-edge-test
    export JOIN_TOKEN=3148ada5-285a-4fca-b3b8-b50a48d7511b
    export OPSCENTER_HOST=doc-test.edge.wallaroocommunity.ninja
    export PIPELINE_URL=ghcr.io/wallaroolabs/doc-samples/pipelines/edge-pipeline:f388c109-8d57-4ed2-9806-aa13f854576b
    export WORKSPACE_ID=5

    The JOIN_TOKEN is a one time access token. Once used, a JOIN_TOKEN expires. The authentication session data is stored in persistent volumes. Persistent volumes must be specified for docker and docker compose based deployments of Wallaroo pipelines; helm based deployments automatically provide persistent volumes to store authentication credentials.

    The JOIN_TOKEN has the following time to live (TTL) parameters.

    • Once created, the JOIN_TOKEN is valid for 24 hours. After it expires the edge will not be allowed to contact the OpsCenter the first time and a new edge bundle will have to be created.
    • After an Edge joins to Wallaroo Ops for the first time with persistent storage, the edge must contact the Wallaroo Ops instance at least once every 7 days.
      • If this period is exceeded, the authentication credentials will expire and a new edge bundle must be created with a new and valid JOIN_TOKEN.

    Wallaroo edges require unique names. To create a new edge bundle with the same name:

    • Use the Remove Edge to remove the edge by name.
    • Use Add Edge to add the edge with the same name. A new EDGE_BUNDLE is generated with a new JOIN_TOKEN.