Aloha Quick Tutorial

The Aloha Quick Start Guide demonstrates how to use Wallaroo to determine malicious web sites from their URL.

In this notebook we will walk through a simple pipeline deployment to inference on a model. For this example we will be using an open source model that uses an Aloha CNN LSTM model for classifiying Domain names as being either legitimate or being used for nefarious purposes such as malware distribution.

For our example, we will perform the following:

  • Create a workspace for our work.
  • Upload the Aloha model.
  • Create a pipeline that can ingest our submitted data, submit it to the model, and export the results
  • Run a sample inference through our pipeline by loading a file
  • Run a sample inference through our pipeline’s URL and store the results in a file.

All sample data and models are available through the Wallaroo Quick Start Guide Samples repository.

Open a Connection to Wallaroo

The first step is to connect to Wallaroo through the Wallaroo client. The Python library is included in the Wallaroo install and available through the Jupyter Hub interface provided with your Wallaroo environment.

This is accomplished using the wallaroo.Client() command, which provides a URL to grant the SDK permission to your specific Wallaroo environment. When displayed, enter the URL into a browser and confirm permissions. Store the connection into a variable that can be referenced later.

import wallaroo
wl = wallaroo.Client()

Create the Workspace

We will create a workspace to work in and call it the “aloha-workspace”, then set it as current workspace environment.

new_workspace = wl.create_workspace("aloha-workspace")
_ = wl.set_current_workspace(new_workspace)

We can verify the workspace is created the current default workspace with the get_current_workspace() command.

wl.get_current_workspace()
Result

{
    'name': 'aloha-workspace',
    'id': 5,
    'archived': False,
    'created_by': '45e6b641-fe57-4fb2-83d2-2c2bd201efe8',
    'created_at': '2022-03-29T20:33:54.981917+00:00',
    'models': [],
    'pipelines': []
}

Wallaroo Engine Configuration

Before deploying an inference engine we will set the configuration of the Wallaroo engine. We will use the Wallaroo DeploymentConfigBuilder() and fill in the options listed below to determine what the properties of our inference engine will be. This way when we start building our pipeline and adding elements to the cluster, Kubernetes can allocate the resources we’ve defined here to the pipeline.

For our example, we’ll use the following settings:

  • replica_count: 1 When deployed this will have a single inference engine
  • cpus: 4 Each inference engine will have 4 cores.
  • memory: 8Gi Each inference engine will have 8 Gb of memory.
deployment_config = (wallaroo.DeploymentConfigBuilder()
                    .replica_count(1)
                    .cpus(4)
                    .memory("8Gi")
                    .build())

Upload the Models

Now we will upload our models. Note that for this example we are applying the model from a .ZIP file. The Aloha model is a protobuf file that has been defined for evaluating web pages, and we will configure it to use data in the tensorflow format.

model = wl.upload_model("aloha-2", "./aloha-cnn-lstm.zip").configure("tensorflow")

Deploy a model

Now that we have a model that we want to use we will create a deployment for it.

We will tell the deployment we are using a tensorflow model and give the deployment name and the configuration we want for the deployment.

To do this, we’ll create our pipeline that can ingest the data, pass the data to our Aloha model, and give us a final output. We’ll call our pipeline aloha-test-demo, then deploy it so it’s ready to receive data. The deployment process usually takes about 45 seconds.

  • Note: If you receive an error that the pipeline could not be deployed because there are not enough resources, undeploy any other pipelines and deploy this one again. This command can quickly undeploy all pipelines to regain resources. We recommend not running this command in a production environment since it will cancel any running pipelines:
for p in wl.list_pipelines(): p.undeploy()
aloha_pipeline = wl.build_pipeline('aloha-test-demo')
aloha_pipeline.add_model_step(model)
aloha_pipeline.deploy()
Result

Waiting for deployment - this will take up to 45s ...... ok

{
    'name': 'aloha-test-demo',
	'create_time': datetime.datetime(2022, 3, 29, 20, 34, 3, 960957, tzinfo=tzutc()),
	'definition': "[
        {
            'ModelInference': 
            {
                'models': 
                [
                    {
                        'name': 'aloha-2',
	                    'version': 'a8e8abdc-c22f-416c-a13c-5fe162357430',
	                    'sha': 'fd998cd5e4964bbbb4f8d29d245a8ac67df81b62be767afbceb96a03d1a01520'
                    }
                ]
            }
        }
    ]"
}

We can verify that the pipeline is running and list what models are associated with it.

aloha_pipeline.status()
Result

{'status': 'Running',
    'details': None,
    'engines': [{'ip': '10.12.1.38',
    'name': 'engine-65d774bb67-4sf5b',
    'status': 'Running',
    'reason': None,
    'pipeline_statuses': {'pipelines': [{'id': 'aloha-test-demo',
        'status': 'Running'}]},
    'model_statuses': {'models': [{'name': 'aloha-2',
        'version': 'a8e8abdc-c22f-416c-a13c-5fe162357430',
        'sha': 'fd998cd5e4964bbbb4f8d29d245a8ac67df81b62be767afbceb96a03d1a01520',
        'status': 'Running'}]}}],
    'engine_lbs': [{'ip': '10.12.1.37',
    'name': 'engine-lb-85846c64f8-qpr6g',
    'status': 'Running',
    'reason': None}]}

Interferences

Infer 1 row

Now that the pipeline is deployed and our Aloha model is in place, we’ll perform a smoke test to verify the pipeline is up and running properly. We’ll use the infer_from_file command to load a single encoded URL into the inference engine and print the results back out.

aloha_pipeline.infer_from_file("data-1.json")
Result

[InferenceResult({'check_failures': [],
    'elapsed': 631348351,
    'model_name': 'aloha-2',
    'model_version': '496e6860-a658-4d35-8b55-0f8cc6ad6fde',
    'original_data': {'text_input': [[0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    0,
                                    28,
                                    16,
                                    32,
                                    23,
                                    29,
                                    32,
                                    30,
                                    19,
                                    26,
                                    17]]},
    'outputs': [{'Float': {'data': [0.001519620418548584], 'dim': [1, 1], 'v': 1}},
                {'Float': {'data': [0.9829147458076477], 'dim': [1, 1], 'v': 1}},
                {'Float': {'data': [0.012099534273147583], 'dim': [1, 1], 'v': 1}},
                {'Float': {'data': [4.7593468480044976e-05],
                            'dim': [1, 1],
                            'v': 1}},
                {'Float': {'data': [2.0289742678869516e-05],
                            'dim': [1, 1],
                            'v': 1}},
                {'Float': {'data': [0.0003197789192199707],
                            'dim': [1, 1],
                            'v': 1}},
                {'Float': {'data': [0.011029303073883057], 'dim': [1, 1], 'v': 1}},
                {'Float': {'data': [0.9975639581680298], 'dim': [1, 1], 'v': 1}},
                {'Float': {'data': [0.010341644287109375], 'dim': [1, 1], 'v': 1}},
                {'Float': {'data': [0.008038878440856934], 'dim': [1, 1], 'v': 1}},
                {'Float': {'data': [0.016155093908309937], 'dim': [1, 1], 'v': 1}},
                {'Float': {'data': [0.006236225366592407], 'dim': [1, 1], 'v': 1}},
                {'Float': {'data': [0.0009985864162445068],
                            'dim': [1, 1],
                            'v': 1}},
                {'Float': {'data': [1.7933435344117743e-26],
                            'dim': [1, 1],
                            'v': 1}},
                {'Float': {'data': [1.388984431455466e-27],
                            'dim': [1, 1],
                            'v': 1}}],
    'pipeline_name': 'aloha-test-demo',
    'time': 1648570552486})]

Batch Inference

Now that our smoke test is successfully, let’s really give it some data. We have two inference files we can use:

  • data-1k.json: Contains 1,0000 inferences
  • data-25k.json: Contains 25,000 inferences

We’ll pipe the data-25k.json file through the aloha_pipeline deployment URL, and place the results in a file named response.txt. We’ll also display the time this takes. Note that for larger batches of 50,000 inferences or more can be difficult to view in Juypter Hub because of its size.

When running this example, replace the URL from the _deployment._url() command into the curl command below.

aloha_pipeline._deployment._url()
    'http://engine-lb.aloha-test-demo-5:29502/pipelines/aloha-test-demo'
!curl -X POST http://engine-lb.aloha-test-demo-5:29502/pipelines/aloha-test-demo -H "Content-Type:application/json" --data @data-25k.json > curl_response.txt
Result

% Total    % Received % Xferd  Average Speed   Time
                                    
100         12.9M      100     10.1M            0:00:19

Undeploy Pipeline

When finished with our tests, we will undeploy the pipeline so we have the Kubernetes resources back for other tasks. Note that if the deployment variable is unchanged aloha_pipeline.deploy() will restart the inference engine in the same configuration as before.

aloha_pipeline.undeploy()