Pipeline logs are retrieved through the Wallaroo MLOps API with the following request.
v1/api/pipelines/get_logs
application/json; format=pandas-records
: For the logs returned as pandas DataFrameapplication/vnd.apache.arrow.file
: for the logs returned as Apache ArrowDesc
): The order for log inserts returned. Valid values are:Asc
: In chronological order of inserts.Desc
: In reverse chronological order of inserts.1000
.): Max records per page.end_time
.start_time
.'application/json; format=pandas-records'
format. To request the logs as Apache Arrow tables, set the submission header Accept
to application/vnd.apache.arrow.file
.x-iteration-status
is All
.x-iteration-status
is All
, then x-iteration-cursor
is not provided.# retrieve the authorization token
headers = wl.auth.auth_header()
url = f"{APIURL}/v1/api/pipelines/get_logs"
# Standard log retrieval
data = {
'pipeline_id': main_pipeline_name,
'workspace_id': workspace_id
}
response = requests.post(url, headers=headers, json=data)
standard_logs = pd.DataFrame.from_records(response.json())
display(standard_logs.head(5).loc[:, ["time", "in", "out"]])
time | in | out | |
---|---|---|---|
0 | 1684423875900 | {'tensor': [4.0, 2.5, 2900.0, 5505.0, 2.0, 0.0, 0.0, 3.0, 8.0, 2900.0, 0.0, 47.6063, -122.02, 2970.0, 5251.0, 12.0, 0.0, 0.0]} | {'variable': [718013.75]} |
1 | 1684423875900 | {'tensor': [2.0, 2.5, 2170.0, 6361.0, 1.0, 0.0, 2.0, 3.0, 8.0, 2170.0, 0.0, 47.7109, -122.017, 2310.0, 7419.0, 6.0, 0.0, 0.0]} | {'variable': [615094.56]} |
2 | 1684423875900 | {'tensor': [3.0, 2.5, 1300.0, 812.0, 2.0, 0.0, 0.0, 3.0, 8.0, 880.0, 420.0, 47.5893, -122.317, 1300.0, 824.0, 6.0, 0.0, 0.0]} | {'variable': [448627.72]} |
3 | 1684423875900 | {'tensor': [4.0, 2.5, 2500.0, 8540.0, 2.0, 0.0, 0.0, 3.0, 9.0, 2500.0, 0.0, 47.5759, -121.994, 2560.0, 8475.0, 24.0, 0.0, 0.0]} | {'variable': [758714.2]} |
4 | 1684423875900 | {'tensor': [3.0, 1.75, 2200.0, 11520.0, 1.0, 0.0, 0.0, 4.0, 7.0, 2200.0, 0.0, 47.7659, -122.341, 1690.0, 8038.0, 62.0, 0.0, 0.0]} | {'variable': [513264.7]} |
# Retrieve logs from specific date/time to only get the two DataFrame input inferences in ascending format
# retrieve the authorization token
headers = wl.auth.auth_header()
url = f"{APIURL}/v1/api/pipelines/get_logs"
# Standard log retrieval
data = {
'pipeline_id': main_pipeline_name,
'workspace_id': workspace_id,
'order': 'Asc',
'start_time': f'{shadow_date_start.isoformat()}',
'end_time': f'{shadow_date_end.isoformat()}'
}
response = requests.post(url, headers=headers, json=data)
standard_logs = pd.DataFrame.from_records(response.json())
display(standard_logs.head(5).loc[:, ["time", "out", "out_logcontrolchallenger01", "out_logcontrolchallenger02"]])
time | out | out_logcontrolchallenger01 | out_logcontrolchallenger02 | |
---|---|---|---|---|
0 | 1684427140394 | {'variable': [718013.75]} | {'variable': [659806.0]} | {'variable': [704901.9]} |
1 | 1684427140394 | {'variable': [615094.56]} | {'variable': [732883.5]} | {'variable': [695994.44]} |
2 | 1684427140394 | {'variable': [448627.72]} | {'variable': [419508.84]} | {'variable': [416164.8]} |
3 | 1684427140394 | {'variable': [758714.2]} | {'variable': [634028.8]} | {'variable': [655277.2]} |
4 | 1684427140394 | {'variable': [513264.7]} | {'variable': [427209.44]} | {'variable': [426854.66]} |
# Retrieve logs from specific date/time to only get the two DataFrame input inferences in ascending format
# retrieve the authorization token
headers = wl.auth.auth_header()
url = f"{APIURL}/v1/api/pipelines/get_logs"
# Standard log retrieval
data = {
'pipeline_id': main_pipeline_name,
'workspace_id': workspace_id,
'order': 'Asc',
'start_time': f'{ab_date_start.isoformat()}',
'end_time': f'{ab_date_end.isoformat()}'
}
response = requests.post(url, headers=headers, json=data)
standard_logs = pd.DataFrame.from_records(response.json())
display(standard_logs.head(5).loc[:, ["time", "out"]])
time | out | |
---|---|---|
0 | 1684427501820 | {'_model_split': ['{"name":"logcontrolchallenger02","version":"89dba25e-a11e-453d-9bcc-cddf8d6ddea0","sha":"ed6065a79d841f7e96307bb20d5ef22840f15da0b587efb51425c7ad60589d6a"}'], 'variable': [715947.75]} |
1 | 1684427502196 | {'_model_split': ['{"name":"logcontrolchallenger01","version":"07fe7686-9bd0-4fd3-9a7c-0e933a74003c","sha":"31e92d6ccb27b041a324a7ac22cf95d9d6cc3aa7e8263a229f7c4aec4938657c"}'], 'variable': [341386.34]} |
2 | 1684427503778 | {'_model_split': ['{"name":"logapicontrol","version":"70b76ecb-55c2-4d68-be9b-b440b11e6499","sha":"e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6"}'], 'variable': [1039781.2]} |
3 | 1684427504566 | {'_model_split': ['{"name":"logcontrolchallenger02","version":"89dba25e-a11e-453d-9bcc-cddf8d6ddea0","sha":"ed6065a79d841f7e96307bb20d5ef22840f15da0b587efb51425c7ad60589d6a"}'], 'variable': [411090.75]} |
4 | 1684427505094 | {'_model_split': ['{"name":"logcontrolchallenger02","version":"89dba25e-a11e-453d-9bcc-cddf8d6ddea0","sha":"ed6065a79d841f7e96307bb20d5ef22840f15da0b587efb51425c7ad60589d6a"}'], 'variable': [296175.66]} |
Pipeline logs have a set allocation of storage space and data requirements.
To prevent storage and performance issues, inference result data may be dropped from pipeline logs by the following standards:
For example, Computer Vision ML Models typically have large inputs and output values - a single pandas DataFrame inference request may be over 13 MB in size, and the inference results nearly as large. To prevent pipeline log storage issues, the input may be dropped from the pipeline logs, and if additional space is needed, the inference outputs would follow. The time
column is preserved.
If a pipeline has dropped columns for space purposes, this will be displayed when a log request is made with the following warning, with {columns} replaced with the dropped columns.
The inference log is above the allowable limit and the following columns may have been suppressed for various rows in the logs: {columns}. To review the dropped columns for an individual inference’s suppressed data, include dataset=["metadata"] in the log request.
To review what columns are dropped from pipeline logs for storage reasons, include the dataset metadata
in the request to view the column metadata.dropped
. This metadata field displays a List of any columns dropped from the pipeline logs.
For example:
# retrieve the authorization token
headers = wl.auth.auth_header()
url = f"{APIURL}/v1/api/pipelines/get_logs"
# Standard log retrieval
data = {
'pipeline_id': main_pipeline_name,
'workspace_id': workspace_id
}
response = requests.post(url, headers=headers, json=data)
standard_logs = pd.DataFrame.from_records(response.json())
display(len(standard_logs))
display(standard_logs.head(5).loc[:, ["time", "metadata"]])
cursor = response.headers['x-iteration-cursor']
time | metadata | |
---|---|---|
0 | 1688760035752 | {’last_model’: ‘{“model_name”:“logapicontrol”,“model_sha”:“e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6”}’, ‘pipeline_version’: ‘’, ’elapsed’: [112967, 267146], ‘dropped’: []} |
1 | 1688760036054 | {’last_model’: ‘{“model_name”:“logapicontrol”,“model_sha”:“e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6”}’, ‘pipeline_version’: ‘’, ’elapsed’: [37127, 594183], ‘dropped’: []} |
2 | 1688759857040 | {’last_model’: ‘{“model_name”:“logapicontrol”,“model_sha”:“e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6”}’, ‘pipeline_version’: ‘’, ’elapsed’: [111082, 253184], ‘dropped’: []} |
3 | 1688759857526 | {’last_model’: ‘{“model_name”:“logapicontrol”,“model_sha”:“e22a0831aafd9917f3cc87a15ed267797f80e2afa12ad7d8810ca58f173b8cc6”}’, ‘pipeline_version’: ‘’, ’elapsed’: [43962, 265740], ‘dropped’: []} |
Data elements that do not fit the supported data types below, such as None
or Null
values, are not supported in pipeline logs. When present, undefined data will be written in the place of the null value, typically zeroes. Any null list values will present an empty list.