Using Jupyter Notebooks in Production

How to go from Jupyter Notebooks to Production Systems

Using Jupyter Notebooks in Production

The following tutorials are available from the Wallaroo Tutorials Repository.

The following tutorials provide an example of an organization moving from experimentation to deployment in production using Jupyter Notebooks as the basis for code research and use. For this example, we can assume to main actors performing the following tasks.

NumberNotebook SampleTaskActorDescription
0101_explore_and_train.ipynbData Exploration and Model SelectionData ScientistThe data scientist evaluates the data and determines the best model to use to solve the proposed problems.
0202_automated_training_process.ipyndTraining Process Automation SetupData ScientistThe data scientist has selected the model and tested how to train it. In this phase, the data scientist tests automating the training process based on a data store.
0303_deploy_model.ipynbDeploy the Model in WallarooMLOps EngineerThe MLOps takes the trained model and deploys a Wallaroo pipeline with it to perform inferences on by feeding it data from a data store.
0404_regular_batch_inferences.ipynbRegular Batch InferenceMLOps EngineerWith the pipeline deployed, regular inferences can be made and the results reported to a data store.

Each Jupyter Notebook is arranged to demonstrate each step of the process.

Resources

The following resources are provided as part of this tutorial:

  • data
    • data/seattle_housing_col_description.txt: Describes the columns used as part data analysis.
    • data/seattle_housing.csv: Sample data of the Seattle, Washington housing market between 2014 and 2015.
  • code
    • simdb.py: A simulated database to demonstrate sending and receiving queries.
    • preprocess.py and postprocess.py: Processes the data into a format the model accepts, and formats the model outputs for database use.
  • models
    • housing_model_xgb.onnx: Model created in Stage 2: Training Process Automation Setup.
    • ./models/preprocess_byop.zip.: Formats the incoming data for the model.
    • ./models/postprocess_byop.zip: Formats the outgoing data for the model.