How to Use PyPMML: Your Guide to PMML Scoring in Python

Apr 14, 2021 | Educational

If you’re looking to harness the power of Predictive Model Markup Language (PMML) in Python, then PyPMML is your go-to library. This article will guide you step-by-step on how to get started with PyPMML, its prerequisites, dependencies, installation, usage, and troubleshooting.

Prerequisites

  • Java >= 8
  • Python 2.7 or >= 3.5

Dependencies

Installation

Installing PyPMML is straightforward. You have two options:

  • For a quick install via pip:
  • pip install pypmml
  • Or you can install the latest version directly from GitHub:
  • pip install --upgrade git+https://github.com/autodeployai/pypmml.git

Usage

Once installed, you can load your PMML models and start making predictions. Let’s break down the steps.

1. Load a Model

You can load the PMML model from various sources – be it a readable file, file path, string, or a byte array.

from pypmml import Model
model = Model.load('single_iris_dectree.xml')

2. Make Predictions

The real magic happens here! You can call the predict(data) method for different data types such as dictionaries, lists, JSON records, NumPy arrays, or Pandas DataFrames. Let’s explore each option with a fun analogy.

Imagine you are a chef trying to create a new dish. The ingredients (data) you add to your recipe have to be in a specific order to ensure the final dish tastes great (predictable results). Similar to how a chef measures ingredients, you provide the model with specific data formats:

  • Data in Dict: Like specifying your ingredients with exact measures.
  • model.predict({'sepal_length': 5.1, 'sepal_width': 3.5, 'petal_length': 1.4, 'petal_width': 0.2})
  • Data in List: Think of this as lining up your ingredients in the order they are to be mixed.
  • model.predict([5.1, 3.5, 1.4, 0.2])
  • Data in JSON Records: It’s like providing your ingredients in a take-out box for convenience.
  • model.predict({'columns': ['sepal_length', 'sepal_width', 'petal_length', 'petal_width'], 'data': [[5.1, 3.5, 1.4, 0.2]]})
  • Data in NumPy Arrays: Here, refer to precise ingredient ratios prepared ahead of time.
  • import numpy as np
    model.predict(np.array([[5.1, 3.5, 1.4, 0.2]]))
  • Data in Pandas DataFrame: This is akin to using a detailed recipe book with all measures listed.
  • import pandas as pd
    data = pd.read_csv('Iris.csv')
    model.predict(data)

Support Java Gateways

If you prefer a different backend, PyPMML allows you to switch between Py4J and JPype.

from pypmml import PMMLContext
PMMLContext.getOrCreate(gateway=jpype)

Troubleshooting

If you encounter issues while working with PyPMML, here are some helpful troubleshooting tips:

  • Ensure that you have Java installed and that it is version 8 or higher.
  • Check if the required libraries, Py4J or JPype, are properly installed.
  • If you’re receiving unexpected results, double-check the order and format of your input data against what the model expects.
  • For additional resources or community support, feel free to reach out or check issues on the PyPMML repository.
  • For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Final Thoughts

Getting started with PyPMML can transform your predictive modeling workflow in Python. With a clear understanding of its components, the path to making accurate predictions has never been easier. Dive into the world of PMML and discover the endless possibilities!

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox