How to use Azure OpenAI GPT-4o with Function calling
Published Jun 03 2024 09:21 AM 6,179 Views

Introduction

In this article we will demonstrate how we leverage GPT-4o capabilities, using images with function calling to unlock multimodal use cases.

We will simulate a package routing service that routes packages based on the shipping label using OCR with GPT-4o.

The model will identify the appropriate function to call based on the image analysis and the predefined actions for routing to the appropriate continent.

 

Background

The new GPT-4o (“o” for “omni”) can reason across audio, vision, and text in real time.

  • It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time in a conversation.
  • It matches GPT-4 Turbo performance on text in English and code, with significant improvement on text in non-English languages, while also being much faster and 50% cheaper in the API.
  • GPT-4o is especially better at vision and audio understanding compared to existing models.
  • GPT-4o now enables function calling.

 

The application

We will run a Jupyter notebook that connects to GPT-4o to sort packages based on the printed labels with the shipping address.

Here are some sample labels we will be using GPT-4o for OCR to get the country this is being shipped to and GPT-4o functions to route the packages.

Denise_Schlesinger_0-1717430848731.png

 

Denise_Schlesinger_1-1717430848739.png

 

Denise_Schlesinger_2-1717430848740.png

 

 

The environment

The code can be found here - Azure OpenAI code examples

Make sure you create your python virtual environment and fill the environment variables as stated in the README.md file.

 

The code

Connecting to Azure OpenAI GPT-4o deployment.

 

from dotenv import load_dotenv
from IPython.display import display, HTML, Image
import os
from openai import AzureOpenAI
import json

load_dotenv()

GPT4o_API_KEY = os.getenv("GPT4o_API_KEY")
GPT4o_DEPLOYMENT_ENDPOINT = os.getenv("GPT4o_DEPLOYMENT_ENDPOINT")
GPT4o_DEPLOYMENT_NAME = os.getenv("GPT4o_DEPLOYMENT_NAME")

client = AzureOpenAI(
  azure_endpoint = GPT4o_DEPLOYMENT_ENDPOINT,
  api_key=GPT4o_API_KEY,  
  api_version="2024-02-01"

)

 

 

Defining the functions to be called after GPT-4o answers.

 

# Defining the functions - in this case a toy example of a shipping function
def ship_to_Oceania(location):
     return f"Shipping to Oceania based on location {location}"

def ship_to_Europe(location):
    return f"Shipping to Europe based on location {location}"

def ship_to_US(location):
    return f"Shipping to Americas based on location {location}"

 

 

Defining the available functions to be called to send to GPT-4o.

It is very IMPORTANT to send the function's and parameters descriptions so GPT-4o will know which method to call.

 

tools = [
    {
        "type": "function",
        "function": {
            "name": "ship_to_Oceania",
            "description": "Shipping the parcel to any country in Oceania",
             "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The country to ship the parcel to.",
                    }
                },
                "required": ["location"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "ship_to_Europe",
            "description": "Shipping the parcel to any country in Europe",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The country to ship the parcel to.",
                    }
                },
                "required": ["location"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "ship_to_US",
            "description": "Shipping the parcel to any country in the United States",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The country to ship the parcel to.",
                    }
                },
                "required": ["location"],
            },
        },
    },
]

available_functions = {
    "ship_to_Oceania": ship_to_Oceania,
    "ship_to_Europe": ship_to_Europe,
    "ship_to_US": ship_to_US,

}

 

 

Function to base64 encode our images, this is the format accepted by GPT-4o.

 

# Encoding the images to send to GPT-4-O
import base64

def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")

 

 

The method to call GPT-4o.

Notice below that we send the parameter "tools" with the JSON describing the functions to be called.

 

def call_OpenAI(messages, tools, available_functions):
    # Step 1: send the prompt and available functions to GPT
    response = client.chat.completions.create(
        model=GPT4o_DEPLOYMENT_NAME,
        messages=messages,
        tools=tools,
        tool_choice="auto",
    )
    response_message = response.choices[0].message
    # Step 2: check if GPT wanted to call a function
    if response_message.tool_calls:
        print("Recommended Function call:")
        print(response_message.tool_calls[0])
        print()
        # Step 3: call the function
        # Note: the JSON response may not always be valid; be sure to handle errors
        function_name = response_message.tool_calls[0].function.name
        # verify function exists
        if function_name not in available_functions:
            return "Function " + function_name + " does not exist"
        function_to_call = available_functions[function_name]
        # verify function has correct number of arguments
        function_args = json.loads(response_message.tool_calls[0].function.arguments)
        if check_args(function_to_call, function_args) is False:
            return "Invalid number of arguments for function: " + function_name
        # call the function
        function_response = function_to_call(**function_args)
        print("Output of function call:")
        print(function_response)
        print()

 

 

Please note that WE and not GPT-4o call the methods in our code based on the answer by GTP4-o.

 

# call the function
        function_response = function_to_call(**function_args)

 

 

Iterate through all the images in the folder. 

Notice the system prompt where we ask GPT-4o what we need it to do, sort labels for packages routing calling functions.

 

# iterate through all the images in the data folder
import os

data_folder = "./data"
for image in os.listdir(data_folder):
    if image.endswith(".png"):
        IMAGE_PATH = os.path.join(data_folder, image)
        base64_image = encode_image(IMAGE_PATH)
        display(Image(IMAGE_PATH))
        messages = [
            {"role": "system", "content": "You are a customer service assistant for a delivery service, equipped to analyze images of package labels. Based on the country to ship the package to, you must always ship to the corresponding continent. You must always use tools!"},
            {"role": "user", "content": [
                {"type": "image_url", "image_url": {
                    "url": f"data:image/png;base64,{base64_image}"}
                }
            ]}
        ]
        call_OpenAI(messages, tools, available_functions)

 

 

Let’s run our notebook!!!

Denise_Schlesinger_3-1717430848748.png

 

Running our code for the label above produces the following output:

 

Recommended Function call:
ChatCompletionMessageToolCall(id='call_lH2G1bh2j1IfBRzZcw84wg0x', function=Function(arguments='{"location":"United States"}', name='ship_to_US'), type='function')


Output of function call:
Shipping to Americas based on location United States

 

 

 

That’s all folks!

Thanks

Denise

1 Comment