RAG-Buddy Quickstart - Helvia RAG-Cache Documentation

Jumpstart your integration with RAG-Buddy in proxy mode, designed to enhance your existing RAG setup with minimal effort. This Quickstart is tailored for a swift and straightforward experience, enabling you to witness the improvements RAG-Buddy brings to your system promptly. For alternative configurations and detailed instructions, our comprehensive Guidebook is available.

The Quickstart below is designed to get you up and running with RAG-Buddy in the shortest time possible. We do however recommend that you use the REST API version of RAG-Buddy. It is more reliable and offers additional advantages over the method described below. That having said, please do follow the quickstart below to get a feel of how RAG-Buddy works with a minimum of effort.

Assumptions

This Quickstart assumes the following prerequisites are met:

You have an active OpenAI API key. If you do not possess one, please obtain it from OpenAI.
You are using the OpenAI Python client. Installation and setup instructions can be found in the repository.
Your use case is:
- Text Classification

If your setup or requirements differ from the above, please consult our Guidebook for a path tailored to your needs.

Create a Project

To begin using RAG-Buddy, a new project must be created. If you haven’t set up a project yet, follow these steps:

Visit the console at RAG-Buddy Console and sign up for an account if you don’t already have one.

The first project of your account is a trial project, meaning you get to experience the Business plan for a week. After the week passes your project’s plan will be updated to the Free plan and the cache’s data will be reset. If you wish to keep your data, subscribe to the Business plan for your project.

Once you’re signed in:
- On your Console, click [Create a new project].
- Name your project.
- The OpenAI:LARGE3-1024 model is pre-selected as your embedding model.
- OpenAI Chat Completions API is pre-selected as your LLM provider.
- Select the Text-Classification cache type.
- Click Next to move on to the next step.
- Add the classes you wish to use in your Text-Classification requests in the Classes list.
- Click Create Project to create your project.
Next, you’ll need an API key for your cache:
- Navigate to the Services tab.
- Scroll down to the API Keys for Cache section.
- Click on Create new key.
- Assign a name to your key and select Create API Key.
- Copy the newly generated API key for use in the next step.

Integrate RAG-Buddy

Here, we demonstrate how to integrate RAG-Buddy into your existing LLM Pipeline. As this is the Quickstart guide we will create the most basic and most effortless integration possible, with the Proxy Endpoint for the Text-Classification use case. For more advanced configurations, please refer to our Guidebook.

For the RAG+Citatio use case please refer to the Proxy Endpoint in the Guidebook. Below we continue with the Text-Classification use case exclusively.

Using the OpenAI API in a TC Context

In this section, we demonstrate a typical use case of the OpenAI API within a Text Classification context. The focus is on leveraging the LLM to preprocess and classify the user’s question with an appropriate intent/class. We start by showcasing a standard OpenAI API call, where the system is fed with a set of pre-selected classes. These system instructions provide the necessary context for the AI to understand and classify the question into the appropriate class/intent without requiring any prior training or feature extraction. Code Example:

Typical TC OpenAI API call

import openai

# Your OpenAI API key
openai_api_key = "sk-abc123" # Replace with your actual API key

# System messages
system_instructions = """You are an expert assistant in the field of customer service. Your task is to help workers in the customer service department of a company.\nYour task is to classify the customer's question in order to help the customer service worker to answer the question. In order to help the worker, you MUST respond with the name of one of the following classes you know.\nIn case you reply with something else, you will be penalized.\nThe classes are the following:"""
    
# Classes
classes = [
        "activate_my_card",
        "age_limit",
        "apple_pay_or_google_pay",
        "atm_support",
        "automatic_top_up",
        "balance_not_updated_after_bank_transfer",
        "balance_not_updated_after_cheque_or_cash_deposit",
        "beneficiary_not_allowed",
        "cancel_transfer",
        "card_about_to_expire",
        "card_acceptance",
        "card_arrival",
        "card_delivery_estimate",
        "card_linking",
        "card_not_working",
        "card_payment_fee_charged",
        "card_payment_not_recognised",
        "card_payment_wrong_exchange_rate",
        "cash_withdrawal_charge",
        "cash_withdrawal_not_recognised",
        "change_pin",
        "compromised_card",
        "contactless_not_working",
        "country_support",
        "declined_card_payment",
        "declined_cash_withdrawal",
        "declined_transfer",
        "direct_debit_payment_not_recognised",
        "disposable_card_limits",
        "edit_personal_details",
        "exchange_charge",
        "exchange_rate",
        "exchange_via_app",
        "extra_charge_on_statement",
        "failed_transfer",
        "fiat_currency_support",
        "get_disposable_virtual_card",
        "get_physical_card",
        "getting_spare_card",
        "getting_virtual_card",
        "lost_or_stolen_card",
        "lost_or_stolen_phone",
        "order_physical_card",
        "passcode_forgotten",
        "pending_card_payment",
        "pending_cash_withdrawal",
        "pending_top_up",
        "pending_transfer",
        "pin_blocked",
        "receiving_money",
        "Refund_not_showing_up",
        "request_refund",
        "reverted_card_payment?",
        "supported_cards_and_currencies",
        "terminate_account",
        "top_up_by_bank_transfer_charge",
        "top_up_by_card_charge",
        "top_up_by_cash_or_cheque",
        "top_up_failed",
        "top_up_limits",
        "top_up_reverted",
        "topping_up_by_card",
        "transaction_charged_twice",
        "transfer_fee_charged",
        "transfer_into_account",
        "transfer_not_received_by_recipient",
        "transfer_timing",
        "unable_to_verify_identity",
        "verify_my_identity",
        "verify_source_of_funds",
        "verify_top_up",
        "virtual_card_not_working",
        "visa_or_mastercard",
        "why_verify_identity",
        "wrong_amount_of_cash_received",
        "wrong_exchange_rate_for_cash_withdrawal",
  ]

# User query (can be replaced with any relevant question)
user_query = "I lost my card!"

# Format the template
classes_string = "\n".join(classes)
template = f"{system_instructions}####\n{classes_string}\n####"

# Initialize OpenAI client
client = openai.OpenAI(
    api_key=openai_api_key,
    timeout=10,
)

# Prepare the messages
messages = [
    {
        "role": "system",
        "content": template,
    },
    {"role": "user", "content": user_query},
]

# Call OpenAI API
completion = client.chat.completions.create(
    model="gpt-4o",  # Replace with the specific model name
    messages=messages,
    temperature=0.0,
)

# Read the response
response = completion.choices[0].message
print(response.content)

Open this code example in a notebook: Colab notebook

Transitioning to RAG-Buddy Usage

To further enhance the efficiency and relevance of responses, we integrate the RAG-Buddy services. RAG-Buddy acts as a proxy, caching user queries and selected articles, reducing the context size sent to the LLM. To utilize RAG-Buddy, just a few modifications in the existing code are necessary:

Set the base URL to the RAG-Buddy Proxy Endpoint: Modify the base_url parameter to point to the RAG-Buddy Proxy Endpoint:

base_url = "https://api.ragbuddy.ai/proxy/ragc/v1" This directs the API calls to the RAG-Buddy Proxy instead of directly to OpenAI.

Add the RAG-Buddy Key as a Header: Incorporate the RAG-Buddy key into the headers for authentication and tracking purposes. Your headers configuration will look like this:

headers = {"Helvia-RAG-Buddy-Token": "your-rag-buddy-key"} Replace “your-rag-buddy-key” with the actual key provided during your project setup.

Remove the Classes provided in the request. RAG-Buddy utilizes the predefined Classes that the user set while creating their project so that they don’t have to supply the Classes for every request to their Text-Classification cache.

By making these changes, your API calls will now be routed through the RAG-Buddy proxy, leveraging its caching capabilities for decreasing the context size. Complete Code with RAG-Buddy Integration:

TC Integration

import openai

# Your OpenAI API key
openai_api_key = "sk-abc123" # Replace with your actual API key

# Your RAG Buddy key
rag_buddy_key = "RAG_CA_abc123" # Replace with your actual RAG Buddy key

# Needed for RAG Cache integration
base_url = "https://api.dev.ragbuddy.ai/proxy/tc/v1"
headers = {"Helvia-RAG-Buddy-Token": rag_buddy_key}

# System messages
system_instructions = """You are an expert assistant in the field of customer service. Your task is to help workers in the customer service department of a company.\nYour task is to classify the customer's question in order to help the customer service worker to answer the question. In order to help the worker, you MUST respond with the name of one of the following classes you know.\nIn case you reply with something else, you will be penalized.\nThe classes are the following:"""

# User query (can be replaced with any relevant question)
user_query = "I lost my card!"

# Format the template
template = f"{system_instructions}"

# Initialize OpenAI client
client = openai.OpenAI(
    api_key=openai_api_key,
    timeout=10,
    default_headers=headers,
    base_url=base_url,
)

# Prepare the messages
messages = [
    {
        "role": "system",
        "content": template,
    },
    {"role": "user", "content": user_query},
]

# Call OpenAI API
completion = client.chat.completions.create(
    model="gpt-4o",  # Replace with the specific model name
    messages=messages,
)

# Read the response
response = completion.choices[0].message
print(response.content)

Open this code example in a notebook: Colab notebook

What’s Next?

This quickstart guide is designed to get you up and running with RAG-Buddy in the shortest time possible. For more advanced configurations and detailed instructions, please refer to our Guidebook where you can:

Add cache control
Inspect the response header for cache hit information
Use the REST API instead of this proxy setup
Integrate for different use cases
And more…

Get Started

Guidebook

​Assumptions

​Create a Project

​Integrate RAG-Buddy

​Using the OpenAI API in a TC Context

​Transitioning to RAG-Buddy Usage

​What’s Next?

Assumptions

Create a Project

Integrate RAG-Buddy

Using the OpenAI API in a TC Context

Transitioning to RAG-Buddy Usage

What’s Next?