REST API Endpoints
This section guides you through integrating RAG-Buddy into your LLM pipeline using the REST API Endpoint, detailing the steps for a more customized and controlled implementation.
We distinguish between two primary use cases that are currently supported:
- RAG+Citation
- Text Classification
Select your use case from the tabs below to view the specific REST API endpoint details.
RAG+Citation Endpoint
The /ragc
Endpoint
Endpoint URI:
https://api.ragbuddy.ai/ragc/v1
Versioning of the /ragc
Endpoint
The /ragc
endpoint implements versioning in its URL path, ensuring stable and predictable interactions with the RAG-Buddy service. The current version, designated as /ragc/v1
, is established to be backward-compatible. This means that the v1
interface, encompassing its features and functionalities, will remain consistent and unchanged.
As RAG-Buddy evolves, new versions may be introduced to include advanced features or improvements. These will be accessible under different version paths (e.g., /ragc/v2
). Importantly, new versions will not affect the existing v1
endpoint, allowing your current integrations to continue operating seamlessly.
We are dedicated to keeping our users informed about updates. New version releases and significant changes will be communicated through email notifications and posts on our blog, enabling users to stay updated and plan their transition to newer versions at their convenience.
/ragc
Endpoint - Request
The /ragc
endpoint in RAG-Buddy’s REST API is designed for handling requests specific to a RAG with citations use case.
A complete request to the /ragc
endpoint consists of the following elements:
- RAG-C Request Fields
- Additonal LLM Fields
- Request Headers
Read on for more detail on each of these components.
RAG-C Request Fields
These request fields are used to specify the remote LLM and to construct the (system and user) messages
fields of the LLM.
-
remote_llm_url (required):
- Type: string
- Description: The URL of the remote Large Language Model (LLM) that RAG-Buddy will interact with.
- For example:
https://api.openai.com/v1/chat/completions
Currently, RAG-Buddy only supports the OpenAI Chat models. Please contact us if you would like to use a different LLM. -
system_intro (required):
- Type: string
- Description: A brief introduction or description that sets the context for the RAG interaction.
- For example:
You are a customer support agent of the e-banking application called Piggy Bank Extraordinaire.
-
system_instructions (optional):
- Type: string or null
- Description: Custom instructions for how the system should handle the request. If not provided, default instructions are used. These instructions are used to control the RAG-Buddy Citation Engine. It is recommended to use the default instructions unless you are familiar with the RAG-Buddy Citation Engine.
- Default instructions (omit this field in case you make no changes to it):
- articles (required):
- Type: array of objects (Article)
- Description: A collection of articles, each with an ID and content. These articles serve as the context for the LLM to generate a grounded response.
- Article:
- ID: string (required) - A unique identifier for the article.
- content: string (required) - The content of the article.
- Article:
- For example:
- user_message (required):
- Type: string
- Description: The user’s query or message that the RAG+Citation pipeline will process.
- For example:
What are the interest rates for the Gold Plus Savings Account?
/ragc
Endpoint - Response
The response is essentially the response as it would have been returned by the LLM directly. The only difference is that the RAG-Buddy Citation Engine parses the result and provides the response content and the selected article (the citation) in explicit fields: answer
and article_id
.
answer
and article_id
are not returned explicitely for a streaming response. In that case these values need to be parsed by the client from the choices.message.content
response field.Here is an example response from the /ragc
endpoint:
The content below is independent of the use case.
Additonal LLM Fields
These fields are specific to the LLM and are forwarded to the LLM as-is. For the OpenAI API, these fields are documented here.
The two required fields for this OpenAI endpoint are model
and messages
. The messages
field is constructed from the fields in the previous section. Don’t send the messages
field explicitly in the request body.
Next to the model
field you probably want to send the temperature
field as well. This field controls the randomness of the LLM. The higher the temperature, the more random the LLM will respond. To make use of the RAG-Buddy Cache, we recommend setting the temperature
to 0.0.
RAG-Buddy supports streaming responses from the LLM. To enable this, set the stream
field to true
.
Request Headers
The following request headers are required:
Content-Type: application/json
Authorization: Bearer $OPENAI_API_KEY
Helvia-RAG-Buddy-Token: RAG_CA_**********
Optional request headers:
Helvia-RAG-Buddy-Cache-Control: <...>
Set this header tono-cache
to disable reading from the RAG-Buddy Cache. To disable writing set it tono-write
; disable both set it tono-cache, no-write
. By default, reading and writing from/to the RAG-Buddy Cache is enabled.
Response headers
The response headers are the response headers as returned by the LLM with the addition of one extra header: Helvia-RAG-Buddy-Cache-Status
. This header indicates whether the RAG-Buddy Cache was used for this request. If the header is returned it means there was a cache hit. The actual value of the header is the (internally used) cache key that was used to retrieve the context from the cache.