Before integrating RAG-Buddy into your LLM pipeline, ensure you have set up a project in the SaaS portal and have an API key ready.
- RAG+Citation
- Text Classification
RAG+Citation Endpoint
The /ragc
Endpoint
Endpoint URI:https://api.ragbuddy.ai/ragc/v1
Note that this endpoint only works for the specific use case where the LLM returns the most relevant article from the context next to the actual answer.
Make sure to check out this Playground to experiment with the RAG+Citation Endpoint.
Versioning of the /ragc
Endpoint
The /ragc
endpoint implements versioning in its URL path, ensuring stable and predictable interactions with the RAG-Buddy service. The current version, designated as /ragc/v1
, is established to be backward-compatible. This means that the v1
interface, encompassing its features and functionalities, will remain consistent and unchanged.As RAG-Buddy evolves, new versions may be introduced to include advanced features or improvements. These will be accessible under different version paths (e.g., /ragc/v2
). Importantly, new versions will not affect the existing v1
endpoint, allowing your current integrations to continue operating seamlessly.We are dedicated to keeping our users informed about updates. New version releases and significant changes will be communicated through email notifications and posts on our blog, enabling users to stay updated and plan their transition to newer versions at their convenience./ragc
Endpoint - Request
The /ragc
endpoint in RAG-Buddy’s REST API is designed for handling requests specific to a RAG with citations use case.A complete request to the /ragc
endpoint consists of the following elements:- RAG-C Request Fields
- Additonal LLM Fields
- Request Headers
RAG-C Request Fields
These request fields are used to specify the remote LLM and to construct the (system and user)messages
fields of the LLM.-
remote_llm_url (required):
- Type: string
- Description: The URL of the remote Large Language Model (LLM) that RAG-Buddy will interact with.
- For example:
https://api.openai.com/v1/chat/completions
Currently, RAG-Buddy only supports the OpenAI Chat models. Please contact us if you would like to use a different LLM. -
system_intro (required):
- Type: string
- Description: A brief introduction or description that sets the context for the RAG interaction.
- For example:
You are a customer support agent of the e-banking application called Piggy Bank Extraordinaire.
-
system_instructions (optional):
- Type: string or null
- Description: Custom instructions for how the system should handle the request. If not provided, default instructions are used. These instructions are used to control the RAG-Buddy Citation Engine. It is recommended to use the default instructions unless you are familiar with the RAG-Buddy Citation Engine.
- Default instructions (omit this field in case you make no changes to it):
- articles (required):
- Type: array of objects (Article)
- Description: A collection of articles, each with an ID and content. These articles serve as the context for the LLM to generate a grounded response.
- Article:
- ID: string (required) - A unique identifier for the article.
- content: string (required) - The content of the article.
- Article:
- For example:
- user_message (required):
- Type: string
- Description: The user’s query or message that the RAG+Citation pipeline will process.
- For example:
What are the interest rates for the Gold Plus Savings Account?
/ragc
Endpoint - Response
The response is essentially the response as it would have been returned by the LLM directly. The only difference is that the RAG-Buddy Citation Engine parses the result and provides the response content and the selected article (the citation) in explicit fields: answer
and article_id
.The
answer
and article_id
are not returned explicitely for a streaming response. In that case these values need to be parsed by the client from the choices.message.content
response field./ragc
endpoint:Additonal LLM Fields
These fields are specific to the LLM and are forwarded to the LLM as-is. For the OpenAI API, these fields are documented here. The two required fields for this OpenAI endpoint aremodel
and messages
. The messages
field is constructed from the fields in the previous section. Don’t send the messages
field explicitly in the request body.
RAG-Buddy currently only supports the OpenAI Chat models.
model
field you probably want to send the temperature
field as well. This field controls the randomness of the LLM. The higher the temperature, the more random the LLM will respond. To make use of the RAG-Buddy Cache, we recommend setting the temperature
to 0.0.
RAG-Buddy supports streaming responses from the LLM. To enable this, set the stream
field to true
.
Request Headers
The following request headers are required:Content-Type: application/json
Authorization: Bearer $OPENAI_API_KEY
Helvia-RAG-Buddy-Token: RAG_CA_**********
Helvia-RAG-Buddy-Cache-Control: <...>
Set this header tono-cache
to disable reading from the RAG-Buddy Cache. To disable writing set it tono-write
; disable both set it tono-cache, no-write
. By default, reading and writing from/to the RAG-Buddy Cache is enabled.
A bit more detail on the Cache Control headers is provided here.
Response headers
The response headers are the response headers as returned by the LLM with the addition of one extra header:Helvia-RAG-Buddy-Cache-Status
. This header indicates whether the RAG-Buddy Cache was used for this request. If the header is returned it means there was a cache hit. The actual value of the header is the (internally used) cache key that was used to retrieve the context from the cache.