🧠LLMs
What is LLMs aka Large Language Models?
An LLM, or Large Language Model, is a fundamental element of BeyondLLM. It is used in the generate function to generate a response. We support a variety of models including ChatOpenAI
, Gemini
, HuggingFaceHub Models
, AzureChatOpenAI
and Ollama
wrapper.
GeminiModel
Gemini is the default model used in BeyondLLM. This model includes the Gemini family models from Google.
Notes: Currently we only support gemini-pro and gemini-1.0-pro. Also no need to install Google Generative AI, because this is a default model.
Parameters
Google API Key : Key used to authenticate and access the Gemini API. Get API key from here: https://ai.google.dev/
Model Name : Defines the Gemini chat model to be used in eg: gemini-pro
Code snippet
Import the GeminiModel from the llms and configure it according to your needs and start using it.
GPT-4o Multimodal Model
This LLM, GPT4OpenAIModel, harnesses the power of OpenAI's GPT-4o model with vision capabilities, enabling interactions that go beyond simple text. It seamlessly handles image, audio, and video inputs alongside text prompts, opening up a realm of multimodal possibilities within your BeyondLLM applications.
In order to harness the multi-modal capabilities of this model, make sure to install the below libraries:
Parameters:
api_key (required): Your OpenAI API key. You can find this key on your OpenAI account page.
model (optional): Specifies the GPT-4 model to use. The default is "gpt-4o," which is GPT-4 with vision capabilities.
model_kwargs (optional): A dictionary of additional keyword arguments to pass to the OpenAI API call, such as max_tokens (to control response length) or temperature (to influence the randomness of the output).
media_paths (optional): The path or a list of paths to your multimedia files (images, audio, or video). You can pass either a single string representing a file path or a list of strings for multiple files. Supported formats include:
Images: JPG, PNG
Audio: MP3, WAV
Video: MP4, AVI, WEBM
Code Snippet:
Example Usages:
1. Using a Single Image:
2. Using Multiple Media Files:
NOTE: Whisper will be used for Audio to Text transcription
BeyondLLM allows you to easily incorporate GPT-4's multimodal abilities into your projects without having to manage the complexities of media encoding and transcription
ChatOpenAIModel
ChatOpenAI is a chat model provided by OpenAI which is trained on instructions dataset in a large corpus.
Installation
In order to use ChatOpenAIModel, we first need to install it:
Parameters
OpenAI API Key: Key used to authenticate and access the
OpenAI API
. Get your API key from here: https://platform.openai.com/Model Name : Defines the OpenAI chat model to be used in eg:
GPT3.5
andGPT4
series.Max Tokens : The output sequence length response from the model.
Temperature : It can be used to control the randomness or creativity in responses.
Code snippet
Import the ChatOpenAIModel from the llms and configure it according to your needs and start using it
HuggingFaceHubModel
The Hugging Face Hub is a platform with over 350k models, 75k datasets, and 150k demo apps (Spaces), all open source and publicly available, in an online platform where people can easily collaborate and build ML together.
Installation
In order to use HuggingFaceModel, we first need to install it:
Parameters
Token : HuggingFace Access Token to run the model on Inference API. You can get your Access token from here: https://huggingface.co/settings/tokens
Model : Model name from the HuggingFace Hub – defaults to
zephyr-7b-beta
.
Code snippet
Specify the model name from the HuggingFaceHub and add your token and start using it.
GroqModel
Groq, a powerful language model API offering access to various chat models, excels at delivering exceptional speed, quality, and energy efficiency compared to traditional methods. If faster LLM inference is a priority, Groq is an excellent choice.
Installation
In order to use GroqModel, we first need to install it:
Parameters
Groq API Key: Obtain your Groq API key from the Groq console (https://console.groq.com/keys) and set it up as an environment variable for security. This key authenticates your requests with the Groq API.
Model (Required):Specifies the Groq language model to use.
Optional Parameters:
temperature: Controls the response randomness (lower for predictable, higher for creative).
Code Snippet
This code retrieves your Groq API key securely, creates a GroqModel instance with the specified model_name and retrieved API key, sets an optional temperature parameter, and demonstrates how to use the generate method for text generation. Remember to replace model with the actual Groq model name you want to use.
Claude Model
The ClaudeModel
class represents a language model from Anthropic. This model can be integrated into the OpenAGI framework to utilize its capabilities in generating textual responses. Below is the detailed implementation and explanation of the ClaudeModel
.pip install ollama
Installation
Parameters
Anthropic API Key: Obtain your Anthropic API key from the Anthropic console and set it up as an environment variable for security. This key authenticates your requests with the Anthropic API.
Model (Required): Specifies the Claude language model to use, such as
claude-3-5-sonnet-20240620
.
Optional Parameters:
temperature: Controls the response randomness (lower for predictable, higher for creative).
top_p: Controls the nucleus sampling, representing the cumulative probability of parameter highest probability tokens.
top_k: Limits the sampling pool to the top
k
tokens.max_tokens: Specifies the maximum number of tokens in the generated response.
Code Snippet
Ollama
Ollama lets you run models locally and use them in your application.
In order to get started with Ollama, we first need to download it, and pull the model based on our need. Download Ollama: https://ollama.com/download
Basic Ollama Commands
More commands: https://github.com/ollama/ollama
Installation
Parameters
Model : The name of the model you are using.
Code snippet
Make sure, before you run the OllamaModel, the model is running locally on your terminal. ollama run llama2
AzureOpenAIModel
Azure OpenAI Service provides REST API access to OpenAI’s powerful language models including the GPT-4
, GPT-3.5-Turbo
, and Embeddings model
series.
Installation
In order to use AzureOpenAIModel, we first need to install it:
Parameters
AzureChatOpenAI API Key: Azure api key for AzureChatOpenAI service.
Deployment Name : Enter the the deployment name that is created on Model deployments on Azure
Endpoint Url : Enter your endpoint url.
Model Name : AzureChatOpenAI enables the access to GPT4 models.
Max Tokens : The maximum sequence length for the model response.
Temperature : It can be used to control the randomness or creativity in responses.
Create your Azure account and get Endpoint URL and Key from here: https://oai.azure.com/
Code snippet
MistralModel
The MistralModel utilizes Mistral AI's robust capabilities, offering support for both text and multimodal inputs. It allows users to send text prompts alongside images for enhanced interaction, making it a versatile choice for BeyondLLM users. This model handles complex requests while ensuring flexibility in configuration. It is particularly useful for use cases requiring the combination of text and visual content.
Notes: Ensure you have installed the Mistral AI library and obtained an API key for authentication. The model supports various customization parameters such as max_tokens
and temperature
.
Parameters
Mistral API Key: Required for authenticating and accessing the Mistral API.
Model Name: Defines the Mistral model to be used, e.g.,
mistral-large
.Model Parameters: Optional parameters like
max_tokens
,temperature
to fine-tune the model's response behavior.
Code snippet
Import the MistralModel, configure it with your API key and model parameters, and start generating responses with support for multimodal inputs.
Last updated