🗣️Multilingual RAG

Import the required libraries

from beyondllm import source,retrieve,embeddings,llms,generator

Setup API keys

import os
from getpass import getpass
os.environ['OPENAI_API_KEY'] = getpass("OpenAI API Key:")

Load the Source Data

Here we will use a Website as the source data. Reference: https://www.christianitytoday.com/ct/2023/june-web-only/same-sex-attraction-not-threat-zh-hant.html

This article on Same-Sex attraction is not a threat - A Chinesse blog article.

data = source.fit(path="https://www.christianitytoday.com/ct/2023/june-web-only/same-sex-attraction-not-threat-zh-hant.html", dtype="url", chunk_size=512,chunk_overlap=0)

Embedding model

We use intfloat/multilingual-e5-large, a Multilingual Embedding Model from HuggingFace.

embed_model = embeddings.HuggingFaceEmbeddings(model_name="intfloat/multilingual-e5-large")

Auto retriever to retrieve documents

retriever = retrieve.auto_retriever(data,embed_model=embed_model,type="normal",top_k=4)

Large Language Model

Define Custom System Prompt

Run Generator Model

Output

Last updated