SumRAG

요약데이터를 활용한 압축 및 LLM 기반 Retrieve 방식을 통해 더욱 효과적인 데이터 검색으로 정확한 정답을 도출할 수 있는 기법을 제안합니다.
또한, Hierachy한 검색 기법으로, 기존 대비 더 적은 데이터를 생성에 활용할 수 있도록 특정하여, 정확한 결과물을 도출합니다.
자세한 개발 과정과 동기는 이 문서를 참고하세요.

How to install

pip install git+https://github.com/kojunseo/SumRAG

How to use

1. Prepare data

-folderA
 ├---Document 1
 ├---Document 2
 └---Document 2

Document1: #뒤에 제목 \n뒤에 실제 내용 \n은 몇개를 써도 무방함

#한글은 무엇인가?\n한글은 대한민국의 문자로...
#누가 한글을 만들었는가?\n한글을 만든사람: 세종대왕
#한글의 구조\n한글은 다음으로 구성됨\n자음: N개\n모음:N개

See example for the dataset

2. Load Data & Save the data

Preprocess and load from folderA

from SumRAG import LLMs
from SumRAG.documents import SumInput

documents = SumInput.load_from_files("./folderA", LLMs.gpt3_5)

Save the Document

documents.save("./folderA_doc")

Load from saved data

documents.load("./folderA_doc")

Define Retriever, Generator, and ask question

For more details about each modules, refer the docstring.

from SumRAG import LLMs, EMBs
from SumRAG.retrieve import HierLLMRetriever, HierEMBMixRetriever, LLMRetriever, EMBRetriever
from SumRAG.generation import BasicGenerator

# choose one
retriever = LLMRetriever(llm=LLMs.gpt3_5, s_input=documents)
retriever = EMBRetriever(emb=EMBs.openai, s_input=documents)
retriever = HierLLMRetriever(llm=LLMs.gpt3_5, s_input=documents)
retriever = HierEMBMixRetriever(llm=LLMs.gpt3_5, emb=EMBs.openai, s_input=documents)

generator = BasicGenerator(llm=LLMs.gpt3_5, retriever_fn=retriever)
print(generator("상한 식품의 환불은 어디에 물어봐야 하나요?"))

Full Example Code

from SumRAG import LLMs, EMBs
from SumRAG.retrieve import HierLLMRetriever, LLMRetriever, EMBRetriever
from SumRAG.generation import BasicGenerator
from SumRAG.documents import SumInput

documents = SumInput.load_from_files("./folderA", LLMs.gpt3_5)
documents.save("./folderA_doc")
# documents.load("./folderA_doc")

retriever = HierLLMRetriever(llm=LLMs.gpt3_5, s_input=documents)
generator = BasicGenerator(llm=LLMs.gpt3_5, retriever_fn=retriever)

print(generator("상한 식품의 환불은 어디에 물어봐야 하나요?"))

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
SumRAG		SumRAG
example		example
figs		figs
.gitignore		.gitignore
MANIFEST.in		MANIFEST.in
PD.md		PD.md
README.md		README.md
labor_langchain_rag.py		labor_langchain_rag.py
labor_preprocess_raw.py		labor_preprocess_raw.py
labor_sumrag_rag.py		labor_sumrag_rag.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SumRAG

How to install

How to use

1. Prepare data

2. Load Data & Save the data

Define Retriever, Generator, and ask question

Full Example Code

About

Uh oh!

Uh oh!

Languages

kojunseo/SumRAG

Folders and files

Latest commit

History

Repository files navigation

SumRAG

How to install

How to use

1. Prepare data

2. Load Data & Save the data

Define Retriever, Generator, and ask question

Full Example Code

About

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages