Skip to content

Trallie (“Transfer learning for information extraction”) boosts IE for search among textual asset descriptions by doing away with costly human annotation, instead leveraging LLM capabilities to follow NL guidelines, understand labels, and manipulate NL like it does for code.

License

Notifications You must be signed in to change notification settings

PiSchool/trallie

Repository files navigation

Trallie - Transfer Learning for Information Extraction

Image description

Apache 2.0 License LinkedIn Stars Python 3.10 Open In Colab

Trallie (“Transfer learning for information extraction”) boosts Information Extraction (IE) for search among textual asset descriptions by doing away with costly human annotation, instead leveraging LLM capabilities to follow NL guidelines, understand labels, and manipulate NL like it does for code.

Problem: Natural language descriptions of assets and resources are here to stay, both as legacy or as flexible catch-alls. Clustering and categorizing them to run structured search queries traditionally requires information extraction (IE), with some partial solutions offered by RAG and dense embedding matching. This often is bottlenecked by costly human annotation, if only to provide few-shot examples of categories.

Ambition: Trallie brings transfer learning and world understanding afforded by LLM to make information extraction agile. We deliver multilingual, IE-fine-tuned checkpoints of various open model architectures; and for reproducibility, our full fine-tuning recipe including prompt templates.

Impact: Transfer learning and natural language input imply impact on legacy and low-resource scenarios, improving discoverability of hidden asset collections, plurality of sources through easier access to search tools, improved trust and privacy.

Team: At Pi School, our experience of rapid prototyping in AI, acquired over >100 AI projects, gives us an advantage in exploiting the rapidly moving SOTA.

Getting Started

  1. Install the "trallie" package.
pip install trallie
  1. Run the demo script from the repository.
python main_pipeline.py
  1. Alternatively, you can use the Demo Notebook on Google Colab.

About

Trallie (“Transfer learning for information extraction”) boosts IE for search among textual asset descriptions by doing away with costly human annotation, instead leveraging LLM capabilities to follow NL guidelines, understand labels, and manipulate NL like it does for code.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages