Skip to content

elcuervo/gliner

Repository files navigation

GLiNER

tests Gem Version

Install

gem "gliner"

Usage

Entities

require "gliner"

Gliner.configure do |config|
  config.threshold = 0.2
  # By default, the gem downloads the default model to .cache/
  # Or set a local path explicitly:
  # config.model = "/path/to/gliner2-multi-v1"
  config.variant = :fp16
end

text = "Apple CEO Tim Cook announced iPhone 15 in Cupertino yesterday."
labels = ["company", "person", "product", "location"]

model = Gliner[labels]
entities = model[text]

pp entities["person"]
# => [#<data Gliner::Entity ...>]

entities["person"].first.text
# => "Tim Cook"

entities["person"].first.probability
# => 92.4

entities["person"].first.offsets
# => [10, 18]

You can also pass per-entity configs:

labels = {
  email: { description: "Email addresses", dtype: "list", threshold: 0.9 },
  person: { description: "Person names", dtype: "str" }
}

model = Gliner[labels]
entities = model["Email John Doe at john@example.com.", threshold: 0.5]

entities["person"].text
# => "John Doe"

entities["email"].map(&:text)
# => ["john@example.com"]

Classification

model = Gliner.classify[
  { sentiment: %w[positive negative neutral] }
]

result = model["This laptop has amazing performance but terrible battery life!"]

pp result

# => { sentiment: #<data Gliner::Label ...> }

result["sentiment"].label
# => "negative"

result["sentiment"].probability
# => 87.1

Multiple classification tasks:

text = "Breaking: Tech giant announces major layoffs amid market downturn"

tasks = {
  sentiment: %w[positive negative neutral],
  urgency: %w[high medium low],
  category: { labels: %w[tech finance politics sports], multi_label: false }
}

results = Gliner.classify[tasks][text]

results.transform_values { |value| value.label }
# => { sentiment: "negative", urgency: "high", category: "tech" }

Structured extraction

text = "iPhone 15 Pro Max with 256GB storage, A17 Pro chip, priced at $1199."

structure = {
  product: [
    "name::str::Full product name and model",
    "storage::str::Storage capacity",
    "processor::str::Chip or processor information",
    "price::str::Product price with currency"
  ]
}

result = Gliner[structure][text]
product = result.fetch("product").first

pp result

product["name"].text
# => "iPhone 15 Pro Max"

product["storage"].text
# => "256GB"

product["processor"].text
# => "A17 Pro"

product["price"].text
# => "1199"

Choices can be included in field specs:

result = Gliner[{ order: ["status::[pending|processing|shipped]::str"] }]["Status: shipped"]

result.fetch("order").first["status"].text
# shipped

Model files

This implementation expects a directory containing:

  • tokenizer.json
  • model.onnx, model_fp16.onnx, or model_int8.onnx
  • (optional) config.json with max_width and max_seq_len

One publicly available ONNX export is cuerbot/gliner2-multi-v1 on Hugging Face. By default, model_fp16.onnx is used; set config.variant (or GLINER_MODEL_FILE) to override. Variants map to files as: :fp16model_fp16.onnx, :fp32model.onnx, :int8model_int8.onnx.

You can also configure the model source directly:

Gliner.configure do |config|
  config.model = "/path/to/model_dir"
  config.variant = :int8
end

Integration test

Downloads a public ONNX export and runs a real inference:

rake test:integration

To download the model separately (for console testing, etc):

rake model:pull

To reuse an existing local download:

GLINER_MODEL_DIR=/path/to/model_dir rake test:integration

Console

Start an IRB session with the gem loaded:

rake console MODEL_DIR=/path/to/model_dir

If you omit MODEL_DIR, the console auto-downloads a public test model (configurable):

rake console
# or:
GLINER_REPO_ID=cuerbot/gliner2-multi-v1 GLINER_MODEL_FILE=model_fp16.onnx rake console

Or:

ruby -Ilib bin/console /path/to/model_dir

About

Ruby inference wrapper for the GLiNER2 ONNX model.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published