Make sliced models HuggingFace compatible by LianaMikael · Pull Request #139 · microsoft/TransformerCompression

LianaMikael · 2024-04-21T17:12:08Z

This PR adds the implementations for sliced Phi and Llama models to make it easy to save and load sliced models.
The models can be initialized with a given scheduler (or no scheduler for zero sparsity) and support save_pretrained and from_pretrained methods like standard HF models.

experiments/run_slicegpt.py

pashminacameron · 2024-04-21T17:25:25Z

experiments/run_slicegpt.py


 import torch
-import wandb
+from transformers.models.llama.modeling_llama import LlamaConfig


I think we might be missing one abstraction. We shouldn't need any model specific imports here. hf_utils.get_model_and_tokenizer abstracts the model type away; we want the same for saving HF-compatible sliced models. This save abstraction will also avoid the Sliced<Model>ForCausalLM imports here.

src/slicegpt/adapters/sliced_phi.py

tests/test_slicing.py

pashminacameron

Since there is a single intermediate size, we can make this explicit by using the type ConstSlicingScheduler in the interface instead of SlicingScheduler. Otherwise, we can add support for all slicing schedulers and keep the base type SlicingScheduler. I don't mind how this is done - in this PR or a follow up PR. We should update the PR title to reflect the decision and contents of the PR.

src/slicegpt/adapters/sliced_phi.py

src/slicegpt/hf_utils.py

pashminacameron

Hmm :/ Tests are not working due to import issues.

… opt

tests/test_model_adapter.py

src/slicegpt/adapters/opt_adapter.py

…icrosoft/TransformerCompression into liana/make_model_HF_compatible

pashminacameron · 2024-04-25T22:00:08Z

I would like to hold off merging this into main for a bit. I will work on this more (after the smaller fixes) and we can re-review.

canamika27 · 2024-05-06T07:55:02Z

I would like to hold off merging this into main for a bit. I will work on this more (after the smaller fixes) and we can re-review.

Any update by when the changes will be merged

pashminacameron · 2024-05-06T17:59:20Z

src/slicegpt/hf_utils.py

+            parallel_blocks=True,
+        )
+
+        sliced_model = SlicedPhiForCausalLM.from_pretrained(


The sliced_model can be of type SlicedPhiForCausalLM or SlicedLlamaForCausalLM at this point. Debugging..

msdmkats · 2024-05-08T15:11:27Z

src/slicegpt/hf_utils.py

    model_adapter.use_cache = False

-    tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=True, token=token, local_files_only=local_model)
+    tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True, token=token, local_files_only=local_model)


This change breaks loading of local models.

Probably, should remain model_path.

LianaMikael · 2024-05-13T08:52:30Z

I would like to hold off merging this into main for a bit. I will work on this more (after the smaller fixes) and we can re-review.

Any update by when the changes will be merged

We will work on merging these changes by the end of this week

madhusrivatsav · 2024-10-21T09:10:48Z

Hi

Please let me know when these changes will be merged ?
Has anyone verified if the models are are HF compatible now ?

NamburiSrinath · 2025-06-10T02:13:59Z

Hi @pashminacameron and @LianaMikael,

Is there a timeline to merge these changes to main?

LianaMikael and others added 16 commits April 11, 2024 13:13

Create a custom sliced class for phi2

d561380

Verify model saving and loading

e017d97

Start adding slicing scheduler

d071a92

Correct model architecture and slice

c27e3d5

Add additional intermediate hidden size to config

94a5720

Clean up and update intermediate hidden layer dim

2b6b7f9

Small fixes and perplexity computation

a9a3488

Fix from_pretrained function

7f4adda

Add tests and fix config

dd58719

Style

4cb10a7

ClAdd LLama class and clean up

85b3782

Add tests and update loading in run_slicegpt

1f6c019

Clean up tests and unnecessary script

c6ff900

Remove unnecessary script

1a68d00

Move sliced models to new files

e6e63f5

Apply style

125cd58

LianaMikael requested review from msdmkats, nailimixaM and pashminacameron April 21, 2024 17:12

pashminacameron reviewed Apr 21, 2024

View reviewed changes

LianaMikael and others added 2 commits April 21, 2024 20:12

Add configs and move model saving to hf_utils

e9bb417

Formatting

be7b767

pashminacameron approved these changes Apr 23, 2024

View reviewed changes

src/slicegpt/adapters/sliced_phi.py Outdated Show resolved Hide resolved

src/slicegpt/hf_utils.py Show resolved Hide resolved

pashminacameron added 3 commits April 23, 2024 12:57

Add module imports

40a7661

Formatting module imports

3719480

src/slicegpt/hf_utils.py

f63a038

pashminacameron reviewed Apr 23, 2024

View reviewed changes

LianaMikael added 3 commits April 23, 2024 16:32

Add intermediate_size to OPT adapter to fix tests

6210fc1

Make sparsity and new_hidden_size mandatory, fix intermediate_size in…

fcd5fa7

… opt

Fix config inputs

20d1ef7

LianaMikael added 2 commits April 23, 2024 18:44

Fix slicing tests

6c1f0df

Fi model saving

9821f08

pashminacameron reviewed Apr 24, 2024

View reviewed changes

tests/test_model_adapter.py Outdated Show resolved Hide resolved

Use ffn_dim in OPT. Don't set intermediate_size in OPTCOnfig.

322016c

pashminacameron reviewed Apr 24, 2024

View reviewed changes

src/slicegpt/adapters/opt_adapter.py Outdated Show resolved Hide resolved

LianaMikael added 7 commits April 24, 2024 14:34

Remove unnecessary params and fix scheduler when model loading

d1e5899

Fix tests

16cb082

Merge branch 'liana/make_model_HF_compatible' of https://github.com/m…

a83a523

…icrosoft/TransformerCompression into liana/make_model_HF_compatible

Update model loading

d2f4d2e

Update scheduler params when loading sliced model

981615b

Update test with model loading

84d775a

Update lm eval and fix rotate

e678717

pashminacameron self-requested a review April 25, 2024 21:57

pashminacameron reviewed May 6, 2024

View reviewed changes

msdmkats reviewed May 8, 2024

View reviewed changes

nailimixaM mentioned this pull request Jun 3, 2024

How to export entire model? #114

Open

Conversation

LianaMikael commented Apr 21, 2024

Uh oh!

Uh oh!

pashminacameron Apr 21, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pashminacameron left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pashminacameron left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pashminacameron commented Apr 25, 2024

Uh oh!

canamika27 commented May 6, 2024

Uh oh!

pashminacameron May 6, 2024

Choose a reason for hiding this comment

Uh oh!

msdmkats May 8, 2024

Choose a reason for hiding this comment

Uh oh!

msdmkats May 8, 2024

Choose a reason for hiding this comment

Uh oh!

LianaMikael commented May 13, 2024

Uh oh!

madhusrivatsav commented Oct 21, 2024

Uh oh!

NamburiSrinath commented Jun 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

pashminacameron left a comment •

edited

Loading