Skip to content

Conversation

@Joao-Pedro-Cabral
Copy link

@Joao-Pedro-Cabral Joao-Pedro-Cabral commented Jan 20, 2026

The current implementation of vcompress is completely wrong.
I propose a "solution" that works for VREG_W >=256 (since I only works with VREG_W >=256, for now I'm not focusing in resolving the VREG_W=128 case).

Bugs of the current implementation:

  • Data and mask registers are swapped;
  • Instruction complete much later than the Pipeline Controller expects;
  • Synchronization bugs with the next instruction;
  • Extra cycles are spent shifting the result element within the result buffer;

The last three are similar to the reduction bug, and depends on this PR: #128.

"Solution":

  • Pipeline Controller waits one extra “vreg cycles” (worst-case scenario) before asserting done;
  • Unit Wrapper internally determine how many flushes/shifts it must perform for the current case;

@ParkerJones567
Copy link
Contributor

This issue might still be present in vicuna2.0, please test with the current development branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants