Implementing-Mamba-Network-from-Scratch

This repository is created to implement linear sequences modeling architectures for both study and research. In this project, I will use JAX and Pytorch to build models, from basic RNNs up to Mamba model, to solve Speech Enhancement and Text Classification tasks and then compare their performance. I will also try to build the Transformer architecture; however, this type of model is not yet within the scope of this project. Therefore, if I cannot finish it on time, I will use the predefined architecture in Pytorch.

Introduction.

Sequences modeling remains a fundamental part of Artificial Intelligence, as it allows AI to capture the structure of sequence data. While there are various applications of this field, the most well-known task today is Language Modeling.

Recently, a powerful Architecutre called Transformer has been dominating this field. Its attention mechanism allows the model to capture complex relationships between tokens within the text. However, the attention mechanism faces a critical problem: its computional cost grows quadradically with sequence length. This limits the Transfomer from capture longer sequences.

As a result, much research is now focusing on linear models which have showed comparable performance to the Transformer while maintaining a low computional cost.

This repository is created focusing on implementing these models for educational purposes. Throughout the implementation, I will try to explain the concepts behind these models as simply as possible. The repository is implemented in the simplest way possible; more efficient implementations will be placed in a seperate repository.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.vscode		.vscode
Architecture_cell		Architecture_cell
Traditional_RNN		Traditional_RNN
data/Text_Classification		data/Text_Classification
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Sequence_Model.py		Sequence_Model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Implementing-Mamba-Network-from-Scratch

Introduction.

II. Tranditonal RNNs

III. Linear Recurrence Unit (LRU)

IV. Structured State Space sequence model (S4)

V. Mamba

About

Uh oh!

Releases

Packages

Languages

License

Pencoding1/Implementing-Mamba-Network-from-Scratch

Folders and files

Latest commit

History

Repository files navigation

Implementing-Mamba-Network-from-Scratch

Introduction.

II. Tranditonal RNNs

III. Linear Recurrence Unit (LRU)

IV. Structured State Space sequence model (S4)

V. Mamba

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages