Small Language Model from scratch for AI-Terminal Data Loading Used NL2Bash dataset Used "Advanced Bash-Scripting Guide" that is a 900 pages book about bash https://tldp.org/LDP/abs/html/ I downloaded this as text with wget Tokenization Byte-Pair Encoding Tokeninzer has been used here. https://www.youtube.com/watch?v=HEikzVL-lZU&t=44s