The Decoder-only model with RoPE, SwiGLU and a BPE tokenizer is in assignment/assianment1-basics/cs336_basics. I only run one experiment on my mac because I do not ...
I completed 18.01SC from MIT OpenCourseWare and this repo contains all of my notes from the lectures and chapters I studied in the textbook I used (Edwards and Penney, Calculus: Early Transcendentals, ...