The code is based on the Hugging Face Transformers repository. We modified src/transformers/model/modeling_llama.py to integrate the MoE-fication process. The main ...