I’m back! I haven’t posted in a while, so I figured I would come back here. Recently, thanks to the help of the TensorFlow Research Cloud, I have been able to finetune Llama 2 on new languages, like French, Spanish, Italian, and Hindi. I used the LoRA algorithm to train it. I naievely tried to use Levanter with Python 3.8 to train the model. However, I quickly learned that 3.9 was needed for it to work. So, I used the Deadsnakes PPA to install Python 3.9. However, I noticed a problem. When I tried to install Levanter, I noticed that my pip installation was broken. I tried for hours to fix it, to no avail. I gave up and spent a week working on other stuff when I found a Stack Overflow post that was using a virtual environment. Then I had my eureka moment and SSHed into the TPU vm. I installed python3-venv from Deadsnakes(I think) and created a new environment. Now, pip finally worked and I could install Levanter. However, I ran into another problem. My system kept OOMing whenever I tried to finetune Phind 34b(we will get to Llama 2 later). I filed an issue on the Levanter repo, and turned off my computer for the night. When I woke up the next morning, I found that David Hall(the author of the repo) had suggested that I lowered the batch size. I lowered it down to 4(1*4 TPU devices) and it still OOMed. After a few weeks of back and forth, however, I was able to get it running and finetune Phind 34b to write Go code. Later, I was watching an episode of the Lex Fridman podcast, when I heard Yann LeCun(the guest speaker) say that the IT firm Infosys was finetuning Llama 2 70b on over 20 languages. This gave me the idea to do a similar thing on a smaller scale with the TPU. Later that day, I did just that, finetuning Llama 2 on over 4 languages. You can find my models here. I want to thank David Hall, Stanford’s CRFM research lab, the Llama 2 team, and the TensorFlow Research Cloud for making this possible.