AI2’s new model leverages a sparse mixture of experts (MoE) architecture- it has 7 billion parameters but uses only 1 billion parameters per input token
We use cookies to provide the best website experience for you. If you continue to use this site we will assume that you are happy with it.OkayPrivacy policy