Skip to content

A list of language models with permissive licenses such as MIT or Apache 2.0

License

Notifications You must be signed in to change notification settings

mmhamdy/open-language-models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

14 Commits
Β 
Β 
Β 
Β 

Repository files navigation

The Open Language Models List

πŸ“„ Introduction

This is a list of permissively licensed language models with MIT, Apache 2.0, or other similar licenses. We are using the term language model broadly here to include not only autoregressive models but also models that were trained with different objectives such as MLM.

This work was mostly inspired by Stella Biderman's Directory of Generative AI, and The Foundation Model Development Cheatsheet. But unlike these two very comprehensive sources, this work is meant to be a quick and more focused reference.

  • πŸ‘‘: Model + Data + Code
  • ⭐: Model + Data
  • ⚑: Model + Code

Important

This is still a work in progress. Contributions, corrections, and feedback are very welcome!

πŸ€– Models

Model Parameters Architecture Encoder Decoder MoE Year Hugging Face License
GPT-1 120M Transformer - βœ… - 2018 πŸ€— MIT
BERT-Base-Cased 110M Transformer βœ… - - 2018 πŸ€— Apache 2.0
BERT-Base-Uncased 110M Transformer βœ… - - 2018 πŸ€— Apache 2.0
BERT-Large-Cased 340M Transformer βœ… - - 2018 πŸ€— Apache 2.0
BERT-Large-Uncased 340M Transformer βœ… - - 2018 πŸ€— Apache 2.0
GPT-2-Small 124M Transformer - βœ… - 2019 πŸ€— MIT
GPT-2-Medium 355M Transformer - βœ… - 2019 πŸ€— MIT
GPT-2-Large 774M Transformer - βœ… - 2019 πŸ€— MIT
GPT-2-XL 1.5B Transformer - βœ… - 2019 πŸ€— MIT
T5-SmallπŸ‘‘ 60M Transformer βœ… βœ… - 2019 πŸ€— Apache 2.0
T5-BaseπŸ‘‘ 220M Transformer βœ… βœ… - 2019 πŸ€— Apache 2.0
T5-LargeπŸ‘‘ 770M Transformer βœ… βœ… - 2019 πŸ€— Apache 2.0
T5-3BπŸ‘‘ 3B Transformer βœ… βœ… - 2019 πŸ€— Apache 2.0
T5-11BπŸ‘‘ 11B Transformer βœ… βœ… - 2019 πŸ€— Apache 2.0
XLM-RoBERTa-Large 560M Transformer βœ… - - 2019 πŸ€— MIT
XLM-RoBERTa-Base 250M Transformer βœ… - - 2019 πŸ€— MIT
RoBERTa-Base 125M Transformer βœ… - - 2019 πŸ€— MIT
RoBERTa-Large 355M Transformer βœ… - - 2019 πŸ€— MIT
DistilBERT-Base-Cased 66M Transformer βœ… - - 2019 πŸ€— Apache 2.0
DistilBERT-Base-Uncased 66M Transformer βœ… - - 2019 πŸ€— Apache 2.0
ALBERT-Base 12M Transformer βœ… - - 2019 πŸ€— Apache 2.0
ALBERT-Large 18M Transformer βœ… - - 2019 πŸ€— Apache 2.0
ALBERT-XLarge 60M Transformer βœ… - - 2019 πŸ€— Apache 2.0
ALBERT-XXLarge 235M Transformer βœ… - - 2019 πŸ€— Apache 2.0
DeBERTa-Base 134M Transformer βœ… - - 2020 πŸ€— MIT
DeBERTa-Large 350M Transformer βœ… - - 2020 πŸ€— MIT
DeBERTa-XLarge 750M Transformer βœ… - - 2020 πŸ€— MIT
ELECTRA-Small-Discriminator 14M Transformer βœ… - - 2020 πŸ€— Apache 2.0
ELECTRA-Base-Discriminator 110M Transformer βœ… - - 2020 πŸ€— Apache 2.0
ELECTRA-Large-Discriminator 335M Transformer βœ… - - 2020 πŸ€— Apache 2.0
GPT-Neo-125MπŸ‘‘ 125M Transformer - βœ… - 2021 πŸ€— MIT
GPT-Neo-1.3BπŸ‘‘ 1.3B Transformer - βœ… - 2021 πŸ€— MIT
GPT-Neo-2.7BπŸ‘‘ 2.7B Transformer - βœ… - 2021 πŸ€— MIT
GPT-JπŸ‘‘ 6B Transformer - βœ… - 2021 πŸ€— Apache 2.0
XLM-RoBERTa-XL 3.5B Transformer βœ… - - 2021 πŸ€— MIT
XLM-RoBERTa-XXL 10.7B Transformer βœ… - - 2021 πŸ€— MIT
DeBERTa-v2-XLarge 900M Transformer βœ… - - 2021 πŸ€— MIT
DeBERTa-v2-XXLarge 1.5M Transformer βœ… - - 2021 πŸ€— MIT
DeBERTa-v3-XSmall 22M Transformer βœ… - - 2021 πŸ€— MIT
DeBERTa-v3-Small 44M Transformer βœ… - - 2021 πŸ€— MIT
DeBERTa-v3-Base 86M Transformer βœ… - - 2021 πŸ€— MIT
DeBERTa-v3-Large 304M Transformer βœ… - - 2021 πŸ€— MIT
mDeBERTa-v3-Base 86M Transformer βœ… - - 2021 πŸ€— MIT
GPT-NeoXπŸ‘‘ 20B Transformer - βœ… - 2022 πŸ€— Apache 2.0
UL2πŸ‘‘ 20B Transformer βœ… βœ… - 2022 πŸ€— Apache 2.0
YaLM⚑ 100B Transformer - βœ… - 2022 πŸ€— Apache 2.0
Pythia-14MπŸ‘‘ 14M Transformer - βœ… - 2023 πŸ€— Apache 2.0
Pythia-70MπŸ‘‘ 70M Transformer - βœ… - 2023 πŸ€— Apache 2.0
Pythia-160MπŸ‘‘ 160M Transformer - βœ… - 2023 πŸ€— Apache 2.0
Pythia-410MπŸ‘‘ 410M Transformer - βœ… - 2023 πŸ€— Apache 2.0
Pythia-1BπŸ‘‘ 1B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Pythia-1.4BπŸ‘‘ 1.4B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Pythia-2.8BπŸ‘‘ 2.8B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Pythia-6.9BπŸ‘‘ 6.9B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Pythia-12BπŸ‘‘ 12B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Cerebras-GPT-111M⭐ 111M Transformer - βœ… - 2023 πŸ€— Apache 2.0
Cerebras-GPT-256M⭐ 256M Transformer - βœ… - 2023 πŸ€— Apache 2.0
Cerebras-GPT-590M⭐ 590M Transformer - βœ… - 2023 πŸ€— Apache 2.0
Cerebras-GPT-1.3B⭐ 1.3B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Cerebras-GPT-2.7B⭐ 2.7B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Cerebras-GPT-6.7B⭐ 6.7B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Cerebras-GPT-13B⭐ 13B Transformer - βœ… - 2023 πŸ€— Apache 2.0
BTLMπŸ‘‘ 3B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Phi-1 1.3B Transformer - βœ… - 2023 πŸ€— MIT
Phi-1.5 1.3B Transformer - βœ… - 2023 πŸ€— MIT
Phi-2 2.7B Transformer - βœ… - 2023 πŸ€— MIT
RedPajama-INCITE-3BπŸ‘‘ 2.8B Transformer - βœ… - 2023 πŸ€— Apache 2.0
RedPajama-INCITE-7BπŸ‘‘ 6.9B Transformer - βœ… - 2023 πŸ€— Apache 2.0
FLM 101B Transformer - βœ… - 2023 πŸ€— Apache 2.0
MPT-7B 7B Transformer - βœ… - 2023 πŸ€— Apache 2.0
MPT-7B-8K 7B Transformer - βœ… - 2023 πŸ€— Apache 2.0
MPT-30B 30B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Mistral-7B-v0.1 7B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Mistral-7B-v0.2 7B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Mistral-7B-v0.3 7B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Falcon-7B 7B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Falcon-40B 40B Transformer - βœ… - 2023 πŸ€— Apache 2.0
TinyLlama 1.1B Transformer - βœ… - 2023 πŸ€— Apache 2.0
OpenLLaMA-3B-v1πŸ‘‘ 3B Transformer - βœ… - 2023 πŸ€— Apache 2.0
OpenLLaMA-7B-v1πŸ‘‘ 7B Transformer - βœ… - 2023 πŸ€— Apache 2.0
OpenLLaMA-13B-v1πŸ‘‘ 13B Transformer - βœ… - 2023 πŸ€— Apache 2.0
OpenLLaMA-3B-v2πŸ‘‘ 3B Transformer - βœ… - 2023 πŸ€— Apache 2.0
OpenLLaMA-7B-v2πŸ‘‘ 7B Transformer - βœ… - 2023 πŸ€— Apache 2.0
DeciLM-7B 7B Transformer - βœ… - 2023 πŸ€— Apache 2.0
AmberπŸ‘‘ 7B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Solar 10.7B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Mixtral-8x7B 46.7B Transformer - βœ… βœ… 2023 πŸ€— Apache 2.0
OpenMoE-base-128B 637M Transformer - βœ… βœ… 2023 πŸ€— Apache 2.0
Mamba-130M 130M SSM - βœ… - 2023 πŸ€— Apache 2.0
Mamba-370M 370M SSM - βœ… - 2023 πŸ€— Apache 2.0
Mamba-790M 790M SSM - βœ… - 2023 πŸ€— Apache 2.0
Mamba-1.4B 1.4M SSM - βœ… - 2023 πŸ€— Apache 2.0
Mamba-2.8B 2.8B SSM - βœ… - 2023 πŸ€— Apache 2.0
Mamba-2.8B-slimpj 2.8B SSM - βœ… - 2023 πŸ€— Apache 2.0
OpenBA 15B Transformer βœ… βœ… - 2023 πŸ€— Apache 2.0
Yi-6B 6B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Yi-6B-200K 6B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Yi-9B 9B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Yi-9B-200K 9B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Yi-34B-200K 34B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Persimmon-8B 8B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Palmyra-3B 3B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Palmyra-Small-128M 128M Transformer - βœ… - 2023 πŸ€— Apache 2.0
Palmyra-Base-5B 5B Transformer - βœ… - 2023 πŸ€— Apache 2.0
Palmyra-Large-20B 20B Transformer - βœ… - 2023 πŸ€— Apache 2.0
SEA-LION-3B 3B Transformer - βœ… - 2023 πŸ€— MIT
SEA-LION-7B 7B Transformer - βœ… - 2023 πŸ€— MIT
PLaMo-13B 13B Transformer - βœ… - 2023 πŸ€— Apache 2.0
LiteLlama 460M Transformer - βœ… - 2024 πŸ€— MIT
H2O-Danube 1.8B Transformer - βœ… - 2024 πŸ€— Apache 2.0
H2O-Danube2 1.8B Transformer - βœ… - 2024 πŸ€— Apache 2.0
Cosmo 1.8B Transformer - βœ… - 2024 πŸ€— Apache 2.0
MobiLlama-0.5B 0.5B Transformer - βœ… - 2024 πŸ€— Apache 2.0
MobiLlama-0.8B 0.8B Transformer - βœ… - 2024 πŸ€— Apache 2.0
MobiLlama-1B 1.2B Transformer - βœ… - 2024 πŸ€— Apache 2.0
OLMo-1BπŸ‘‘ 1B Transformer - βœ… - 2024 πŸ€— Apache 2.0
OLMo-7BπŸ‘‘ 7B Transformer - βœ… - 2024 πŸ€— Apache 2.0
OLMo-7B-Twin-2TπŸ‘‘ 7B Transformer - βœ… - 2024 πŸ€— Apache 2.0
OLMo-1.7-7BπŸ‘‘ 7B Transformer - βœ… - 2024 πŸ€— Apache 2.0
Poro 34B Transformer - βœ… - 2024 πŸ€— Apache 2.0
Grok-1 314B Transformer - βœ… βœ… 2024 πŸ€— Apache 2.0
OpenMoe-8b-1.1T 8B Transformer - βœ… βœ… 2024 πŸ€— Apache 2.0
OpenMoE-8B-1T 8B Transformer - βœ… βœ… 2024 πŸ€— Apache 2.0
OpenMoE-8B-800B 8B Transformer - βœ… βœ… 2024 πŸ€— Apache 2.0
OpenMoE-8B-600B 8B Transformer - βœ… βœ… 2024 πŸ€— Apache 2.0
OpenMoE-8B-400B 8B Transformer - βœ… βœ… 2024 πŸ€— Apache 2.0
OpenMoE-8B-200B 8B Transformer - βœ… βœ… 2024 πŸ€— Apache 2.0
OpenMoE-34B-200B 34B Transformer - βœ… βœ… 2024 πŸ€— Apache 2.0
Jamba 52B SSM-Transformer - βœ… βœ… 2024 πŸ€— Apache 2.0
JetMoE 8B Transformer - βœ… βœ… 2024 πŸ€— Apache 2.0
Mambaoutai 1.6B SSM - βœ… - 2024 πŸ€— Apache 2.0
Tele-FLM 52B Transformer - βœ… - 2024 πŸ€— Apache 2.0
Arctic-Base 480B Transformer - βœ… βœ… 2024 πŸ€— Apache 2.0
Zamba-7B 7B SSM-Transformer - βœ… βœ… 2024 πŸ€— Apache 2.0
Mixtral-8x22B-v0.1 141B Transformer - βœ… βœ… 2024 πŸ€— Apache 2.0
Granite-7b-base 7B Transformer - βœ… - 2024 πŸ€— Apache 2.0
Chuxin-1.6B-BaseπŸ‘‘ 1.6B Transformer - βœ… - 2024 πŸ€— MIT
Chuxin-1.6B-1MπŸ‘‘ 1.6B Transformer - βœ… - 2024 πŸ€— MIT
NeoπŸ‘‘ 7B Transformer - βœ… - 2024 πŸ€— Apache 2.0
Yi-1.5-6B 6B Transformer - βœ… - 2024 πŸ€— Apache 2.0
Yi-1.5-9B 9B Transformer - βœ… - 2024 πŸ€— Apache 2.0
Yi-1.5-34B 34B Transformer - βœ… - 2024 πŸ€— Apache 2.0
GECKO-7B 7B Transformer - βœ… - 2024 πŸ€— Apache 2.0
Qwen2-0.5B 0.5B Transformer - βœ… - 2024 πŸ€— Apache 2.0
Qwen2-1.5B 1.5B Transformer - βœ… - 2024 πŸ€— Apache 2.0
Qwen2-7B 7B Transformer - βœ… - 2024 πŸ€— Apache 2.0
Qwen2-57B-A14B 57B Transformer - βœ… βœ… 2024 πŸ€— Apache 2.0
K2πŸ‘‘ 65B Transformer - βœ… - 2024 πŸ€— Apache 2.0
Pile-T5-BaseπŸ‘‘ 248M Transformer βœ… βœ… - 2024 πŸ€— Apache 2.0
Pile-T5-LargeπŸ‘‘ 783M Transformer βœ… βœ… - 2024 πŸ€— Apache 2.0
Pile-T5-XLπŸ‘‘ 2.85B Transformer βœ… βœ… - 2024 πŸ€— Apache 2.0
SmolLM-135MπŸ‘‘ 135M Transformer - βœ… - 2024 πŸ€— Apache 2.0
SmolLM-360MπŸ‘‘ 360M Transformer - βœ… - 2024 πŸ€— Apache 2.0
SmolLM-1.7BπŸ‘‘ 1.7B Transformer - βœ… - 2024 πŸ€— Apache 2.0
GRIN 42B Transformer - βœ… βœ… 2024 πŸ€— MIT
OLMoE-1B-7BπŸ‘‘ 7B Transformer - βœ… βœ… 2024 πŸ€— Apache 2.0
Zamba2-1.2B 1.2B SSM-Transformer - βœ… - 2024 πŸ€— Apache 2.0
Zamba2-2.7B 2.7B SSM-Transformer - βœ… - 2024 πŸ€— Apache 2.0
Fox-1-1.6B 1.6B Transformer - βœ… - 2024 πŸ€— Apache 2.0

πŸ“š Resources

About Openness

  • [Blog post] What "Open" Means: A great blog post by John Shaughnessy discussing how the many different incarnations of the word "open".
  • [Paper] Towards a Framework for Openness in Foundation Models: In this paper, Mozilla and Columbia Institute of Global Politics brought together over 40 leading scholars and practitioners working on openness and AI to discuss the highly debated definitions and benefits of open sourcing foundation models. Among this team are Victor Storchan, Yann LeCun, Justine Tunney, Nathan Lambert, and many others.
  • [Paper] Rethinking open source generative AI: This paper surveys over 45 generative AI models using an evidence-based framework that distinguishes 14 dimensions of openness, from training datasets to scientific and technical documentation and from licensing to access methods.
  • [Paper] Risks and Opportunities of Open-Source Generative AI: This paper analyzes the risks and opportunities of open-source generative AI models using a three-stage framework for Gen AI development (near, mid and long-term), and argues that, overall, the benefits of open-source Gen AI outweigh its risks.

πŸ“Œ Citation

@misc{hamdy2024openlmlist,
  title = {The Open Language Models List},
  author = {Mohammed Hamdy},
  url = {https://github.com/mmhamdy/open-language-models},
  year = {2024},
}

About

A list of language models with permissive licenses such as MIT or Apache 2.0

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published