UAE's Technology Innovation Institute (TII) has introduced Falcon 3, a family of small language models ranging from one billion to 10 billion parameters.
Falcon 3 outperformed models like Qwen 2.5 7B and Llama 3.1 8B in benchmark tests.
TII focuses on research in AI, quantum computing, robotics, and cryptography and is funded by the Abu Dhabi government.
Falcon 3 utilizes a technique called Grouped Query Attention (GQA) to reduce memory demands and achieve low latency during inference.