mirror of
https://github.com/microsoft/BitNet.git
synced 2026-05-03 11:20:36 +00:00
update README
This commit is contained in:
@@ -4,10 +4,8 @@
|
||||
|
||||
bitnet.cpp is the official inference framework for 1-bit LLMs (e.g., BitNet b1.58). It offers a suite of optimized kernels, that support **fast** and **lossless** inference of 1.58-bit models on CPU (with NPU and GPU support coming next).
|
||||
|
||||
The first release of bitnet.cpp is to support inference on CPUs. bitnet.cpp achieves speedups of **1.37x** to **5.07x** on ARM CPUs, with larger models experiencing greater performance gains. Additionally, it reduces energy consumption by **55.4%** to **70.0%**, further boosting overall efficiency. On x86 CPUs, speedups range from **2.37x** to **6.17x** with energy reductions between **71.9%** to **82.2%**. Furthermore, bitnet.cpp can run a 100B BitNet b1.58 model on a single CPU, achieving speeds comparable to human reading (5-7 tokens per second), significantly enhancing the potential for running LLMs on local devices. Please refer to the [technical report](https://arxiv.org/abs/2410.16144) for more details.
|
||||
|
||||
<img src="./assets/m2_performance.jpg" alt="m2_performance" width="800"/>
|
||||
<img src="./assets/intel_performance.jpg" alt="m2_performance" width="800"/>
|
||||
<img src="./assets/f_compa.png" alt="performance" width="800"/>
|
||||
<!-- <img src="./assets/intel_performance.jpg" alt="m2_performance" width="800"/> -->
|
||||
|
||||
>The tested models are dummy setups used in a research context to demonstrate the inference performance of bitnet.cpp.
|
||||
|
||||
@@ -18,8 +16,8 @@ A demo of bitnet.cpp running a BitNet b1.58 3B model on Apple M2:
|
||||
https://github.com/user-attachments/assets/7f46b736-edec-4828-b809-4be780a3e5b1
|
||||
|
||||
## What's New:
|
||||
|
||||
- 11/08/2024 [BitNet a4.8: 4-bit Activations for 1-bit LLMs](https://arxiv.org/abs/2411.04965) 
|
||||
- 02/18/2025 [Bitnet.cpp: Efficient Edge Inference for Ternary LLMs](https://arxiv.org/abs/2502.11880) 
|
||||
- 11/08/2024 [BitNet a4.8: 4-bit Activations for 1-bit LLMs](https://arxiv.org/abs/2411.04965)
|
||||
- 10/21/2024 [1-bit AI Infra: Part 1.1, Fast and Lossless BitNet b1.58 Inference on CPUs](https://arxiv.org/abs/2410.16144)
|
||||
- 10/17/2024 bitnet.cpp 1.0 released.
|
||||
- 03/21/2024 [The-Era-of-1-bit-LLMs__Training_Tips_Code_FAQ](https://github.com/microsoft/unilm/blob/master/bitnet/The-Era-of-1-bit-LLMs__Training_Tips_Code_FAQ.pdf)
|
||||
@@ -139,7 +137,7 @@ This project is based on the [llama.cpp](https://github.com/ggerganov/llama.cpp)
|
||||
|
||||
1. Clone the repo
|
||||
```bash
|
||||
git clone --recursive https://github.com/microsoft/BitNet.git
|
||||
git clone --recursive -b paper https://github.com/microsoft/BitNet.git
|
||||
cd BitNet
|
||||
```
|
||||
2. Install the dependencies
|
||||
|
||||
Binary file not shown.
|
After Width: | Height: | Size: 84 KiB |
Reference in New Issue
Block a user