Commit Graph

82 Commits

Author SHA1 Message Date
deva100 b68802ff17 [fix] embed-quant q6_k; [modify] README update 2026-01-20 04:56:50 +00:00
deva100 35b1c28585 [fix] correct README 2026-01-15 03:44:50 +00:00
deva100 53ffe5e92b [chore] update README 2026-01-15 03:37:16 +00:00
deva100 43da5e5f76 [fix] make demo_benchmark.sh more fast 2025-12-23 07:23:14 +00:00
deva100 41cc304868 [chore] add some automation bash script for BitNet Tech Report 2025-12-23 06:48:33 +00:00
deva100 112f853414 [feat] I2S kernels for weight & activation parallel on Intel & ARM machine; [feat] I2S GEMV & GEMM(llama.cpp); [feat] quantize activation & dequantize embedding(llama.cpp); [fix] compile bug: cannot define __ARM_FEATURE_DOTPROD(llama.cpp) 2025-11-19 07:35:05 +00:00
Junhui He 404980eeca Merge pull request #290 from microsoft/gpu-readme-dev
Update readme for gpu kernels
2025-06-03 14:14:20 +08:00
Junhui He 088e607b25 Merge pull request #280 from microsoft/fix-convert-dev
Enable conversion from .safetensors checkpoints to gguf files
2025-06-03 13:59:47 +08:00
ZeonfaiHo c1e9a9a237 Update readme for gpu kernels 2025-05-31 21:41:41 +08:00
junhuihe 43e9b2d4a0 Enable conversion from .safetensors checkpoints to gguf files 2025-05-23 16:19:29 +08:00
tsong-ms 69a20459f5 Merge pull request #268 from younesbelkada/add-falcon-e-final
Add falcon-e support
2025-05-21 16:28:05 +08:00
younesbelkada 5c12850ed9 Merge branch 'add-falcon-e-final' of github.com:younesbelkada/BitNet into add-falcon-e-final 2025-05-21 11:53:40 +04:00
younesbelkada 765741d80b update submodule 2025-05-21 11:52:30 +04:00
Younes Belkada f314d18863 feat: add also base models 2025-05-21 04:11:07 +04:00
Younes Belkada 9e9575665e Merge branch 'microsoft:main' into add-falcon-e-final 2025-05-20 17:05:11 +04:00
tsong-ms 70285e0154 Merge pull request #276 from microsoft/readme-dev
refine readme for gpu kernel
2025-05-20 16:14:18 +08:00
tsong-ms 6197e9feb0 refine readme for gpu kernel 2025-05-20 12:29:56 +08:00
Junhui He 6c2c08f67e Merge pull request #266 from microsoft/gpu-dev 2025-05-19 12:46:20 +08:00
Junhui He 154c92b704 Init gpu branch 2025-05-19 04:34:00 +00:00
Younes Belkada 0015ad5201 Update README.md 2025-05-15 18:49:28 +04:00
younesbelkada de371b708d add falcon-e support 2025-05-14 17:07:05 +04:00
Benjamin Wegener c9e752c9d7 Fix build error with GCC by forcing Clang compiler in CMake on android/aarch64 (#242)
GCC does not recognize Clang-specific warning flags like
-Wunreachable-code-break and -Wunreachable-code-return, which are passed
by upstream submodules (e.g., ggml). This patch forces CMake to use Clang
via command-line arguments, avoiding the need to patch nested submodules.

This resolves compiler errors without modifying submodule source code.
2025-05-08 16:22:45 +08:00
Benjamin Wegener 1792346223 Add run_inference_server.py for Running llama.cpp Built-in Server (#204)
* Update CMakeLists.txt

I added a CMake option to compile the Llama.cpp server. This update allows us to easily build and deploy the server using BitNet

* Create run_inference_server.py

same as run_inference, but for use with llama.cpp's built in server, for some extra comfort

In particular:
- The build directory is determined based on whether the system is running on Windows or not.
- A list of arguments (`--model`, `-m` etc.) is created.
- The main argument list is parsed and passed to the `subprocess.run()` method to execute the system command.
2025-05-08 16:22:12 +08:00
Junhui He c17d1c5d77 Merge pull request #212 from microsoft/arch-name-dev
Fix model architecture name
2025-04-23 11:20:15 +08:00
junhuihe 488dc1e876 Fix model architecture name 2025-04-22 17:28:59 +08:00
tsong-ms fd9f1d6e46 Merge pull request #176 from microsoft/readme-dev
refine readme
2025-04-16 12:35:53 +08:00
tsong 874e6bd5fb refine readme 2025-04-16 04:34:59 +00:00
tsong-ms 034b34cb70 Merge pull request #175 from microsoft/readme-dev
add third-party demo
2025-04-15 22:42:12 +08:00
tsong 71fdd9472f add third-party demo 2025-04-15 14:36:05 +00:00
Yan Xia 1c77bd8966 Update README.md 2025-04-15 17:11:23 +08:00
Yan Xia 8f75f99c72 Update README.md (#172)
add two FAQs for windows build requestions.
2025-04-15 17:07:20 +08:00
Yan Xia 0e7dadba1e Update README.md 2025-04-15 15:24:42 +08:00
Yan Xia fd3f355a0b update readme and setup script to support official BitNet b1.58 model (#171)
* update readme and setup file for new model.

* update model file name

---------

Co-authored-by: Yan Xia <yanxia@microsoft.com>
2025-04-15 14:53:56 +08:00
tsong-ms fa854cf8f8 Merge pull request #167 from potassiummmm/bitnet-25
add support for bitnet2b_2501 model
2025-04-15 14:27:46 +08:00
potassiummmm 09f91066d6 add conversion logic for new model 2025-03-12 18:34:05 +08:00
potassiummmm 4f2e41a514 add support for bitnet2b_2501 model 2025-03-12 18:16:45 +08:00
Eddie-Wang1120 caf17ec438 update README 2025-02-18 21:13:27 +08:00
potassiummmm 437b321dcf Merge pull request #145 from potassiummmm/readme-new-model
Update README
2024-12-20 16:22:10 +08:00
potassiummmm d0fc8c9a39 Update README.md 2024-12-20 14:58:53 +08:00
potassiummmm 1791e8eb1c Merge branch 'microsoft:main' into readme-new-model 2024-12-20 14:56:08 +08:00
potassiummmm 253954811b Merge pull request #130 from lfoppiano/patch-1
Make the coverage table more readable with both dark and light theme
2024-12-20 14:55:38 +08:00
potassiummmm 933e8950bd Update README.md 2024-12-19 18:47:53 +08:00
potassiummmm b441b76118 Update README.md 2024-12-19 18:35:31 +08:00
potassiummmm fa83380d99 Update README.md 2024-12-19 18:32:54 +08:00
potassiummmm 3e19f15cd0 Merge pull request #142 from microsoft/f3-fix-2
Fix model name in setup_env.py
2024-12-19 01:21:21 +08:00
potassiummmm c96c2499d6 Fix model name in setup_env.py 2024-12-19 01:20:14 +08:00
potassiummmm e255fef69b Merge pull request #141 from potassiummmm/f3-fix
F3 fix
2024-12-18 21:27:26 +08:00
potassiummmm 0a446952e1 fix readme issue and -cnv option issue 2024-12-18 21:20:26 +08:00
potassiummmm aa39c0cdcc fix version requirement of transformers pypi package and model list for codegen 2024-12-18 17:54:23 +08:00
tsong-ms 33ceabed0b Merge pull request #137 from younesbelkada/f3-changes
Feat: Add changes for Falcon3 release
2024-12-18 11:41:20 +08:00