Junhui He
6c2c08f67e
Merge pull request #266 from microsoft/gpu-dev
2025-05-19 12:46:20 +08:00
Junhui He
154c92b704
Init gpu branch
2025-05-19 04:34:00 +00:00
Benjamin Wegener
c9e752c9d7
Fix build error with GCC by forcing Clang compiler in CMake on android/aarch64 ( #242 )
...
GCC does not recognize Clang-specific warning flags like
-Wunreachable-code-break and -Wunreachable-code-return, which are passed
by upstream submodules (e.g., ggml). This patch forces CMake to use Clang
via command-line arguments, avoiding the need to patch nested submodules.
This resolves compiler errors without modifying submodule source code.
2025-05-08 16:22:45 +08:00
Benjamin Wegener
1792346223
Add run_inference_server.py for Running llama.cpp Built-in Server ( #204 )
...
* Update CMakeLists.txt
I added a CMake option to compile the Llama.cpp server. This update allows us to easily build and deploy the server using BitNet
* Create run_inference_server.py
same as run_inference, but for use with llama.cpp's built in server, for some extra comfort
In particular:
- The build directory is determined based on whether the system is running on Windows or not.
- A list of arguments (`--model`, `-m` etc.) is created.
- The main argument list is parsed and passed to the `subprocess.run()` method to execute the system command.
2025-05-08 16:22:12 +08:00
Junhui He
c17d1c5d77
Merge pull request #212 from microsoft/arch-name-dev
...
Fix model architecture name
2025-04-23 11:20:15 +08:00
junhuihe
488dc1e876
Fix model architecture name
2025-04-22 17:28:59 +08:00
tsong-ms
fd9f1d6e46
Merge pull request #176 from microsoft/readme-dev
...
refine readme
2025-04-16 12:35:53 +08:00
tsong
874e6bd5fb
refine readme
2025-04-16 04:34:59 +00:00
tsong-ms
034b34cb70
Merge pull request #175 from microsoft/readme-dev
...
add third-party demo
2025-04-15 22:42:12 +08:00
tsong
71fdd9472f
add third-party demo
2025-04-15 14:36:05 +00:00
Yan Xia
1c77bd8966
Update README.md
2025-04-15 17:11:23 +08:00
Yan Xia
8f75f99c72
Update README.md ( #172 )
...
add two FAQs for windows build requestions.
2025-04-15 17:07:20 +08:00
Yan Xia
0e7dadba1e
Update README.md
2025-04-15 15:24:42 +08:00
Yan Xia
fd3f355a0b
update readme and setup script to support official BitNet b1.58 model ( #171 )
...
* update readme and setup file for new model.
* update model file name
---------
Co-authored-by: Yan Xia <yanxia@microsoft.com >
2025-04-15 14:53:56 +08:00
tsong-ms
fa854cf8f8
Merge pull request #167 from potassiummmm/bitnet-25
...
add support for bitnet2b_2501 model
2025-04-15 14:27:46 +08:00
potassiummmm
09f91066d6
add conversion logic for new model
2025-03-12 18:34:05 +08:00
potassiummmm
4f2e41a514
add support for bitnet2b_2501 model
2025-03-12 18:16:45 +08:00
Eddie-Wang1120
caf17ec438
update README
2025-02-18 21:13:27 +08:00
potassiummmm
437b321dcf
Merge pull request #145 from potassiummmm/readme-new-model
...
Update README
2024-12-20 16:22:10 +08:00
potassiummmm
d0fc8c9a39
Update README.md
2024-12-20 14:58:53 +08:00
potassiummmm
1791e8eb1c
Merge branch 'microsoft:main' into readme-new-model
2024-12-20 14:56:08 +08:00
potassiummmm
253954811b
Merge pull request #130 from lfoppiano/patch-1
...
Make the coverage table more readable with both dark and light theme
2024-12-20 14:55:38 +08:00
potassiummmm
933e8950bd
Update README.md
2024-12-19 18:47:53 +08:00
potassiummmm
b441b76118
Update README.md
2024-12-19 18:35:31 +08:00
potassiummmm
fa83380d99
Update README.md
2024-12-19 18:32:54 +08:00
potassiummmm
3e19f15cd0
Merge pull request #142 from microsoft/f3-fix-2
...
Fix model name in setup_env.py
2024-12-19 01:21:21 +08:00
potassiummmm
c96c2499d6
Fix model name in setup_env.py
2024-12-19 01:20:14 +08:00
potassiummmm
e255fef69b
Merge pull request #141 from potassiummmm/f3-fix
...
F3 fix
2024-12-18 21:27:26 +08:00
potassiummmm
0a446952e1
fix readme issue and -cnv option issue
2024-12-18 21:20:26 +08:00
potassiummmm
aa39c0cdcc
fix version requirement of transformers pypi package and model list for codegen
2024-12-18 17:54:23 +08:00
tsong-ms
33ceabed0b
Merge pull request #137 from younesbelkada/f3-changes
...
Feat: Add changes for Falcon3 release
2024-12-18 11:41:20 +08:00
younesbelkada
6a5134a6f0
change
2024-12-17 08:45:25 +00:00
younesbelkada
85c3247323
add changes on README
2024-12-17 07:05:35 +00:00
younesbelkada
c1c55417c2
fix issues
2024-12-16 15:26:33 +00:00
younesbelkada
de19627eef
add 10b model
2024-12-16 14:42:15 +00:00
younesbelkada
a838911a55
more changes to support chat models
2024-12-09 16:45:31 +00:00
Luca Foppiano
22566ab52e
Make the coverage table more readable with both dark and light theme
2024-12-05 12:02:16 +00:00
younesbelkada
7c57a5ae20
fix weird character issue
2024-11-23 14:49:44 +00:00
younesbelkada
22575a47cf
update submodule
2024-11-14 15:08:47 +00:00
younesbelkada
c1892d6818
updated submodule
2024-11-14 14:53:43 +00:00
younesbelkada
18cfa8af89
add fc3 support
2024-11-14 14:51:09 +00:00
potassiummmm
bf11a49f11
Add support for ios platform
2024-11-11 15:13:55 +08:00
Shuming Ma
4645960add
Update README.md
2024-11-08 17:23:21 +08:00
potassiummmm
37c247c4dc
Merge pull request #79 from MrEcco/main
...
Getting Errors When following Readme Instruction on ARM server with ubuntu24.04 #74
2024-11-07 00:30:10 +08:00
potassiummmm
338973feb8
Merge pull request #83 from JCGoran/jelic/gcc_fixes
...
Fix building on GCC toolchain
2024-11-07 00:29:17 +08:00
potassiummmm
80b94aecb2
Fix llama-bench path error on Windows
2024-10-31 16:50:47 +08:00
Yan Xia
9b29748910
Update README.md acknowledgement section
2024-10-29 21:25:19 +08:00
Goran Jelic-Cizmek
489b7c5abf
Add -fpermissive if using GCC
2024-10-26 17:10:54 +02:00
Goran Jelic-Cizmek
141ddfd4fe
Fix compiler errors on GCC
2024-10-25 16:23:01 +02:00
Goran Jelic-Cizmek
9d37b8692d
Add GCC to compiler check
2024-10-25 16:22:55 +02:00