3 Commits

Author SHA1 Message Date
Benjamin Wegener 1792346223 Add run_inference_server.py for Running llama.cpp Built-in Server (#204)
* Update CMakeLists.txt

I added a CMake option to compile the Llama.cpp server. This update allows us to easily build and deploy the server using BitNet

* Create run_inference_server.py

same as run_inference, but for use with llama.cpp's built in server, for some extra comfort

In particular:
- The build directory is determined based on whether the system is running on Windows or not.
- A list of arguments (`--model`, `-m` etc.) is created.
- The main argument list is parsed and passed to the `subprocess.run()` method to execute the system command.
2025-05-08 16:22:12 +08:00
Goran Jelic-Cizmek 489b7c5abf Add -fpermissive if using GCC 2024-10-26 17:10:54 +02:00
potassiummmm 6cfd8831fd initial commit 2024-10-17 21:21:10 +08:00