From fa83380d99e761cd2c06940299c935345979db53 Mon Sep 17 00:00:00 2001 From: potassiummmm <54800242+potassiummmm@users.noreply.github.com> Date: Thu, 19 Dec 2024 18:32:54 +0800 Subject: [PATCH 1/4] Update README.md --- README.md | 103 +++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 98 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 8b6fce8..95f6096 100644 --- a/README.md +++ b/README.md @@ -88,6 +88,104 @@ This project is based on the [llama.cpp](https://github.com/ggerganov/llama.cpp)
usage: run_inference.py [-h] [-m MODEL] [-n N_PREDICT] -p PROMPT [-t THREADS] [-c CTX_SIZE] [-temp TEMPERATURE] [-cnv]
From b441b7611855c2bb94436d499158e3606b56ea77 Mon Sep 17 00:00:00 2001
From: potassiummmm <54800242+potassiummmm@users.noreply.github.com>
Date: Thu, 19 Dec 2024 18:35:31 +0800
Subject: [PATCH 2/4] Update README.md
---
README.md | 1 +
1 file changed, 1 insertion(+)
diff --git a/README.md b/README.md
index 95f6096..b616916 100644
--- a/README.md
+++ b/README.md
@@ -252,6 +252,7 @@ optional arguments:
Quantization type
--quant-embd Quantize the embeddings to f16
--use-pretuned, -p Use the pretuned kernel parameters
+ (When this option is turned on, the specified prompt by -p will be used as the system prompt.)
## Usage
### Basic usage
From 933e8950bda04566893fe19194a35e46c5f29bd3 Mon Sep 17 00:00:00 2001
From: potassiummmm <54800242+potassiummmm@users.noreply.github.com>
Date: Thu, 19 Dec 2024 18:47:53 +0800
Subject: [PATCH 3/4] Update README.md
---
README.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/README.md b/README.md
index b616916..f1f6a23 100644
--- a/README.md
+++ b/README.md
@@ -252,7 +252,6 @@ optional arguments:
Quantization type
--quant-embd Quantize the embeddings to f16
--use-pretuned, -p Use the pretuned kernel parameters
- (When this option is turned on, the specified prompt by -p will be used as the system prompt.)
## Usage
### Basic usage
@@ -280,6 +279,7 @@ optional arguments:
-temp TEMPERATURE, --temperature TEMPERATURE
Temperature, a hyperparameter that controls the randomness of the generated text
-cnv, --conversation Whether to enable chat mode or not (for instruct models.)
+ (When this option is turned on, the prompt specified by -p will be used as the system prompt.)
### Benchmark
From d0fc8c9a39ccb4cc36b1a76b774eaa6b7f2a182b Mon Sep 17 00:00:00 2001
From: potassiummmm <54800242+potassiummmm@users.noreply.github.com>
Date: Fri, 20 Dec 2024 14:58:53 +0800
Subject: [PATCH 4/4] Update README.md
---
README.md | 100 +++++-------------------------------------------------
1 file changed, 8 insertions(+), 92 deletions(-)
diff --git a/README.md b/README.md
index a06844b..a439f0a 100644
--- a/README.md
+++ b/README.md
@@ -89,102 +89,18 @@ This project is based on the [llama.cpp](https://github.com/ggerganov/llama.cpp)