From fa83380d99e761cd2c06940299c935345979db53 Mon Sep 17 00:00:00 2001 From: potassiummmm <54800242+potassiummmm@users.noreply.github.com> Date: Thu, 19 Dec 2024 18:32:54 +0800 Subject: [PATCH 1/4] Update README.md --- README.md | 103 +++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 98 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 8b6fce8..95f6096 100644 --- a/README.md +++ b/README.md @@ -88,6 +88,104 @@ This project is based on the [llama.cpp](https://github.com/ggerganov/llama.cpp) ✔ ✘ + + Falcon3-1B-Instruct-1.58bit + 1.0B + x86 + ✔ + ✘ + ✔ + + + ARM + ✔ + ✔ + ✘ + + + Falcon3-3B-1.58bit + 3.0B + x86 + ✔ + ✘ + ✔ + + + ARM + ✔ + ✔ + ✘ + + + Falcon3-3B-Instruct-1.58bit + 3.0B + x86 + ✔ + ✘ + ✔ + + + ARM + ✔ + ✔ + ✘ + + + Falcon3-7B-1.58bit + 7.0B + x86 + ✔ + ✘ + ✔ + + + ARM + ✔ + ✔ + ✘ + + + Falcon3-7B-Instruct-1.58bit + 7.0B + x86 + ✔ + ✘ + ✔ + + + ARM + ✔ + ✔ + ✘ + + + Falcon3-10B-1.58bit + 10.0B + x86 + ✔ + ✘ + ✔ + + + ARM + ✔ + ✔ + ✘ + + + Falcon3-10B-Instruct-1.58bit + 10.0B + x86 + ✔ + ✘ + ✔ + + + ARM + ✔ + ✔ + ✘ + @@ -160,11 +258,6 @@ optional arguments: ```bash # Run inference with the quantized model python run_inference.py -m models/Falcon3-7B-Instruct-1.58bit/ggml-model-i2_s.gguf -p "You are a helpful assistant" -cnv - -# Output: -# Daniel went back to the the the garden. Mary travelled to the kitchen. Sandra journeyed to the kitchen. Sandra went to the hallway. John went to the bedroom. Mary went back to the garden. Where is Mary? -# Answer: Mary is in the garden. - ```
 usage: run_inference.py [-h] [-m MODEL] [-n N_PREDICT] -p PROMPT [-t THREADS] [-c CTX_SIZE] [-temp TEMPERATURE] [-cnv]

From b441b7611855c2bb94436d499158e3606b56ea77 Mon Sep 17 00:00:00 2001
From: potassiummmm <54800242+potassiummmm@users.noreply.github.com>
Date: Thu, 19 Dec 2024 18:35:31 +0800
Subject: [PATCH 2/4] Update README.md

---
 README.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/README.md b/README.md
index 95f6096..b616916 100644
--- a/README.md
+++ b/README.md
@@ -252,6 +252,7 @@ optional arguments:
                         Quantization type
   --quant-embd          Quantize the embeddings to f16
   --use-pretuned, -p    Use the pretuned kernel parameters
+                        (When this option is turned on, the specified prompt by -p will be used as the system prompt.)
 
## Usage ### Basic usage From 933e8950bda04566893fe19194a35e46c5f29bd3 Mon Sep 17 00:00:00 2001 From: potassiummmm <54800242+potassiummmm@users.noreply.github.com> Date: Thu, 19 Dec 2024 18:47:53 +0800 Subject: [PATCH 3/4] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index b616916..f1f6a23 100644 --- a/README.md +++ b/README.md @@ -252,7 +252,6 @@ optional arguments: Quantization type --quant-embd Quantize the embeddings to f16 --use-pretuned, -p Use the pretuned kernel parameters - (When this option is turned on, the specified prompt by -p will be used as the system prompt.) ## Usage ### Basic usage @@ -280,6 +279,7 @@ optional arguments: -temp TEMPERATURE, --temperature TEMPERATURE Temperature, a hyperparameter that controls the randomness of the generated text -cnv, --conversation Whether to enable chat mode or not (for instruct models.) + (When this option is turned on, the prompt specified by -p will be used as the system prompt.) ### Benchmark From d0fc8c9a39ccb4cc36b1a76b774eaa6b7f2a182b Mon Sep 17 00:00:00 2001 From: potassiummmm <54800242+potassiummmm@users.noreply.github.com> Date: Fri, 20 Dec 2024 14:58:53 +0800 Subject: [PATCH 4/4] Update README.md --- README.md | 100 +++++------------------------------------------------- 1 file changed, 8 insertions(+), 92 deletions(-) diff --git a/README.md b/README.md index a06844b..a439f0a 100644 --- a/README.md +++ b/README.md @@ -89,102 +89,18 @@ This project is based on the [llama.cpp](https://github.com/ggerganov/llama.cpp) ❌ - Falcon3-1B-Instruct-1.58bit - 1.0B + Falcon3 Family + 1B-10B x86 - ✔ - ✘ - ✔ + ✅ + ❌ + ✅ ARM - ✔ - ✔ - ✘ - - - Falcon3-3B-1.58bit - 3.0B - x86 - ✔ - ✘ - ✔ - - - ARM - ✔ - ✔ - ✘ - - - Falcon3-3B-Instruct-1.58bit - 3.0B - x86 - ✔ - ✘ - ✔ - - - ARM - ✔ - ✔ - ✘ - - - Falcon3-7B-1.58bit - 7.0B - x86 - ✔ - ✘ - ✔ - - - ARM - ✔ - ✔ - ✘ - - - Falcon3-7B-Instruct-1.58bit - 7.0B - x86 - ✔ - ✘ - ✔ - - - ARM - ✔ - ✔ - ✘ - - - Falcon3-10B-1.58bit - 10.0B - x86 - ✔ - ✘ - ✔ - - - ARM - ✔ - ✔ - ✘ - - - Falcon3-10B-Instruct-1.58bit - 10.0B - x86 - ✔ - ✘ - ✔ - - - ARM - ✔ - ✔ - ✘ + ✅ + ✅ + ❌