summaryrefslogtreecommitdiff
path: root/CHANGELOG.md
diff options
context:
space:
mode:
authorTim Dettmers <tim.dettmers@gmail.com>2022-08-16 19:03:19 -0700
committerTim Dettmers <tim.dettmers@gmail.com>2022-08-16 19:03:19 -0700
commita6664de0720c7d8572a475a9c59f7dd85b5f83b0 (patch)
tree1342a757dfc60fe37f1d94371251fd17bc968bff /CHANGELOG.md
parentde354f7ded52bfa857089769225cdf1ee694bfd6 (diff)
Enhanced error handling in CUDA SETUP failures.
Diffstat (limited to 'CHANGELOG.md')
-rw-r--r--CHANGELOG.md23
1 files changed, 23 insertions, 0 deletions
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 285984e..1017721 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -67,3 +67,26 @@ Features:
Deprecated:
- Pre-compiled release for CUDA 9.2, 10.0, 10.2 no longer available
+
+### 0.31.0
+
+#### 8-bit Inference and Packaging Update
+
+Features:
+ - added direct outlier extraction. This enables outlier extraction without fp16 weights without performance degradation.
+ - Added automatic CUDA SETUP procedure and packaging all binaries into a single bitsandbytes package.
+
+### 0.32.0
+
+#### 8-bit Inference Performance Enhancements
+
+We added performance enhancements for small models. This makes small models about 2x faster for LLM.int8() inference.
+
+Features:
+ - Int32 dequantization now supports fused biases.
+ - Linear8bitLt now uses a fused bias implementation.
+ - Change `.data.storage().data_ptr()` to `.data.data_ptr()` to enhance inference performance.
+
+Bug fixes:
+ - Now throws and error if LLM.int8() is used on a GPU that is not supported.
+ - Enhances error messaging if CUDA SETUP fails.