Enhanced error handling in CUDA SETUP failures.

author: Tim Dettmers <tim.dettmers@gmail.com> 2022-08-16 19:03:19 -0700
committer: Tim Dettmers <tim.dettmers@gmail.com> 2022-08-16 19:03:19 -0700
commit: a6664de0720c7d8572a475a9c59f7dd85b5f83b0 (patch)
tree: 1342a757dfc60fe37f1d94371251fd17bc968bff /CHANGELOG.md
parent: de354f7ded52bfa857089769225cdf1ee694bfd6 (diff)
1 files changed, 23 insertions, 0 deletions
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 285984e..1017721 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -67,3 +67,26 @@ Features:
 
 Deprecated:
  - Pre-compiled release for CUDA 9.2, 10.0, 10.2 no longer available
+
+### 0.31.0
+
+#### 8-bit Inference and Packaging Update
+
+Features:
+ - added direct outlier extraction. This enables outlier extraction without fp16 weights without performance degradation.
+ - Added automatic CUDA SETUP procedure and packaging all binaries into a single bitsandbytes package.
+
+### 0.32.0
+
+#### 8-bit Inference Performance Enhancements
+
+We added performance enhancements for small models. This makes small models about 2x faster for LLM.int8() inference.
+
+Features:
+ - Int32 dequantization now supports fused biases.
+ - Linear8bitLt now uses a fused bias implementation.
+ - Change `.data.storage().data_ptr()` to `.data.data_ptr()` to enhance inference performance.
+
+Bug fixes:
+ - Now throws and error if LLM.int8() is used on a GPU that is not supported.
+ - Enhances error messaging if CUDA SETUP fails.
author	Tim Dettmers <tim.dettmers@gmail.com>	2022-08-16 19:03:19 -0700
committer	Tim Dettmers <tim.dettmers@gmail.com>	2022-08-16 19:03:19 -0700
commit	a6664de0720c7d8572a475a9c59f7dd85b5f83b0 (patch)
tree	1342a757dfc60fe37f1d94371251fd17bc968bff /CHANGELOG.md
parent	de354f7ded52bfa857089769225cdf1ee694bfd6 (diff)