From a6664de0720c7d8572a475a9c59f7dd85b5f83b0 Mon Sep 17 00:00:00 2001 From: Tim Dettmers Date: Tue, 16 Aug 2022 19:03:19 -0700 Subject: Enhanced error handling in CUDA SETUP failures. --- CHANGELOG.md | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) (limited to 'CHANGELOG.md') diff --git a/CHANGELOG.md b/CHANGELOG.md index 285984e..1017721 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -67,3 +67,26 @@ Features: Deprecated: - Pre-compiled release for CUDA 9.2, 10.0, 10.2 no longer available + +### 0.31.0 + +#### 8-bit Inference and Packaging Update + +Features: + - added direct outlier extraction. This enables outlier extraction without fp16 weights without performance degradation. + - Added automatic CUDA SETUP procedure and packaging all binaries into a single bitsandbytes package. + +### 0.32.0 + +#### 8-bit Inference Performance Enhancements + +We added performance enhancements for small models. This makes small models about 2x faster for LLM.int8() inference. + +Features: + - Int32 dequantization now supports fused biases. + - Linear8bitLt now uses a fused bias implementation. + - Change `.data.storage().data_ptr()` to `.data.data_ptr()` to enhance inference performance. + +Bug fixes: + - Now throws and error if LLM.int8() is used on a GPU that is not supported. + - Enhances error messaging if CUDA SETUP fails. -- cgit v1.2.3