summaryrefslogtreecommitdiff
path: root/CHANGELOG.md
diff options
context:
space:
mode:
authorTim Dettmers <tim.dettmers@gmail.com>2022-07-25 14:02:14 -0700
committerTim Dettmers <tim.dettmers@gmail.com>2022-07-25 14:02:14 -0700
commit8b1fd32e3e4f5073fd055cb5f9261ec585f8cc2c (patch)
tree76044424d73d02e1026c996b22b9da5061188387 /CHANGELOG.md
parent7d2ecd30c044840ba5f161ec73e5eaf30ac8131d (diff)
Fixed makefile; fixed Ampere igemmlt_8 bug.
Diffstat (limited to 'CHANGELOG.md')
-rw-r--r--CHANGELOG.md11
1 files changed, 11 insertions, 0 deletions
diff --git a/CHANGELOG.md b/CHANGELOG.md
index fa20b15..08adfce 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -53,3 +53,14 @@ Bug fixes:
Docs:
- Added instructions how to solve "\_\_fatbinwrap_" errors.
+
+
+### 0.30.0
+
+#### 8-bit Inference Update
+
+Features:
+ - Added 8-bit matrix multiplication form cuBLAS, and cuBLASLt as well as multiple GEMM kernels (GEMM, GEMMEx, GEMMLt)
+ - Added 8-bit Linear layers with 8-bit Params that perform memory efficient inference with an option for 8-bit mixed precision matrix decomposition for inference without performance degradation
+ - Added quantization methods for "fake" quantization as well as optimized kernels vector-wise quantization and equalization as well as optimized cuBLASLt transformations
+ - CPU only build now available (Thank you, @mryab)