From 8b1fd32e3e4f5073fd055cb5f9261ec585f8cc2c Mon Sep 17 00:00:00 2001 From: Tim Dettmers Date: Mon, 25 Jul 2022 14:02:14 -0700 Subject: Fixed makefile; fixed Ampere igemmlt_8 bug. --- CHANGELOG.md | 11 +++++++++++ 1 file changed, 11 insertions(+) (limited to 'CHANGELOG.md') diff --git a/CHANGELOG.md b/CHANGELOG.md index fa20b15..08adfce 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -53,3 +53,14 @@ Bug fixes: Docs: - Added instructions how to solve "\_\_fatbinwrap_" errors. + + +### 0.30.0 + +#### 8-bit Inference Update + +Features: + - Added 8-bit matrix multiplication form cuBLAS, and cuBLASLt as well as multiple GEMM kernels (GEMM, GEMMEx, GEMMLt) + - Added 8-bit Linear layers with 8-bit Params that perform memory efficient inference with an option for 8-bit mixed precision matrix decomposition for inference without performance degradation + - Added quantization methods for "fake" quantization as well as optimized kernels vector-wise quantization and equalization as well as optimized cuBLASLt transformations + - CPU only build now available (Thank you, @mryab) -- cgit v1.2.3 From 9268dc9d887a3d54cd1f008dcb628aaa5b5bd90a Mon Sep 17 00:00:00 2001 From: Tim Dettmers Date: Mon, 25 Jul 2022 19:30:37 -0700 Subject: Some progress on build script; added multi-cuda install script. --- CHANGELOG.md | 3 +++ 1 file changed, 3 insertions(+) (limited to 'CHANGELOG.md') diff --git a/CHANGELOG.md b/CHANGELOG.md index 08adfce..285984e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -64,3 +64,6 @@ Features: - Added 8-bit Linear layers with 8-bit Params that perform memory efficient inference with an option for 8-bit mixed precision matrix decomposition for inference without performance degradation - Added quantization methods for "fake" quantization as well as optimized kernels vector-wise quantization and equalization as well as optimized cuBLASLt transformations - CPU only build now available (Thank you, @mryab) + +Deprecated: + - Pre-compiled release for CUDA 9.2, 10.0, 10.2 no longer available -- cgit v1.2.3