summaryrefslogtreecommitdiff
AgeCommit message (Expand)Author
2022-08-06Removed faulty asserts.Tim Dettmers
2022-08-05Added the case that all env variables are empty (CUDA docker).Tim Dettmers
2022-08-05Bumping version for TestPyPi release.Tim Dettmers
2022-08-05Now determining cuda version via libcudart.so call.Tim Dettmers
2022-08-04Fixed bugs in cuda setup.Tim Dettmers
2022-08-04Merge branch 'debug' into cuda-bin-switch-and-cliTim Dettmers
2022-08-04Added pre/post device call for extract outliers.Tim Dettmers
2022-08-04Merge branch 'extract_outliers' into debugTim Dettmers
2022-08-04Added pre and post device call to transform.Tim Dettmers
2022-08-03Removed print statement.Tim Dettmers
2022-08-03Added fixes for the case that matmullt dim A is zero, e.g. [0, 768].Tim Dettmers
2022-08-03Added CUDA block assert and is_on_gpu check.Tim Dettmers
2022-08-02tentative refactoring of the compute capabilities codeTitus von Koeller
2022-08-02factored cuda_setup.main out into smaller modules and functionsTitus von Koeller
2022-08-02move cuda_setup code into subpackageTitus von Koeller
2022-08-01Fixed syntax error; bumped revision for beta release.Tim Dettmers
2022-08-01Added some more docs and comments.Tim Dettmers
2022-08-01Added full env variable search; CONDA_PREFIX priority.Tim Dettmers
2022-08-01deleted function that was moved but accidentally not removed in commitTitus von Koeller
2022-08-01reran black with linelength 80 for greater readabilityTitus von Koeller
2022-08-01refactored subshell execution code for greater readability and moved it to utilsTitus von Koeller
2022-08-01flake8 found some stuff that needs fixing before the releaseTitus von Koeller
2022-08-01ran black and isort for coherent code formattingTitus von Koeller
2022-08-01fix typoTitus von Koeller
2022-08-01minor refactor to more concise syntaxTitus von Koeller
2022-07-31Added adjusted build file.Tim Dettmers
2022-07-31Initial build script changes (untested on PyPi).Tim Dettmers
2022-07-31Full evaluate_cuda setup with integration test.Tim Dettmers
2022-07-27adding CLI tool for CUDA install debugging - intermediate commitTitus von Koeller
2022-07-27Fixed deployment script to check for LD_LIBRARY_PATH.Tim Dettmers
2022-07-27Fixed direct extraction masking.Tim Dettmers
2022-07-26Fixed make default to compile with cublaslt.Tim Dettmers
2022-07-26Merge branch 'patch_merge' into extract_outliersTim Dettmers
2022-07-26Matmullt with direct outlier extraction for 8-bit inference.Tim Dettmers
2022-07-26Added col_ampere outlier extraction kernel.Tim Dettmers
2022-07-26Working outlier extraction for Turing.Tim Dettmers
2022-07-26Boilerplate and test for extract_outliers.Tim Dettmers
2022-07-26Changed setup.py; deployed on test pypi.Tim Dettmers
2022-07-26Fixed cpuonly build.Tim Dettmers
2022-07-25Added matmul build and flags.Tim Dettmers
2022-07-25Some progress on build script; added multi-cuda install script.Tim Dettmers
2022-07-25Removed rowscale (segfaults on ampere).Tim Dettmers
2022-07-25Fixed makefile; fixed Ampere igemmlt_8 bug.Tim Dettmers
2022-07-22Fixed rowcol synchronization bug.Tim Dettmers
2022-07-22Most tests passing.Tim Dettmers
2022-07-18Merge pull request #3 from TimDettmers/cpuonlyTim Dettmers
2022-07-01Update README.mdMax Ryabinin
2022-07-01Reduce diffMax Ryabinin
2022-07-01Reduce diffMax Ryabinin
2022-07-01Reduce diffMax Ryabinin