summaryrefslogtreecommitdiff
path: root/howto_config_override.md
diff options
context:
space:
mode:
authorTim Dettmers <tim.dettmers@gmail.com>2021-10-07 09:54:34 -0700
committerTim Dettmers <tim.dettmers@gmail.com>2021-10-07 09:54:34 -0700
commit5f95b5253f4936080479c909724601b342da1c18 (patch)
treebe264f2332348daba34cdaf14264e94eddec1b23 /howto_config_override.md
parent7439924891496025edf60c9da6a782f362a50c70 (diff)
Updated readme with latest changes.
Diffstat (limited to 'howto_config_override.md')
-rw-r--r--howto_config_override.md26
1 files changed, 26 insertions, 0 deletions
diff --git a/howto_config_override.md b/howto_config_override.md
new file mode 100644
index 0000000..11e9d49
--- /dev/null
+++ b/howto_config_override.md
@@ -0,0 +1,26 @@
+# How to override config hyperparameters for particular weights/parameters
+
+If you want to optimize some unstable parameters with 32-bit Adam and others with 8-bit Adam, you can use the `GlobalOptimManager`. With this, we can also configure specific hyperparameters for particular layers, such as embedding layers. To do that, we need two things: (1) register the parameter while they are still on the CPU, (2) override the config with the new desired hyperparameters (anytime, anywhere). See our [guide](howto_config_override.md) for more details
+
+```python
+import torch
+import bitsandbytes as bnb
+
+mng = bnb.optim.GlobalOptimManager.get_instance()
+
+model = MyModel()
+mng.register_parameters(model.parameters()) # 1. register parameters while still on CPU
+
+model = model.cuda()
+# use 8-bit optimizer states for all parameters
+adam = bnb.optim.Adam(model.parameters(), lr=0.001, optim_bits=8)
+
+# 2a. override: the parameter model.fc1.weight now uses 32-bit Adam
+mng.override_config(model.fc1.weight, 'optim_bits', 32)
+
+# 2b. override: the two special layers use
+# sparse optimization + different learning rate + different Adam betas
+mng.override_config([model.special.weight, model.also_special.weight],
+ key_value_dict ={'is_sparse': True, 'lr': 1e-5, 'betas'=(0.9, 0.98)})
+```
+Possible options for the config override are: `betas, eps, weight_decay, lr, optim_bits, min_8bit_size, percentile_clipping, block_wise, max_unorm`