Bitsandbytes documentation

Lion

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v0.49.2).
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Lion

Lion (Evolved Sign Momentum) is a unique optimizer that uses the sign of the gradient to determine the update direction of the momentum. This makes Lion more memory-efficient and faster than AdamW which tracks and store the first and second-order moments.

Lion

class bitsandbytes.optim.Lion

< >

( params lr = 0.0001 betas = (0.9, 0.99) weight_decay = 0 optim_bits = 32 args = None min_8bit_size = 4096 is_paged = False )

__init__

< >

( params lr = 0.0001 betas = (0.9, 0.99) weight_decay = 0 optim_bits = 32 args = None min_8bit_size = 4096 is_paged = False )

Parameters

  • params (torch.tensor) — The input parameters to optimize.
  • lr (float, defaults to 1e-4) — The learning rate.
  • betas (tuple(float, float), defaults to (0.9, 0.999)) — The beta values are the decay rates of the first and second-order moment of the optimizer.
  • weight_decay (float, defaults to 0) — The weight decay value for the optimizer.
  • optim_bits (int, defaults to 32) — The number of bits of the optimizer state.
  • args (object, defaults to None) — An object with additional arguments.
  • min_8bit_size (int, defaults to 4096) — The minimum number of elements of the parameter tensors for 8-bit optimization.
  • is_paged (bool, defaults to False) — Whether the optimizer is a paged optimizer or not.

Base Lion optimizer.

Lion8bit

class bitsandbytes.optim.Lion8bit

< >

( params lr = 0.0001 betas = (0.9, 0.99) weight_decay = 0 args = None min_8bit_size = 4096 is_paged = False )

__init__

< >

( params lr = 0.0001 betas = (0.9, 0.99) weight_decay = 0 args = None min_8bit_size = 4096 is_paged = False )

Parameters

  • params (torch.tensor) — The input parameters to optimize.
  • lr (float, defaults to 1e-4) — The learning rate.
  • betas (tuple(float, float), defaults to (0.9, 0.999)) — The beta values are the decay rates of the first and second-order moment of the optimizer.
  • weight_decay (float, defaults to 0) — The weight decay value for the optimizer.
  • args (object, defaults to None) — An object with additional arguments.
  • min_8bit_size (int, defaults to 4096) — The minimum number of elements of the parameter tensors for 8-bit optimization.
  • is_paged (bool, defaults to False) — Whether the optimizer is a paged optimizer or not.

8-bit Lion optimizer.

Lion32bit

class bitsandbytes.optim.Lion32bit

< >

( params lr = 0.0001 betas = (0.9, 0.99) weight_decay = 0 args = None min_8bit_size = 4096 is_paged = False )

__init__

< >

( params lr = 0.0001 betas = (0.9, 0.99) weight_decay = 0 args = None min_8bit_size = 4096 is_paged = False )

Parameters

  • params (torch.tensor) — The input parameters to optimize.
  • lr (float, defaults to 1e-4) — The learning rate.
  • betas (tuple(float, float), defaults to (0.9, 0.999)) — The beta values are the decay rates of the first and second-order moment of the optimizer.
  • weight_decay (float, defaults to 0) — The weight decay value for the optimizer.
  • args (object, defaults to None) — An object with additional arguments.
  • min_8bit_size (int, defaults to 4096) — The minimum number of elements of the parameter tensors for 8-bit optimization.
  • is_paged (bool, defaults to False) — Whether the optimizer is a paged optimizer or not.

32-bit Lion optimizer.

PagedLion

class bitsandbytes.optim.PagedLion

< >

( params lr = 0.0001 betas = (0.9, 0.99) weight_decay = 0 optim_bits = 32 args = None min_8bit_size = 4096 )

__init__

< >

( params lr = 0.0001 betas = (0.9, 0.99) weight_decay = 0 optim_bits = 32 args = None min_8bit_size = 4096 )

Parameters

  • params (torch.tensor) — The input parameters to optimize.
  • lr (float, defaults to 1e-4) — The learning rate.
  • betas (tuple(float, float), defaults to (0.9, 0.999)) — The beta values are the decay rates of the first and second-order moment of the optimizer.
  • weight_decay (float, defaults to 0) — The weight decay value for the optimizer.
  • optim_bits (int, defaults to 32) — The number of bits of the optimizer state.
  • args (object, defaults to None) — An object with additional arguments.
  • min_8bit_size (int, defaults to 4096) — The minimum number of elements of the parameter tensors for 8-bit optimization.

Paged Lion optimizer.

PagedLion8bit

class bitsandbytes.optim.PagedLion8bit

< >

( params lr = 0.0001 betas = (0.9, 0.99) weight_decay = 0 args = None min_8bit_size = 4096 )

__init__

< >

( params lr = 0.0001 betas = (0.9, 0.99) weight_decay = 0 args = None min_8bit_size = 4096 )

Parameters

  • params (torch.tensor) — The input parameters to optimize.
  • lr (float, defaults to 1e-4) — The learning rate.
  • betas (tuple(float, float), defaults to (0.9, 0.999)) — The beta values are the decay rates of the first and second-order moment of the optimizer.
  • weight_decay (float, defaults to 0) — The weight decay value for the optimizer.
  • args (object, defaults to None) — An object with additional arguments.
  • min_8bit_size (int, defaults to 4096) — The minimum number of elements of the parameter tensors for 8-bit optimization.

Paged 8-bit Lion optimizer.

PagedLion32bit

class bitsandbytes.optim.PagedLion32bit

< >

( params lr = 0.0001 betas = (0.9, 0.99) weight_decay = 0 args = None min_8bit_size = 4096 )

__init__

< >

( params lr = 0.0001 betas = (0.9, 0.99) weight_decay = 0 args = None min_8bit_size = 4096 )

Parameters

  • params (torch.tensor) — The input parameters to optimize.
  • lr (float, defaults to 1e-4) — The learning rate.
  • betas (tuple(float, float), defaults to (0.9, 0.999)) — The beta values are the decay rates of the first and second-order moment of the optimizer.
  • weight_decay (float, defaults to 0) — The weight decay value for the optimizer.
  • args (object, defaults to None) — An object with additional arguments.
  • min_8bit_size (int, defaults to 4096) — The minimum number of elements of the parameter tensors for 8-bit optimization.

Paged 32-bit Lion optimizer.

Update on GitHub