High-Level APIs¶
AutoModel¶
| AutoModel Variant | API |
|---|---|
| AutoModelForCausalLM | liger_kernel.transformers.AutoLigerKernelForCausalLM |
This API extends the implementation of the AutoModelForCausalLM within the transformers library from Hugging Face.
liger_kernel.transformers.AutoLigerKernelForCausalLM ¶
Bases: AutoModelForCausalLM
This class is a drop-in replacement for AutoModelForCausalLM that applies the Liger Kernel to the model if applicable.
Source code in src/liger_kernel/transformers/auto_model.py
Try it Out
You can experiment as shown in this example here.
Patching¶
You can also use the Patching APIs to use the kernels for a specific model architecture.
| Model | API | Supported Operations |
|---|---|---|
| LLaMA 2 & 3 | liger_kernel.transformers.apply_liger_kernel_to_llama |
RoPE, RMSNorm, SwiGLU, CrossEntropyLoss, FusedLinearCrossEntropy |
| LLaMA 3.2-Vision | liger_kernel.transformers.apply_liger_kernel_to_mllama |
RoPE, RMSNorm, SwiGLU, CrossEntropyLoss, FusedLinearCrossEntropy |
| Mistral | liger_kernel.transformers.apply_liger_kernel_to_mistral |
RoPE, RMSNorm, SwiGLU, CrossEntropyLoss, FusedLinearCrossEntropy |
| Mixtral | liger_kernel.transformers.apply_liger_kernel_to_mixtral |
RoPE, RMSNorm, SwiGLU, CrossEntropyLoss, FusedLinearCrossEntropy |
| Gemma1 | liger_kernel.transformers.apply_liger_kernel_to_gemma |
RoPE, RMSNorm, GeGLU, CrossEntropyLoss, FusedLinearCrossEntropy |
| Gemma2 | liger_kernel.transformers.apply_liger_kernel_to_gemma2 |
RoPE, RMSNorm, GeGLU, CrossEntropyLoss, FusedLinearCrossEntropy |
| Qwen2, Qwen2.5, & QwQ | liger_kernel.transformers.apply_liger_kernel_to_qwen2 |
RoPE, RMSNorm, SwiGLU, CrossEntropyLoss, FusedLinearCrossEntropy |
| Qwen2-VL | liger_kernel.transformers.apply_liger_kernel_to_qwen2_vl |
RMSNorm, LayerNorm, SwiGLU, CrossEntropyLoss, FusedLinearCrossEntropy |
| Phi3 & Phi3.5 | liger_kernel.transformers.apply_liger_kernel_to_phi3 |
RoPE, RMSNorm, SwiGLU, CrossEntropyLoss, FusedLinearCrossEntropy |
Function Signatures¶
liger_kernel.transformers.apply_liger_kernel_to_llama ¶
apply_liger_kernel_to_llama(rope=True, cross_entropy=False, fused_linear_cross_entropy=True, rms_norm=True, swiglu=True, model=None)
Apply Liger kernels to replace original implementation in HuggingFace Llama models (2 and 3)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rope
|
bool
|
Whether to apply Liger's rotary position embedding. Default is True. |
True
|
cross_entropy
|
bool
|
Whether to apply Liger's cross entropy loss. Default is False. |
False
|
fused_linear_cross_entropy
|
bool
|
Whether to apply Liger's fused linear cross entropy loss. Default is True.
|
True
|
rms_norm
|
bool
|
Whether to apply Liger's RMSNorm. Default is True. |
True
|
swiglu
|
bool
|
Whether to apply Liger's SwiGLU MLP. Default is True. |
True
|
model
|
PreTrainedModel
|
The model instance to apply Liger kernels to, if the model has already been |
None
|
Source code in src/liger_kernel/transformers/monkey_patch.py
liger_kernel.transformers.apply_liger_kernel_to_mllama ¶
apply_liger_kernel_to_mllama(rope=True, cross_entropy=False, fused_linear_cross_entropy=True, layer_norm=True, rms_norm=True, swiglu=True, model=None)
Apply Liger kernels to replace original implementation in HuggingFace MLlama models. NOTE: MLlama is not available in transformers<4.45.0
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rope
|
bool
|
Whether to apply Liger's rotary position embedding. Default is True. |
True
|
cross_entropy
|
bool
|
Whether to apply Liger's cross entropy loss. Default is False. |
False
|
fused_linear_cross_entropy
|
bool
|
Whether to apply Liger's fused linear cross entropy loss. Default is True.
|
True
|
rms_norm
|
bool
|
Whether to apply Liger's RMSNorm. Default is True. |
True
|
swiglu
|
bool
|
Whether to apply Liger's SwiGLU MLP. Default is True. |
True
|
model
|
PreTrainedModel
|
The model instance to apply Liger kernels to, if the model has already been |
None
|
Source code in src/liger_kernel/transformers/monkey_patch.py
523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 | |
liger_kernel.transformers.apply_liger_kernel_to_mistral ¶
apply_liger_kernel_to_mistral(rope=True, cross_entropy=False, fused_linear_cross_entropy=True, rms_norm=True, swiglu=True, model=None)
Apply Liger kernels to replace original implementation in HuggingFace Mistral models
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rope
|
bool
|
Whether to apply Liger's rotary position embedding. Default is False. |
True
|
cross_entropy
|
bool
|
Whether to apply Liger's cross entropy loss. Default is True. |
False
|
fused_linear_cross_entropy
|
bool
|
Whether to apply Liger's fused linear cross entropy loss. Default is True.
|
True
|
rms_norm
|
bool
|
Whether to apply Liger's RMSNorm. Default is True. |
True
|
rms_norm
|
bool
|
Whether to apply Liger's RMSNorm. Default is True. |
True
|
swiglu
|
bool
|
Whether to apply Liger's SwiGLU MLP. Default is True. |
True
|
model
|
PreTrainedModel
|
The model instance to apply Liger kernels to, if the model has already been |
None
|
Source code in src/liger_kernel/transformers/monkey_patch.py
liger_kernel.transformers.apply_liger_kernel_to_mixtral ¶
apply_liger_kernel_to_mixtral(rope=True, cross_entropy=False, fused_linear_cross_entropy=True, rms_norm=True, swiglu=True, model=None)
Apply Liger kernels to replace original implementation in HuggingFace Mixtral models
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rope
|
bool
|
Whether to apply Liger's rotary position embedding. Default is True. |
True
|
cross_entropy
|
bool
|
Whether to apply Liger's cross entropy loss. Default is False. |
False
|
fused_linear_cross_entropy
|
bool
|
Whether to apply Liger's fused linear cross entropy loss. Default is True.
|
True
|
rms_norm
|
bool
|
Whether to apply Liger's RMSNorm. Default is True. |
True
|
swiglu
|
bool
|
Whether to apply Liger's SwiGLU MLP. Default is True. |
True
|
model
|
PreTrainedModel
|
The model instance to apply Liger kernels to, if the model has already been |
None
|
Source code in src/liger_kernel/transformers/monkey_patch.py
liger_kernel.transformers.apply_liger_kernel_to_gemma ¶
apply_liger_kernel_to_gemma(rope=True, cross_entropy=False, fused_linear_cross_entropy=True, rms_norm=True, geglu=True, model=None)
Apply Liger kernels to replace original implementation in HuggingFace Gemma
(Gemma 1 and 1.1 supported, for Gemma2 please use apply_liger_kernel_to_gemma2 ) to make GPU go burrr.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rope
|
bool
|
Whether to apply Liger's rotary position embedding. Default is True. |
True
|
cross_entropy
|
bool
|
Whether to apply Liger's cross entropy loss. Default is False. |
False
|
fused_linear_cross_entropy
|
bool
|
Whether to apply Liger's fused linear cross entropy loss. Default is True.
|
True
|
rms_norm
|
bool
|
Whether to apply Liger's RMSNorm. Default is True. |
True
|
geglu
|
bool
|
Whether to apply Liger's GeGLU MLP. Default is True. |
True
|
model
|
PreTrainedModel
|
The model instance to apply Liger kernels to, if the model has already been |
None
|
Source code in src/liger_kernel/transformers/monkey_patch.py
liger_kernel.transformers.apply_liger_kernel_to_gemma2 ¶
apply_liger_kernel_to_gemma2(rope=True, cross_entropy=False, fused_linear_cross_entropy=True, rms_norm=True, geglu=True, model=None)
Apply Liger kernels to replace original implementation in HuggingFace Gemma2
(for Gemma1 please use apply_liger_kernel_to_gemma) to make GPU go burrr.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rope
|
bool
|
Whether to apply Liger's rotary position embedding. Default is True. |
True
|
cross_entropy
|
bool
|
Whether to apply Liger's cross entropy loss. Default is False. |
False
|
fused_linear_cross_entropy
|
bool
|
Whether to apply Liger's fused linear cross entropy loss. Default is True.
|
True
|
rms_norm
|
bool
|
Whether to apply Liger's RMSNorm. Default is True. |
True
|
geglu
|
bool
|
Whether to apply Liger's GeGLU MLP. Default is True. |
True
|
model
|
PreTrainedModel
|
The model instance to apply Liger kernels to, if the model has already been |
None
|
Source code in src/liger_kernel/transformers/monkey_patch.py
990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 | |
liger_kernel.transformers.apply_liger_kernel_to_qwen2 ¶
apply_liger_kernel_to_qwen2(rope=True, cross_entropy=False, fused_linear_cross_entropy=True, rms_norm=True, swiglu=True, model=None)
Apply Liger kernels to replace original implementation in HuggingFace Qwen2 models
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rope
|
bool
|
Whether to apply Liger's rotary position embedding. Default is True. |
True
|
cross_entropy
|
bool
|
Whether to apply Liger's cross entropy loss. Default is False. |
False
|
fused_linear_cross_entropy
|
bool
|
Whether to apply Liger's fused linear cross entropy loss. Default is True.
|
True
|
rms_norm
|
bool
|
Whether to apply Liger's RMSNorm. Default is True. |
True
|
swiglu
|
bool
|
Whether to apply Liger's SwiGLU MLP. Default is True. |
True
|
model
|
PreTrainedModel
|
The model instance to apply Liger kernels to, if the model has already been |
None
|
Source code in src/liger_kernel/transformers/monkey_patch.py
1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 | |
liger_kernel.transformers.apply_liger_kernel_to_qwen2_vl ¶
apply_liger_kernel_to_qwen2_vl(rope=True, cross_entropy=False, fused_linear_cross_entropy=True, rms_norm=True, layer_norm=True, swiglu=True, model=None)
Apply Liger kernels to replace original implementation in HuggingFace Qwen2-VL models. NOTE: Qwen2-VL is not supported in transformers<4.52.4
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cross_entropy
|
bool
|
Whether to apply Liger's cross entropy loss. Default is False. |
False
|
fused_linear_cross_entropy
|
bool
|
Whether to apply Liger's fused linear cross entropy loss. Default is True.
|
True
|
rms_norm
|
bool
|
Whether to apply Liger's RMSNorm. Default is True. |
True
|
layer_norm
|
bool
|
Whether to apply Liger's LayerNorm. Default is True. |
True
|
swiglu
|
bool
|
Whether to apply Liger's SwiGLU MLP. Default is True. |
True
|
model
|
PreTrainedModel
|
The model instance to apply Liger kernels to, if the model has already been |
None
|
Source code in src/liger_kernel/transformers/monkey_patch.py
1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 | |
liger_kernel.transformers.apply_liger_kernel_to_phi3 ¶
apply_liger_kernel_to_phi3(rope=True, cross_entropy=False, fused_linear_cross_entropy=True, rms_norm=True, swiglu=True, model=None)
Apply Liger kernels to replace original implementation in HuggingFace Phi3 models.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rope
|
bool
|
Whether to apply Liger's rotary position embedding. Default is True. |
True
|
cross_entropy
|
bool
|
Whether to apply Liger's cross entropy loss. Default is False. |
False
|
fused_linear_cross_entropy
|
bool
|
Whether to apply Liger's fused linear cross entropy loss. Default is True.
|
True
|
rms_norm
|
bool
|
Whether to apply Liger's RMSNorm. Default is True. |
True
|
swiglu
|
bool
|
Whether to apply Liger's SwiGLU Phi3MLP. Default is True. |
True
|
model
|
PreTrainedModel
|
The model instance to apply Liger kernels to, if the model has already been |
None
|
Source code in src/liger_kernel/transformers/monkey_patch.py
2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 | |