High-Level APIs¶
AutoModel¶
| AutoModel Variant | API |
|---|---|
| AutoModelForCausalLM | liger_kernel.transformers.AutoLigerKernelForCausalLM |
This API extends the implementation of the AutoModelForCausalLM within the transformers library from Hugging Face.
liger_kernel.transformers.AutoLigerKernelForCausalLM ¶
Bases: AutoModelForCausalLM
This class is a drop-in replacement for AutoModelForCausalLM that applies the Liger Kernel to the model if applicable.
Source code in src/liger_kernel/transformers/auto_model.py
Try it Out
You can experiment as shown in this example here.
Patching¶
You can also use the Patching APIs to use the kernels for a specific model architecture.
| Model | API | Supported Operations |
|---|---|---|
| LLaMA 2 & 3 | liger_kernel.transformers.apply_liger_kernel_to_llama |
RoPE, RMSNorm, SwiGLU, CrossEntropyLoss, FusedLinearCrossEntropy |
| LLaMA 3.2-Vision | liger_kernel.transformers.apply_liger_kernel_to_mllama |
RoPE, RMSNorm, SwiGLU, CrossEntropyLoss, FusedLinearCrossEntropy |
| Mistral | liger_kernel.transformers.apply_liger_kernel_to_mistral |
RoPE, RMSNorm, SwiGLU, CrossEntropyLoss, FusedLinearCrossEntropy |
| Mixtral | liger_kernel.transformers.apply_liger_kernel_to_mixtral |
RoPE, RMSNorm, SwiGLU, CrossEntropyLoss, FusedLinearCrossEntropy |
| Gemma1 | liger_kernel.transformers.apply_liger_kernel_to_gemma |
RoPE, RMSNorm, GeGLU, CrossEntropyLoss, FusedLinearCrossEntropy |
| Gemma2 | liger_kernel.transformers.apply_liger_kernel_to_gemma2 |
RoPE, RMSNorm, GeGLU, CrossEntropyLoss, FusedLinearCrossEntropy |
| Qwen2, Qwen2.5, & QwQ | liger_kernel.transformers.apply_liger_kernel_to_qwen2 |
RoPE, RMSNorm, SwiGLU, CrossEntropyLoss, FusedLinearCrossEntropy |
| Qwen2-VL | liger_kernel.transformers.apply_liger_kernel_to_qwen2_vl |
RMSNorm, LayerNorm, SwiGLU, CrossEntropyLoss, FusedLinearCrossEntropy |
| Phi3 & Phi3.5 | liger_kernel.transformers.apply_liger_kernel_to_phi3 |
RoPE, RMSNorm, SwiGLU, CrossEntropyLoss, FusedLinearCrossEntropy |
Function Signatures¶
liger_kernel.transformers.apply_liger_kernel_to_llama ¶
apply_liger_kernel_to_llama(rope=True, cross_entropy=False, fused_linear_cross_entropy=True, rms_norm=True, swiglu=True, model=None)
Apply Liger kernels to replace original implementation in HuggingFace Llama models (2 and 3)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rope
|
bool
|
Whether to apply Liger's rotary position embedding. Default is True. |
True
|
cross_entropy
|
bool
|
Whether to apply Liger's cross entropy loss. Default is False. |
False
|
fused_linear_cross_entropy
|
bool
|
Whether to apply Liger's fused linear cross entropy loss. Default is True.
|
True
|
rms_norm
|
bool
|
Whether to apply Liger's RMSNorm. Default is True. |
True
|
swiglu
|
bool
|
Whether to apply Liger's SwiGLU MLP. Default is True. |
True
|
model
|
PreTrainedModel
|
The model instance to apply Liger kernels to, if the model has already been |
None
|
Source code in src/liger_kernel/transformers/monkey_patch.py
liger_kernel.transformers.apply_liger_kernel_to_mllama ¶
apply_liger_kernel_to_mllama(rope=True, cross_entropy=False, fused_linear_cross_entropy=True, layer_norm=True, rms_norm=True, swiglu=True, model=None)
Apply Liger kernels to replace original implementation in HuggingFace MLlama models. NOTE: MLlama is not available in transformers<4.45.0
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rope
|
bool
|
Whether to apply Liger's rotary position embedding. Default is True. |
True
|
cross_entropy
|
bool
|
Whether to apply Liger's cross entropy loss. Default is False. |
False
|
fused_linear_cross_entropy
|
bool
|
Whether to apply Liger's fused linear cross entropy loss. Default is True.
|
True
|
rms_norm
|
bool
|
Whether to apply Liger's RMSNorm. Default is True. |
True
|
swiglu
|
bool
|
Whether to apply Liger's SwiGLU MLP. Default is True. |
True
|
model
|
PreTrainedModel
|
The model instance to apply Liger kernels to, if the model has already been |
None
|
Source code in src/liger_kernel/transformers/monkey_patch.py
523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 | |
liger_kernel.transformers.apply_liger_kernel_to_mistral ¶
apply_liger_kernel_to_mistral(rope=True, cross_entropy=False, fused_linear_cross_entropy=True, rms_norm=True, swiglu=True, model=None)
Apply Liger kernels to replace original implementation in HuggingFace Mistral models
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rope
|
bool
|
Whether to apply Liger's rotary position embedding. Default is False. |
True
|
cross_entropy
|
bool
|
Whether to apply Liger's cross entropy loss. Default is True. |
False
|
fused_linear_cross_entropy
|
bool
|
Whether to apply Liger's fused linear cross entropy loss. Default is True.
|
True
|
rms_norm
|
bool
|
Whether to apply Liger's RMSNorm. Default is True. |
True
|
rms_norm
|
bool
|
Whether to apply Liger's RMSNorm. Default is True. |
True
|
swiglu
|
bool
|
Whether to apply Liger's SwiGLU MLP. Default is True. |
True
|
model
|
PreTrainedModel
|
The model instance to apply Liger kernels to, if the model has already been |
None
|
Source code in src/liger_kernel/transformers/monkey_patch.py
liger_kernel.transformers.apply_liger_kernel_to_mixtral ¶
apply_liger_kernel_to_mixtral(rope=True, cross_entropy=False, fused_linear_cross_entropy=True, rms_norm=True, swiglu=True, model=None)
Apply Liger kernels to replace original implementation in HuggingFace Mixtral models
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rope
|
bool
|
Whether to apply Liger's rotary position embedding. Default is True. |
True
|
cross_entropy
|
bool
|
Whether to apply Liger's cross entropy loss. Default is False. |
False
|
fused_linear_cross_entropy
|
bool
|
Whether to apply Liger's fused linear cross entropy loss. Default is True.
|
True
|
rms_norm
|
bool
|
Whether to apply Liger's RMSNorm. Default is True. |
True
|
swiglu
|
bool
|
Whether to apply Liger's SwiGLU MLP. Default is True. |
True
|
model
|
PreTrainedModel
|
The model instance to apply Liger kernels to, if the model has already been |
None
|
Source code in src/liger_kernel/transformers/monkey_patch.py
liger_kernel.transformers.apply_liger_kernel_to_gemma ¶
apply_liger_kernel_to_gemma(rope=True, cross_entropy=False, fused_linear_cross_entropy=True, rms_norm=True, geglu=True, model=None)
Apply Liger kernels to replace original implementation in HuggingFace Gemma
(Gemma 1 and 1.1 supported, for Gemma2 please use apply_liger_kernel_to_gemma2 ) to make GPU go burrr.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rope
|
bool
|
Whether to apply Liger's rotary position embedding. Default is True. |
True
|
cross_entropy
|
bool
|
Whether to apply Liger's cross entropy loss. Default is False. |
False
|
fused_linear_cross_entropy
|
bool
|
Whether to apply Liger's fused linear cross entropy loss. Default is True.
|
True
|
rms_norm
|
bool
|
Whether to apply Liger's RMSNorm. Default is True. |
True
|
geglu
|
bool
|
Whether to apply Liger's GeGLU MLP. Default is True. |
True
|
model
|
PreTrainedModel
|
The model instance to apply Liger kernels to, if the model has already been |
None
|
Source code in src/liger_kernel/transformers/monkey_patch.py
liger_kernel.transformers.apply_liger_kernel_to_gemma2 ¶
apply_liger_kernel_to_gemma2(rope=True, cross_entropy=False, fused_linear_cross_entropy=True, rms_norm=True, geglu=True, model=None)
Apply Liger kernels to replace original implementation in HuggingFace Gemma2
(for Gemma1 please use apply_liger_kernel_to_gemma) to make GPU go burrr.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rope
|
bool
|
Whether to apply Liger's rotary position embedding. Default is True. |
True
|
cross_entropy
|
bool
|
Whether to apply Liger's cross entropy loss. Default is False. |
False
|
fused_linear_cross_entropy
|
bool
|
Whether to apply Liger's fused linear cross entropy loss. Default is True.
|
True
|
rms_norm
|
bool
|
Whether to apply Liger's RMSNorm. Default is True. |
True
|
geglu
|
bool
|
Whether to apply Liger's GeGLU MLP. Default is True. |
True
|
model
|
PreTrainedModel
|
The model instance to apply Liger kernels to, if the model has already been |
None
|
Source code in src/liger_kernel/transformers/monkey_patch.py
990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 | |
liger_kernel.transformers.apply_liger_kernel_to_qwen2 ¶
apply_liger_kernel_to_qwen2(rope=True, cross_entropy=False, fused_linear_cross_entropy=True, rms_norm=True, swiglu=True, model=None)
Apply Liger kernels to replace original implementation in HuggingFace Qwen2 models
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rope
|
bool
|
Whether to apply Liger's rotary position embedding. Default is True. |
True
|
cross_entropy
|
bool
|
Whether to apply Liger's cross entropy loss. Default is False. |
False
|
fused_linear_cross_entropy
|
bool
|
Whether to apply Liger's fused linear cross entropy loss. Default is True.
|
True
|
rms_norm
|
bool
|
Whether to apply Liger's RMSNorm. Default is True. |
True
|
swiglu
|
bool
|
Whether to apply Liger's SwiGLU MLP. Default is True. |
True
|
model
|
PreTrainedModel
|
The model instance to apply Liger kernels to, if the model has already been |
None
|
Source code in src/liger_kernel/transformers/monkey_patch.py
1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 | |
liger_kernel.transformers.apply_liger_kernel_to_qwen2_vl ¶
apply_liger_kernel_to_qwen2_vl(rope=True, cross_entropy=False, fused_linear_cross_entropy=True, rms_norm=True, layer_norm=True, swiglu=True, model=None)
Apply Liger kernels to replace original implementation in HuggingFace Qwen2-VL models. NOTE: Qwen2-VL is not supported in transformers<4.52.4
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cross_entropy
|
bool
|
Whether to apply Liger's cross entropy loss. Default is False. |
False
|
fused_linear_cross_entropy
|
bool
|
Whether to apply Liger's fused linear cross entropy loss. Default is True.
|
True
|
rms_norm
|
bool
|
Whether to apply Liger's RMSNorm. Default is True. |
True
|
layer_norm
|
bool
|
Whether to apply Liger's LayerNorm. Default is True. |
True
|
swiglu
|
bool
|
Whether to apply Liger's SwiGLU MLP. Default is True. |
True
|
model
|
PreTrainedModel
|
The model instance to apply Liger kernels to, if the model has already been |
None
|
Source code in src/liger_kernel/transformers/monkey_patch.py
1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 | |
liger_kernel.transformers.apply_liger_kernel_to_phi3 ¶
apply_liger_kernel_to_phi3(rope=True, cross_entropy=False, fused_linear_cross_entropy=True, rms_norm=True, swiglu=True, model=None)
Apply Liger kernels to replace original implementation in HuggingFace Phi3 models.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rope
|
bool
|
Whether to apply Liger's rotary position embedding. Default is True. |
True
|
cross_entropy
|
bool
|
Whether to apply Liger's cross entropy loss. Default is False. |
False
|
fused_linear_cross_entropy
|
bool
|
Whether to apply Liger's fused linear cross entropy loss. Default is True.
|
True
|
rms_norm
|
bool
|
Whether to apply Liger's RMSNorm. Default is True. |
True
|
swiglu
|
bool
|
Whether to apply Liger's SwiGLU Phi3MLP. Default is True. |
True
|
model
|
PreTrainedModel
|
The model instance to apply Liger kernels to, if the model has already been |
None
|
Source code in src/liger_kernel/transformers/monkey_patch.py
2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 | |