Triton 3.2.0 AttrsDescriptor issue topic. - leomaxwell973/Triton-3.3.0-UPDATE_FROM_3.2.0_and_FIXED-Windows-Nvidia-Prebuilt GitHub Wiki
In 3.2.0, it is believed that if users are not using bleeding edge or perhaps custom-built torches, they may run into this code failing, for some reason.
Torch_inductor\codegen\triton.py
@lru_cache(None)
def gen_attr_descriptor_import():
"""
import AttrsDescriptor if the triton version is new enough to have this
class defined.
"""
if not has_triton_package():
return ""
import triton.compiler.compiler
# Note: this works because triton.compiler.compiler imports AttrsDescriptor from triton.backends.compiler
# When support for the legacy AttrsDescriptor is removed then this import path should be changed.
if hasattr(triton.compiler.compiler, "AttrsDescriptor"):
return "from triton.compiler.compiler import AttrsDescriptor"
else:
return ""
As you can see, this code really shouldn't be failing first off, but it is... possibly either something changed in python/triton/torch were doing housekeeping and preventative bug coding by checking before committing to an action, is well, now counted as committing by checking and thus triggering not found errors of AttrsDescriptor.
Which on the flipside of making no sense its broken, this, does make sense: AttrsDescriptor no longer exists. Neither do alot of the Attrs and other properties that used to be present, they are all being shifted, absorbed or otherwise depreciated in a big way for the backend it seems.
So what can you do?
I believe bleeding edge torch cases, like one you may find here on hugging face links at this time, MIGHT work to restore functionality, but this is cuda 128 and it is "designed" for blackwell, but it installed fie for me and though i can't fully test it right now, i can say i have no reason to suspect needing to, at least not yet. HOWEVER the more sure-fire way would be to build your own torch from source, but building things to build things, I get that.
CPython | PyTorch | Torchvision
----------|----------|----------
Alternative: Brute Force
if you are of the technical nature, you can try this:
Find the code above, and change it to this:
Torch_inductor\codegen\triton.py
@lru_cache(None)
def gen_attr_descriptor_import():
"""
import AttrsDescriptor if the triton version is new enough to have this
class defined. DEPRECIATED <<< add of you like, or remove comments
"""
if not has_triton_package():
return ""
else: # <<< either add this or move one from bottom here, just to satisfy the scopes and ifs removed and remaining etc.
import triton.compiler.compiler
# Note: this works because triton.compiler.compiler imports AttrsDescriptor from triton.backends.compiler
# When support for the legacy AttrsDescriptor is removed then this import path should be changed.
# if hasattr(triton.compiler.compiler, "AttrsDescriptor"): <<< <
# return "from triton.compiler.compiler import AttrsDescriptor" <<< ^ < Either comment out or delete everything here tagged as a comment.
# else: <<< ----
return ""
OR
@lru_cache(None)
def gen_attr_descriptor_import():
if not has_triton_package():
return ""
else:
import triton.compiler.compiler
return ""
OR...
MY Own Experimental Trial & Discovery
While poking around for another issue that's baring me from testing in full swing and getting things done, I did happen upon this, which is not in the backends folder and easily overlooked in the misc. tools folder. this has similar backend functionality though, from a attrs perspective, not a perfect match and they are different but its close, and it may be the replacement but maybe not... why wouldn't they update the links? why wouldn't torch be on point? why is this in tools? why is it experimental? if they are gonna have Attr Descs anyway, why tuck it away and nuke the others? So, you can append this code above, I'll post more info on it if / when I can confirm.
>>> BUT ONLY IF YOU ARE BOTH SAVY AND WILLING TO MESS WITH CODE YOURSELF IN A RISK LIKELY ENVIORNMENT <<<
# When support for the legacy AttrsDescriptor is removed then this import path should be changed.
# UPDATE: This is unofficially updated for Triton 3.2.0+, while retaining compatibility for 3.0.0 and before. - LeoMaxwell973 (Solo)
if hasattr(triton.compiler.compiler, "AttrsDescriptor"):
return "from triton.compiler.compiler import AttrsDescriptor"
elif hasattr(triton.tools.experimental_descriptor, "TmaDescKernelParam"):
return "from triton.tools.experimental_descriptor import TmaDescKernelParam"
else:
return ""
! NEW !
A second location for missing Attrs values, and this time, its torch ... ?
This can be called by utilities trying to run torch compile tests and inference and firstly check Triton's vers and dict stuff .... instead of Triton.. because reasons.... smh devs
Torch_inductor\utils.py
class TritonAttrsDescriptorVersion(enum.Enum):
V0_NO_TRITON = 0
V1_COMPILER = 1 # triton.compiler.compiler.AttrsDescriptor
V2_BACKENDS = 2 # triton.backends.compiler.AttrsDescriptor
V3_BACKENDS_TUPLE = (
3 # triton.backends.compiler.AttrsDescriptor, but with tuple support
)
V4_DICT = 4 # a raw dict
@functools.lru_cache(None)
def get_triton_attrs_descriptor_version() -> TritonAttrsDescriptorVersion:
if importlib.util.find_spec("triton") is None: # !!!! YOU MAY NEED TO ADD IMPORT IMPORTLIB
return TritonAttrsDescriptorVersion.V0_NO_TRITON
import triton.backends.compiler
import triton.compiler.compiler
if hasattr(triton.backends.compiler, "AttrsDescriptor"):
# Triton 3.2.0
# AttrsDescriptor was moved from triton.compiler.compiler to triton.backends.compiler.
# AttrsDescriptor and its serialization format were also changed.
# TODO: implement V3_BACKENDS_TUPLE
# On Dec 9, 2024, tuple support (triton #5220) was implemented and breaks handling.
# We don't have a way to detect this (and haven't implemented this version)
return TritonAttrsDescriptorVersion.V2_BACKENDS
elif hasattr(triton.compiler.compiler, "AttrsDescriptor"):
# Triton 3.0.0
return TritonAttrsDescriptorVersion.V1_COMPILER
else:
# After Jan 1, 2025
# AttrsDescriptor was removed and replaced with a raw dict.
return TritonAttrsDescriptorVersion.V4_DICT
def triton_version_uses_attrs_dict() -> bool:
return get_triton_attrs_descriptor_version() == TritonAttrsDescriptorVersion.V4_DICT
This gets appended at the bottom luckily, or so id assume, it was at the bottom of raw torch which, it seems, since im using bleeding edge and it only solved SOME of the problems the people that had/are having/will have NO problems, are likely to be the ones building their torches
since the bleeding edge 2.6.0+cu128 cuda drivers only seemed to have resolved the Triton side of missing things, not as much torch, and ... well what torch does just doesn't make sense sometimes ill leave it at that to save the rant. Also to finish my point, as I said, this code was retrieved from the raw torch build I have not gotten to do yet, because atlas reasons and this stepping in on it more than i had anticipated. but yea even without being built, it had the code so... if you're running into things you need to brute force and already have the torch you want/need/should have/etc. or just cant build one at this time like me, then grab raw pack anyway to get patch/bypass crosscodes.
TRITION KERNELS MISMATCH :
This looks like classic version creep, change:
torch\testing_internal\triton_utils.py
@triton.jit
def inline_asm_kernel(X, Y, Z, n: "tl.constexpr", BLOCK: "tl.constexpr"):
x = tl.load(X + tl.arange(0, BLOCK))
y = tl.load(Y + tl.arange(0, BLOCK))
s = tl.full([BLOCK], n, tl.int32)
z = tl.inline_asm_elementwise(
"shf.l.wrap.b32 $0, $1, $2, $3;",
"=r,r, r, r",
[x, y, s],
dtype=tl.int32,
is_pure=True,
pack=1,
)
tl.store(Z + tl.arange(0, BLOCK), z)
To this, by well copy paste, or just copy pasting same one and changing names on each, (none/blnak pure, pure_true, pure_false), all other attributes aside from name and pure i couldnt find a diff so just copy paste it, it worked for me, least no errors
@triton.jit
def inline_asm_kernel(X, Y, Z, n: "tl.constexpr", BLOCK: "tl.constexpr"):
x = tl.load(X + tl.arange(0, BLOCK))
y = tl.load(Y + tl.arange(0, BLOCK))
s = tl.full([BLOCK], n, tl.int32)
z = tl.inline_asm_elementwise(
"shf.l.wrap.b32 $0, $1, $2, $3;",
"=r,r, r, r",
[x, y, s],
dtype=tl.int32,
is_pure=True,
pack=1,
)
tl.store(Z + tl.arange(0, BLOCK), z)
@triton.jit
def inline_asm_kernel_is_pure_true(X, Y, Z, n: "tl.constexpr", BLOCK: "tl.constexpr"):
x = tl.load(X + tl.arange(0, BLOCK))
y = tl.load(Y + tl.arange(0, BLOCK))
s = tl.full([BLOCK], n, tl.int32)
z = tl.inline_asm_elementwise(
"shf.l.wrap.b32 $0, $1, $2, $3;",
"=r,r, r, r",
[x, y, s],
dtype=tl.int32,
is_pure=True,
pack=1,
)
tl.store(Z + tl.arange(0, BLOCK), z)
@triton.jit
def inline_asm_kernel_is_pure_false(X, Y, Z, n: "tl.constexpr", BLOCK: "tl.constexpr"):
x = tl.load(X + tl.arange(0, BLOCK))
y = tl.load(Y + tl.arange(0, BLOCK))
s = tl.full([BLOCK], n, tl.int32)
z = tl.inline_asm_elementwise(
"shf.l.wrap.b32 $0, $1, $2, $3;",
"=r,r, r, r",
[x, y, s],
dtype=tl.int32,
is_pure=False,
pack=1,
)
tl.store(Z + tl.arange(0, BLOCK), z)