You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Mixtral is MOE model, therefore its MLP layers is calling layers/moe.py, where the function signature is forward(self, hidden_states, finished=None, workspace=None,...) -- finished is the additional field specific to MOE but not in MLP/GatedMLP/FusedGatedMLP, so we should use a kwarg workspace=all_reduce_workspace at this line.
I will fix internally and update on next main branch release. Please apply this local change in the meantime, thanks!
Closing for now. If you still have issue after this fix, please feel free to re-open!
Env:
TRT-LLM 0.7.1
Host: p4d.24xlarge ec2 instance(A100)
Model: Mixtral-8x7b
Build args: Tp=8, use_custom_all_reduce
Error log:
Fails with
AttributeError: 'NoneType' object has no attribute 'trt_tensor'
The text was updated successfully, but these errors were encountered: