Abstract: Transformers have shown great potential in computer vision tasks. A common belief is their attention-based token mixer module contributes most to their competence. However, recent works show ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results