In this paper, we propose an efficient multi-scale Transformer (EMSFormer) that employs learnable keys and values based on the single-head attention mechanism and a dual-resolution structure for ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results