CAMixerSR: Only Details Need More "Attention"

To satisfy the rapidly increasing demands on the large image (2K-8K) super-resolution (SR), prevailing methods follow two independent tracks: 1) accelerate existing networks by content-aware routing, and 2) design better super-resolution networks via token mixer refining. Despite directness, they encounter unavoidable defects (e.g., inflexible route or non-discriminative processing) limiting further improvements of quality-complexity trade-off. To erase the drawbacks, we integrate these schemes by proposing a content-aware mixer (CAMixer), which assigns convolution for simple contexts and additional deformable window-attention for sparse textures. Specifically, the CAMixer uses a learnable predictor to generate multiple bootstraps, including offsets for windows warping, a mask for classifying windows, and convolutional attentions for endowing convolution with the dynamic property, which modulates attention to include more useful textures self-adaptively and improves the representation capability of convolution. We further introduce a global classification loss to improve the accuracy of predictors. By simply stacking CAMixers, we obtain CAMixerSR which achieves superior performance on large-image SR, lightweight SR, and omnidirectional-image SR..

Medienart:

Preprint

Erscheinungsjahr:

2024

Erschienen:

2024

Enthalten in:

arXiv.org - (2024) vom: 29. Feb. Zur Gesamtaufnahme - year:2024

Sprache:

Englisch

Beteiligte Personen:

Wang, Yan [VerfasserIn]
Zhao, Shijie [VerfasserIn]
Liu, Yi [VerfasserIn]
Li, Junlin [VerfasserIn]
Zhang, Li [VerfasserIn]

Links:

Volltext [kostenfrei]

Themen:

000
620
Computer Science - Computer Vision and Pattern Recognition
Electrical Engineering and Systems Science - Image and Video Processing

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

XCH04275237X