News
The use of vision transformers (ViT) in computer vision is increasing due to its limited inductive biases (e.g., locality, weight sharing, etc.) and increased scalability compared to other deep ...
This repository is based on SusungHong/Self-Attention-Guidance, which is based on openai/guided-diffusion. The environment setup and the pretrained models are the same as the original repository. The ...
This article presents a novel person reidentification model, named multihead self-attention network (MHSA-Net), to prune unimportant information and capture key local information from person images.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results