|
Boosting Multi-Object Tracking Performance with Advanced Attention Mechanism based on Transformer |
|---|---|
| รหัสดีโอไอ | |
| Creator | Pimpa Cheewaprakobkit |
| Title | Boosting Multi-Object Tracking Performance with Advanced Attention Mechanism based on Transformer |
| Publisher | Faculty of Informatics, Mahasarakham University |
| Publication Year | 2569 |
| Journal Title | Journal of Applied Informatics and Technology |
| Journal Vol. | 8 |
| Journal No. | 1 |
| Page no. | 257934 |
| Keyword | Advanced Attention Mechanism, Attention Mechanisms, Multi-Object Tracking, Transformer |
| URL Website | https://ph01.tci-thaijo.org/index.php/jait |
| Website title | Journal of Applied Informatics and Technology |
| ISSN | 3088-1803 |
| Abstract | The Transformer architecture has been highly successful in natural lan-guage processing and is increasingly being applied to computer visiontasks, such as medical image analysis, traffic light monitoring, surveil-lance, and object tracking, due to its self-attention mechanism enablingglobal interactions between image patches. However, the quadratic com-plexity in time and memory limits its scalability for high-resolution im-ages. To address this, we propose an advanced attention mechanismfor multi-object tracking, incorporating Transposed Self-Attention (TSA)and Cross Patch Interaction (CPI) modules. TSA reduces computationalcomplexity by capturing feature dependencies across the entire channelspace instead of image patches, resulting in linear complexity relative tothe number of patches. CPI enhances cross patch communication, improv-ing the model’s learning efficiency. Our method reduces computationalcosts by approximately 13% and achieves state-of-the-art performance,with a multi-object tracking accuracy of 72.8% on MOT17 and 63.4% onMOT20. This represents a 10.3% improvement over the baseline methodon MOT17, while also reducing training time per epoch by nearly 7 min-utes and increasing frame per second from 7 to 8. These results demon-strate the effectiveness and efficiency of our approach for multi-objecttracking. |