Multimodal Deep Learning for Advertising Content Safety: A Comprehensive Study on Detection and Governance Strategies

Main Article Content

Xin Lu

Abstract

The proliferation of digital advertising across multiple platforms has created unprecedented challenges for content safety and brand protection. This paper presents a comprehensive study on multimodal deep learning approaches for detecting unsafe advertising content, addressing both explicit violations and implicit misleading information. We propose a novel framework that integrates visual, textual, and cross-modal features through advanced fusion architectures to achieve robust detection performance. Our methodology combines pre-trained language models, vision transformers, and optical character recognition systems with attention-based fusion mechanisms for comprehensive content analysis. Experimental results on a dataset of 45,000 advertising samples demonstrate that our approach achieves 92.3% accuracy in detecting policy violations, outperforming single-modality baselines by consistent gains. The framework shows particular strength in identifying implicit misleading content with an 89.1% F1-score and maintains balanced precision-recall trade-offs suitable for production deployment. This research contributes practical governance strategies for human-AI collaboration in content moderation workflows, addressing the critical need for scalable and accurate advertising safety systems in the digital ecosystem. Our method outperforms the best single-modality baseline by 15.5 percentage points and a strong late-fusion baseline by 8.6 percentage points.

Article Details

Section

Articles

How to Cite

Multimodal Deep Learning for Advertising Content Safety: A Comprehensive Study on Detection and Governance Strategies. (2026). Journal of Science, Innovation & Social Impact, 2(1), 64-79. https://sagespress.com/index.php/JSISI/article/view/82

References

1. H. Liu, W. Wang, and H. Li, "Interpretable multimodal misinformation detection with logic reasoning," arXiv preprint arXiv:2305.05964, 2023. doi: 10.18653/v1/2023.findings-acl.620

2. Z. Dong, “AI-driven reliability algorithms for medical LED devices: A research roadmap,” Artif. Intell. Mach. Learn. Rev., vol. 5, no. 2, pp. 54–63, 2024.

3. R. Shao, T. Wu, J. Wu, L. Nie, and Z. Liu, "Detecting and grounding multi-modal media manipulation and beyond," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 8, pp. 5556-5574, 2024. doi: 10.1109/tpami.2024.3367749

4. Y. Chen, D. Li, P. Zhang, J. Sui, Q. Lv, L. Tun, and L. Shang, "Cross-modal ambiguity learning for multimodal fake news detection," In Proceedings of the ACM web conference 2022, April, 2022, pp. 2897-2905. doi: 10.1145/3485447.3511968

5. A. Levi, O. Levi, S. Mishra, and J. Morra, "AI vs. Human Moderators: A Comparative Evaluation of Multimodal LLMs in Content Moderation for Brand Safety," in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 2025, pp. 5965–5973.

6. F. Zeng, W. Li, W. Gao, and Y. Pang, "Multimodal misinformation detection by learning from synthetic data with multimodal LLMs," arXiv preprint arXiv:2409.19656, 2024. doi: 10.18653/v1/2024.findings-emnlp.613

7. A. Agarwal, and P. Meel, "Stacked Bi-LSTM with attention and contextual BERT embeddings for fake news analysis," In 2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS), March, 2021, pp. 233-37.

8. J. Yuan, Y. Yu, G. Mittal, M. Hall, S. Sajeev, and M. Chen, "Rethinking multimodal content moderation from an asymmetric angle with mixed-modality," In Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2024, pp. 8532-8542.

9. Z. Dong and R. Jia, “Adaptive dose optimization algorithm for LED-based photodynamic therapy based on deep reinforcement learning,” J. Sustain., Policy, Pract., vol. 1, no. 3, pp. 144–155, 2025.

10. N. T. Cao, Q. M. Vo, and A. H. Ton-That, "An Effective Approach to Ensure Brand Safety in Online Advertising Using Image Multiclass Classification and Deep Learning," In International conference on WorldS4, July, 2024, pp. 363-373. doi: 10.1007/978-981-97-8695-4_34

11. P. Wei, F. Wu, Y. Sun, H. Zhou, and X. Y. Jing, "Modality and event adversarial networks for multi-modal fake news detection," IEEE Signal Processing Letters, vol. 29, pp. 1382-1386, 2022.

12. B. Singh, "Sidestepping Ad Fraud Through Interfaces of Artificial Intelligence Machine Learning: Deep Dive Into Financial Fraud Auxiliary Brand Safety," In Avoiding Ad Fraud and Supporting Brand Safety: Programmatic Advertising Solutions, 2025, pp. 329-352.

13. J. Wu, J. Guo, and B. Hooi, "Fake news in sheep's clothing: Robust fake news detection against LLM-empowered style attacks," In Proceedings of the 30th ACM SIGKDD conference on knowledge discovery and data mining, August, 2024, pp. 3367-3378. doi: 10.1145/3637528.3671977

14. A. Lao, Q. Zhang, C. Shi, L. Cao, K. Yi, L. Hu, and D. Miao, "Frequency spectrum is more effective for multimodal representation and fusion: A multimodal spectrum rumor detector," In Proceedings of the AAAI conference on artificial intelligence, March, 2024, pp. 18426-18434. doi: 10.1609/aaai.v38i16.29803

15. K. S. L. Kazi, S. S. Shinde, P. M. Nerkar, S. S. Kazi, and V. S. Kazi, "Machine learning for brand protection: A review of a proactive defense mechanism," Avoiding Ad Fraud and Supporting Brand Safety: Programmatic Advertising Solutions, pp. 175-220, 2025.

16. T. Gan, K. Yang, and W. Wang, "Review of Machine Learning and False Advertising in Live E-commerce: Features, Motivations, and Identification Studies," In International Conference on Computing and Communication Networks, October, 2024, pp. 297-306. doi: 10.1007/978-981-96-3250-3_24

17. R. Gorwa, R. Binns, and C. Katzenbach, "Algorithmic content moderation: Technical and political challenges in the automation of platform governance," Big Data & Society, vol. 7, no. 1, p. 2053951719897945, 2020. doi: 10.31235/osf.io/fj6pg

18. Z. Dong and F. Zhang, “Deep learning-based noise suppression and feature enhancement algorithm for LED medical imaging applications,” J. Sci., Innov. Soc. Impact, vol. 1, no. 1, pp. 9–18, 2025.