🕵️ Robust NAS under adversarial training: benchmark, theory, and beyond


LIONS EPFL University of Warwick Mirelo AI University of Wisconsin-Madison   *Partially done at LIONS-EPFL    Partially done at AWS Lablets

Abstract

Recent developments in neural architecture search (NAS) emphasize the significance of considering robust architectures against malicious data. However, there is a notable absence of benchmark evaluations and theoretical guarantees for searching these robust architectures, especially when adversarial training is considered. In this work, we aim to address these two challenges, making twofold contributions. First, we release a comprehensive data set that encompasses both clean accuracy and robust accuracy for a vast array of adversarially trained networks from the NAS-Bench-201 search space on image datasets. Then, leveraging the neural tangent kernel (NTK) tool from deep learning theory, we establish a generalization theory for searching architecture in terms of clean accuracy and robust accuracy under multi-objective adversarial training. We firmly believe that our benchmark and theoretical insights will significantly benefit the NAS community through reliable reproducibility, efficient assessment, and theoretical foundation, particularly in the pursuit of robust architectures.


NAS-Bench-201 Search Space


Figure: Visualization of the NAS-Bench-201 search space. Top left: A neural cell with 4 nodes and 6 edges. Top right: 5 predefined operations that can be selected as the edge in the cell. Bottom: Macro structure of each candidate architecture in the benchmark


Overview of the dataset


Figure: Boxplots for both clean and robust accuracy of all 6466 non-isomorphic architectures in the considered search space. Red line indicates the accuracy of a random guess.


Interactive Evaluation

You can change the operations in each edge and see the corresponding evaluation result.


CIFAR-10CIFAR-100ImageNet16-120
Clean0.6120.1520.166
FGSM (e=3.0)0.2020.1110.034
FGSM (e=8.0)0.1970.1140.033
PGD (e=3.0)0.1390.1090.022
PGD (e=8.0)0.3770.1110.038
AutoAttack(e=8.0)0.2080.054

BibTeX


            @inproceedings{
              wu2024robust,
              title={Robust NAS under adversarial training: benchmark, theory, and beyond},
              author={Yongtao Wu and Fanghui Liu and Carl-Johann Simon-Gabriel and Grigorios Chrysos and Volkan Cevher},
              booktitle={The Twelfth International Conference on Learning Representations},
              year={2024},
              url={https://openreview.net/forum?id=cdUpf6t6LZ}
              }