SNAC-Pack is an open-source AutoML framework that automates neural architecture design for FPGA deployment by combining hardware-aware search with quantization and pruning. The tool reduces design cycles from months to hours while matching or exceeding baseline performance on tasks like jet classification and quantum computing applications.
SNAC-Pack addresses a critical gap in neural architecture search by moving beyond accuracy-only optimization toward hardware-constrained design. Traditional NAS methods rely on proxy metrics like bit operations that poorly predict actual FPGA resource consumption—lookup tables, DSPs, flip-flops, BRAM, and latency. This framework bridges that gap through a hardware surrogate model that estimates resource costs without expensive synthesis, enabling parallel multi-objective optimization across compute nodes. The approach combines Optuna and NSGA-II for global search with a local refinement stage applying quantization-aware training and iterative pruning, ultimately synthesizing to FPGA firmware via hls4ml. The framework's practical impact is demonstrated through significant design cycle reduction: qubit readout optimization compressed months of manual tuning into hours of automated search while discovering architectures that match or exceed baseline performance. This democratizes hardware-aware ML for specialized domains like particle physics and quantum computing, where FPGA deployment is critical but design expertise is scarce. The open-source availability with YAML configuration and optional agentic interfaces removes barriers to adoption for researchers without deep hardware optimization knowledge. The work reflects broader industry trends toward co-design methodologies that treat hardware constraints as first-class optimization objectives rather than afterthoughts. For organizations deploying edge ML on resource-constrained devices, this toolkit offers measurable efficiency gains and accelerated time-to-deployment.
- →Hardware surrogate models eliminate expensive synthesis costs while enabling accurate resource estimation during neural architecture search.
- →Multi-objective optimization discovers Pareto-optimal architectures balancing accuracy, FPGA utilization, and latency simultaneously.
- →Design cycle acceleration from months to hours demonstrates practical value for specialized applications like quantum computing and particle physics.
- →Open-source framework with YAML configuration removes implementation barriers for researchers without extensive hardware expertise.
- →Combined quantization-aware training and pruning during local search further compresses models for resource-constrained FPGA deployment.