Abstract
Gaze estimation requires balancing accuracy and efficiency for real-world deployment. We introduce LightGazeNet, a lightweight Graph Neural Network (GNN) framework that integrates multi-modal inputs—facial features, eye cues, 3D eye centers, head pose, and calibration data— within a compact graph-based architecture. Using multi-head attention for context-aware fusion, LightGazeNet achieves competitive or superior accuracy with significantly fewer parameters and strong cross-dataset generalization.
What’s new
- Graph modeling of heterogeneous gaze cues (appearance + geometry) for explicit relational reasoning.
- Multi-head attention GNN to adaptively weight modalities and improve interpretability.
- Lightweight design for practical deployment on resource-constrained devices.
Results
LightGazeNet is designed for strong accuracy–efficiency trade-offs and robust generalization. Below are key headline numbers from the paper.
MPIIFaceGaze
3.06°
Mean angular error (leave-one-subject-out). Calibration further improves performance.
EyeDiap
2.91°
Mean angular error under standard evaluation protocol.
GazeCapture
1.69 cm
Overall distance error across devices (phone+tablet).
Calibration (MPIIFaceGaze)
| Calibration samples (k) | Angular error (°) | Improvement |
|---|---|---|
| Uncalibrated | 3.39 | — |
| 1 | 3.28 | 3.24% |
| 9 | 3.15 | 7.08% |
| 16 | 3.06 | 9.73% |
| 32 | 2.99 | 11.80% |
Citation
@inproceedings{LightGazeNet2026,
title = {LightGazeNet: A Lightweight GNN-based Architecture for Gaze Estimation},
author = {Patel, Heena and Chowdhury, Anirban and Choksy, Pooja Jigar and Pachade, Samiksha Pradeep and Puar, Ajinkya},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
year = {2026},
note = {Accepted},
}
Contact
Questions, collaborations, or requests:
Email: eyelignai@akesoeyecare.com
Affiliation: Akeso Eyecare, Beijing, China