Abstract

Gaze estimation requires balancing accuracy and efficiency for real-world deployment. We introduce LightGazeNet, a lightweight Graph Neural Network (GNN) framework that integrates multi-modal inputs—facial features, eye cues, 3D eye centers, head pose, and calibration data— within a compact graph-based architecture. Using multi-head attention for context-aware fusion, LightGazeNet achieves competitive or superior accuracy with significantly fewer parameters and strong cross-dataset generalization.

What’s new

  • Graph modeling of heterogeneous gaze cues (appearance + geometry) for explicit relational reasoning.
  • Multi-head attention GNN to adaptively weight modalities and improve interpretability.
  • Lightweight design for practical deployment on resource-constrained devices.

Results

LightGazeNet is designed for strong accuracy–efficiency trade-offs and robust generalization. Below are key headline numbers from the paper.

MPIIFaceGaze

3.06°

Mean angular error (leave-one-subject-out). Calibration further improves performance.

EyeDiap

2.91°

Mean angular error under standard evaluation protocol.

GazeCapture

1.69 cm

Overall distance error across devices (phone+tablet).

Calibration (MPIIFaceGaze)

Calibration samples (k)Angular error (°)Improvement
Uncalibrated3.39
13.283.24%
93.157.08%
163.069.73%
322.9911.80%

Citation



@inproceedings{LightGazeNet2026,
  title     = {LightGazeNet: A Lightweight GNN-based Architecture for Gaze Estimation},
  author    = {Patel, Heena and Chowdhury, Anirban and Choksy, Pooja Jigar and Pachade, Samiksha Pradeep and Puar, Ajinkya},
  booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
  year      = {2026},
  note      = {Accepted},
}

Contact

Questions, collaborations, or requests:

Email: eyelignai@akesoeyecare.com

Affiliation: Akeso Eyecare, Beijing, China