LightGazeNet

Abstract

Gaze estimation requires balancing accuracy and efficiency for real-world deployment. We introduce LightGazeNet, a lightweight Graph Neural Network (GNN) framework that integrates multi-modal inputs—facial features, eye cues, 3D eye centers, head pose, and calibration data— within a compact graph-based architecture. Using multi-head attention for context-aware fusion, LightGazeNet achieves competitive or superior accuracy with significantly fewer parameters and strong cross-dataset generalization.

What’s new

Graph modeling of heterogeneous gaze cues (appearance + geometry) for explicit relational reasoning.
Multi-head attention GNN to adaptively weight modalities and improve interpretability.
Lightweight design for practical deployment on resource-constrained devices.

Results

LightGazeNet is designed for strong accuracy–efficiency trade-offs and robust generalization. Below are key headline numbers from the paper.

MPIIFaceGaze

3.06°

Mean angular error (leave-one-subject-out). Calibration further improves performance.

EyeDiap

2.91°

Mean angular error under standard evaluation protocol.

GazeCapture

1.69 cm

Overall distance error across devices (phone+tablet).

Calibration (MPIIFaceGaze)

Calibration samples (k)	Angular error (°)	Improvement
Uncalibrated	3.39	—
1	3.28	3.24%
9	3.15	7.08%
16	3.06	9.73%
32	2.99	11.80%

Citation

@inproceedings{LightGazeNet2026,
  title     = {LightGazeNet: A Lightweight GNN-based Architecture for Gaze Estimation},
  author    = {Patel, Heena and Chowdhury, Anirban and Choksy, Pooja Jigar and Pachade, Samiksha Pradeep and Puar, Ajinkya},
  booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
  year      = {2026},
  note      = {Accepted},
}

Contact

Questions, collaborations, or requests:

Email: eyelignai@akesoeyecare.com

Affiliation: Akeso Eyecare, Beijing, China