Wearable egocentric visual context detection raises privacy concerns and is rarely personalized or on-device. We created a wearable system, called PAL, with on-device deep learning so that the user images do not have to be sent to the cloud for processing, and can be processed on-device in a real-time, offline, and privacy-preserving manner. PAL enables human-in-the-loop context labeling using wearable audio input/output and a mobile/web application. PAL uses on-device deep learning models for object and face detection, low-shot custom face recognition (~1 training image per person), low-shot custom context recognition (e.g., brushing teeth, ~10 training images per context), and custom context clustering for active learning. We tested PAL with 4 participants, 2 days each, and obtained ~1000 in-the-wild images. The participants found PAL easy-to-use and each model had дt80% accuracy. Thus, PAL supports wearable, personalized, and privacy-preserving egocentric visual context detection using human-in-the-loop, low-shot, and on-device deep learning.

https://www.youtube.com/watch?v=rL2zv8zx17A&ab_channel=MinaKhan