Apple's Person Detection

Aug 2, 2021 07:13 PM
Read the full piece

Recognizing People in Photos Through Private On-Device Machine Learning.

notion image
A very brief research summary
  1. Apple will group together important people in Photos to create a better, personalized experience out of your kept photos and videos.
  1. Automatic Person Recognition is the new "Face Detection"
      • Crossing data to recognize people even when they don't face the camera
      • Face, location, context, and now upper body is being taken into account
  1. Process all that information on your iPhone, overnight. You should'nt notice it.
      • It heavily relies on the Photos gallery's structure, pixel similarity in time & proximity contexts.
      • Will everything break down if I change all the metadata, edit every single image differently (like really differently; isolating Reds and desaturating all the rest for example)
  1. Photos is filtering unclear faces to decrease error rate. Some are blurred, and some are not faces, being detected as such by the face detection algorithm. that's a well know problem but I wonder what's the indicator for validating the face is a face, and what's the indicator's indicator?
  1. Gender and race bias are key for real life success and Apple is working towards it. Important work imo, I love that their work around it is spoken of in a techy article and not a prime time tech crunch piece. Good people do and don't tell.
  1. Visualizing the results looks Apple'y, smooth and clean and contains only what matter.

2 cents is all I have
I chose not to dive deep into the How and focus more in the What reacting to this very good research by Apple. The challenges in defining "accuracy", not to say reaching it, in a chaotic environment like a Photos gallery of A person are only getting deeper the more you look at it.
Working at PostPro we reached a similar conclusion as Apple did in this research; the main one is that sacrifices should be made all across the board in order to reach just a fraction of your end goal.
Calculating what one might (or should) feel like when externally using their own memories is indeed a dare, but I think Apple is taking the right direction here in the What, as well as in the How. The context of 'relatives' or human-centric approach is key at this point when trying to define what matters. What about John's cat? will she be recognized by her upper body?