Spaces:
Sleeping
HCP Face Analysis β Architecture
Pipeline
A single photo runs through ten analyzers. Their outputs are merged
into one dictionary; later analyzers can overwrite keys from earlier
ones (only intentional in a couple of places β _run_pipeline in
app.py is the single source of truth).
Photo (RGB ndarray)
β
βββΊ [1] InsightFaceAnalyzer (insightface buffalo_l, ONNX)
β β face_bbox, face_confidence, face_embedding (512-d ArcFace),
β age_estimate (piecewise-calibrated). Gender comes from
β FairFace (step 3a) for a real softmax confidence.
β
βββΊ Build face crop from face_bbox + padding. Downstream analyzers
β that benefit from a tighter input read the crop; MediaPipe gets
β the full image because it has its own detector.
β
βββΊ [2] LandmarkAnalyzer (MediaPipe Face Landmarker)
β 478 landmarks + 52 blendshapes β all geometric features,
β smiling, mouth_open (via blendshapes.jawOpen), eyes_open,
β facial_asymmetry_score, smile_asymmetry, possible_dimples,
β possible_unibrow.
β
βββΊ [3a] GenderAnalyzer (dima806/fairface_gender ViT)
β β gender, gender_confidence, gender_distribution
β (cropped input). Replaces the InsightFace gender head so
β we get a real softmax confidence.
β
βββΊ [3b] EthnicityAnalyzer (cledoux42/Ethnicity_Test_v003 ViT)
β β ethnicity, ethnicity_confidence, ethnicity_distribution
β (cropped input).
β
βββΊ [4] ParsingAnalyzer (SegFormer-B5 human parsing)
β β _skin_mask, _hair_mask, hat_detected, hair_length,
β hair_present, wrinkle_level, skin_texture_score,
β skin_uniformity, freckles_or_moles
β (cropped input β cleaner masks).
β
βββΊ [5] EmotionAnalyzer (HSEmotion EfficientNet-B0)
β β primary/secondary emotion, emotion_scores, valence,
β arousal, mood (cropped input).
β
βββΊ [6] ColorAnalyzer (no ML β OpenCV LAB/HSV)
β Reads SegFormer masks + MediaPipe lip/iris landmarks.
β β skin_tone (Fitzpatrick + L*/a*/b* + hex), skin_undertone,
β eye_color, hair_color (name + hex), hair_texture
β (coarse, fallback), lip_color (shade + hex)
β
βββΊ [7] ObstructionAnalyzer (dima806/face_obstruction ViT-B/16)
β β wearing_glasses, wearing_sunglasses, wearing_mask,
β obstruction_scores (cropped input).
β
βββΊ [8] HairTypeAnalyzer (dima806/hair_type ViT-B/16)
β β hair_type (curly/dreadlocks/kinky/straight/wavy),
β hair_type_confidence (cropped input).
β
βββΊ [9] BeautyAnalyzer (ResNet-50 trained on SCUT-FBP5500)
β Optional. Loads local weights or HF Hub; if absent, output
β is None and AestheticAnalyzer falls back to rules.
β β beauty_score (1.0β5.0), beauty_score_norm (0β100),
β beauty_model_source.
β
βββΊ [10] AestheticAnalyzer (no model)
Reads the merged dict from steps 1β9 and produces the
final chopped_score (0β100) plus chopped_breakdown
showing each factor's signed contribution.
Internal/scratch keys use a leading underscore (_skin_mask,
_hair_mask, _raw_landmarks, _insight_landmarks_2d). app.py
strips them before returning JSON.
Attribute β source map
| Section | Field(s) | Source |
|---|---|---|
| Demographics | face_bbox, face_confidence, face_embedding (512-d), age_estimate (piecewise-calibrated), age_range | InsightFace buffalo_l |
| Demographics | gender, gender_confidence, gender_distribution | FairFace ViT |
| Demographics | ethnicity, ethnicity_confidence, ethnicity_distribution | cledoux42 ViT |
| Emotion | primary/secondary emotion, emotion_scores, valence, arousal, mood | HSEmotion EffNet-B0 |
| Face Structure | face_shape (+ 4 ratios), jawline_type/angle, chin_type, cheekbone_prominence, cheek_fullness, forehead_width, facial_asymmetry_score | MediaPipe Face Landmarker |
| Hair | hair_length, hair_present | SegFormer-B5 |
| Hair | hair_type (+ confidence) | HairTypeViT (dima806) |
| Hair | hair_color, hair hex | ColorAnalyzer |
| Eyes | eye_shape, eye_depth, eye_spacing, eye_size, eyes_open | MediaPipe |
| Eyes | eye_color | ColorAnalyzer |
| Eyebrows | eyebrow_shape, eyebrow_arch_height, eyebrow_thickness, possible_unibrow | MediaPipe |
| Nose | nose_shape, nose_bridge, nose_tip_shape, nostril_width | MediaPipe |
| Lips & Mouth | lip_fullness, lip_balance, mouth_width, cupids_bow, smile_asymmetry, possible_dimples, smiling, mouth_open | MediaPipe (last two via blendshapes) |
| Lips & Mouth | lip_color (shade + hex) | ColorAnalyzer (mask from MediaPipe) |
| Skin | skin_tone (Fitzpatrick, L*/a*/b*, hex), skin_undertone | ColorAnalyzer |
| Skin | wrinkle_level, skin_texture_score, skin_uniformity | SegFormer mask + OpenCV stats (freckles_or_moles still computed server-side but no longer displayed β detector was too noisy) |
| Accessories | wearing_glasses, wearing_sunglasses, wearing_mask | ObstructionViT (dima806) |
| Accessories | wearing_hat | SegFormer (hat class coverage) |
| Aesthetics | beauty_score (1β5), beauty_score_norm (0β100) | BeautyAnalyzer (SCUT-FBP5500 ResNet-50) |
| Aesthetics | chopped_score (0β100), chopped_breakdown | AestheticAnalyzer (rule + learned blend) |
Face matching
InsightFace's ArcFace head emits a 512-d L2-normalised recognition
embedding. We store it alongside each contact in
people.face_embedding (pgvector). On a new photo save, the client
queries Supabase for any contact with cosine similarity β₯ 0.55 to the
new embedding and prompts the user "this looks like {name}, add to
that profile?" before creating a new contact.
LFW accuracy is 99.83%; IJB-B at FAR=1e-4 is 96.21%. For grouping photos in a personal collection (similar lighting, same camera) this is excellent. Identical twins and close family members can match β the 0.55 threshold makes the prompt opt-in rather than auto-merge.
Training the beauty regressor
Live source in training/beauty/. The script
fine-tunes a timm ResNet-50 on SCUT-FBP5500. After training, drop the
resulting beauty_regressor.pt into face-service/models/ (or push
to HF Hub and set BEAUTY_HF_REPO_ID). BeautyAnalyzer picks it up
automatically on the next process boot.
Until weights exist, beauty_score returns None and the AestheticAnalyzer
gracefully falls back to a pure rule-based chopped score.
Deployment
The service builds as a Docker image targeting Hugging Face Spaces
free tier (2 GB RAM, shared CPU). MediaPipe .task and the
InsightFace buffalo_l bundle are pulled at build time; all other
Hugging Face models lazy-download on first inference and cache under
/root/.cache/huggingface.
The Node/Express server forwards /analyze-face requests to
FACE_SERVICE_URL/analyze-base64. The React Native client never
talks to this service directly.
Adding a new analyzer
- Drop a new module under
analyzers/with a class exposing__init__()andanalyze(...) -> dict. - Import + add a lazy-load block in
app.py'sget_analyzers(). - Add a
results.update(...)call inside_run_pipelineat the right pipeline position. - Surface the new keys in client/src/screens/EditProfileScreen.js and add a legend row.
Order matters: later analyzers overwrite earlier keys on collision. The aesthetic aggregator runs last so it can see everything.