Annotation KPIs
Benchmark Study

See the Videos in the Demo Environment

Sign-up to discover our tools and experience our automated annotation pipeline and R.PRO demos. Contact us to discuss your needs, or schedule a deep dive with us.

Go to Demo EnvironmentContact us

Key Facts

  • 157h to annotate 1h of Egocentric Video with Ramblr’s AI-powered annotation pipeline
  • Quality metrics: 0.94 IOU and 0.90 F1 score achieved
  • 97% of 109,370 frames annotated automatically resulting in an avg. annotation time of 5.2s per frame
  • Diverse dataset with 14 different scene types
  • Open vocabulary with 529 unique object instances
  • Accurate mask and category annotations for all context-relevant objects on every frame

Test Datasets

  • Collection devices: Meta Quest 3, GoPro, Vuzix Blade, Project Aria
  • Datasets: Collected by Ramblr, Ego4D, Aria Pilot Dataset

Video duration

Unique objects

Total

1h

529

Max per video

5min

10

Video Duration / Scene Type

(% of total dataset)

  • Cooking

    23%
  • Sports

    18%
  • Office work

    12%
  • Crafting

    10%
  • Social interaction

    9%
  • Pet care

    7%
  • Household activity

    4%
  • Arts

    3%
  • Eating

    3%
  • Studying

    3%
  • Personal hygiene

    3%
  • Playing

    2%
  • Lab work

    1%
  • Flight training

    <1%

Annotation Time

1 Hour of Egocentric Video

  • Annotation time

    111.0h70.9%
  • Review and correction

    26.3h16.8%
  • GT annotation for QC

    19.2h12.3%

2.5min

Avg. annotation time per manually annotated frame

5.2sec

Avg. annotation time per single frame

Video frames

Auto-annotated frames

Manually annotated frames

109,370

105,568 (97%)

3,802 (3%)

Quality

Average IOU and F1 Score for Dataset

IOU

0.94

F1 score

0.90

1125 (1%)

Ground truth frames for quality control

Annotations

  • Temporal consistent segmentation of object instances
  • Multi-object tracking for all context relevant objects
  • AI-assisted annotation guidance to minimize subjective interpretation

Annotate Context Relevant Objects

Open vocabulary

All objects the ego interacts with; Ramblr’s AI-models detect hand-object interactions and gaze signals to provide annotation guidance

Closed vocabulary

Objects always considered to be relevant e.g., mobile phones and laptops. Ramblr’s custom trained detection model provides annotation guidance