Half a Million Data Points in 4 Weeks. Learn Why Media Monitors Trusted Takt for their Annotation Needs

What does Media Monitors do?

Media Monitors is the nation’s leading local monitoring company, serving the media and advertising industries with near real-time intelligence on broadcast TV, cable, radio, print and digital using an AI-powered engine.


What was the Annotation Task?

Classify podcast transcriptions into different categories of ads based on their context, type and specificity. Ads were of the form of Direct Response, Live Broadcast, Promotion, etc. This functioned as the training dataset for Media Monitor’s ad relevance and targeting AI. 

Task Highlights

Complexity
Low
Accuracy
99%
Volume
504,000 data points
Price
$0.04 per data point
Task Type
Classification (5 Categories)
"After running trials with a number of companies, Takt was by far the most accurate and reasonably priced for our needs. The team understood the level of detail necessary for >99% accuracy and we were able to virtually eliminate false-positives. I personally appreciated the crisp communication via Slack, allowing us to tackle uncertainties as they arose, ensuring high accuracy."
- Sacha Spitzer, Podcast Product Manager at Media Monitors

Problem

  • Complexity too high to crowdsource - Media Monitors had previously tried Amazon Mechanical Turks and other crowdsourced platforms which failed at even 4 majority votes. The Labeling Guideline needed constant collaboration and the operational workflow needed for the task could not be implemented through crowdsourcing.
  • Noisy data  - The transcriptions of the podcasts were unclear and required contextual understanding. For example - “.org” was “doug” in the transcription but needed to be marked as part of an ad. Podcast hosts and guests of varying speaking styles resulted in a multitude of transcription errors
  • Edge cases - Needed to draw the distinction between promotions and advertisements. For example - If the podcast host is promoting his/her own show, it is not an ad. However, if it is the guest’s show that is being promoted, it is an ad. This was complex given just the transcription.
  • High Volume, Short Timeline - Needed 500,000 data points labelled in 4 weeks including the hiring, training and setup period. 

Solution 

  • Takt hired full time annotators with experience in marketing and sales who understood the context and application of the dataset. Conducted dataset fit tests as part of the hiring which ensured they understood the varying podcast lingo.
  • Concurrent training sessions where the transcription was reviewed along with the audio of the podcast helped annotators gain clarity of the dataset. Guidelines to simplify the classification categories into subcategories that were easily understandable reduced the annotator decision processing.
  • Daily review processes ensured that errors did not propagate and a mistake of one annotator was a learning for every annotator. Conducted 100% review for the 1st week, 75% for the 2nd week, 50% for the 3rd week and 25% for the 4th week.
  • Feedback loops helped flag issues and doubts during annotation rather than let annotators pick with uncertainty.
  • Hired 12 annotators with high dataset fit within the first week and trained each annotator in a 4 day period of intense feedback and review. Achieved a daily output of approximately 30,000 lines in the 3rd week which ensured timely completion of the task.

Learn about Media Monitors' experience and why they continue to partner with Takt

Download Now