How would you describe Breaking Bad?

Would you call it a "dark thriller"? An "intense drama"? Would you say it's a "study in character development" or represents an "ethical quandary"?

The way catalog content is labeled — and the more specific we're able to get in our descriptions — has a profound impact on the effectiveness of our recommender systems, and recommender systems have a huge impact on user engagement and conversions.

Some of our eCommerce customers report upwards of 60% of their total conversions coming from their recommender carousels alone. As a further point, Amazon claims that 35% of its almost $170Bn revenue comes from their recommender system.

A study on Netflix's metadata enrichment process

Netflix openly demonstrates their metadata enrichment through informative "tags" they add to their shows.

From the picture above, you can see that Netflix is already storing and exposing metadata such as the release date, rating ("TV-MA"), the broad categorization ("Drama"), actors/actresses, and creator(s).

Also, note this section farther down the page, where they have additional metadata for Breaking Bad:

Pay particular attention to the "This show is..." section, above.

In addition to the standard genre classifications, Netflix also describes Breaking Bad as "Violent, Gritty, and Dark."

Metadata tags

Netflix uses these descriptive tags throughout their catalog of over 6,000 movies and shows, and these tags play a big role in helping users finding their next binge:

“Grey’s Anatomy” is “soapy” and “emotional.” “Emily in Paris” is “campy” and “quirky.” “Our Planet II” is “relaxing” and “captivating,” while “Gravity” is “suspenseful” and “visually striking.”

How big of an impact do these metadata tags actually make?

“Each time we've removed the tags as an experiment, engagement plummeted.” Eunice Kim, Netflix's CPO told The New York Times.

There's no question, these tags are a pillar of Netflix's success.

How to create and maintain metadata enrichment tags

Having enriched metadata allows you to improve your search results, content recommender systems, and more, but creating this metadata can be a lot of work.

In Netflix's case, they report that they maintain a team of 30 dedicated “taggers” who manage their library of over 3,000 tags. In addition to tagging new content, the team periodically reviews the current tags to ensure they’re still effective, exciting, and not redundant.

At Aampe, we enable companies to use marketing messaging to enrich their internal metadata

This example is from a food delivery app we work with that has an audience of over 50 million users —

The app has thousands of restaurant and cuisine options, and tagging each of them individually would be incredibly tedious and time consuming.

Instead of creating all of these tags manually, we take a three step approach to metadata enrichment:

1. Load content library into Aampe

Aampe connects directly to an app's CMS system(s). We can ingest millions of items and properties.

2. Write and label 'message wrappers' around the content

Note in the example below on the left, our customer incorporated CRM data and CMS content("[[first_name]]" and "{{item_ cuisine_name}}," respectively) into the message with liquid tags.

The Aampe system then allows marketing users to write and tag message variations featuring descriptive labels (below, on the right):

Notice, that "Effortless and delicious" appeals to a label of "Convenience," while "Treat yourself to a tasty delight" focuses on "Indulgence."

There is no limit around the number of tags or variations that can be included.

3. Collect user responses to messages

Aampe uses clustering and a reinforcement learning/MAB infrastructure to send these messages to users and measure their responses (not just in clicks, but also in downstream metrics like add-to-carts and conversions/purchases).

This methodology not only learns individual user preferences but also associates the app's content with various descriptive labels autonomously, minimizing the need for user intervention.

See below where the system has associated and scored different menu items and restaurants with different labels:

Notice that the restaurant "Raj Cabin" is most closely associated with the label "Taste," while "ChicKing" was associated more strongly with "Affordability."

"Parotta" (an Indian flatbread) is most closely associated with feelings of "Gratification," while "Hotel Orion" is most sought after for its "Variety."

Enriching your existing metadata, capturing this user-generated nuance allows you to do everything from improve the relevancy of your search results to increase the effectiveness of your recommender system.

The results

Utilizing this metadata for enhancing the in-app experience, coupled with Aampe's machine learning-optimized messaging personalization, enabled the app to sustain over 90% of its purchase volume while halving message sends, resulting in substantial savings, particularly as these were WhatsApp messages.

Not only was this a cost savings, but the app also saw an improvement in user retention from sending less overall messaging.

(These outcomes align with our typical performance at Aampe. Here's another instance where we aided an app in maintaining its conversion volume while decreasing SMS sends by over 75%.)

Back to Breaking Bad

Using this same methodology, instead of using a team of manual "taggers" who have to create and maintain these tags themselves, Netflix could launch a series of messages (SMS, Push Notifications, WhatsApp, In-App, Emails, etc.) advertising their new shows with a variety of labels or adjectives.

Using Aampe's Reinforcement Learning system, each user would receive the shows and language the model determines will be most appealing to them.

In addition to user-level optimization, Aampe's system would determine which adjectives/metadata tags were most correlated to "power users" — not only which tags drive the most clicks, but also which correlate most strongly with conversions.

(For example, labeling Breaking Bad as a comedy might drive clicks out of curiosity, but the model would quickly learn that users who enter under that assumption don't continue to watch the show (hypothetically speaking because of a mismatch of expectations), so it would deprioritize that particular show/label combination.)

This methodology would result in increased conversions, better product performance, and auto-metadata enrichment.

To learn more, click the orange button, below.