top of page


Understanding marketing video performance with metadata using a semi-automated tagging platform


I interned as a Product Designer / Manager at Tezign, a contech startup using AI to empower future content marketing. I led the design of a AI-empowered video tagging platform to generate video metadata for smart video marketing in a 8-member multi-functional team until launch. This platform successfully reduced the manual labor in video tagging, and generated useful insights for YSL China 🎉

2021.02 - 2021.05 (3 months)


Project Manager

Product Designer

Design Strategy

Stakeholder Research

User Research


2 AI Developers

4 Software Developers

2 Designers

1 Product Manager (Mentor)





Background: Metadata of video content

Short-form video is a popular marketing tool, as it provide a special type of easily-consumed, detailed content that potential customers increasingly engage with, enjoy, and share. Due to the large demand for marketing videos, leading brands have realized the importance of gaining insights about video content performance to augment their video content optimization & production intelligently.


However, leading brands still lack a scientific way to understand how the marketing video content could impact its performance, e.g., what kind of content will attract more target audience.


The foundation of future smart marketing, is rich and valid metadata of video content.


Through discussions with the marketing team, we identified three major issues in video metadata tagging:


Labor Intensive

Due to the unstructured and semantic nature of video content, currently video metadata generation heavily relies on manual tagging, which is time-consuming and labor-intensive.



It's hard for the content manager to manage the tags, or annotators, or check the quality of tagging. Under such circumstance, insights generated by inaccurate metadata may be misleading.


Unstructured Tags

Metadata of videos in a specific vertical domain should fit into a specific content architecture so that they can be analyzed to generate useful insights. However, currently content architects do not have standard tagging structures.


To decode video content in an effective way to generate meaningful insights, we started with the following initiatives: 

1. Design scientific video content tagging workflow and methodology. 
2. Design a toolchain to support video content tagging and metadata generation.
Reduce in Time Costs: 

Our tagging workflow and tool successfully resulted in 80% decrease in estimated time costs for video metadata generation.

Proof of Concept

We collaborated with YSL China, used our metadata analysis methods and tools to generate insights for their video marketing strategies. We successfully generated 30+ insights from the data of their 130 video feeds. 

decrease in time costs for video metadata generation


YSL China video feeds were analyzed


Insights were generated and pitched to YSL China Marketing team


Marketing Research

Background Research

Co-design Workshops

Problem & Goal Definition

Contextual Inquiries

Job to be Done

Problem Prioritization




Wireframe (Lo-fi) 

Design Critique


Hi-fi Prototype


Feature Prioritization

Roadmap & Milestones

Agile Development

Usability Testing

Design Iterations

Proof of Concept

Agile Dev & Test
Initial Research


Types of metadata

Our first question is, what types of metadata do we need to understand video performance?

​By collaborating with marketing content architects, we divided the metadata we need for attribution analysis into 4 categories: Business Information (e.g., price, brand, etc), Video Making (music, scenes, etc), Media Channel (tiktok, official web, etc), and Performance (e.g., CTR, ROI, etc)

Workflow & Methodology

Second, we did research to understand the overall picture of video analysis and future possibilities, thus we can have a better idea about the role of metadata, how&when to generate them, and how&when to use them.

After rounds of brainstorming and co-design with marketing content architects, content operation managers, data scientists, and engineers at Tezign, we codesigned a standardized workflow for metadata & insights generation: Data Collection - Data Processing - Tagging Structure & Analysis Methodology Design - Video Tagging - Analysis - Insight -  Proof of Insight. (Steps related to content tagging are highlighted in the following diagram)

Video Content Tagging Process

Third, I interviewed content architects to understand how they decode and tag video content.


Generally, they would first segment the video into temporal sequences, and then they would tag the content on segmentations using labels, which are well categorized in a tree structure.


The tagging structure ensures that the metadata can generate meaningful, generalizable, and understandable insights.

content tagging.jpg
Initial Problem Discovery
Through the first round of marketing research and stakeholder research with marketing experts, data scientists, product managers, we identified that the most pressing issue lies in the lack of available solutions to marketing content architects for creating tagging structures and to annotators for efficiently tagging videos using those structures.

Therefore, next two questions arises:
  • How can we help marketing content architects build robust tagging structures?
  • How can we help annotators efficiently generate valid metadata for video analysis?
User Research


User Interviews

I conducted contextual inquiries on marketing content architects and 3 experienced annotators to get an in-depth idea about their current situation, including the workflow, tools they are using, etc. I collected data about the time costs of each step involved for video tagging. Click to see the video of current tagging process using Elan, Excel, and Python.

It turned out that the current tagging process was very labor intensive and time-consuming. Only a 30-seconds video took more than 1.5 hours to annotate using Elan. Taking the large quantity of videos needed for insights generation and time cost for data processing, the video analysis is too costly to afford

90 min +

Time cost for one person to annotate a  30-seconds video using Elan

100 +

videos needed to generate reliable insights for video marketing

3.5 days +

Time cost for data cleaning, processing, and analysis

Job To Be Done

To understand how to provide a desirable infrastructure for video tagging, I analyzed the workflow, task requirements in each step, and users' emotional needs in video tagging based on user interviews.

jbd annotator.png
Pain Points

Through affinity mapping, I identified the most severe pain points in each stage. Also, by discussing with engineers and the product team, we prioritized problems that we were going to solve in the MVP stage, which are highlighted in orange color.

pain points.png
Key Takeaway
The dilemma in the marketing video content tagging methodology is, there are so many repetitive and labor-intensive tagging tasks that should ultimately be performed by machines, however, the state-of-art AI still lacks the capability to comprehensively "understand" video content, necessitating human input.
Next Step
We initiated our investigation into the application of AI techniques for reducing human labor in the tagging process, which is also valuable business investment. Specifically, by generating large amounts of valid datasets, we can train AI models to effectively augment future video content analysis, optimization, and production.
Design Considerations

With the understanding of the major pain points and needs, I categorized the design considerations into 3 categories:


Support a simple and smooth tagging process while ensuring the validity for metadata


Provide a smart tagging experience to reduce manual labor while supporting enough user flexibility


Clearly show the current status & progress of tagging tasks and clearly present the tags

Collaborative User Workflow

Based on the design considerations, I further designed the entire collaborative workflow for content architects & data scientists, and annotators & reviewers. I designed features that can help address the above design considerations, which are highlighted using the corresponding serial number. 

Collaborative Workflow.png


Co-designing AI-assisted Tagging Techniques

I organized a co-design workshop with 2 marketing content architects, 1 data scientist, and 2 AI developers to brainstorm possible technical solutions to facilitate tagging. 

Marketing content architects shared their previous tagging structure for marketing videos of LANCOME (a well-known cosmetic brand), which was refined after tagging for 30+ short-form videos using the existing workflow and tools.

Structured Tagging

Their tagging structure&method help us understand what information should be tagged, how they would tag it, and also help envision what&how AI techniques can be used to assist users.


We brainstormed techniques based on current situations, and also evaluated the feasibility & ROI of each technique.

AI technique - Frame 1 (1) (1).jpg
Tagging Workflow Redesign

Based on previous research, I designed the workflow of the new tagging system.


Before annotating, AI techniques, including scene segmentation based on visual and audio features, OCR-based and ASR-based tagging, and possibly other recognition, would automatically annotate the videos to reduce manual labor. Then people can add additional tags as well as modify the existing tags. Using AI segmentations, people can easily locate the positions for tagging. After completing tagging, content metadata, together with performance data, can then be used to generate insights through attribution analysis.

new workflow.png


Wireframes and Iterations

I created the wireframes and quickly iterated the design through usability testing, expert interviews, and discussions with designers, developers, and lead product manager. Scroll below to see the 4 iterations.

Key Iterations

Coming Soon!

Lo-Fi Design


Demo Video

As the product manager, I managed the design, iterative tests, and development of the content tagging system. I made the video to help you gain a better idea of our final outcome:

Artboard (1).png
Tezign Smart Video Decoder
Final Design


YSL Content Insight Report

As a proof of concept of future smart video content platforms, which use AI techniques to intelligently decode video performance with metadata, manage video content, and optimize/generate video content, our team helped YSL China analyzed 100+ their marketing videos and delivered a report. After annotating using the Video Decoder Tool, we successfully generated 10,000+ tags and a tagging structure specially designed for similar short-form marketing videos. More importantly, we pitched to YSL China about 30+ content insights we gained through metadata analysis.


YSL Videos + Performance data


Tags generated


Content Insights








Proof of Concept


⛽️ Next Step

To enhance the quality of metadata, we are committed to expanding the tagging structure for various marketing video types. Additionally, we will delve deeper into analysis methods to generate reliable, valuable insights.  Moreover, we will persist in developing the toolchain for content analysis to facilitate efficient and intelligent support for content management, tagging, and analysis. By doing so, we aim to enhance future optimization and production of marketing videos.

🧠 ​Always Try to Think from A Higher Level

As a young startup, Tezign provided me with significant flexibility to explore diverse areas of interest such as research, design, and product management. I was fortunate to work alongside supportive and encouraging colleagues. The crucial lesson I learned was to approach problems from a higher level, question why we are focusing on a specific issue, and clearly define the problem before devising potential solutions. Additionally, I learned the importance of finding a simple way to test the viability of the solution.


Overall, this experience allowed me to develop a more comprehensive understanding of product strategies, task prioritization, and roadmapping. There is always an opportunity to expand our perspectives and gain a better understanding of the entire picture.

🤔 What Makes Good Videos?  - Wait. How Do We Define "Good"?

Although I find the content insights regarding what tags and sequences make videos "good" impressive, I remain skeptical of the current way we assess video quality. Currently, we rely on common, quantified, business-related metrics such as conversion rate, click-through rate, and ROI. However, these metrics fail to fully capture the essence of video quality. Moreover, video insights driven solely by commercial interests may lead to homogenized content that fails to positively impact users. For instance, we all know that good-looking models attract viewers and stimulate purchases, and that viewers tend to lose patience with videos longer than three minutes. However, if social media platforms become flooded with such videos, users may end up making unnecessary purchases and find the content tedious and impetuous.  


I am concerned about how smart technologies shape consumer attitudes, daily life, and spiritual life. While I firmly believe in the principles of "design for good" and "tech for good," it is crucial to first define what constitutes "good." I believe that designing effective metrics poses the biggest challenge for smart marketing.

bottom of page