
VIDEO DECODER
Understanding marketing video performance with metadata using a semi-automated tagging platform
OVERVIEW
I interned as a Product Designer / Manager at Tezign, a contech startup using AI to empower future content marketing. I led the design of a AI-empowered video tagging platform to generate video metadata for smart video marketing in a 8-member multi-functional team until launch. This platform successfully reduced the manual labor in video tagging, and generated useful insights for YSL China 🎉
DURATION
2021.02 - 2021.05 (3 months)
ROLE
Project Manager
Product Designer
Design Strategy
Stakeholder Research
User Research
TEAM
2 AI Developers
4 Software Developers
2 Designers
1 Product Manager (Mentor)
TOOLS
Figma
Protopie
Trello
Background: Metadata of video content

Problems
Through discussions with the marketing team, we identified three major issues in video metadata tagging:
1.
Labor Intensive
Due to the unstructured and semantic nature of video content, currently video metadata generation heavily relies on manual tagging, which is time-consuming and labor-intensive.
2.
Error-prone
It's hard for the content manager to manage the tags, or annotators, or check the quality of tagging. Under such circumstance, insights generated by inaccurate metadata may be misleading.
3.
Unstructured Tags
Metadata of videos in a specific vertical domain should fit into a specific content architecture so that they can be analyzed to generate useful insights. However, currently content architects do not have standard tagging structures.
Solution
To decode video content in an effective way to generate meaningful insights, we started with the following initiatives:
1. Design scientific video content tagging workflow and methodology.
2. Design a toolchain to support video content tagging and metadata generation.
Outcome
Reduce in Time Costs:
Our tagging workflow and tool successfully resulted in 80% decrease in estimated time costs for video metadata generation.
Proof of Concept
We collaborated with YSL China, used our metadata analysis methods and tools to generate insights for their video marketing strategies. We successfully generated 30+ insights from the data of their 130 video feeds.
decrease in time costs for video metadata generation
130+
YSL China video feeds were analyzed
30+
Insights were generated and pitched to YSL China Marketing team
Process
We used the Discovery and Framing process in agile product management, which helped us quickly validate our assumptions that we made about the product, and quickly ran a proof-of-concept to validate our product value proposition.
Specifically, the activities we led include:
01
Research
Marketing Research
Background Research
Co-design Workshops
Problem & Goal Definition
Contextual Inquiries
Job to be Done
Problem Prioritization
02
Design
Ideation
Workflow
Wireframe (Lo-fi)
Design Critique
Iterations
Hi-fi Prototype
03
Feature Prioritization
Roadmap & Milestones
Agile Development
Usability Testing
Design Iterations
Proof of Concept
Agile Dev & Test
BACKGROUND RESEARCH
Types of metadata
Our first question is, what types of metadata do we need to understand video performance?
By collaborating with marketing content architects, we divided the metadata we need for attribution analysis into 4 categories: Business Information (e.g., price, brand, etc), Video Making (music, scenes, etc), Media Channel (tiktok, official web, etc), and Performance (e.g., CTR, ROI, etc).

Workflow & Methodology
Second, we did research to understand the overall picture of video analysis and future possibilities, thus we can have a better idea about the role of metadata, how&when to generate them, and how&when to use them.
After rounds of brainstorming and co-design with marketing content architects, content operation managers, data scientists, and engineers at Tezign, we codesigned a standardized workflow for metadata & insights generation: Data Collection - Data Processing - Tagging Structure & Analysis Methodology Design - Video Tagging - Analysis - Insight - Proof of Insight. (Steps related to content tagging are highlighted in the following diagram)

Video Content Tagging Process
Third, I interviewed content architects to understand how they decode and tag video content.
Generally, they would first segment the video into temporal sequences, and then they would tag the content on segmentations using labels, which are well categorized in a tree structure.
The tagging structure ensures that the metadata can generate meaningful, generalizable, and understandable insights.

Initial Problem Discovery
Through the first round of marketing research and stakeholder research with marketing experts, data scientists, product managers, we identified that the most pressing issue lies in the lack of available solutions to marketing content architects for creating tagging structures and to annotators for efficiently tagging videos using those structures.
Therefore, next two questions arises:
-
How can we help marketing content architects build robust tagging structures?
-
How can we help annotators efficiently generate valid metadata for video analysis?
USER RESEARCH
User Interviews
I conducted contextual inquiries on 2 marketing content architects and 3 experienced annotators to get an in-depth idea about their current situation, including the workflow, tools they are using, etc. I collected data about the time costs of each step involved for video tagging. Click to see the video of current tagging process using Elan, Excel, and Python.
It turned out that the current tagging process was very labor intensive and time-consuming. Only a 30-seconds video took more than 1.5 hours to annotate using Elan. Taking the large quantity of videos needed for insights generation and time cost for data processing, the video analysis is too costly to afford.
90 min +
Time cost for one person to annotate a 30-seconds video using Elan
100 +
videos needed to generate reliable insights for video marketing
3.5 days +
Time cost for data cleaning, processing, and analysis
Job To Be Done
To understand how to provide a desirable infrastructure for video tagging, I analyzed the workflow, task requirements in each step, and users' emotional needs in video tagging based on user interviews.

Pain Points
Through affinity mapping, I identified the most severe pain points in each stage. Also, by discussing with engineers and the product team, we prioritized problems that we were going to solve in the MVP stage, which are highlighted in orange color.

Key Takeaway
The dilemma in the marketing video content tagging methodology is, there are so many repetitive and labor-intensive tagging tasks that should ultimately be performed by machines, however, the state-of-art AI still lacks the capability to comprehensively "understand" video content, necessitating human input.
Next Step
We initiated our investigation into the application of AI techniques for reducing human labor in the tagging process, which is also valuable business investment. Specifically, by generating large amounts of valid datasets, we can train AI models to effectively augment future video content analysis, optimization, and production.
Design Considerations
With the understanding of the major pain points and needs, I categorized the design considerations into 3 categories:
Smooth
Support a simple and smooth tagging process while ensuring the validity for metadata
Smart
Provide a smart tagging experience to reduce manual labor while supporting enough user flexibility
Clear
Clearly show the current status & progress of tagging tasks and clearly present the tags
Collaborative User Workflow
Based on the design considerations, I further designed the entire collaborative workflow for content architects & data scientists, and annotators & reviewers. I designed features that can help address the above design considerations, which are highlighted using the corresponding serial number.
%20(1).png)
IDEATION
Co-designing AI-assisted Tagging Techniques
I organized a co-design workshop with 2 marketing content architects, 1 data scientist, and 2 AI developers to brainstorm possible technical solutions to facilitate tagging.
Marketing content architects shared their previous tagging structure for marketing videos of LANCOME (a well-known cosmetic brand), which was refined after tagging for 30+ short-form videos using the existing workflow and tools.
.png)
Their tagging structure&method help us understand what information should be tagged, how they would tag it, and also help envision what&how AI techniques can be used to assist users.
We brainstormed techniques based on current situations, and also evaluated the feasibility & ROI of each technique.
%20(1).jpg)
Tagging Workflow Redesign
Based on previous research, I designed the workflow of the new tagging system.
Before annotating, AI techniques, including scene segmentation based on visual and audio features, OCR-based and ASR-based tagging, and possibly other recognition, would automatically annotate the videos to reduce manual labor. Then people can add additional tags as well as modify the existing tags. Using AI segmentations, people can easily locate the positions for tagging. After completing tagging, content metadata, together with performance data, can then be used to generate insights through attribution analysis.

LO-FI DESIGN
Wireframes and Iterations
I created the wireframes and quickly iterated the design through usability testing, expert interviews, and discussions with designers, developers, and lead product manager. Scroll below to see the 4 iterations.
Key Iterations
Coming Soon!
FINAL DESIGN
Demo Video
As the product manager, I managed the design, iterative tests, and development of the content tagging system. I made the video to help you gain a better idea of our final outcome:
.png)
Tezign Smart Video Decoder
PROOF OF CONCEPT
YSL Content Insight Report
As a proof of concept of future smart video content platforms, which use AI techniques to intelligently decode video performance with metadata, manage video content, and optimize/generate video content, our team helped YSL China analyzed 100+ their marketing videos and delivered a report. After annotating using the Video Decoder Tool, we successfully generated 10,000+ tags and a tagging structure specially designed for similar short-form marketing videos. More importantly, we pitched to YSL China about 30+ content insights we gained through metadata analysis.
130+
YSL Videos + Performance data
10,000+
Tags generated
30+
Content Insights
BRAND CUES & MESSAGES
PRODUCT
MUSIC & RHYTHM
VISUAL QUALITY
EMOTION
PEOPLE & ENVIRONMENT
SEQUENCE
REFLECTIONS
⛽️ Next Step
To enhance the quality of metadata, we are committed to expanding the tagging structure for various marketing video types. Additionally, we will delve deeper into analysis methods to generate reliable, valuable insights. Moreover, we will persist in developing the toolchain for content analysis to facilitate efficient and intelligent support for content management, tagging, and analysis. By doing so, we aim to enhance future optimization and production of marketing videos.
🧠 Always Try to Think from A Higher Level
As a young startup, Tezign provided me with significant flexibility to explore diverse areas of interest such as research, design, and product management. I was fortunate to work alongside supportive and encouraging colleagues. The crucial lesson I learned was to approach problems from a higher level, question why we are focusing on a specific issue, and clearly define the problem before devising potential solutions. Additionally, I learned the importance of finding a simple way to test the viability of the solution.
Overall, this experience allowed me to develop a more comprehensive understanding of product strategies, task prioritization, and roadmapping. There is always an opportunity to expand our perspectives and gain a better understanding of the entire picture.
🤔 What Makes Good Videos? - Wait. How Do We Define "Good"?
Although I find the content insights regarding what tags and sequences make videos "good" impressive, I remain skeptical of the current way we assess video quality. Currently, we rely on common, quantified, business-related metrics such as conversion rate, click-through rate, and ROI. However, these metrics fail to fully capture the essence of video quality. Moreover, video insights driven solely by commercial interests may lead to homogenized content that fails to positively impact users. For instance, we all know that good-looking models attract viewers and stimulate purchases, and that viewers tend to lose patience with videos longer than three minutes. However, if social media platforms become flooded with such videos, users may end up making unnecessary purchases and find the content tedious and impetuous.
I am concerned about how smart technologies shape consumer attitudes, daily life, and spiritual life. While I firmly believe in the principles of "design for good" and "tech for good," it is crucial to first define what constitutes "good." I believe that designing effective metrics poses the biggest challenge for smart marketing.