Why this matters
On a large catalog, the home screen is the product — far more viewing starts from what the system surfaces than from what a viewer goes looking for, and Netflix has reported that its recommendations influence around 80% of the hours its members watch, leaving roughly 20% to search (Gomez-Uribe & Hunt, 2015). As the block's opening article argues, discovery is what decides retention, and merchandising is the surface where discovery actually meets the viewer's eyes. This article is for the founder, product manager, or streaming CTO who has to decide how much of the home screen to personalize, where to draw the line between algorithm and editor, and what artwork and ordering machinery is worth building. You will not write the ranking models by hand, but you do have to specify the behavior, choose build-vs-buy, and understand why a beautiful catalog presented through a generic, one-size-fits-all home screen quietly loses viewers every night. Merchandising sits directly downstream of the recommendation models and the metadata that feeds them — it is where their output becomes a screen a person looks at.
The home screen is a recommendation
Start with the idea that reframes everything that follows: the home screen of a streaming service is not a static storefront that everyone sees the same way. It is generated, per viewer, every time it loads — a recommendation in the shape of a page. The word for this craft is merchandising: the art of arranging what you have to sell so the right item catches the right shopper at the right moment. A physical store does it with shelf placement, end-caps, and window displays; a streaming service does it with rows, ordering, and artwork, except that the streaming store rebuilds itself from scratch for each person who walks in.
The stakes are set by a hard limit on attention. Netflix's own research found that a typical member who does not find something to watch in roughly 60 to 90 seconds is at risk of giving up the search and leaving — for a game, a book, or another app (Gomez-Uribe & Hunt, 2015). That short window is the whole game of merchandising. You are not trying to show the viewer everything; you are trying to put something they will actually start playing in front of them before the window closes. A catalog of fifty thousand titles is worthless if the dozen the viewer sees first are wrong for them.
A useful analogy: think of the home screen as a newspaper that lays itself out differently for each reader. The same day's stories exist for everyone, but the editor — here, an algorithm — decides which go above the fold, which get a big photo, and which are buried on page nine, based on what this reader tends to care about. The reader feels served; the work was done in the layout, not the writing. Streaming merchandising is that layout decision, made millions of times a night.
The four layers of a personalized home screen
Before the technology, get the anatomy clear, because most teams under-build by treating the home screen as one decision when it is really four stacked on top of each other. Each layer is personalized independently, and each can be done well or badly on its own.
The first layer is row selection — choosing which rows appear at all from a large pool of candidate rows. A row is a horizontal strip of titles with a theme: "Continue Watching," "Trending Now," "Critically Acclaimed Workplace Comedies," "Because You Watched Inception." The candidate pool is far larger than the screen can hold, so the system picks a subset.
The second layer is row ordering — the top-to-bottom sequence of the rows it picked. The row a viewer is most likely to act on goes near the top; the rest descend in rough order of expected usefulness. This matters more than it sounds, for a reason we will return to: viewers pay far more attention to the top of the page than the bottom.
The third layer is within-row ranking — the left-to-right order of titles inside each row. The strongest pick for that viewer, in that row, starts on the left, where the eye lands first, and relevance falls off to the right (Netflix Help Center, 2026).
The fourth layer sits on top of the other three: artwork selection — the single image shown for each title. Because one title can have many candidate images, the system can choose a different image for different viewers, so the show that appears in your row and mine is the same title wearing a different face.
Figure 1. The four layers of a personalized home screen. Three of them order things (which rows, the top-to-bottom order of rows, the left-to-right order of titles inside each row); the fourth chooses the single image each title wears. All four are personalized independently, and a generic home screen is one that builds none of them per viewer.
Keep these four layers in mind as separate decisions. A common failure is to nail one — say, a smart "Because You Watched" row — while leaving the others generic, so the page still feels off. Strong merchandising personalizes all four.
Layers 1 and 2: choosing and ordering the rows (page generation)
The first two layers — which rows to show and in what order — are handled together by what Netflix calls page generation, and the history of how it was built tells you why it matters. Before 2015, Netflix used a rule-based approach: a template defined what kind of row could appear in each vertical slot, the same skeleton for everyone, with only the titles inside personalized (Alvino & Basilico, 2015). The page was a fixed form with personalized blanks. The move that followed — the one most serious platforms have since copied — was to make the choice and order of the rows themselves personalized and learned from data, not fixed by a template (Alvino & Basilico, 2015; Gomez-Uribe & Hunt, 2015).
The page-generation algorithm balances two goals that pull against each other. The first is relevance: every row should be one this viewer is likely to engage with. The first is easy to over-optimize, and that is the trap. If you rank rows by relevance alone, you get a page that is ten variations on the viewer's single biggest interest — ten flavors of crime drama for the crime-drama fan — and the viewer's smaller tastes vanish. The second goal, diversity, exists to prevent exactly that: the rows should span the viewer's range of interests, not just hammer the top one.
The clean way to think about diversity here comes from a Netflix paper called Calibrated Recommendations (Steck, 2018). The idea is simple and worth internalizing: if a viewer's history is about 70% romance and 30% action, a calibrated home screen should be roughly 70% romance and 30% action too — it should reflect the proportions of someone's tastes, not let the majority interest crowd the minority one off the page entirely (Steck, 2018). A purely accuracy-chasing recommender tends to drop the 30% because piling on more romance scores marginally higher on a naive metric; calibration deliberately holds space for the minor interest. Steck's paper also gives the practical mechanism most teams use: a re-ranking step that takes the raw, relevance-sorted list and reshuffles it to restore the right proportions (Steck, 2018).
Figure 2. Page generation balances two forces. A large pool of candidate rows is selected and ordered by relevance — rows the viewer will engage with — and by diversity, calibrated to the proportions of the viewer's tastes, then a re-ranking step restores those proportions before the page is shown. Relevance alone collapses the page into clones of one taste; calibration holds space for the minor interests.
It helps to know the common row archetypes, because they are the building blocks of the candidate pool, and they mix different logics:
| Row type | What it personalizes | Logic behind it |
|---|---|---|
| Continue Watching | Order by likelihood you resume | Per-viewer; ranks unfinished titles by resume probability |
| Top 10 (in your country) | Position of the row, not its contents | Popularity-based, country-wide; the list is the same for everyone in a country, but where it sits is personalized (About Netflix, 2020) |
| Trending Now | Short-term popularity + some personalization | Blends what is spiking with what fits you |
| Because You Watched X | Titles similar to one you finished | Per-viewer; seeded by a specific title you engaged with |
| Genre / theme rows | Which genres, and their order | Per-viewer; drawn from your watch history and calibrated for range |
Table 1. Common home-screen row archetypes and what each one actually personalizes. Note that not everything on a personalized page is personalized to your taste — the Top 10 row shows the same country-wide list to everyone, and only its placement moves per viewer. Mixing popularity, freshness, and personal history is what makes a page feel both relevant and alive.
The "Top 10" row is worth a second look because it is a clean example of the editorial-vs-algorithmic blend we will get to. When Netflix launched it in February 2020, the contents were deliberately not personalized — it shows the most popular titles in your whole country, with a distinctive badge — yet the row's position on your page still moves based on how relevant it is to you (About Netflix, 2020). One row, two logics: a non-personalized list placed by a personalized rule.
Layer 3: ordering titles within a row, and the tyranny of position
The third layer — the left-to-right order inside a row — looks minor and is not, because of a well-documented quirk of how people look at lists. Attention is not spread evenly across the screen. Viewers look hard at the top-left and trail off quickly going right and going down; positions further from the start get dramatically fewer eyes regardless of what sits there. This is called position bias, and it has been measured carefully in the search-and-ranking literature: in the "cascade" model of how people scan a ranked list, a user examines items from the top, and the chance of even looking at an item falls sharply with its position (Craswell et al., 2008).
Position bias has two consequences for merchandising, one obvious and one subtle. The obvious one: ordering is high-stakes, because the title in slot one gets vastly more attention than the title in slot eight, so putting the right title first is most of the value of the row. The subtle one is a trap. Because top positions get more clicks purely for being on top, a naive system that learns "this title gets lots of clicks, promote it further" will mistake position for quality and create a rich-get-richer feedback loop — the title that happened to start on the left gets clicked, gets promoted, gets clicked more, while a better title two slots over never gets the chance. Mature systems correct for position bias when they learn from clicks, discounting a click on slot one because slot one would have gotten clicks anyway (Craswell et al., 2008). If you take one engineering caution from this article, make it this: never train a merchandising model on raw clicks without accounting for where the thing was shown.
Artwork and thumbnails: the same title, a different face
Now the fourth layer, and the one that most surprises people: the single image shown for a title — its artwork or thumbnail — does not have to be the same for everyone. A title can carry many candidate images, and the system can pick the one most likely to appeal to this viewer.
Why spend effort here at all? Because the image is the first and sometimes only thing a browsing viewer processes about a title. In the 60-to-90-second window, a viewer skims dozens of titles, and for each one the artwork is doing the persuading before a single word of synopsis is read. The same show can be sold honestly in very different ways: a romantic subplot foregrounded for the viewer who loves romance, an action beat for the viewer who loves action, a specific actor for the viewer who has watched that actor before. The title is identical; the framing meets the viewer where they are.
The mechanics, the way Netflix has described them, run like this (Netflix Technology Blog, 2017). First, a set of candidate images is produced for each title — different moments, different characters, different moods — sometimes dozens per title. Second, for each viewer in each context, the system chooses which candidate to show. "Context" here means the signals available at that moment: the viewer's history, the device, sometimes the time of day or the row the title sits in (Netflix Technology Blog, 2017). Crucially, the problem is far smaller than full recommendation: instead of choosing among tens of thousands of titles, the system is choosing among a handful of images for a title it has already decided to show — which is exactly what makes the next idea, the contextual bandit, practical.
Figure 3. How per-viewer artwork works. One title carries many candidate images. For each viewer and context, a contextual-bandit policy picks the image most likely to earn a watch, mostly exploiting what it has learned but occasionally exploring an untried image, and every outcome feeds back to improve the next pick. The lift is largest for lesser-known titles, where the right image does the most work.
Two honest cautions belong here, because artwork is where merchandising can quietly go wrong. The first is clickbait: an image that wins the click but oversells the title trains viewers to distrust your artwork and inflates a vanity metric while real engagement falls. The fix is to optimize artwork against did they watch and keep watching, not did they click — the same watch-time-over-clicks discipline that governs recommendations. The second is attribution: it is genuinely hard to prove how much a given image change caused, because the same viewer is reacting to the image, the row, the position, and the title all at once, and a play cannot be cleanly credited to the picture alone (Netflix Technology Blog, 2017). Netflix has noted that its artwork lift is largest for lesser-known titles — a famous show gets watched whatever its thumbnail, while a hidden gem lives or dies by the image (Netflix Technology Blog, 2017). That is where to spend the effort.
How the candidate images themselves are produced — the computer-vision models that scan footage for well-composed, expressive frames — is machine-learning work that belongs to a different section; the model internals live in the AI for Video Engineering section, and we stay at the product layer here. What a streaming team needs to decide is the behavior: how many candidate images per title, what contexts to personalize on, and how to measure success without fooling yourself.
Contextual bandits: why merchandising does not wait for an A/B test
The artwork problem exposes a deeper idea that runs through all of merchandising: how the system learns which choice is best. The obvious method is the classic A/B test — show image A to half the audience and image B to the other half for a few weeks, then keep the winner. It works, but it is slow and wasteful: for the whole test you are knowingly showing the losing image to half your viewers, and you only learn one comparison at a time.
Netflix's answer for artwork is a method called a contextual bandit, and the name comes from the "one-armed bandit," a slot machine (Netflix Technology Blog, 2017). Picture a row of slot machines where each pull pays off at a different, unknown rate; you want to win the most over many pulls, so you face a constant trade-off between exploiting the machine that has paid best so far and exploring the others in case one is secretly better. A bandit algorithm balances that trade-off automatically. The "contextual" part means the best choice depends on the situation — the right image depends on who is looking — so the algorithm learns a policy that maps context to choice rather than picking one global winner (Li et al., 2010, who introduced this approach for personalized content with the widely used LinUCB algorithm).
The payoff over an A/B test is that a bandit learns continuously and per context instead of in slow, global batches. It mostly shows the image it currently believes is best for each kind of viewer (exploit), but now and then tries an untested one to keep learning (explore), and it folds every outcome back in immediately (Netflix Technology Blog, 2017; Li et al., 2010). It both serves the best-known choice today and improves it tomorrow, with far less audience spent on losers. The same machinery generalizes beyond artwork to any merchandising choice with a modest set of options and a measurable outcome. Proving that a bigger change — a new row algorithm, a new page layout — actually helps still calls for careful online experimentation, which is its own discipline covered in the article on A/B testing and experimentation for streaming.
The editorial-versus-algorithmic balance
It is tempting to conclude that merchandising should be fully automated. It should not, and the reason is that an algorithm optimizes what it can measure, while a streaming business has goals and constraints it cannot hand to a model. The right design is a partnership: editors set the boundaries and the algorithm fills and orders within them.
Consider what only a human can reasonably own. A new release the company has paid heavily for needs guaranteed prominence on day one, before the algorithm has any data on how it performs — a cold-start the model cannot solve alone. Brand and PR sensitivity — not placing a violent thriller next to a children's title, pulling a title whose star is in the news — is judgment, not a metric. Licensing and windowing constrain what can even appear: a title available only in some countries, or only until month's end, must be merchandised around its rights, a subject covered in the licensing block. And editorial point of view — a curated "Staff Picks," a themed collection for a cultural moment — is part of some brands' identity. Hulu and HBO's Max have leaned more on human curation as a deliberate difference from Netflix's heavier algorithmic approach (Arts Management & Technology Lab, 2021).
The practical model that works is to let editors define the candidate pool and the rules — which titles are eligible, which rows are mandatory, what must be promoted or must never be adjacent — and let the algorithm choose and order within those rules. The editor draws the fences; the algorithm grazes freely inside them. This keeps the scale and personalization of automation while preserving the business control, legal compliance, and brand voice that automation cannot provide.
| Decision | Better owned by the algorithm | Better owned by editors |
|---|---|---|
| Which titles fit this viewer's taste | ✓ at scale, per viewer | |
| Ordering rows and titles by predicted engagement | ✓ continuously, per session | |
| Guaranteeing a paid new release its launch prominence | ✓ — no data exists yet (cold start) | |
| Brand / PR-sensitive placement and adjacency | ✓ — judgment, not a metric | |
| Honoring licensing windows and territory rights | ✓ — a hard legal constraint | |
| Curated editorial collections and brand voice | ✓ — part of the product's identity |
Table 2. Who should own which merchandising decision. The algorithm wins on scale and per-viewer fit; editors win on cold-start launches, brand and legal judgment, and curation. Strong platforms do not choose one — editors set the rules and the candidate pool, and the algorithm personalizes within them.
A worked example: the cost of a generic home screen
Make the stakes concrete with arithmetic, because "personalize the home screen" is easy to defer until you see what skipping it costs. Take a service with 2 million active viewers, each opening the app on average 30 times a month, for 60 million browsing sessions a month.
Anchor on the attention window. A meaningful share of sessions end with the viewer never starting anything — they browse, nothing grabs them inside the 60-to-90-second window, and they leave. Suppose a generic, weakly-merchandised home screen produces a browse-abandonment rate (the fraction of sessions that end with no play) of 25%. That is 60,000,000 × 0.25 = 15,000,000 abandoned sessions a month — fifteen million times someone opened your app, looked, and left with nothing.
Now suppose better merchandising — personalized rows, calibrated diversity so minor tastes are not buried, per-viewer artwork on lesser-known titles, the right title in slot one — cuts that abandonment rate by a quarter, from 25% to 18.75%. That is a 60,000,000 × (0.25 − 0.1875) = 3,750,000 reduction: nearly four million sessions a month that now end in a play instead of a shrug. The arithmetic is deliberately rough, but the shape is the point — abandonment is multiplied by your entire session volume, so a few points off it is an enormous number of saved sessions, and saved sessions are the leading indicator of the retention that decides whether subscribers churn. Merchandising is not screen decoration; it is a retention lever with a measurable hand on the business.
A common mistake: optimizing the home screen for clicks
The most expensive merchandising mistake is also the most natural: optimize every layer for the click. Clicks are easy to measure and feel like success, so teams tune rows, ordering, and especially artwork to maximize them — and quietly wreck the product. Clickbait artwork wins the tap and loses the trust when the title does not match the promise. Ordering tuned for clicks promotes whatever is sensational over whatever the viewer will actually finish. The fix is to optimize for watch time and retention, not clicks — did the viewer start and keep watching — which is the same discipline that governs recommendations, because a click that leads to a thirty-second bounce is a failure dressed as a win.
Three more traps recur. The first is the one home screen for everyone — building rows and ordering once, globally, so the page is identical for a horror fan and a documentary lover; this is the failure of not building the four layers per viewer at all. The second is letting the algorithm crowd out minor tastes — ranking by raw relevance until the page is ten clones of the viewer's top genre, the exact failure calibration exists to prevent (Steck, 2018). The third is learning from raw clicks without correcting for position — promoting whatever sat in slot one because it got the clicks slot one always gets, the rich-get-richer loop (Craswell et al., 2008). Each mistake has the same root: treating a measurable proxy as the goal instead of the watch the viewer actually wanted.
Where Fora Soft fits in
Personalized merchandising is a scale problem before it is a design problem: a home screen rebuilt per viewer, across phone, web, and a remote-driven television, in the fraction of a second before the page paints, multiplied by every session every night. Across 625+ shipped projects for 400+ clients since 2005 in video streaming, OTT/Internet TV, e-learning, and video surveillance, the recurring pattern we build is the full merchandising stack — page generation that selects and orders rows with relevance and calibrated diversity, within-row ranking that corrects for position bias, per-viewer artwork selection with the explore/exploit loop that learns continuously, and the editorial-control layer that lets your team guarantee launches, honor licensing windows, and keep brand-sensitive titles apart. Our approach is scalability-first and vendor-neutral: we start from your catalog size, session volume, and device mix, decide where a hosted personalization service is enough and where a custom home-page pipeline earns its cost, and wire merchandising into the same recommendation, metadata, and analytics systems so the storefront, the catalog, and the data behave as one product on every screen.
What to read next
- Recommendation Systems for Video
- A/B Testing and Experimentation for Streaming
- The Personalization Data Pipeline
Call to action
- Talk to a streaming engineer — book a 30-minute scoping call to talk through your personalized merchandising plan.
- See our case studies — 250+ shipped projects across video streaming, WebRTC, OTT, telemedicine, e-learning, surveillance, and AR/VR.
- Download the Personalized Merchandising Readiness Checklist — One Page — The four layers of a personalized home screen, the relevance-vs-diversity (calibration) balance, position bias, artwork and contextual bandits, the editorial-vs-algorithmic split, and the metrics to track — on a single sheet.
References
- The Netflix Recommender System: Algorithms, Business Value, and Innovation. Gomez-Uribe, C. A. & Hunt, N. ACM Transactions on Management Information Systems, 6(4), Article 13, 2015. DOI 10.1145/2843948. Tier 1 (peer-reviewed first-party engineering). Source of: recommendations influencing ~80% of streamed hours (search ~20%); the ~60–90-second window before a member gives up the search; the page-generation framing of the home page as personalized row selection and ordering; the relevance-and-diversity goals of page construction; the reported ~$1B/year value of personalization. https://dl.acm.org/doi/10.1145/2843948 — accessed 2026-06-18.
- Calibrated Recommendations. Steck, H. Proceedings of the 12th ACM Conference on Recommender Systems (RecSys '18): 154–162, 2018. DOI 10.1145/3240323.3240372. Tier 1 (peer-reviewed primary literature, Netflix). Source of: calibration — a recommendation list should reflect the proportions of a user's interests (the 70% romance / 30% action example); the tendency of accuracy-only recommenders to crowd out a user's minor interests; the post-processing re-ranking algorithm that restores calibrated proportions — the mechanism behind diversity in row selection. https://dl.acm.org/doi/10.1145/3240323.3240372 — accessed 2026-06-18.
- A Contextual-Bandit Approach to Personalized News Article Recommendation. Li, L., Chu, W., Langford, J. & Schapire, R. E. Proceedings of the 19th International Conference on World Wide Web (WWW '10): 661–670, 2010. DOI 10.1145/1772690.1772758 (arXiv:1003.0146). Tier 1 (peer-reviewed primary literature). Source of: the contextual-bandit formulation for personalized content selection; the explore/exploit trade-off; the LinUCB algorithm; choosing among a small set of options conditioned on context — the algorithmic basis of per-viewer artwork selection. https://arxiv.org/abs/1003.0146 — accessed 2026-06-18.
- An Experimental Comparison of Click Position-Bias Models. Craswell, N., Zoeter, O., Taylor, M. & Ramsey, B. Proceedings of the International Conference on Web Search and Data Mining (WSDM '08): 87–94, 2008. DOI 10.1145/1341531.1341545. Tier 1 (peer-reviewed primary literature). Source of: position bias — the probability of examining an item falls sharply with its rank; the cascade model of scanning a ranked list; the need to correct for position when learning from clicks (the rich-get-richer feedback loop). https://dl.acm.org/doi/10.1145/1341531.1341545 — accessed 2026-06-18.
- Artwork Personalization at Netflix. Chandrashekar, A., Amat, F., Basilico, J. & Jebara, T. Netflix Technology Blog, 2017. Tier 3 (first-party engineering, orientation for "what ships"). Source of: per-viewer artwork/thumbnail selection; the use of a contextual bandit rather than a batch A/B test; the candidate-image pool per title; context features (member history, device, etc.); the artwork problem being far smaller than full recommendation; the largest lift on lesser-known titles; the attribution/incrementality difficulty. https://netflixtechblog.com/artwork-personalization-c589f074ad76 — accessed 2026-06-18.
- Learning a Personalized Homepage. Alvino, C. & Basilico, J. Netflix Technology Blog, 2015-04-09. Tier 3 (first-party engineering). Source of: the pre-2015 rule-based/template approach to the homepage and the shift to a personalized row selection and ordering algorithm; balancing relevance with diversity and exploration in page construction. https://netflixtechblog.com/learning-a-personalized-homepage-aa8ec670359a — accessed 2026-06-18.
- How Netflix's recommendations system works. Netflix Help Center, 2026. Tier 3 (first-party documentation). Source of: the row archetypes (Continue Watching, Trending Now, Top 10, Because You Watched); within-row ranking placing the strongest titles on the left; rows ranked top-to-bottom by relevance; personalization of which rows appear. https://help.netflix.com/en/node/100639 — accessed 2026-06-18.
- Now — for the first time — you can see what's popular on Netflix (Top 10 launch). About Netflix (company blog), 2020-02-24. Tier 3 (first-party announcement). Source of: the Top 10 row launched February 2020; its contents are country-wide popularity (not personalized to taste) while its position on the page is personalized; the Top 10 badge. https://about.netflix.com/en/news/now-for-the-first-time-you-can-see-whats-popular-on-netflix — accessed 2026-06-18.
- Deep Learning for Recommender Systems: A Netflix Case Study. Steck, H., Baltrunas, L., Elahi, E., Liang, D., Raimond, Y. & Basilico, J. AI Magazine, 42(3): 7–18, 2021. DOI 10.1609/aimag.v42i3.18140. Tier 2 (peer-reviewed first-party overview). Source of: the multi-stage candidate-generation-then-ranking and re-ranking pattern that underlies page generation and merchandising at Netflix; the role of personalization across the member experience. https://ojs.aaai.org/aimagazine/index.php/aimagazine/article/view/18140 — accessed 2026-06-18.
- Algorithms in Streaming Services (editorial vs algorithmic curation). Arts Management & Technology Lab, 2021. Tier 6 (institutional/educational, orientation only). Source of: the contrast between Netflix's heavier algorithmic approach and the more human-curated approaches taken by services such as Hulu and HBO's Max — used only to illustrate the editorial-vs-algorithmic spectrum, not as a source for any algorithmic claim. https://amt-lab.org/blog/2021/8/algorithms-in-streaming-services — accessed 2026-06-18.
Where sources disagreed or popular explanations oversimplified, the primary literature was followed. The calibration, contextual-bandit, and position-bias claims are cited to the original peer-reviewed papers (refs 2–4) rather than to vendor paraphrases; the Netflix product behaviors (artwork, page generation, row types, Top 10) are cited to Netflix's own engineering blog, help center, and announcements (refs 5–8). The widely-repeated marketing figures that "artwork drives 80% of viewing decisions" and "viewers spend 1.8 seconds per thumbnail" were deliberately not stated as fact: they trace to secondary write-ups, not to a Netflix primary source, so the article uses only the peer-reviewed ~80%-of-hours-from-recommendations figure and the ~60–90-second window (ref 1). The editorial-vs-algorithmic contrast (ref 10) is used for orientation only.


