Components of a Successful Internet Portal

by Jose Saura

 

[draft]
In this document I use the word “portal” to refer to the entire ecosystem that is needed to provide information to people. Much has been discussed about the future of traditional web portals and whether or not search technologies will replace them. I believe there is always going to be a need to provide a well know, trusted entry point for accessing information regardless of the technologies behind the experience.

The main reasons traditional portals and entry pages are still successful today and haven’t been replaced by algorithmic results has to do with trust and relevance. People trust more and identify better with sites that have humans doing the work of selecting and filter what is important. As of today there are no automated solutions that could offer the same level of relevance and that will generate the same level of emotional attachment from users.

That said, although it is not covered in detail in this document, a very important component of the platform is the content indexing service. This service needs to be able categorize and rank content to make it possible to automatically surface related content. The technology only replaces humans in “tail” scenarios when it isn’t cost effective for a human to program the experience. In most cases, it complements rather than replace the work of a content editor.

In this document I cover the basic forces, users, content producers and advertisers and the role of the platform. This analysis provides some of the basic requirements that guide the development of the egooge platform (See also: The vision document and description of storage services).

A portal is broker that balances three powerful forces: Users, content producers and advertisers. For a portal to be successful it needs to find the right model that optimizes results for each one of them.

Understanding the portal’s user base (personas) is critical. We want the right mix of users so we can have large number of users and users that actually consume advertisements and products. Large user volumes allow us to attract more advertisers. Users that purchase keep conversions high and advertisers happy.

For example, a Facebook like model optimizes the experience of younger consumers, less affluent but high in number. A more traditional news oriented portal might function better for older consumers, more affluent but smaller volumes.

In many cases the content we present is similar but the way users get to it should be tailored to the user base. A single platform should be able to offer distinct entry points/portals that could be optimized for different populations while consuming the same content sources.

The platform should allow us to define and select the right advertisement product, in the right place at the right time. The business rules that govern the optimal behaviors have to take in consideration the data model behind the web page/site, the content source, editorial/marketing input the user history, and the advertisement inventory.

Marketing teams should be able to easily control advertisement rules and complement the advertisement services. For example, “This ad should show in all pages whose content source is Forbes.com”, is a type of rule that is common and can’t be controlled by the advertisement service.

In the content space, we need to optimize the content we deliver taking in consideration the cost of the content and the user’s needs (news, research and entertainment).

There are terabytes available of content that is either low quality or appeals to small segments of the populations. There are fewer high quality sources that appeal broadly and these are obviously more expensive. Creating a platform that helps optimize this experience it is not trivial and still requires editorial input.

Premium content tends to bring premium advertisement fees, while other content commands low fees. Again, the key is to optimize revenues by achieving the right balance.

What the platform needs is to make it very easy to incorporate new sources of content, either premium or not. To classify and categorize this content using a common pipeline (similar in functionality to the local search pipeline). We need ONE pipeline and not the hundreds we have today so we can easily incorporate new rules and new content sources.

The platform should have a content ingestion pipeline, a content store, a content classifier and indexer.

The content presentation pipeline is also critical. This pipeline understands the different devices (mobile, browser, toolbars, silverlight, etc) and it also applies the rules that control what content and what ads are shown. The rules are based on editorial input, marketing input, user requirements and user history.

A great opportunity for a successful portal is to simplify the content publishing process for external providers. A platform that could be opened to third parties with built-in support for advertisement it will lower our content acquisition costs of semi-premium content and allow us to partner with influential producers.

To summarize the platform requirement are:

·         A common content ingestion pipeline that classifies, indexes, interrelates and normalizes content so it can be further manipulated by common algorithms.

·         A common storage system that allows us to represent the complex data model behind modern web sites.

·         Editorial tools to generate content and control marketing rules.

·         A rendering pipeline allows web developers to create and maintain web sites at a low cost.