Saturday, June 05, 2004

NotCon '04 Talk Notes: Gmail, Search and Content Discovery

In reverse order...

Content Discovery
Many criteria against which to assess the content of search results - try "consensus of opinion":

- At one extreme is agreed information/fact (e.g. "Battle of Hastings Date" - "1066")
- At the other extreme are differing tastes/opinions (e.g. "Jazz Music Great" - ?)
...and the whole range in between i.e. disputed information/facts & variable popularity of a piece of music (with correct metadata) etc.

Also, discovery may either occur from the pool of (filtered) content that has already been established (outside Internet), or from the flood of unfiltered items that are being injected into the Internet.

Many ways to consider search, try "position above/in network":

- Can take an overview of whole network (e.g. Google PageRank) which gives one ranking for a given search criteria
- A tailored/unique perspective from each node inside a network (such as peer-to-peer) where search results differ for a given search criteria

Suggest that Google, PageRank is best suited for agreed information/facts, while peer-to-peer can tailor results on an individual basis in order to account for opinions/tastes. Peer-to-peer (P2P) consists of a network of nodes with connections between them representing some sort of relationship. Example: SoundRatings on top of Freenet.

[Audioscrobbler is an example of search/collaborative filtering which gives unique results tailored to the user. This is possible through the generation of implicit preferences/ratings based on listening habits. Well suited to the pool of (filtered) content that has been established.]

But not all peers/users are equal; some will have abilities/characteristics that enable the discovery of new content tailored to individual preferences.

The Tipping Point - How little things can make a big difference
Mavens - Data banks
Connectors - Social glue
Salesmen - Persuaders in word-of-mouth epidemics

To me these people look like bloggers.

So how could (a subset) of blogger types use Gmail to facilitate the discovery of such content?

Firstly deal with the issue of implicit vs. explicit ratings - some bloggers (mavens, connectors, and salesmen) are suited to providing ratings information against relevant metadata and occasionally do so. e.g. email subscription:

"Reviews ---------------------------------------
Trawling the web for your pleasure

:: Mamas With Bushes, Build Original

Type:: Mash
Uses:: Mama Said Knock You Out, LL Cool J
Fuckin' in the Bushes, Oasis

One word - rocking.

Size:: 03.28MB Length:: 00:03:35
Format:: MP3 Quality:: 128Kbps

Score:: 9/10"

Download link was here

[Please note: As the review is of bootleggers the source is confidential]

...and signing up is as simple as going to soundhog.org.uk (no relation).


"I went to see Harry Potter on Monday with my younger brother and bloody Cory who did his 'taking photos of the "don't take photos" sign' thing to general hilarity from the rest of the auditorium. Again. And - of course - the Englishman dies quietly of shame inside (but it wouldn't have been the same without him). Personal verdict - still flawed, but better than anything else in the rest of the series so far. Four stars. Well done to all involved etc." Source: PlasticBag.org 4 June 2004.

How would it work? Subscription emails (automatically archived) would contain:
- Metadata (see example below)
- Rating scale (five-star, out of 4, 5, 10, 100?)
- Collected in email archive
- Active/automatic search for new content linked from email
- Add user explicit/implicit rating through email reply/Gmail labels

For Example:
title: Oxygen
link: http://www.actsofvolition.com/steven/hc/hortonschoice_oxygen.mp3
creator: Hortons Choice
Rating: 87/100
Review: Excellent
genre: AlternRock
genre_id: 40
format: audio/mpeg
license: http://creativecommons.org/licenses/by-nc-sa/1.0/ ...but RSS much better.

Does such information in a (Gmail) archive facilitate the discovery of this type of content?

Well, there are going to be:
1. Bloggers, who provide ratings against metadata, that are popular (80:20 rule)
2. Users/peers (& bloggers?) who have a similar profile to you who will link to bloggers
3. Users/peers with archives of ratings that can be searched

Tools available include:
1. PageRank + similar algorithms
2. Collaborative filtering,
3. P2P search (of others archives) but...

Are we really searching for content or looking for the connectors/bloggers who can link us to it / deliver it to us?

