Why it’s hard to work on a feed reader

October 25th, 2007

It’s not the writing of the code. It’s not the fetching and parsing of the feeds. It’s not dealing with daylight savings time and encoding issues. It’s not any of that.

It’s because you’re always sitting in front of an abundance of interesting things to read. Every time you try to test the latest feature, an interesting headline flies in front of your face. And, more often than not, you manage to convince yourself that it’s “industry related and still counts as work”. Half an hour later, you get back to your code, change a few lines, reload the page, and bam, another interesting blog post catches your eye.

The damn thing just works too well. :)

Commons Question Time

October 23rd, 2007

I’ve recently begun dvr’ing the weekly Prime Minister Q&A session at the British House of Commons. If you don’t watch it, it’s a half hour long and it’s on CSPAN Wed. mornings at 7am (thus the need for the dvr’ing). I highly recommend it to anyone that either loves politics or enjoys watching skilled trash talkers go at each other in front of a rowdy crowd.

Here’s a particularly biting clip from last week:

America would be a better place if our president were subjected to such scrutiny on a weekly basis.

And, have you ever watched the House or Senate on TV? It’s like the turnout at a college fencing match. Everyone has better things to do. At the very least, the prospect of seeing the Prime Minister talk some trash seems to get butts in the seats. The same thing might get everyone together now and then in our Congress too.

RSS is still in its infancy and now there’s proof

October 15th, 2007

Google Reader recently made the number of subscribers of each feed available to anyone doing a feed search. The results are very interesting.

First I’d like to point out that the people that are going around trying to frantically put together leaderboards (like Techcrunch) are wasting their time. The numbers for certain feeds are highly inflated because Google has included them in their feed bundles that are offered when you click “browse” in the “Add Subscription” area. These preselected groups of feeds are also presented to people trying out Google Reader for the first time. As a result, most Google Reader users will have added 1 or more of these bundles, and many, many people that tried Google Reader once and never went back will have these feeds sitting idle in their list. This makes any leaderboard you come up with unfair and uninteresting. [#1]

However, if we can agree that a significant portion of GR users have added the news and tech bundles, then the number of subscribers of the top feeds will approximate the total number of people that have signed up for GR. The BBC news feed has about 200,000 subscribers and sits at or close to the top of the list. I think we can safely infer that the number of total GR users (all time, including inactive members) is somewhere in the ballpark of this number.

That number is TINY. GR has less than a half a million registered users? With Google behind it? GR is definitely today’s most popular feed reader, so this means that the number of people using a feed reader every day is really still quite small.

Clearly feed reading is still in its infancy.

I say infancy because I think that anyone that’s ever used a reader regularly will agree that it’s really, really, really useful. It’s something you try out and can never go back. It’s a paradigm shift. Something this useful is destined to expand its audience.

As a person that just launched his own feed reader, I take the fact that the field is still immature to be great news. It means there’s room for more players. This is something I’ve suspected for quite some time and it’s part of the reason that I started Feed Each Other.

Another thing that confirms my suspicion is something that one of the Google Reader team members said in that leaked video not too long ago. He said something along the lines of “as our user base grows, bloglines isn’t really losing much traffic. We’re not stealing users from them, but we’re getting people that are new to the game“. The potential audience for feed reading is massive. Anyone can benefit from this technology, young and old. Hundreds of millions of people. We’re just getting started, and I think people are ready for this. It’s the right time.

Whoever creates a feed reader that is fun, easy to use and easy to understand for regular folks is going to be very successful. Obviously, I think Feed Each Other is that reader.

Notes:

#1 – The ‘popular feeds’ list on Feed Each Other only takes into account subscriptions that have been recently active for just this reason.

Going for the win-lose is a bad sign for Google

October 10th, 2007

Valleywag just wrote about something that’s been bothering me for a few weeks now.

http://www.google.com/url?sa=t&ct=res&cd=1&url=
http%3A%2F%2Fwww.smellypoop.com%2Fpoop.html&ei=7xAM
R62WC5mIhAPn3I0w&usg=AFQjCNFsw83KR3JmGnnBi_n89GtI
dgWgFw&sig2=Th-qrd3nEEx_368yjwK7cA

Google has stopped linking directly to sites in their main search results and is instead passing each url through a really long, gnarly redirect. This is something Yahoo! has been doing for a long time and it’s pretty lame. You can’t cut and paste links out of the search results any more, and it takes longer to get where you want to go because you have to go through a middleman. It’s a subtle thing, but it’s really annoying. I’ve always believed that the lack of such redirects was a big reason why Google always maintained a lead over Yahoo! in the search wars.

What’s the point of these redirects? They’re there so that Google/Yahoo can accurately track clicks on their search results pages. I previously assumed that Google, because they weren’t beholden to archaic link tracking methodologies like Yahoo!, was using javascript click handlers and ajax to do the tracking behind the scenes. I guess I overestimated them.

This is a bad, bad sign for Google. It means they are, perhaps for the first time, willing to sacrifice the quality of the user experience in order to feed more data to the reports read by upper management. Google wins here (short term), but the user loses. It’s very un-dude.

A change like this could only be made to their #1 money-maker if their fantastic culture and spirit is eroding and giving way to typical big company BS. It means that some VP managed to put his short term goals ahead of the company’s long term success. Perhaps they’ve let too many Paul Buchheit’s get away and hired ladder climbers in their place.

If I owned any Google stock, I’d sell it soon (although, to be fair, I’ve said that a few times before and I’ve been wrong every time).

Why Google cares about its Reader

October 2nd, 2007

Note: If this speculation is off the mark, then this is why they should care about their Reader.

The NY Times recently had an interesting piece with a behind the scenes look into the world of Google search and their ranking gurus Udi Manber and Amit Singhal [#1]. One part of this article that caught my eye was where they discuss the “freshness” problem.

Freshness, which describes how many recently created or changed pages are included in a search result, is at the center of a constant debate in search: Is it better to provide new information or to display pages that have stood the test of time and are more likely to be of higher quality? Until now, Google has preferred pages old enough to attract others to link to them.

To solve this problem, the article says that Mr. Singhal thought of a solution that he calls QDF, or “query deserves freshness”. They aim to show fresh results for queries that are topical to recent events.

THE QDF solution revolves around determining whether a topic is “hot.” If news sites or blog posts are actively writing about a topic, the model figures that it is one for which users are more likely to want current information. The model also examines Google’s own stream of billions of search queries, which Mr. Singhal believes is an even better monitor of global enthusiasm about a particular subject.

Cool idea, right? But there’s a question that begs answering. Once you’ve figured out that a user is looking for fresh results, how do you figure out which ones to show to them?

Google’s web search index has a lag time of up to a few days. This may be due to the sheer size of the index and the technical problems that come with re-indexing the entire web very frequently. But, I think it’s because updating it any more often than that provides little further benefit. If they don’t wait a few days, then people have no time to create links to the new web pages out there and provide the page rank algorithm with its secret sauce. In the page rank world, fresh pages have no value. People need time to link (or not link) to them and tell the algorithm whether or not they’re worthwhile.air track tumbling

This means that a new mode of thought is required to index and rank very fresh information. Information for which the usual trust metadata is not yet available.

To fill this void, so called “blog search” has emerged. Blog search is actually just an arena in which “query deserves freshness” is always true. It’s the same problem. Technorati is the leader in this area, but others are catching up. Unforunately, Technorati’s results are often littered with spammy and irrelevant links. They try to mitigate this by assigning an authority score to the source of each result, but it’s not granular enough to get at the real issue (Google doing more frequent indexing won’t really help web search for the same reason). All the other blog search providers have similar ranking issues. So, what are they to do?

The way to solve the ranking problem for fresh information is to analyze the attention streams of people that are consuming very fresh information.

And where do people consume very fresh information? That’s right, they do it in a feed reader.

Google Reader is a way for the big G to get extremely reliable data about which new web pages are worthwhile and which ones are not. If lots of people email a story in a feed to their friends, that’s a clue that it’s interesting. If lots of people star something, yet another. If they tag it, even better. This all happens very quickly, within minutes of an item’s publication. You see, if they get a large enough group of people to consume their fresh information inside Google Reader, then they are acquiring massive amounts of structured, valuable, implicit metadata that can help them to solve the freshness ranking problem.

This attention data is infinitely more valuable to Google than the potential advertising dollars they could obtain from showing ads to Reader users. That’s why Reader is ad free and will remain that way. They just want to know what you’re looking at.

This attention data is why Technorati acquired the Personal Bee and why Ask bought Bloglines. They want at this type of information too. This is also why Yahoo! is really blowing it by not building a proper reader of their own. At this point, they should just buy one (nudge, nudge, wink, wink). [#2]

If there’s one thing I learned while working at Yahoo! it’s that search is the cash cow, the big prize. Everything else matters only in regards to how it can improve search and drive increased search market share. This is probably even more of a truism at Google. Google Reader may have started out as a 20% project, but all that fresh attention metadata means that it will eventually become an integral part of their search platform.

Notes:

[#1] These two get all the attention while all of their hard working, un-named minions get no recognition. I really hated when that happened with Yahoo! Answers in the media. Some big shot VP would get quoted as the “man behind Yahoo! Answers” and half the time it was someone I’d never even seen or heard of before. Oh, and I’m still annoyed with Mr. Manber for stealing my corp. username at Yahoo! and forcing me to use something clunky other than just udi@. When your name is as rare around here as mine is, you just don’t see these things coming. Damn you Manber! ;)

[#2] Yup, in case you didn’t already know, I’ve just built a new breed of feed reader. It’s pretty awesome. You should really check it out. Now. Yes, you. Right now. Seriously. Do it. http://feedeachother.com