Thursday, February 22, 2007

Larry and me, not on same Page

In 1998 Page and Brin wrote their now famous paper on what is now popularly called "Page Rank". I think that's great work but, against popular belief, I must say that that does not make Larry Page God. Nor does it make him an expert on all things technological. Why the outburst? Well recently, John Battelle reported Larry Page talking about AI and

how the human brain compares to that of an operating system

What he says is inaccurate both Biologically and Technologically. That in itself is not surprising, what is shocking is the fact that the majority of the media/blogosphere seem to be accepting what he says implicitly. Frankly speaking his argument is a joke.



Watch the video and you will see Page talking about DNA:
If you look at your DNA its just about 600 MB compressed, which is smaller than any operating system. Your Linux, windows any operating system. That includes booting up your brain, right ... by definition. So your algorithms are probably not that complicated, its probably about the overall computation.

I hate to be the one to break this to you Page, but on this, you have no idea what you are talking about.



First off, lets start with what the technologically inconsistencies are in Page's argument. The assumption that AI will require huge amounts of computational power is one which is seriously flawed. There are innumerable people working on AI who explicitly state that this is not something that is seriously necessary.

How powerful are the best present-day supercomputers? A quick glance at top500.org lets us find out.

The computational power of many of these machines are rapidly approaching human equivalency, if not already surpassing it. (Depending on which estimate you use.) A common estimate for the human brain's processing power is 100 Tflops/sec (10^15 floating operations per second), but neuroscientific and evolutionary evidence suggests this may be an overestimate (Bostrom 1997). Values as low as 100 Gflops/sec (10^13 floating operations per second) have been proposed (Moravec 1998). The vast majority of computing experts, including the people at Intel, predict the continuous acceleration of available computing power through 2015 at least, at which point we would need to switch to nanocomputing or quantum computing to maintain continuous progress. The point at which human-equivalent computing power becomes available is highly significant because it puts the possibility of Artificial General Intelligence (AGI) within reach. The creation of AGI would signify the arrival of a new intelligent species, the greatest milestone in humanity's history, and either our extinction or salvation - depending on its motivations (Bostrom 2003).

Original Source

Points of note
  1. Powerful computers exist outside Google
  2. These maybe "as fast as the human brain"
  3. Computing power reaching the speeds of that of the human brain is important


But, how important is speed? Further down the same article the author talks about this:


The really interesting thing about engineering AGI, however, is that computer scientists won't necessarily need human-equivalent computing power to implement successful Artificial Intelligence. Implementing AGI with 1/100th, or even 1/1000th computing power may be possible. Biological evolution, being a nonforesightful process constrained by the inherent mechanics of biology, incremental adaptation, the constant necessity for an immediate fitness advantage, weakness with simultaneous dependencies, and so on, falls far short of the efficiency and foresight that human experts can muster.

Points of note:
  1. We don't necessarily need Human-equivalent computing power to implement successful AI




What is more disturbing is Page's comparison between the function of the human DNA and that of operating systems. Lets go through what the actual role of DNA is. Now I am not a Biologist, but here is a gist of what one of my friends explained to me:





Original image location here.

DNA contains the "instructions" for the development of living organisms. This does not imply that DNA is an OS. What then does the DNA do?

The structural and functional aspects of a living being are determined by proteins and the interactions between proteins. DNA is used by cells to create these proteins through a fairly complex process:

The first step in this process is that of Transcription wherein that part of the DNA "describing" the protein ( called the gene ) that is currently need is "copied". The "copy" takes the form of an RNA strand, which can basically be though of as a temporary storage of the protein structure.

Animations depicting Transcription:
Very Basic
Another basic
Technical


The next step is the process of Translation wherein the information in the RNA is used to create the protein.

Animations depicting Translation:
Very Basic
Another Basic one
Technical


One would think it stops here but the way in which the protein then folds up again plays a fundamental role in the function of the protein. In fact there exist proteins call prions which when folded in one way can be useful to the body but when folded the other cause diseases. In fact their linear structure ( primary structure ), the orientation and details of folding ( secondary structure ), the three dimensional structure ( tertiary structure ) and finally the way in which multiple chains club together ( quaternary structure ) all play a vital role in the functioning of the protein.

What is also interesting is the fact that given a primary structure of a protein and the environment it is likely to exist in, there is, as yet, no way of pre-calculating the final three dimensional structure with a hundred percent certainty. So just the sequence of DNA does not tell you what interactions are going to happen in a cell.



Consider all of this and now consider what we started off with:
If you look at your DNA its just about 600 MB compressed, which is smaller than any operating system. Your Linux, windows any operating system. That includes booting up your brain, right ... by definition. So your algorithms are probably not that complicated, its probably about the overall computation.
Way to go Page ...

Read More...


Friday, February 16, 2007

Through the eyes of The Google-bot.

If you were to see what the Google-bot does, what would that be? Well this seems to be something everyone is concern about, mostly because that is what determines how "highly" Google thinks of you. There are a couple of online tools ( this and this ) which let you see what search engines actually read off your site.

Unfortunately this generates a huge pile of text that looks something like this:

Spidered Text :
Search Me skip to main | skip to sidebar Search Me Search - from Various Angles Friday, February 16, 2007 John Battelle on a Community driven Yahoo! Pipes project: Read Tim O'Reilly. Posted by Harish TM at 4:05 AM In my previous post I tried to start a community initiative for creating services based on Yahoo! pipes. I tried a couple of different approaches to kick start this project and when I failed, I decided to try to mail a couple of people

...

Yes - the plain text version of the site. So who cares?

Maybe if you were paid to do some kind of SEO you would, but to the rest of us this is still just a long and cluttered piece of text. But this becomes interesting when you push it through some kind of a visualization tool such as TagCrowd.

TagCrowd is a web application for visualizing word frequencies in any user-supplied text by creating what is popularly known as a tag cloud.
About page here

So I ran this blog through a web spider and then the output of that through TagCrowd and here is what I got. I am not sure what exactly this gives us and it would have been great if TagCrowd included phrases, but it looks like there might be something in this:


created at TagCrowd.com




Read More...


John Battelle on a Community driven Yahoo! Pipes project: Read Tim O'Reilly.

In my previous post I tried to start a community initiative for creating services based on Yahoo! pipes. I tried a couple of different approaches to kick start this project and when I failed, I decided to try to mail a couple of people about it. One of the people I mailed was John Battelle, Chairman, Federated Media.

My mail:

Hey John,

I have a small blog ( http://search-search-and-searchme.blogspot.com/ ) and I am trying to pull together a community that could, together, come up with services based on Yahoo! Pipes.

I was hoping you could give me your ideas on an initiative such as this:


  1. Whats your take on the new service - Yahoo! pipes? What do you think its impact on user generated content will be?
  2. Do you think a it is possible to build a community that works on Yahoo! Pipes services? If so what do you think is the best way to build such a community?
  3. You seem to have written off Google Co-op, but at the same time seem to be quite happy with Alexa opening up their crawl ( a beta version of which was around for a while ). Why is that? Don't you think that the implementation ease of Google Coop can make it more useful than Alexa?
  4. How do you think Google Coop and Yahoo pipes can be used in conjunction?



Hoping to hear from you


Harish TM


And here is John's reply:

Hi Harish

I am not a developer, so I'm not the best person to answer. I've not written GC off, just waiting to see something neat. Pipes - read Tim O'Reilly on it. I like the idea a lot.
------------------

John Battelle

Chairman, Federated Media: http://federatedmedia.net



Great... So I have mailed Tim O'Reilly on it... Lets hope he replies...


I also sent a similar mail to Erin Brenner, Copy Chief & Associate Editor, ClickZ * Incisive Media Plc. and here is his response:

Hi, Harish. Thanks for your e-mail. However, you'll need to direct your
query to one of our Experts for an answer:
http://www.clickz.com/showPage.html?page=experts.

Kind regards,
Erin


Erin Brenner
Copy Chief & Associate Editor
ClickZ * Incisive Media Plc.
erin@clickz.com * www.clickz.com

Mailed them too... Lets see what they say...

And to Krishna Kumar, Project Manager, Ivesia Solutions who gave some interesting insights:

Hi Harish
Nice to see your email. Hope you are doing fine. I think one way of building a community is to add more content to your blog about Yahoo! Pipes and then use Technorati, Digg and Google Blogsearch to drive traffic to your site. You could create a Yahoo or Google Group to create the developer community and provide a link on your home page.
Regards
Krish
Visit me at http://www.krishami.com & http://krishami.blogspot.com



Thanks Krishna.

So here is the ( so far empty ) Google group - http://groups.google.com/group/yahoo-pipes---community-development
Yahoo! pipes does not require any technical expertise so please sign up even if you have non of that...

I intent to add further content about Yahoo! pipes on that group to avoid making this a Yahoo! Pipes site.

Also if you are interested in Co-managing that group, please let me know - Additional perspectives will, I am sure, be an advantage.





Read More...


Sunday, February 11, 2007

Yahoo! - Pipes

Yahoo! pipes is an interesting new offering from Yahoo! that allows users to aggregate inputs from various different sources. The Pipes intro page describes the service as:

Pipes is a hosted service that lets you remix feeds and create new data mashups in a visual programming environment. The name of the service pays tribute to Unix pipes, which let programmers do astonishingly clever things by making it easy to chain simple utilities together on the command line.

[update: More on this on this post]

You can read more about the initial reactions here, here and here. In fact the new service won such popularity that it went down in the first couple of hours.

I played around a little with Yahoo! Pipes and it looks like a potentially powerful tool. Unfortunately most users seem to be more keen on having their own application than on creating something of real value. I guess we will have to wait for the initial buzz to die down before we start seeing really useful applications.

Consider this - one of the first and most popular Pipes. According to the creator

This Pipe takes the New York Times homepage, passes it thru Content Analysis and uses the keywords to find Photos at Flickr.

If you actually run the Pipe the only thing that you notice is that it is possible to extract pictures of semi-naked woman from Flicker, no matter what the context. This again might be due to the lack of sufficient innovation on behalf of the creators.

The same thing goes for using other widgets - The problem that needs to be addressed is the fact that the shear volume of adult content on the web is likely to drown out any attempt to actually generate any meaningful content without a higher degree of control.

I have created a Yahoo account:

publicpipes@yahoo.com
password: teststuff


It should be fun to have readers create pipes together.

So go ahead, create your pipe, see what others are working on and lets hope a combined service comes out of this.

Read More...


Saturday, February 10, 2007

Won't the real Web 2.0 please stand up

Don't you often get the feeling that an idea or a concept is just there on the tip of your tongue but you just can't put it into words? Well I guess Web 2.0 is one such concept. Here is a great video that brings out exactly what this is all about.



Read More...


Reply: Blog Feedback OR How I learnt to stop Blogging and start Thinking

Now that I think about it, posting an entry with feedback, without actually adding my comments to it was kind of lazy. I thought it best to add another post with my thoughts rather than just respond with a comment.


First of all, I must confess that this is not the kind of blogging style that I am comfortable with, and hence my comments are coloured by my personal discomfort. It is possible that for the target audiences that your blog is aimed at, the points that I raise are not applicable.
I have thought a little about what my target audience should be, and the only answer I seem to have is: everyone. Searching is something that all Internet users do at some point of time, and the idea is that this blog should ideally go into the different aspects of search, some technical and others not so technical, but all relevant. So your points are definitely applicable and much appreciated.


* Your blog posts don't clearly define, or explain, much of the new terms and jargon they introduce. Rather, they coolly and unceremoniously point to links. Moreover, it often happens that visiting the link pointed at only serves to confuse rather than clarify because it goes to a very general page.

The high linkedness of your page is to be appreciated (and it's probably one of the desirable features in modern blogging) but I am always more contented with a blog post that makes complete sense even without a person bothering to follow any of the links. Of course, this may further depend on the background of the person. For instance, a person who is anyway up-to-date on the latest products and services offered by Google and Microsoft may not have to follow many of the links. Depending on the kind of audience you are aiming at, I think you should seek to make the blog post self-contained for that audience (or at least, the person's understanding of one sentence should not depend on having gone to a link in an earlier sentence).

I think the problem here is that when you read about something day in and day our ( as I am forced to do about search ), you tend to assume that people know about certain things that you repeatedly read about. I will work on getting around this

About links, very often I link to blogs, which have different opinions, and to pages that go into extreme details that I can not possibly afford to go into. To me this is just a way to show where I am getting my facts from and what has led me to say certain things that I do say. In other words, I guess this blog is an attempt to talk about my understanding and views about certain things, and I think it only fair to link to alternative views. However if, content-wise, blog entries are incomplete, that is something I will have to work on.


* Your blog posts are too short. Again, a matter of personal taste. I prefer longer blog posts. Some of my own get too verbose, of course. What I'm really talking about in length is not the number of words but the internal structure: an introductory gambit that introduces the ideas to be discussed, a main body where several ideas are
discussed, and then some conclusions (for longer ones, many iterations between ideas and conclusions). A few twists in between. If you're aiming at a "news blog" I think this one does the trick reasonably well. But I think that as a "trend analysis" blog this doesn't quite meet up to the mark. You seem largely keen to state some words, drop some names, point to some sources, make some grand statements and quit with a "wait-n-watch".
Frankly I am not aiming at a news blog. That would just be redundant, I think there are enough news sources across the net. Also I want to avoid the repetition of things that a whole lot of people have said already. When talking about a product, I could of course go into the details of what that product is - but then again, every new product results in thousands of such blog posts, what then is the point of yet another "summery of About Pages blog"? I think the idea is that a user is introduced to a product or a concept, pointed to places where he or she can get details of that product ( if they don't know about it ) and to then exposed to what I think about it.



* I think you should completely avoid hyperlinks in your conclusion, and you should specially avoid conclusions like "I don't know ..." specially if you blog post started out with "We shall determine whether ..."
Some posts are observations regarding trends - such posts can not end but on a "lets see" note. I don't think there will be too many instances wherein a blog post tries to determine anything. I guess I just have to be more careful with my wording.


* Avoid saying things like "anyone who has..." or "it is obvious that..." I know I do that sometimes too but that's my bad habit :)


Will do...

* A question: what kind of comments do you really seek? Do you hope that people will follow all the links, read all the articles/go through the websites, and get back with comments? For the kind of posts that you have written, the comments you get will largely be meaningless because the general class of people who try to comment sensibly will first try to follow all the links so they'll be exhausted by the time they return.

Again, one of the reasons I provide so many links is that when reading blogs I often come across things I have not previously heard of. Even if the rest of the blog post does not directly depend on that concept/product, I often like to have links that I can just open in another tab for later reading. So no, I do not expect one to follow all the links before commenting. The links are like, "I think A sucks, A was started by this guy called B and if you want to read about B here is a link", or "A is a product that allows you to do this and I think this product is <rest of the blog post>, if you want to know how to use the product here is a link that might help you out".


Read More...