Microsoft Busily Using Patent Threats to Increase Their Search Traffic

Yesterday, Microsoft and Linspire Inc. announced for the small Linux vendor. Linspire develops and markets the Linspire distribution of Linux, marketed as ‘The World’s Easiest Desktop Linux’.

Linspire LogoIn the terms of the deal, Microsoft has required that Linspire set the Windows Live search engine as the default web search on all future copies and releases of the Linspire OS. In return, Microsoft will grant Linspire permission to use Truetype Fonts, the Windows Media codecs, and has agreed to waive their right to patent litigation against Linspire Inc, along with the users of their Operating System.

This move is bound to make some waves in the Search industry for the two things it signifies:

  1. Microsoft is beginning to really throw weight behind their search product.. Ok, this isn’t exactly earth shattering, but the point is that through their patent-backed extortion practices, Microsoft could have gotten just about anything they wanted. However, all they asked for was the default search. What does this imply for the future?
  2. It’s an implicit recognition of the potential threat Linux is posing to Microsoft. Although the searches generated by Linspire before users reset the default search to Google probably will not make up a. overly significant portion of Microsoft’s search volume. On the other hand, this seems like just the first in a series of battles Microsoft can fight against Corporate-backed linux distributions. Will Microsoft be cutting similar Search deals with other corporate-backed Linux vendors?

Linspire is a distribution with a lot of potential. It is not aimed at the geek crowd; rather, it seeks to imitate Windows to a certain extent in order to make the switch to Linux as simple as possible for businesses and individuals. In real world terms, it means that the users are your average business people - salesmen, secretaries, and the like. In practice, such users used Internet Explorer and Microsoft Search regularly; they would probably happily continue to use Microsoft Live search if that came as the default on Lindows OS.

This is not Linspire Inc.’s first encounter with Microsoft. Back in 2002(?), Microsoft filed suit against the company forcing them to change their name from ‘Lindows Inc.’ as part of a $20 million settlement. Therefore, they were likely an easy target, since they already knew the legal weight Microsoft could throw at them.

Dreamhost leaks 3,500 FTP passwords

**** Update: After all this Dreamhost mess, I’ve decided to abandon ship and go to Lighthouse Technologies for hosting, since I know the owner, and can vouch that he is solid. His best plan is $16 / mo, but bound to be more reliable and secure. If you want to get hosting with Lighthouse, please consider using my affiliate link! ****

I just recieved this email from Dreamhost. It seems that they’ve leaked 3500 FTP account passwords somehow.

That explains a lot - about 2 weeks ago, someone used my password to upload tons of spam links to my sites. At the time, I contacted Dreamhost indicating the problem, and they assured me that their servers were secure, and it *must* be my problem. Looks like it wan’t me.

From: DreamHost Security Team
Subject: URGENT: FTP Account Security Concerns…

Hello -

This email is regarding a potential security concern related to your
‘XXXX’ FTP account.

We have detected what appears to be the exploit of a number of
accounts belonging to DreamHost customers, and it appears that your
account was one of those affected.

We’re still working to determine how this occurred, but it appears
that a 3rd party found a way to obtain the password information
associated with approximately 3,500 separate FTP accounts and has
used that information to append data to the index files of customer
sites using automated scripts (primarily for search engine
optimization purposes).

Our records indicate that only roughly 20% of the accounts accessed -
less than 0.15% of the total accounts that we host - actually had
any changes made to them. Most accounts were untouched.

We ask that you do the following as soon as possible:

1. Immediately change your FTP password, as well as that of any other
accounts that may share the same password. We recommend the use of
passwords containing 8 or more random letters and numbers. You may
change your FTP password from the web panel (”Users” section, “Manage
Users” sub-section).

2. Review your hosted accounts/sites and ensure that nothing has been
uploaded or changed that you did not do yourself. Many of the
unauthorized logins did not result in changes at all (the intruder
logged in, obtained a directory listing and quickly logged back out)
but to be sure you should carefully review the full contents of your
account.

Again, only about 20% of the exploited accounts showed any
modifications, and of those the only known changes have been to site
index documents (ie. ‘index.php’, ‘index.html’, etc - though we
recommend looking for other changes as well).

It appears that the same intruder also attempted to gain direct
access to our internal customer information database, but this was
thwarted by protections we have in place to prevent such access.
Similarly, we have seen no indication that the intruder accessed
other customer account services such as email or MySQL databases.

In the last 24 hours we have made numerous significant behind-the-
scenes changes to improve internal security, including the discovery
and patching to prevent a handful of possible exploits.

We will, of course, continue to investigate the source of this
particular security breach and keep customers apprised of what we
find. Once we learn more, we will be sure to post updates as they
become available to our status weblog:

http://www.dreamhoststatus.com/

Thank you for your patience. If you have any questions or concerns,
please let us know.

Ok, time for me to share a little hidden annoyance I have.

Occasionally, some of the people who subscribe to my blog via RSS decide to link to me in blog posts of their own. Thanks for that, all of you - I really appreciate that something I’ve written has given you something to ponder, and you thought it was worth the time to link to me, or write about the subject I brought up.

However, instead of using the Feedburner URL of the post you are linking to, why not just link back to the post itself? Ie, instead of linking to:

why not link to:

Yes, I know that the Feedburner URL redirects to the second one. Also, this is more or less using a ‘link condom’ when linking to the site; the Feedburner URL is not just a simple 30X redirect to the new site; it performs a decent amount of processing for their Feed Statistics before sending you on to this site. Why not just link directly, and spread the link love?

Of course, if there is a certain benefit, or thought behind it, let me know in the comments - I am curious about positive uses for such linking!

SeoQuake Extension for Firefox: Cool, but Broken

For the last year or so, I have been regularily been using the Search Status and SEO for Firefox Firefox browser extensions as part of my daily routine for researching competition, SERPS, and sites.

SeoQuake LogoOn the recommendation of David Ogletree, I recently install the SeoQuake, just to test it out, and to see if it fit with my workflow better than the two plugins I am currently using. So far I am reasonably impressed, apart from a few small issues; namely that a number of the link checks performed by the tool are broken.

SeoQuake example of broken functions

As you can see in the image above, a few of the functions are broken. In both the site-specific bar and the SERP checking functionality of SeoQuake, the Yahoo Link and Link Domain queries always return 0 queries. This is of course an error, since a quick check of the sources indicates that there are a few thousand backlinks returned by both. Addtionally, the MSN link check always returns an error.

This results in the broken SeoQuake being severely handicapped until these bugs are fixed; hopefully we won’t have to wait too long for the next release.

SeoQuake SERPS toolOn the positive side, once these bugs are fixed, I will likely include SeoQuake in my regular toolset. Although the SERP information returned by SeoQuake is not quite comprehensive as that returned by SEO for Firefox, it loads quicker, and has a useful per-site caching system, so that if the same site is seen multiple times within different SERPS, it only fetches its information / statistics once. This leads to quicker results when researching similar SERPS within the same niche.

Overall, SeoQuake is quite a cool tool, and I think it will totally rock once they get those bugs out of the works.

Every Photo Needs a Caption…

What would you suggest for this one? Hope the guy’s girlfriend / wife doesn’t use Google Street View!

Google Maps - Guy Checking Out Girl

For more detail and zoom, !

Killing Two Memes with One Stone

In the last few days, I have been tagged into two seperate memes, so here I am killing two birds with one stone:

My Nerd Score

My Nerd Score CShel has just tagged me into the Nerd Score meme. Feeling somewhat apprehensive, I took the test, and it turns out that I am slighly less nerdy than Cshel. At the same time, had some of the questions been asked differently, I am sure I would have come up higher. Questions like ‘How many programming languages do you know?’ (6-10), ‘What was your first computer, and at what age did you get it?’ (Tandy 2000, at age 10)

Google Search Trends

Jason Bartholme, the king nerd, has .

If you have a Google account and are signed in, you can view your search history by clicking on Web History, then Trends. There you will see three lists of ten for your Top Queries, Top Sites, and Top Clicks. Below that, is the fun stuff. Here are my numbers:

My Hourly Search Trends
My Daily Search Trends

My numbers are obviously lower than most others; I suspect that is due to me using the Google search bar in Firefox rather than actually using the Google.com homepage. The results that are returned by this box are actually an Adsense Search results page, in which the advertisement revenues go to the Firefox Foundation.

Tagging others:

In return, I would like to tagthe following for both memes:

  • Dave Zuls
  • Scott Horne
  • Shandy King
  • Abhishek Tripathi
  • David Ogletree - yes, I know JB also tagged him, but I’m just turning up the pressure!
  • Richard Ball

I could keep going, but I will spare the rest the pain of yet another meme :p

Perhaps Google’s algorithm isn’t as difficult as we all think?

No, I haven’t been sitting in front of the microwave for too long again. Before you rip me to pieces, give me a few seconds to explain myself!

Possible Technology Limitations

Now, we all know that Google has one of the largest server farms in the world, estimated upwards of 250,000 individual servers spread worldwide. In spite of this fact, many people lost sight of the fact that Google only has a finite (albeit large) amount or resources.

If we estimate that Google crawls 100 million+ new pages per day, they are likely to encounter a billion or more new links on a daily basis. I think it is plausible that given the ‘100 factors’ supposedly composing the algorithm, Google may find itself running short on server power while crunching all the incoming data. For example, many of the ‘factors’ which are assumed to influence an outgoing link’s value are dependent on characteristics of incoming links. This could continue recursively back through many layers of the page heirarchy. Links are only one example of hard-to-crunch data; undoubtedly there are more costly factors to take into account.

Additionally, one needs to consider latency times to transmit data between server farms located on all different continents. For instance, data transmitted from Eastern Asia would take likely 100ms to reach the Continental US. Since page information is likely distributed among the various server farms, there could be signifigant transport delays involved in obtaining the data for a larger algorithm.

Remember that a certain proportion of Google’s server farm is not dedicated to their ranking algorithm; much of their hardware contains the finalized results which they serve out. Not only that, much of the hardware contains duplicate information: for instance, there are numerous data centers serving out identical information to search requests in the United States; a similar situation is seen in most foreign countries.

Geniuses and ‘Good’ Algorithms

Cringely’s recent article on PBS once again brought to the forefront one important fact: Google is composed of genius engineers and computer scientists. Every computer scientist knows that the ‘best’ algorithms are the ones that solve the largest number of potential cases in the least amount of steps, in the simplest fashion possible.

A well designed algorithm conveys a sense of beauty to a computer scientist; there is nothing like taking a huge, ugly algorithm written quickly to solve a problem, and refining it into a short, effective, and quick piece of work. A simple but effective algorithm has an elegance around it that is recognized by all who work with it.

Conclusions

As a result of the makeup of Google’s employee body, I would suspect work is constantly being done to simplify the Google algorithm while maintaining the same level of effectiveness it currently has, and I believe it is quite possible that the algorithm that is currently in place is much simpler than we have been led to believe. There is financial benefit to using a ’simple’ algorithm: by cutting down on machine time, Google would be able to get better use out of its machine time, which has obvious financial implications.

What are your thoughts? Personally, this is just a theory: until we know better, I am just going to continue with my mental picture of the ‘big’ algorithm, and all the various on-page and off-page factors we traditionally assume they look at.

« Previous PageNext Page »