Matt Cutts #10: Lightning Round!

Here’s the tenth in the series of videos posted by Google’s Matt Cutts to Google Video over the past year. These are important for every web developer to see. Please see Matt’s first!

See the rest of the videos!

Transcription

Alright. This is Matt Cutts, coming to you on July 31st Monday. This is probably the last one I will do tonight. So lets try to do a lightning round.

Alright! Peter writes in. Says:

“Is it possible to search for just home pages? I tried doing -inurlhtml, -inurlhtm blah, blah blah.. php, asp, but that doesn’t filter out enough.”

That’s a really good sugestion Peter. I hadn’t thought about that.

Fast used to offer something like that. But I think, all they did was to look for a ~ in the url. I will file that as a feature request and see if people are willing to prioritize it where we might be able to offer that. My guess is, it would be relatively low on the priority list, because of the syntax you mentioned subtracting off a bunch of extensions would probably work pretty well.

Ah. I get to clarify something about strong versus bold, emphasis versus italic. So, there was a previous question where somebody had asked about whether it was better to use bold or whether it was better to use strong. Because bold is what everybody used in the old days when the dinosaurs roamed the earth, and strong is what the W3C recommends. At that time, last night, I thought that we just barely, barely, barely, like an epsilon preferred bold over strong and I said, for the most part don’t worry about that.

The nice thing is an engineer actually took me to the code where actually I could see it for myself, and Google does treat bold and strong with exactly the same weight. So thank you for that Paul. I really, really appreciate it. In addition, I checked the code that shows that ‘em’ and italic are treated exactly the same as well. So, there you have it, go forth and mark up like the W3 would like you to do it, do you it semantically well and don’t worry so much about crufty old tags, because Google would score it just the same either way.

Alright. In the lightning round, GoodmanAmanaHVAC asks,:

“Will we see more kitty-posts in the future?”

I think we will. In fact I tried to get my cats in on this show but they are a li’l scared of lights. Lets see, if I can get them used to it.

TomHTML asks,:

“What are Google SSD, Google GAS, Google RS2, Google Global Marketplace, Google Weaver and other services discovered by Tony Rusco??”

I think it was very clever of Tony to try to do a dictionary tag against our services check-in, but I am not going to talk about what those services are.

What else have we got here.

Josef Humpkins asks,

“A Preview of what many of the topics might be in the duplicate content session of the SES.”

I gave a little bit of a preview in one of the other sessions on video. But, I think what we would basically talk about, Sherry will be there, a lot of people will be there, we will talk about shingling.

What I’ll essentially say is, Google does a lot of duplicate detection from the crawl, all the way down to the very last millisecond, practically when user sees things. And we use stuff that’s exact duplicate detection and we do stuff that’s near duplicate detection. So we do a pretty good job all the way along the line of trying to weed out duplicates and stuff like that. And the best advice I give is to make sure that your duplicate content, you know, pages which might have nearly same content, look as much different as possible, if they are truly different content.

A Lot of people worry about printable versions or somebody else asked about .doc or word file compared to an html file. Typically you don’t need to worry about that. If you have similar content on different domains, may be in French and another version in English, you really don’t need to worry about that.

Again, if you do have the exact same content, may be for a Canadian site and for a .com site, its probably just the sort of thing where we will detect which ever one looks better to us and and just show that, but it wouldn’t necessarily trigger any sort of penalty or anything like that. Or if you want to avoid it, you can try to make sure that templates are very very different. But in general, if the content is quite similar, its better just to let us show which ever representation we think is the best anyway.

And Thomas writes in and says,

“Does Google index or rank blog sites differently, than regular websites?”

That’s a Good Question.

Not really. Somebody else asked about links from govs, edus and whether links from two level deep govs and edus, like gov.pl are the same as .gov. And the fact is we don’t really have much in the way to say, oh this is a link the from the odp or from .gov or .edu.so give that some sort of special boost. Its just that those sites tend to have higher pagerank because more people link to them and reputable people link to them.

So blog sites,there is not really any distinction unless if you go off to blogsearch ofcourse, and then its all constrained to blogs. In theory, we could rank them differently, but for the most part, just the general search, the way it crawls out. Things are working out ok.

Alright!. Thanks.

Transcription thanks to Peter T. Davis

No comments yet. Be the first.

Leave a reply