I filed a bug report about this 3 months ago. I did completely agree this is not likely to be a serious security issue (only a few indexed emails will contain phonenumbers or private information, even less likely private as Googlebot must have publicly found these links online, though some message bodies did hint at stuff like password resets).
It is worth looking into from an SEO perspective, site (index) health perspective and to ultimately prevent/minimize problems like these.
In my opinion these links do not belong in the index. https://mail.google.com/mail/ should have been the Canonical. Big companies expose the contents of SMS messages, who contacts who, and sometimes even what their users search for.
Now these pre-filled "to"-field links were picked up by accident with the automated sitelinks algorithm. It could have contained a pre-filled "body" too, maybe some spam or maybe a crude online link to make an appointment and have certain fields filled in.
Webmasters can prevent this by specifying Canonical and cultivating the search index quality by only letting bots index unique quality pages, not for example have pages and pages of user-generated search results (Consider noindexing /search/results/), open redirect link(spam), or indexing every possible way a user may change your URL parameters and posting a link of that somewhere to be picked up by the search bots.
I have so many questions after finishing that article, but first to come to mind is: who puts a HTML FAQ in their e-mail signature? Seriously, who does that?
I was going to ask the same about googling "gmail", clicking the second link, and actually sending a blank email. But apparently the answer is "thousands of people" a day.
It's probably healthy for us "web people" to get a reminder every so often that vast swaths of humanity use computers in ways that are nearly incomprehensible to us.
The world's fastest typist, Sean Wrona, would abhor that - he uses capslock instead of shift(!) He can go 200WPM+ so, clearly he's doing something right...
I assume there's a point in overall speed where the slowdown from caps-up/down -> letter-up/down -> caps-up/down is worth the reliability drop from coordinating shift-down -> letter-up/down -> shift-up. I think I hit that point sometimes entering passwords and other frequently entered info, but the reality is for most typing none of us get there.
Seriously, the number of wrong turns you had to take to get there... wtf? I kind of feel like from a UI perspective, you can try to make your site as usable as possible, but someone will always run into the dumbest edge case or contact you to ask questions which are clearly answered on the site.
Yikes! At first glance I thought those were questions to Mr. Peck and thought it was fine, but if that's the signature.... It's a little presumptuous isn't it.
It's my email FAQ. I get a lot of email - it's my largest social network containing 1000's of connections. I came up with the idea of the FAQ to head off having to answer the same questions over and over again. I used to have it as a list of social networks (twitter/skype/phone, etc.) but people ignored that. And that didn't allow me to give the details the FAQ does. The FAQ catches people's attention better than the traditional signature. It has cut down on those kinds of questions by ~70-80%, I'd say. So maybe it's presumptuous, but it was the more successful of my A/B testing. And it's not as noticeable when there's a big ol' email on top of it. So WHO does something like that?!! Me. I do.
With respect, the very people you are covering for a living probably find that signature obnoxious. Speaking for myself, I certainly would; I never thought I'd see an e-mail signature worse than a paragraph of unenforceable legal disclaimer, but alas, I have.
One thing you've probably overlooked is that a significant majority of mail agents (a) discard HTML, making your carefully-formatted signature render in ways that you have not anticipated or dropping it altogether, and/or (b) do not thread and hide quotes the same way yours does. In addition, you're now making me download and potentially display that signature every time you reply to me. On mobile, this can be a big deal in a deep thread where 70% of the data in a message could end up being your repeated signature.
I'm assuming that your workflow is top-posting and letting Gmail hide the repeated signature from you. That gets really unworkable, really quickly in any client that is not Gmail, particularly with a huge HTML signature that is repeated on every message. Since one of the primary topics of your publication is engineering types, you might find that this signature is doing you more harm than good with them. (Though I grant that you probably correspond more with PR and management people, who love Comic Sans signatures.) I don't think your A/B testing can account for that. There is a far better solution in Gmail's canned responses, which really should be promoted from a fucking Lab already:
The other thing you probably haven't accounted for is that I very quickly pick up on someone being annoyed that I've e-mailed them. Your signature basically says "you are an inconvenience to me," by demonstrating that you've put a lot of thought into lowering the questions that you answer. I feel the same way about "Before you e-mail me, please read this:" and the other things overworked e-mailers ask me to do as a method of transferring work. I generally end up in a "oh, you don't want me to e-mail you? Okay!" attitude when dealing with this sort of thing.
The signature lightly betrays that you do not generally value engaging with the people that e-mail you and turns me off to you, which might be hurting you with sources. We all hate e-mail, yet the best e-mail correspondents find ways to manage the workload without betraying that their conversational partner is an inconvenience to their life. Seeing as dealing with people via correspondence is probably half of your chosen profession, I'd hope that's not the message that you intend to convey.
The new canned responses take 2-3 clicks to get one answer in. That would add a lot of unnecessary time wasted each day. Perhaps there is a middle ground for Sarah that contains a link to an updated page with this info?
I imagine far fewer people would get the answers if a click was required. Pretty basic user tendencies.
Honestly, I don't get the irritation of the parent comment. I'm drowning in email....and I'm not a public figure. If I saw something like that FAQ, I would just assume the person was flooded with mail, and doing me a favour by answering any common questions.
In Chrome and Firefox, if you lazily type "gmail" into the address bar rather than "gmail.com" then it will google it for you. I suspect it's rather common, I know I've done it a few times by accident.
I have watched people google 'gmail' and ignoring the autocomplete, but clicking on the link that comes up in the search. At some point people were told "do it this way" and to change that would require a 30-minute handholding and de-education session.
I consider myself as an experienced internet and computer user, but sometime I simply write "gmail" and then click on the first link instead of writing the ".com". Go figure.
Using search engines this way is also called "navigational search" and it's still how most of the world navigates the web (even sites they know well and use daily!)
It's basically using search as an additional name resolution layer on top of DNS. Because search is more human-convenient, with a layer of semantic processing free from the rigid technical constraints of DNS names. Which is all great, a useful abstraction, until the semantic layer guesses wrong on what the user actually wanted and the abstraction leaks.
While I'm on my phone it's much easier to just open the browser, type "h" in the search box, auto-complete kicks in, I press the "hacker news" suggestion and then click on the first result in the SERPS which gets me in here. Otherwise I would have to start typing "hn..." something in the browser bar, remember that's not how HN's URL actually looks like, curse and revert to using Google.
however, DNS also has an issue for this particular audience. A simple typo could land you on a completely different site which I think is why navigational searching is more useful for those in a rush or those that can't type. I even find my self doing this time. If I ever need to get to thesaurus.com, I usually just type t, h, e, s and then mash the letters a, u, and r a few times and then end it with an s. it isn't that it is hard to spell, but I just find it easier and Google seems to almost always get what I am trying to type so quickly that spelling it correctly wouldn't really make it much faster
This is truly hilarious. Just imagine - some guy, in the middle of nowhere, sitting in front of his computer and suddenly his smartphone starts vibrating with notification about new email. And vibrating.... and vibrating... and after dozens of minutes, he got call. And the caller ask you: Hey, don't you get some emails lately?:D
What I really want to know is who the hell actually googles "gmail"? There's a link at the top of the homepage!
You'd be surprised at what some people do (on laptops and desktops). I've seen ones who refuse to type URLs in the address bar, with the sole exception of google.com, which they then use to type in the URL for the site they're looking for.
They don't even realize the search bar on the top right exists.
I know someone who uses the search bar in the top right to google for "google.com", which they then use to search for "yahoo", then having arrived at yahoo's front page, they search for "mail" to get to yahoo mail. Apparently typing "mail.yahoo.com" into the address bar (or just typing "mail" and having it auto-complete) is "too complicated"...
(I presume that this sort of person is who the GNOME UI team think of when they talk about "average user"? :|)
I've watched people go to the firefox homepage (which is Google) and google google, then click on a sponsored ad. I've also seen people go to bing and bing google go to google then google hotmail and then go to hotmail.
I have to say: I've done user testing with (probably hundreds of) users over the years, and I'd say the majority (anecdata) of users never type a URL or use a bookmark - typing 'facebook' or 'gmail' into Google is a very common thing. Incomprehensible as this may be to us nerds, it's a very real thing.
I never type website addresses in browser :) I just type phrases like "gmail", "hacker news" or "stanchart india" in Chrome/Firefox omnibox and then click the first/correct search result. I kind of use Google search as way to avoid typos in addresses that may lead of phishing sites.
Finding the search box on google.com, typing gmail and typing enter and then clicking on a large link in bold is a lot easier than clicking successfully on a small subtle link on the top of page.
It's also a lot less error prone than typing gmail.com into the address bar, which could be missed typed as gail.com, mail.com, etc.
Different people have different habits. I generally google find most sites unless it's an easy URL or autocompleted. I sometimes even type things like "hack new" on my phone to get here.
(I don't want to make it too easy to get here, if you're wondering why I haven't bookmarked it. Can be a time sink)
It really depends where the crawler is finding these links. I don't think this is a mistake. It's possible that people are adding these links to their site instead of "mailto:" links and crawler is simply indexing them.
It's not the only one overall, but this one likely ranks the highest. It also is bundled with gmail because it links to gmail domain (mail.google.com).
When I search it tries to link me directly to an attachment from an email, but tells me my account is suspended. I am already logged into Gmail though, so it might be different.
Well, hopefully nothing that dire - I can't imagine this would happen. Likely though those responsible for the incident will craft an incident report detailing what went wrong and why, as well as how to prevent it in the future - preferably in an automated fashion.
There is no way Google (or I assume any good employer) would fire an employee would for such a mistake. Mistakes happen, they are a part of life in the technical world. You work really hard to make sure the mistakes are minor (which this one is) and fixable, but they are going to happen.
Losing data is really really bad, and if it happens it better be minor and there better be extreme circumstances. If a service goes down, that is bad but shit happens sometimes. Something minor like this is going to make for an entertaining post-mortem, but nothing more. I'm sure Google will do something nice for the person who was affected by it.
Easy rule of thumb for operations: Never make the same mistake twice. You learn from every mistake, but new and amazing ways for things to break are always going to happen.
Source: I worked at Google in operations and made mistakes that were far worse than this
Firing people for making mistakes is a surefire way to ensure that when someone makes a mistake they will do everything they can to cover it up rather than take appropriate action to correct it.
There's also an argument that it makes sure other people will be more careful with their work to try to avoid mistakes. I don't think it's the way to go, but there certainly are managers in the world who do. (Hopefully a vast minority.)
Besides, this is very likely the result of several things all going wrong at once ("a perfect storm"). It's not like there's some dude that's responsible for the "Email" link, who looked at the link and said "yep, that's about right" and saved it to google.
It's not limited to a small number of Gmail accounts. If you use some additional search parameters you'll see a ton of compose URLs showing up in the SERPs
Searching:
site:mail.google.com gmail inurl:?to=
gave me nearly 25,000 results. Some have subject lines too.
I am seeing some private emails with subjects AND body, apparently spam but sometimes half-written content like "Dear John,". Someone build a spam bot and left the links out for indexing?
But, to be fair, it is only the 24th of January, so there might well be many more crazy things for us in stock. Though Google linking directly to mail some poor Hotmail account will be hard to top, yes.
The search index is built in mysterious ways, not only from crawled links but also visited sites in Chrome I suppose. We'll probably get an in-depth article from an SEO blog.
...he contacted Hotmail support this morning to try to get help. Ironically, he asked them to contact him at his alternate email address, which is Gmail.
+200 for additional bounty to have Gmail do customer support quickly too. (Oh remember the time when everyone had an adsense account? Boy that could take up to two months to get an initial reply.)
For something like self-driving cars I'd have ten times more trust in a company that excels in humility than one like Google with success and experience in data mining.
Hubris and excessive confidence in engineering ability here are a recipe for disaster.
off topic, but the one thing that I hate about google search results is that they are all linked to google.com/url?*. Sometimes my internet will crap out just because of this. Because google's server is too slow to redirect.
They certainly do it for organic results sometimes too. A/B testing or measuring quality of results? Annoying to no end if you want to copy an url directly. I remember looking into it a few years ago and they even went so far as to add js setting the status bar text to the normal url on mouseover.
Even if we redirect users to the search result sites, we don't track by the url, but track by number of hits and number of subscriptions as the performance indicators.
> I guess this is for CPC. But it should not be applied on the organic results.
It's been used on all SERPs for months now, ever since Google became extra privacy conscious. It prevents search queries from leaking to site owners via the referrer HTTP header.
I agree with you and 0x0 below. But is the number of query keywords so critical which cannot be released to the owner? I remember that it was shown to everybody when we type keywords and I think that's helpful information to users and to the site owners.
I know if the number is released, the site owners can cheat: "Create networks of fake sites to provide backlinks on popular keywords". This is really the problem of the current SEO. I have a report discussing about that: http://bit.ly/1fcOp8I
I think it's actually being used for years. That's how they calculate click through rates for organic results. The kind of report you see in Webmaster Tools.
I don't agree that Google should be the one doing this, most browsers have a setting for referer like off/on/same-domain-only and if someone wants privacy they can set it the way they like.
It's also extremely irritating if you're copying a link to the site from Google.
This reminds me of a story on an Australian comedy show where the host gave out an unusual telephone number, calling it "Satan's phone number". Turns out it was a valid number that belonged to some poor sap who got inundated with phone calls. Unfortunately, I can't seem to find a source after reading this years ago, so if anyone can confirm this as real or a hoax, step up.
But this really makes me wonder what kind of architectural mishap has had to occur for this to happen? Either way, the poor guy just became a celebrity. Hope he gets some benefit out of it.
They're trying to reach Gmail for a comment, but they can't, cause Gmail is down. It's funny. It's also made me realize that I should switch to email provider that I have more control over.
if one googless "gmail" and sees the top link opening a compose e-mail to your e-mail address, one probably starts to question reality or if they are in a dream, because it's so insane. its not like winning the lottery or getting hit by lightning, it seems more rare than that because it's an unexpected event that is tied to you. basically the emotional equivalent of going to times square and seeing your picture on one of the big screens with no explanation. fucking crazy.
The actual behaviour (the link with a random email address) is so far from the expected behaviour I'm finding it hard to believe how it happened. Probably search indexing/ranking error as others have speculated.
Yesterday the GMail App for iOS glitched on me as well. At one point the number of unread emails went to over 2k (in reality about 200), and then notifications for the app got enabled (usually disabled). Nothing major, just minor annoyances.
I didn't think much about it until today's events. Coincidence?
Those are just valid email addresses (and names, I guess). High-quality leads would be a list of people who are already especially qualified or interested in whatever you're pitching.
If they are actually composing and sending an email without checking what's going on, I guess that says something about those users' personalities... :)
It's not a "Gmail Glitch", seems like an index error by Google search, his email was indexed in the site links for "Gmail". I assume that due to the outage searches for "Gmail" spiked, so has his inbox was ddos'd.
It is worth looking into from an SEO perspective, site (index) health perspective and to ultimately prevent/minimize problems like these.
In my opinion these links do not belong in the index. https://mail.google.com/mail/ should have been the Canonical. Big companies expose the contents of SMS messages, who contacts who, and sometimes even what their users search for.
Now these pre-filled "to"-field links were picked up by accident with the automated sitelinks algorithm. It could have contained a pre-filled "body" too, maybe some spam or maybe a crude online link to make an appointment and have certain fields filled in.
Webmasters can prevent this by specifying Canonical and cultivating the search index quality by only letting bots index unique quality pages, not for example have pages and pages of user-generated search results (Consider noindexing /search/results/), open redirect link(spam), or indexing every possible way a user may change your URL parameters and posting a link of that somewhere to be picked up by the search bots.