Hacker Newsnew | comments | leaders | jobs | submitlogin
Stop obfuscating your email address - a geek mistake (mindscape.co.nz)
50 points by traskjd 2 days ago | 69 comments




27 points by RiderOfGiraffes 2 days ago | link

I have a pet peeve, it's people thinking that my situation is the same as theirs, and giving me advice that is wrong for me.

I get around 2400 spam a day. I've tried three different spam filters recommended to me by people I trust and who generally know about these things, and they get about 97% to 98% accuracy. Worse, they produce false positives. With 50 spam per day, plus having to trawl through the spam bin to look for the false positives, I decided to write my own.

My spam filter is highly tuned to my traffic and I get about 10 spam through per day. More, there are about 2 false positives per month, although that's very hard to quantify.

The problem with putting up a form is that people want a reply, and yet they can't type their own email address properly. About half the emails I get through my various forms have subtle and not-so-subtle misspellings of the return address, and it can cost me hours to track them down.

No, obfuscated email addresses is still my best tool in this situation.

reply

14 points by TheElder 2 days ago | link

I'm using Gmail for my domain and received zero spams in well over a year. My email address is all over the web. What are you using for your spam filter?

reply

5 points by araneae 2 days ago | link

I don't bother obfuscating either. I do get spam occasionally, but the gmail algorithm learns so once I mark a particular type of spam as spam it never comes up again.

reply

5 points by Tichy 2 days ago | link

I am less worried about receiving spam and more worried about not receiving good mails. Are you sure that Gmail's spam filter has zero false positives as well? If you are sure, how do you know? I can not check my spam folder manually because it contains thousands of mails.

reply

5 points by TheElder 2 days ago | link

At one point I was checking it religiously but I've come to trust it after never finding a legitimate mail in my spam folder after a long period of time. I still check it from time to time, and again, I don't find legitimate mail tagged as spam.

reply

3 points by ugh 2 days ago | link

Most confirmation mails (and the like) end up in my Gmail junk folder. E-Mails from normal people never. And for me it has just become a automatic process to look in the junk folder for those.

reply

1 point by RiderOfGiraffes 2 days ago | link

I can't use gmail because I have the requirement to create email addresses on-the-fly. I have my own domain and can do that. The result is I need good spam filtering.

My filtering now achieves around 99.6% filtering, and about 1 detected false positive per month. It would be interesting to see how gmail copes with my 2400 spam per day, and what accuracy it achieves, but it's a non-starter because of how I use email.

reply

11 points by JimmyL 2 days ago | link

I do this all the time with Google Apps to track who sells my email address, creating a new one for each place I signup.

1. Create a catch-all address for the domain you're going to use that isn't the normal postmaster one.

2. Pick a three- or four-letter combination of letters that rarely appears in normal conversation (like dcj, for example).

3. Set a mail filter on the catch-all account to forward all mail that has your three-letter combination as a part of its recipient list to your real address.

This means that for every service you sign up for, you can create a new address (I always use domainname.code@mydomain) that is trackable and gets to you. For example, I just signed up with Via Rail's online system - using the address viarail.dcj@mydomain. It will get forwarded to my real address (since it's got the dcj in there), if I start getting spam on it I'll know where they got the email address from, and if it gets really bad I can just change my filters to block all email to viarail.dcj@mydomain.

You could do something with Gmail's plus-addressing, but I find that many services don't accept those email addresses.

reply

1 point by lunchbox 2 days ago | link

Good idea. For those of us who use an @gmail.com address, here's what you can use in lieu of plus signs, which as JimmyL said often don't work:

Gmail lets you insert periods between the letters of your username. So if my email address is someusername@gmail.com, the following are valid variants of my email address:

some.username@gmail.com

some.user.name@gmail.com

s.omeusername@gmail.com

And so on. So you can use variants for different services you sign up for. Also note that you can substitute @googlemail.com for @gmail.com.

reply

1 point by Gormo 2 days ago | link

Yahoo Mail offers a service called AddressGuard that does exactly this, but it's only available to paid accounts. You can also easily use these as your from address, so even in direct correspondence, you still shield your primary address from the recipient. (This allowed me to verify that eMusic sells their subscriber list to spammers.)

A free alternative is SneakEmail (http://www.sneakemail.com) which allows you to set up disposable addresses that forward to your primary account, and allows you to set up pre-forwarding filters. They also create a unique address for the sender of each email, and you can set up your SneakEmail filters to insert this as the reply-to address of each email you receive.

reply

1 point by JimmyL 2 days ago | link

SneakEmail is great; it's what I use when I'm seriously suspicious of who I'm sending mail to.

For 95% of the things I sign up for, however, I'm not that paranoid - the slight decrease in security is offset by the added convenience of not having to log into a third-party service (like SneakEmail) to get what I'm doing done.

reply

6 points by pmjordan 2 days ago | link

I suspect you already know about this, but the google apps/domain service lets you add aliases and mailboxes painlessly.

reply

2 points by dkokelley 2 days ago | link

I suspect 'on-the-fly' could mean automatically, as in the server generates them as needed. I'm not sure if Google Apps can do that.

reply

3 points by nrr 2 days ago | link

I do this all the time with plus addressing.

reply

1 point by holygoat 2 days ago | link

... except that about 80% of form validation code incorrectly believes that foo+bar@zog.com is invalid.

reply

2 points by petewarden 2 days ago | link

I use a custom domain, but have it set up to forward all addresses to Gmail, mainly to use their spam filter. I'm not sure what your particular use-case for the on-demand addresses is, but you might be able to set up some Gmail label rules to sort it for you on that end too.

The only downside is that you have to manually verify your custom domain's 'sent from' addresses in Gmail, so you can't easily reply from arbitrary addresses.

reply

10 points by mseebach 2 days ago | link

Your visitors that can't spell their own email address is likely to screw your obfuscated address up as well. I've seen email in my outbound queue to whatever@domain.dot.tld and variations. I wouldn't be surprised if someone never realizes that AT is @ (in non-English speaking locales, @=at is not at all obvious. In the Nordic languages it's known as (elephant's) trunk-A. ) and gives up before the mail leaving the client.

reCaptcha Mailhide (http://mailhide.recaptcha.net/) is a good solution to this.

It hides your e-mail behind a CAPTCHA, but once it's solved it's an actual, well-formed address, that can be copied or clicked. Also, it's very clear if you've solved the CAPTCHA - it's no always clear if solved the "riddle" of an obfuscation correctly.

reply

4 points by RiderOfGiraffes 2 days ago | link

  > ... visitors that can't spell their own email address
  > ... likely to screw your obfuscated address up as well
  > ... it's no always clear if solved the "riddle" of an
  > obfuscation correctly.
They are likely to screw it up, but at least they get the feedback of a bounced email. There is a recovery mode, and it's their problem.

With an incorrectly spelled return address on a form there is no recovery mode at all, and they get no feedback that it hasn't worked.

reply

4 points by raganwald 2 days ago | link

it's their problem

Isn't that the point of the article? That you are offloading work onto the peopel you ostensibly want to get in touch with you?

(I'm not saying it's wrong for you to do so, just that your statement seems to line up perfectly with the author's premise that obfuscating your email address is taking your problem and making it their problem).

reply

6 points by RiderOfGiraffes 2 days ago | link

The author suggests that using a form makes the problem go away. My experience says that people who can't de-obfuscate an email address frequently can't type their return address correctly. The obfuscated address gives them feedback in the form of a bounced email, and thereby gives a recovery mode. The incorrect return address in the form gives no recovery mode at all.

To me, that's conclusive proof that the form is worse than the obfuscated address.

Further, there's a balance to be achieved. I've analysed the types of people I want to contact me, and of those who can't work out my address there are two types. One type is those that I really don't care about - nuisance, spam, content-free, or time-wasters. The others that I do care about usually have a different route to me, one that's specifically tailored to them and made as easy as possible. That one has a specially designed anti-spam measure built into it.

reply

2 points by pyre 2 days ago | link

How many servers bounce email anymore as opposed to just 'blackhole'ing it so that spammers can't probe a domain for valid email addresses?

reply

1 point by raganwald 2 days ago | link

I have to agree with your disdain for forms. I hope that when given the choic ebetween obfuscating an email address and providing a form, our response will be to improve our spam filtering and offer a plaintext mailto: link.

Plaintext links have high usability. The mail is organized within the user's mail program where it can be retrieved, collated by subject, and so forth. It can be copied and pasted. It's the abslute best thing for them :-)

reply

1 point by mseebach 1 day ago | link

> but at least they get the feedback of a bounced email.

First, that depends. They may hit an existing domain with a catch-all mailbox configured, or they may hit a legitimate mailbox at your provider, where the receiver may or may not think to reply that they got the wrong address.

Second, what percentage of users (especially in the segment that might misspell their own email address) will know what to do with a bounce mail?

reply

1 point by daleharvey 2 days ago | link

If spam is still a problem, its pretty obvious your spam filter isnt working properly, that was kinda the point of the article

reply

11 points by edw519 2 days ago | link

OP has it all wrong.

  e d w 5 1 9   AT   g m a i l
is not meant to upset anybody.

It's a IQ test.

If you can't figure out how to contact me from that data then you'd probably be wasting my time anyway.

Self-solving problem.

reply

9 points by brent 2 days ago | link

Exactly.

The author cannot be bothered to decode such email addresses. If that is the case I probably did not want his email in the first place. Ipso facto, my filter worked.

reply

2 points by Herring 2 days ago | link

I especially like where he says the writer is lazy for expecting him to spend 2 seconds decoding that address.

reply

2 points by clintavo 1 day ago | link

You are EXACTLY right - we display our jobs email address this way. We figure if someone can't figure out an email address that is presented this way, then we don't want to hire them.

reply

6 points by jgilliam 2 days ago | link

This will create a pretty mailto link encrypted with javascript that the bots can't pick up: http://hivelogic.com/enkoder/form

I've used it for years with good results. People will even email me thinking that I've exposed my email address to spammers encouraging me to use the blah [at] blah dot com style.

reply

2 points by pavel_lishin 2 days ago | link

Do you ever reply that while you never see spam, you frequently have to deal with other kinds of unsolicited e-mail, hinting softly?

reply

1 point by jgilliam 2 days ago | link

There is also a rails plugin: http://github.com/dan/hivelogic-enkoder-rails/tree

reply

6 points by bantic 2 days ago | link

I always thought this sort of systematic obfuscation ( /@/at/s, /\./dot/s, etc) was as machine-readable as an actual email address. I've just gone on the assumption that there are spam harvesters out there using regexes that catch "bob at domain dot com" as well as bob@domain.com.

reply

3 points by eli 2 days ago | link

Sure, there are. But given the huge number on unobfuscated addresses on the web how many bother? Ten percent?

EDIT: This experiment shows it to be much less than that (based on volume of spam received) http://techblog.tilllate.com/2008/07/20/ten-methods-to-obfus...

reply

3 points by acdha 2 days ago | link

Given the large number of blog, wiki and CMS engines which use the same cargo-cult security idea, it's probably considerably higher than 10%. If you're getting paid to harvest addresses, wouldn't you write a single regexp to increase the number of good addresses?

reply

3 points by eli 2 days ago | link

Sure I would, but like most half-way decent programmers, you couldn't pay me enough to code spam bots.

I've blocked a huge volume of comment spam on my sites by blocking certain malformed HTTP headers. The authors couldn't be bothered to check if they were getting it right. I don't think most spam bot authors are A) very well paid or B) very good.

reply

1 point by xinsight 2 days ago | link

Don't forget that it's a dynamic system. As more of the low-hanging fruit emails get picked by spam email harvesters, then there is more value in the harder to decode emails since they haven't been spammed. There is a tipping point where it would be "worth it" for someone to start to decode the harder other types of obfuscated emails.

reply

2 points by dkokelley 2 days ago | link

Possibly, but couldn't you argue that the low-hanging fruit email addresses are more likely to be profitable to spammers? Which of these two internet users is more likely to buy your replica rolex: danny@aol.com, or AOL: danny (or danny at aol, or danny+don'tspamme at teh a oh l's dot com)?

My point is that users smart enough to disguise their emails from spammers are more likely to be wary of their wares.

reply

2 points by yread 2 days ago | link

I don't think it even makes sense to harvest these addresses. The people who write email [AT] address [dot] com are MUCH less likely to purchase their Viagra online in you shop just because you send them an email:)

reply

1 point by pyre 2 days ago | link

Does that really matter? A botnet of '3 million users' sounds pretty impressive until you find out they're all on dial-up.

reply

5 points by Raphael 2 days ago | link

My objection to it is that it would be trivial for email crawlers to decode the obfuscations most people use.

reply

14 points by gojomo 2 days ago | link

And yet: many email crawlers don't -- they're lazy and get plenty of email addresses just scraping nonobfuscated addresses.

So even trivial obfuscation turns out to help a lot. Here's someone's tests from a year ago:

http://techblog.tilllate.com/2008/07/20/ten-methods-to-obfus...

Every technique reduced spam by 60% or more, and even the dumb-as-rocks replacement of '@' and '.' with 'AT' and 'DOT' reduced the volume by size by over 99%.

reply

4 points by mseebach 2 days ago | link

Based on that analysis, javascript "scrambling" is the preferable method: It is 100%(1) effective and has no usability implications.

1: The analysis runs 1.5 years until July 2008 -- one must assume that crawlers has become more sophisticated. I.e. building the DOM, executing any javascript and then searching all visible text isn't that difficult, and less so now than in 2007.

reply

2 points by pmjordan 2 days ago | link

It's still far from trivial if you're doing this on millions of pages, especially as you'll have to sandbox the JS in some way, which may or may not subtly break things in other ways. I suspect the effort isn't worth it.

reply

1 point by petsos 2 days ago | link

Email harvesters probably don't want to run javascript because then they would be open to traps (like infinite loops or other cpu consuming scripts) that could be targeted at them.

reply

1 point by dmoney 2 days ago | link

Wouldn't the traps also make the web page unusable to regular users' browsers?

reply

2 points by pyre 2 days ago | link

There are ways to trick bots. 4chan used to have a second field named 'email' in their submission form that was hidden with the value set to "DO NOT PUT ANYTHING HERE" (or something similar) and spammers would blindly fill both email fields (unless someone was specifically targeting 4chan). I'll bet there are plenty of ways to get an email-crawler to click on some link that a normal person would not.

reply

1 point by gojomo 2 days ago | link

Such traps can be placed on decoy pages that users and good robots are unlikely to visit. Or, that legitimate users execute rarely -- when clicking a link to send a single legitimate email -- but email harvesters execute in excess.

reply

1 point by selven 2 days ago | link

Especially obfuscations like "bob at gmail dot com". If I were a spambot I would just read until the first space and append "@gmail.com" and "@hotmail.com" and 5-6 other mail providers. This would break through 90% of people doing that, and they're high value targets too - they feel secure with their clever obfuscation and chances are their other anti-spam tools (and reflexes) are weaker.

reply

4 points by eli 2 days ago | link

But you're not a spam bot author. Based on what I've seen, most spam bot authors are pretty bad programmers.

Also, I would debate that these are high value targets. I'd wager that people who go out of their way to obscure their addresses are much less likely to purchase fake pills or fall for a phishing email than the average user.

reply

2 points by heyitsnick 2 days ago | link

With your method, you would have to get an email address after every word: in your own words "i would just read until the first space and append". At the first space ("bob "), you have no indication this is a valid prefix of an email address.

Secondly, I don't see how bob is a high value target. Someone who knows what spam is, and knows what a spam bot is, and knows they want to obfuscate it, is probably not someone who would make a purchase if they did receive spam.

I think that's one reason spam bot crawlers don't try that hard to obfuscate addresses: the recipients are of less value than those unobfuscated.

reply

1 point by motters 2 days ago | link

Yes. I've never bothered to obfuscate my email for exactly that reason, rather than for reasons of user convenience. If I were writing a program to crawl web pages for email addresses it would be extremely easy to include most of the common obfuscation patterns. If you really want to obfuscate then use some kind of image of the text, which is less easily machine readable.

reply

4 points by KevBurnsJr 2 days ago | link

Add a plus to your name whenever you put your email address into a form on a website. It will continue to show up as though the +whatever were not present.

Kev+myspace@gmail.com, Kev+facebook@gmail.com, Kev+untrustworthysite@gmail.com

If you start getting a bunch of spam to Kev+myspace@gmail.com, filter out all email to that address.

reply

7 points by ovi256 2 days ago | link

I try to use that all the time, except that everybody likes to roll out their own email validation, and they are all wrong, they do not allow the + sign as they should.

The RFC is extremely permissive, even spaces (yeah, spaces) are allowed in email addresses.

reply

1 point by dkokelley 2 days ago | link

I tried doing something like that, but it ended up being a hassle using my multiple email address versions to log in to sites. Actually, it would be nice to have a browser smart enough to do something like this automatically. I'll bet there's a FF plugin for that... but I use Chrome.

reply

4 points by oliverkofoed 2 days ago | link

My spam avoidance scheme is to own my own domain, and then have *@mydomain.com go to my inbox, so i can create disposable addresses for any use at any time.

For instance, I always sign up with [servicename].account@mydomain.com and only ever give out my personal e-mail to people i meet in person.

The great thing is that if a [servicename].account address starts getting spam, i know which service sold my address and i can just blackhole that address.

That way, i never have to obfuscate my address, since i'll always just create a new one for the specific need. It's probably not for everyone though...

reply

3 points by dpcan 2 days ago | link

I'm in the same boat as many people here.

I just use Spamassassin with my domain name and I get about 2 spam messages in my inbox per day, but the spam box gets around 1000 per day. I post my email address on websites because I want to have zero barriers when a customer or lead needs to contact me. It's NOT THEIR PROBLEM that I get spam.

So I agree, just get a good spam filter.

reply

3 points by tomjen2 2 days ago | link

Unless you are really desperate for people to contact you, forcing them to "decode" the email address might be a nice velvet rope that keeps out those who aren't worthy of your time.

Obviously this might not work if you have a customer support email.

reply

3 points by Tichy 2 days ago | link

No, spam filters are not pretty good. Instead, email has now become an unreliable channel for me. I get so much spam that I have to enable automatic filtering. That means I probably miss non-spam emails occasionally. And some spam still slips through the automatic filter.

reply

2 points by indranil 2 days ago | link

If I display my email like the way this fellow detests, it obviously means I anticipate my readers to have enough reason to type an email address out!

reply

1 point by Encosia 2 days ago | link

I couldn't agree more with the post. I've had my primary email address sitting out on my contact page for years and have never spent a significant amount of time dealing with spam. At this point, obfuscated addresses are as archaic as animated "under construction" GIFs and the blink tag.

Using the de-obfuscation process as a gating mechanism is arbitrary and perhaps a bit arrogant. There's no reason to assume that familiarity with this convention equates to intelligence or value. Even now, I still run into plenty of people in businesses outside of the tech industry that confuse website and email address formats. That doesn't make them "unworthy" of contacting us; it just means they have a different skillset and knowledge than we do.

Meanwhile, email spammers and scammers are most likely to understand these conventions, since it's their "job" to do so.

reply

1 point by cnvogel 2 days ago | link

One possible solution to the problem is to use dynamically generated throw-away email-addresses. You can also encode some kind of signature. Then just hide the monstrously huge address behind a pretty "Please Email me" Link:

mailto:chris-hHz389aASKJkjhqweuiSHADKJweiuqzrq@example.com

If Spam increases (or, say, whenever more than 3 emails have been received at one particular address), you shut down the address.

I once implemented the receiving, hmac/signature checking, part for the exim mail-server and a general address-generator class in python and php. But never actually put it to use.

reply

1 point by eli 2 days ago | link

I wouldn't obfuscate the support or billing contact address for my company, but on my personal site? Or code documentation posted on the web? You betcha.

There's a big difference between pushing work on your customers and pushing work on people who don't know you and are trying to contact you the first time.

reply

1 point by NathanKP 2 days ago | link

The two main suggestions to allow people to contact you without getting lots of spam:

1) Use Gmail. Google has written their search engine to know how to detect spam so they know how to stop spam from reaching your email address. I don't think I have ever had even a single spam message even reach my spam folder, and that's thanks to Google. I could put my non-obfuscated email address all over the place and not have to worry.

2) Write a custom contact form. It is not that hard if you know even a little bit of PHP. And if you don't you can always use Zoho forms or some other free online form creator. I never put up contact emails because they are, in my opinion, just as unprofessional as an obfuscated contact email. Contact forms are much more professional.

reply

2 points by RiderOfGiraffes 2 days ago | link

And what do you do when someone mis-types their return address?

reply

1 point by est 2 days ago | link

obfuscating is useless:

http://www.google.com/search?q=at+gmail+dot+com

Collect enough patterns and you can harvest tons of email addresses like used to be.

reply

4 points by mseebach 2 days ago | link

There's a difference between what can be done, and what will be done. The point in the article is that simple obfuscation actually works.

reply

1 point by petsos 2 days ago | link

Yes, but that's like hiding behind your finger. It is so easy to de-obfuscate that it is just a matter of time.

reply

1 point by gloob 2 days ago | link

A matter of time is plenty good enough for the time being. It's an email account, not a bank account.

reply

0 points by rv77ax 2 days ago | link

Looks like we miss some important question here. How spammer got the emails ? Do the Spam bots really crawled to every web pages ?

Some note that i learn from Internet to minimize spam emails:

- Never use third party proxy or anonymous network (i.e. Tor). I once work for company that not allowed to use any port except 8080, so for several months i use Tor. Suddenly, after several weeks my spam folder increased with junk emails.

- Make sure you clear all your caches and cookies _before_ and after browsing for pr0n. duh! :)

- Never use any third party application from Facebook/MySpace/any-social-networks, unless you using your non-private mail on your Facebook/MySpace/any-social-networks account.

- Do not read spam email. If you know that email is spam just check it and delete, or let the system delete it automatically, like Gmail do. I do not know anything about SMTP protocol but there is one feature that make your email notified to sender when you read it, by opening your email you just notified the spammer that your email is, at least still, active.

Spammer, in context of the emails gatherer, is not stupid. They know what their doing.

reply




Lists | RSS | Bookmarklet | Guidelines | FAQ | News News | Feature Requests | Y Combinator | Apply | Library