Subdomains With Hyphens - The HP Security Laboratory -
Subdomains With Hyphens

I’ve been running a lightweight web crawler for a while just to look for interesting things. Recently I’ve noticed several web sites with hyphens at the beginning or end (or both) of their subdomain names/labels. The first time I saw it, I chalked it up to a link error, but after noticing it a few times it warranted investigation.

 

Here’s an example with both: http://-brainfreeze-.blogspot.com/

 

Obviously Blogspot allows that, but I got to wondering if that’s technically legitimate, and if so, what software accepts or barfs on it? So let’s look at the mess of RFCs that I found… RFC-608, RFC-592, RFC-1033, RFC-1034, RFC-1132, RFC-1738 and RFC-1912 all throw their hands into the host/domain name cookie jar (did I miss any?).

 

RFC-1912, which seems to be the latest, says (emphasis mine):

Allowable characters in a label for a host name are only ASCII letters, digits, and the `-' character. Labels may not be all numbers, but may have a leading digit (e.g., 3com.com). Labels must end and begin only with a letter or digit.

 

So… my interpretation is that that “-brainfreeze-“ and others are in violation of this RFC. But we all know how this really works—it only matters what the browsers support—so what do they do?

 

Not surprisingly, IE, Firefox (Mac & Windows), Safari (Mac) and Opera (Mac) all open the web site without a problem. Safari on the iPhone, however, fails with the error “Safari can't open the page because it can't find the server.”

 

So that leads us outside the browser…

 

Some quick mail tests show Outlook, Thunderbird and Gmail all will accept them in a “To” field. Apache seems to have no issues with them, either.

 

Plesk, a popular web server management package, disallows them, saying the subdomain field has an “improper value.”

 

For programming languages, PERL’s LWP module won’t load them (“bad hostname”), and Ruby’s Hpricot library won’t either (“the scheme http does not accept registry part”).  PHP with include/require fails (“php_network_getaddresses: getaddrinfo failed: Name or service not known”). Python’s urllib2 also spits up on it (“IOError: [Errno socket error] (-2, 'Name or service not known')”).

 

However, these errors may not be in any particular language or program, but based on underlying name resolution issues. It’s important to note that nslookup on Windows, and nslookup/dig on OS X and CentOS 5.2 don’t have any problems with these names (on the same hosts those languages were tried on).  I’ll try to look at the resolution libraries soon for some of those and post a follow-up.

 For web sites, Archive.org's Wayback Machine gives a vague error if you try to look up our friend "-brainfreeze-" (it differs from a nonexistent host), but Google's cache of it works. Tinyurl doesn't have any issues either. 

So is this a security issue? Probably not directly—but it may be a problem using tools against names that fit this pattern. What happens if your proxy or filter fails to parse the name, what tools rely on a “broken” ping or name lookup before they do something, what tools use urllib2 or LWP under the hood, and what hostname parsing routines will simply say they are invalid and return an error? Any one of those “minor” issues could cause a compliance failure, if not a real security issue.

 

So to recap, what failed in testing?

-          PERL LWP

-          Ruby Hpricot

-          Python urllib2

-          PHP include/require

-          Plesk

-          Safari (iPhone)

-     The Wayback Machine

 

Find others? Post in the comments, and give http://–brainfreeze-.blogspot.com/ some love for being a good test site.


Posted 12-02-2008 7:31 PM by Chris Sullo
Filed under: , , ,

Comments

Subdomains With Hyphens - The HP Security Laboratory wrote Subdomains With Hyphens - The HP Security Laboratory
on 12-03-2008 7:06 AM

Pingback from  Subdomains With Hyphens - The HP Security Laboratory