Hacker News new | past | comments | ask | show | jobs | submit login
There’s more than one way to write an IP address (2019) (ma.ttias.be)
256 points by KomoD on April 28, 2023 | hide | past | favorite | 100 comments



> Here’s another neat trick. You can overflow a digit. \n\n [...] PING 10.0.513 (10.0.2.1)

That's not exactly what's happening. You're omitting the fourth octet so this is then interpreted as a decimal part of the address (you can also have it be interpreted as hex or octal with the usual prefixes).

10.0.0.513 won't work because overflow isn't really what's happening. (For a minute you had me wondering if I missed something in my IP address variants tool because I didn't know that 9.256.0.1 would work as 10.0.0.1, but no, it can't and I've got the other case covered. Whew!)

The example can be written more succinctly as 10.513


Interesting, so that's also whats happening with 127.1 and 127.0.1: the zeroes are not being "inserted automatically", they come from expanding the last number (1) into the bits for the last bytes of the address.

To make it clearer:

- for "x", then "x" is all 4 bytes of the address

- for "x.y", then "x" is the first byte of the address and "y" the last 3 bytes

- for "x.y.z", then "x" is the first byte, "y" is the second" and "z" is the last 2

- for "x.y.z.w", then each of the numbers is its own byte


I guess the most useful way of describing the dot-decimal notation for IP(v4) addresses is that it consists of at least one and at most four numbers separated by dots, each of which represents on octet of the address (MSB on the left), save for the last number, which always represents all remaining octets. Each number is by default a decimal number but can also be an octal (prefix 0) or a hexadecimal number (prefix 0x). That should include all cases. Stuff like 10.05003 → 10.0.10.3.


That is correct. The dots are there for our convenience to be able to spot individual bytes. I hope the author addresses and edits their article accordingly.


Their article from 3 years ago? I somehow doubt they're going to be making any edits now.


> The example can be written more succinctly as 10.513

The most useful example I use on a day to day basis is

  dig foo.com @1.1
or

  ping 1.1
Which expands to 1.0.0.1


0 for 0.0.0.0 is useful too. I often do things like ping 0 or ssh 0 (perhaps with a different port number) to ssh into localhost.

0 expands to "0.0.0.0" and can be used in place of localhost in many situations. (https://serverfault.com/questions/876698/whats-the-differenc...)


And this is why the second address for the 1.1.1.1 DNS service is 1.0.0.1.

`mtr 1.1` (with MTR_OPTIONS="--displaymode=2" in my environment, super useful display mode for general connectivity diagnostics) is my standard “my internet connection doesn’t seem to be working perfectly, where’s the problem?” tool.


What's happening is more obvious when you consider that you can equivalently write it as simply 167772673. Try `ping 167772673`!


No need to ping if all you want is the conversion :)

Just type it into the address bar: https://snipboard.io/kbLTso.jpg (previously posted that screenshot in 2021 https://news.ycombinator.com/item?id=29050936)


The address bar is also a bit excessive for doing conversions.

  $ getent hosts 127.1
  127.0.0.1       127.1


I would agree that a whole browser engine is overkill for any simple conversion of four simple bytes

...but you realize we're conversing via a website, right? :D Most people will already be in a browser!


And the ping utility.


How about 134744072, that hits my favorite am-I-connected site. I never knew you could put an int representing the 4 bytes together until today, this is really fun.

Even hex works, but doesn't hit a site that responds: (ping 0xcafebeef).


I enjoy using "ping 0x1010101"


Javascript seems to normalize these automatically:

    const a = document.createElement('a');
    
    a.href = 'http://032166163360';
    console.log(a.href);
    // "http://209.216.230.240/"
    
    a.href = 'http://127.1'
    console.log(a.href);
    // "http://127.0.0.1/"
    
    a.href = 'http://10.50.1'
    console.log(a.href);
    // "http://10.50.0.1/"
    
    a.href = 'http://10.0.513'
    console.log(a.href);
    // "http://10.0.2.1/"
    
    a.href = 'http://0xA000201'
    console.log(a.href);
    // "http://10.0.2.1/"
    
    a.href = 'http://10.0.2.010'
    console.log(a.href);
    // "http://10.0.2.8/"

and also:

    console.log(new URL('http://10.0.513').host)
    // "10.0.2.1"


This wasn't always the case. For years, my website had a numeric vhost configured with an easter egg but I don't think anyone ever visited it. Then I noticed, maybe four years ago, that firefox now translates the IP address into dotted quad notation and use that as a Host header instead of what the user typed, so it would never trigger now anyway.

The internet used to be more fun when it was all fun and games and we didn't need to worry about every possible type of user misleading :(


>The internet used to be more fun when it was all fun and games and we didn't need to worry about every possible type of user misleading :(

I remember in the 2000s, there was some site using a decimal IP address (as a single number, not dotted quad) that had hacking/crypto puzzles. Something with a "Alice in Wonderland" theme. Does that ring a bell for anyone?


> I remember in the 2000s, there was some site using a decimal IP address (as a single number, not dotted quad) that had hacking/crypto puzzles. Something with a "Alice in Wonderland" theme. Does that ring a bell for anyone?

Was it called ninebows? I was looking online for that a while ago to show a friend, and I remember in school reading pages and pages of forum threads about it: speculation of what this or that meant, ideas for solving the next puzzle, et c. It was a website with some hidden puzzle that, once solved, would leave you with another link somewhere else with another puzzle that needed solving.

Someone eventually won, too, but it looks like it just vanished off the face of the Internet like many things from that era. Either that, or my search engine skills aren't good enough.


It (and most other scripting languages) are likely just calling inet_aton under the hood.


And proper inet_aton allows even more formats (IPv4 only):

u32 undotted

u8.u24 dotted-signal

u8.u8.u16 dotted-triple

u8.u8.u8.u8 dotted-quad

^ where each of the above is allowed to be octal, hex, or decimal


I think you mean browser (more specifically HTML standard), not JavaScript


In particular, IPv4 address parsing for URL hosts is specified in https://url.spec.whatwg.org/#concept-ipv4-parser, in WHATWG's URL Standard.


Here's a list of all the ways (and notation combinations) you can make with your IP address: https://lucb1e.com/randomprojects/php/funnip.php


The first time I ever saw a hexadecimal ip was in a spam text message, one of those "click here for your prize". I laughed to myself thinking whoever wrote their spambot had messed up the url but to my surprise the link worked. I didn't ever receive my prize...


It's a common trick to evade spam detection, because the writer of the spam detection software probably didn't think about those weird IP address formats and would fail to extract the URL.


the education you received was more valuable than any monetary reward. $2k you earn yourself is worth more than $100k given to you for free.


Worth it in an ideal world, in the real world it's a bad advice to compare given vs earned. We wouldn't have the majority of the companies if the founders didn't have the wealth given for them to bootstrap.


I thought the article will mention ①②⑦.⓪.⓪.①

$ ping ①②⑦.⓪.⓪.① PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.


That's neat, but it's a property of the OS, though, not of IP addresses themselves :)

(doesn't work on Windows btw)


Works in Firefox on Linux for me. Probably some form of unicode normalization?

http://xn--orhcp.xn--mvh.xn--mvh.xn--orh:8000 -> http://127.0.0.1:8000


Oh wow, I think HN normalized the unicode with punycode since it's a URL. I submitted with ①②⑦.⓪.⓪.① inside the http:// and :8000 And the punicode version also works for me :)


Just tried both versions in Firefox on Linux, they work for me too.


PaleMoon browser refuses to open the xn--orhcp link, and shows nothing in the status bar for it. Right click doesn't give the option to open it either, only bookmark.


Yup, same here too. Something cool I just noticed, Firefox converts that link to 127.0.0.1:8000 in the status bar when hovering over it.


>(doesn't work on Windows btw)

Works in Gitbash in Microsoft Terminal, which is cheating a bit, but doesn't introduce a different OS.

Edit: Works in Powershell too.


We should be able to spell out IPv4 and IPv6 addresses in WingDings.

EDIT: WingDings is the Grandparents to Emojis.


Codepage 437 is the great grandparent: ♥♦♣♠

edit: 0x01="smiley face white on black", 0x02="smiley face black on white" were filtered by HN


Its so funny you state that,

I wanted to use "great grandparents" but I made the false assumption that the HN audience wouldnt have identified, but holy crap we do.

I think we need a knowledge archive asap.... we are going to start dying off soon.

The next decade is going to decimate some minds.


Wow, I thought HN filtered all emojis. TIL.


This requires a terminal or shell with Unicode normalization.


Doesn't work on macOS Ventura 13.3.


> $ ping 10.0.2.010

> PING 10.0.2.010 (10.0.2.8)...

It was all fun and games until they started mixing bases, decimal and octal in the same address.

That's just cursed.


yep, and I learned to hate IP-Phones this way. Cause some manufacturer (Siemens) wrote (writes?) Decimal IP with 0 Prefix (so it's always three digits) for aesthetics.


Makes them sortable via standard text sorting algo's, too.


I had a case 15 years ago when .255 were accepted in WebGUI and the stack really tried to get that IP, found out only by running Ethereal on a directly connected patch.


Oh hell no!


http://032166163360 => news.ycombinator.com


Although `href="http://032166163360"` firefox says the link is `209.216.230.240`.

Try also http://0xD1D8E6F0 and http://3520653040


Yes, noticed the same thing. It's still one number in the HTML source though, only in the status bar it shows the "normal" IP address.


~~HN or~~ Firefox rewrote in dotted-quad.


If you use "view source", you can see the link isn't rewritten there.


... but please don't.


I use this to store IP addresses in a database because you can operate on numbers (e.g. WHERE subnet_start < $thisip < subnet_end) but hardly on the unique dotted format that we normally display them as.

Also to specify a bind address when I don't care, like running `php -S 0:3000` (the silly thing wants a bind address rather than only a port number. There, have one!) or accessing localhost in a browser (just typing 0:3000 is enough). For 127.0.0.1, unfortunately the best you can do is writing 127.1. The numeric, hex, and octal variants are 2130706433, 0x7f000001, and 017700000001, which I personally don't find preferable to 127.1.


If you're using a database, use something like Postgres that has this functionality built in[0]. You can store IPs, networks, etc. in a native format that has all sorts of functionality available out of the box.

[0] https://www.postgresql.org/docs/current/functions-net.html


Yes, storing is different from displaying though.

In MySQL for example, that's what INET_ATON() and INET_NTOA() are for, to convert between binary and display.

Analogous to storing timestamps but displaying as datetimes in a timezone.


This is why I love this site. I've been doing networking for many years and I'm not a total novice in databases, but I had no idea MySQL had these functions.



I'd still much rather store something that can be indexed in a btree than something where you have to always call a function on and do full table scans. Of course, before displaying to the user you'd use long2ip again (or the database equivalent you mentioned; I usually avoid doing unnecessary computations on the database and, instead, let the application handle display logic).


Sorry if I wasn't clear, I was agreeing with you! Yes precisely for indexing (plus just a fixed column size that wastes no space).


Obviously the octal and hex and overflow are pretty cursed, but I do like using 10.0.0.* for home IPv4 just 'cause typing ssh 10.1 is so darn convenient.


Or set up local DNS and search domains, or even just add entries to your hosts file. `ssh fw` is easy, as is `ssh server`, which while more characters are more in the central typing plane.


I have those too, but I'm familiar with all the IPs and I just got tired of adding DNS entries. Esp for some predictable ones in the "dynamic" range.

Also there are times I don't have DNS working. Often times at some console where copy/paste also isn't working or where I don't even have a mouse, and I extra appreciate the simpler typing :)


I switched to the 10/8 block at home because it's less stupid than typing 192.168.whatever for everything local. I'll have to try this.


I'm not convinced we shouldn't have (originally) adopted using pure hex, e.g. 0x7F000001 instead of 127.0.0.1. Personally, I think it makes subnet masks, etc, a lot _more_ obvious.


Yes. The nibble boundary is at every 4, not 8 bits too, making subetting A LOT easier.


I think the IP address libraries should only accept the standard dotted decimal octet form. And let the others die as non-standard, historical forms.


Next you'll tell me to stop visiting the alternate timeline where everyone has a mustache and domain names end in a period.

https://news.ycombinator.com./item?id=35751030 ;-)

( warning: your cookies for news.ycombinator.com won't work in news.ycombinator.com. )


I "ping 1.1" as my go-to network availability test.

It checks to see if Cloudflare is responding, which 99.9% of the time is going to tell you if your internet is working :)


the behavior you observe is arbitrary


Discussed at the time:

There’s more than one way to write an IP address - https://news.ycombinator.com/item?id=20390759 - July 2019 (48 comments)


Hah, this reminds me of early firewall bypass techniques. A long time ago, getting to your destination via octal notation was a hack.


Personally, when I was trying to wrap my head around CIDRs for the first time, thinking about IPs at one 32 bit number (a la hex formatting) was super helpful, and makes it less annoying to leave behind the nice /8 /16 /24 chunks. Thinking of terms of just bitmasks is also pretty straight forward in the end.


I gave a class on TCP/IP to other consultants in my company 20 years ago. The day went well right up until the end when I covered bitmasks, at which point eyes universally glazed over.

Lesson learned: always save that topic for last, so the rest of the day isn’t a disaster.


In community college not 5 years ago, I took the first two of a series of Cisco networking classes designed to prep for certifications such as CCNA.

When we came upon CIDR and VLSM, our instructor (very knowledgeable, down-to-earth, pragmatic) introduced us to various calculators that could assist us, although he did also show us a manual way to graph out each bit. Then he admitted that the VLSM portion of the class had often driven his previous students to tears, and he didn't want to see anyone crying over this anymore.


Aren't there some other ones? IIRC the standard "IP address handler library" would do things like "try any possible way of interpreting it" and would work on words, etc.


There are no officially adopted text-representations (for IPv4), only binary ones. In other words: there are simultaneously infinitely more possibilities and exactly zero.

For the most part, applications tend to punt the job of interpreting such text address representations to the IP stack (usually embedded in the OS kernel). These vary in what they'll accept by implementation and version, but they tend to be extremely good at interpreting whatever arbitrary nonsense people have historically been likely to try. As a result, there are surprisingly few application-level libraries which even attempt to deal with that mess.


I only know of two operating systems where IP address parsing (and the address resolver in general) is part of the TCP/IP stack: ITS and z/OS. MS-DOS gets an honorable mention due to not having any architectural distinction between parts of the system at all, and z/OS only qualifies because it's not entirely clear where the boundaries of the "TCP/IP stack" are to begin with. (One would be forgiven for thinking that the TCP/IP stack is contained in the address space called "TCPIP". However, significant parts of it are in the LPA, which is part of every address space, and it's not clear to me yet where exactly the resolver is.)


You could represent each octet as an ASCII char reducing it to 4 characters.


> You could represent each octet as an ASCII char reducing it to 4 characters.

Except ASCII is 7-bit, and a number of those are control characters. So you couldn’t, and many that you could would be unreadable. <DEL><NUL><NUL><SOH> for localhost is... not ideal.


That sounds like good laptop sticker material. Could even give readers a hint by using the "there's no place like" saying that I've seen applied to 127.0.0.1 and ::1 already, or prefix some hacking tool. I wonder how many people would get it.

There's no place like <DEL><NUL><NUL><SOH>


You can type control codes (except for NUL, DEL, CR, LF?) on the Linux console by pressing Ctrl+letter. Might add some security to your disk encryption password, unless you need to enter it from a GUI :)


How do you write DEL, which Ctrl+key is that? I had written this actually as ^H^@^@^A, but realized that ^H may remove characters but is not DEL, so I edited that out of my previous comment.


Yes, I don't know if there is any key combination to type DEL. Ctrl just clears bit 6. ^S and ^Q are also problematic because of XON/XOFF. I did use ^G (BEEP!) in a password once.

edit ^H is also swallowed by the line editor of course.


Or 5 base85 digits if you wanted to ensure there were no control characters...


I remember realizing I could use that trick to getting around a bluecoat proxy back in 2004. I worked in a bank, and they had draconian web filtering. DNS lookups weren't blocked, but the eeb proxy blocked sites by domain name and the ipv4 address it resolved to. However, it didn't block the ip address if you formatted it as a single 32bit integer! Of course, that would break if there was a redirect or https, but back then many sites didn't have those.


I gotta say, I realize it's got a lot of historical weight to it, but I've always hated the "leading zero means octal" thing.

The fact that 12 and 012 mean drastically different things in many languages does not ever feel good.


What's the history behind this? I doubt there was legacy or backwards compat reasoning? Allowing for such a loose and wide interpretation makes for complicated parsing and numerous exploits.


Most of the time it is just a banal shenanigans of strtoint conversions and how exactly the dotted decimal parser was written.

> for complicated parsing

Somewhat

> numerous exploits.

Nah. It's mostly localized to a string to hostname processing, ie it never occurs in the network stack and happens on the user'side of things and permissions (think CLI and interpreted languges)


not exploits of the OS, but I've used this to exploit web applications quite a bit. Tricks like these get your past a lot of input filters or validation logic. This allows me to trick these apps into making HTTP requests to internal or private IPs/hosts.

As an example, think of a cloud based web performance monitoring system. I trick it into making HTTP requests to 169.254.169.254, and I get access to data from their AWS metadata service...


Yep. Skipped this part (not at "he desktop RN) but honestly this is more in 'check what you accept' and input data validation|sanitation. Still a valid target for an exploit but you really need a bunch of things ('web performance monitoring system') to happen before you can have a meaningful usage (if at all) from these exploits.


I don't know about the hex representation, but the binary representation is useful for figuring out CIDR ranges


Reminds me of this bug in OpenBSD: https://marc.info/?l=openbsd-bugs&m=124425026630958&w=2

Of course the reply was just a deflection, even though OpenBSD was the only buggy one.


There’s always more than one way to write an IP address, sure, but there is only one way to canonicalize them all: remember IPv4 is just storing a mere 32-bit integer. So why not just store the straight 4 bytes?


Reminds me of passing exponential notation to php scripts to bypass field validation and dump stack traces. Lots of security fun to be had in this space with libraries that everyone assumes have been fuzzed to death.


Unisys used commas in its presentation format. Major bummer in the late 80s commissioning a new library catalogue system.


Is this an artifact of ping or a standard?



inet_aton is its own thing, but at least for URLs in browsers the behavior is standardized (https://url.spec.whatwg.org/#concept-ipv4-parser).


I'm disappointed 0b doesn't work.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: