When students ask: "What's an IP Address?

As a support tutor of beginner level IT students of vastly different ages and backgrounds, I sometimes have to refresh and evaluate my own knowledge and approach to a certain topic that arises, so I find writing Posts like this helpful to organise ideas on subjects that I had difficulty understanding myself, to see if I can explain or expand on things better, or a bit differently than how I learned or was taught.

I try to get a lot more across, more efficiently, so a bigger picture emerges for a student – even if less detailed or well understood in the short term – rather than forcing a parrot fashion "cop-out" approach along the "you just need to accept and learn this, as it will become clearer later", and without overtaxing them or myself in the available time.

Recently I got asked; "What is an IP Address?"

A perfectly sensible question for any IT student to ask – but from a teachers perspective, it not so easy to answer for many reasons.

Answering this (or anything really) mainly depends on how much available time there is and the relative knowledge level, curiosity and ability of the student.

For example, depending on the situation; do I mention IPV4 address format related to 32 bit binary, memory and bus technology evolution?

Subnetting – why its required?

Binary "ANDing" and 32 bit number space limitations, CIDR and – God forbid – Ipv6?!

Why TCP/IP is used and not other methods, so DARPANET as the Internet's beginnings?

The most concise but least illuminating answer at this point is usually similar to:

"An Internet Protocol address is a unique numerical identifier for a computer on a given network."

This will probably be followed by a "how?" or "why?" type question.

For the "why" type, I give an analogy of the unique name/house number/street of a postal system address or similar.

For the "how" type, I mention a certain type of server (DHCP) as a special computer that hands out a unique number to a computer when its attached to a network.

If I go down the route of explaining the format of an IP address such as, I may have to include the idea of a subnet – which can get messy without students getting perspective on the whole available address space of IPV4, because the number format itself does not make sense to the student at this point, in terms of why its unique, and what the dots mean. If they see an example with a /24 after it – you really have your work cut out…

To explain any of that in any meaningful, satisfying way, they need to understand the conversion of the decimal numbers to binary that multiply out the total number space, which is usually where I start, then gauge when and if eyes are starting to glaze over…

Starting with binary, I win either way – I find a student worth the time and effort, or they change the subject and/or run away.

Each of the four dot delimited decimals in are an 8 bit byte binary number, so each byte can have a value between 0 and 255 for each of the four decimal blocks ranges of numbers they contain:

Now it can be seen that an 8 bit binary number range can contain a maximum of (2^8 – 1) numbers = 0-255 in decimal.

Now the idea can be extended to the example IP address – say – so that the student can see that the total numerical address space of IP4 (0-255.0-255.0-255.0-255) would comprise four blocks of 256 "addresses" (0-255), giving a total of 256 x 256 x 256 x 256 or 2^32 individual numbers = 4,294,967,296 unique numbers. I have to omit "network" and "host" separation concepts at this point.

This enables a student to realise, theoretically, that there is a maximum number of computers that could be connected to the same theoretical network, and no more, and that this is a major technical shortcoming of IP4, as the original designers never foresaw the explosion in Internet popularity as has occurred, that would require more address IDs than this 32 bit space allows (2^0 to 2^31). They probably didn't foresee 64 bit computers evolving so soon after about 1998 either?

As there are now more physical computers/phones on the planet than 4.3 billion, that may need to connect to the Internet at the same time, so would not get a unique ID if the Internet really was one big network, they are sub-divided into smaller, locally manageable sections, or "sub-networks", that use various clever methods to "extend" the seemingly finite available address space of 2^32 unique numbers.

The increase in Internet usage toward this 32 bit maximum number is shown nicely below – with about 1 billion IP4 numbers left available:



World Internet Users and 2015 Population Stats




JUNE 30, 2015 – Mid-Year Update

World Regions

( 2015 Est.)

Internet Users
Dec. 31, 2000

Internet Users
Latest Data

(% Population)

Users %
of Table






27.0 %

9.6 %






38.8 %

47.8 %






73.5 %

18.5 %


Middle East




49.0 %

3.5 %


North America




87.9 %

9.6 %


Latin America / Caribbean




53.9 %

10.2 %


Oceania / Australia




72.9 %

0.8 %






45.0 %

100.0 %


NOTES: (1) Internet Usage and World Population Statistics are preliminary for June 30, 2015.



Or as a live clock, here:


Up to this point, the student is usually still interested, as they have a much greater idea of the scale, history and workings of the Internet that they use every day, with a greater appreciation of some of the technical aspects required for its operation.

I have covered all that as perspective by only introducing 32 bit binary maths and IPV4 number space concepts, at about their most fundamental level that may suffice to give an overall idea of function to the student who asked that seemingly innocent initial question; "What is an IP Address?"

If they are still having trouble visualising the massive 32 bit number, and amount of networked computers it represents in theory, the idea can be simplified by constraining the available numbers, one bit at a time, as if the computer was an early pre 1980s model with only a 4 bit memory so it can only hold a maximum number range of 0-15, with which to identify and remember 15 other computers including itself.

0000 – 1111 = 0-15 decimal

Now they should be able to grasp that the more memory a computer has, bit by bit, the more exponential capacity it has to store larger numbers, up to the seemingly massive 4.3 or so billion of a 32 bit memory space.

If they don't understand this far, you'll know there is little point continuing with any more on the subject without frying both your and their brains, and refer them to more simple topics to study first, and/or a real world example of 8 bit memory limitations, as in computers circa 1980s:

Hopefully they have understood, but now they may ask "so what is the subnet mask for?" that they may have seen in a Windows dialogue box…

Having explained the limits of IP4 as a finite number space, I approach the topic of masks as a numerical "filter" that can be seen as something that separates groups of binary numbers, by its powers of 2 as in the table above, so that they can be more easily organised as larger and larger groups, and relate to easier administration of larger and larger groups of computers on given networks, in a similar way.

This can be achieved using a mask, because all 1s and 0s in a computer are held electronically, and there is an electronic circuit called an AND Gate that can have 2 inputs and 1 output that gives a binary digit 1 output ONLY if both the inputs A and B have a binary input 1 applied to them. All other combinations generate a 0 output. This circuit can act as a digital "sieve".

This has the effect of separating binary number ranges, which means computers can be separated from each other into smaller, more manageable and physically separated groups on a shared, connected medium like a co-axial cable or wifi unit, before other technical, physical, local power consumption, and electronic complications arise.


This device can separate a binary number by its component powers or columns in the table above, if used as a "mask" against the binary version of an IP address, depending on what inputs it has, where input A may be the original binary value of an IP address, and input B is the mask value.

For a simple example using the last 4 binary digits of the number 0, with a mask of 0 you would "AND" the binary number 0000 applied to input A, with the same at B, you would get binary 0 as the output for A+B in the diagram above. The number input at A is the same as the output.

As an addition sum:


B0000 +



If you AND an IP address value of 1111 or decimal 15 to a mask of 1111, you ALSO get the same number at the output as was input at A, as per the "TRUTH" table in the graphic:





Again, the original IP address as input A is unchanged.

For a mixed case, where a 0 or 1 value of the IP address is added to it's opposite value in the B mask, you get a 0 output.





These are interesting observations from the above, because it shows that only numbers that are all 0's or all 1's go unchanged, so represent the lower and uppermost numbers for each power of 2, or given number range.

All numbers in between are converted to 0's.

This means IP addresses can be identified and separated by their binary block groups.

It may now be seen for that mixed case, that all the numbers in the IPv4 address space of 2^31; 0 – 4,294,967,295 – which in 32 bit binary are:

00000000.00000000.00000000.00000000 to 11111111.11111111.11111111.11111111

that a variable mask could be used as an operator for the whole IP number range that can identify and "sieve" blocks of individual powers of 2, so logically separating them into groups, with each computer retaining a unique number in that range.

Any mask of binary 1 at that particular column space, retains any 0 or 1 in the input at their original value.

Any mask of binary 0 at that particular column space, converts any 0 or 1 in the input to 0.

For a real mask described by a value like "/24" after the IP address (, it means that the first 24 bits (from left to right) of the total 32 bits of the whole IP address have been "masked" by 24 binary digits of value 1:

11111111.11111111.11111111.00000000 mask

Looking at the bytes of real example in binary:

11000000. 10101000. 00000001. 00001010

(192). (168). (1). (8+2 = 10)

If the mask is now applied to that address and ANDING performed:

1100 0000. 1010 1000. 0000 0001. 0000 1010 Host Address 10

1111 1111. 1111 1111. 1111 1111. 0000 0000 mask


1100 0000. 1010 1000. 0000 0001. 0000 0000 "sieved" range representing values 0-255

(192). (168). (1).

|< /24 bit mask >|

It can be seen that where the mask values were 0, the range of "sieved" numbers after the 24th bit have also been changed to 0, even though the IP address had a value of 00001010 or 10 decimal.

The mask has the effect of splitting the ip address into two component parts, called the network address and the host address ranges.

This identifies the single ip address as being one part of the 192.168.1.(0-255) addresses.

One result if this on a real network is that a computer with the same network address portion, would be able to communicate directly (without a router) only with other computers whos addresses are numbered in the range 0-255, that also have the same network prefix 192.168.1.xxx but no others, such as those with a prefix 192.168.2.xxx or 4.4.xxx.xxx etc.

In terms of network administration, this "" network could host from 0-255 individual computers, which would would be on the same "subnet" so able to communicate, if all had the same 192.168.1 prefix with a /24 mask.

As another example, that network above could be further sub-divided by using a longer mask, say /25, that would now half the available host addresses available for the computers.

An extra bit would be available to the network portion of the address space, and one less for host space.

Subnetwork 1

11000000. 10101000. 00000001. 00001010

(192). (168). (1). (8+2 = 10)

If the mask is now applied to that address and ANDING performed:

1100 0000. 1010 1000. 0000 0001. 0000 1010 Host Address 10

1111 1111. 1111 1111. 1111 1111. 1000 0000 mask


1100 0000. 1010 1000. 0000 0001. 0000 0000 "sieved" range representing values 0-127

(192). (168). (1). (0)

|< /25 bit mask >|

11000000. 10101000. 00000001. 00001010

(192). (168). (1). (8+2 = 10)

Subnetwork 2

If the mask is now applied to that address and ANDING performed:

1100 0000. 1010 1000. 0000 0001. 1000 1010 Address 138

1111 1111. 1111 1111. 1111 1111. 1000 0000 mask


1100 0000. 1010 1000. 0000 0001. 1000 0000 "sieved" range representing values 128255

(192). (168). (1). (0)

|< /25 bit mask >|

The result is that the network prefix section is split in two to become 2 separate networks of and that can host half as many computers each as before the split.

Each new network hosts numbers 0-127 and 128-255.

Now you may see the pattern emerging from mask usage:



You can read my summary of this pattern and my personal questions for its practical consequences under certain conditions here:


(and some notes from my original article – ignore the technicalities of "actual devices" and "broadcast addresses" at this point) to just understand the idea of sub netting by powers of 2.

A general system of achieving segregation for the total IP4 number space has been done historically by relevant official controlling bodies, by classifying particular number ranges into 5 large "classful" sections; A,B,C,D,E.

These ranges and general use allocation are:




Size of network
bit field

Size of rest
bit field

of networks

per network

Total addresses
in class

Start address

End address

Class A 0 8 (mask) 24 128 (27) 16,777,216 (224) 2,147,483,648 (231)
Class B 10 16 (mask) 16 16,384 (214) 65,536 (216) 1,073,741,824 (230)
Class C 110 24 (mask) 8 2,097,152 (221) 256 (28) 536,870,912 (229)
Class D (multicast) 1110 not defined not defined not defined not defined 268,435,456 (228)
Class E (reserved) 1111 not defined not defined not defined not defined 268,435,456 (228)

You can see the similarity of network separation by binary power above, as used with masks earlier for classless networks, but for classful networks, binary power separation is by the first four bits of the 32 bit IP4 address space.

It can be seen from the above mask example that this categorisation of A, B, C etc. is mainly redundant from an internal (non directly Internet connected) network viewpoint, as sub netting can be used to split networks in almost any way that suits an administrator provided the legal internal addresses don't leak out onto the Internet. This sub netting is usually termed CIDR, or Classless Inter-Domain Routing, where the mask values don't have to be fixed to only those of /8, /16 or / 24 that relate to classes A, B and C.

For the last aspect of ID "uniqueness" on a network – if a student is aware of a MAC address, but is still curious, I usually bailout as a tutor at this point, with a "I haven't time to cover that right now" (genuinely!) and refer them to Computerphile videos, such as: