IPv4 Components in APL
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
At a recent APL-focussed Meetup someone posed a challenge to slice up the components of an IPv4 address with an APL language and it prompted me to learn a bit more about how that works in general and how I could do the processing in APL myself.
The person who posed the challenge had approached it themselves using J which I’m only vaguely familiar with, but it gave me an opportunity to learn a bit more about it. It’s not all that dissimilar from the Dyalog APL I know a bit better; it uses a standard ASCII input with many of the same ideas – for example, determining whether a year is a leapyear or not as I explored recently
In Dyalog APL:
⍝ Dyalog APL leapyear ← {(80∨⍵) > 50∨⍵} years ← 1890+⍳30 (leapyear years) ⌿ years 1892 1896 1904 1908 1912 1916 1920 ⍝ or tacit leapyear ← 80∘∨ > 50∘∨
compared to J:
NB. J leapyear =: {{ (80 +. y) > (50 +. y) }} years =: 1890 + i.31 (leapyear years) # years 1892 1896 1904 1908 1912 1916 1920 NB. or tatic leapyear =: 80&+. > 50&+.
It’s fairly straightforward to see the correlation between these two.
I don’t think we worked through the J solution to slicing up the components of an IPv4 address, but I did have a go during the meeting at a Dyalog APL solution, which we walked through and I’ve since improved.
The problem as posed was – given an IPv4 address, e.g. ‘192.0.2.63’ and a subnet
mask in CIDR notation
(e.g. /24
), can we identify the different networking components?
This is a neat problem because it potentially involves arrays – maybe we should start with what this means. I’m no expert in this myself, but explaining things is a great way to learn more about them, so feel free to correct me at any point.
I started with this guide which I know already has a mistake in one of the graphics – can you find it?
An IPv4 address consists of four octets separated by dots, each number representing
8 bits (hence ‘octet’) which in binary means 8 1
s or 0
s for a maximum value of 255
192 = 11000000 = 1x(2^8) + 1x(2^7) + 0x(2^6) + 0x(2^5) + ... = 256 + 128 + 0
so we have four of these sets of 8 binary values that represent an address.
The subnet mask is described by the CIDR block and it essentially represents how
many 1
s are at the start of an address, so if the mask is ‘255.0.0.0’ then it
would be
11111111 00000000 00000000 00000000
which is 8 1
s, so it would be /8
. Similarly /26
would have 26 1
s and
converting from binary to decimal would represent a mask of ‘255.255.255.192’.
So, given an address and a CIDR block, what is the mask?
First, we need to convert our address from a string to an array of binary digits. One way to partition a string at a character in APL is
'.'(≠⊆⊢)'192.0.2.63' 192 0 2 63
and we can convert this array of strings to numbers with ‘eval’
⍎¨'.'(≠⊆⊢)'192.0.2.63' 192 0 2 63
Converting to binary in APL is as easy as ‘decode’ with a radix of 8 2
s
2 2 2 2 2 2 2 2 ⊤ ⍎¨'.'(≠⊆⊢)'192.0.2.63' 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 1 0 0 0 1
but of course we can write all those 2
s with either (8⍴2)
(‘reshape’ a value
of 2
to length 8) or (8/2)
(‘repeat’ 2
8 times) so
(8⍴2)⊤⍎¨'.'(≠⊆⊢)'192.0.2.63' 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 1 0 0 0 1
That gives us the binary sequences for each of the octets as columns of an array. It’s a lot to type out each time, though, so we can create a function that takes a right argument
asbin←{(8/2)⊤⍎¨'.'(≠⊆⊢)⍵} asbin '192.0.2.63' 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 1 0 0 0 1
Of course, if we want to go back the other way and see this as an IP address made of octets, we can ‘paste’ the values (converted back to integers) together with dots between them with
asoct←2∘⊥ asip←{∊(⍕¨⍵),¨'.' '.' '.' ''}
The first of these creates a “curried” (partially applied) ‘decode’ with radix 2
,
while the second ’format’s the values in the specified pattern, so
asoct asbin '192.0.2.63' 192 0 2 63 asip asoct asbin '192.0.2.63' 192.0.2.63
Cool, we can round-trip this.
The subnet mask is a series of 1
s filled to 32 values with 0
s which we can write as
mask←{⍉4 8 ⍴ 1=⍸⍵ 0 (32-⍵)}
which creates a 4×8 array of values filled with the right number of 1
s
mask 26 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0
We can view this subnet mask with this new function, too
asoct mask 26 255 255 255 192 asip asoct mask 26 255.255.255.192
The ‘network address’ for this address is found by a bitwise AND between this mask and the IP address, and APL has a builtin ‘and’
(mask 26) ∧ asbin '192.0.2.63' 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 asip asoct (mask 26) ∧ asbin '192.0.2.63' 192.0.2.0
The ‘broadcast address’ is found by a bitwise OR between the inverse of the mask and the IP address
(~mask 26) ∨ asbin '192.0.2.63' 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 1 0 0 0 1 asip asoct (~mask 26) ∨ asbin '192.0.2.63' 192.0.2.63
Looking at what we have so far, we can write out some functions
asip←{∊(⍕¨⍵),¨'.' '.' '.' ''} asoct←2∘⊥ mask←{⍉4 8 ⍴ 1=⍸⍵ 0 (32-⍵)} smask←{asip asoct mask ⍵} asbin←{(8/2)⊤⍎¨'.'(≠⊆⊢)⍵} netaddr←{asip asoct (mask ⍺) ∧ asbin ⍵} bcast←{asip asoct (~mask ⍺) ∨ asbin ⍵}
and try these on some different addresses
ip←'192.168.0.1' smask 8 255.0.0.0 26 netaddr ip 192.168.0.0 26 bcast ip 192.168.0.63 ip←'142.250.70.174' 16 netaddr ip 142.250.0.0 16 bcast ip 142.250.255.255
Cool!
We could also calculate the number of hosts that can be assigned, since that’s just 2 to the power of the number of host bits (non-network bits), minus the network and broadcast addresses
nhosts←{¯2+2*(32-⍵)} nhosts 26 62
We could list the entire range of host IPs, except we need to offset the network and broadcast addresses. Time to make some utilities
netutil←{asoct (mask ⍺) ∧ asbin ⍵} butil←{asoct (~mask ⍺) ∨ asbin ⍵} bcast1←{x←⍺ butil ⍵ ⋄ x[4]←x[4]-1 ⋄ asip x} netaddr1←{x←⍺ netutil ⍵ ⋄ x[4]←x[4]+1 ⋄ asip x} iprange←{n←⍺ netaddr1 ⍵ ⋄ b←⍺ bcast1 ⍵ ⋄ n,'-',b} 26 iprange '192.0.2.63' 192.0.2.1-192.0.2.62
That seems like a good set of utilities – and a great opportunity to learn about
how Dyalog APL packages up things into namespaces. One way is to write the
functions to a file, say SubnetCalc.dyalog
as
:Namespace SubnetCalc asip←{∊(⍕¨⍵),¨'.' '.' '.' ''} asoct←{2⊥⍵} mask←{⍉4 8 ⍴ 1=⍸⍵ 0 (32-⍵)} nhosts←{¯2+2*(32-⍵)} smask←{asip asoct mask ⍵} asbin←{(8/2)⊤⍎¨'.'(≠⊆⊢)⍵} netutil←{asoct (mask ⍺)∧asbin ⍵} netaddr←{asip ⍺ netutil ⍵} netaddr1←{x←⍺ netutil ⍵ ⋄ x[4]←x[4]+1 ⋄ asip x} butil←{asoct (~mask ⍺)∨asbin ⍵} bcast←{asip ⍺ butil ⍵} bcast1←{x←⍺ butil ⍵ ⋄ x[4]←x[4]-1 ⋄ asip x} iprange←{n←⍺ netaddr1 ⍵ ⋄ b←⍺ bcast1 ⍵ ⋄ n,'-',b} :EndNamespace
(noting that I needed to use explicit defuns rather than just tacit calls) then load that into the RIDE editor session with
⎕FIX '/path/to/project/SubnetCalc.dyalog'
and give it a shorter name, if desired
'ip' ⎕NS SubnetCalc
Now I can call my functions even faster
ip.smask 26 255.255.255.192 google←'142.250.70.174' 26 ip.iprange google 142.250.70.129-142.250.70.190
Dyalog has recently announced a proper package infrastructure Tatin which might come as a surprise to those more familiar with newer languages, but it’s actually one of the first package ecosystems for an APL language that I know of. I want to figure out whether my ‘toy’ package is too simplistic to be shared, or if it’s worth learning those ropes. At the moment all the packages in that system are internally sourced, but presumably that would open up to external users once it’s stabilised.
All of this was a lot of fun and I learned a lot. How would I go about this in another language? Well, there’s almost always an R package for something, and sure enough there’s an {ipaddress} package on CRAN that has all of this functionality plus more, though it does seem to rely on compiling some C++ code to do it.
library(ipaddress) ip <- ip_address("192.0.2.44") ip_to_binary(ip) # c.f. asbin ## [1] "11000000000000000000001000101100" ipn <- ip_network("192.0.2.0/26") prefix_length(ipn) # c.f. nhosts ## [1] 26 network_address(ipn) # c.f. netaddr ## <ip_address[1]> ## [1] 192.0.2.0 broadcast_address(ipn) # c.f. bcast ## <ip_address[1]> ## [1] 192.0.2.63 netmask(ipn) # c.f. smask ## <ip_address[1]> ## [1] 255.255.255.192 hostmask(ipn) ## <ip_address[1]> ## [1] 0.0.0.63 range(hosts(ipn)) # c.f. iprange ## <ip_address[2]> ## [1] 192.0.2.1 192.0.2.62 is_within(ip, ipn) ## [1] TRUE
One of the advantages of the APL approach, I feel, is that you can see exactly what the function is doing – often there’s no point naming a function because any useful name you might give it typically has more characters than the actual implementation. Digging into this package even slightly, it’s not immediately obvious where the processing is happening. I sometimes worry that we add too many layers to higher and higher abstractions, though I appreciate that sometimes gets us a lot of benefit.
I wouldn’t use my APL code in production – it has no checks or error handling, but building these helped me really understand what’s going on between all the ones and zeroes.
If you have comments, suggestions, or improvements, as always, feel free to use the comment section below, or hit me up on Mastodon.
devtools::session_info()
## ─ Session info ─────────────────────────────────────────────────────────────── ## setting value ## version R version 4.3.3 (2024-02-29) ## os Pop!_OS 22.04 LTS ## system x86_64, linux-gnu ## ui X11 ## language (EN) ## collate en_AU.UTF-8 ## ctype en_AU.UTF-8 ## tz Australia/Adelaide ## date 2024-08-22 ## pandoc 3.2 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/x86_64/ (via rmarkdown) ## ## ─ Packages ─────────────────────────────────────────────────────────────────── ## package * version date (UTC) lib source ## blogdown 1.19 2024-02-01 [1] CRAN (R 4.3.3) ## bookdown 0.36 2023-10-16 [1] CRAN (R 4.3.2) ## bslib 0.6.1 2023-11-28 [3] CRAN (R 4.3.2) ## cachem 1.0.8 2023-05-01 [3] CRAN (R 4.3.0) ## callr 3.7.3 2022-11-02 [3] CRAN (R 4.2.2) ## cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.3) ## crayon 1.5.2 2022-09-29 [3] CRAN (R 4.2.1) ## devtools 2.4.5 2022-10-11 [1] CRAN (R 4.3.2) ## digest 0.6.34 2024-01-11 [3] CRAN (R 4.3.2) ## ellipsis 0.3.2 2021-04-29 [3] CRAN (R 4.1.1) ## evaluate 0.23 2023-11-01 [3] CRAN (R 4.3.2) ## fastmap 1.1.1 2023-02-24 [3] CRAN (R 4.2.2) ## fs 1.6.3 2023-07-20 [3] CRAN (R 4.3.1) ## glue 1.7.0 2024-01-09 [1] CRAN (R 4.3.3) ## htmltools 0.5.7 2023-11-03 [3] CRAN (R 4.3.2) ## htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.2) ## httpuv 1.6.12 2023-10-23 [1] CRAN (R 4.3.2) ## icecream 0.2.1 2023-09-27 [1] CRAN (R 4.3.2) ## ipaddress * 1.0.2 2023-12-01 [1] CRAN (R 4.3.3) ## jquerylib 0.1.4 2021-04-26 [3] CRAN (R 4.1.2) ## jsonlite 1.8.8 2023-12-04 [3] CRAN (R 4.3.2) ## knitr 1.45 2023-10-30 [3] CRAN (R 4.3.2) ## later 1.3.1 2023-05-02 [1] CRAN (R 4.3.2) ## lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.3.3) ## magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.3) ## memoise 2.0.1 2021-11-26 [3] CRAN (R 4.2.0) ## mime 0.12 2021-09-28 [3] CRAN (R 4.2.0) ## miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.3.2) ## pkgbuild 1.4.2 2023-06-26 [1] CRAN (R 4.3.2) ## pkgload 1.3.3 2023-09-22 [1] CRAN (R 4.3.2) ## prettyunits 1.2.0 2023-09-24 [3] CRAN (R 4.3.1) ## processx 3.8.3 2023-12-10 [3] CRAN (R 4.3.2) ## profvis 0.3.8 2023-05-02 [1] CRAN (R 4.3.2) ## promises 1.2.1 2023-08-10 [1] CRAN (R 4.3.2) ## ps 1.7.6 2024-01-18 [3] CRAN (R 4.3.2) ## purrr 1.0.2 2023-08-10 [3] CRAN (R 4.3.1) ## R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.3) ## Rcpp 1.0.11 2023-07-06 [1] CRAN (R 4.3.2) ## remotes 2.4.2.1 2023-07-18 [1] CRAN (R 4.3.2) ## rlang 1.1.4 2024-06-04 [1] CRAN (R 4.3.3) ## rmarkdown 2.25 2023-09-18 [3] CRAN (R 4.3.1) ## rstudioapi 0.15.0 2023-07-07 [3] CRAN (R 4.3.1) ## sass 0.4.8 2023-12-06 [3] CRAN (R 4.3.2) ## sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.2) ## shiny 1.7.5.1 2023-10-14 [1] CRAN (R 4.3.2) ## stringi 1.8.3 2023-12-11 [3] CRAN (R 4.3.2) ## stringr 1.5.1 2023-11-14 [3] CRAN (R 4.3.2) ## urlchecker 1.0.1 2021-11-30 [1] CRAN (R 4.3.2) ## usethis 3.0.0 2024-07-29 [1] CRAN (R 4.3.3) ## vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.3.3) ## xfun 0.41 2023-11-01 [3] CRAN (R 4.3.2) ## xtable 1.8-4 2019-04-21 [1] CRAN (R 4.3.2) ## yaml 2.3.8 2023-12-11 [3] CRAN (R 4.3.2) ## ## [1] /home/jono/R/x86_64-pc-linux-gnu-library/4.3 ## [2] /usr/local/lib/R/site-library ## [3] /usr/lib/R/site-library ## [4] /usr/lib/R/library ## ## ──────────────────────────────────────────────────────────────────────────────
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.