Site icon R-bloggers

IPv4 Components in APL

[This article was first published on rstats on Irregularly Scheduled Programming, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

At a recent APL-focussed Meetup someone posed a challenge to slice up the components of an IPv4 address with an APL language and it prompted me to learn a bit more about how that works in general and how I could do the processing in APL myself.

The person who posed the challenge had approached it themselves using J which I’m only vaguely familiar with, but it gave me an opportunity to learn a bit more about it. It’s not all that dissimilar from the Dyalog APL I know a bit better; it uses a standard ASCII input with many of the same ideas – for example, determining whether a year is a leapyear or not as I explored recently

In Dyalog APL:

⍝ Dyalog APL
      leapyear ← {(80∨⍵) > 50∨⍵}
      years ← 1890+⍳30
      (leapyear years) ⌿ years
1892 1896 1904 1908 1912 1916 1920

⍝ or tacit
      leapyear ← 80∘∨ > 50∘∨

compared to J:

NB. J
   leapyear =: {{ (80 +. y) > (50 +. y) }}
   years =: 1890 + i.31
   (leapyear years) # years
1892 1896 1904 1908 1912 1916 1920

NB. or tatic
   leapyear =: 80&+. > 50&+.

It’s fairly straightforward to see the correlation between these two.

I don’t think we worked through the J solution to slicing up the components of an IPv4 address, but I did have a go during the meeting at a Dyalog APL solution, which we walked through and I’ve since improved.

The problem as posed was – given an IPv4 address, e.g. ‘192.0.2.63’ and a subnet mask in CIDR notation (e.g. /24), can we identify the different networking components?

This is a neat problem because it potentially involves arrays – maybe we should start with what this means. I’m no expert in this myself, but explaining things is a great way to learn more about them, so feel free to correct me at any point.

I started with this guide which I know already has a mistake in one of the graphics – can you find it?

An IPv4 address consists of four octets separated by dots, each number representing 8 bits (hence ‘octet’) which in binary means 8 1s or 0s for a maximum value of 255

192 = 11000000
    = 1x(2^8) + 1x(2^7) + 0x(2^6) + 0x(2^5) + ... 
    = 256 + 128 + 0

so we have four of these sets of 8 binary values that represent an address.

The subnet mask is described by the CIDR block and it essentially represents how many 1s are at the start of an address, so if the mask is ‘255.0.0.0’ then it would be

11111111 00000000 00000000 00000000

which is 8 1s, so it would be /8. Similarly /26 would have 26 1s and converting from binary to decimal would represent a mask of ‘255.255.255.192’.

So, given an address and a CIDR block, what is the mask?

First, we need to convert our address from a string to an array of binary digits. One way to partition a string at a character in APL is

      '.'(≠⊆⊢)'192.0.2.63'
192  0  2  63      

and we can convert this array of strings to numbers with ‘eval’

      ⍎¨'.'(≠⊆⊢)'192.0.2.63'
192 0 2 63

Converting to binary in APL is as easy as ‘decode’ with a radix of 8 2s

      2 2 2 2 2 2 2 2 ⊤ ⍎¨'.'(≠⊆⊢)'192.0.2.63'
1 0 0 0
1 0 0 0
0 0 0 1
0 0 0 1
0 0 0 1
0 0 0 1
0 0 1 1
0 0 0 1

but of course we can write all those 2s with either (8⍴2) (‘reshape’ a value of 2 to length 8) or (8/2) (‘repeat’ 2 8 times) so

      (8⍴2)⊤⍎¨'.'(≠⊆⊢)'192.0.2.63'
1 0 0 0
1 0 0 0
0 0 0 1
0 0 0 1
0 0 0 1
0 0 0 1
0 0 1 1
0 0 0 1

That gives us the binary sequences for each of the octets as columns of an array. It’s a lot to type out each time, though, so we can create a function that takes a right argument

      asbin←{(8/2)⊤⍎¨'.'(≠⊆⊢)⍵}
      asbin '192.0.2.63'
1 0 0 0
1 0 0 0
0 0 0 1
0 0 0 1
0 0 0 1
0 0 0 1
0 0 1 1
0 0 0 1

Of course, if we want to go back the other way and see this as an IP address made of octets, we can ‘paste’ the values (converted back to integers) together with dots between them with

      asoct←2∘⊥
      asip←{∊(⍕¨⍵),¨'.' '.' '.' ''}

The first of these creates a “curried” (partially applied) ‘decode’ with radix 2, while the second ’format’s the values in the specified pattern, so

      asoct asbin '192.0.2.63'
192 0 2 63

      asip asoct asbin '192.0.2.63'
192.0.2.63

Cool, we can round-trip this.

The subnet mask is a series of 1s filled to 32 values with 0s which we can write as

      mask←{⍉4 8 ⍴ 1=⍸⍵ 0 (32-⍵)}

which creates a 4×8 array of values filled with the right number of 1s

      mask 26
1 1 1 1
1 1 1 1
1 1 1 0
1 1 1 0
1 1 1 0
1 1 1 0
1 1 1 0
1 1 1 0

We can view this subnet mask with this new function, too

      asoct mask 26
255 255 255 192

      asip asoct mask 26
255.255.255.192

The ‘network address’ for this address is found by a bitwise AND between this mask and the IP address, and APL has a builtin ‘and’

      (mask 26) ∧ asbin '192.0.2.63'
1 0 0 0
1 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 1 0
0 0 0 0

      asip asoct (mask 26) ∧ asbin '192.0.2.63'
192.0.2.0

The ‘broadcast address’ is found by a bitwise OR between the inverse of the mask and the IP address

      (~mask 26) ∨ asbin '192.0.2.63'
1 0 0 0
1 0 0 0
0 0 0 1
0 0 0 1
0 0 0 1
0 0 0 1
0 0 1 1
0 0 0 1

      asip asoct (~mask 26) ∨ asbin '192.0.2.63'
192.0.2.63

Looking at what we have so far, we can write out some functions

      asip←{∊(⍕¨⍵),¨'.' '.' '.' ''}
      asoct←2∘⊥
      mask←{⍉4 8 ⍴ 1=⍸⍵ 0 (32-⍵)}
      smask←{asip asoct mask ⍵}
      asbin←{(8/2)⊤⍎¨'.'(≠⊆⊢)⍵}
      netaddr←{asip asoct (mask ⍺) ∧ asbin ⍵}
      bcast←{asip asoct (~mask ⍺) ∨ asbin ⍵}

and try these on some different addresses

      ip←'192.168.0.1'
      smask 8
255.0.0.0

      26 netaddr ip
192.168.0.0

      26 bcast ip
192.168.0.63

      ip←'142.250.70.174'
      
      16 netaddr ip
142.250.0.0

      16 bcast ip
142.250.255.255

Cool!

We could also calculate the number of hosts that can be assigned, since that’s just 2 to the power of the number of host bits (non-network bits), minus the network and broadcast addresses

      nhosts←{¯2+2*(32-⍵)}
      nhosts 26
62

We could list the entire range of host IPs, except we need to offset the network and broadcast addresses. Time to make some utilities

      netutil←{asoct (mask ⍺) ∧ asbin ⍵}
      butil←{asoct (~mask ⍺) ∨ asbin ⍵}

      bcast1←{x←⍺ butil ⍵ ⋄ x[4]←x[4]-1 ⋄ asip x}
      netaddr1←{x←⍺ netutil ⍵ ⋄ x[4]←x[4]+1 ⋄ asip x}
      
      iprange←{n←⍺ netaddr1 ⍵ ⋄ b←⍺ bcast1 ⍵ ⋄ n,'-',b}
      
      26 iprange '192.0.2.63'
192.0.2.1-192.0.2.62

That seems like a good set of utilities – and a great opportunity to learn about how Dyalog APL packages up things into namespaces. One way is to write the functions to a file, say SubnetCalc.dyalog as

:Namespace SubnetCalc
    asip←{∊(⍕¨⍵),¨'.' '.' '.' ''}
    asoct←{2⊥⍵}
    mask←{⍉4 8 ⍴ 1=⍸⍵ 0 (32-⍵)}
    nhosts←{¯2+2*(32-⍵)}
    smask←{asip asoct mask ⍵}
    asbin←{(8/2)⊤⍎¨'.'(≠⊆⊢)⍵}
    netutil←{asoct (mask ⍺)∧asbin ⍵}
    netaddr←{asip ⍺ netutil ⍵}
    netaddr1←{x←⍺ netutil ⍵ ⋄ x[4]←x[4]+1 ⋄ asip x}
    butil←{asoct (~mask ⍺)∨asbin ⍵}
    bcast←{asip ⍺ butil ⍵}
    bcast1←{x←⍺ butil ⍵ ⋄ x[4]←x[4]-1 ⋄ asip x}
    iprange←{n←⍺ netaddr1 ⍵ ⋄ b←⍺ bcast1 ⍵ ⋄ n,'-',b}
:EndNamespace

(noting that I needed to use explicit defuns rather than just tacit calls) then load that into the RIDE editor session with

⎕FIX '/path/to/project/SubnetCalc.dyalog'

and give it a shorter name, if desired

'ip' ⎕NS SubnetCalc

Now I can call my functions even faster

      ip.smask 26
255.255.255.192

      google←'142.250.70.174'

      26 ip.iprange google
142.250.70.129-142.250.70.190

Dyalog has recently announced a proper package infrastructure Tatin which might come as a surprise to those more familiar with newer languages, but it’s actually one of the first package ecosystems for an APL language that I know of. I want to figure out whether my ‘toy’ package is too simplistic to be shared, or if it’s worth learning those ropes. At the moment all the packages in that system are internally sourced, but presumably that would open up to external users once it’s stabilised.

All of this was a lot of fun and I learned a lot. How would I go about this in another language? Well, there’s almost always an R package for something, and sure enough there’s an {ipaddress} package on CRAN that has all of this functionality plus more, though it does seem to rely on compiling some C++ code to do it.

library(ipaddress)

ip <- ip_address("192.0.2.44")
ip_to_binary(ip)                     # c.f. asbin
## [1] "11000000000000000000001000101100"
ipn <- ip_network("192.0.2.0/26")
prefix_length(ipn)                   # c.f. nhosts
## [1] 26
network_address(ipn)                 # c.f. netaddr
## <ip_address[1]>
## [1] 192.0.2.0
broadcast_address(ipn)               # c.f. bcast
## <ip_address[1]>
## [1] 192.0.2.63
netmask(ipn)                         # c.f. smask
## <ip_address[1]>
## [1] 255.255.255.192
hostmask(ipn)
## <ip_address[1]>
## [1] 0.0.0.63
range(hosts(ipn))                    # c.f. iprange
## <ip_address[2]>
## [1] 192.0.2.1  192.0.2.62
is_within(ip, ipn)
## [1] TRUE

One of the advantages of the APL approach, I feel, is that you can see exactly what the function is doing – often there’s no point naming a function because any useful name you might give it typically has more characters than the actual implementation. Digging into this package even slightly, it’s not immediately obvious where the processing is happening. I sometimes worry that we add too many layers to higher and higher abstractions, though I appreciate that sometimes gets us a lot of benefit.

I wouldn’t use my APL code in production – it has no checks or error handling, but building these helped me really understand what’s going on between all the ones and zeroes.

If you have comments, suggestions, or improvements, as always, feel free to use the comment section below, or hit me up on Mastodon.


< details> < summary> devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.3.3 (2024-02-29)
##  os       Pop!_OS 22.04 LTS
##  system   x86_64, linux-gnu
##  ui       X11
##  language (EN)
##  collate  en_AU.UTF-8
##  ctype    en_AU.UTF-8
##  tz       Australia/Adelaide
##  date     2024-08-22
##  pandoc   3.2 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/x86_64/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package     * version date (UTC) lib source
##  blogdown      1.19    2024-02-01 [1] CRAN (R 4.3.3)
##  bookdown      0.36    2023-10-16 [1] CRAN (R 4.3.2)
##  bslib         0.6.1   2023-11-28 [3] CRAN (R 4.3.2)
##  cachem        1.0.8   2023-05-01 [3] CRAN (R 4.3.0)
##  callr         3.7.3   2022-11-02 [3] CRAN (R 4.2.2)
##  cli           3.6.1   2023-03-23 [1] CRAN (R 4.3.3)
##  crayon        1.5.2   2022-09-29 [3] CRAN (R 4.2.1)
##  devtools      2.4.5   2022-10-11 [1] CRAN (R 4.3.2)
##  digest        0.6.34  2024-01-11 [3] CRAN (R 4.3.2)
##  ellipsis      0.3.2   2021-04-29 [3] CRAN (R 4.1.1)
##  evaluate      0.23    2023-11-01 [3] CRAN (R 4.3.2)
##  fastmap       1.1.1   2023-02-24 [3] CRAN (R 4.2.2)
##  fs            1.6.3   2023-07-20 [3] CRAN (R 4.3.1)
##  glue          1.7.0   2024-01-09 [1] CRAN (R 4.3.3)
##  htmltools     0.5.7   2023-11-03 [3] CRAN (R 4.3.2)
##  htmlwidgets   1.6.2   2023-03-17 [1] CRAN (R 4.3.2)
##  httpuv        1.6.12  2023-10-23 [1] CRAN (R 4.3.2)
##  icecream      0.2.1   2023-09-27 [1] CRAN (R 4.3.2)
##  ipaddress   * 1.0.2   2023-12-01 [1] CRAN (R 4.3.3)
##  jquerylib     0.1.4   2021-04-26 [3] CRAN (R 4.1.2)
##  jsonlite      1.8.8   2023-12-04 [3] CRAN (R 4.3.2)
##  knitr         1.45    2023-10-30 [3] CRAN (R 4.3.2)
##  later         1.3.1   2023-05-02 [1] CRAN (R 4.3.2)
##  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.3.3)
##  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.3.3)
##  memoise       2.0.1   2021-11-26 [3] CRAN (R 4.2.0)
##  mime          0.12    2021-09-28 [3] CRAN (R 4.2.0)
##  miniUI        0.1.1.1 2018-05-18 [1] CRAN (R 4.3.2)
##  pkgbuild      1.4.2   2023-06-26 [1] CRAN (R 4.3.2)
##  pkgload       1.3.3   2023-09-22 [1] CRAN (R 4.3.2)
##  prettyunits   1.2.0   2023-09-24 [3] CRAN (R 4.3.1)
##  processx      3.8.3   2023-12-10 [3] CRAN (R 4.3.2)
##  profvis       0.3.8   2023-05-02 [1] CRAN (R 4.3.2)
##  promises      1.2.1   2023-08-10 [1] CRAN (R 4.3.2)
##  ps            1.7.6   2024-01-18 [3] CRAN (R 4.3.2)
##  purrr         1.0.2   2023-08-10 [3] CRAN (R 4.3.1)
##  R6            2.5.1   2021-08-19 [1] CRAN (R 4.3.3)
##  Rcpp          1.0.11  2023-07-06 [1] CRAN (R 4.3.2)
##  remotes       2.4.2.1 2023-07-18 [1] CRAN (R 4.3.2)
##  rlang         1.1.4   2024-06-04 [1] CRAN (R 4.3.3)
##  rmarkdown     2.25    2023-09-18 [3] CRAN (R 4.3.1)
##  rstudioapi    0.15.0  2023-07-07 [3] CRAN (R 4.3.1)
##  sass          0.4.8   2023-12-06 [3] CRAN (R 4.3.2)
##  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.2)
##  shiny         1.7.5.1 2023-10-14 [1] CRAN (R 4.3.2)
##  stringi       1.8.3   2023-12-11 [3] CRAN (R 4.3.2)
##  stringr       1.5.1   2023-11-14 [3] CRAN (R 4.3.2)
##  urlchecker    1.0.1   2021-11-30 [1] CRAN (R 4.3.2)
##  usethis       3.0.0   2024-07-29 [1] CRAN (R 4.3.3)
##  vctrs         0.6.5   2023-12-01 [1] CRAN (R 4.3.3)
##  xfun          0.41    2023-11-01 [3] CRAN (R 4.3.2)
##  xtable        1.8-4   2019-04-21 [1] CRAN (R 4.3.2)
##  yaml          2.3.8   2023-12-11 [3] CRAN (R 4.3.2)
## 
##  [1] /home/jono/R/x86_64-pc-linux-gnu-library/4.3
##  [2] /usr/local/lib/R/site-library
##  [3] /usr/lib/R/site-library
##  [4] /usr/lib/R/library
## 
## ──────────────────────────────────────────────────────────────────────────────


To leave a comment for the author, please follow the link and comment on their blog: rstats on Irregularly Scheduled Programming.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exit mobile version