Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The other day I stumbled upon this post on mastodon
and I thought that this could be a nice evening challenge to implement. At the end, it took longer than an evening but at least I learned a lot about UUIDs and base58 encoding.
What are UUIDs?
UUIDs are 128-bit values used to uniquely identify information in computer systems. Unlike traditional incremental IDs, UUIDs are designed to be unique across all space and time, making them ideal for systems where data is created across different machines, environments, or even different points in time. The magic of UUIDs lies in their ability to reduce the risk of duplication to nearly zero, even when generated independently by multiple sources.
< section id="what-is-base58-encoding" class="level2">What is Base58 encoding?
Base58 encoding is a specialized method of encoding binary data into a shorter, more readable string format, primarily designed to be more human-friendly. It uses a subset of 58 alphanumeric characters, intentionally omitting potentially confusing characters like “0” (zero), “O” (capital o), “I” (uppercase i), and “l” (lowercase L). This encoding is particularly popular in applications where clarity and brevity are essential, such as in cryptocurrencies (e.g., Bitcoin addresses) and compact data representations. By reducing the chance of transcription errors and producing shorter strings, Base58 encoding is a practical choice for creating cleaner, more user-friendly representations of data like UUIDs.
< section id="installation" class="level2">Installation
You can install the development version of shortuuid like so:
remotes::install_github("schochastics/shortuuid") #or pak::pak("schochastics/shortuuid")
library(shortuuid)
Example
The package implements a method to generate valid random uuids and two encoders/decoders using slight variations of the same alphabet.
library(shortuuid) # generate random uuids ids <- generate_uuid(n = 5) ids
[1] "efbcbdd5-251b-4335-b9b3-83c6ea6847a8" [2] "dad2883c-c78b-4e38-b535-954fffcca362" [3] "1cb47c00-76cf-49a9-ab37-e0cf2d42b702" [4] "298b9628-d4aa-431e-a26b-0377d85ba37b" [5] "4694d3b7-0748-4069-89d4-fef4751c4b74"
is.uuid(ids)
[1] TRUE TRUE TRUE TRUE TRUE
# alphabet: "123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz" b58 <- uuid_to_bitcoin58(ids) b58
[1] "Wc27nxhH5r1BZYYDafPgUX" "U2E47RcfnA9enhGEUWSbZ3" "4Yb6VAtxvB1XyVjgs83R9K" [4] "68YwpLNW4qVkZR3CTSDEdG" "9iWa4VN7qbZc88ftpx5PKR"
# alphabet: "123456789abcdefghijkmnopqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ" f58 <- uuid_to_flickr58(ids) f58
[1] "vB27MXGh5R1byxxdzEoFtw" "t2e47qBEMa9DMGgetvrAy3" "4xA6uaTXVb1wYuJFS83q9j" [4] "68xWPknv4QuKyq3csrdeCg" "9Hvz4un7QAyB88ETPX5ojq"
# convert back bitcoin58_to_uuid(b58)
[1] "efbcbdd5-251b-4335-b9b3-83c6ea6847a8" [2] "dad2883c-c78b-4e38-b535-954fffcca362" [3] "1cb47c00-76cf-49a9-ab37-e0cf2d42b702" [4] "298b9628-d4aa-431e-a26b-0377d85ba37b" [5] "4694d3b7-0748-4069-89d4-fef4751c4b74"
flickr58_to_uuid(f58)
[1] "efbcbdd5-251b-4335-b9b3-83c6ea6847a8" [2] "dad2883c-c78b-4e38-b535-954fffcca362" [3] "1cb47c00-76cf-49a9-ab37-e0cf2d42b702" [4] "298b9628-d4aa-431e-a26b-0377d85ba37b" [5] "4694d3b7-0748-4069-89d4-fef4751c4b74"
Addendum
Code to generate uuids taken from @rkg8
Reuse
< section class="quarto-appendix-contents" id="quarto-citation">Citation
@online{schoch2024, author = {Schoch, David}, title = {Shortuuid: {Generate} and {Translate} {Standard} {UUIDs}}, date = {2024-08-24}, url = {http://blog.schochastics.net/posts/2024-08-24_short-uuids}, langid = {en} }
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.