Site icon R-bloggers

Expand your Bluesky network with R (repost)

[This article was first published on Getting Genetics Done, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

 < mark>This is reposted from the original at https://blog.stephenturner.us/p/expand-your-bluesky-network-with-r.

I’m encouraging everyone I know online to join the scientific community on Bluesky.

< picture style="--tw-border-spacing-x: 0; --tw-border-spacing-y: 0; --tw-ring-color: rgb(59 130 246 / 0.5); --tw-ring-offset-color: #fff; --tw-ring-offset-shadow: 0 0 #0000; --tw-ring-offset-width: 0px; --tw-ring-shadow: 0 0 #0000; --tw-rotate: 0; --tw-scale-x: 1; --tw-scale-y: 1; --tw-scroll-snap-strictness: proximity; --tw-shadow-colored: 0 0 #0000; --tw-shadow: 0 0 #0000; --tw-skew-x: 0; --tw-skew-y: 0; --tw-translate-x: 0; --tw-translate-y: 0;">< source srcset_temp="https://substackcdn.com/image/fetch/w_140,h_140,c_fill,f_webp,q_auto:good,fl_progressive:steep,g_auto/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb265d1c8-0584-4ac9-9544-fbbf76eaff60_1203x859.png" style="--tw-border-spacing-x: 0; --tw-border-spacing-y: 0; --tw-ring-color: rgb(59 130 246 / 0.5); --tw-ring-offset-color: #fff; --tw-ring-offset-shadow: 0 0 #0000; --tw-ring-offset-width: 0px; --tw-ring-shadow: 0 0 #0000; --tw-rotate: 0; --tw-scale-x: 1; --tw-scale-y: 1; --tw-scroll-snap-strictness: proximity; --tw-shadow-colored: 0 0 #0000; --tw-shadow: 0 0 #0000; --tw-skew-x: 0; --tw-skew-y: 0; --tw-translate-x: 0; --tw-translate-y: 0;" type="image/webp">Bluesky for Science

Bluesky for Science

In that post I link to several starter packs — lists of accounts posting about a topic that you can follow individually or all at once to start filling out your network.

I started following accounts of people I knew from X and from a few starter packs I came across. One way to expand your network is to take all the accounts you follow, see who they are following but you aren’t. You can rank this list descending by the number of your follows who follow them, and use that list as a way to fill out your network.

Let’s do this with just a few lines of code in R. The atrrr package (CRANGitHubDocs) is one of several packages that wraps the AT protocol behind Bluesky, allowing you to interact with Bluesky through a set of R functions. It’s super easy to use and the docs are great.

The code below does this. It will first authenticate with an app password. It then retrieves all the accounts you follow. Next, it gets who all those accounts follow, and removes the accounts you already follow.1

library(dplyr)
library(atrrr)

# Authenticate first (switch out with your username)
bsky_username <- "youraccount.bsky.social"

# If you already have an app password:
bsky_app_pw <- "change-me-change-me-123"
auth(user=bsky_username, password=bsky_app_pw)

# Or be guided through the process
auth()

# Get the people you follow
f <- get_follows(actor=bsky_username, limit=Inf)

# Get just their handles
fh <- f$actor_handle

# Get who your follows are following
ff <-
  fh |>
  lapply(get_follows, limit=Inf) |>
  setNames(fh)

# Make it a data frame
ffdf <- bind_rows(ff, .id="follow")

# Get counts, removing ppl you already follow
ffcounts <-
  ffdf |>
  count(actor_handle, sort=TRUE) |>
  anti_join(f, by="actor_handle") |>
  filter(actor_handle!="handle.invalid")

# Join back to account info, add URL
ffcounts <-
  ffdf |>
  distinct(actor_handle, actor_name) |>
  inner_join(x=ffcounts, y=_, by="actor_handle") |>
  mutate(url=paste0("https://bsky.app/profile/",
                    actor_handle))

This returns a data frame of all the accounts followed by the people you follow, but who you don’t already follow, descending by the number of accounts you follow who follow them (mouthful right there).

< picture style="--tw-border-spacing-x: 0; --tw-border-spacing-y: 0; --tw-ring-color: rgb(59 130 246 / 0.5); --tw-ring-offset-color: #fff; --tw-ring-offset-shadow: 0 0 #0000; --tw-ring-offset-width: 0px; --tw-ring-shadow: 0 0 #0000; --tw-rotate: 0; --tw-scale-x: 1; --tw-scale-y: 1; --tw-scroll-snap-strictness: proximity; --tw-shadow-colored: 0 0 #0000; --tw-shadow: 0 0 #0000; --tw-skew-x: 0; --tw-skew-y: 0; --tw-translate-x: 0; --tw-translate-y: 0;">< source sizes="100vw" srcset_temp="https://substackcdn.com/image/fetch/w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F928113b8-574a-4770-94f9-dbc0aaf1858f_1045x332.png 424w, https://substackcdn.com/image/fetch/w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F928113b8-574a-4770-94f9-dbc0aaf1858f_1045x332.png 848w, https://substackcdn.com/image/fetch/w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F928113b8-574a-4770-94f9-dbc0aaf1858f_1045x332.png 1272w, https://substackcdn.com/image/fetch/w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F928113b8-574a-4770-94f9-dbc0aaf1858f_1045x332.png 1456w" style="--tw-border-spacing-x: 0; --tw-border-spacing-y: 0; --tw-ring-color: rgb(59 130 246 / 0.5); --tw-ring-offset-color: #fff; --tw-ring-offset-shadow: 0 0 #0000; --tw-ring-offset-width: 0px; --tw-ring-shadow: 0 0 #0000; --tw-rotate: 0; --tw-scale-x: 1; --tw-scale-y: 1; --tw-scroll-snap-strictness: proximity; --tw-shadow-colored: 0 0 #0000; --tw-shadow: 0 0 #0000; --tw-skew-x: 0; --tw-skew-y: 0; --tw-translate-x: 0; --tw-translate-y: 0;" type="image/webp">

Optional, but you can make this nicer by using the gt package to make a nice table with a clickable link.

# Optional, clean up and create a nice table
library(gt)
library(glue)
top <- 20L
ffcounts |>
  head(top) |>
  rename(Handle=actor_handle, N=n, Name=actor_name) |>
  mutate(Handle=glue("[{Handle}]({url})")) |>
  mutate(Handle=lapply(Handle, gt::md)) |>
  select(-url) |>
  gt() |>
  tab_header(
    title=md(glue("**My top {top} follows' follows**")),
    subtitle="Collected November 19, 2024") |>
  tab_style(
    style="-weight:bold",
    locations=cells_column_labels()
  ) |>
  cols_align(align="left") |>
  opt_row_striping(row_striping = TRUE)

I can’t embed an HTML file here, but here’s what that output looks like. You can click any one of the names and follow the account if you find it useful.

< picture style="--tw-border-spacing-x: 0; --tw-border-spacing-y: 0; --tw-ring-color: rgb(59 130 246 / 0.5); --tw-ring-offset-color: #fff; --tw-ring-offset-shadow: 0 0 #0000; --tw-ring-offset-width: 0px; --tw-ring-shadow: 0 0 #0000; --tw-rotate: 0; --tw-scale-x: 1; --tw-scale-y: 1; --tw-scroll-snap-strictness: proximity; --tw-shadow-colored: 0 0 #0000; --tw-shadow: 0 0 #0000; --tw-skew-x: 0; --tw-skew-y: 0; --tw-translate-x: 0; --tw-translate-y: 0;">< source sizes="100vw" srcset_temp="https://substackcdn.com/image/fetch/w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68a6e826-45f1-4412-96bb-ed4cc70f55a1_576x1045.png 424w, https://substackcdn.com/image/fetch/w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68a6e826-45f1-4412-96bb-ed4cc70f55a1_576x1045.png 848w, https://substackcdn.com/image/fetch/w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68a6e826-45f1-4412-96bb-ed4cc70f55a1_576x1045.png 1272w, https://substackcdn.com/image/fetch/w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68a6e826-45f1-4412-96bb-ed4cc70f55a1_576x1045.png 1456w" style="--tw-border-spacing-x: 0; --tw-border-spacing-y: 0; --tw-ring-color: rgb(59 130 246 / 0.5); --tw-ring-offset-color: #fff; --tw-ring-offset-shadow: 0 0 #0000; --tw-ring-offset-width: 0px; --tw-ring-shadow: 0 0 #0000; --tw-rotate: 0; --tw-scale-x: 1; --tw-scale-y: 1; --tw-scroll-snap-strictness: proximity; --tw-shadow-colored: 0 0 #0000; --tw-shadow: 0 0 #0000; --tw-skew-x: 0; --tw-skew-y: 0; --tw-translate-x: 0; --tw-translate-y: 0;" type="image/webp">
< svg class="lucide lucide-maximize2" fill="none" height="20" stroke-linecap="round" stroke-linejoin="round" stroke-width="2" stroke="currentColor" viewbox="0 0 24 24" width="20" xmlns="http://www.w3.org/2000/svg">< polyline points="15 3 21 3 21 9">< polyline points="9 21 3 21 3 15">
  • Maybe you do this iteratively – add your top follows’ follows, then rerun the process a few times to possibly discover unknown second-degree connections.

    The code here essentially replicates what @theo.io’s Bluesky Network Analyzer is doing, but all locally using R. That web app is faster and easier to use, and does some smart caching and throttling to avoid API rate limits. See the footnote for more.

    Getting Genetics Done by Stephen Turner is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
    To leave a comment for the author, please follow the link and comment on their blog: Getting Genetics Done.

    R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
    Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
    Exit mobile version