[This article was first published on Rcrastinate, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
This is something I did a while ago using the Berlin Affective Word List (BAWL).Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The BAWL contains ratings for 2902 German words (2107 nouns, 504 verbs, 291 adjectives). Ratings were collected for emotional valence (bad vs. good), arousal (the grade of valence) and imaginability (how well you can imagine the specific word). Please note, that I cannot supply the BAWL here on my blog. You can get the password for the Excel file, however, if you write an e-mail to Melissa Võ.
In German, you can use the suffix “-los” with nouns to mark the non-existence or non-presence of the noun and get an adjective. Basically, it works just like the English suffix “-less”. If you want to express that no moon was visible during a specific night, you could call this night “mondlose Nacht” (moonless night). The adjective “mondlos” is derived from the noun “Mond” (moon). The same works for “machtlos” (powerless), “makellos” (flawless) or “lieblos” (loveless). The last case is actually a little tricky because the adjective is “lieblos” but the noun it is derived from is “Liebe” – so the adjective loses an “e” here.
Now, we want to see what happens to a word’s emotional valence and arousal ratings if we add a “-los”.
First, we need to load a pre-processed Berlin Affective Word List.
library(scales) # for “alpha()”
load(<place Rdata here>) # variable name “bawl”
The relevant columns of the dataframe look like this (there are also standard deviations available for all measures).
head(bawl[,c(“low.word”, “w.class”, “emo.mean”,
“arousal.mean”, “image.mean”)])
low.word w.class emo.mean arousal.mean image.mean
1 aal N -0.5 2.380952 6.555556
2 aas N -2.1 2.631579 5.444444
3 abart N -1.6 3.277778 2.333333
4 abbau N -1.0 3.000000 2.227273
5 abbauen V -0.8 2.105263 3.670000
6 abbild N -0.2 2.105263 3.777778
Now, select all adjectives (word class is “A”) ending in “los”:
adj <- bawl[bawl$w.class == “A”,]
los.adj <- adj[grep(pattern=”los$”, adj$low.word),]
Next, we extract everything which stands in front of the “los”.
beforelos <- gsub(pattern=”los$”, “”, los.adj$low.word)
There are a few problems like the ones with “Liebe” I outlined above. The adjective for the absence of “Liebe” is not “liebelos” but “lieblos”. There are a few other cases in BAWL we have to take care of. I am just replacing “wrong” nouns with the “correct” ones.
repl.list <- list(c(“freud”, “freude”),
c(“hilf”, “hilfe”),
c(“leb”, “leben”),
c(“lieb”, “liebe”),
c(“namen”, “name”),
c(“reg”, “”),
c(“sorg”, “sorge”),
c(“treu”, “treue”))
for (el in repl.list) {
beforelos <- gsub(pattern=el[1], replacement=el[2],
beforelos, fixed = T)
}
Creating a dataframe with adjective ratings and the pre-“los” nouns.
df <- data.frame(adj = los.adj$low.word,
emo.adj = los.adj$emo.mean,
arousal.adj = los.adj$arousal.mean,
noun = beforelos)
For some of the “-los” adjectives, there is no corresponding noun in BAWL. I am first getting those nouns and then exclude them.
df$noun[!(df$noun %in% bawl[bawl$w.class == “N”, “low.word”])]
df <- df[!(df$noun %in% c(“kopf”, “rat”, “”,
“spur”, “ufer”, “zeit”)),]
Everything we have to do now is get out the noun ratings (for emotional valence and arousal) from BAWL and put it in the same dataframe.
df$emo.noun <- sapply(df$noun, FUN = function (x) {
bawl[bawl$w.class == “N” & bawl$low.word == x, “emo.mean”]
})
df$arousal.noun <- sapply(df$noun, FUN = function (x) {
bawl[bawl$w.class == “N” & bawl$low.word == x, “arousal.mean”]
})
And finally: Plot the whole thing. Note the call for arrows() – it’s really easy to connect all nouns to their respective adjectives.
plot(df$emo.adj, df$arousal.adj, xlim = c(-3,3), ylim = c(1,4.2),
pch = 15, col = “blue”, cex = 2,
xlab = “Emotional valence”,
ylab = “Arousal”, bty = “n”)
points(df$emo.noun, df$arousal.noun,
pch = 17, col = “green”, cex = 2)
arrows(x0=df$emo.noun, y0=df$arousal.noun,
x1=df$emo.adj, y1=df$arousal.adj,
length=0.2, code = 2, angle = 20, lwd = 2,
col = alpha(“black”, 0.5))
text(df$emo.adj, df$arousal.adj-0.05,
labels = df$adj, cex = 0.7)
text(df$emo.noun, df$arousal.noun-0.05,
labels = df$noun, cex = 0.7)
legend(x=”bottomleft”, legend=c(“Noun”, “Noun+los”),
pch = c(17,15), col = c(“green”, “blue”), bty = “n”)
abline(v = 0, lty = 2)
(click to enlarge)
Several interesting things can be seen in the graph:
- Almost all the words are “changing sides” of the emotional valence scale. Take, for example, “Liebe” (love) which is rated very good and very arousing. “Lieblos” (loveless), in contrast, changes the side almost all the way through the scale.
- Another nice example is “bodenlos” (bottomless). “bodenlos” is derived from “Boden” (bottom, but also soil). “Boden” is rated almost perfectly neutral and very unarousing (= boring?). If we add “-los” and get “bodenlos”, the word gets very arousing and slightly negative. That’s because “bodenlos” (just like bottomless) can be used in a metaphorical sense: as in “bodenloser Hass” (bottomless hatred).
- Other words get less arousing when a “-los” is added. One example is “sorglos” (carefree). While “Sorge” (sorrow) is rated quite arousing and negative, “sorglos” is rather neutral in terms of emotional valence and arousal.
To leave a comment for the author, please follow the link and comment on their blog: Rcrastinate.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.