Site icon R-bloggers

Oce translations

[This article was first published on Dan Kelley Blog/R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A new Oce user wondered how to get Spanish labels on axes. This is easy for some plot types because xlab and ylab arguments are obeyed, but in many cases the labels are generated by the code. That meant that it was finally time for me to start localizing Oce, using the GNU gettext scheme. Gettext being unfamiliar to me, I stumbled a bit at first, and decided to document the procedure here.

The start of this blog post is mostly for me, while the end is mostly for users who would like to help in the translation effort.

PART 1: developer notes

Initial work cycle

Step 1

Create a directory named oce/po/.

Step 2

The basic procedure it to change a code fragment like

1
ylab="Depth"

which is clearly not appropriate in all langauges, into

1
gettext("Depth", domain="R-oce")

which yields an entry in a translation table that can then be tailored for any desired language. (The gettext() call should not really need to set the domain, but I found it necessary for some items, and decided to include it everywhere.)

Step 3

Enter the oce directory, launch R, and type as follows, to insert a file named R-oce.pot in the po directory.

1
tools::update_pkg_po(".")

This create a file that has entries for all errors, warnings, and messages, as well as for gettext() entries. The file will be named po/R-oce.pot, and will contain entry pairs such as the following.

1
2
msgid "Depth"
msgstr ""

The first of these is a key, and the second is a replacement string (discussed presently). The details of the file format, and much else relating to gettext are available at the GNU site.

Step 4

To start work on, say, a French translation table, type the following in the shell.

1
2
cd oce/po
msginit --locale=R-fr --input R-oce.pot

This creates an English-French translation table in a file named po/R-fr.po. This must be done just once for each language to be translated. The key is the fr part of the name, which is the ISO-639 code for French.

Once the file is created, look at it with a text editor and, if necessary, change the charset to UTF-8, which can handle most languages.

Importantly, running msginit is only necessary at the first stage of translation. As new entries are added, one must only follow the work cycle given below.

Step 5

Edit po/R-fr.po as desired, inserting translations. You will need a text editor that permits characters in a variety of languages; vim and emacs are ideal for this.

Simply change the msgstr item to the translated value, e.g. for French the R-fr.po file should contain

1
2
msgid "Depth"
msgstr "Profondeur"

for this entry.

Update work cycle

The work cycle is as follows.

  1. Edit the R source code, replacing strings like FOO with gettext("FOO", "R-oce").

  2. Launch R in the oce directory and invoke tools::update_pkg_po(".") to add an entry to the po/R-oce.pot file, with corresponding entries in po/R-fr.po and any other existing translation files.

  3. Edit po/R-fr.po and any other translation files, changing the msgstr entry that corresponds to FOO.

  4. Run tools::update_pkg_po(".") again, to update all relevant files in the inst directory.

  5. Build and test oce.

  6. Repeat from step 1 as required for other words.

Since step 5 is slow, it helps to be watching your country win a gold medal in the Olympics hockey game, while you are doing such work.

Using translations

In most cases the system language will be set with system tools. Still, Sys.setenv() can be handy for switching the language used in a plot (e.g. a French user may have the computer set up to work in French, but may prefer to graph data using English, for publication). Commonly, Sys.setenv() will be done in the R startup file, or defined in the OS shell, e.g. below for a temporary use

1
LANG=es_ES.UTF-8 R --no-save < spanish.R

PART 2: Helping me with the translation effort

At the moment, Oce works in English, with some support for Spanish, French and German. The support for Mandarin is crude, having come from online translation engines.

If you would like Oce graphs to work in another language with which you have high familiarity, please contact me. I will need you to write down a few relevant words in your language and send them to me via PDF or scanned hand-written document (MSword and OpenOffice formats cannot are not useful). Minimally, the words should be the ones used on axes of the graphs you use, but it would help other users if you could translate the phrases listed below. Also, translate E, W, N, and S, as used in longitude and latitude, as well as any other unit abbreviations that differ between English and your language.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
Absolute Salinity
Conservative Temperature
Depth
Distance
Elevation
Longitude
Latitude
Potential Temperature
Potential Density Anomaly
Practical Salinity
Pressure
Sea Level
Speed
Temperature
Velocity

The easiest way to provide translated items is in a UTF-8 file containing translated phrases that I can copy and paste into the source files.

Table of results

The following code demonstrates translations by default plots for various data types. Mandarin requires a -family specification. The Mandarin translations came from online engines and may be laughably bad.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
library(oce)
datasets <- c("adv","adp","cm","ctd","drifter","lisst","lobo","sealevel","section","tdr")
languages <- c("en","es","de","fr","zh")
for (d in datasets) {
    data(list=d)
    if (d == "section")
        section <- sectionGrid(section)
    for (l in languages) {
        pdf(paste(d,"-",l,".pdf",sep=""), family=if (l=="zh") "GB1" else "Helvetica")
        Sys.setenv(LANGUAGE=l)
        plot(get(d))
        dev.off()
    }
}

To make the above work, an up-to-date version of the translation branch must be installed, by executing the following in R:

1
2
3
library(devtools)
install_github("ocedata", "dankelley", "master")
install_github("oce", "dankelley", "languages")

The results of the test code given above are shown below (click to enlarge).

English

French

German

Mandarin

Resources

To leave a comment for the author, please follow the link and comment on their blog: Dan Kelley Blog/R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.