Content from Translations in R


Last updated on 2022-06-21 | Edit this page

Estimated time 5 minutes

Overview

Questions

  • What is the history of the Translations process in R ?
  • What are the areas that need work?
  • Are there any existing translations that need updation?

Objectives

  • Familiarising with the history of translations in R
  • Exploring potential areas that need more work in translations

Introduction


This lesson is inspired from the materials used during the Translations sessions of the Collaboration Campfires.

Content from Status of Translations


Last updated on 2022-06-21 | Edit this page

Estimated time 32 minutes

Overview

Questions

  • What is the current status of translations in R?
  • How can you find out more about the translations in your language of interest?

Objectives

  • Demonstrate the current status of translations in R
  • Explore the Process for Localization (Translation) in R
  • Explore mirror of R source code
  • Introduction to .pot and .po files.

Introduction


There are two basic files with extensions .pot and .po that are usually required during the Translation process in R. The .pot files are the template files which contain the error messages, warnings, and other similar messages, in R. In the template .pot file these messages will be available in standard English against the placeholder msgid "Standard English message is placed here". Below every msgid there will be a placeholder for the translated message called msgstr "" and it would always be empty (default) in the .pot file. The msgstr is to be filled in the corresponding .po file - to include the appropriate translation. Both the template .pot file and the translated .po file should stored in the same directory always.

Challenge 1: Exploration

Go through the dataset and example script available in the translations directory. Explore the status of translations.

There are three datasets available to use with this lesson: metadata.CSV, message_status.csv, and 4168b6fff27eafad68a4b134dba5c7d09e090fcb.csv. Each of these datasets are described below:

  • The metadata.csv data file includes a .csv with one record per .po file in the R sources. There are 12 variables included in this dataset. The variable names and their description are provided in the table below:
Variable name Description
sha The shortened SHA for the git commit of the r-svn repo that the data were obtained from
date The date of the git commit of the r-svn repo that the data were obtained from
package The name of a package containing messages to be translated
po_file The name of .po files in the package sources
component The component of the package the .po file relates to, either C, R, RGui (the latter is only in the base package)
language The name of the language in English with the region as a suffix if applicable (e.g. English_GB vs English)
r_version The name of the R version to .po file relates do (does not always match pot_creation_date)
bug_reports Where to report bugs related to this .po file
pot_creation_date The date the PO template file was created (when messages last updated), YYYY-MM-DD format
po_revision_date The date the .po file was revised, YYYY-MM-DD format
last_translator The name and email of the last translator
team The name and/or email of the translation team
  • The message_status.csv data file includes a .csv with one record per message in each .po file in the R source. It includes the variables sha, date, package, po_file, component, and language as above, plus:
Variable name Description
message A message in the .po file
translated A logical value indicating if the message has been translated
fuzzy A logical value indicating if the translation is flagged as fuzzy, i.e. a fuzzy match of an old translation to a message that has had a minor update
  • The 4168b6fff27eafad68a4b134dba5c7d09e090fcb.csv data file contains the results for the r-svn commit with hash 4168b6fff27eafad68a4b134dba5c7d09e090fcb. This .csv has 6 columns described below:
Variable name Description
git_commit The commit hash
package The name of a package in the R sources
language The ISO 639 code of the language, including variant
type Either “C” or “R”
n_translated The number of correctly translated messages
n_untranslated The number of incorrectly translated messages

Using the information available in the message_status.csv data file, a bar plot is created which shows the counts of correctly translated messages from the C or R code. The code used to generate this bar plot and the bar plot are provided below.

R

# Load required packages
library(dplyr)

OUTPUT


Attaching package: 'dplyr'

OUTPUT

The following objects are masked from 'package:stats':

    filter, lag

OUTPUT

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

R

library(forcats)
library(ggplot2)
library(readr)

# Read the data set
message_status <- read.csv("https://raw.githubusercontent.com/r-devel/rcontribution/main/collaboration_campfires/translations/message_status.csv")

# Plot the counts
ggplot2::ggplot(filter(message_status, translated, !fuzzy, component != "RGui"),
      aes(x = fct_infreq(language))) +
      geom_bar(stat = "count", fill = "steelblue") +
      theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1),
            legend.position = "none") +
      labs(x = NULL, y = "# translated messages")
Correctly translated messages in base and default                packages
Correctly translated messages in base and default packages

Challenge 2: Propose new metrics and plots

Using the datasets and the example code above, analyse the current state of translations in the R project, for the language(s) of your interest. Propose metrics, tables of plots to explore the translations data. You can modify the code above to produce the metrics, tables or plots.

Datasets available on the r-devel/rcontribution GitHub repository.

Content from Contributing to Translations


Last updated on 2022-06-21 | Edit this page

Estimated time 51 minutes

Overview

Questions

  • How to translate messages, errors, warnings in R to another language?

Objectives

  • Explore the translations process in R
  • Explore the translation tools available

Challenge 1: Check the progress of translations for a given language

The check_progress.R script provides a code to check progress for a given language, producing a table as follows.

R

lang <- "de"  ## ISO code for the language of interest
library(devtools)

OUTPUT

Loading required package: usethis

R

po_status <- source_url("https://raw.githubusercontent.com/r-devel/translations-campfire/main/check_progress.R")

OUTPUT

ℹ SHA-1 hash of file is ce6513224e87563d09fb3da80183b3484939636d

R

knitr::kable(po_status$value)

|| || || ||

Challenge 2: Exploring the translations tools

The following tools help during the translations process:

Challenge 3: Standard English to British English

Demonstration of the conversion of .pot file available in Standard English to a .po file in British English. Follow the steps below to perform this conversion.

  • Check the TODO list for British English (en_GB), to know which files need work on the British English translations.
  • Switch to the en_GB branch.
  • Look at example .pot and existing .po files.
  • Open a .pot file of interest in RStudio.
  • Explore the metadata in this .pot file and update it wherever required.
  • When the .pot file is updated, one of the following happens:
    • A message is added: This would need translation.
    • A message is removed: This would be deprecated.
    • A message is changed: If this is a big change, then it is treated as new message added and old message deleted. Otherwise it is marked as “fuzzy”, which means that the translation needs updating.
  • If a .po file does not exist corresponding to a .pot file, then a new one can be created using the following command (The command demonstrates creating a .po file for French (fr) language).
msginit -i R-base.pot -o R-fr.po -l fr
  msgid        "Warning message:\n"
  msgid_plural "Warning messages:\n"
  msgstr[0]    ""
  msgstr[1]    ""

In the message above, there are two msgid messages. For languages with a singular and only one plural, the first is for the singular form and the second is for the plural form. For languages which do not have plurals, give only one line starting with msgstr[0]. (The Slovenian language would need four lines.).

Challenge 4: Standard English to Hindi

Demonstration of the conversion of .pot file available in Standard English to a .po file in Hindi. Follow the steps below to perform this conversion.

  • Check the TODO list for Hindi (hi), to know which files need work on the Hindi translations.
  • Switch to the hi branch.
  • Look at example .pot and existing .po files.
  • Open a .pot file of interest in the Poedit tool for translations.
  • Translate the messages from Standard English to Hindi one by one. Use an online keyboard to assist in the process of translations.
  • If any translations needs further work then click on the needs work button on the Poedit tool. It will turn the message into orange coloured font on Poedit and it will be marked as fuzzy in the corresponding .po file.
  • Download the corresponding .po and view it on RStudio to see the changes.
  • Make sure, not to translate, variable names, function names, commands, package names, formats like %d, %s, %H, etc..
  • If translation (in any of the msgstr of the .po file) is left as "", then the untranslated message (msgid) will be used.

Challenge 5: Try your own translations!

Want to try translating to a language of your interest? Check the open issues for a specific language on GitHub. Switch to the corresponding language branch on GitHub (named according to the ISO code). Open the files that need work and try your own translations.

Challenge 6: Bonus!

Some bonus tasks to work on!

  1. Complete translations of the .po file(s) you started working on during this lesson.
  2. Check out any other package files that need a translation.
  3. Translate in any other language of interest.
  4. Check existing .po files that might need an update in the translations.