Content from Translations in R
Last updated on 2022-06-21 | Edit this page
Estimated time 5 minutes
Overview
Questions
- What is the history of the Translations process in R ?
- What are the areas that need work?
- Are there any existing translations that need updation?
Objectives
- Familiarising with the history of translations in R
- Exploring potential areas that need more work in translations
Introduction
This lesson is inspired from the materials used during the Translations sessions of the Collaboration Campfires.
Content from Status of Translations
Last updated on 2022-06-21 | Edit this page
Estimated time 32 minutes
Overview
Questions
- What is the current status of translations in R?
- How can you find out more about the translations in your language of interest?
Objectives
- Demonstrate the current status of translations in R
- Explore the Process for Localization (Translation) in R
- Explore mirror of R source code
- Introduction to
.pot
and.po
files.
Introduction
There are two basic files with extensions .pot
and .po
that are usually required during the Translation process in R. The .pot
files are the template files which contain the error messages, warnings, and other similar messages, in R. In the template .pot
file these messages will be available in standard English against the placeholder msgid "Standard English message is placed here"
. Below every msgid
there will be a placeholder for the translated message called msgstr ""
and it would always be empty (default) in the .pot
file. The msgstr
is to be filled in the corresponding .po
file - to include the appropriate translation. Both the template .pot
file and the translated .po
file should stored in the same directory always.
Challenge 1: Exploration
Go through the dataset and example script available in the translations directory. Explore the status of translations.
There are three datasets available to use with this lesson: metadata.CSV
, message_status.csv
, and 4168b6fff27eafad68a4b134dba5c7d09e090fcb.csv
. Each of these datasets are described below:
- The
metadata.csv
data file includes a.csv
with one record per.po
file in the R sources. There are 12 variables included in this dataset. The variable names and their description are provided in the table below:
Variable name | Description |
---|---|
sha |
The shortened SHA for the git commit of the r-svn repo that the data were obtained from |
date |
The date of the git commit of the r-svn repo that the data were obtained from |
package |
The name of a package containing messages to be translated |
po_file |
The name of .po files in the package sources |
component |
The component of the package the .po file relates to, either C , R , RGui (the latter is only in the base package) |
language |
The name of the language in English with the region as a suffix if applicable (e.g. English_GB vs English ) |
r_version |
The name of the R version to .po file relates do (does not always match pot_creation_date ) |
bug_reports |
Where to report bugs related to this .po file |
pot_creation_date |
The date the PO template file was created (when messages last updated), YYYY-MM-DD format |
po_revision_date |
The date the .po file was revised, YYYY-MM-DD format |
last_translator |
The name and email of the last translator |
team |
The name and/or email of the translation team |
- The
message_status.csv
data file includes a.csv
with one record per message in each.po
file in the R source. It includes the variablessha
,date
,package
,po_file
,component
, andlanguage
as above, plus:
Variable name | Description |
---|---|
message |
A message in the .po file |
translated |
A logical value indicating if the message has been translated |
fuzzy |
A logical value indicating if the translation is flagged as fuzzy , i.e. a fuzzy match of an old translation to a message that has had a minor update |
- The
4168b6fff27eafad68a4b134dba5c7d09e090fcb.csv
data file contains the results for the r-svn commit with hash4168b6fff27eafad68a4b134dba5c7d09e090fcb
. This.csv
has 6 columns described below:
Variable name | Description |
---|---|
git_commit |
The commit hash |
package |
The name of a package in the R sources |
language |
The ISO 639 code of the language, including variant |
type |
Either “C” or “R” |
n_translated |
The number of correctly translated messages |
n_untranslated |
The number of incorrectly translated messages |
Using the information available in the message_status.csv
data file, a bar plot is created which shows the counts of correctly translated messages from the C or R code. The code used to generate this bar plot and the bar plot are provided below.
R
# Load required packages
library(dplyr)
OUTPUT
Attaching package: 'dplyr'
OUTPUT
The following objects are masked from 'package:stats':
filter, lag
OUTPUT
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
R
library(forcats)
library(ggplot2)
library(readr)
# Read the data set
message_status <- read.csv("https://raw.githubusercontent.com/r-devel/rcontribution/main/collaboration_campfires/translations/message_status.csv")
# Plot the counts
ggplot2::ggplot(filter(message_status, translated, !fuzzy, component != "RGui"),
aes(x = fct_infreq(language))) +
geom_bar(stat = "count", fill = "steelblue") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1),
legend.position = "none") +
labs(x = NULL, y = "# translated messages")
Challenge 2: Propose new metrics and plots
Using the datasets and the example code above, analyse the current state of translations in the R project, for the language(s) of your interest. Propose metrics, tables of plots to explore the translations data. You can modify the code above to produce the metrics, tables or plots.
Datasets available on the r-devel/rcontribution GitHub repository.
Content from Contributing to Translations
Last updated on 2022-06-21 | Edit this page
Estimated time 51 minutes
Overview
Questions
- How to translate messages, errors, warnings in R to another language?
Objectives
- Explore the translations process in R
- Explore the translation tools available
Challenge 1: Check the progress of translations for a given language
The check_progress.R script provides a code to check progress for a given language, producing a table as follows.
R
lang <- "de" ## ISO code for the language of interest
library(devtools)
OUTPUT
Loading required package: usethis
R
po_status <- source_url("https://raw.githubusercontent.com/r-devel/translations-campfire/main/check_progress.R")
OUTPUT
ℹ SHA-1 hash of file is ce6513224e87563d09fb3da80183b3484939636d
R
knitr::kable(po_status$value)
|| || || ||
Challenge 2: Exploring the translations tools
The following tools help during the translations process:
- Download the Poedit tool for translations.
- Keep an online keyboard handy.
- Explore a mirror of the R source code, especially the
.po
and the.pot
(template file) files in it.
Challenge 3: Standard English to British English
Demonstration of the conversion of .pot
file available in Standard English to a .po
file in British English. Follow the steps below to perform this conversion.
- Check the TODO list for British English (en_GB), to know which files need work on the British English translations.
- Switch to the en_GB branch.
- Look at example
.pot
and existing.po
files. - Open a
.pot
file of interest in RStudio. - Explore the metadata in this
.pot
file and update it wherever required. - When the .pot file is updated, one of the following happens:
- A message is added: This would need translation.
- A message is removed: This would be deprecated.
- A message is changed: If this is a big change, then it is treated as new message added and old message deleted. Otherwise it is marked as “fuzzy”, which means that the translation needs updating.
- If a
.po
file does not exist corresponding to a.pot
file, then a new one can be created using the following command (The command demonstrates creating a.po
file forFrench
(fr
) language).
msginit -i R-base.pot -o R-fr.po -l fr
- Some messages have plural forms.
msgid "Warning message:\n"
msgid_plural "Warning messages:\n"
msgstr[0] ""
msgstr[1] ""
In the message above, there are two msgid
messages. For languages with a singular and only one plural, the first is for the singular form and the second is for the plural form. For languages which do not have plurals, give only one line starting with msgstr[0]
. (The Slovenian language would need four lines.).
Challenge 4: Standard English to Hindi
Demonstration of the conversion of .pot
file available in Standard English to a .po
file in Hindi. Follow the steps below to perform this conversion.
- Check the TODO list for Hindi (hi), to know which files need work on the Hindi translations.
- Switch to the hi branch.
- Look at example
.pot
and existing.po
files. - Open a
.pot
file of interest in the Poedit tool for translations. - Translate the messages from Standard English to Hindi one by one. Use an online keyboard to assist in the process of translations.
- If any translations needs further work then click on the
needs work
button on the Poedit tool. It will turn the message into orange coloured font on Poedit and it will be marked asfuzzy
in the corresponding.po
file. - Download the corresponding
.po
and view it on RStudio to see the changes. - Make sure, not to translate, variable names, function names, commands, package names, formats like %d, %s, %H, etc..
- If translation (in any of the
msgstr
of the.po
file) is left as""
, then the untranslated message (msgid
) will be used.
Challenge 5: Try your own translations!
Want to try translating to a language of your interest? Check the open issues for a specific language on GitHub. Switch to the corresponding language branch on GitHub (named according to the ISO code). Open the files that need work and try your own translations.
Challenge 6: Bonus!
Some bonus tasks to work on!
- Complete translations of the
.po
file(s) you started working on during this lesson. - Check out any other package files that need a translation.
- Translate in any other language of interest.
- Check existing
.po
files that might need an update in the translations.