[MDBF-366] Create interface to translate server error messages Created: 2022-03-22  Updated: 2023-05-15

Status: In Progress
Project: MariaDB Foundation Development
Component/s: None
Affects Version/s: None
Fix Version/s: N/A

Type: Task Priority: Major
Reporter: Ian Gilfillan Assignee: Vlad Bogolin
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: 0d
Time Spent: 7d 4.5h
Original Estimate: Not Specified


 Description   

The problem
MariaDB Server has a large number of error messages, located in sql/share/errmsg-utf8.txt
The list is incomplete in English, and partially complete in other languages.

The goal is to create a web interface to allow users to easily translate/review strings, so that these could be contributed back to the server.

1) Extract the strings from the error message file, determine which strings are translated for each language, and get an overall percentage completeness for each language
2) Allow users to contribute suggestions for untranslated strings.
3) Potentially allow users to review suggestions by other users
4) Collate the results allowing easy merges back to the Server.

We don't want to reinvent the wheel, and many projects do something similar. There are open-source projects that facilitate this. I have used Pootle in the past (though this is no longer active).

The primary part of the project would be automating the extraction and insertion of the error messages from/into the Server as much as possible.

Secondary would be reviewing which method to use for displaying and handling the translations from users.

Some options I'm aware of - this should be investigated more carefully:



 Comments   
Comment by Daniel Black [ 2022-03-23 ]

i18next

A extendable framework for writing our own is: https://www.i18next.com/overview/plugins-and-utils with blog on writing server side customisations

Transifex

https://www.transifex.com/open-source/

  • open source free
  • github integration - single branch only
  • format, we've got an odd, but simple, format, but customization possible, could work with a supported format and git hooks maybe.
  • translator autojoin project
Comment by Daniel Black [ 2022-04-26 ]

Lessons from MDEV-28227:

  • 10.4 -> 10.5 merge was hard due to changing MySQL -> MariaDB in a number of error messages (so 10.4 wasn't a great choice)
  • changing English messages requires test cases to be re-recorded. So a) disallow, or auto trigger mtr --record test (both release and debug builds) to update result files as part of test.
  • Not particularly well documented anywhere, but ISO-639-2 - *B*ibliographic is the language code
  • The format string %.*[a-z] from the first translation need to match others in order and exactly. We'll need a validation for this. The build will fail to compile with any of these incorrectly (made more painful by build stops on first failure).
  • add language to sql/share/CMakeLists.txt
  • nla reordered because it was out of alphabetical order. This make parts of the merge harder, but its resolved now.

marko did 10.5 -> 10.6 merge noted:

  • too much whitespace changes (since resolved, won't happen again)
  • ER_PARTITION_DOES_NOT_EXIST was renamed to ER_DROP_PARTITION_NON_EXISTENT

greenman notes most Spanish translation was added in 10.6, do we use this as a base?

On merging with whitespace and nla resorted, 10.5 might be acceptable. A single added spanish translation in 10.6 should just be a little fuzz to resolve which I think git can manage by default. If there was a mechanism to allow a translation in 10.4 if the english 10.4 and 10.5 text was the same that could be acceptable.

If there's a way a translation tool could have validations of translations to prevent malicious messages that would be advantageous.

Comment by Daniel Black [ 2022-04-27 ]

Added languages need to be in debian/mariadb-server-core*.install too.

Comment by Daniel Black [ 2022-04-29 ]

If we are going for a non-latest branch, we'll need to see how the translation text changes over future branches. e.g. ER_JSON_PATH_NO_WILDCARD (10.9) and ER_TOO_BIG_SCALE and ER_TOO_BIG_PRECISION.

Comment by Marko Mäkelä [ 2022-04-29 ]

There is a shortcut that could help avoid running all mtr tests when any English error messages are changed. Note: this also affects merges of such changes to later-version branches. Later versions might contain additional tests that trigger the changed message.

The shortcut is: Run git grep to find any files that refer to the old message string. Adjust them and run any tests that depend on those files (git grep for the file name to find any source directives).

Generated at Thu Feb 08 03:37:22 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.