Some More About Translating Business Central Apps

I’ve written before about using Azure Cognitive Services to translate the captions in the .xlf file that is generated when you compile your Business Central app. For us, the motivation is to make our apps available in as many countries as possible in AppSource.

Since then Søren Alexandersen has announced that it will not be necessary to provide all of a country’s official languages to make your app available in that country.

If you going to provide translations you might be interested in how to improve upon a the approach of the last post.

The Problem

The problem of course is that we are relying on machine translation to translate very short phrases or single words. A single word can mean different things and be translated in many different ways into other languages depending on the context. Context that the machine translation doesn’t have. That’s what makes language and etymology simultaneously fascinating and infuriating.

The problem is compounded by abbreviations and acronyms. You and I know that “Prod. Order” is short for “Production Order”. But “Prod” is itself an English word that has nothing to do with manufacturing.

We know that FA is likely short for “fixed asset” but if you don’t know that the context is an ERP system it could mean a whole range of things. How is Azure supposed to translate it?

What we need is some domain-specific knowledge.

The Solution

When we think about it we know that we’ve already got thousands of translations of captions into the languages that we want – if only we can get them into a useful format. We’ve got Docker images of Business Central localisations. They contain the base app for the location complete with source/target pairs for each caption.

If you can get hold of the xlf file it’s a relatively simple job to search for a trans-unit that has a source node matching the caption that you want to translate and find the corresponding translated target node.

As an example, I’ve created a container called ch from the image mcr.microsoft.com/businesscentral/sandbox:ch – the Swiss localisation of Business Central.

Find and expand the base application source.

$ContainerName = 'ch'
$Script = {Expand-Archive "C:\Applications.*\Base Application.Source.zip" -DestinationPath 'C:\run\my\base'}

Invoke-ScriptInBCContainer -containerName $ContainerName -scriptblock $Script

This script will find the zip file containing the localised base application and extract it to the ‘C:\run\my\base’ folder. This will take a few minutes but when it is done you should see a Translations folder containing, in the case of Switzerland, four .xlf files.

The following script will load the fr-CH .xlf file into an Xml Document, search for a trans-unit node which has a child source node matching a given string and return the target i.e. the fr-CH translation.

$Language = 'fr'
$enUSCaption = 'Prod. Order Line No.'
[xml]$xlf = Get-Content (Get-ChildItem "C:\ProgramData\NavContainerHelper\Extensions\$ContainerName\my\base\Translations" -Filter "*$Language*").FullName

$NSMgr.AddNamespace('x',$xlf.DocumentElement.NamespaceURI)
$xlf.SelectSingleNode("/x:xliff/x:file/x:body/x:group/x:trans-unit[x:source='$enUSCaption']", $NSMgr).target

Which returns “N° ligne O.F.” – cool.

Some Obvious Points

I’m going to leave it there for this post, save for making a few obvious points.

  • This is hopelessly inefficient. Downloading the localised Docker image, creating the container, extracting the base app – all to get at the .xlf files. We’re going to want a smarter solution before using this approach in any volume and for more languages
  • Each .xlf file is 60+MB – that takes a while to load into memory – you’ll want to keep the variable in scope and reuse it for multiple searches rather than reloading the document
  • Not all of the US English captions you create in your app will exist in the base application – you’ll still want to send those off for translation.

Maybe we can start to address these points next time…