Description
By default, Franc returns ISO-639-3 three-letter language tags, as listed in the Supported Languages table.
We would like Franc to alternatively support outputting IANA language subtags as an option, in compliance with the W3C recommendation for specifying the value of the lang
attribute in HTML (and the xml:lang
attribute in XML) documents.
(Two- and three-letter) IANA language codes are used as the primary language subtags in the language tag syntax as defined by the IETF’s BCP 47, which may be further specified by adding subtags for “extended language”, script, region, dialect variants, etc. (RFC 5646 describes the syntax in full). The addition of such more fine-grained secondary qualifiers are, I guess, out of Franc’s scope, but it would be very helpful nevertheless when Franc would be able to at least return the IANA primary language tags, which suffice, if used stand-alone, to be still in compliance with the spec.
On the Web — as the IETF and W3C agree — IANA language subtags and BCP 47 seem to be the de facto industry standard (at least more so than ISO 639-3). Moreover, the naming convention for TeX hyphenation pattern files (such as used by i.a. OpenOffice) use ISO-8859-2 codes, which overlap better with IANA language subtags, too.
If Franc would output IANA language subtags, then the return values could be used as-is, and without any further post-processing or re-mapping, in, for example CSS rules, specifying hyphenation:
@media print {
:lang(nl) { hyphenate-patterns: url(hyphenation/hyph-nl.pat); }
}
@wooorm :
- What is the rationale for Franc to default on ISO-639-3 (only)? Is it a “better” standard, and, if so, why?
- If you would agree it would be a good idea for Franc to support BCP 47 and outputting IANA language subtags as an available option, then how would you prefer it to be implemented and accept a PR? (We’d happily contribute.) Would it suffice to add and map them in
data/support.json
?
Activity