Ruby-Locale HOWTO

ruby-locale-howto

Locale ID handling

Ruby-Locale manages the Locale ID and provides APIs for Locale handling. They are thread safe. And each thread have a Locale.

The simplest usage of this library is:

require 'locale'

Locale.current  #=> Returns current locale IDs.

Language Tag

Language Tag is used as the Locale ID in this library. You can use this tag to select your own localized procedure in your Program.

There are some standard specification for Language Tag, such as "ja_JP.UTF-8", "ja-JP" and "ja-Hira-JP". Ruby-Locale supports almost of these language tags and convert to each other.

  • Simple - <language> and <REGION>. Almost of all Program is enough to use this style only.
    • (e.g.) "ja-JP", "ja_JP".
  • Common - <language>, <Script>, <REGION> and <VARIANT>.
    • (e.g.) "ja-Kana-JP-MOBILE", "ja_JP_MOBILE"
  • RFC - IETF(RFC2646(BCP47)) language tag.
    • (e.g.) "ja-Kana-JP-MOBILE"
  • CLDR - CLDR(Unicode Common Locale Data Repository) locale idenitifer.
    • (e.g.) "en_US-POSIX@calendar=islamic"
  • POSIX - POSIX locale identifier.
    • (e.g.) "ja_JP.UTF-8"

With Ruby-Locale, Programs can use their preferable tags and can use other Programs which has the other tag format.

require 'locale'

lang = Locale::Tag.parse("ja_JP.UTF-8")
puts lang.to_s      # => "ja_JP"
puts lang.to_simple # => #<Locale::Tag::Posix: "ja_JP">
puts lang.to_rfc    # => #<Locale::Tag::Posix: "ja-JP">
puts lang.to_cldr   # => #<Locale::Tag::Posix: "ja_JP">
puts lang.to_posix  # => #<Locale::Tag::Posix: "ja_JP.UTF-8">

lang = Locale::Tag.parse("ja_Hira_JP")
puts lang.to_s      # => "ja_Hira_JP"
puts lang.to_simple # => #<Locale::Tag::Posix: "ja_JP">
puts lang.to_rfc    # => #<Locale::Tag::Posix: "ja-Hira-JP">
puts lang.to_cldr   # => #<Locale::Tag::Posix: "ja_Hira_JP">
puts lang.to_posix  # => #<Locale::Tag::Posix: "ja_JP">

Language TagList

A locale is represented by a Locale::TagList which is the Array of Locale::Tag(s) order by the priority to use. So Programs need to select one of the Locale::Tag then execute localized procedures.

require 'locale'
ENV["LANGUAGE"] = "en_CA:en_US"

require 'locale'
ENV["LANGUAGE"] = "en_CA:en_US"

p taglist = Locale.current #=> [#<Locale::Tag::Posix: en_CA>, #<Locale::Tag::Posix: en_US>]
p taglist[0].to_s  #=> en_CA
p taglist.to_s     #=> en_CA  same with taglist[0].to_s.
p taglist[1].to_s  #=> en_US

The Local::TagList can behave as the first Locale::Tag. So taglist.language is the same with taglist[0].language.

Auto detection of the Locale ID and the charset.

Programs can get the Locale ID and the charset which are required by the user or the platform. Programs can set the locale by themselves, but almost of all cases, it doesn't need to set the Locale by themselves.

  • POSIX(Unix/Linux/*BSD): Get the locales from environment valiables.
    • language tag - Get the value from environment variables order by LANGUAGE > LC_ALL > LC_MESSAGES > LANG. LANGUAGE can be set plural locales such as "en_CA:en_US", others can be set a locale only such as "en_CA".
    • charset - the result of `locale charmap`
  • Win32:
    • language tag - Try to get the environment variables like POSIX. If it's not set, then get the value from Win32 API.
    • charset - Get the value which is found with the language tag.
  • JRuby:
    • language tag - Try to get the environment variables like POSIX. If it's not set, then get the value from Java API.
    • charset - Get the value from Java API.
  • CGI: Get the locale from HTTP request.
    • language tag - HTTP_ACCEPT_LANGUAGE > QUERY_STRING(lang), Cookies(lang) > (en). HTTP_ACCEPT_LANGUAGE can be set plural locales, others can be set a locale only.
    • charset - HTTP_ACCEPT_CHARSET > UTF-8

With CGI, you need to call Locale.init first, then set CGI object to this library.

require 'locale'
require 'cgi'

# Initialize Locale library.
Locale.init(:driver => :cgi)
cgi = CGI.new
Locale.set_cgi(cgi)

Locale.current  # Returns 

Without CGI, you don't need to call Locale.init explicity.

Select the "locale identifier string" from Language tags

For example, Web applications get some language tags from WWW browser once. (You can find the setting from your browser preferences or options.)

If the WWW browser is set 3 languages like as "fr-CA", "fr-BE"and "fr-FR", then, what "locale identifier string" does the Web application support ?

Of course, the Web application support all of them is the best answer. But it may be difficult to support all of them.

So, you may think you support "fr" and "fr-CA", but not "fr-BE". If "fr-BE" is selected, use "fr" instead.

  1. If fr-CA is required, then use fr-CA.
  2. If fr-BE is required, then use fr. (fallback to fr)
  3. If fr-FR is required, then use fr. (fallback to fr)

Ruby-Locale supports this fallback function with Locale.candidates method. It's better to use use this method instead of using the result of Locale.current (A TagList) directly.

p Locale.candidates  #=>  [#<Locale::Tag::Common: fr_CA>, #<Locale::Tag::Common: fr_BE>, 
                           #<Locale::Tag::Common: fr>, #<Locale::Tag::Common: en>]



# Use :type to reduce the language tag style.
p Locale.candidates(:type => :rfc)  #=> [#<Locale::Tag::Rfc: fr-CA>, #<Locale::Tag::Rfc: fr-BE>,
                                         #<Locale::Tag::Rfc: "fr">, #<Locale::Tag::Rfc: "en">]

# Use :supported_language_tags
p Locale.candidates(:type => :rfc,
                    :supported_language_tags => ["fr-CA", "fr"]) 
                                    #=> [#<Locale::Tag::Rfc: "fr-CA">, #<Locale::Tag::Rfc: "fr">]

Sample to get localized file:

Locale.candidates(:type => :rfc,
                  :supported_language_tags => ["fr-CA", "fr"]).each do |lang|
  path = "/foo/bar/locale/#{lang}/foo.rb"
  if File.exist?(path)
    File.open(path)
        :
        :
    break
  end
end

Or:

lang = Locale.candidates(:type => :rfc,
                         :supported_language_tags => ["fr-CA", "fr"])[0]
File.open("/foo/bar/locale/#{lang}/foo.rb")

In this case, lang is Locale::TagList, so you can do like as:

lang = Locale.candidates(:type => :rfc,
                         :supported_language_tags => ["fr-CA", "fr"])
File.open("/foo/bar/locale/#{lang}/foo.rb")

Setting Locales

Ruby-Locale has "default" and "current" values. Default value is set whole the application, and it's used if the current value is not found.

The current value is set to the current thread only. Each thread can have a current value.

require 'locale'

# Set the default locale. This value is used by all of threads.
Locale.default = "ja_JP"

# Set the current locale. This value is set to the current thread.
Locale.current = "ja-JP"
Locale.set_current("ja-JP", "en-US")  # Plural language tags order by priority.

Clear Locales

The locale values are cached in each Thread. If Programs require to get new locale values, call clear methods.

require 'locale'

# Clear current locale
Locale.clear

# Clear all locales of all threads.
Locale.clear_all

Gets the charset

To get the system/user charset, Call Locale.charset. This is useful to convert the strings from internal-charset to output-charset.

require 'locale'
require 'iconv'

puts Iconv.iconv(Locale.charset, "UTF-8", "こんにちは")

Resources

Ruby-Locale provides the list of ISO 639-3 languages and ISO 3166 regions. You need to require 'locale/info' first.

require 'locale/info'

p Locale::Info.get_language("ja")
p Locale::Info.get_region("JP")

Samples

See locale-x.x.x/samples/.

Online samples are here.

Last modified:2008/12/04 01:18:10
Keyword(s):
References:[Ruby-Locale for Ruby on Rails HOWTO] [Ruby-Locale]