Monday 26 November 2012

c sharp programming

Ordinal versus culture comparison



There are two basic algorithms for string comparison: ordinal and culture-sensitive.
Ordinal  comparisons  interpret  characters  simply  as numbers  (according  to  their
numeric Unicode  value);  culture-sensitive  comparisons  interpret  characters with
reference to a particular alphabet. There are two special cultures: the “current cul-
ture,” which is based on settings picked up from the computer’s control panel, and
the “invariant culture,” which  is  the  same on every computer  (and closely maps
American culture).
For equality comparison, both ordinal and culture-specific algorithms are useful.
For ordering, however, culture-specific comparison is nearly always preferable: to
order strings alphabetically, you need an alphabet. Ordinal relies on the numeric
Unicode  point  values,  which  happen  to  put  English  characters  in  alphabetical


order—but even then not exactly as you might expect. For example, assuming case-
sensitivity, consider the strings “Atom”, “atom”, and “Zamia”. The invariant culture
puts them in the following order:
"Atom", "atom", "Zamia"
Ordinal arranges them instead as follows:
"Atom", "Zamia", "atom"
This is because the invariant culture encapsulates an alphabet, which considers up-
percase characters adjacent to their lowercase counterparts (AaBbCcDd…). The or-
dinal  algorithm,  however,  puts  all  the  uppercase  characters  first,  and  then  all
lowercase characters (A..Z, a..z). This is essentially a throwback to the ASCII char-
acter set invented in the 1960s.

No comments:

Post a Comment