[ANSI-Smalltalk] Behaviour of #collect:
Richard O'Keefe
ok at cs.otago.ac.nz
Thu Sep 25 00:30:45 BST 2008
Let me offer just one example to show that returning
a collection of a different class from the receiver is
already considered acceptable.
One huge can of worms that the revised standard MUST address
(and would be useful even if it ONLY addressed this and no
other issue) is international support, including localisation,
but specifically including Unicode.
Several Smalltalk systems now DO handle Unicode.
Ambrai is or was one of them. (Are Ambrai still in business?
It looked cool, but the web site seems to be dead.)
One way to handle Unicode in strings is rather like the way
Interlisp-D used to do it. You keep strings as narrow as you
can, only widening them when you store a wider character than
they can currently hold. You might have
String "abstract"
Latin1String "1 byte per character"
BMPString "2 bytes per character"
UnicodeString "3 or 4 bytes per character"
and when you try to store a 16-bit character into a
Latin1String the receiver becomes a BMPString;
when you try to store a 21-bit character into a Latin1String
or a BMPString the receiver becomes a UnicodeString.
Now suppose we want to convert ASCII characters to
"FullWidth ASCII" characters, leaving others alone.
aLatin1String collect: [:each |
(each codePoint between: 33 and: 126)
ifTrue: [Character codePoint: each codePoint + 65248]
ifFalse: [each]]
If the receiver is a Latin1String, the result will be a BMPString,
which is a different class.
Does anyone find that objectionable?
More information about the ANSI-Smalltalk
mailing list