[ANSI-Smalltalk] Behaviour of #collect:

Richard O'Keefe ok at cs.otago.ac.nz
Wed Sep 24 05:00:17 BST 2008


On 23 Sep 2008, at 7:48 am, cstb wrote:
>> Would adding #withAll:collect: to every class that has #withAll:
>> be considered a good move?
>
>
> Seems backwards to me, and suggests an implementation with an
> extraneous copy.  As opposed to

How does it suggest that?
That interpretation fails some Gricean maxim or other:
   if the author *meant* (Thingy withAll: (whatsIt collect:  
transmogrify)
   s/he would have written that; there'd be no point in a method just to
   save a pair of parentheses, so this *must* be something that gets a
   similar effect in some better way, or whoever wrote this method call
   was an unhelpful idiot.

>
>
>>> collect: a1block as: aCollectionClass
>>> collect: a1block asKeyed: aKeyedCollectionClass

There are several problems with these.
One is that it is always the receiver that is in charge.
If someone writes
	OrderedSet withAll: someCollection collect: [...]
then it is OrderedSet that decides what to answer.
If someone writes
	someCollection collect: [...] as: OrderedSet
then it is someCollection that decides what to answer,
and it need not be an OrderedSet at all.

Oddly enough, in the Haskell-Café mailing list, someone was
complaining just now about "map block collection" putting the
block somewhere other than the end, and I think the reasons
given were good ones.
	someCollection as: aCollectionClass collect: aBlock
would be better.  But of course, if #withAll:collect:
suggests an extraneous copy, #as:collect: would do so too.

Of course there is another important reason to prefer
#withAll:collect: in an addition to ANSI Smalltalk than
#collect:as:.  #withAll:collect: is a generalisation of
a method (#withAll:) that DOES exist in ANSI Smalltalk.
#collect:as: is a generalisation of the #as: method,
which I grant you is a handy method, but it does NOT
exist in ANSI Smalltalk.

Whatever the original reason for adding aClass withAll: coll
instead of the traditional coll as: aClass, that reason still
applies to the combination of converting and transforming.

Another problem is that I really don't know what
#collect:asKeyed# is supposed to do when aKeyedCollectionClass
is a collection class but is not keyed.
The nice thing about #withAll: coll collect: block is that it
does whatever #withAll: coll would have done >except for
transforming the elements<.

>>> collect: a1block into: anExistingCollection
>>> collect: a1block intoKeyed: existingKeyedCollection

I experimented with those (inspired by Common Lisp) and
rapidly found that they made no practical sense.  What
do you do if the existing collection
  - is read only?  (as well as Interval, I have ReadOnlyString,
    ReadOnlyByteArray, ReadOnlyArray, and some others).
  - is a fixed size, but a different size from the receiver?
  - is indirect, like the traditional MappedCollection, or my
    LazyKeySet (keys of a dictionary, acts like a set, but
    does not copy) or LazyAssociationSet (associations of a
    dictionary)?
  - is the receiver?

Having said all that, I *do* have
   anExistingCollection addAll: aCollection collect: aBlock
but this time it is the receiver that is in charge:
  + read only collections just plain don't have that method
  + neither do fixed size collections
  + has _very_ carefully written code for Dictionary, Set,
    Bag, SequencedContractibleCollection, and about a dozen
    other classes to make sure that passing in the
    receiver (or in the case of Lazy...Set, the underlying
    dictionary) will actually work.  I won't say it was easy!
There is certainly no question of a single
#collect:into: that could work for all or even many classes.
>
>> What exactly should #collect: do when sent to an object that
>> belongs to a class whose instances cannot hold the results that
>> the block returns?
>
>
> Great way to phrase the question.
>
> Some possible rules which strike me as sensible:
>
> 1) Same class as receiver, if possible, otherwise send error to self.

If "send error to self" means something like "self pvtCannotStoreThat"
it's no use unless you *also* standardise what that does, in which case
you only need to specify that behaviour, not the intermediate send.
If it means "raise an exception", we don't need yet more unspecified
exceptions.  We need things we can *portably* recover from.

"conforming to the same protocol as the receiver" is the way the
ANSI standard approximates "same class as receiver."
Been there, done that, got the scars.  Not useful.
>
> 2) Same species as receiver, if possible, otherwise send error to  
> self.

There is no notion of "species" in the ANSI Smalltalk standard.
If there were, the description of "Interval" would change, but
that's all.

There is good reason for the standard not to talk about 'species'
(my PDF viewer can't find that word anywhere).
The concept is ambiguous.
There are at least two notions of species for a collection.
(1) What should I return in response to #select: or #reject:?
     (Interval cannot return Interval here, but must return Array.)
     (I have a DateInterval class that's similar.)
(2) What should I return in response to a #collect:?
There is an important distinction between them: the elements of
the results of a #select: or #reject: are always elements of the
receiver, so necessarily satisfy the constraints that that kind
of collection imposes.  The elements of the results of a
#collect: are *not* always elements of the receiver, so the
possibility exists that a different kind of collection (but quite
possibly one implementing the same protocol) might be needed.

>
> 3) (a1block value: self first) collectionSpecies

This is meaningless when the receiver is empty.
It relies on a new #collectionSpecies method which is not
part of the standard.
It gives really dumb answers.  Consider
	#[0 1 2] collect: [:each | each * 1000]
It would not be surprising if 0 collectionSpecies were
ByteArray, but then that would not work for the second
element.
It also assumes that the result does not depend on the
receiver at all, which doesn't strike me as sensible.

>
> 4) always anArray

We are talking about reforming the standard, not revolting against it.
>
> 5) always anOrderedCollection

Ditto.
>
> 6) always aList

What's "aList"?
>
> 7) best effort, via some predictable sequence of retries

That's pretty much what I proposed.
>
> 8) as is (essentially case-by-case ad-hoc)

I don't know what you mean by "as is".  At the moment "as is"
behaviour is generally *NOT* case-by-case ad-hoc, but standard.
"As is", in fact, is pretty much 1 or 2.
>

> I *think* I'd vote for #1, as a standard, even though it would  
> likely break
> some currently working code, because it seems most consistant/ 
> predictable.

No, that's what we *have* in the standard.
"Predictable" isn't an unqualified virtue;
"predictably useless" is still useless.
Even when consistently spelled, "consistent" is not
an unqualified virtue either,
Emerson's "A foolish consistency is the hobgoblin of little minds,
adored by little statesmen and philosophers and divines."
applies here as well.
"Consistently USEFUL" beats "consistently DANGEROUS".

The problem is that if you want to map over some collection x using a  
block
b whose results cannot fit in a collection of the same type as x,  
there is
currently, in the standard and in practice,
  -1- NO portable way to detect that it has happened or recover.
  -2- NO portable way to program around it without making an extra copy.
  -3- NO portable way to add your own "safeCollect:" method
    (because there is no standard way to add a method to any existing  
class;
     the existing collection *factories* need not even be classes).

I don't see a change to -3- as likely.
I've already said that I would like to see *standard* exceptions
when errors occur so that portable code can catch them (-1-), but
if it's this hard to reach agreement on small simple issues, how
long will that take?
-2- is precisely the job of #withAll:collect:, and since the caller
knows about the block, it's presumably OK for the caller to be
expected to know about what kind of collection will accept the
block's results.  So
	ShortArray withAll: #[0 1 2] collect: [:each | each * 1000]
is a viable alternative to "just make it work", *if* that goes in
the standard.
>

> Meanwhile, I'm using #collect:into:, as needed.

My library had #collect:into:, but I ripped it out.
I replaced it with
     aCollection addAll: coll collect: aBlock
for collection objects that can grow and
     SomeCollectionClass withAll: coll collect: aBlock
for all collections (except of course Interval).
I've never looked back.

Perhaps I was wrong.
Could you give us some examples of #collect:into:?

Of course the real issue here is whether the programmer can
reasonably be expected to KNOW that something other than #collect:
is needed.




More information about the ANSI-Smalltalk mailing list