[ANSI-Smalltalk] Smalltalk file streams

Richard O'Keefe ok at cs.otago.ac.nz
Mon Oct 20 01:11:09 BST 2008


On 17 Oct 2008, at 7:00 pm, Paolo Bonzini wrote:

>
>> Again, an exceedingly unhelpful response.
>
> To explain my short answer of PEBKAC: the keyboard and the chair are  
> the
> programmer's, if that's not clear.

It most definitely wasn't.

>  The standard *can* provide methods
> that sometimes make sense and sometimes don't.  If they don't, and the
> result could be infinite request of memory or an infinite loop, it's a
> problem of the user.

And now we are back to the ambiguous "user".

The thing is, if you persist in this attitude, *everything*
ends up being the fault of the woman in the ATM queue.
Everyone else just says "it's not MY problem, PEBKAC PEBKAC PEBKAC".

I suspect that you may be muddling up two distinguishable
interpretations of "sometimes make sense and sometimes don't".

(1) If the standard specifies a precondition
     and a method invocation violates that precondition,
     that's a method that "sometimes makes sense and sometimes doesn't".
     If the precondition is not spelled out, it's the standard's fault.
     If it spelled out and violated anyway, it's the programmer's
     fault, PROVIDED the standard provides means for the programmer
     to detect the fault in advance.

(2) If the standard specifies a precondition and a postcondition
     and a method invocation satisfies the precondition
     but the postcondition is violated, then it's NOT the
     programmer's fault: if the standard has specified a result
     that cannot be achieved, it is the standard's fault; if the
     standard's result could be achieved but the implementation
     doesn't do that, it's the implementor's fault.

Let's restructure that and introduce some technical terms:

     programmer error:
	precondition violated, programmer could have known
     programmer incapacity:
	precondition violated, programmer could not have
	known
     implementation error:
	precondition ok, postcondition violated,
	implementation wrong
     implementation limit:
	precondition ok, postcondition not established,
	implementation basically ok but ran out of something
     standard error:
	precondition ok, postcondition violated,
	no reasonable implementation _could_ get it right.


The key discriminant here is the precondition.
I am not saying that the standard could or should specify
only things that work all the time (we'd have no floating
point arithmetic in that case).  But I am saying that the
standard should be more explicit about preconditions.  It
is explicit about argument types,  but that's pretty much
as far as it goes.

There's also the problem in something like Smalltalk that
the standard has two faces.  When 5.3.1.1 says that #=
only returns true and false and has no errors, it is
  - making a PROMISE that implementations will get this right
    for the standard "classes",
  - setting out a REQUIREMENT for users who want to define
    the same method.

Some claims clearly cannot be taken at face value because of
global resource limits:  5.3.1.6 says that #copy has no
errors, but despite that it's obvious that an attempt to
copy some object may fail because there is not enough memory
left.  But this is something that applies to *every* method:
#= could try to allocate arbitrary amounts of data on its
way to a true or false.  There are similar issues with
opening files.

The standard needs to say something about this.
If there's a section of the ANSI standard that talks about
it, I've been unable to find it.

Note that running out of memory is an implementation limit.
It's not PEBKAC.  (Hare PEBKAC, hare PEBKAC, hare hare.)

For an example of something that's a programmer error,
	nil perform: #==
because 5.3.1.15 says the behaviour is undefined if the
number of arguments (here 0) does not match that implicitly
required by the syntactic form of the selector (here 1), so
the precondition is violated, and it is practical to check,
so it's an error, not an incapacity.

Anyone designing a new language would almost surely have
designated a standard exception to be raised when #perform:
gets an arity clash.  It seems bizarre that ANSI Smalltalk
has one of the most elaborate exception handling schemes
around and makes so little use of it.  But that's another
issue.

>
>
> Certainly, it would be *insane* to require that #contents detects  
> device
> files where it cannot terminate.

You keep on talking about cases where #contents cannot
TERMINATE.  But I was and remain talking mainly about
cases where #contents cannot START.  And it is not only
not insane to detect that, it's quite easy.

  - if you only have write access to an external resource,
    you can *never* perform #contents on it.  So it doesn't
    *EVER* make sense to have #contents in the <writeFileStream>
    protocol.

  - if you cannot seek back to the beginning of an external
    resource, you can *never* find the full contents.
    This applies to sockets, pipes, ttys, ptys, and all that
    kind of thing.  And at least on Windows, VMS, MCP, and
    UNIX variants, it's dead simple to recognise such things.
    To a first approximation, it doesn't make sense to have
    #contents in the interface of anything but a plain disc
    file stream.

    Far from being insane, this is exactly what Squeak and
    VisualWorks do: for the sake of #position: and #contents
    they won't let me open anything where those operations
    would not work.  Why would anyone call insane what actual
    systems actually *do*?

  - if you are reading from a disc file, another process might
    be writing to it at the same time.  This is why the 'tail'
    command has the '-f' option.  Of course, in this case the
    whole notion of "THE contents" of the stream is problematic.
    However, the strategy of determining the size *now* and
    reading up to that will work (as indeed the 'tail' command
    normally does).

So #contents belongs in <readFileStream> only,
not in <writeFileStream>, and it requires a precondition
something like
	"#contents need not work for all kinds of external
	file; it is implementation-defined which."


>  And it would be even *impossible* to
> detect in advance whether #contents on a socket would terminate,

Not relevant.  Because we can't seek, we can't get the
*PAST* sequence values, so it is quite certain that #contents
does not make sense for a socket.

> but yet
> it would be useful to provide #contents on a socket.

I doubt it.  I suspect you are thinking of reading up to the
end, which is quite different from #contents.

>  It's a problem
> that *programmers* sitting between a keyboard and a chair have to  
> solve,
> not whoever writes the standard.

The standard needs to first give them the TOOLS.
>
>
>> there is no "THE ... way" that the Array new: call
>> would fail;
>
> But I'd expect #contents on a 500 GB data file to fail *that same*  
> way.

I wouldn't.
>
>
>>> The erratum here is that #contents belongs in <gettableStream> and
>>> <WriteStream>, not in <collectionStream> and <FileStream>.  I sent
>>> another email on the subject.
>>
>> Except that for smallish files, #contents is quite useful.
>
> Indeed.  <FileStream> *is* a <gettableStream> isn't it?

We both made mistakes here.  No, <FileStream> isn't a
<gettableStream>, but <readFileStream>, which I think is what you
meant, _is_.

However, <gettableStreams> in general cannot support #contents
either.  Do we agree that #contents belongs in <ReadStream>,
<WriteStream>, and <fileReadStream>?

>
>
> Paolo
>
> _______________________________________________
> ANSI-Smalltalk mailing list
> ANSI-Smalltalk at lists.openskills.org
> http://lists.openskills.org/cgi-bin/mailman/listinfo/ansi-smalltalk
>




More information about the ANSI-Smalltalk mailing list