[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

*To*: scheme-reports@x*Subject*: [Scheme-reports] Errors (?) in 7.2, Formal Semantics*From*: William Cushing <william.cushing@x>*Date*: Wed, 26 Nov 2014 15:50:24 -0800

I'm trying to grok the formal semantics section, 7.2, of r7rs small.

It seems to me there are numerous small errors, but perhaps I'm just confused.

I don't see any errata for the formal section on the web page; and I couldn't find a better place to submit bugs....so, here goes:

;;;;

;; car-internal

;;;;

As written, the declared type of car-internal is "_expression_-value to

_expression_-continuation to command-continuation" (E->K->C).

However, the definition starts with "lambda ewk", and results in calling hold on two

args (and that results in a command-continuation). So the type of the

definition is "_expression_-value to dynamic-point to _expression_-continuation

to command-continuation" (E->P->K->C).

Based on how car-internal is called, the fix appears to be: simply delete

the w in the definition (that is, delete the \omega). In other words, the

type statement is correct, but the definition is not.

;;;;

;; valueslist

;;;;

As-is the definition of valueslist resembles:

;; E->K->C

(define valueslist

(lambda (e k)

(if (pair? e)

(cdr-internal e

(lambda (e*) ; (0)

(valueslist e* ; (1)

(lambda (e**)

(car-internal e

(single (lambda (e+) ; (2)

(k (list* e+ e**)))) ; (3) list*=cons

))

))

)

;; not a pair

(if (null? e) (k (list)) (error "non-list argument to

*values-list*"))

)))

At (1), e* will be a (singleton) sequence of values, but values-list does

not accept a sequence of values as its first argument. Since car-internal

and cdr-internal work by sending a singleton sequence to their

_expression_-continuation, it seems that (1) should be:

(valueslist (first e*) ...) ; (A)

Alternatively, (0) could be:

(single (lambda (e++) (valueslist e++ ...)) ; (B)

That is perhaps more pleasing, since it would be symmetric with how

car-internal is called at (2).

On the other-hand, calling internal functions (bypassing type-checks)

suggests complete confidence, and so perhaps bypassing

argument-count checking (from single) is desirable as well. If so,

then (A) is preferable, and for symmetry, likewise (2) should be:

(lambda (e***) (k (list (first e***) e**) )) ; (A')

--

Incidentally, list->values seems more Scheme-y than valueslist.

;;;;

;; list

;;;;

In list, both list and cons are called. Both of those are supposed to take

a dynamic point as input, but neither use within list provides such an

input.

Supposing that the dynamic point doesn't need to change, then an \omega has

to be inserted before (single ...) and before the last occurrence of \kappa.

I don't (yet?) grok dynamic points, so that suggested fix

may very well be wrong; it's just the minimal thing that ensures type

correctness.

;;;;

;; cons

;;;;

The definition of cons flips the order of \omega and \kappa in its use of

twoarg:

\twoarg(\lambda \eps_1 \eps_2 \kappa \omega...

should be

twoarg(\lambda \eps_1 \eps_2 \omega \kappa ...

;;;;

;; lambda (rest-arg)

;;;;

While it's perfectly clear what is meant, the line that declares that the

meaning of "(\lambda I \gamma* E_0)" is the meaning of "(\lambda (. I)

\gamma* E_0)" is more or less breaking the rules. That is, "(. I)" is not

legal syntax. Or, well, the abstract syntax (7.2.1) declares that such is

valid, but the concrete syntax (7.1.3) specifically contradicts.

In an evaluation context, <> \concat <I> or (cons (list) (list I)) is

legitimate, in contrast with "(lambda (() . I) ...)"; that fact can be better

exploited to avoid needlessly cheating with the abstract syntax.

Namely, one can factor out the common meaning of the two cases at the

meta-level rather than at the syntactic level, something like:

\meaning{(\lambda (I* . I) \gamma^* E_0)} = \foo I* I \gamma^* E_0

\meaning{(\lambda I \gamma^* E_0)} = \foo <> I \gamma^* E_0

Obviously \foo needs a better name, perhaps \listify-args.

;;;;

;; Builtins

;;;;

"add" and "less" are orphans. "eqv" may also be. (They are unreferenced

in the rest of the formal semantics.) (When the semantics require such

operations, "+", "<", and so forth of the meta-language are used directly.)

Perhaps just delete them? One could posit that the initial environment

extends "\rho('+) = add" and "\rho('<) = less", which perhaps makes their

definitions more interesting to a reader. Still, there are many further

required primitives; why do these particular primitives get special

treatment?

Worse, the definitions given aren't (entirely) correct.

That is, implementations are specifically licensed (elsewhere in the

report) to implement less than the mathematical ideal; but the definitions

given don't appear to allow such restrictions.

Addition (object-level), in particular, will, in any implementation,

eventually run out of memory (for bignums), or otherwise fail to live up to

the mathematical ideal ([signalling?] errors on fixnum overflow, for

implementations without bignums). However, as written, the formal

semantics imply that object-level addition can only fail due to wrong

number (from twoarg) or wrong type of arguments.

I suppose one could take the stance that the meta-function "in" (injection

into an object-level domain, in this case, numbers) is where

implementations are allowed to hang restrictions. But then it wouldn't be

a meta-function; it would be a meta-procedure. (Since understanding the

meaning of such injection failures, as written, calls for a procedural

reading of the meta-language. Contrast with "new": "in" lacks the "+

{error}" hook in its range, and its uses lack the explicit branch to

implementation-dependent behaviour.)

It seems the formal semantics go to great lengths to provide a purely

functional meta-language; it seems a cop-out to attempt to gloss over

injection failures using a procedural mechanism.

From what I've seen of other treatments of formal semantics, the trick

seems to be to add a type of primitive object-level functions with

unspecified behaviour and constituents. ("delta rules"? Or is it "gamma

rules"?).

That is, one could have semantic types such as:

O_2 = E \times E \to E "primitive binary operations"

O_1 = E \to E "primitive unary operations"

which could be injected to be members of F using an appropriate

meta-function. Something like:

inject = \lambda o . \sequence{L_o}{\lambda e* \omega \kappa . \kappa (o e*)} in F

For example, L_+ is what the symbol '+ would be bound to in the base

environment, and the base store would then map L_+ to <inject +, false>.

(There isn't a need to dance around with "new" to get locations for the

builtins, of course.)

("false" here means that the initial bindings would be immutable, which I

think is correct for R7.)

;;;;

;; Rebinding lambda (Keywords?!)

;;;;

The formal semantics do not capture the meaning of:

(let ((lambda list) (a 1) (b 2)) (lambda (list a b) (list a b)) )

The proper meaning is '((1 2) (1 2)). But as written, (after desugaring

the let) the formal interpretation results in a procedure equivalent to the

one defined by:

(define apply/2 (lambda (x y z) (x y z)))

That is because, according to the _expression_ semantic function, the

interpretation of symbols "lambda", "if", and "set!" are immune to

rebinding. Or rather, the interpretation of lambda-forms, if-forms, and

set!-forms are immune --- the symbols themselves can be rebound. That

results in a very mild case of lisp-2-ism:

(lambda (lambda) (lambda (x) lambda))

;; the K-combinator according to the current formal treatment.

One can try to wriggle out of this by arguing that the formal semantics are

at the level of entirely desugared scheme; that is, one could try to push

upon the macro-expander the responsibility to properly implement lexical

scope for lambda, if, and set!.

Since Schemes' macro-expanders are, nowadays, hygienic by default, they

*can* be called upon to rewrite:

(define evil ((lambda (lambda a b) (lambda (list a b) (list a b))) list 1

2))

into something resembling:

(define evil ((lambda (lambda_ a_ b_) (lambda_ (list a_ b_) (list a_ b_)))

list 1 2))

But that's strange. One of the most important things the formal semantics

exists to explain is the meaning of lexical scope (and how that interacts

with continuations, errors, ...).

As it stands, the definition of lambda does precisely that in a completely

general and correct fashion for all but three identifiers. For that

matter, given the chance, the existing formal implementation of lambda

would work in those cases, too; except that the meaning meta-function jumps

the gun.

If instead those three cases were written along the following lines, then

the formal semantics would completely capture the Scheme spirit:

\meaning{(lambda E*)} = \lambda \rho .

\syntax{lambda} \notin \domain{\rho}

\then \base-lambda E* \rho

\else \lambda \omega \kappa .

\hold (\lookup \rho \syntax{lambda}) ...

where

\base-lambda = \lambda E* . \lambda \rho \omega \kappa ...

(I.e., check for rebinding, then go to either the identifier case, or the

initial meaning.)

I would not be surprised if there was some more elegant way to implement the semantic branch.It seems to me there are numerous small errors, but perhaps I'm just confused.

I don't see any errata for the formal section on the web page; and I couldn't find a better place to submit bugs....so, here goes:

;;;;

;; car-internal

;;;;

As written, the declared type of car-internal is "_expression_-value to

_expression_-continuation to command-continuation" (E->K->C).

However, the definition starts with "lambda ewk", and results in calling hold on two

args (and that results in a command-continuation). So the type of the

definition is "_expression_-value to dynamic-point to _expression_-continuation

to command-continuation" (E->P->K->C).

Based on how car-internal is called, the fix appears to be: simply delete

the w in the definition (that is, delete the \omega). In other words, the

type statement is correct, but the definition is not.

;;;;

;; valueslist

;;;;

As-is the definition of valueslist resembles:

;; E->K->C

(define valueslist

(lambda (e k)

(if (pair? e)

(cdr-internal e

(lambda (e*) ; (0)

(valueslist e* ; (1)

(lambda (e**)

(car-internal e

(single (lambda (e+) ; (2)

(k (list* e+ e**)))) ; (3) list*=cons

))

))

)

;; not a pair

(if (null? e) (k (list)) (error "non-list argument to

*values-list*"))

)))

At (1), e* will be a (singleton) sequence of values, but values-list does

not accept a sequence of values as its first argument. Since car-internal

and cdr-internal work by sending a singleton sequence to their

_expression_-continuation, it seems that (1) should be:

(valueslist (first e*) ...) ; (A)

Alternatively, (0) could be:

(single (lambda (e++) (valueslist e++ ...)) ; (B)

That is perhaps more pleasing, since it would be symmetric with how

car-internal is called at (2).

On the other-hand, calling internal functions (bypassing type-checks)

suggests complete confidence, and so perhaps bypassing

argument-count checking (from single) is desirable as well. If so,

then (A) is preferable, and for symmetry, likewise (2) should be:

(lambda (e***) (k (list (first e***) e**) )) ; (A')

--

Incidentally, list->values seems more Scheme-y than valueslist.

;;;;

;; list

;;;;

In list, both list and cons are called. Both of those are supposed to take

a dynamic point as input, but neither use within list provides such an

input.

Supposing that the dynamic point doesn't need to change, then an \omega has

to be inserted before (single ...) and before the last occurrence of \kappa.

I don't (yet?) grok dynamic points, so that suggested fix

may very well be wrong; it's just the minimal thing that ensures type

correctness.

;;;;

;; cons

;;;;

The definition of cons flips the order of \omega and \kappa in its use of

twoarg:

\twoarg(\lambda \eps_1 \eps_2 \kappa \omega...

should be

twoarg(\lambda \eps_1 \eps_2 \omega \kappa ...

;;;;

;; lambda (rest-arg)

;;;;

While it's perfectly clear what is meant, the line that declares that the

meaning of "(\lambda I \gamma* E_0)" is the meaning of "(\lambda (. I)

\gamma* E_0)" is more or less breaking the rules. That is, "(. I)" is not

legal syntax. Or, well, the abstract syntax (7.2.1) declares that such is

valid, but the concrete syntax (7.1.3) specifically contradicts.

In an evaluation context, <> \concat <I> or (cons (list) (list I)) is

legitimate, in contrast with "(lambda (() . I) ...)"; that fact can be better

exploited to avoid needlessly cheating with the abstract syntax.

Namely, one can factor out the common meaning of the two cases at the

meta-level rather than at the syntactic level, something like:

\meaning{(\lambda (I* . I) \gamma^* E_0)} = \foo I* I \gamma^* E_0

\meaning{(\lambda I \gamma^* E_0)} = \foo <> I \gamma^* E_0

Obviously \foo needs a better name, perhaps \listify-args.

;;;;

;; Builtins

;;;;

"add" and "less" are orphans. "eqv" may also be. (They are unreferenced

in the rest of the formal semantics.) (When the semantics require such

operations, "+", "<", and so forth of the meta-language are used directly.)

Perhaps just delete them? One could posit that the initial environment

extends "\rho('+) = add" and "\rho('<) = less", which perhaps makes their

definitions more interesting to a reader. Still, there are many further

required primitives; why do these particular primitives get special

treatment?

Worse, the definitions given aren't (entirely) correct.

That is, implementations are specifically licensed (elsewhere in the

report) to implement less than the mathematical ideal; but the definitions

given don't appear to allow such restrictions.

Addition (object-level), in particular, will, in any implementation,

eventually run out of memory (for bignums), or otherwise fail to live up to

the mathematical ideal ([signalling?] errors on fixnum overflow, for

implementations without bignums). However, as written, the formal

semantics imply that object-level addition can only fail due to wrong

number (from twoarg) or wrong type of arguments.

I suppose one could take the stance that the meta-function "in" (injection

into an object-level domain, in this case, numbers) is where

implementations are allowed to hang restrictions. But then it wouldn't be

a meta-function; it would be a meta-procedure. (Since understanding the

meaning of such injection failures, as written, calls for a procedural

reading of the meta-language. Contrast with "new": "in" lacks the "+

{error}" hook in its range, and its uses lack the explicit branch to

implementation-dependent behaviour.)

It seems the formal semantics go to great lengths to provide a purely

functional meta-language; it seems a cop-out to attempt to gloss over

injection failures using a procedural mechanism.

From what I've seen of other treatments of formal semantics, the trick

seems to be to add a type of primitive object-level functions with

unspecified behaviour and constituents. ("delta rules"? Or is it "gamma

rules"?).

That is, one could have semantic types such as:

O_2 = E \times E \to E "primitive binary operations"

O_1 = E \to E "primitive unary operations"

which could be injected to be members of F using an appropriate

meta-function. Something like:

inject = \lambda o . \sequence{L_o}{\lambda e* \omega \kappa . \kappa (o e*)} in F

For example, L_+ is what the symbol '+ would be bound to in the base

environment, and the base store would then map L_+ to <inject +, false>.

(There isn't a need to dance around with "new" to get locations for the

builtins, of course.)

("false" here means that the initial bindings would be immutable, which I

think is correct for R7.)

;;;;

;; Rebinding lambda (Keywords?!)

;;;;

The formal semantics do not capture the meaning of:

(let ((lambda list) (a 1) (b 2)) (lambda (list a b) (list a b)) )

The proper meaning is '((1 2) (1 2)). But as written, (after desugaring

the let) the formal interpretation results in a procedure equivalent to the

one defined by:

(define apply/2 (lambda (x y z) (x y z)))

That is because, according to the _expression_ semantic function, the

interpretation of symbols "lambda", "if", and "set!" are immune to

rebinding. Or rather, the interpretation of lambda-forms, if-forms, and

set!-forms are immune --- the symbols themselves can be rebound. That

results in a very mild case of lisp-2-ism:

(lambda (lambda) (lambda (x) lambda))

;; the K-combinator according to the current formal treatment.

One can try to wriggle out of this by arguing that the formal semantics are

at the level of entirely desugared scheme; that is, one could try to push

upon the macro-expander the responsibility to properly implement lexical

scope for lambda, if, and set!.

Since Schemes' macro-expanders are, nowadays, hygienic by default, they

*can* be called upon to rewrite:

(define evil ((lambda (lambda a b) (lambda (list a b) (list a b))) list 1

2))

into something resembling:

(define evil ((lambda (lambda_ a_ b_) (lambda_ (list a_ b_) (list a_ b_)))

list 1 2))

But that's strange. One of the most important things the formal semantics

exists to explain is the meaning of lexical scope (and how that interacts

with continuations, errors, ...).

As it stands, the definition of lambda does precisely that in a completely

general and correct fashion for all but three identifiers. For that

matter, given the chance, the existing formal implementation of lambda

would work in those cases, too; except that the meaning meta-function jumps

the gun.

If instead those three cases were written along the following lines, then

the formal semantics would completely capture the Scheme spirit:

\meaning{(lambda E*)} = \lambda \rho .

\syntax{lambda} \notin \domain{\rho}

\then \base-lambda E* \rho

\else \lambda \omega \kappa .

\hold (\lookup \rho \syntax{lambda}) ...

where

\base-lambda = \lambda E* . \lambda \rho \omega \kappa ...

(I.e., check for rebinding, then go to either the identifier case, or the

initial meaning.)

;;;;

;; Lambda (clarity)

;;;;

The definition of lambda uses "in E" to project meta-functions into

object-functions. "in F" would make it clearer what is going on, in

particular, what the significance of the pair is. Note that almost every other use of

symbolic meta-cons and friends (\langle\rangle, \down, \dag) has to do with

argument passing rather than procedure construction.

call/cc also builds an object-level closure, and likewise its use of "in E"

would be clearer as "in F".

A meta-function "make-closure: L \to (E* \to P \to K \to C) \to F" would be

clearer and better still, of course. (At least, having the code respect

some sort of encapsulation of a closure type rather than manipulating its

guts directly is often regarded as superior from a software engineering

standpoint.)

The only other uses of meta-cons and friends is in the continuation-walking

code, which is not confusing, because they are being used to manipulate

sequences proper (rather than being used to fake up a record type).

-Will

_______________________________________________ Scheme-reports mailing list Scheme-reports@x http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports

**Follow-Ups**:**Re: [Scheme-reports] Errors (?) in 7.2, Formal Semantics***From:*Alex Shinn <alexshinn@x>

- Prev by Date:
**Re: [Scheme-reports] 'else' auxiliary syntax** - Next by Date:
**Re: [Scheme-reports] Errors (?) in 7.2, Formal Semantics** - Previous by thread:
**[Scheme-reports] syntax-rules is not exported from (scheme r5rs)** - Next by thread:
**Re: [Scheme-reports] Errors (?) in 7.2, Formal Semantics** - Index(es):