[XMLSCHEMA-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

FW: Regex syntax [+-]

From: Michael Kay <mike@saxonica.com>
Date: Mon, 15 Aug 2005 17:52:12 +0100
To: <xmlschema-dev@w3.org>
Message-ID: <E1E4iCa-0003Zz-V5@lisa.w3.org>
xml regex syntax

A couple of weeks ago I raised this message on the list, and received no
reply.

Does this mean:

(a) that it will eventually be answered, and in the meantime I can enjoy
listening to piped Vivaldi, or

(b) that it's fallen down a black hole?

Michael Kay



-----Original Message-----
From: xmlschema-dev-request@w3.org [mailto:xmlschema-dev-request@w3.org] On
Behalf Of Michael Kay
Sent: 04 August 2005 22:51
To: xmlschema-dev@w3.org
Subject: Regex syntax [+-]


I'm busy trying to implement the anti-erratum that says [+-] in a regex is
now legal, and I'm therefore trying to understand exactly what the rules now
are.

In particular, what characters are allowed to appear as s and e in a range
[s-e]?

The production rules say

[18]   	seRange	   ::=   	charOrEsc '-' charOrEsc
[20]   	charOrEsc	   ::=   	XmlChar | SingleCharEsc
[21]   	XmlChar	   ::=   	[^\#x2D#x5B#x5D]

which imply that [, ], \, and - are disallowed in both positions.

But the text then elaborates this by saying that 

s-e is a valid character range iff:

    * s is a .single character escape., or an XML character;
    * s is not \
    * If s is the first character in a .character class expression., then s
is not ^
    * e is a .single character escape., or an XML character;
    * e is not \ or [; and
    * The code point of e is greater than or equal to the code point of s; 

Question: in this English text, what does "XML character" mean? Does it mean
any character allowed in XML, or does it mean XmlChar as defined in
production 21? (If it means XMLChar, why are bullets 2 and 5 there?)

The grammar rules say that \ and [ are disallowed in both positions, but the
English rules say \ is disallowed for the start of the range while both \
and [ are disallowed for the end. Why the inconsistency? Why is "-" not
mentioned?

I'm left more confused than ever!

Michael Kay
http://www.saxonica.com/
Received on Monday, 15 August 2005 16:52:55 GMT

Subscribe to the Stylus Scoop newsletter for helpful XML tips and tutorials.
Email
First Name
Last Name
Company

Download Stylus Studio 6 XML Enterprise Edition

Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2007 All Rights Reserved.