Aladdin - Scala Bugtracking
[#965] project: compiler priority: low category: missing feature
submitter assigned to status date submitted
Nikolay Burak fixed 2007-02-23 15:32:46.0
subject [contrib #351] Ignore whitespace nodes in literal XML
code
what happened
what expected There should be a compiler option to ignore all whitespace-only text nodes in literal XML. I use it for source code readability, but it screws up further processing, e.g., my pattern matching. Right now I have to dump my literal XML nodes to String and reparse them to get rid of the whitespace.
[back to overview]
Changes of this bug report
Nikolay  edited on  2007-02-23 15:33:16.0
contribution #351
Lex  edited on  2007-02-23 17:18:50.0

There is no one right answer about whitespace. Sometimes you want whitespace-only nodes. Additionally, compiler options are awkward to work with. They essentially define multiple languages, and if you work with a large code base you have to keep track of which files are in which dialect of Scala.

Here's another approach which works with the above forces: have library routines that clean up XML in various ways. See the methods of scala.runtime.RichString for examples. There could just as well be a removeSpaceOnlyNodes. With this approach, the XML literals should give you a node that drops as little information as possible, and then you call the correct cleanup routine for your application if the default is not good enough.

Burak  edited on  2007-02-27 09:12:03.0
There is an "xml:space" kind of option we could support (in a cleverer way than how it is specified in XML) -- however this would also be implemented as a transformation of the XML at runtime.
Whitespace in XML hurts Scala programmers more than others, because we are used to structural equality "just working".
So here's a spec: if the node contains an attribute scala-xml:space="trim", then all nodes that are not marked scala-xml:space="default" will have their whitespace trimmed (all ws nodes disappear). We'd also have "collapse" which furthermore reduces all occurrences of ws(ws*) to a single space. This attribute only plays the role of a compiler directive and disappears after the whitespace handling.
Burak  edited on  2007-02-27 10:41:57.0
ok there is now a Utility.trim function that goes and kills whitespace... also within the text. I did not add any syntactic support (some XML attribute), seems too nonstandard.