xmerl newlines

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

xmerl newlines

Erik Reitsma (RY/ETM)-2
> i;m trying to output an xml files and with xmerl 0.15
> i seem to recall newlines were put in for me.  now i
> upgraded to 0.18 and the xml is one long string --
> fine for word-wrap editor like emacs otherwise i bit
> hard to use.

On the other hand, these newlines and tabs and spaces also appear in parsed XML. Therefore this is indeed more symmetrical.

It would be nice if the newlines and tabs and spaces would be removed from the parsed XML (if it is not already supported and I missed some option). Now I do a pass through the parsed XML to remove all xmlText records that contain nothing but whitespace.

*Erik.



Reply | Threaded
Open this post in threaded view
|

xmerl newlines

Ulf Wiger-4
On Tue, 17 Jun 2003, Erik Reitsma (ETM) wrote:

>> i;m trying to output an xml files and with xmerl 0.15
>> i seem to recall newlines were put in for me.  now i
>> upgraded to 0.18 and the xml is one long string --
>> fine for word-wrap editor like emacs otherwise i bit
>> hard to use.
>
>On the other hand, these newlines and tabs and spaces also
>appear in parsed XML. Therefore this is indeed more
>symmetrical.

There are many different requirements... (:

I understand the problem as being that you've built a
structure of tuples (or #xmlElement{}) in Erlang and want
them exported with some pretty-printing. Is that right?

A less than perfect pretty-print hack to xmerl_xml.erl
illustrates how it could be done. xmerl_xml.erl is quite a
small module, and easily modified into a local version of
export that fits your needs perfectly:

'#element#'(Tag, [], Attrs, Parents, E) ->
    [pp(length(Parents)), empty_tag(Tag, Attrs)];
'#element#'(Tag, Data, Attrs, Parents, E) ->
    [pp(length(Parents)), markup(Tag, Attrs, Data)].

pp(N) ->
    ["\n", lists:duplicate(N*4, $\s)].


This has the disadvantage of not putting the end tags where
you'd expect them to be. To fix that, you have to
copy xmerl_lib:markup/3 and modify it -- not that difficult
perhaps.

To accompany the above hack, you may want to remove any
whitespace already in the structure:

'#text#'(Text) ->
    case is_whitespace(Text) of
        true ->
            [];
        false ->
            export_text(Text)
    end.

is_whitespace(" " ++ T) ->  is_whitespace(T);
is_whitespace("\n" ++ T) -> is_whitespace(T);
is_whitespace("\t" ++ T) -> is_whitespace(T);
is_whitespace([_|_]) -> false;
is_whitespace([]) ->
    true.

Or scan with the {space, normalize} option (see below.)


>It would be nice if the newlines and tabs and spaces would
>be removed from the parsed XML (if it is not already
>supported and I missed some option). Now I do a pass
>through the parsed XML to remove all xmlText records that
>contain nothing but whitespace.

I had implemented the {space,...} option wrongly in
xmerl-0.15 and was set straight by those who use and
understand XML. (: However, the option {space,normalize}
_should_ do almost what you want.  It will accumulate
consecutive whitespace and replace it with one space. Close
enough?

/Uffe
--
Ulf Wiger, Senior Specialist,
   / / /   Architecture & Design of Carrier-Class Software
  / / /    Strategic Product & System Management
 / / /     Ericsson AB, Connectivity and Control Nodes