|
Hi all,
I'm trying to use xmerl_scan to handle XMPP stanzas, and I think I've found a problem with it. I don't think it's intended for streaming XML in XMPP-like interleaved-request-and-response-document situations. The attached program feeds the string "<opentag>" to xmerl_scan. I would have expected to receive the open-tag event before blocking for more data, but instead it requires at least one more character of input data before it will emit the open-tag event! If I instead pass "<opentag> ", with a space after the close-bracket, it emits the open-tag event as expected, plus the start of an xmlText, before blocking for more data. My question, then, is: Is this a bug? Should xmerl_scan supply the open-tag event before blocking, when it is fed "<opentag>"? (The specific context of this problem is dealing with the <stream:stream> sent by the server at the handshake stage of XEP-114.) Regards, Tony P.S.: to run the attached program, $ erlc lookaheadbug.erl && erl -run lookaheadbug go -- [][][] Tony Garnock-Jones | Mob: +44 (0)7905 974 211 [][] LShift Ltd | Tel: +44 (0)20 7729 7060 [] [] http://www.lshift.net/ | Email: [hidden email] -module(lookaheadbug). -include_lib("xmerl/include/xmerl.hrl"). -export([go/0]). go() -> xmerl_scan:string("<opentag>", [{event_fun, fun handle_event/2}, {continuation_fun, fun get_more_data/3}]). get_more_data(Continue, Exception, ScannerState) -> io:format("blocking for more data~n"), case get(tag_received) of true -> io:format("hooray!~n"), Exception(ScannerState); _ -> io:format("oh dear!~n"), throw(we_should_have_evented_a_tag_open_by_now) end. handle_event(Event, ScannerState) -> io:format("parser got event:~n~p~n", [Event]), case {Event#xmerl_event.event, Event#xmerl_event.data} of {started, #xmlElement{}} -> put(tag_received, true); _ -> ok end, ScannerState. ________________________________________________________________ erlang-questions mailing list. See http://www.erlang.org/faq.html erlang-questions (at) erlang.org |
|
Hi
You could have seen it already :) But nevertheless there is exmpp [1] library from Process One. It supports such things: 1: http://support.process-one.net/doc/display/EXMPP/ HTH. Gleb Peregud On Wed, Sep 23, 2009 at 15:34, Tony Garnock-Jones <[hidden email]> wrote: > Hi all, > > I'm trying to use xmerl_scan to handle XMPP stanzas, and I think I've > found a problem with it. I don't think it's intended for streaming XML > in XMPP-like interleaved-request-and-response-document situations. > > The attached program feeds the string "<opentag>" to xmerl_scan. I would > have expected to receive the open-tag event before blocking for more > data, but instead it requires at least one more character of input data > before it will emit the open-tag event! If I instead pass "<opentag> ", > with a space after the close-bracket, it emits the open-tag event as > expected, plus the start of an xmlText, before blocking for more data. > > My question, then, is: > > Is this a bug? Should xmerl_scan supply the open-tag event before > blocking, when it is fed "<opentag>"? > > (The specific context of this problem is dealing with the > <stream:stream> sent by the server at the handshake stage of XEP-114.) > > Regards, > Tony > > P.S.: to run the attached program, > $ erlc lookaheadbug.erl && erl -run lookaheadbug go > -- > [][][] Tony Garnock-Jones | Mob: +44 (0)7905 974 211 > [][] LShift Ltd | Tel: +44 (0)20 7729 7060 > [] [] http://www.lshift.net/ | Email: [hidden email] > > > ________________________________________________________________ > erlang-questions mailing list. See http://www.erlang.org/faq.html > erlang-questions (at) erlang.org > ________________________________________________________________ erlang-questions mailing list. See http://www.erlang.org/faq.html erlang-questions (at) erlang.org |
|
Hi Gleb,
Yes, thanks, I've seen that already. We're already using it, in fact. If exmpp didn't exist, though, the bug (?) in xmerl_scan would make it difficult (impossible?) to write a pure-Erlang XEP-114 implementation using xmerl. Regards, Tony Gleb Peregud wrote: > Hi > > You could have seen it already :) But nevertheless there is exmpp [1] > library from Process One. It supports such things: > > 1: http://support.process-one.net/doc/display/EXMPP/ > > HTH. > Gleb Peregud -- [][][] Tony Garnock-Jones | Mob: +44 (0)7905 974 211 [][] LShift Ltd | Tel: +44 (0)20 7729 7060 [] [] http://www.lshift.net/ | Email: [hidden email] ________________________________________________________________ erlang-questions mailing list. See http://www.erlang.org/faq.html erlang-questions (at) erlang.org |
|
In reply to this post by Gleb Peregud
Hello,
Le 23 sept. 2009 à 16:07, Gleb Peregud a écrit : > Hi > > You could have seen it already :) But nevertheless there is exmpp [1] > library from Process One. It supports such things: > > 1: http://support.process-one.net/doc/display/EXMPP/ Yes, come on Tony, use a real tool dedicated to the task ;) Our test and all reports we have shows it is very efficient. Let us know if you have questions, reports or improvements, -- Mickaël Rémond http://www.process-one.net/ ________________________________________________________________ erlang-questions mailing list. See http://www.erlang.org/faq.html erlang-questions (at) erlang.org |
|
In reply to this post by Tony Garnock-Jones-2
Tony, The support for streaming in xmerl_scan is quite hackish, and is bound to be broken one way or another. I doubt that it is feasible to correctly handle streams given how xmerl_scan works. As it is, you could reasonably ask of it to be a little bit more discerning - right now, xmerl_eventp only breaks at whitespace, which is very conservative. The main problem is that xmerl_scan is undisciplined when it comes to ensuring that it has enough characters to pattern-match in the current function head. E.g. %% [75] ExternalID ::= 'SYSTEM' S SystemLiteral %% | 'PUBLIC' S PubidLiteral S SystemLiteral scan_doctype1([], S=#xmerl_scanner{continuation_fun = F}) -> F(fun(MoreBytes, S1) -> scan_doctype1(MoreBytes, S1) end, fun(S1) -> ?fatal(unexpected_end, S1) end, S); scan_doctype1("PUBLIC" ++ T, S0) -> ... If given a stream fragment, like "PUBL", the matching above will fail, and xmerl_scan will derail. THIS is a serious bug - and I'm originally at fault. ;-) Have you tried using xmerl_sax_parser instead? The plan is to replace xmerl_scan completely for stream parsing. And given this, xmerl_eventp is unlikely to see any major improvements. BR, Ulf W Tony Garnock-Jones wrote: > Hi all, > > I'm trying to use xmerl_scan to handle XMPP stanzas, and I think I've > found a problem with it. I don't think it's intended for streaming XML > in XMPP-like interleaved-request-and-response-document situations. > > The attached program feeds the string "<opentag>" to xmerl_scan. I would > have expected to receive the open-tag event before blocking for more > data, but instead it requires at least one more character of input data > before it will emit the open-tag event! If I instead pass "<opentag> ", > with a space after the close-bracket, it emits the open-tag event as > expected, plus the start of an xmlText, before blocking for more data. > > My question, then, is: > > Is this a bug? Should xmerl_scan supply the open-tag event before > blocking, when it is fed "<opentag>"? > > (The specific context of this problem is dealing with the > <stream:stream> sent by the server at the handshake stage of XEP-114.) > > Regards, > Tony > > P.S.: to run the attached program, > $ erlc lookaheadbug.erl && erl -run lookaheadbug go > > > ------------------------------------------------------------------------ > > > ________________________________________________________________ > erlang-questions mailing list. See http://www.erlang.org/faq.html > erlang-questions (at) erlang.org -- Ulf Wiger CTO, Erlang Training & Consulting Ltd http://www.erlang-consulting.com ________________________________________________________________ erlang-questions mailing list. See http://www.erlang.org/faq.html erlang-questions (at) erlang.org |
|
Ulf Wiger wrote:
> Have you tried using xmerl_sax_parser instead? > The plan is to replace xmerl_scan completely for stream > parsing. And given this, xmerl_eventp is unlikely to see > any major improvements. Aha! Thank you, Ulf, I will try that. Regards, Tony -- [][][] Tony Garnock-Jones | Mob: +44 (0)7905 974 211 [][] LShift Ltd | Tel: +44 (0)20 7729 7060 [] [] http://www.lshift.net/ | Email: [hidden email] ________________________________________________________________ erlang-questions mailing list. See http://www.erlang.org/faq.html erlang-questions (at) erlang.org |
| Powered by Nabble | Edit this page |
