Skip to content

Does mochiweb's parse/1 ignore (some) white-space? #166

Open
@rrrene

Description

First off, thanks for an amazing library. I love the parser and how it enabled me to concentrate on the "sanitizing" part while building an HTML sanitizer in Elixir. One thing that struck me as odd was that white-space between a closing and an opening tag seems to be omitted in the parser's return value.

The following is in Elixir syntax and I hope it is understandable. I am sorry that my Erlang is not good enough to translate this for a better bug report. 😞

When we put this binary into mochi_web:

:mochiweb_html.parse("<html>just <b>an</b> <b>other</b> test</html>")

the result is this:

{"html", [], ["just ", {"b", [], ["an"]}, {"b", [], ["other"]}, " test"]}

The space between the closing of the first </b> and the opening of the second <b> is somehow lost. I would have expected the following, where the space is preserved as a "text node":

{"html", [], ["just ", {"b", [], ["an"]}, " ", {"b", [], ["other"]}, " test"]}

But maybe the described behaviour is intended? Or is this a bug?

Thanks again and keep up the good work! 👍

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions