Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected "malformed start tag" error with HTMLParser #715

Closed
Tungsteno74 opened this issue Apr 17, 2016 · 5 comments
Closed

Unexpected "malformed start tag" error with HTMLParser #715

Tungsteno74 opened this issue Apr 17, 2016 · 5 comments

Comments

@Tungsteno74
Copy link

Tungsteno74 commented Apr 17, 2016

Used sdl2 toolchain.

traceback:

04-17 21:20:04.781 30484-30507/? I/python:  Traceback (most recent call last):
04-17 21:20:04.781 30484-30507/? I/python:    File "main.py", line 21, in <module>
04-17 21:20:04.781 30484-30507/? I/python:      mainApp().run()
04-17 21:20:04.781 30484-30507/? I/python:    File "/data/data/opentest.test.opentest/files/lib/python2.7/site-

packages/kivy/app.py", line 802, in run
04-17 21:20:04.781 30484-30507/? I/python:      root = self.build()
04-17 21:20:04.781 30484-30507/? I/python:    File "main.py", line 16, in build
04-17 21:20:04.781 30484-30507/? I/python:      open_test(filepath, filepathout)
04-17 21:20:04.781 30484-30507/? I/python:    File "/data/data/opentest.test.opentest/files/opentest.py", line 34, in open_test
04-17 21:20:04.781 30484-30507/? I/python:      parser.feed(fli)
04-17 21:20:04.781 30484-30507/? I/python:    File "/home/valerio/.local/share/python-for-android/build/python-

installs/opentest/lib/python2.7/HTMLParser.py", line 108, in feed
04-17 21:20:04.781 30484-30507/? I/python:    File "/home/valerio/.local/share/python-for-android/build/python-

installs/opentest/lib/python2.7/HTMLParser.py", line 148, in goahead
04-17 21:20:04.781 30484-30507/? I/python:    File "/home/valerio/.local/share/python-for-android/build/python-

installs/opentest/lib/python2.7/HTMLParser.py", line 229, in parse_starttag
04-17 21:20:04.791 30484-30507/? I/python:    File "/home/valerio/.local/share/python-for-android/build/python-

installs/opentest/lib/python2.7/HTMLParser.py", line 304, in check_for_whole_start_tag
04-17 21:20:04.791 30484-30507/? I/python:    File "/home/valerio/.local/share/python-for-android/build/python-

installs/opentest/lib/python2.7/HTMLParser.py", line 115, in error
04-17 21:20:04.791 30484-30507/? I/python:  HTMLParser.HTMLParseError: malformed start tag, at line 3, column 26

test program:

from kivy.uix.button import Button

from HTMLParser import HTMLParser

filepath = "/data/data/opentest.test.opentest/files/page.html" #test in android 4.4.2

class MyHTMLParser(HTMLParser):
    def handle_starttag(self, tag, attrs):
        print tag
    def handle_endtag(self, tag):
        print tag
    def handle_data(self, data):
        pass
class mainApp(App):
    def build(self):
        rdata = ""
        print filepath
        print "-----"
        with open(filepath,"rUb") as file:
            rdata = file.read()
        print "-----"
        parser = MyHTMLParser()
        parser.feed(rdata)
        print "......"
        return Button(text='Hello World')

mainApp().run()

html test page:

<!DOCTYPE html>
<!-- This file is need as a test for Android distribution -->
<html class="test-class" "="" id="id-test"><head><title>test title</title></head><body>test body<div>test div</div>

<div>test div 2</div></body></html>

The problem appears when the parser encounters the non conventional "="" attribute. I add it in this test because some webpage (for reasons that I do not know) insert this syntax in its html page.

I tested it with the desktop version of python and do not raise this issue.

So I went to check the HTMLParser.py files bundled with python-for-android and I compared it with the version found in the standard Python distribution and, discovered that there are some slight differences in the implementation.

The main reason that should generate this error in just one version of the code is located in the check_for_whole_start_tag () method.

The troublesome piece of code in python-for-Android is as follows:

            self.updatepos(i, j)
            self.error("malformed start tag")

instead the piece of code that is replaced in the desktop version is:

            if j > i:
                return j
            else:
                return i + 1

Thus if i replace the HTMLParser.py file in the python-for-android with the one founded in linux distribution and then compile the test app with the "new" implementation and deploy it on my smartphone it works great!

Then, as mentioned before, there are some other minor differences in the two distribution that i don't know what they really do, so I ask to developers:

is there any real reason to maintain these differences?

thanks for your support and sorry for my bad english

@kived
Copy link
Contributor

kived commented Apr 18, 2016

We do not make any changes to HTMLParser. You are comparing different Python versions (p4a still uses 2.7.2), which is why the code is different.

python/cpython@c10e39f#diff-a07dd7eb9cb779be7f57ea2282a94d96L356

@kived kived closed this as completed Apr 18, 2016
@Tungsteno74
Copy link
Author

Tungsteno74 commented Apr 19, 2016

@kived I'm sorry, I'm really mortified for that. I'm using buildozer to create the apk and I updated it about a week ago. I thought the packages inside it (including python-for-android) were automatically updated. Please could you tell me if is possible to update the python version to last release without problems? thanks!

@inclement
Copy link
Member

inclement commented Apr 19, 2016

The old python-for-android toolchain doesn't currently support the latest python (actually it supports specifically 2.7.2 and nothing else). It isn't simple to change the version, because the patching necessary to make python build and run on android is version dependent.

The new toolchain has experimental support for 3.5 and will be able to use the same method for the latest version of 2.7. It also has a PR to use 2.7.11 when building python itself, but I haven't tested that yet.

@kived
Copy link
Contributor

kived commented Apr 19, 2016

As a workaround, you could include a copy of HTMLParser.py from 2.7.11 in your project itself, and import that instead. Or if you're using other libraries which use HTMLParser then you could monkey-patch it with the updated code.

@Tungsteno74
Copy link
Author

Thanks again guys for your tips!
I'm trying to include it in buildozer build with the PR
#693

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants