Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Old-style classes interfere with returning strings #22

Open
Slater-Victoroff opened this issue Sep 13, 2015 · 3 comments
Open

Old-style classes interfere with returning strings #22

Slater-Victoroff opened this issue Sep 13, 2015 · 3 comments

Comments

@Slater-Victoroff
Copy link

I have no idea how to solve this one, but when trying to return strings from a function decorated with an @threads, rather than executing, it simply errors. Code below to reproduce:

def download_nm(delimiter, source, dump):
    images = [line.split(delimiter)[10] for line in open(source)][1:]  # avoid header row
    with open(dump, 'a') as sink:
        for i, image in enumerate(images):
            results = return_image_json(image)  # type(results) == <type 'instance'> 
            sink.write(results)  # Error, expected string or buffer

@threads(16)
def return_image_json(image_link):
    response = requests.get(image_link)
    encoded = "data:%s;base64,%s" % (response.headers['Content-Type'], base64.b64encode(response.content))
    return json.dumps({image_link: encoded}) + '\n'

It looks like moving to new-style classes resolves this issue, but since you're relying on some of the syntax-hacks of old-style classes I'm not sure if this is solvable.

@Slater-Victoroff
Copy link
Author

Did more digging into this error. Seems that tomorrow breaks if any type-checking whatsoever is done on the resulting object. So passing the results into most any library function should break.

@madisonmay
Copy link
Owner

Super valid complaint. It's actually unrelated to the kind of class being used -- it's simply that calls to isinstance and type(obj) don't actually trigger a __getattr__ call.

You can fix this by doing literally anything to the result object before calling a function that checks the results type. I would recommend calling tomorrow._wait() explicitly in these circumstances. Simplified example below:

import base64
import json

from tomorrow import threads
import requests

def save_image_json(images, filename):
    with open(filename, 'a') as sink:
        for i, image in enumerate(images):
            results = return_image_json(image)
            results._wait()
            sink.write(results) 

@threads(4)
def return_image_json(image_link):
    response = requests.get(image_link)
    encoded = "data:%s;base64,%s" % (response.headers['Content-Type'], base64.b64encode(response.content))
    return json.dumps({image_link: encoded}) + '\n'

if __name__ == "__main__":
    # koala images
    images = [
        'http://i.imgur.com/1p5nKyZ.jpg',
        'http://i.imgur.com/lDt3nJ8.jpg', 
        'http://i.imgur.com/hkUeCBT.jpg', 
        'http://i.imgur.com/ZglSElj.jpg'
    ]
    save_image_json(images, filename='koalas.json')

I'm going to look into forcing isinstance and type to trigger resolution of the future, but I'm not entirely certain this is possible. If that doesn't work I'll probably just end up updating the README to make sure people are aware of this edge case.

@980202006
Copy link

if I have the error after run code,what I get is all the Tomorrow class. Is there any method to deal with it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants