str and repr are ignored for custom iterable objects #17695

dmyersturnbull · 2017-09-27T20:00:53Z

I posted a comment on ipython/ipython#2379, but I realized this may be specific to Pandas. I'm using IPython 6.2.0, Python 3.5, Jupyter 4.3.0, and Pandas 0.20.1.

Pandas won't use __str__ or __repr__ (or anything I can find) to display an iterable object that defines __len__

Code Sample, a copy-pastable example if possible

class WithLen:
    z = [1,2,3]
    def __getitem__(self, i): return self.z[i]
    def __str__(self): return "from str"
    def __repr__(self): return "from repr"
    def __len__(self): return len(self.z)

df = pd.DataFrame([
    pd.Series({'test': WithLen()})
])
print(df)  # PROBLEMATIC CASE
print(df['test'].iloc[0])  # prints fine

class WithoutLen:
    z = [1,2,3]
    def __getitem__(self, i): return self.z[i]
    def __str__(self): return "from str"
    def __repr__(self): return "from repr"
    def __len__(self): return len(self.z)

print(pd.DataFrame([
    pd.Series({'test': WithoutLen()})
]))  # prints fine

Problem description

I've implemented __str__ and __repr__ on an iterable class with good reason, and it being iterable shouldn't override that behavior.

Expected Output

I expected and want to see this printed both with and without __len__:

       test
0  from str

Output of `pd.show_versions()`


INSTALLED VERSIONS
------------------
commit: None
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.11.6-201.fc25.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.utf8
LOCALE: en_US.UTF-8

pandas: 0.20.3
pytest: 3.0.7
pip: 9.0.1
setuptools: 36.5.0
Cython: 0.25.2
numpy: 1.13.1
scipy: 0.19.0
xarray: None
IPython: 6.2.0
sphinx: 1.5.1
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.2.0
tables: 3.3.0
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.1
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: 3.7.3
bs4: 4.5.3
html5lib: 0.999
sqlalchemy: 1.1.6
pymysql: 0.7.11.None
psycopg2: None
jinja2: 2.9.5
s3fs: None
pandas_gbq: None
pandas_datareader: None

The text was updated successfully, but these errors were encountered:

TomAugspurger · 2017-09-27T20:14:07Z

You might try looking in pandas/io/formats/format.py, specifically

pandas/pandas/io/formats/format.py

Line 1765 in f9d88cd

def format_array(values, formatter, float_format=None, na_rep='NaN',

and I'm assuming

pandas/pandas/io/formats/format.py

Line 1803 in f9d88cd

class GenericArrayFormatter(object):

Presumably we cast to np.array somewhere in there, perhaps unnecessarily.

jreback · 2017-09-28T00:03:42Z

to be honest this is not real well supported, nor intended. you are pretty much on your own.

putting anything else besides scalars inside a Series/DataFrame is non-idiomatic. There will be some support for containers or things (e.g. lists / arrays) in pandas2.

TomAugspurger · 2017-09-28T00:12:10Z

I think if we can support formatting without too much effort then we should. @dmyersturnbull if you can track down the issue and suggest a fix, it'd probably be accepted.

jorisvandenbossche · 2017-09-28T09:10:56Z

The reason is because in our 'pretty print' implementation, we have a special method for sequences that can follow the max_seq_items option:

pandas/pandas/io/formats/printing.py

Lines 97 to 126 in f9d88cd

    
           def _pprint_seq(seq, _nest_lvl=0, max_seq_items=None, **kwds): 
        
               """ 
        
               internal. pprinter for iterables. you should probably use pprint_thing() 
        
               rather then calling this directly. 
        
               bounds length of printed sequence, depending on options 
        
               """ 
        
               if isinstance(seq, set): 
        
                   fmt = u("{{{body}}}") 
        
               else: 
        
                   fmt = u("[{body}]") if hasattr(seq, '__setitem__') else u("({body})") 
        
               if max_seq_items is False: 
        
                   nitems = len(seq) 
        
               else: 
        
                   nitems = max_seq_items or get_option("max_seq_items") or len(seq) 
        
               s = iter(seq) 
        
               r = [] 
        
               for i in range(min(nitems, len(seq))):  # handle sets, no slicing 
        
                   r.append(pprint_thing( 
        
                       next(s), _nest_lvl + 1, max_seq_items=max_seq_items, **kwds)) 
        
               body = ", ".join(r) 
        
               if nitems < len(seq): 
        
                   body += ", ..." 
        
               elif isinstance(seq, tuple) and len(seq) == 1: 
        
                   body += ',' 
        
               return fmt.format(body=body)

Which more or less hardcodes the possible results (depending on whether setitem exists, it will look like list of tuple). I am not directly sure of a way to both follow the repr of custom objects and still be able to limit the number of items showed for normal reprs.

jreback closed this as completed Sep 28, 2017

jreback added this to the won't fix milestone Sep 28, 2017

jreback added the Usage Question label Sep 28, 2017

TomAugspurger mentioned this issue Dec 19, 2017

Series not honoring class __repr__ or __str__ #18843

Open

TomAugspurger modified the milestones: won't fix, No action Jul 6, 2018

jorisvandenbossche mentioned this issue Aug 14, 2018

is_sequence is too aggressive for determining how to print #22333

Closed

jamesmyatt mentioned this issue Jun 20, 2019

pprint_thing for sequences #26970

Closed

miba2020 mentioned this issue Dec 6, 2024

ENH: How about let pprint_thing print Real instance according to display.precision #60503

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

str and repr are ignored for custom iterable objects #17695

str and repr are ignored for custom iterable objects #17695

dmyersturnbull commented Sep 27, 2017

TomAugspurger commented Sep 27, 2017

jreback commented Sep 28, 2017 •

edited

Loading

TomAugspurger commented Sep 28, 2017

jorisvandenbossche commented Sep 28, 2017

__str__ and __repr__ are ignored for custom iterable objects #17695

__str__ and __repr__ are ignored for custom iterable objects #17695

Comments

dmyersturnbull commented Sep 27, 2017

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

TomAugspurger commented Sep 27, 2017

jreback commented Sep 28, 2017 • edited Loading

TomAugspurger commented Sep 28, 2017

jorisvandenbossche commented Sep 28, 2017

str and repr are ignored for custom iterable objects #17695

str and repr are ignored for custom iterable objects #17695

Output of `pd.show_versions()`

jreback commented Sep 28, 2017 •

edited

Loading