Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Request] New open() method #64

Closed
pshelly opened this issue Nov 18, 2018 · 3 comments
Closed

[Request] New open() method #64

pshelly opened this issue Nov 18, 2018 · 3 comments

Comments

@pshelly
Copy link

pshelly commented Nov 18, 2018

Hello and thanks for making these bindings!
An open() method like gzip module in the python stdlib will be extremely useful.

@indygreg
Copy link
Owner

I like the feature suggestion. A prerequisite is full implementations of io.RawIOBase on our compressor/decompressor stream readers/writers. We have partial coverage of that today. I'd like to get full coverage in the next release.

@ashwinvis
Copy link

ashwinvis commented Jun 22, 2020

For *.tar.zst files, this works for me, at least for reading.

from contextlib import contextmanager
from tarfile import TarFile
import zstandard as zstd


@contextmanager
def open_tar_zst(path_tar_zst):
    """Decompress and open a .tar.zst file"""
    with open(path_tar_zst, 'rb') as fh:
        dctx = zstd.ZstdDecompressor()
        with dctx.stream_reader(fh) as stream:
            yield TarFile(fileobj=stream)


with open_tar_zst("my.tar.zst") as tar:
    tar.list()

@ashwinvis
Copy link

FYI, the following methods of TarFile works:

  • next
  • getmember
  • getmembers
  • extractall

What does not work:

  • extract
In [75]: with open_tar_zst(tarball) as tar: 
    ...:     tar.extract('SIZE') 
    ...:                                                                                                                                                                                      
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-75-8b5ebe57f6d8> in <module>
      1 with open_tar_zst(tarball) as tar:
----> 2     tar.extract('SIZE')
      3 

/usr/lib/python3.8/tarfile.py in extract(self, member, path, set_attrs, numeric_owner)
   2063 
   2064         try:
-> 2065             self._extract_member(tarinfo, os.path.join(path, tarinfo.name),
   2066                                  set_attrs=set_attrs,
   2067                                  numeric_owner=numeric_owner)

/usr/lib/python3.8/tarfile.py in _extract_member(self, tarinfo, targetpath, set_attrs, numeric_owner)
   2135 
   2136         if tarinfo.isreg():
-> 2137             self.makefile(tarinfo, targetpath)
   2138         elif tarinfo.isdir():
   2139             self.makedir(tarinfo, targetpath)

/usr/lib/python3.8/tarfile.py in makefile(self, tarinfo, targetpath)
   2174         """
   2175         source = self.fileobj
-> 2176         source.seek(tarinfo.offset_data)
   2177         bufsize = self.copybufsize
   2178         with bltn_open(targetpath, "wb") as target:

ValueError: cannot seek zstd decompression stream backwards
  • gettarinfo
In [98]: with open_tar_zst(tarball) as tar: 
    ...:     print(tar.gettarinfo()) 
    ...:                                                                                                                                                                                      
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-98-43c3d95a8702> in <module>
      1 with open_tar_zst(tarball, 'rb') as tar:
----> 2     print(tar.gettarinfo())
      3 

/usr/lib/python3.8/tarfile.py in gettarinfo(self, name, arcname, fileobj)
   1804            string.
   1805         """
-> 1806         self._check("awx")
   1807 
   1808         # When fileobj is given, replace name by

/usr/lib/python3.8/tarfile.py in _check(self, mode)
   2382             raise OSError("%s is closed" % self.__class__.__name__)
   2383         if mode is not None and self.mode not in mode:
-> 2384             raise OSError("bad operation for mode %r" % self.mode)
   2385 
   2386     def _find_link_target(self, tarinfo):

OSError: bad operation for mode 'r'

indygreg added a commit that referenced this issue Dec 27, 2020
Now that our file object types have a closefd to control behavior,
implementing open() is pretty easy.

This satisfied a longtime feature request.

Closes #64.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants