Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pull: fails on HDFS after removing .dvc/cache #10583

Open
zsaladin opened this issue Oct 8, 2024 · 2 comments
Open

pull: fails on HDFS after removing .dvc/cache #10583

zsaladin opened this issue Oct 8, 2024 · 2 comments
Labels
bug Did we break something? fs: hdfs Related to the HDFS filesystem help wanted upstream Issues which need to be resolved in an upstream dependency

Comments

@zsaladin
Copy link

zsaladin commented Oct 8, 2024

Bug Report

Description

dvc pullfails on HDFS after removing .dvc/cache. It means someone clones the repository at first then dvc pull always fails.
But dvc pull -q succeed. So it seems that some log printing causes this problem.

I explain things that may help you to debug hopefully.

  1. Variable total is not a number. It causes the error.
  2. Variable **d contains variable total which is from size
  3. But in this case the variable size is not a number. It is a bound method. here

Reproduce

  1. dvc init
  2. Copy dataset.zip to the directory
  3. dvc remote add -d storage hdfs://user/dvc/mystorage
  4. dvc add dataset.zip
  5. dvc push
  6. rm -rf dataset.zip .dvc/.cache
  7. dvc pull

Expected

dvc pull and dvc fetch are executed successfully n HDFS.

Environment information

Output of dvc doctor:

$ dvc doctor

DVC version: 3.55.2 (pip)
-------------------------
Platform: Python 3.10.12 on Linux-6.10.4-linuxkit-x86_64-with-glibc2.28
Subprojects:
	dvc_data = 3.16.5
	dvc_objects = 5.1.0
	dvc_render = 1.0.2
	dvc_task = 0.4.0
	scmrepo = 3.3.7
Supports:
	azure (adlfs = 2024.7.0, knack = 0.12.0, azure-identity = 1.17.1),
	gdrive (pydrive2 = 1.20.0),
	gs (gcsfs = 2024.9.0.post1),
	hdfs (fsspec = 2024.9.0, pyarrow = 17.0.0),
	http (aiohttp = 3.10.5, aiohttp-retry = 2.8.3),
	https (aiohttp = 3.10.5, aiohttp-retry = 2.8.3),
	oss (ossfs = 2023.12.0),
	s3 (s3fs = 2024.9.0, boto3 = 1.35.16),
	ssh (sshfs = 2024.6.0),
	webdav (webdav4 = 0.10.0),
	webdavs (webdav4 = 0.10.0),
	webhdfs (fsspec = 2024.9.0)
Config:
	Global: /home/user/.config/dvc
	System: /etc/xdg/dvc
Cache types: symlink
Cache directory: fuse.osxfs on osxfs
Caches: local
Remotes: hdfs
Workspace directory: fuse.osxfs on osxfs
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/19c955812b0a09cd409a3779f4e4d774

Additional Information (if any):

I attach error log below.

$ dvc pull -v

2024-10-08 15:51:47,388 DEBUG: v3.55.2 (pip), CPython 3.10.12 on Linux-6.10.4-linuxkit-x86_64-with-glibc2.28
2024-10-08 15:51:47,390 DEBUG: command: /home/user/.local/bin/dvc pull -v
Collecting                                                                                                                                                                                                                         |0.00 [00:00,    ?entry/s]
Fetching2024-10-08 15:51:49,343 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2024-10-08 15:51:50,297 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
2024-10-08 15:51:50,625 DEBUG: Preparing to transfer data from 'hdfs://user/dvc/mystorage/files/md5' to '/home/user/repo/.dvc/cache/files/md5'
2024-10-08 15:51:50,625 DEBUG: Preparing to collect status from '/home/user/repo/.dvc/cache/files/md5'
2024-10-08 15:51:50,625 DEBUG: Collecting status from '/home/user/repo/.dvc/cache/files/md5'
2024-10-08 15:51:50,629 DEBUG: Preparing to collect status from '/user/dvc/mystorage/files/md5'
2024-10-08 15:51:50,630 DEBUG: Collecting status from '/user/dvc/mystorage/files/md5'
2024-10-08 15:51:50,691 DEBUG: Estimated remote size: 256 files
2024-10-08 15:51:50,692 DEBUG: Querying 2 oids via traverse
Fetching
  0%|          |Fetching from hdfs                                                                                                                                                                                                 0/1 [00:00<?,     ?file/s]
2024-10-08 15:51:51,217 DEBUG: Removing '/home/user/repo/.dvc/cache/files/md5/12/.bnFqV3d0PmZTKtbQoPM-8A.tmp'
2024-10-08 15:51:51,219 ERROR: failed to transfer '126a8a51b9d1bbd07fddc65819a542c3' - unsupported operand type(s) for +: 'method' and 'float'
Traceback (most recent call last):
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/dvc_objects/fs/generic.py", line 349, in transfer
    _try_links(
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/dvc_objects/fs/generic.py", line 281, in _try_links
    return copy(
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/dvc_objects/fs/generic.py", line 97, in copy
    return _get(
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/dvc_objects/fs/generic.py", line 227, in _get
    _get_one(from_paths[0], to_paths[0])
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/dvc_objects/fs/generic.py", line 217, in _get_one
    return from_fs.get_file(
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/dvc_objects/fs/base.py", line 645, in get_file
    self.fs.get_file(from_info, to_info, callback=callback, **kwargs)
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/fsspec/implementations/arrow.py", line 210, in get_file
    super().get_file(rpath, lpath, **kwargs)
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/fsspec/spec.py", line 904, in get_file
    callback.set_size(getattr(f1, "size", None))
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/fsspec/callbacks.py", line 97, in set_size
    self.call()
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/fsspec/callbacks.py", line 311, in call
    self.tqdm = self._tqdm_cls(total=self.size, **self._tqdm_kwargs)
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/dvc_data/callbacks.py", line 92, in __init__
    super().__init__(
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/tqdm/std.py", line 1098, in __init__
    self.refresh(lock_args=self.lock_args)
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/tqdm/std.py", line 1347, in refresh
    self.display()
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/tqdm/std.py", line 1495, in display
    self.sp(self.__str__() if msg is None else msg)
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/tqdm/std.py", line 1151, in __str__
    return self.format_meter(**self.format_dict)
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/dvc_data/callbacks.py", line 129, in format_dict
    meter = self.format_meter(  # type: ignore[call-arg]
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/tqdm/std.py", line 534, in format_meter
    if total and n >= (total + 0.5):  # allow float imprecision (#849)
TypeError: unsupported operand type(s) for +: 'method' and 'float'

Fetching                                                                                                                                                                                                                                                    Exception ignored in: <function tqdm.__del__ at 0x7ffffdaf53f0>                                                                                                                                                                     0/1 [00:00<?,     ?file/s]
Traceback (most recent call last):
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/tqdm/std.py", line 1148, in __del__
    self.close()
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/dvc_data/callbacks.py", line 115, in close
    self.postfix["info"] = ""
TypeError: 'NoneType' object does not support item assignment
2024-10-08 15:51:51,224 DEBUG: failed to protect '/home/user/repo/.dvc/cache/files/md5/12/6a8a51b9d1bbd07fddc65819a542c3' - [Errno 2] No such file or directory: '/home/user/repo/.dvc/cache/files/md5/12/6a8a51b9d1bbd07fddc65819a542c3'
Traceback (most recent call last):
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/dvc_data/hashfile/db/local.py", line 117, in protect
    os.chmod(path, self.CACHE_MODE)
FileNotFoundError: [Errno 2] No such file or directory: '/home/user/repo/.dvc/cache/files/md5/12/6a8a51b9d1bbd07fddc65819a542c3'

Fetching
2024-10-08 15:51:51,227 ERROR: failed to pull data from the cloud - 1 files failed to download
Traceback (most recent call last):
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/dvc/commands/data_sync.py", line 35, in run
    stats = self.repo.pull(
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/dvc/repo/__init__.py", line 58, in wrapper
    return f(repo, *args, **kwargs)
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/dvc/repo/pull.py", line 30, in pull
    processed_files_count = self.fetch(
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/dvc/repo/__init__.py", line 58, in wrapper
    return f(repo, *args, **kwargs)
  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/dvc/repo/fetch.py", line 200, in fetch
    raise DownloadError(failed_count)
dvc.exceptions.DownloadError: 1 files failed to download

2024-10-08 15:51:51,234 DEBUG: Analytics is disabled.
@skshetry
Copy link
Member

skshetry commented Oct 9, 2024

  File "/home/user/.local/share/uv/tools/dvc/lib/python3.10/site-packages/fsspec/spec.py", line 904, in get_file
    callback.set_size(getattr(f1, "size", None))

From above traceback, this looks like a bug in fsspec. Could you please open an issue in https://github.com/fsspec/filesystem_spec?

I agree size should be a property, not a method.

@skshetry skshetry added bug Did we break something? upstream Issues which need to be resolved in an upstream dependency fs: hdfs Related to the HDFS filesystem labels Oct 9, 2024
@zsaladin
Copy link
Author

@skshetry

I posted the issue - fsspec/filesystem_spec#1711

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Did we break something? fs: hdfs Related to the HDFS filesystem help wanted upstream Issues which need to be resolved in an upstream dependency
Projects
None yet
Development

No branches or pull requests

3 participants