You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Inference of .txt file type fails if the file has only whitespaces.
To Reproduce
from tempfile import NamedTemporaryFile
from unstructured.partition.auto import partition
with NamedTemporaryFile(mode="w", suffix=".txt") as f:
f.write(" \n")
f.seek(0)
elements = partition(filename=f.name)
Raises IndexError Expected behavior
The file should be properly partitioned.
Environment Info
OS version: Linux-6.8.0-45-generic-x86_64-with-glibc2.35
Python version: 3.10.12
unstructured version: 0.15.13
unstructured-inference version: 0.7.36
pytesseract is not installed
Torch version: 2.4.1
Detectron2 is not installed
PaddleOCR is not installed
Libmagic version: file-5.41
magic file from /etc/magic:/usr/share/misc/magic
LibreOffice version: LibreOffice 7.3.7.2 30(Build:2)
The text was updated successfully, but these errors were encountered:
This is a fix for this
[bug](#3674), auto partition fails on text files which are empty or contain only whitespaces
Inference of .txt file type fails if the file has only whitespaces.
To Reproduce:
```
from tempfile import NamedTemporaryFile
from unstructured.partition.auto import partition
with NamedTemporaryFile(mode="w", suffix=".txt") as f:
f.write(" \n")
f.seek(0)
elements = partition(filename=f.name)
```
Describe the bug
Inference of .txt file type fails if the file has only whitespaces.
To Reproduce
Raises IndexError
Expected behavior
The file should be properly partitioned.
Environment Info
OS version: Linux-6.8.0-45-generic-x86_64-with-glibc2.35
Python version: 3.10.12
unstructured version: 0.15.13
unstructured-inference version: 0.7.36
pytesseract is not installed
Torch version: 2.4.1
Detectron2 is not installed
PaddleOCR is not installed
Libmagic version: file-5.41
magic file from /etc/magic:/usr/share/misc/magic
LibreOffice version: LibreOffice 7.3.7.2 30(Build:2)
The text was updated successfully, but these errors were encountered: