Skip to content
This repository has been archived by the owner on Feb 14, 2025. It is now read-only.

chore(deps): bump unstructured from 0.15.7 to 0.15.9 #193

Closed
wants to merge 1 commit into from

Conversation

dependabot[bot]
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Sep 2, 2024

Bumps unstructured from 0.15.7 to 0.15.9.

Release notes

Sourced from unstructured's releases.

0.15.9

Enhancements

Features

  • Add support for encoding parameter in partition_csv

0.15.8

Enhancements

  • Bump unstructured.paddleocr to 2.8.1.0.

Features

  • Add MixedbreadAI embedder Adds MixedbreadAI embeddings to support embedding via Mixedbread AI.

Fixes

  • Replace pillow-heif with pi-heif. Replaces pillow-heif with pi-heif due to more permissive licensing on the wheel for pi-heif.
  • Minify text_as_html from DOCX. Previously .metadata.text_as_html for DOCX tables was "bloated" with whitespace and noise elements introduced by tabulate that produced over-chunking and lower "semantic density" of elements. Reduce HTML to minimum character count without preserving all text.
  • Fall back to filename extension-based file-type detection for unidentified OLE files. Resolves a problem where a DOC file that could not be detected as such by filetype was incorrectly identified as a MSG file.
Changelog

Sourced from unstructured's changelog.

0.15.9

Enhancements

Features

  • Add support for encoding parameter in partition_csv

Fixes

  • Check storage contents for OLE file type detection Updates detect_filetype to check the content of OLE files to more reliable differentiate DOC, PPT, XLS, and MSG files. As part of this, the "msg" extra was removed because the python-oxmsg package is now a base dependency.
  • Fix disk space leaks and Windows errors when accessing file.name on a NamedTemporaryFile Uses of NamedTemporaryFile(..., delete=False) and/or uses of file.name of NamedTemporaryFiles have been replaced with TemporaryFileDirectory to avoid a known issue: https://docs.python.org/3/library/tempfile.html#tempfile.NamedTemporaryFile

0.15.8

Enhancements

  • Bump unstructured.paddleocr to 2.8.1.0.

Features

  • Add MixedbreadAI embedder Adds MixedbreadAI embeddings to support embedding via Mixedbread AI.

Fixes

  • Replace pillow-heif with pi-heif. Replaces pillow-heif with pi-heif due to more permissive licensing on the wheel for pi-heif.
  • Minify text_as_html from DOCX. Previously .metadata.text_as_html for DOCX tables was "bloated" with whitespace and noise elements introduced by tabulate that produced over-chunking and lower "semantic density" of elements. Reduce HTML to minimum character count without preserving all text.
  • Fall back to filename extension-based file-type detection for unidentified OLE files. Resolves a problem where a DOC file that could not be detected as such by filetype was incorrectly identified as a MSG file.
Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [unstructured](https://github.com/Unstructured-IO/unstructured) from 0.15.7 to 0.15.9.
- [Release notes](https://github.com/Unstructured-IO/unstructured/releases)
- [Changelog](https://github.com/Unstructured-IO/unstructured/blob/main/CHANGELOG.md)
- [Commits](Unstructured-IO/unstructured@0.15.7...0.15.9)

---
updated-dependencies:
- dependency-name: unstructured
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
@dependabot dependabot bot added the dependencies Pull requests that update a dependency file label Sep 2, 2024
Copy link
Contributor Author

dependabot bot commented on behalf of github Sep 15, 2024

Looks like unstructured is up-to-date now, so this is no longer needed.

@dependabot dependabot bot closed this Sep 15, 2024
@dependabot dependabot bot deleted the dependabot/pip/unstructured-0.15.9 branch September 15, 2024 23:51
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants