Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(indexing)!: Removed duplication of batch_size. Pipeline owns the default ba… #336

Merged
merged 7 commits into from
Sep 26, 2024

Conversation

devsprint
Copy link
Contributor

@devsprint devsprint commented Sep 25, 2024

Fixes #233

BREAKING CHANGE: The batch size of batch transformers when indexing is now configured on the batch transformer. If no batch size or default is configured, a configurable default is used from the pipeline. The default batch size is 256.

…tch size value and Embeed/SparseEmbed are able to modify it. Fixes bosun-ai#233
Copy link
Member

@timonv timonv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is excellent, thank you (again) for the contribution! Just one small comment on docs.

swiftide-indexing/src/pipeline.rs Show resolved Hide resolved
@timonv
Copy link
Member

timonv commented Sep 25, 2024

fyi #334 relates but it shouldn't interfere

@timonv timonv changed the title feat: Removed duplication of batch_size. Pipeline owns the default ba… feat(indexing)!: Removed duplication of batch_size. Pipeline owns the default ba… Sep 26, 2024
@timonv timonv merged commit 7d8a57f into bosun-ai:master Sep 26, 2024
8 checks passed
This was referenced Sep 26, 2024
timonv added a commit that referenced this pull request Sep 27, 2024
## 🤖 New release
* `swiftide`: 0.12.3 -> 0.13.0 (✓ API compatible changes)
* `swiftide-core`: 0.12.3 -> 0.13.0 (✓ API compatible changes)
* `swiftide-indexing`: 0.12.3 -> 0.13.0 (⚠️ API breaking changes)
* `swiftide-macros`: 0.12.3 -> 0.13.0
* `swiftide-integrations`: 0.12.3 -> 0.13.0 (⚠️ API breaking changes)
* `swiftide-query`: 0.12.3 -> 0.13.0 (✓ API compatible changes)

### ⚠️ `swiftide-indexing` breaking changes

```
--- failure method_parameter_count_changed: pub method parameter count changed ---

Description:
A publicly-visible method now takes a different number of parameters.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#fn-change-arity
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.35.0/src/lints/method_parameter_count_changed.ron

Failed in:
  swiftide_indexing::Pipeline::then_in_batch now takes 2 parameters instead of 3, in /tmp/.tmpKWtCli/swiftide/swiftide-indexing/src/pipeline.rs:221
```

### ⚠️ `swiftide-integrations` breaking changes

```
--- failure enum_marked_non_exhaustive: enum marked #[non_exhaustive] ---

Description:
A public enum has been marked #[non_exhaustive]. Pattern-matching on it outside of its crate must now include a wildcard pattern like `_`, or it will fail to compile.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#attr-adding-non-exhaustive
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.35.0/src/lints/enum_marked_non_exhaustive.ron

Failed in:
  enum SupportedLanguages in /tmp/.tmpKWtCli/swiftide/swiftide-integrations/src/treesitter/supported_languages.rs:37

--- failure inherent_method_missing: pub method removed or renamed ---

Description:
A publicly-visible method or associated fn is no longer available under its prior name. It may have been renamed or removed entirely.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.35.0/src/lints/inherent_method_missing.ron

Failed in:
  FastEmbed::with_batch_size, previously in file /tmp/.tmpg5eaNs/swiftide-integrations/src/fastembed/mod.rs:98
```

<details><summary><i><b>Changelog</b></i></summary><p>

## `swiftide`
<blockquote>

##
[0.13.0](v0.12.3...v0.13.0)
- 2024-09-26

### New features

-
[7d8a57f](7d8a57f)
*(indexing)* [**breaking**] Removed duplication of batch_size
([#336](#336))

**BREAKING CHANGE**: The batch size of batch transformers when indexing
is
now configured on the batch transformer. If no batch size or default is
configured, a configurable default is used from the pipeline. The
default batch size is 256.

---------

-
[fd110c8](fd110c8)
*(tree-sitter)* Add support for Java 22
([#309](#309))

### Bug fixes

-
[23b96e0](23b96e0)
*(tree-sitter)* [**breaking**] SupportedLanguages are now non-exhaustive
([#331](#331))

**BREAKING CHANGE**: SupportedLanguages are now non-exhaustive. This
means that matching on SupportedLanguages will now require a catch-all
arm.
This change was made to allow for future languages to be added without
breaking changes.

### Miscellaneous

-
[923a8f0](923a8f0)
*(fastembed,qdrant)* Better batching defaults
([#334](#334))

```text
Qdrant and FastEmbed now have a default batch size, removing the need to set it manually. The default batch size is 50 and 256 respectively.
```

**Full Changelog**:
0.12.3...0.13.0



</blockquote>


</p></details>

---
This PR was generated with
[release-plz](https://github.com/MarcoIeni/release-plz/).

---------

Co-authored-by: Timon Vonk <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Single definition of batch size in indexing vs transformers
2 participants