Skip to content

Latest commit

 

History

History
1297 lines (710 loc) · 60.3 KB

CHANGELOG.md

File metadata and controls

1297 lines (710 loc) · 60.3 KB

Changelog

All notable changes to this project will be documented in this file.

0.13.1 - 2024-10-02

Bug fixes

  • e6d9ec2 (lancedb) Should not error if table exists (#349)

Full Changelog: https://github.com/bosun-ai/swiftide/compare/0.13.0...0.13.1

0.13.0 - 2024-09-26

New features

  • 7d8a57f (indexing) [breaking] Removed duplication of batch_size (#336)

BREAKING CHANGE: The batch size of batch transformers when indexing is now configured on the batch transformer. If no batch size or default is configured, a configurable default is used from the pipeline. The default batch size is 256.

Bug fixes

  • 23b96e0 (tree-sitter) [breaking] SupportedLanguages are now non-exhaustive (#331)

BREAKING CHANGE: SupportedLanguages are now non-exhaustive. This means that matching on SupportedLanguages will now require a catch-all arm. This change was made to allow for future languages to be added without breaking changes.

Miscellaneous

  • 923a8f0 (fastembed,qdrant) Better batching defaults (#334)
Qdrant and FastEmbed now have a default batch size, removing the need to set it manually. The default batch size is 50 and 256 respectively.

Full Changelog: https://github.com/bosun-ai/swiftide/compare/0.12.3...0.13.0

0.12.3 - 2024-09-23

New features

  • da5df22 (tree-sitter) Implement Serialize and Deserialize for SupportedLanguages (#314)

Bug fixes

  • a756148 (tree-sitter) Fix javascript and improve tests (#313)
As learned from [#309](https://github.com/bosun-ai/swiftide/pull/309), test coverage for the refs defs transformer was
  not great. There _are_ more tests in code_tree. Turns out, with the
  latest treesitter update, javascript broke as it was the only language
  not covered at all.

Miscellaneous

  • e8e9d80 (docs) Add documentation to query module (#276)

Full Changelog: https://github.com/bosun-ai/swiftide/compare/0.12.2...0.12.3

v0.12.2 - 2024-09-20

Docs

  • d84814e Fix broken documentation links and other cargo doc warnings (#304) by @tinco
Running `cargo doc --all-features` resulted in a lot of warnings.

Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.12.1...v0.12.2

v0.12.1 - 2024-09-16

New features

  • ec227d2 (indexing,query) Add concise info log with transformation name by @timonv

  • 01cf579 (query) Add query_mut for reusable query pipelines by @timonv

  • 081a248 (query) Improve query performance similar to indexing in 0.12 by @timonv

  • 8029926 (query,indexing) Add duration in log output on pipeline completion by @timonv

Bug fixes

  • 39b6ecb (core) Truncate long strings safely when printing debug logs by @timonv

  • 8b8ceb9 (deps) Update redis by @timonv

  • 16e9c74 (openai) Reduce debug verbosity by @timonv

  • 6914d60 (qdrant) Reduce debug verbosity when storing nodes by @timonv

  • 3d13889 (query) Reduce and improve debugging verbosity by @timonv

  • 133cf1d (query) Remove verbose debug and skip self in instrumentation by @timonv

  • ce17981 Clippy by @timonv

  • a871c61 Fmt by @timonv

Miscellaneous

  • d62b047 (ci) Update testcontainer images and fix tests by @timonv

Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.12.0...v0.12.1

v0.12.0 - 2024-09-13

New features

  • e902cb7 (query) Add support for filters in SimilaritySingleEmbedding (#298) by @timonv
Adds support for filters for Qdrant and Lancedb in
  SimilaritySingleEmbedding. Also fixes several small bugs and brings
  improved tests.
  • f158960 Major performance improvements (#291) by @timonv
Futures that do not yield were not run in parallel properly. With this
  futures are spawned on a tokio worker thread by default.

  When embedding (fastembed) and storing a 85k row dataset, there's a
  ~1.35x performance improvement:
  <img width="621" alt="image"
  src="https://github.com/user-attachments/assets/ba2d4d96-8d4a-44f1-b02d-6ac2af0cedb7">

  ~~Need to do one more test with IO bound futures as well. Pretty huge,
  not that it was slow.~~

  With IO bound openai it's 1.5x.

Bug fixes

  • f8314cc (indexing) Limit logged chunk to max 100 chars (#292) by @timonv

  • f95f806 (indexing) Debugging nodes should respect utf8 char boundaries by @timonv

  • 8595553 Implement into_stream_boxed for all loaders by @timonv

  • 9464ca1 Bad embed error propagation (#293) by @timonv

- **fix(indexing): Limit logged chunk to max 100 chars**
  - **fix: Embed transformers must correctly propagate errors**

Miscellaneous

  • 45d8a57 (ci) Use llm-cov preview via nightly and improve test coverage (#289) by @timonv
Fix test coverage in CI. Simplified the trait bounds on the query
  pipeline for now to make it all work and fit together, and added more
  tests to assert boxed versions of trait objects work in tests.
  • 408f30a (deps) Update testcontainers (#295) by @timonv

  • 37c4bd9 (deps) Update treesitter (#296) by @timonv

  • 8d9e954 Cargo update by @timonv

Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.11.1...v0.12.0

v0.11.1 - 2024-09-10

New features

  • 3c9491b Implemtent traits T for Box for indexing and query traits (#285) by @timonv
When working with trait objects, some pipeline steps now allow for
  Box<dyn Trait> as well.

Bug fixes

  • dfa546b Add missing parquet feature flag by @timonv

Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.11.0...v0.11.1

v0.11.0 - 2024-09-08

New features

  • bdf17ad (indexing) Parquet loader (#279) by @timonv
Ingest and index data from parquet files.
  • a98dbcb (integrations) Add ollama embeddings support (#278) by @ephraimkunz
Update to the most recent ollama-rs, which exposes the batch embedding
  API Ollama exposes (https://github.com/pepperoni21/ollama-rs/pull/61).
  This allows the Ollama struct in Swiftide to implement `EmbeddingModel`.

  Use the same pattern that the OpenAI struct uses to manage separate
  embedding and prompt models.

  ---------

Miscellaneous

  • 873795b (ci) Re-enable coverage via Coverals with tarpaulin (#280) by @timonv

  • 465de7f Update CHANGELOG.md with breaking change by @timonv

New Contributors

  • @ephraimkunz made their first contribution in #278

Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.10.0...v0.11.0

v0.10.0 - 2024-09-06

Bug fixes

  • 5a724df [breaking] Rust 1.81 support (#275) by @timonv
Fixing id generation properly as per #272, will be merged in together.

  - **Clippy**
  - **fix(qdrant)!: Default hasher changed in Rust 1.81**

BREAKING CHANGE: Rust 1.81 support (#275)

Docs

  • 3711f6f (readme) Fix date (#273) by @dzvon
I suppose this should be 09-02.

New Contributors

  • @dzvon made their first contribution in #273

Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.9.2...v0.10.0

v0.9.2 - 2024-09-04

New features

  • 84e9bae (indexing) Add chunker for text with text_splitter (#270) by @timonv

  • 387fbf2 (query) Hybrid search for qdrant in query pipeline (#260) by @timonv

Implement hybrid search for qdrant with their new Fusion search. Example
  in /examples includes an indexing and query pipeline, included the
  example answer as well.

Docs

  • 064c7e1 (readme) Update intro by @timonv

  • 1dc4c90 (readme) Add new blog links by @timonv

Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.9.1...v0.9.2

v0.9.1 - 2024-09-01

New features

  • b891f93 (integrations) Add fluvio as loader support (#243) by @timonv
Adds Fluvio as a loader support, enabling Swiftide indexing streams to
  process messages from a Fluvio topic.
  • c00b6c8 (query) Ragas support (#236) by @timonv
Work in progress on support for ragas as per
  https://github.com/explodinggradients/ragas/issues/1165 and #232

  Add an optional evaluator to a pipeline. Evaluators need to handle
  transformation events in the query pipeline. The Ragas evaluator
  captures the transformations as per
  https://docs.ragas.io/en/latest/howtos/applications/data_preparation.html.

  You can find a working notebook here
  https://github.com/bosun-ai/swiftide-tutorial/blob/c510788a625215f46575415161659edf26fc1fd5/ragas/notebook.ipynb
  with a pipeline using it here
  https://github.com/bosun-ai/swiftide-tutorial/pull/1
  • a1250c1 LanceDB support (#254) by @timonv
Add LanceDB support for indexing and querying. LanceDB separates compute
  from storage, where storage can be local or hosted elsewhere.

Bug fixes

  • f92376d (deps) Update rust crate aws-sdk-bedrockruntime to v1.46.0 (#247) by @renovate[bot]

  • 732a166 Remove no default features from futures-util by @timonv

Miscellaneous

  • 9b257da Default features cleanup (#262) by @timonv
Integrations are messy and pull a lot in. A potential solution is to
  disable default features, only add what is actually required, and put
  the responsibility at users if they need anything specific. Feature
  unification should then take care of the rest.

Docs

  • fb381b8 (readme) Copy improvements (#261) by @timonv

Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.9.0...v0.9.1

v0.9.0 - 2024-08-15

New features

  • 2443933 (qdrant) Add access to inner client for custom operations (#242) by @timonv

  • 4fff613 (query) Add concurrency on query pipeline and add query_all by @timonv

Bug fixes

  • 4e31c0a (deps) Update rust crate aws-sdk-bedrockruntime to v1.44.0 (#244) by @renovate[bot]

  • 501321f (deps) Update rust crate spider to v1.99.37 (#230) by @renovate[bot]

  • 8a1cc69 (query) After retrieval current transormation should be empty by @timonv

Miscellaneous

  • e9d0016 (indexing,integrations) Move tree-sitter dependencies to integrations (#235) by @timonv
Removes the dependency of indexing on integrations, resulting in much
  faster builds when developing on indexing.

Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.8.0...v0.9.0

v0.8.0 - 2024-08-12

New features

  • 2e25ad4 (indexing) [breaking] Default LLM for indexing pipeline and boilerplate Transformer macro (#227) by @timonv
Add setting a default LLM for an indexing pipeline, avoiding the need to
  clone multiple times.

  More importantly, introduced `swiftide-macros` with
  `#[swiftide_macros::indexing_transformer]` that generates
  all boilerplate code used for internal transformers. This ensures all
  transformers are consistent and makes them
  easy to change in the future. This is a big win for maintainability and
  ease to extend. Users are encouraged to use the macro
  as well.

BREAKING CHANGE: Introduces WithIndexingDefaults and WithBatchIndexingDefaults trait constraints for transformers. They can be used as a marker with a noop (i.e. just impl WithIndexingDefaults for MyTransformer {}). However, when implemented fully, they can be used to provide defaults from the pipeline to your transformers.

  • 67336f1 (indexing) Sparse vector support with Splade and Qdrant (#222) by @timonv
Adds Sparse vector support to the indexing pipeline, enabling hybrid
  search for vector databases. The design should work for any form of
  Sparse embedding, and works with existing embedding modes and multiple
  named vectors. Additionally, added `try_default_sparse` to FastEmbed,
  using Splade, so it's fully usuable.

  Hybrid search in the query pipeline coming soon.
  • e728a7c Code outlines in chunk metadata (#137) by @tinco
Added a transformer that generates outlines for code files using tree sitter. And another that compresses the outline to be more relevant to chunks. Additionally added a step to the metadata QA tool that uses the outline to improve the contextual awareness during QA generation.

Bug fixes

  • dc7412b (deps) Update aws-sdk-rust monorepo (#223) by @renovate[bot]

Miscellaneous

  • 9613f50 (ci) Only show remote github url if present in changelog by @timonv

Docs

  • 73d1649 (readme) Add Ollama support to README by @timonv

  • b3f04de (readme) Add link to discord (#219) by @timonv

  • 4970a68 (readme) Fix discord links by @timonv

Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.7.1...v0.8.0

v0.7.1 - 2024-08-04

New features

  • b2d31e5 (integrations) Add ollama support (#214) by @tinco

  • 9eb5894 (query) Add support for closures in all steps (#215) by @timonv

Miscellaneous

  • 53e662b (ci) Add cargo deny to lint dependencies (#213) by @timonv

Docs

  • 1539393 (readme) Update README.md by @timonv

  • ba07ab9 (readme) Readme improvements by @timonv

  • f7accde (readme) Add 0.7 announcement by @timonv

  • 084548f (readme) Clarify on closures by @timonv

Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.7.0...v0.7.1

swiftide-v0.7.0 - 2024-07-28

New features

  • ec1fb04 (indexing) Metadata as first class citizen (#204) by @timonv
Adds our own implementation for metadata, internally still using a
  BTreeMap. The Value type is now a `serde_json::Value` enum. This allows
  us to store the metadata in the same format as the rest of the document,
  and also allows us to use values programmatically later.

  As is, all current meta data is still stored as Strings.
  • 16bafe4 (swiftide) [breaking] Rework workspace preparing for swiftide-query (#199) by @timonv
Splits up the project into multiple small, unpublished crates. Boosts
  compile times, makes the code a bit easier to grok and enables
  swiftide-query to be build separately.

BREAKING CHANGE: All indexing related tools are now in

  • 63694d2 (swiftide-query) Query pipeline v1 (#189) by @timonv

Bug fixes

  • ee3aad3 (deps) Update rust crate aws-sdk-bedrockruntime to v1.42.0 (#195) by @renovate[bot]

  • be0f31d (deps) Update rust crate spider to v1.99.11 (#190) by @renovate[bot]

  • dd04453 (swiftide) Update main lockfile by @timonv

  • bafd907 Update all cargo package descriptions by @timonv

Miscellaneous

  • e72641b (ci) Set versions in dependencies by @timonv

Docs

  • 2114aa4 (readme) Add copy on the query pipeline by @timonv

  • 573aff6 (indexing) Document the default prompt templates and their context (#206) by @timonv

Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.6.7...swiftide-v0.7.0

swiftide-v0.6.7 - 2024-07-23

New features

  • beea449 (prompt) Add Into for strings to PromptTemplate (#193) by @timonv

  • f3091f7 (transformers) References and definitions from code (#186) by @timonv

Docs

  • 97a572e (readme) Add blog posts and update doc link (#194) by @timonv

  • 504fe26 (pipeline) Add note that closures can also be used as transformers by @timonv

Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.6.6...swiftide-v0.6.7

swiftide-v0.6.6 - 2024-07-16

New features

  • d1c642a (groq) Add SimplePrompt support for Groq (#183) by @timonv
Adds simple prompt support for Groq by using async_openai. ~~Needs some
  double checks~~. Works great.

Bug fixes

  • 5d4a814 (deps) Update rust crate aws-sdk-bedrockruntime to v1.40.0 (#169) by @renovate[bot]

Docs

  • 143c7c9 (readme) Fix typo (#180) by @eltociear

  • d393181 (docsrs) Scrape examples and fix links (#184) by @timonv

New Contributors

  • @eltociear made their first contribution in #180

Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.6.5...swiftide-v0.6.6

swiftide-v0.6.5 - 2024-07-15

New features

  • 0065c7a (prompt) Add extending the prompt repository (#178) by @timonv

Bug fixes

  • b54691f (prompts) Include default prompts in crate (#174) by @timonv
- **add prompts to crate**
  - **load prompts via cargo manifest dir**
  • 3c297bb (swiftide) Remove include from Cargo.toml by @timonv

Miscellaneous

  • 73d5fa3 (traits) Cleanup unused batch size in BatchableTransformer (#177) by @timonv

Docs

  • b95b395 (swiftide) Documentation improvements and cleanup (#176) by @timonv
- **chore: remove ingestion stream**
  - **Documentation and grammar**

Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.6.3...swiftide-v0.6.5

swiftide-v0.6.3 - 2024-07-14

Bug fixes

  • 47418b5 (prompts) Fix breaking issue with prompts not found by @timonv

Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.6.2...swiftide-v0.6.3

swiftide-v0.6.2 - 2024-07-12

Miscellaneous

  • 2b682b2 (deps) Limit feature flags on qdrant to fix docsrs by @timonv

Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.6.1...swiftide-v0.6.2

swiftide-v0.6.1 - 2024-07-12

Miscellaneous

  • aae7ab1 (deps) Patch update all by @timonv

Docs

  • 085709f (docsrs) Disable unstable and rustdoc scraping by @timonv

Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.6.0...swiftide-v0.6.1

swiftide-v0.6.0 - 2024-07-12

New features

  • 70ea268 (prompts) Add prompts as first class citizens (#145) by @timonv
Adds Prompts as first class citizens. This is a breaking change as
  SimplePrompt with just a a `&str` is no longer allowed.

  This introduces `Prompt` and `PromptTemplate`. A template uses jinja
  style templating build on tera. Templates can be converted into prompts,
  and have context added. A prompt is then send to something that prompts,
  i.e. openai or bedrock.

  Additional prompts can be added either compiled or as one-offs.
  Additionally, it's perfectly fine to prompt with just a string as well,
  just provide an `.into()`.

  For future development, some LLMs really benefit from system prompts,
  which this would enable. For the query pipeline we can also take a much
  more structured approach with composed templates and conditionals.
  • 699cfe4 Embed modes and named vectors (#123) by @pwalski
Added named vector support to qdrant. A pipeline can now have its embed
  mode configured, either per field, chunk and metadata combined (default)
  or both. Vectors need to be configured on the qdrant client side.

  See `examples/store_multiple_vectors.rs` for an example.

  Shoutout to @pwalski for the contribution. Closes #62.

  ---------

Bug fixes

  • 9334934 (chunkcode) Use correct chunksizes (#122) by @timonv

  • dfc76dd (deps) Update rust crate serde to v1.0.204 (#129) by @renovate[bot]

  • 28f5b04 (deps) Update rust crate tree-sitter-typescript to v0.21.2 (#128) by @renovate[bot]

  • 9c261b8 (deps) Update rust crate text-splitter to v0.14.1 (#127) by @renovate[bot]

  • ff92abd (deps) Update rust crate tree-sitter-javascript to v0.21.4 (#126) by @renovate[bot]

  • 7af97b5 (deps) Update rust crate spider to v1.98.7 (#124) by @renovate[bot]

  • adc4bf7 (deps) Update aws-sdk-rust monorepo (#125) by @renovate[bot]

  • dd32ef3 (deps) Update rust crate async-trait to v0.1.81 (#134) by @renovate[bot]

  • 2b13523 (deps) Update rust crate fastembed to v3.7.1 (#135) by @renovate[bot]

  • 8e22937 (deps) Update rust crate aws-sdk-bedrockruntime to v1.39.0 (#143) by @renovate[bot]

  • 353cd9e (qdrant) Upgrade and better defaults (#118) by @timonv

- **fix(deps): update rust crate qdrant-client to v1.10.1**
  - **fix(qdrant): upgrade to new qdrant with sensible defaults**
  - **feat(qdrant): safe to clone with internal arc**

  ---------
  • b53636c Inability to store only some of EmbeddedFields (#139) by @pwalski

Performance

  • ea8f823 Improve local build performance and crate cleanup (#148) by @timonv
- **tune cargo for faster builds**
  - **perf(swiftide): increase local build performance**

Miscellaneous

  • eb8364e (ci) Try overriding the github repo for git cliff by @timonv

  • 5de6af4 (ci) Only add contributors if present by @timonv

  • 4c9ed77 (ci) Properly check if contributors are present by @timonv

  • c5bf796 (ci) Add clippy back to ci (#147) by @timonv

  • 7a8843a (deps) Update rust crate testcontainers to 0.20.0 (#133) by @renovate[bot]

  • 364e13d (swiftide) Loosen up dependencies (#140) by @timonv

Loosen up dependencies so swiftide is a bit more flexible to add to
  existing projects
  • 84dd65d [breaking] Rename all mentions of ingest to index (#130) by @timonv
Swiftide is not an ingestion pipeline (loading data), but an indexing
  pipeline (prepping for search).

  There is now a temporary, deprecated re-export to match the previous api.

BREAKING CHANGE: rename all mentions of ingest to index (#130)

  • 51c114c Various tooling & community improvements (#131) by @timonv
- **fix(ci): ensure clippy runs with all features**
  - **chore(ci): coverage using llvm-cov**
  - **chore: drastically improve changelog generation**
  - **chore(ci): add sanity checks for pull requests**
  - **chore(ci): split jobs and add typos**
  • d2a9ea1 Enable clippy pedantic (#132) by @timonv

Docs

  • 8405c9e (contributing) Add guidelines on code design (#113) by @timonv

  • 3e447fe (readme) Link to CONTRIBUTING (#114) by @timonv

  • 4c40e27 (readme) Add back coverage badge by @timonv

  • 5691ac9 (readme) Add preproduction warning by @timonv

  • 37af322 (rustdocs) Rewrite the initial landing page (#149) by @timonv

- **Add homepage and badges to cargo toml**
  - **documentation landing page improvements**
  • 7686c2d Templated prompts are now a major feature by @timonv

New Contributors

  • @pwalski made their first contribution in #139

Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.5.0...swiftide-v0.6.0

swiftide-v0.5.0 - 2024-07-01

New features

  • 6a88651 (ingestion_pipeline) Implement filter (#109) by @timonv

  • 5aeb3a7 (ingestion_pipeline) Splitting and merging streams by @timonv

  • 8812fbf (ingestion_pipeline) Build a pipeline from a stream by @timonv

  • 6101bed AWS bedrock support (#92) by @timonv

Adds an integration with AWS Bedrock, implementing SimplePrompt for
  Anthropic and Titan models. More can be added if there is a need. Same
  for the embedding models.

Bug fixes

  • 17a2be1 (changelog) Add scope by @timonv

  • a12cce2 (openai) Add tests for builder by @timonv

  • 963919b (transformers) [breaking] Fix too small chunks being retained and api by @timonv

BREAKING CHANGE: Fix too small chunks being retained and api

  • 5e8da00 Fix oversight in ingestion pipeline tests by @timonv

  • e8198d8 Use git cliff manually for changelog generation by @timonv

  • 2c31513 Just use keepachangelog by @timonv

  • 6430af7 Use native cargo bench format and only run benchmarks crate by @timonv

  • cba981a Replace unwrap with expect and add comment on panic by @timonv

Miscellaneous

  • e243212 (ci) Enable continous benchmarking and improve benchmarks (#98) by @timonv

  • 2dbf14c (ci) Fix benchmarks in ci by @timonv

  • b155de6 (ci) Fix naming of github actions by @timonv

  • 206e432 (ci) Add support for merge queues by @timonv

  • 46752db (ci) Add concurrency configuration by @timonv

  • 5f09c11 Add initial benchmarks by @timonv

  • 162c6ef Ensure feat is always in Added by @timonv

Docs

  • 929410c (readme) Add diagram to the readme (#107) by @timonv

  • b014f43 Improve documentation across the project (#112) by @timonv

Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.4.3...swiftide-v0.5.0

swiftide-v0.4.3 - 2024-06-28

Bug fixes

  • ab3dc86 (memory_storage) Fallback to incremental counter when missing id by @timonv

Miscellaneous

Docs

  • dad3e02 (readme) Add ci badge by @timonv

  • 4076092 (readme) Clean up and consistent badge styles by @timonv

Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.4.2...swiftide-v0.4.3

swiftide-v0.4.2 - 2024-06-26

New features

  • 926cc0c (ingestion_stream) Implement into for Result<Vec> by @timonv

Bug fixes

  • 3143308 (embed) Panic if number of embeddings and node are equal by @timonv

Miscellaneous

  • 5ed08bb Cleanup changelog by @timonv

Docs

  • 47aa378 Create CONTRIBUTING.md by @timonv

  • 0660d5b Readme updates by @timonv

Refactor

  • d285874 (ingestion_pipeline) Log_all combines other log helpers by @timonv

Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.4.1...swiftide-v0.4.2

swiftide-v0.4.1 - 2024-06-24

New features

  • 3898ee7 (memory_storage) Can be cloned safely preserving storage by @timonv

  • 92052bf (transformers) Allow for arbitrary closures as transformers and batchable transformers by @timonv

Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.4.0...swiftide-v0.4.1

swiftide-v0.4.0 - 2024-06-23

New features

  • 477a284 (benchmarks) Add benchmark for the file loader by @timonv

  • 1567940 (benchmarks) Add benchmark for simple local pipeline by @timonv

  • 2228d84 (examples) Example for markdown with all metadata by @timonv

  • 9a1e12d (examples,scraping) Add example scraping and ingesting a url by @timonv

  • 15deeb7 (ingestion_node) Add constructor with defaults by @timonv

  • 4d5c68e (ingestion_node) Improved human readable Debug by @timonv

  • a5051b7 (ingestion_pipeline) Optional error filtering and logging (#75) by @timonv

  • 062107b (ingestion_pipeline) Implement throttling a pipeline (#77) by @timonv

  • a2ffc78 (ingestion_stream) Improved stream developer experience (#81) by @timonv

Improves stream ergonomics by providing convenient helpers and `Into`
  for streams, vectors and iterators that match the internal type.

  This means that in many cases, trait implementers can simply call
  `.into()` instead of manually constructing a stream. In the case it's an
  iterator, they can now use `IngestionStream::iter(<IntoIterator>)`
  instead.
  • d260674 (integrations) [breaking] Support fastembed (#60) by @timonv
Adds support for FastEmbed with various models. Includes a breaking change, renaming the Embed trait to EmbeddingModel.

BREAKING CHANGE: support fastembed (#60)

  • 9004323 (integrations) [breaking] Implement Persist for Redis (#80) by @timonv

BREAKING CHANGE: implement Persist for Redis (#80)

  • eb84dd2 (integrations,transformers) Add transformer for converting html to markdown by @timonv

  • ef7dcea (loaders) File loader performance improvements by @timonv

  • 6d37051 (loaders) Add scraping using spider by @timonv

  • 2351867 (persist) In memory storage for testing, experimentation and debugging by @timonv

  • 4d5d650 (traits) Add automock for simpleprompt by @timonv

  • bd6f887 (transformers) Add transformers for title, summary and keywords by @timonv

Bug fixes

  • 7cbfc4e (ingestion_pipeline) Concurrency does not work when spawned (#76) by @timonv
Currency does did not work as expected. When spawning via `Tokio::spawn`
  the future would be polled directly, and any concurrency setting would
  not be respected. Because it had to be removed, improved tracing for
  each step as well.

Miscellaneous

  • f4341ba (ci) Single changelog for all (future) crates in root (#57) by @timonv

  • 7dde8a0 (ci) Code coverage reporting (#58) by @timonv

Post test coverage to Coveralls

  Also enabled --all-features when running tests in ci, just to be sure
  • cb7a2cd (scraping) Exclude spider from test coverage by @timonv

  • 7767588 (transformers) Improve test coverage by @timonv

  • 3b7c0db Move changelog to root by @timonv

  • d6d0215 Properly quote crate name in changelog by @timonv

  • f251895 Documentation and feature flag cleanup (#69) by @timonv

With fastembed added our dependencies become rather heavy. By default
  now disable all integrations and either provide 'all' or cherry pick
  integrations.

Docs

  • 53ed920 Hide the table of contents by @timonv

Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.3.3...swiftide-v0.4.0

swiftide-v0.3.3 - 2024-06-16

New features

  • bdaed53 (integrations) Clone and debug for integrations by @timonv

  • 318e538 (transformers) Builder and clone for chunk_code by @timonv

  • c074cc0 (transformers) Builder for chunk_markdown by @timonv

  • e18e7fa (transformers) Builder and clone for MetadataQACode by @timonv

  • fd63dff (transformers) Builder and clone for MetadataQAText by @timonv

Miscellaneous

  • 678106c (ci) Pretty names for pipelines (#54) by @timonv

Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.3.2...swiftide-v0.3.3

swiftide-v0.3.2 - 2024-06-16

New features

  • b211002 (integrations) Qdrant and openai builder should be consistent (#52) by @timonv

Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.3.1...swiftide-v0.3.2

swiftide-v0.3.1 - 2024-06-15

Docs

  • 6f63866 We love feedback <3 by @timonv

  • 7d79b64 Fixing some grammar typos on README.md (#51) by @hectorip

New Contributors

  • @hectorip made their first contribution in #51

Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.3.0...swiftide-v0.3.1

swiftide-v0.3.0 - 2024-06-14

New features

  • 745b8ed (ingestion_pipeline) [breaking] Support chained storage backends (#46) by @timonv
Pipeline now supports multiple storage backends. This makes the order of adding storage important. Changed the name of the method to reflect that.

BREAKING CHANGE: support chained storage backends (#46)

  • cd055f1 (ingestion_pipeline) Concurrency improvements (#48) by @timonv

  • 1f0cd28 (ingestion_pipeline) Early return if any error encountered (#49) by @timonv

  • fa74939 Configurable concurrency for transformers and chunkers (#47) by @timonv

Docs

  • 473e60e Update linkedin link by @timonv

Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.2.1...swiftide-v0.3.0

swiftide-v0.2.1 - 2024-06-13

Docs

  • cb9b4fe Add link to bosun by @timonv

  • e330ab9 Fix documentation link by @timonv

Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.2.0...swiftide-v0.2.1

swiftide-v0.2.0 - 2024-06-13

New features

  • 9ec93be Api improvements with example (#10) by @timonv

Bug fixes

Docs

  • 95a6200 (swiftide) Documented file swiftide/src/ingestion/ingestion_pipeline.rs (#14) by @bosun-ai[bot]

  • 7abccc2 (swiftide) Documented file swiftide/src/ingestion/ingestion_stream.rs (#16) by @bosun-ai[bot]

  • 755cd47 (swiftide) Documented file swiftide/src/ingestion/ingestion_node.rs (#15) by @bosun-ai[bot]

  • 2ea5a84 (swiftide) Documented file swiftide/src/integrations/openai/mod.rs (#21) by @bosun-ai[bot]

  • b319c0d (swiftide) Documented file swiftide/src/integrations/treesitter/splitter.rs (#30) by @bosun-ai[bot]

  • 29fce74 (swiftide) Documented file swiftide/src/integrations/redis/node_cache.rs (#29) by @bosun-ai[bot]

  • 7229af8 (swiftide) Documented file swiftide/src/integrations/qdrant/persist.rs (#24) by @bosun-ai[bot]

  • 6240a26 (swiftide) Documented file swiftide/src/integrations/redis/mod.rs (#23) by @bosun-ai[bot]

  • 7688c99 (swiftide) Documented file swiftide/src/integrations/qdrant/mod.rs (#22) by @bosun-ai[bot]

  • d572c88 (swiftide) Documented file swiftide/src/integrations/qdrant/ingestion_node.rs (#20) by @bosun-ai[bot]

  • 14e24c3 (swiftide) Documented file swiftide/src/ingestion/mod.rs (#28) by @bosun-ai[bot]

  • 502939f (swiftide) Documented file swiftide/src/integrations/treesitter/supported_languages.rs (#26) by @bosun-ai[bot]

  • a78e68e (swiftide) Documented file swiftide/tests/ingestion_pipeline.rs (#41) by @bosun-ai[bot]

  • 289687e (swiftide) Documented file swiftide/src/loaders/mod.rs (#40) by @bosun-ai[bot]

  • ebd0a5d (swiftide) Documented file swiftide/src/transformers/chunk_code.rs (#39) by @bosun-ai[bot]

  • fb428d1 (swiftide) Documented file swiftide/src/transformers/metadata_qa_text.rs (#36) by @bosun-ai[bot]

  • 305a641 (swiftide) Documented file swiftide/src/transformers/openai_embed.rs (#35) by @bosun-ai[bot]

  • c932897 (swiftide) Documented file swiftide/src/transformers/metadata_qa_code.rs (#34) by @bosun-ai[bot]

  • 090ef1b (swiftide) Documented file swiftide/src/integrations/openai/simple_prompt.rs (#19) by @bosun-ai[bot]

  • 7cfcc83 Update readme template links and fix template by @timonv

  • a717f3d Template links should be underscores by @timonv

New Contributors

  • @bosun-ai[bot] made their first contribution in #19

Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.1.0...swiftide-v0.2.0

v0.1.0 - 2024-06-13

New features

  • 2a6e503 (doc) Setup basic readme (#5) by @timonv

  • b8f9166 (fluyt) Significant tracing improvements (#368) by @timonv

* fix(fluyt): remove unnecessary cloning and unwraps

  * fix(fluyt): also set target correctly on manual spans

  * fix(fluyt): do not capture raw result

  * feat(fluyt): nicer tracing for ingestion pipeline

  * fix(fluyt): remove instrumentation on lazy methods

  * feat(fluyt): add useful metadata to the root span

  * fix(fluyt): fix dangling spans in ingestion pipeline

  * fix(fluyt): do not log codebase in rag utils
  • 0986136 (fluyt/code_ops) Add languages to chunker and range for chunk size (#334) by @timonv
* feat(fluyt/code_ops): add more treesitter languages

  * fix: clippy + fmt

  * feat(fluyt/code_ops): implement builder and support range

  * feat(fluyt/code_ops): implement range limits for code chunking

  * feat(fluyt/indexing): code chunking supports size
  • f10bc30 (ingestion_pipeline) Default concurrency is the number of cpus (#6) by @timonv

  • 7453ddc Replace databuoy with new ingestion pipeline (#322) by @timonv

  • 054b560 Fix build and add feature flags for all integrations by @timonv

Bug fixes

  • fdf4be3 (fluyt) Ensure minimal tracing by @timonv

  • 389b0f1 Add debug info to qdrant setup by @timonv

  • bb905a3 Use rustls on redis and log errors by @timonv

  • 458801c Properly connect to redis over tls by @timonv

Miscellaneous

  • ce6e465 (fluyt) Add verbose log on checking if index exists by @timonv

  • 6967b0d Make indexing extraction compile by @tinco

  • f595f3d Add rust-toolchain on stable by @timonv

  • da004c6 Start cleaning up dependencies by @timonv

  • cccdaf5 Remove more unused dependencies by @timonv

  • 7ee8799 Remove more crates and update by @timonv

  • 951f496 Clean up more crates by @timonv

  • 1f17d84 Cargo update by @timonv

  • 730d879 Create LICENSE by @timonv

  • 44524fb Restructure repository and rename (#3) by @timonv

* chore: move traits around

  * chore: move crates to root folder

  * chore: restructure and make it compile

  * chore: remove infrastructure

  * fix: make it compile

  * fix: clippy

  * chore: remove min rust version

  * chore: cargo update

  * chore: remove code_ops

  * chore: settle on swiftide
  • e717b7f Update issue templates by @timonv

  • 8e22e0e Cleanup by @timonv

  • 4d79d27 Tests, tests, tests (#4) by @timonv

  • 1036d56 Configure cargo toml (#7) by @timonv

  • 0ae98a7 Cleanup Cargo keywords by @timonv

Refactor

  • 0d342ea Models as first class citizens (#318) by @timonv
* refactor: refactor common datastructures to /models

  * refactor: promote to first class citizens

  * fix: clippy

  * fix: remove duplication in http handler

  * fix: clippy

  * fix: fmt

  * feat: update for latest change

  * fix(fluyt/models): doctest