All notable changes to this project will be documented in this file.
0.13.1 - 2024-10-02
Full Changelog: https://github.com/bosun-ai/swiftide/compare/0.13.0...0.13.1
0.13.0 - 2024-09-26
BREAKING CHANGE: The batch size of batch transformers when indexing is now configured on the batch transformer. If no batch size or default is configured, a configurable default is used from the pipeline. The default batch size is 256.
BREAKING CHANGE: SupportedLanguages are now non-exhaustive. This means that matching on SupportedLanguages will now require a catch-all arm. This change was made to allow for future languages to be added without breaking changes.
Qdrant and FastEmbed now have a default batch size, removing the need to set it manually. The default batch size is 50 and 256 respectively.
Full Changelog: https://github.com/bosun-ai/swiftide/compare/0.12.3...0.13.0
0.12.3 - 2024-09-23
As learned from [#309](https://github.com/bosun-ai/swiftide/pull/309), test coverage for the refs defs transformer was
not great. There _are_ more tests in code_tree. Turns out, with the
latest treesitter update, javascript broke as it was the only language
not covered at all.
Full Changelog: https://github.com/bosun-ai/swiftide/compare/0.12.2...0.12.3
v0.12.2 - 2024-09-20
- d84814e Fix broken documentation links and other cargo doc warnings (#304) by @tinco
Running `cargo doc --all-features` resulted in a lot of warnings.
Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.12.1...v0.12.2
v0.12.1 - 2024-09-16
-
ec227d2 (indexing,query) Add concise info log with transformation name by @timonv
-
01cf579 (query) Add query_mut for reusable query pipelines by @timonv
-
081a248 (query) Improve query performance similar to indexing in 0.12 by @timonv
-
8029926 (query,indexing) Add duration in log output on pipeline completion by @timonv
-
39b6ecb (core) Truncate long strings safely when printing debug logs by @timonv
-
8b8ceb9 (deps) Update redis by @timonv
-
16e9c74 (openai) Reduce debug verbosity by @timonv
-
6914d60 (qdrant) Reduce debug verbosity when storing nodes by @timonv
-
3d13889 (query) Reduce and improve debugging verbosity by @timonv
-
133cf1d (query) Remove verbose debug and skip self in instrumentation by @timonv
-
ce17981 Clippy by @timonv
-
a871c61 Fmt by @timonv
- d62b047 (ci) Update testcontainer images and fix tests by @timonv
Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.12.0...v0.12.1
v0.12.0 - 2024-09-13
- e902cb7 (query) Add support for filters in SimilaritySingleEmbedding (#298) by @timonv
Adds support for filters for Qdrant and Lancedb in
SimilaritySingleEmbedding. Also fixes several small bugs and brings
improved tests.
- f158960 Major performance improvements (#291) by @timonv
Futures that do not yield were not run in parallel properly. With this
futures are spawned on a tokio worker thread by default.
When embedding (fastembed) and storing a 85k row dataset, there's a
~1.35x performance improvement:
<img width="621" alt="image"
src="https://github.com/user-attachments/assets/ba2d4d96-8d4a-44f1-b02d-6ac2af0cedb7">
~~Need to do one more test with IO bound futures as well. Pretty huge,
not that it was slow.~~
With IO bound openai it's 1.5x.
-
f8314cc (indexing) Limit logged chunk to max 100 chars (#292) by @timonv
-
f95f806 (indexing) Debugging nodes should respect utf8 char boundaries by @timonv
-
8595553 Implement into_stream_boxed for all loaders by @timonv
-
9464ca1 Bad embed error propagation (#293) by @timonv
- **fix(indexing): Limit logged chunk to max 100 chars**
- **fix: Embed transformers must correctly propagate errors**
- 45d8a57 (ci) Use llm-cov preview via nightly and improve test coverage (#289) by @timonv
Fix test coverage in CI. Simplified the trait bounds on the query
pipeline for now to make it all work and fit together, and added more
tests to assert boxed versions of trait objects work in tests.
-
408f30a (deps) Update testcontainers (#295) by @timonv
-
37c4bd9 (deps) Update treesitter (#296) by @timonv
-
8d9e954 Cargo update by @timonv
Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.11.1...v0.12.0
v0.11.1 - 2024-09-10
- 3c9491b Implemtent traits T for Box for indexing and query traits (#285) by @timonv
When working with trait objects, some pipeline steps now allow for
Box<dyn Trait> as well.
- dfa546b Add missing parquet feature flag by @timonv
Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.11.0...v0.11.1
v0.11.0 - 2024-09-08
- bdf17ad (indexing) Parquet loader (#279) by @timonv
Ingest and index data from parquet files.
- a98dbcb (integrations) Add ollama embeddings support (#278) by @ephraimkunz
Update to the most recent ollama-rs, which exposes the batch embedding
API Ollama exposes (https://github.com/pepperoni21/ollama-rs/pull/61).
This allows the Ollama struct in Swiftide to implement `EmbeddingModel`.
Use the same pattern that the OpenAI struct uses to manage separate
embedding and prompt models.
---------
-
873795b (ci) Re-enable coverage via Coverals with tarpaulin (#280) by @timonv
-
465de7f Update CHANGELOG.md with breaking change by @timonv
- @ephraimkunz made their first contribution in #278
Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.10.0...v0.11.0
v0.10.0 - 2024-09-06
- 5a724df [breaking] Rust 1.81 support (#275) by @timonv
Fixing id generation properly as per #272, will be merged in together.
- **Clippy**
- **fix(qdrant)!: Default hasher changed in Rust 1.81**
BREAKING CHANGE: Rust 1.81 support (#275)
- 3711f6f (readme) Fix date (#273) by @dzvon
I suppose this should be 09-02.
- @dzvon made their first contribution in #273
Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.9.2...v0.10.0
v0.9.2 - 2024-09-04
-
84e9bae (indexing) Add chunker for text with text_splitter (#270) by @timonv
-
387fbf2 (query) Hybrid search for qdrant in query pipeline (#260) by @timonv
Implement hybrid search for qdrant with their new Fusion search. Example
in /examples includes an indexing and query pipeline, included the
example answer as well.
Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.9.1...v0.9.2
v0.9.1 - 2024-09-01
- b891f93 (integrations) Add fluvio as loader support (#243) by @timonv
Adds Fluvio as a loader support, enabling Swiftide indexing streams to
process messages from a Fluvio topic.
- c00b6c8 (query) Ragas support (#236) by @timonv
Work in progress on support for ragas as per
https://github.com/explodinggradients/ragas/issues/1165 and #232
Add an optional evaluator to a pipeline. Evaluators need to handle
transformation events in the query pipeline. The Ragas evaluator
captures the transformations as per
https://docs.ragas.io/en/latest/howtos/applications/data_preparation.html.
You can find a working notebook here
https://github.com/bosun-ai/swiftide-tutorial/blob/c510788a625215f46575415161659edf26fc1fd5/ragas/notebook.ipynb
with a pipeline using it here
https://github.com/bosun-ai/swiftide-tutorial/pull/1
- a1250c1 LanceDB support (#254) by @timonv
Add LanceDB support for indexing and querying. LanceDB separates compute
from storage, where storage can be local or hosted elsewhere.
-
f92376d (deps) Update rust crate aws-sdk-bedrockruntime to v1.46.0 (#247) by @renovate[bot]
-
732a166 Remove no default features from futures-util by @timonv
- 9b257da Default features cleanup (#262) by @timonv
Integrations are messy and pull a lot in. A potential solution is to
disable default features, only add what is actually required, and put
the responsibility at users if they need anything specific. Feature
unification should then take care of the rest.
- fb381b8 (readme) Copy improvements (#261) by @timonv
Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.9.0...v0.9.1
v0.9.0 - 2024-08-15
-
2443933 (qdrant) Add access to inner client for custom operations (#242) by @timonv
-
4fff613 (query) Add concurrency on query pipeline and add query_all by @timonv
-
4e31c0a (deps) Update rust crate aws-sdk-bedrockruntime to v1.44.0 (#244) by @renovate[bot]
-
501321f (deps) Update rust crate spider to v1.99.37 (#230) by @renovate[bot]
-
8a1cc69 (query) After retrieval current transormation should be empty by @timonv
- e9d0016 (indexing,integrations) Move tree-sitter dependencies to integrations (#235) by @timonv
Removes the dependency of indexing on integrations, resulting in much
faster builds when developing on indexing.
Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.8.0...v0.9.0
v0.8.0 - 2024-08-12
- 2e25ad4 (indexing) [breaking] Default LLM for indexing pipeline and boilerplate Transformer macro (#227) by @timonv
Add setting a default LLM for an indexing pipeline, avoiding the need to
clone multiple times.
More importantly, introduced `swiftide-macros` with
`#[swiftide_macros::indexing_transformer]` that generates
all boilerplate code used for internal transformers. This ensures all
transformers are consistent and makes them
easy to change in the future. This is a big win for maintainability and
ease to extend. Users are encouraged to use the macro
as well.
BREAKING CHANGE: Introduces WithIndexingDefaults
and
WithBatchIndexingDefaults
trait constraints for transformers. They can
be used as a marker
with a noop (i.e. just impl WithIndexingDefaults for MyTransformer {}
). However, when implemented fully, they can be used to provide
defaults from the pipeline to your transformers.
- 67336f1 (indexing) Sparse vector support with Splade and Qdrant (#222) by @timonv
Adds Sparse vector support to the indexing pipeline, enabling hybrid
search for vector databases. The design should work for any form of
Sparse embedding, and works with existing embedding modes and multiple
named vectors. Additionally, added `try_default_sparse` to FastEmbed,
using Splade, so it's fully usuable.
Hybrid search in the query pipeline coming soon.
- e728a7c Code outlines in chunk metadata (#137) by @tinco
Added a transformer that generates outlines for code files using tree sitter. And another that compresses the outline to be more relevant to chunks. Additionally added a step to the metadata QA tool that uses the outline to improve the contextual awareness during QA generation.
- dc7412b (deps) Update aws-sdk-rust monorepo (#223) by @renovate[bot]
- 9613f50 (ci) Only show remote github url if present in changelog by @timonv
-
73d1649 (readme) Add Ollama support to README by @timonv
-
b3f04de (readme) Add link to discord (#219) by @timonv
-
4970a68 (readme) Fix discord links by @timonv
Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.7.1...v0.8.0
v0.7.1 - 2024-08-04
-
b2d31e5 (integrations) Add ollama support (#214) by @tinco
-
9eb5894 (query) Add support for closures in all steps (#215) by @timonv
- 53e662b (ci) Add cargo deny to lint dependencies (#213) by @timonv
-
1539393 (readme) Update README.md by @timonv
-
ba07ab9 (readme) Readme improvements by @timonv
-
f7accde (readme) Add 0.7 announcement by @timonv
-
084548f (readme) Clarify on closures by @timonv
Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.7.0...v0.7.1
swiftide-v0.7.0 - 2024-07-28
- ec1fb04 (indexing) Metadata as first class citizen (#204) by @timonv
Adds our own implementation for metadata, internally still using a
BTreeMap. The Value type is now a `serde_json::Value` enum. This allows
us to store the metadata in the same format as the rest of the document,
and also allows us to use values programmatically later.
As is, all current meta data is still stored as Strings.
- 16bafe4 (swiftide) [breaking] Rework workspace preparing for swiftide-query (#199) by @timonv
Splits up the project into multiple small, unpublished crates. Boosts
compile times, makes the code a bit easier to grok and enables
swiftide-query to be build separately.
BREAKING CHANGE: All indexing related tools are now in
- 63694d2 (swiftide-query) Query pipeline v1 (#189) by @timonv
-
ee3aad3 (deps) Update rust crate aws-sdk-bedrockruntime to v1.42.0 (#195) by @renovate[bot]
-
be0f31d (deps) Update rust crate spider to v1.99.11 (#190) by @renovate[bot]
-
dd04453 (swiftide) Update main lockfile by @timonv
-
bafd907 Update all cargo package descriptions by @timonv
- e72641b (ci) Set versions in dependencies by @timonv
-
2114aa4 (readme) Add copy on the query pipeline by @timonv
-
573aff6 (indexing) Document the default prompt templates and their context (#206) by @timonv
Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.6.7...swiftide-v0.7.0
swiftide-v0.6.7 - 2024-07-23
-
beea449 (prompt) Add Into for strings to PromptTemplate (#193) by @timonv
-
f3091f7 (transformers) References and definitions from code (#186) by @timonv
-
97a572e (readme) Add blog posts and update doc link (#194) by @timonv
-
504fe26 (pipeline) Add note that closures can also be used as transformers by @timonv
Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.6.6...swiftide-v0.6.7
swiftide-v0.6.6 - 2024-07-16
- d1c642a (groq) Add SimplePrompt support for Groq (#183) by @timonv
Adds simple prompt support for Groq by using async_openai. ~~Needs some
double checks~~. Works great.
- 5d4a814 (deps) Update rust crate aws-sdk-bedrockruntime to v1.40.0 (#169) by @renovate[bot]
-
143c7c9 (readme) Fix typo (#180) by @eltociear
-
d393181 (docsrs) Scrape examples and fix links (#184) by @timonv
- @eltociear made their first contribution in #180
Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.6.5...swiftide-v0.6.6
swiftide-v0.6.5 - 2024-07-15
- 0065c7a (prompt) Add extending the prompt repository (#178) by @timonv
- b54691f (prompts) Include default prompts in crate (#174) by @timonv
- **add prompts to crate**
- **load prompts via cargo manifest dir**
- 3c297bb (swiftide) Remove include from Cargo.toml by @timonv
- 73d5fa3 (traits) Cleanup unused batch size in
BatchableTransformer
(#177) by @timonv
- b95b395 (swiftide) Documentation improvements and cleanup (#176) by @timonv
- **chore: remove ingestion stream**
- **Documentation and grammar**
Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.6.3...swiftide-v0.6.5
swiftide-v0.6.3 - 2024-07-14
- 47418b5 (prompts) Fix breaking issue with prompts not found by @timonv
Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.6.2...swiftide-v0.6.3
swiftide-v0.6.2 - 2024-07-12
- 2b682b2 (deps) Limit feature flags on qdrant to fix docsrs by @timonv
Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.6.1...swiftide-v0.6.2
swiftide-v0.6.1 - 2024-07-12
- aae7ab1 (deps) Patch update all by @timonv
- 085709f (docsrs) Disable unstable and rustdoc scraping by @timonv
Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.6.0...swiftide-v0.6.1
swiftide-v0.6.0 - 2024-07-12
- 70ea268 (prompts) Add prompts as first class citizens (#145) by @timonv
Adds Prompts as first class citizens. This is a breaking change as
SimplePrompt with just a a `&str` is no longer allowed.
This introduces `Prompt` and `PromptTemplate`. A template uses jinja
style templating build on tera. Templates can be converted into prompts,
and have context added. A prompt is then send to something that prompts,
i.e. openai or bedrock.
Additional prompts can be added either compiled or as one-offs.
Additionally, it's perfectly fine to prompt with just a string as well,
just provide an `.into()`.
For future development, some LLMs really benefit from system prompts,
which this would enable. For the query pipeline we can also take a much
more structured approach with composed templates and conditionals.
- 699cfe4 Embed modes and named vectors (#123) by @pwalski
Added named vector support to qdrant. A pipeline can now have its embed
mode configured, either per field, chunk and metadata combined (default)
or both. Vectors need to be configured on the qdrant client side.
See `examples/store_multiple_vectors.rs` for an example.
Shoutout to @pwalski for the contribution. Closes #62.
---------
-
9334934 (chunkcode) Use correct chunksizes (#122) by @timonv
-
dfc76dd (deps) Update rust crate serde to v1.0.204 (#129) by @renovate[bot]
-
28f5b04 (deps) Update rust crate tree-sitter-typescript to v0.21.2 (#128) by @renovate[bot]
-
9c261b8 (deps) Update rust crate text-splitter to v0.14.1 (#127) by @renovate[bot]
-
ff92abd (deps) Update rust crate tree-sitter-javascript to v0.21.4 (#126) by @renovate[bot]
-
7af97b5 (deps) Update rust crate spider to v1.98.7 (#124) by @renovate[bot]
-
adc4bf7 (deps) Update aws-sdk-rust monorepo (#125) by @renovate[bot]
-
dd32ef3 (deps) Update rust crate async-trait to v0.1.81 (#134) by @renovate[bot]
-
2b13523 (deps) Update rust crate fastembed to v3.7.1 (#135) by @renovate[bot]
-
8e22937 (deps) Update rust crate aws-sdk-bedrockruntime to v1.39.0 (#143) by @renovate[bot]
-
353cd9e (qdrant) Upgrade and better defaults (#118) by @timonv
- **fix(deps): update rust crate qdrant-client to v1.10.1**
- **fix(qdrant): upgrade to new qdrant with sensible defaults**
- **feat(qdrant): safe to clone with internal arc**
---------
- b53636c Inability to store only some of
EmbeddedField
s (#139) by @pwalski
- ea8f823 Improve local build performance and crate cleanup (#148) by @timonv
- **tune cargo for faster builds**
- **perf(swiftide): increase local build performance**
-
eb8364e (ci) Try overriding the github repo for git cliff by @timonv
-
5de6af4 (ci) Only add contributors if present by @timonv
-
4c9ed77 (ci) Properly check if contributors are present by @timonv
-
c5bf796 (ci) Add clippy back to ci (#147) by @timonv
-
7a8843a (deps) Update rust crate testcontainers to 0.20.0 (#133) by @renovate[bot]
-
364e13d (swiftide) Loosen up dependencies (#140) by @timonv
Loosen up dependencies so swiftide is a bit more flexible to add to
existing projects
- 84dd65d [breaking] Rename all mentions of ingest to index (#130) by @timonv
Swiftide is not an ingestion pipeline (loading data), but an indexing
pipeline (prepping for search).
There is now a temporary, deprecated re-export to match the previous api.
BREAKING CHANGE: rename all mentions of ingest to index (#130)
- 51c114c Various tooling & community improvements (#131) by @timonv
- **fix(ci): ensure clippy runs with all features**
- **chore(ci): coverage using llvm-cov**
- **chore: drastically improve changelog generation**
- **chore(ci): add sanity checks for pull requests**
- **chore(ci): split jobs and add typos**
- d2a9ea1 Enable clippy pedantic (#132) by @timonv
-
8405c9e (contributing) Add guidelines on code design (#113) by @timonv
-
3e447fe (readme) Link to CONTRIBUTING (#114) by @timonv
-
4c40e27 (readme) Add back coverage badge by @timonv
-
5691ac9 (readme) Add preproduction warning by @timonv
-
37af322 (rustdocs) Rewrite the initial landing page (#149) by @timonv
- **Add homepage and badges to cargo toml**
- **documentation landing page improvements**
- 7686c2d Templated prompts are now a major feature by @timonv
- @pwalski made their first contribution in #139
Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.5.0...swiftide-v0.6.0
swiftide-v0.5.0 - 2024-07-01
-
6a88651 (ingestion_pipeline) Implement filter (#109) by @timonv
-
5aeb3a7 (ingestion_pipeline) Splitting and merging streams by @timonv
-
8812fbf (ingestion_pipeline) Build a pipeline from a stream by @timonv
-
6101bed AWS bedrock support (#92) by @timonv
Adds an integration with AWS Bedrock, implementing SimplePrompt for
Anthropic and Titan models. More can be added if there is a need. Same
for the embedding models.
-
17a2be1 (changelog) Add scope by @timonv
-
a12cce2 (openai) Add tests for builder by @timonv
-
963919b (transformers) [breaking] Fix too small chunks being retained and api by @timonv
BREAKING CHANGE: Fix too small chunks being retained and api
-
5e8da00 Fix oversight in ingestion pipeline tests by @timonv
-
e8198d8 Use git cliff manually for changelog generation by @timonv
-
2c31513 Just use keepachangelog by @timonv
-
6430af7 Use native cargo bench format and only run benchmarks crate by @timonv
-
cba981a Replace unwrap with expect and add comment on panic by @timonv
-
e243212 (ci) Enable continous benchmarking and improve benchmarks (#98) by @timonv
-
2dbf14c (ci) Fix benchmarks in ci by @timonv
-
b155de6 (ci) Fix naming of github actions by @timonv
-
206e432 (ci) Add support for merge queues by @timonv
-
46752db (ci) Add concurrency configuration by @timonv
-
5f09c11 Add initial benchmarks by @timonv
-
162c6ef Ensure feat is always in Added by @timonv
-
929410c (readme) Add diagram to the readme (#107) by @timonv
-
b014f43 Improve documentation across the project (#112) by @timonv
Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.4.3...swiftide-v0.5.0
swiftide-v0.4.3 - 2024-06-28
- ab3dc86 (memory_storage) Fallback to incremental counter when missing id by @timonv
- bdebc24 Clippy by @timonv
-
dad3e02 (readme) Add ci badge by @timonv
-
4076092 (readme) Clean up and consistent badge styles by @timonv
Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.4.2...swiftide-v0.4.3
swiftide-v0.4.2 - 2024-06-26
- 926cc0c (ingestion_stream) Implement into for Result<Vec> by @timonv
- 3143308 (embed) Panic if number of embeddings and node are equal by @timonv
- 5ed08bb Cleanup changelog by @timonv
- d285874 (ingestion_pipeline) Log_all combines other log helpers by @timonv
Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.4.1...swiftide-v0.4.2
swiftide-v0.4.1 - 2024-06-24
-
3898ee7 (memory_storage) Can be cloned safely preserving storage by @timonv
-
92052bf (transformers) Allow for arbitrary closures as transformers and batchable transformers by @timonv
Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.4.0...swiftide-v0.4.1
swiftide-v0.4.0 - 2024-06-23
-
477a284 (benchmarks) Add benchmark for the file loader by @timonv
-
1567940 (benchmarks) Add benchmark for simple local pipeline by @timonv
-
2228d84 (examples) Example for markdown with all metadata by @timonv
-
9a1e12d (examples,scraping) Add example scraping and ingesting a url by @timonv
-
15deeb7 (ingestion_node) Add constructor with defaults by @timonv
-
4d5c68e (ingestion_node) Improved human readable Debug by @timonv
-
a5051b7 (ingestion_pipeline) Optional error filtering and logging (#75) by @timonv
-
062107b (ingestion_pipeline) Implement throttling a pipeline (#77) by @timonv
-
a2ffc78 (ingestion_stream) Improved stream developer experience (#81) by @timonv
Improves stream ergonomics by providing convenient helpers and `Into`
for streams, vectors and iterators that match the internal type.
This means that in many cases, trait implementers can simply call
`.into()` instead of manually constructing a stream. In the case it's an
iterator, they can now use `IngestionStream::iter(<IntoIterator>)`
instead.
- d260674 (integrations) [breaking] Support fastembed (#60) by @timonv
Adds support for FastEmbed with various models. Includes a breaking change, renaming the Embed trait to EmbeddingModel.
BREAKING CHANGE: support fastembed (#60)
- 9004323 (integrations) [breaking] Implement Persist for Redis (#80) by @timonv
BREAKING CHANGE: implement Persist for Redis (#80)
-
eb84dd2 (integrations,transformers) Add transformer for converting html to markdown by @timonv
-
ef7dcea (loaders) File loader performance improvements by @timonv
-
6d37051 (loaders) Add scraping using
spider
by @timonv -
2351867 (persist) In memory storage for testing, experimentation and debugging by @timonv
-
4d5d650 (traits) Add automock for simpleprompt by @timonv
-
bd6f887 (transformers) Add transformers for title, summary and keywords by @timonv
- 7cbfc4e (ingestion_pipeline) Concurrency does not work when spawned (#76) by @timonv
Currency does did not work as expected. When spawning via `Tokio::spawn`
the future would be polled directly, and any concurrency setting would
not be respected. Because it had to be removed, improved tracing for
each step as well.
-
f4341ba (ci) Single changelog for all (future) crates in root (#57) by @timonv
-
7dde8a0 (ci) Code coverage reporting (#58) by @timonv
Post test coverage to Coveralls
Also enabled --all-features when running tests in ci, just to be sure
-
cb7a2cd (scraping) Exclude spider from test coverage by @timonv
-
7767588 (transformers) Improve test coverage by @timonv
-
3b7c0db Move changelog to root by @timonv
-
d6d0215 Properly quote crate name in changelog by @timonv
-
f251895 Documentation and feature flag cleanup (#69) by @timonv
With fastembed added our dependencies become rather heavy. By default
now disable all integrations and either provide 'all' or cherry pick
integrations.
- f6656be Cargo update by @timonv
- 53ed920 Hide the table of contents by @timonv
Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.3.3...swiftide-v0.4.0
swiftide-v0.3.3 - 2024-06-16
-
bdaed53 (integrations) Clone and debug for integrations by @timonv
-
318e538 (transformers) Builder and clone for chunk_code by @timonv
-
c074cc0 (transformers) Builder for chunk_markdown by @timonv
-
e18e7fa (transformers) Builder and clone for MetadataQACode by @timonv
-
fd63dff (transformers) Builder and clone for MetadataQAText by @timonv
- 678106c (ci) Pretty names for pipelines (#54) by @timonv
Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.3.2...swiftide-v0.3.3
swiftide-v0.3.2 - 2024-06-16
- b211002 (integrations) Qdrant and openai builder should be consistent (#52) by @timonv
Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.3.1...swiftide-v0.3.2
swiftide-v0.3.1 - 2024-06-15
-
6f63866 We love feedback <3 by @timonv
-
7d79b64 Fixing some grammar typos on README.md (#51) by @hectorip
- @hectorip made their first contribution in #51
Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.3.0...swiftide-v0.3.1
swiftide-v0.3.0 - 2024-06-14
- 745b8ed (ingestion_pipeline) [breaking] Support chained storage backends (#46) by @timonv
Pipeline now supports multiple storage backends. This makes the order of adding storage important. Changed the name of the method to reflect that.
BREAKING CHANGE: support chained storage backends (#46)
-
cd055f1 (ingestion_pipeline) Concurrency improvements (#48) by @timonv
-
1f0cd28 (ingestion_pipeline) Early return if any error encountered (#49) by @timonv
-
fa74939 Configurable concurrency for transformers and chunkers (#47) by @timonv
- 473e60e Update linkedin link by @timonv
Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.2.1...swiftide-v0.3.0
swiftide-v0.2.1 - 2024-06-13
Full Changelog: https://github.com/bosun-ai/swiftide/compare/swiftide-v0.2.0...swiftide-v0.2.1
swiftide-v0.2.0 - 2024-06-13
- 9ec93be Api improvements with example (#10) by @timonv
-
95a6200 (swiftide) Documented file swiftide/src/ingestion/ingestion_pipeline.rs (#14) by @bosun-ai[bot]
-
7abccc2 (swiftide) Documented file swiftide/src/ingestion/ingestion_stream.rs (#16) by @bosun-ai[bot]
-
755cd47 (swiftide) Documented file swiftide/src/ingestion/ingestion_node.rs (#15) by @bosun-ai[bot]
-
2ea5a84 (swiftide) Documented file swiftide/src/integrations/openai/mod.rs (#21) by @bosun-ai[bot]
-
b319c0d (swiftide) Documented file swiftide/src/integrations/treesitter/splitter.rs (#30) by @bosun-ai[bot]
-
29fce74 (swiftide) Documented file swiftide/src/integrations/redis/node_cache.rs (#29) by @bosun-ai[bot]
-
7229af8 (swiftide) Documented file swiftide/src/integrations/qdrant/persist.rs (#24) by @bosun-ai[bot]
-
6240a26 (swiftide) Documented file swiftide/src/integrations/redis/mod.rs (#23) by @bosun-ai[bot]
-
7688c99 (swiftide) Documented file swiftide/src/integrations/qdrant/mod.rs (#22) by @bosun-ai[bot]
-
d572c88 (swiftide) Documented file swiftide/src/integrations/qdrant/ingestion_node.rs (#20) by @bosun-ai[bot]
-
14e24c3 (swiftide) Documented file swiftide/src/ingestion/mod.rs (#28) by @bosun-ai[bot]
-
502939f (swiftide) Documented file swiftide/src/integrations/treesitter/supported_languages.rs (#26) by @bosun-ai[bot]
-
a78e68e (swiftide) Documented file swiftide/tests/ingestion_pipeline.rs (#41) by @bosun-ai[bot]
-
289687e (swiftide) Documented file swiftide/src/loaders/mod.rs (#40) by @bosun-ai[bot]
-
ebd0a5d (swiftide) Documented file swiftide/src/transformers/chunk_code.rs (#39) by @bosun-ai[bot]
-
fb428d1 (swiftide) Documented file swiftide/src/transformers/metadata_qa_text.rs (#36) by @bosun-ai[bot]
-
305a641 (swiftide) Documented file swiftide/src/transformers/openai_embed.rs (#35) by @bosun-ai[bot]
-
c932897 (swiftide) Documented file swiftide/src/transformers/metadata_qa_code.rs (#34) by @bosun-ai[bot]
-
090ef1b (swiftide) Documented file swiftide/src/integrations/openai/simple_prompt.rs (#19) by @bosun-ai[bot]
-
7cfcc83 Update readme template links and fix template by @timonv
-
a717f3d Template links should be underscores by @timonv
- @bosun-ai[bot] made their first contribution in #19
Full Changelog: https://github.com/bosun-ai/swiftide/compare/v0.1.0...swiftide-v0.2.0
v0.1.0 - 2024-06-13
-
2a6e503 (doc) Setup basic readme (#5) by @timonv
-
b8f9166 (fluyt) Significant tracing improvements (#368) by @timonv
* fix(fluyt): remove unnecessary cloning and unwraps
* fix(fluyt): also set target correctly on manual spans
* fix(fluyt): do not capture raw result
* feat(fluyt): nicer tracing for ingestion pipeline
* fix(fluyt): remove instrumentation on lazy methods
* feat(fluyt): add useful metadata to the root span
* fix(fluyt): fix dangling spans in ingestion pipeline
* fix(fluyt): do not log codebase in rag utils
- 0986136 (fluyt/code_ops) Add languages to chunker and range for chunk size (#334) by @timonv
* feat(fluyt/code_ops): add more treesitter languages
* fix: clippy + fmt
* feat(fluyt/code_ops): implement builder and support range
* feat(fluyt/code_ops): implement range limits for code chunking
* feat(fluyt/indexing): code chunking supports size
-
f10bc30 (ingestion_pipeline) Default concurrency is the number of cpus (#6) by @timonv
-
7453ddc Replace databuoy with new ingestion pipeline (#322) by @timonv
-
054b560 Fix build and add feature flags for all integrations by @timonv
-
fdf4be3 (fluyt) Ensure minimal tracing by @timonv
-
389b0f1 Add debug info to qdrant setup by @timonv
-
bb905a3 Use rustls on redis and log errors by @timonv
-
458801c Properly connect to redis over tls by @timonv
-
ce6e465 (fluyt) Add verbose log on checking if index exists by @timonv
-
6967b0d Make indexing extraction compile by @tinco
-
f595f3d Add rust-toolchain on stable by @timonv
-
da004c6 Start cleaning up dependencies by @timonv
-
cccdaf5 Remove more unused dependencies by @timonv
-
7ee8799 Remove more crates and update by @timonv
-
951f496 Clean up more crates by @timonv
-
1f17d84 Cargo update by @timonv
-
730d879 Create LICENSE by @timonv
-
44524fb Restructure repository and rename (#3) by @timonv
* chore: move traits around
* chore: move crates to root folder
* chore: restructure and make it compile
* chore: remove infrastructure
* fix: make it compile
* fix: clippy
* chore: remove min rust version
* chore: cargo update
* chore: remove code_ops
* chore: settle on swiftide
-
e717b7f Update issue templates by @timonv
-
8e22e0e Cleanup by @timonv
-
4d79d27 Tests, tests, tests (#4) by @timonv
-
1036d56 Configure cargo toml (#7) by @timonv
-
0ae98a7 Cleanup Cargo keywords by @timonv
- 0d342ea Models as first class citizens (#318) by @timonv
* refactor: refactor common datastructures to /models
* refactor: promote to first class citizens
* fix: clippy
* fix: remove duplication in http handler
* fix: clippy
* fix: fmt
* feat: update for latest change
* fix(fluyt/models): doctest