Skip to content

Commit

Permalink
wip(#6): start implementing search with milli but stop and reconsider…
Browse files Browse the repository at this point in the history
… to maybe use tantivy
  • Loading branch information
Toromyx committed May 1, 2024
1 parent cf7080d commit 2d9b999
Show file tree
Hide file tree
Showing 16 changed files with 867 additions and 30 deletions.
84 changes: 84 additions & 0 deletions docs/decisions/0001-xyz-as-search-engine.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# XYZ as Full-Text Search Engine

## Context and Problem Statement

{Describe the context and problem statement, e.g., in free form using two to three sentences or in the form of an
illustrative story.
You may want to articulate the problem in form of a question and add links to collaboration boards or issue management
systems.}

Which full-text search engine should this application use?

<!-- This is an optional element. Feel free to remove. -->

## Decision Drivers

- {decision driver 1, e.g., a force, facing concern, …}
- {decision driver 2, e.g., a force, facing concern, …}
-<!-- numbers of drivers can vary -->

## Considered Options

- {title of option 1}
- {title of option 2}
- {title of option 3}
-<!-- numbers of options can vary -->

## Decision Outcome

Chosen option: "{title of option 1}", because
{justification. e.g., only option, which meets k.o. criterion decision driver | which resolves force {force} | … | comes
out best (see below)}.

<!-- This is an optional element. Feel free to remove. -->

### Consequences

- Good, because {positive consequence, e.g., improvement of one or more desired qualities, …}
- Bad, because {negative consequence, e.g., compromising one or more desired qualities, …}
-<!-- numbers of consequences can vary -->

<!-- This is an optional element. Feel free to remove. -->

## Validation

{describe how the implementation of/compliance with the ADR is validated. E.g., by a review or an ArchUnit test}

<!-- This is an optional element. Feel free to remove. -->

## Pros and Cons of the Options

### {title of option 1}

<!-- This is an optional element. Feel free to remove. -->

{example | description | pointer to more information | …}

- Good, because {argument a}
- Good, because {argument b}

<!-- use "neutral" if the given argument weights neither for good nor bad -->

- Neutral, because {argument c}
- Bad, because {argument d}
-<!-- numbers of pros and cons can vary -->

### {title of other option}

{example | description | pointer to more information | …}

- Good, because {argument a}
- Good, because {argument b}
- Neutral, because {argument c}
- Bad, because {argument d}
-

<!-- This is an optional element. Feel free to remove. -->

## More Information

{You might want to provide additional evidence/confidence for the decision outcome here and/or
document the team agreement on the decision and/or
define when this decision when and how the decision should be realized and if/when it should be re-visited and/or
how the decision is validated.
Links to other decisions and resources might here appear as well.}
5 changes: 5 additions & 0 deletions src-tauri/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,11 @@ version = "^0.4.19"
[dependencies.log4rs]
version = "^1.2"

[dependencies.milli]
version = "1.8.0"
git = "https://github.com/meilisearch/meilisearch.git"
tag = "v1.8.0-rc.2"

[dependencies.mime_guess]
version = "^2.0"
default-features = false
Expand Down
113 changes: 104 additions & 9 deletions src-tauri/src/entity_crud.rs
Original file line number Diff line number Diff line change
@@ -1,19 +1,25 @@
//! This module implements create, read, update, delete, list, and count operations for the entities in [`crate::entity`].
use std::fmt::Debug;
use std::{collections::HashSet, fmt::Debug, sync::OnceLock};

use anyhow::Result;
use async_trait::async_trait;
use milli::Index;
use sea_orm::{
sea_query, sea_query::IntoCondition, ActiveModelBehavior, ActiveModelTrait, ColumnTrait,
EntityTrait, FromQueryResult, IntoActiveModel, ModelTrait, PrimaryKeyToColumn, PrimaryKeyTrait,
QueryFilter, QuerySelect, RelationTrait, Select, TransactionTrait, TryFromU64, TryGetable,
TryGetableMany,
EntityName, EntityTrait, FromQueryResult, IntoActiveModel, ModelTrait, PrimaryKeyToColumn,
PrimaryKeyTrait, QueryFilter, QuerySelect, RelationTrait, Select, TransactionTrait, TryFromU64,
TryGetable, TryGetableMany,
};
use sea_query::{FromValueTuple, IntoValueTuple};
use sea_query::{FromValueTuple, Iden, IntoValueTuple};
use serde::{Deserialize, Serialize};
use serde_json::Value;

use crate::{database, window::get_window};
use crate::{
database,
search_index::{search_index_add, search_index_delete, search_index_init},
window::get_window,
};

pub mod file;
pub mod ingredient;
Expand Down Expand Up @@ -48,6 +54,21 @@ where
}
}

pub trait EntityDocumentTrait: Serialize + Sized {
type Model;

fn from_model(model: Self::Model) -> Self;

fn into_object(self) -> milli::Object {
let value = serde_json::to_value(self)
.expect("Serializing a struct to a serde_json::Value should not fail.");
match value {
Value::Object(map) => map,
_ => unreachable!("A struct converted to a serde_json::Value should always be a map."),
}
}
}

/// This struct represents just getting the id column of a database table.
///
/// This is needed for listing entity ids.
Expand Down Expand Up @@ -115,7 +136,7 @@ pub trait EntityCrudTrait {
Column = Self::Column,
Relation = Self::Relation,
PrimaryKey = Self::PrimaryKey,
>;
> + Default;

/// the entity's model, implementing [`ModelTrait`]
type Model: ModelTrait<Entity = Self::Entity>
Expand Down Expand Up @@ -146,14 +167,18 @@ pub trait EntityCrudTrait {
+ TryFromU64
+ TryGetable
+ Serialize
+ Clone;
+ Clone
+ ToString;

/// the struct with which to create an entity, implementing [`TryIntoActiveModel<Self::ActiveModel>`]
type EntityCreate: TryIntoActiveModel<Self::ActiveModel> + Send;

/// the struct with which to update an entity, implementing [`TryIntoActiveModel<Self::ActiveModel>`]
type EntityUpdate: TryIntoActiveModel<Self::ActiveModel> + Send;

/// the struct with which the search index is built, implementing [`EntityDocumentTrait`]
type EntityDocument: EntityDocumentTrait<Model = Self::Model> + Send;

/// the struct with which to filter an entity list or count, implementing [`IntoCondition`]
type EntityCondition: IntoCondition + Send;

Expand All @@ -166,6 +191,7 @@ pub trait EntityCrudTrait {
///
/// - when there is any problem with the database
/// - when the tauri window can't be messaged about the created entity
/// - when there is any problem with the search index
async fn create(
create: Self::EntityCreate,
) -> Result<<Self::PrimaryKey as PrimaryKeyTrait>::ValueType> {
Expand All @@ -174,6 +200,9 @@ pub trait EntityCrudTrait {
let active_model = create.try_into_active_model().await?;
let model = active_model.insert(&txn).await?;
txn.commit().await?;
let search_index = Self::search_index();
let document = Self::EntityDocument::from_model(model.clone());
search_index_add(search_index, &document.into_object())?;
get_window().emit(Self::entity_action_created_channel(), ())?;
Ok(Self::primary_key_value(&model))
}
Expand All @@ -197,11 +226,15 @@ pub trait EntityCrudTrait {
///
/// - when there is any problem with the database
/// - when the tauri window can't be messaged about the updated entity
/// - when there is any problem with the search index
async fn update(update: Self::EntityUpdate) -> Result<Self::Model> {
let db = database::connect_writing().await;
let txn = db.begin().await?;
let model = update.try_into_active_model().await?.update(&txn).await?;
txn.commit().await?;
let search_index = Self::search_index();
let document = Self::EntityDocument::from_model(model.clone());
search_index_add(search_index, &document.into_object())?;
get_window().emit(
Self::entity_action_updated_channel(),
Self::primary_key_value(&model),
Expand All @@ -215,16 +248,19 @@ pub trait EntityCrudTrait {
///
/// - when there is any problem with the database
/// - when the tauri window can't be messaged about the deleted entity
/// - when there is an error in [`Self::pre_delete`]
/// - when there is any problem with the search index
async fn delete(id: <Self::PrimaryKey as PrimaryKeyTrait>::ValueType) -> Result<()> {
let db = database::connect_writing().await;
let txn = db.begin().await?;
let model_option = Self::Entity::find_by_id(id.clone()).one(&txn).await?;
let Some(model) = model_option else {
return Ok(());
};
let search_index_primary_key_value = Self::search_index_primary_key_value(&model);
model.delete(&txn).await?;
txn.commit().await?;
let search_index = Self::search_index();
search_index_delete(search_index, search_index_primary_key_value)?;
get_window().emit(Self::entity_action_deleted_channel(), id)?;
Ok(())
}
Expand Down Expand Up @@ -275,6 +311,18 @@ pub trait EntityCrudTrait {
Ok(count)
}

/// Search the search index for entities.
///
/// # Errors
///
/// - when there is any problem with the search index
async fn search() -> Result<Vec<Self::PrimaryKeyValue>> {
// TODO search search_index, build parameters according to what milli can do
// https://www.meilisearch.com/docs/reference/api/search#customize-attributes-to-search-on-at-search-time
// https://www.meilisearch.com/docs/learn/fine_tuning_results/filtering
todo!()
}

/// Get the primary key value from the entity model.
fn primary_key_value(model: &Self::Model) -> <Self::PrimaryKey as PrimaryKeyTrait>::ValueType;

Expand All @@ -289,4 +337,51 @@ pub trait EntityCrudTrait {

/// Get the tauri event channel for a deleted entity.
fn entity_action_deleted_channel() -> &'static str;

/// Get the primary key name for the search index.
///
/// This is used in [`milli::update::Settings::set_primary_key`].
fn search_index_primary_key_field() -> String {
Self::primary_key_colum().to_string()
}

/// Get the primary key value for the search index.
///
/// This is used for [`milli::update::index_documents::IndexDocuments::remove_documents`].
fn search_index_primary_key_value(model: &Self::Model) -> String {
Self::primary_key_value(model).to_string()
}

/// Get the searchable fields for the search index.
///
/// This is used in [`milli::update::Settings::set_searchable_fields`].
/// The order is the [attribute ranking order](https://www.meilisearch.com/docs/learn/core_concepts/relevancy#attribute-ranking-order).
/// TODO what fields should be searchable?
fn searchable_fields() -> Vec<String>;

/// Get the filterable fields for the search index.
///
/// This is used in [`milli::update::Settings::set_filterable_fields`].
/// TODO what fields should be searchable?
fn filterable_fields() -> HashSet<String>;

/// Get the sortable fields for the search index.
///
/// This is used in [`milli::update::Settings::set_sortable_fields`].
/// TODO what fields should be sortable?
fn sortable_fields() -> HashSet<String>;

fn search_index_once() -> &'static OnceLock<Index>;

fn search_index() -> &'static Index {
Self::search_index_once().get_or_init(|| {
search_index_init(
Self::Entity::default().table_name(),
Self::search_index_primary_key_field(),
Self::searchable_fields(),
Self::filterable_fields(),
Self::sortable_fields(),
)
})
}
}
40 changes: 37 additions & 3 deletions src-tauri/src/entity_crud/file.rs
Original file line number Diff line number Diff line change
@@ -1,28 +1,34 @@
//! This module implements [`EntityCrudTrait`] for [`crate::entity::file`].
use std::{fs, str::FromStr};
use std::{collections::HashSet, fs, str::FromStr, sync::OnceLock};

use anyhow::Result;
use async_trait::async_trait;
use milli::Index;
use mime_guess::mime;
use reqwest::header;
use sea_orm::{
sea_query::IntoCondition, ActiveValue, ColumnTrait, Condition, IntoActiveModel, QueryOrder,
Select,
};
use serde::Deserialize;
use sea_query::IntoIden;
use serde::{Deserialize, Serialize};
use tempfile::NamedTempFile;
use url::Url;

use crate::{
entity::file::{ActiveModel, Column, Entity, Model, PrimaryKey, Relation},
entity_crud::{EntityCrudTrait, Filter, Order, OrderBy, TryIntoActiveModel},
entity_crud::{
EntityCrudTrait, EntityDocumentTrait, Filter, Order, OrderBy, TryIntoActiveModel,
},
event::channel::{
ENTITY_ACTION_CREATED_FILE, ENTITY_ACTION_DELETED_FILE, ENTITY_ACTION_UPDATED_FILE,
},
file_storage,
};

static SEARCH_INDEX_ONCE: OnceLock<Index> = OnceLock::new();

#[derive(Debug, Deserialize)]
#[serde(rename_all = "camelCase")]
pub enum FileCreateUri {
Expand Down Expand Up @@ -99,6 +105,17 @@ impl IntoActiveModel<ActiveModel> for FileUpdate {
}
}

#[derive(Debug, Serialize)]
pub struct FileDocument {}

impl EntityDocumentTrait for FileDocument {
type Model = Model;

fn from_model(_model: Self::Model) -> Self {
todo!()
}
}

pub type FileFilter = Filter<FileCondition, FileOrderBy>;

#[derive(Debug, Deserialize)]
Expand Down Expand Up @@ -142,6 +159,7 @@ impl EntityCrudTrait for FileCrud {
type PrimaryKeyValue = i64;
type EntityCreate = FileCreate;
type EntityUpdate = FileUpdate;
type EntityDocument = FileDocument;
type EntityCondition = FileCondition;
type EntityOrderBy = FileOrderBy;

Expand All @@ -164,6 +182,22 @@ impl EntityCrudTrait for FileCrud {
fn entity_action_deleted_channel() -> &'static str {
ENTITY_ACTION_DELETED_FILE
}

fn searchable_fields() -> Vec<String> {
vec![Column::Name.into_iden().to_string()]
}

fn filterable_fields() -> HashSet<String> {
HashSet::from([])
}

fn sortable_fields() -> HashSet<String> {
HashSet::from([Column::Name.into_iden().to_string()])
}

fn search_index_once() -> &'static OnceLock<Index> {
&SEARCH_INDEX_ONCE
}
}

#[cfg(test)]
Expand Down
Loading

0 comments on commit 2d9b999

Please sign in to comment.