-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds Search module and create_index command in python #868
Conversation
583bbdf
to
800300a
Compare
10e0e5a
to
3215278
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll continue tomorrow
python/python/glide/async_commands/redis_modules/search/commands.py
Outdated
Show resolved
Hide resolved
schema (List[Field]): The schema of the index, specifying the fields and their types. | ||
max_text_fields (bool): Forces RediSearch to encode indexes as if there were more than 32 text attributes. | ||
Additional attributes (beyond 32) can be added using FT.ALTER. | ||
temporary_seconds (Optional[int]): Creates a temporary index that automatically expires after a set period of inactivity, measured in seconds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the description of this filed from redis site
python/python/glide/async_commands/redis_modules/search/commands.py
Outdated
Show resolved
Hide resolved
python/python/glide/async_commands/redis_modules/search/field.py
Outdated
Show resolved
Hide resolved
args = super().get_field_args() | ||
args.append("TEXT") | ||
if not self.sortable and self.no_index: | ||
raise ValueError("Cannot be both non-storable and non-indexed.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not doing it in the object init function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also you can make it more informative - "no_index
can only be set when sortable
is provided" (something like that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checking inside get_field_args
is better since user can do something like this:
index = Index(name="name" ,sortable=Sortable.SORTABLE , no_idex=false)
index.no_index = true
...
and we want to let user change some attributes because we want to make the object reusable, so the best thing to do is to check before executing the command.
python/python/glide/async_commands/redis_modules/search/field.py
Outdated
Show resolved
Hide resolved
python/python/glide/async_commands/redis_modules/search/field.py
Outdated
Show resolved
Hide resolved
156f1fc
to
c846677
Compare
python/python/glide/async_commands/redis_modules/search/commands.py
Outdated
Show resolved
Hide resolved
python/python/glide/async_commands/redis_modules/search/commands.py
Outdated
Show resolved
Hide resolved
temporary_seconds (Optional[int]): Creates a lightweight temporary index that automatically expires after a set period of inactivity, measured in seconds. | ||
The index's internal idle timer resets each time a search or addition operation occurs. | ||
Because such indexes are lightweight, you can create thousands of such indexes without negative performance implications and, therefore, | ||
you should consider using SKIPINITIALSCAN to avoid costly scanning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what SKIPINITIALSCAN means? how to I set it in the client?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its the last argument in this function -> initial_scan
, can be set with InitialScan.SKIP_SCAN
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand but the description should say that
python/python/glide/async_commands/redis_modules/search/optional_params.py
Show resolved
Hide resolved
python/python/glide/async_commands/redis_modules/search/optional_params.py
Outdated
Show resolved
Hide resolved
python/python/glide/async_commands/redis_modules/search/commands.py
Outdated
Show resolved
Hide resolved
python/python/glide/async_commands/redis_modules/search/commands.py
Outdated
Show resolved
Hide resolved
python/python/glide/async_commands/redis_modules/search/commands.py
Outdated
Show resolved
Hide resolved
python/python/glide/async_commands/redis_modules/search/field.py
Outdated
Show resolved
Hide resolved
no_stem (bool): Indicate whether to disable stemming when indexing field values. | ||
no_index (bool): Indicate whether the field should be excluded from indexing. | ||
phonetic (Optional[PhoneticType]): Activate phonetic matching for text searches, specifies the phonetic algorithm and language used (e.g., English: dm:en). | ||
weight (Optional[float]): Specify attribute importance of this attribute when calculating result accuracy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rewrite the description, use the one from redis's
python/python/glide/async_commands/redis_modules/search/field.py
Outdated
Show resolved
Hide resolved
""" | ||
|
||
|
||
class Field: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand it right the Field class shouldn't be used with create_index, only the extended versions of it.
If so, please make it an abstract class, so the get_field_args would be an abstract function and instead have another function implemented only in the base class get_filed_identifier
which will return the filed name and alias
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got until GeoField. will continue tomorrow.
01d6b1a
to
53899f2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shohamazon very impressive work! I can see you've put a lot of thought into making this interface accessible to users.
approving, but @barshaul has the final word on this.
python/python/tests/test_search.py
Outdated
assert "search" in res["module"] | ||
|
||
@pytest.mark.parametrize("cluster_mode", [True, False]) | ||
async def test_json_module_loadded(self, redis_client: TRedisClient): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you testing the JSON module in the search module tests?
also - loadded -> loaded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to create json indexes, json module needs to be loaded, maybe it would be better to combine both tests into a single test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should combine the modules tests, and I think that this test should belong to json tests only. You can have an assertion in the json indexes tests to verify that json module was loaded
GeoField(name="location"), | ||
] | ||
|
||
assert ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all of these tests only test that the call returns OK. Is this really the most meaningful test available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for now, when I add FT.SEARCH the tests will be more meaningful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please also add tests for edge cases - passing an empty string for prefix/filter, combining mutually exclusive args, passing nulls etc.. make sure we raise appropriate exceptions
Thank you 🙂 :) |
|
||
class Field(ABC): | ||
""" | ||
Base class for defining fields in a schema. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Abstract base class
args.extend(self.sortable.value.split(" ")) | ||
if self.no_index: | ||
args.append("NOINDEX") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove newline
|
||
Args: | ||
index (str): The name of the index. | ||
index_type (IndexType): The type of the index, supports HASH (default) and JSON. To index JSON, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't document what options IndexType has, as it might changed and then we will need to maintain it here too. but you can say it is default to HASH
prefix (Optional[List[str]]): The prefix for the index, tells the index which keys it should index. Multiple prefixes can be added. | ||
If not set, the default is '*' (all keys). | ||
filter (Optional[str]): A filter expression. It is possible to use "@__key" to access the key that was just added/changed. | ||
language (Optional[str]): The default language for documents in the index. Default is English. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove "Default is English. "-> this is a server configuration that might be changed by redis, and by default we're passing null - not specific value
List[str]: A list of index attributes. | ||
""" | ||
args = [self.name] | ||
if self.index_type: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how can it be false? it has a default value
args = [self.name] | ||
if self.index_type: | ||
args.extend(["ON", self.index_type.value]) | ||
if self.prefix and len(self.prefix) > 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please raise a value error if len(self.prefix) == 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shohamazon Finally, I finished reviewing all. Please make the required fixes and let me know when it's ready for another round.
f72df1b
to
76f55be
Compare
@barshaul ready |
) -> TOK: | ||
""" | ||
Creates a new search index with the specified configuration. | ||
See https://redis.io/commands/ft.create/ for more details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://redis.io/commands/ft.create/ for more details. | |
See https://redis.io/commands/ft.create/ for more details. |
max_text_fields: bool = False, | ||
temporary_seconds: Optional[int] = None, | ||
offset: Offset = Offset.STORE_OFFSET, | ||
highlight: Highlights = Highlights.USE_HIGHLIGHTS, | ||
field_flag: FieldFlag = FieldFlag.USE_FIELDS, | ||
frequencies: Frequencies = Frequencies.SAVE_FREQUENCIES, | ||
stop_words: Optional[List[str]] = None, | ||
initial_scan: InitialScan = InitialScan.SCAN_INDEX, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider creating CreateIndexOptions
with these parameters inside
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then you can do options.toArgs()
below
sortable (Optional[SortableOption]): If not None, sets the field as sortable with the specified sortable option. | ||
no_stem (bool): Indicate whether to disable stemming when indexing field values. | ||
no_index (bool): Indicate whether the field should be excluded from indexing. | ||
phonetic (Optional[PhoneticType]): Activate phonetic matching for text searches, specifies the phonetic algorithm and language used (e.g., English: dm:en). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
phonetic (Optional[PhoneticType]): Activate phonetic matching for text searches, specifies the phonetic algorithm and language used (e.g., English: dm:en). | |
phonetic (Optional[PhoneticType]): Activate phonetic matching for text searches, specifies the phonetic algorithm and language used (e.g., `DM_ENGLISH`). |
closing for now, will reopen when modules are supported |
Description of changes:
Adding the
FT.CREATE
command of RediSearch in python.