-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial service commit #1
Conversation
README.md
Outdated
|
||
Make sure your environment is configured: | ||
```commandline | ||
export ELASTIC_PASSWORD=PASSWORD_HERE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also suggest sensible defaults for the other variables? Or provide an env file maybe?
e.g. for MEM_LIMIT, STACK_VERSION, CLUSTER_NAME, ES_PORT etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2064 export STACK_VERSION=8.3.3
2065 export CLUSTER_NAME=elasticcluster
2066 export MEM_LIMIT=1g
2069 export ES_PORT=7777
seem to be enough to launch the docker containers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely - there was an .env file but it wasn't committed due to the .gitignore. I'll add this now.
Note, I found that 2GB of memory lead to better performance, so I'll put that in. Still works well with 1GB if we're memory constrained.
README.md
Outdated
|
||
To start server: | ||
```console | ||
uvicorn api:app --reload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I'm a bit lost as I know little docker and python unfortunately. Where do I need to run uvicorn? In one of the docker containers? (elasticvue?) Or in a Python virtual env outside of docker, where I should install everything listed in requirements.txt?
If you can put a bit more details in the Helpful commands for newbies like me, it would be very helpful indeed! :)
Our python and docker expert is @alexgarel , but he's on vacation until August 16th.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely! So, I've now structured this so that the search service runs in docker. I've also added commands for both using docker and using locally.
Note, as part of this I moved everything inside an app
directory, so the PR change looks bigger than it is.
That looks awesome, thank you very much @simonj2 ! The CSV export is very partial, so at some point it will be better to switch to the MongoDB export. There are few things that we'll to think about at some point. In particular taxonomized fields like the categories are not stored with searchable textual names. The list of categories in in categories_tags, and it's an array with entries like "en:coffee". There is a field "categories" which can be in any language and should not be used for search. One solution could be to include values for one language (e.g. for a French product, if there's categories_tags: "en:coffee", we also include categories: "café". Then search would work for one language (not so good for products sold in countries with multiple languages, like Belgium or Canada). For language specific fields like product_name, we also have entries like product_name_en, product_name_fr etc. (product_name contains a copy of the value for the main language of the product). |
Thanks for the feedback @stephanegigandet ! I've incorporated the comments you had about the code.
Makes sense - will look at this in the future.
Interesting! I think this comes back to the schema discussion. Fields aren't discoverable, documented, or consistent (ie, certain times the language is embedded in the value A unified schema for data representation could solve this, ie, every field could be stored as a map of countrycode->value:
This could then be returned to the clients directly, and a convenience API could be offered to return the correct value if a language parameter is provided. The same logic could apply for tags (filtering if a language parameter is provided). I could write logic to do things like:
But, I think that's probably fragile - new fields will break this, we don't make our API consumers lives any easier by consolidating fields, etc. Perhaps let's see how the schema work lands, then it could be incorporated here quite easily? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks very good to me, thank you!
What
Initial commit of the search service.
Sorry for the large commit, but this allowed me to scope out the proposal better.
Contains: