-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #1 from weni-ai/products-index
Products index
- Loading branch information
Showing
23 changed files
with
3,050 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
[report] | ||
exclude_lines = pass |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
name: CI | ||
|
||
on: | ||
push: | ||
branches: | ||
- '*' | ||
pull_request: | ||
branches: | ||
- '*' | ||
|
||
jobs: | ||
build: | ||
runs-on: ubuntu-latest | ||
|
||
steps: | ||
- name: Checkout code | ||
uses: actions/checkout@v2 | ||
|
||
- name: Set up Python | ||
uses: actions/setup-python@v2 | ||
with: | ||
python-version: '3.10' | ||
|
||
- name: Install project dependencies | ||
run: | | ||
pip install poetry | ||
poetry install | ||
working-directory: ${{ github.workspace }} | ||
|
||
- name: Run tests | ||
run: | | ||
poetry run coverage run -m unittest discover ./app/tests/ | ||
poetry run coverage report | ||
poetry run coverage xml | ||
working-directory: ${{ github.workspace }} | ||
|
||
- name: Upload coverage report | ||
uses: codecov/codecov-action@v2 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
FROM python:3.10-slim | ||
|
||
WORKDIR /app | ||
|
||
RUN pip install poetry | ||
|
||
COPY pyproject.toml poetry.lock ./ | ||
|
||
RUN poetry config virtualenvs.create false && \ | ||
poetry install --no-dev | ||
|
||
COPY . . | ||
|
||
EXPOSE 8000 | ||
|
||
COPY entrypoint.sh /entrypoint.sh | ||
|
||
RUN chmod +x /entrypoint.sh | ||
|
||
CMD ["/entrypoint.sh"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,188 @@ | ||
# microservice-ia | ||
[![CI](https://github.com/weni-ai/SentenX/actions/workflows/ci.yaml/badge.svg)](https://github.com/weni-ai/SentenX/actions/workflows/ci.yaml) | ||
|
||
# SentenX | ||
|
||
microservice that uses a sentence transformer model to index and search records. | ||
|
||
## Table of Contents | ||
|
||
1. [Requirements](#requirements) | ||
2. [Quickstart](#quickstart) | ||
3. [Usage](#usage) | ||
4. [Test](#test) | ||
|
||
## Requirements | ||
|
||
* python 3.10 | ||
* elasticsearch 8.9.1 | ||
|
||
## Quickstart | ||
on root directory of this project run the following commands to: | ||
|
||
setup sagemaker required keys and elasticsearch url environment variables | ||
|
||
``` | ||
export AWS_ACCESS_KEY_ID=YOUR_SAGEMAKER_AWS_ACCESS_KEY | ||
export AWS_SECRET_ACCESS_KEY=YOUR_SAGEMAKER_AWS_SECRET_ACCESS_KEY | ||
export ELASTICSEARCH_URL=YOUR_ELASTICSEARCH_URL | ||
``` | ||
|
||
install poetry | ||
``` | ||
pip install poetry | ||
``` | ||
|
||
create a python 3.10 virtual environment | ||
``` | ||
poetry env use 3.10 | ||
``` | ||
|
||
activate the environment | ||
``` | ||
poetry shell | ||
``` | ||
|
||
install dependencies | ||
``` | ||
poetry install | ||
``` | ||
|
||
start the microservice | ||
``` | ||
uvicorn app.main:main_app.api --reload | ||
``` | ||
|
||
### Docker compose | ||
|
||
to start sentenx with elasticsearch with docker compose: | ||
|
||
setup `AWS_SECRET_ACCESS_KEY` and `AWS_ACCESS_KEY_ID` on `docker-compose.yml` | ||
``` | ||
docker compose up -d | ||
``` | ||
|
||
to stop: | ||
``` | ||
docker compose down | ||
``` | ||
|
||
to start with rebuild after any change on source: | ||
``` | ||
docker compose up -d --build | ||
``` | ||
|
||
|
||
## Usage | ||
|
||
### To index a product | ||
|
||
request: | ||
```bash | ||
curl -X PUT http://localhost:8000/products/index \ | ||
-H 'Content-Type: application/json' \ | ||
-d '{ | ||
"catalog_id": "cat1", | ||
"product": { | ||
"facebook_id": "123456789", | ||
"title": "massa para bolo de baunilha", | ||
"org_id": "1", | ||
"channel_id": "5", | ||
"catalog_id": "cat1", | ||
"product_retailer_id": "pp1" | ||
} | ||
} | ||
' | ||
``` | ||
response: | ||
```json | ||
status: 200 | ||
{ | ||
"catalog_id": "cat1", | ||
"documents": [ | ||
"cac65148-8c1d-423c-a022-2a52cdedcd3c" | ||
] | ||
} | ||
``` | ||
|
||
### To index products in batch | ||
|
||
request: | ||
```bash | ||
|
||
curl -X PUT http://localhost:8000/products/batch \ | ||
-H 'Content-Type: application/json' \ | ||
-d '{ | ||
"catalog_id": "asdfgh", | ||
"products": [ | ||
{ | ||
"facebook_id": "1234567891", | ||
"title": "banana prata 1kg", | ||
"org_id": "1", | ||
"channel_id": "5", | ||
"catalog_id": "asdfgh", | ||
"product_retailer_id": "p1" | ||
}, | ||
{ | ||
"facebook_id": "1234567892", | ||
"title": "doce de banana 250g", | ||
"org_id": "1", | ||
"channel_id": "5", | ||
"catalog_id": "asdfgh", | ||
"product_retailer_id": "p2" | ||
} | ||
] | ||
}' | ||
``` | ||
|
||
response: | ||
```json | ||
status: 200 | ||
|
||
{ | ||
"catalog_id": "asdfgh", | ||
"documents": [ | ||
"f5b8d394-eb62-4c92-9501-51a8ebcf1380", | ||
"bcb551e8-0bd1-4ca7-825b-cf8aa8a3f0e0" | ||
] | ||
} | ||
``` | ||
|
||
### To search for products | ||
|
||
request | ||
```bash | ||
curl http://localhost:8000/products/search \ | ||
-H 'Content-Type: application/json' \ | ||
-d '{ | ||
"search": "massa", | ||
"filter": { | ||
"catalog_id": "cat1" | ||
}, | ||
"threshold": 1.6 | ||
} | ||
' | ||
``` | ||
response: | ||
```json | ||
status: 200 | ||
{ | ||
"products": [ | ||
{ | ||
"facebook_id": "1", | ||
"title": "massa para bolo de baunilha", | ||
"org_id": "1", | ||
"channel_id": "5", | ||
"catalog_id": "asdfgh4321", | ||
"product_retailer_id": "abc321" | ||
} | ||
] | ||
} | ||
``` | ||
|
||
## Test | ||
|
||
we use unittest with discover to run the tests that are in `./app/tests` | ||
``` | ||
coverage run -m unittest discover -s app/tests | ||
``` | ||
|
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
import os | ||
|
||
|
||
class AppConfig: | ||
def __init__(self): | ||
self.product_index_name = os.environ.get( | ||
"INDEX_PRODUCTS_NAME", "catalog_products" | ||
) | ||
self.es_url = os.environ.get("ELASTICSEARCH_URL", "http://localhost:9200") | ||
self.embedding_type = os.environ.get("EMBEDDING_TYPE", "sagemaker") | ||
self.sagemaker = { | ||
"endpoint_name": os.environ.get( | ||
"SAGEMAKER_ENDPOINT_NAME", | ||
"huggingface-pytorch-inference-2023-07-28-21-01-20-147", | ||
), | ||
"region_name": os.environ.get("SAGEMAKER_REGION_NAME", "us-east-1"), | ||
} | ||
self.huggingfacehub = { | ||
"repo_id": os.environ.get( | ||
"HUGGINGFACE_REPO_ID", "sentence-transformers/all-MiniLM-L6-v2" | ||
), | ||
"task": os.environ.get("HUGGINGFACE_TASK", "feature-extraction"), | ||
"huggingfacehub_api_token": os.environ.get( | ||
"HUGGINGFACE_API_TOKEN", "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" | ||
), | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
from abc import ABC, abstractmethod | ||
|
||
|
||
class IDocumentHandler(ABC): | ||
@abstractmethod | ||
def index(self): | ||
pass | ||
|
||
@abstractmethod | ||
def batch_index(self): | ||
pass | ||
|
||
@abstractmethod | ||
def search(self): | ||
pass | ||
|
||
@abstractmethod | ||
def delete(self): | ||
pass | ||
|
||
@abstractmethod | ||
def delete_batch(self): | ||
pass |
Oops, something went wrong.