Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vector client release 0.4.0 #7

Merged
merged 6 commits into from
Apr 11, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .flake8
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
[flake8]
ignore = E501
max-line-length = 88 # black compatability
ignore = E501, E203 # black compatability
DomPeliniAerospike marked this conversation as resolved.
Show resolved Hide resolved
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -155,3 +155,11 @@ cython_debug/
# PyCharm
.idea/

# Vector search test files
tests/siftsmall/*
tests/siftsmall.tar.gz

# Notes
notes.txt

public-key.asc
20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
36 changes: 36 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Configuration file for the Sphinx documentation builder.
#
# For the full list of built-in configuration values, see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information
import os
import sys
sys.path.insert(0, os.path.abspath('../src/aerospike_vector'))
import sphinx.ext.autodoc

project = 'D'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'D', is that right? I am unfamiliar with sphinx

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this was just a place holder while I was getting the hang of the python API docs.
Will fix this with the (next) docs PR.

copyright = '2024, D'
author = 'D'
release = 'D'

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration

extensions = [
# Other extensions...
'sphinx.ext.autodoc',
]


templates_path = ['_templates']
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']



# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output

html_theme = 'sphinx_rtd_theme'
html_static_path = ['_static']
20 changes: 20 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
.. D documentation master file, created by
sphinx-quickstart on Sun Apr 7 22:14:55 2024.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.

Welcome to D's documentation! SCOBEDY BABALOUIE

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D again! :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not need to write this, will fix this soon lol.

=============================

.. toctree::
:maxdepth: 2
:caption: Contents:

vectordb_client

Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
35 changes: 35 additions & 0 deletions docs/make.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=.
set BUILDDIR=_build

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.https://www.sphinx-doc.org/
exit /b 1
)

if "%1" == "" goto help

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%

:end
popd
4 changes: 4 additions & 0 deletions docs/vectodb_client.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
.. automodule:: aerospike_vector.vectordb_client
:members:
:undoc-members:
:show-inheritance:
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ description = "Aerospike Proximus Client Library for Python"
authors = [
{ name = "Aerospike, Inc.", email = "[email protected]" }
]
readme = "README.rst"
readme = "README.md"
license = { text = "Apache Software License" }
keywords = ["aerospike", "vector", "database", "ANN"]
classifiers = [
Expand All @@ -22,7 +22,7 @@ classifiers = [
"Programming Language :: Python :: Implementation :: CPython",
"Topic :: Database"
]
version = "0.3.2.dev1"
version = "0.4.0dev1"
requires-python = ">3.8"
dependencies = [
"grpcio",
Expand Down
3 changes: 3 additions & 0 deletions src/aerospike_vector/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1,4 @@
import logging
name = "aerospike_vector"

logging.getLogger(__name__).addHandler(logging.NullHandler())
35 changes: 22 additions & 13 deletions src/aerospike_vector/conversions.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,20 +18,28 @@ def toVectorDbValue(value: Any) -> types_pb2.Value:
if isinstance(value[0], float):
return types_pb2.Value(
vectorValue=types_pb2.Vector(
floatData={"value": [float(x) for x in value]}))
floatData={"value": [float(x) for x in value]}
)
)
elif isinstance(value[0], bool):
return types_pb2.Value(
vectorValue=types_pb2.Vector(
boolData={"value": [True if x else False for x in value]}))
boolData={"value": [True if x else False for x in value]}
)
)
else:
return types_pb2.Value(
listValue=types_pb2.List(
entries=[toVectorDbValue(x) for x in value]))
listValue=types_pb2.List(entries=[toVectorDbValue(x) for x in value])
)
elif isinstance(value, dict):
d = types_pb2.Value(
mapValue=types_pb2.Map(entries=[
types_pb2.MapEntry(key=toMapKey(k), value=toVectorDbValue(v))
for k, v in value.items()]))
mapValue=types_pb2.Map(
entries=[
types_pb2.MapEntry(key=toMapKey(k), value=toVectorDbValue(v))
for k, v in value.items()
]
)
)
return d
else:
raise Exception("Invalid type " + str(type(value)))
Expand Down Expand Up @@ -61,7 +69,9 @@ def fromVectorDbKey(key: types_pb2.Key) -> types.Key:
elif key.HasField("bytesValue"):
keyValue = key.bytesValue

return types.Key(key.namespace, key.set, key.digest, keyValue)
return types.Key(
namespace=key.namespace, set=key.set, digest=key.digest, key=keyValue
)


def fromVectorDbRecord(record: types_pb2.Record) -> dict[str, Any]:
Expand All @@ -72,11 +82,10 @@ def fromVectorDbRecord(record: types_pb2.Record) -> dict[str, Any]:
return bins


def fromVectorDbNeighbor(input: types_pb2.Neighbor) -> (
types.Neighbor):
return types.Neighbor(fromVectorDbKey(input.key),
fromVectorDbRecord(input.record),
input.distance)
def fromVectorDbNeighbor(input: types_pb2.Neighbor) -> types.Neighbor:
return types.Neighbor(
key=fromVectorDbKey(input.key), bins=fromVectorDbRecord(input.record), distance=input.distance
)


def fromVectorDbValue(input: types_pb2.Value) -> Any:
Expand Down
123 changes: 117 additions & 6 deletions src/aerospike_vector/types.py
Original file line number Diff line number Diff line change
@@ -1,29 +1,140 @@
from typing import Any
import enum
from typing import Any, Optional

from . import types_pb2


class HostPort(object):
def __init__(self, address: str, port: int, isTls=False):
self.address = address
def __init__(self, *, host: str, port: int, isTls: Optional[bool] = False) -> None:
self.host = host
self.port = port
self.isTls = isTls


class Key(object):
def __init__(self, namespace: str, set: str, digest: bytearray, key: Any):
def __init__(
self, *, namespace: str, set: str, digest: bytearray, key: Any
) -> None:
self.namespace = namespace
self.set = set
self.digest = digest
self.key = key

def __str__(self):
return f"Key: namespace='{self.namespace}', set='{self.set}', digest={self.digest}, key={self.key}"


class RecordWithKey(object):
def __init__(self, key: Key, bins: dict[str, Any]):
def __init__(self, *, key: Key, bins: dict[str, Any]) -> None:
self.key = key
self.bins = bins

def __str__(self):
bins_info = ""
for key, value in self.bins.items():
if isinstance(value, list):
if len(value) > 4:
value_str = (
"[\n"
+ ",\n".join("\t\t\t{}".format(val) for val in value[:3])
+ ",\n\t\t\t...\n\t\t]"
)
else:
value_str = str(value)
else:
value_str = str(value)
bins_info += "\n\t\t{}: {}".format(key, value_str)
return "{{\n\t{},\n\tbins: {{\n{}\n\t}}\n}}".format(self.key, bins_info)


class Neighbor(object):
def __init__(self, key: Key, bins: dict[str, Any], distance: float):
def __init__(self, *, key: Key, bins: dict[str, Any], distance: float) -> None:
self.key = key
self.bins = bins
self.distance = distance

def __str__(self):
bins_info = ""
for key, value in self.bins.items():
if isinstance(value, list):
if len(value) > 4:
value_str = (
"[\n"
+ ",\n".join("\t\t\t{}".format(val) for val in value[:3])
+ ",\n\t\t\t...\n\t\t]"
)
else:
value_str = str(value)
else:
value_str = str(value)
bins_info += "\n\t\t{}: {}".format(key, value_str)
return "{{\n\t{},\n\tdistance: {},\n\tbins: {{\n{}\n\t}}\n}}".format(
self.key, self.distance, bins_info
)


class VectorDistanceMetric(enum.Enum):
SQUARED_EUCLIDEAN: types_pb2.VectorDistanceMetric = (
types_pb2.VectorDistanceMetric.SQUARED_EUCLIDEAN
)
COSINE: types_pb2.VectorDistanceMetric = types_pb2.VectorDistanceMetric.COSINE
DOT_PRODUCT: types_pb2.VectorDistanceMetric = (
types_pb2.VectorDistanceMetric.DOT_PRODUCT
)
MANHATTAN: types_pb2.VectorDistanceMetric = types_pb2.VectorDistanceMetric.MANHATTAN
HAMMING: types_pb2.VectorDistanceMetric = types_pb2.VectorDistanceMetric.HAMMING


class HnswBatchingParams(object):
def __init__(
self,
*,
max_records: Optional[int] = 10000,
interval: Optional[int] = 10000,
disabled: Optional[bool] = False,
) -> None:
self.max_records = max_records
self.interval = interval
self.disabled = disabled

def to_pb2(self):
# Create an instance of HnswBatchingParams
params = types_pb2.HnswBatchingParams()
params.maxRecords = self.max_records
params.interval = self.interval
params.disabled = self.disabled
return params


class HnswParams(object):
def __init__(
self,
*,
m: Optional[int] = 16,
ef_construction: Optional[int] = 100,
ef: Optional[int] = 100,
batching_params: Optional[HnswBatchingParams] = HnswBatchingParams(),
) -> None:
self.m = m
self.ef_construction = ef_construction
self.ef = ef
self.batching_params = batching_params

def to_pb2(self):
# Create an instance of HnswParams
params = types_pb2.HnswParams()
params.m = self.m
params.efConstruction = self.ef_construction
params.ef = self.ef

# Assign HnswBatchingParams instance to HnswParams
params.batchingParams.CopyFrom(self.batching_params.to_pb2())
return params

class HnswSearchParams(object):
def __init__(self, *, ef: Optional[int] = None) -> None:
self.ef = ef
def to_pb2(self):
params = types_pb2.HnswSearchParams()
params.ef = self.ef
return params
Loading