Skip to content

Latest commit

 

History

History
188 lines (121 loc) · 6.22 KB

DEVELOPER_GUIDE.md

File metadata and controls

188 lines (121 loc) · 6.22 KB

OpenSearch-py-ml Developer Guide

This guide applies to the development within the OpenSearch-py-ml project

Getting started guide

This guide is for any developer who wants a running local development environment where you can make, see, and test changes. It's opinionated to get you running as quickly and easily as possible, but it's not the only way to set up a development environment.

If you're only interested in installing and running this project, you can install from pypi

If you're planning to contribute code (features or fixes) to this repository, great! Make sure to also read the contributing guide.

Key technologies

OpenSearch-py-ml is primarily a python based client plugin for machine learning in opensearch. To effectively contribute you should be familiar with PYTHON.

Prerequisites

To develop on OpenSearch-py-ml, you'll need:

  • A GitHub account
  • git for version control
  • Python, Package installer for python. for example:pip
  • A code editor of your choice, configured for Python. If you don't have a favorite editor, we suggest Pycharm

If you already have these installed or have your own preferences for installing them, skip ahead to the Fork and clone OpenSearch-py-ml section.

Install git

If you don't already have it installed (check with git --version) we recommend following the git installation guide for your OS.

Install python

You can install any version of python starting from 3.8.

Fork and clone OpenSearch-py-ml

All local development should be done in a forked repository. Fork OpenSearch-py-ml by clicking the "Fork" button at the top of the GitHub repository.

Clone your forked version of OpenSearch-py-ml to your local machine (replace opensearch-project in the command below with your GitHub username):

$ git clone [email protected]:opensearch-project/opensearch-py-ml.git

Install Opensearch-py-ml dependencies

If you haven't already, change directories to your cloned repository directory:

$ cd OpenSearch-py-ml

The pip install command will install the project's dependencies and build all internal packages and plugins.

$ pip install -r requirements-dev.txt

Run OpenSearch

OpenSearch-py-ml requires a running version of OpenSearch (from opensearch 2.5) to connect to.

You can install opensearch multiple ways:

  1. https://opensearch.org/downloads.html#docker-compose
  2. https://opensearch.org/docs/2.5/install-and-configure/install-opensearch/tar/

Next Steps

Now that you have a development environment to play with, there are a number of different paths you may take next.

Update the settings for the cluster

After navigating to OpenSearch Dashboards you should update the persistent settings for the cluster. The settings will update the behavior of the machine learning plugin, specifically the ml_commons plugin. ML Commons cluster settings: https://opensearch.org/docs/latest/ml-commons-plugin/cluster-settings/

You should paste this settings in the Dev Tools window and run it:

 PUT /_cluster/settings
 {
   "persistent" : {
     "plugins.ml_commons.only_run_on_ml_node" : false, 
     "plugins.ml_commons.native_memory_threshold" : 100, 
     "plugins.ml_commons.max_model_on_node": 20
   }
 }

Review user tutorials to understand the key features and workflows

  • These Notebook Examples will show you how to use opensearch-py-ml for data exploration and machine learning.
  • API references provides helpful guidance using different functionalities of opensearch-py-ml

To test code formatting and linting issue we can run

$ nox -s lint

To fix code formatting and linting issue we can run

$ nox -s format

To run tests

$ nox -s test

To test documentation

# New HTML pages will be created in build/html
$ cd docs
$ pip install -r requirements-docs.txt
$ make clean
$ make html

Default setup for opensearch

opensearch.hosts: ["https://localhost:9200"]
opensearch.username: "admin" # Default username
opensearch.password: "admin" # Default password

Code guidelines

Filenames

All filenames should use snake_case.

Right: opensearch_py_ml/ml_commons/ml_commons_client.py

Wrong: opensearch_py_ml/mlCommons/mlCommonsClient.py

Do not comment out code

We use a version management system. If a line of code is no longer needed, remove it, don't simply comment it out.

Avoid magic numbers/strings

These are numbers (or other values) simply used in line in your code. Do not use these, give them a variable name, so they can be understood and changed easily.

// good
minWidth = 300

if width < minWidth:
  ...

// bad
if width < 300:
  ...

Avoid global definitions

Don't do this. Everything should be wrapped in a module that can be depended on by other modules. Even things as simple as a single value should be a module.

Write small functions

Keep your functions short. A good function fits on a slide that the people in the last row of a big room can comfortably read. So don't count on them having perfect vision and limit yourself to ~15 lines of code per function.