Skip to content

Commit

Permalink
Merge pull request #104 from SubramanyamChalla24/backend_ml
Browse files Browse the repository at this point in the history
Integrated Cohere with Qdrant to get similarity scores.
  • Loading branch information
srbhr authored Aug 19, 2023
2 parents 076125f + 8113b8f commit 370d802
Show file tree
Hide file tree
Showing 5 changed files with 305 additions and 88 deletions.
89 changes: 53 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,65 +57,65 @@ Follow these steps to set up the environment and run the application.

2. Clone the forked repository.

```bash
git clone https://github.com/<YOUR-USERNAME>/Resume-Matcher.git
cd Resume-Matcher
```
```bash
git clone https://github.com/<YOUR-USERNAME>/Resume-Matcher.git
cd Resume-Matcher
```

3. Create a Python Virtual Environment:

- Using [virtualenv](https://learnpython.com/blog/how-to-use-virtualenv-python/):
- Using [virtualenv](https://learnpython.com/blog/how-to-use-virtualenv-python/):

_Note_: Check how to install virtualenv on your system here [link](https://learnpython.com/blog/how-to-use-virtualenv-python/).
_Note_: Check how to install virtualenv on your system here [link](https://learnpython.com/blog/how-to-use-virtualenv-python/).

```bash
virtualenv env
```
```bash
virtualenv env
```

**OR**
**OR**

- Create a Python Virtual Environment:
- Create a Python Virtual Environment:

```bash
python -m venv env
```
```bash
python -m venv env
```

4. Activate the Virtual Environment.

- On Windows.
- On Windows.

```bash
env\Scripts\activate
```
```bash
env\Scripts\activate
```

- On macOS and Linux.
- On macOS and Linux.

```bash
source env/bin/activate
```
```bash
source env/bin/activate
```

5. Install Dependencies:

```bash
pip install -r requirements.txt
```
```bash
pip install -r requirements.txt
```

6. Prepare Data:

- Resumes: Place your resumes in PDF format in the `Data/Resumes` folder. Remove any existing contents in this folder.
- Job Descriptions: Place your job descriptions in PDF format in the `Data/JobDescription` folder. Remove any existing contents in this folder.
- Resumes: Place your resumes in PDF format in the `Data/Resumes` folder. Remove any existing contents in this folder.
- Job Descriptions: Place your job descriptions in PDF format in the `Data/JobDescription` folder. Remove any existing contents in this folder.

7. Parse Resumes to JSON:

```python
python run_first.py
```
```python
python run_first.py
```

8. Run the Application:

```python
streamlit run streamlit_app.py
```
```python
streamlit run streamlit_app.py
```

**Note**: For local versions, you do not need to run "streamlit_second.py" as it is specifically for deploying to Streamlit servers.

Expand All @@ -127,12 +127,29 @@ Follow these steps to set up the environment and run the application.

1. Build the image and start application

```bash
docker-compose up
```
```bash
docker-compose up
```

2. Open `localhost:80` on your browser

### Cohere and Qdrant

1. Visit [Cohere website registration](https://dashboard.cohere.ai/welcome/register) and create an account.
2. Go to API keys and copy your cohere api key.
3. Visit [Qdrant website](https://cloud.qdrant.io/) and create an account.
4. Get your api key and cluster url as well
5. Now create a yaml file named config.yml in Scripts/Similarity/ folder.
6. The format for the conifg file should be as below:
```yaml
cohere:
api_key: cohere_key
qdrant:
api_key: qdrant_api_key
url: qdrant_cluster_url
```
7. Please replace your values without any quotes.

<br/>

<div align="center">
Expand Down
76 changes: 40 additions & 36 deletions archive/resume_matcher.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -17,32 +17,11 @@
"cells": [
{
"cell_type": "code",
"execution_count": null,
"execution_count": 5,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "aHoRFk4LpFSZ",
"outputId": "0a950106-ea2a-498a-9dcc-e99458b1f139"
"id": "aHoRFk4LpFSZ"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m44.5/44.5 kB\u001b[0m \u001b[31m1.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.7/2.7 MB\u001b[0m \u001b[31m30.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m132.5/132.5 kB\u001b[0m \u001b[31m1.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.6/2.6 MB\u001b[0m \u001b[31m11.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m75.4/75.4 kB\u001b[0m \u001b[31m6.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m304.5/304.5 kB\u001b[0m \u001b[31m12.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m74.5/74.5 kB\u001b[0m \u001b[31m6.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m57.5/57.5 kB\u001b[0m \u001b[31m5.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m58.3/58.3 kB\u001b[0m \u001b[31m5.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25h"
]
}
],
"outputs": [],
"source": [
"!pip install cohere --quiet\n",
"!pip install qdrant-client --quiet"
Expand All @@ -55,14 +34,27 @@
"from qdrant_client import QdrantClient, models\n",
"from qdrant_client.http.models import Batch\n",
"import cohere\n",
"\n",
"def read_config(filepath):\n",
" with open(filepath) as f:\n",
" config = yaml.safe_load(f)\n",
" return config\n",
" try:\n",
" with open(filepath) as f:\n",
" config = yaml.safe_load(f)\n",
" return config\n",
" except FileNotFoundError as e:\n",
" print(f\"Configuration file {filepath} not found: {e}\")\n",
" except yaml.YAMLError as e:\n",
" print(f\"Error parsing YAML in configuration file {filepath}: {e}\", exc_info=True)\n",
" except Exception as e:\n",
" print(f\"Error reading configuration file {filepath}: {e}\")\n",
" return None\n",
"\n",
"\n",
"class QdrantSearch:\n",
" def __init__(self, resumes, jd):\n",
" config = read_config(\"config.yml\")\n",
"\n",
"\n",
"\n",
" self.cohere_key = config['cohere']['api_key']\n",
" self.qdrant_key = config['qdrant']['api_key']\n",
" self.qdrant_url = config['qdrant']['url']\n",
Expand Down Expand Up @@ -128,31 +120,34 @@
"metadata": {
"id": "SXOgwcCATtww"
},
"execution_count": null,
"execution_count": 6,
"outputs": []
},
{
"cell_type": "code",
"source": [
"resumes = [\"Professional Summary Highly skilled MERN Stack Developer with over 10 years of experience specializing in designing building and maintaining complex web applications Proficient in MongoDB Expressjs React and Nodejs Currently contributing to the development of AI technologies at OpenAI with a primary focus on the ChatGPT project Skills JavaScript and TypeScript MongoDB Expressjs React Nodejs MERN stack RESTful APIs Git and GitHub Docker and Kubernetes Agile and Scrum Python and Machine Learning basics Experience June 2020 PresentMERN Stack Developer OpenAI San Francisco USA Working on the development of the ChatGPT project using Nodejs Expressjs and React Implementing RESTful services for communication between frontend and backend Utilizing Docker and Kubernetes for deployment and management of applications Working in an Agile environment delivering highquality software every sprint Contributing to the design and implementation of machine learning algorithms for natural language processing tasks July 2015 May 2020Full Stack Developer Uber San Francisco USA Developed and maintained scalable web applications using MERN stack Ensured the performance quality and responsiveness of applications Successfully deployed solutions using Docker and Kubernetes Collaborated with a team of engineers product managers and UX designers Led a team of junior developers conducted code reviews and ensured adherence to best coding practices Worked closely with the data science team to optimize recommendation algorithms and enhance user experience June 2012 June 2015Software Developer Facebook Menlo Park USA Developed features for the Facebook web application using React Ensured the performance of the MongoDB databases Utilized RESTful APIs for communication between different parts of the application Worked in a fastpaced testdriven development environment Assisted in migrating the legacy system to a modern MERN stack architecture Education 2009 2012 PhD in Computer Science CalTech Pasadena USA 2007 2009 Master of Science in Computer Science MIT Cambridge USA 2003 2007 Bachelor of Science in Computer Science UC San Diego San Diego USA 1/2 Projects 2019 PresentPersonal Project Gotham Event Planner Created a fullfeatured web application to plan and organize events in Gotham city Used MERN stack for development and Docker for deployment The application allows users to create manage and share events and integrates with Google Maps API to display event locations 2/2\"]\n",
"job_description = \"Job Description Java Developer 3 Years of Experience Tech Solutions San Francisco CA USA About Us At Tech Solutions we believe in the power of technology to solve complex problems We are a dynamic forwardthinking tech company specializing in custom software solutions for various industries We are seeking a talented and experienced Java Developer to join our team Job Description We are seeking a skilled Java Developer with at least 3 years of experience in building highperforming scal able enterprisegrade applications You will be part of a talented software team that works on missioncritical applications Your roles and responsibilities will include managing Java/Java EE application development while providing expertise in the full software development lifecycle Responsibilities •Designing implementing and maintaining Java applications that are often highvolume and low latency required for missioncritical systems •Delivering high availability and performance •Contributing to all phases of the development lifecycle •Writing welldesigned efficient and testable code •Conducting software analysis programming testing and debugging •Ensuring designs comply with specifications •Preparing and producing releases of software components •Supporting continuous improvement by investigating alternatives and technologies and presenting these for architectural review Requirements •BS/MS degree in Computer Science Engineering or a related subject •Proven handson Software Development experience •Proven working experience in Java development •Handson experience in designing and developing applications using Java EE platforms •ObjectOriented Analysis and design using common design patterns •Profound insight of Java and JEE internals Classloading Memory Management Transaction man agement etc 1 •Excellent knowledge of Relational Databases SQL and ORM technologies JPA2 Hibernate •Experience in developing web applications using at least one popular web framework JSF Wicket GWT Spring MVC •Experience with testdriven development Benefits •Competitive salary package •Health dental and vision insurance •Retirement savings plan •Professional development opportunities •Flexible work hours Tech Solutions is proud to be an equal opportunity employer We celebrate diversity and are committed to creating an inclusive environment for all employees How to Apply To apply please submit your resume and a brief explanation of your relevant experience to 2\"\n",
"config = read_config(\"config.yml\")\n",
"if not config:\n",
" print(\"Cannot process this as there is no config.yml\")\n",
"else:\n",
" qdrant_search = QdrantSearch(resumes, job_description)\n",
"\n",
"qdrant_search = QdrantSearch(resumes, job_description)\n",
" qdrant_search.update_qdrant()\n",
"\n",
"qdrant_search.update_qdrant()\n",
"\n",
"results = qdrant_search.search()\n",
"for r in results:\n",
" print(r)"
" results = qdrant_search.search()\n",
" for r in results:\n",
" print(r)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "rlP3s5euo435",
"outputId": "3f4f15b6-d446-4491-d4d5-d9ba14a2a145"
"outputId": "389c00e7-8cd1-4dd6-f517-d923e3c4bf2a"
},
"execution_count": null,
"execution_count": 10,
"outputs": [
{
"output_type": "stream",
Expand All @@ -162,6 +157,15 @@
]
}
]
},
{
"cell_type": "code",
"source": [],
"metadata": {
"id": "WFdXngZkEyOm"
},
"execution_count": null,
"outputs": []
}
]
}
3 changes: 2 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -108,4 +108,5 @@ wasabi==1.1.2
watchdog==3.0.0
zipp==3.16.2

cohere~=4.19.2
cohere~=4.19.2
qdrant-client
Loading

0 comments on commit 370d802

Please sign in to comment.