An intelligent recruitment tool that leverages OCR and Natural Language Processing (NLP) to automate the extraction of data from unstructured PDF resumes and rank candidates based on job description relevance.
- Intelligent OCR: Uses AWS Textract to extract text, tables, and key-value pairs from PDF resumes with high accuracy.
- "Fit Score" Algorithm: A custom Python ranking algorithm that computes a candidate's relevance score against a job description, reducing manual screening time by 70%.
- Responsive Dashboard: A modern React.js frontend for HR managers to upload resumes and view ranked results.
- Serverless Backend: Built on AWS API Gateway and DynamoDB for scalable, maintenance-free operation.
graph LR
User[HR Manager] -->|Upload PDF| Client[React Dashboard]
Client -->|API Request| APIG[API Gateway]
APIG -->|Trigger| Lambda1[Upload Lambda]
Lambda1 -->|Store File| S3[S3 Bucket]
S3 -->|Trigger Event| Lambda2[Processing Lambda]
Lambda2 -->|Extract Text| Textract[AWS Textract]
Lambda2 -->|Analyze & Rank| NLP[NLP Engine (Python)]
NLP -->|Store Data| DB[(DynamoDB)]
Client -->|Fetch Results| DB
- Frontend: React.js, Tailwind CSS
- Backend: AWS Lambda (Python), API Gateway
- ML/AI: AWS Textract, SpacY / NLTK (for entity recognition)
- Database: Amazon DynamoDB
- Storage: Amazon S3
├── frontend/
│ ├── src/
│ │ ├── components/ # Upload, Dashboard, CandidateCard
│ │ └── services/ # API integration
│ └── package.json
├── backend/
│ ├── ranking_algorithm/ # Python Fit Score logic
│ ├── handlers/ # Lambda function handlers
│ └── serverless.yml # Serverless Framework config
├── data/
│ └── sample_resumes/ # Test data
└── README.md
- Efficiency: Cut down the initial resume screening process from days to minutes.
- Accuracy: Improved candidate matching accuracy using weighted keyword analysis and semantic matching.
-
Backend Deployment:
cd backend sls deploy -
Frontend Setup:
cd frontend npm install npm start
Distributed under the MIT License.