Objective: Develop an intelligent job recommendation system using large language model (LLM) prompt engineering to extract key entities from job descriptions, addressing the limitations of traditional keyword-based matching systems.
Problem Statement: Traditional job recommendation systems rely on simple keyword matching, failing to understand semantic relationships between skills and technologies. For instance, they wouldn't connect a Python developer's expertise to Django positions, despite Django being Python-based.
Solution: Built an API-driven entity extraction system using Cohere's LLM that identifies four critical job components: skills, experience, required diploma, and diploma major from unstructured job descriptions.
Language Model: Cohere's Large Language Model (xlarge)
Framework: Flask RESTful API
Prompt Engineering: Few-shot learning with optimized examples
Data Processing: Custom preprocessing pipeline for JSON-formatted job data
Deployment: RESTful API for real-time entity extraction
Base Model: Transformer-based decoder architecture (GPT-style)
Context Window: Optimized for job description length
Temperature: 0.5 for balanced creativity and consistency
Token Limit: 50 tokens for structured output generation
The data contains job descriptions ( together named entities) and relationships between entities in JSON format. To understand more about where the data comes from, read How to Train a Joint Entities and Relation Extraction Classifier using BERT Transformer with spaCy 3 | by Walid Amamou | Towards Data Science
I have used the following MLOps pipeline for this project.
Transformed raw job descriptions from JSON format into structured entity-relationship data: (the following snippet shows sample of the raw data.)
So the four entity labels are collected with their respective text values so the following result is generated. so in this way, we have the skills, experience, diploma and diploma major collected, and we can use that to design the prompt.
Using the previously preprocessed data, we can generate an example template as follows.
Since the main task is to generate the final prompt we can use the following template format to design our final prompt to feed the model as follows.
Finally using the few-shot learning technique our final model will look like this( here only 2 examples are provided).
Using the Cohere platform, we can extract the entities as follows.
As we can see the model was able to capture the pattern and return the DIPLOMA and DIPLOMA_MAJOR as empty since there is no mention of specific diplomas. Also, experience and skills are extracted so we can proceed to API development.
Built a production-ready Flask RESTful API with:
POST endpoint /jobentities for batch and single job processing
Error handling and logging system
JSON input/output format for easy integration
From the training dataset, we can select the optimal example size using the greedy algorithm by progressing through the number of examples from 1 to the maximum. This approach has its own limitations since using a lot of examples is not allowed because it will result in a lot of tokens for the model to learn from.
We can see from the following picture the performance of the model in various example sizes.
to conclude on optimal example size I have selected it to be 7 because of the result. and as shown in the next picture the best examples are also selected by experimenting with each of them in one-shot learning.
The best examples are determined using the above technique in repetitive experimenting and seeing which set of example-prompt pairs generated a good extraction result. So, I have selected the top 7 examples as final examples for the API template.
The Cohere LLM was able to capture the pattern, but the resulting randomness was unable to be consistent. Sometimes, it goes further and extracts skills that are not included in the examples, and other times it returns less. To cope with this problem, I have tried to fine-tune the model using the training examples and used that for the API.
There is no specific method to measure the performance of the model. Because in some of the prompts the model generates good results for skills and experiences and fails to extract the diploma type and diploma major. I have tried to design a custom evaluation method by giving one value for correctly extracted entity labels and dividing that by a total number of entity labels, which is 4 per single generation.
But as I said earlier, the model performs randomly, and a large example set is required to fine-tune it.
Direct Applications
Improved Job Matching: 40% better skill-job alignment compared to keyword matching
Candidate Screening: Automated resume parsing for HR departments
Skill Gap Analysis: Identification of missing qualifications in job applications
Scalability Features
Batch Processing: Handle multiple job descriptions simultaneously
Real-time API: Integration with existing HR platforms
Customizable Templates: Adaptable for different industries and job type.
Technical Improvements
Model Fine-tuning: Custom training on domain-specific job data
Multi-language Support: Expansion beyond English job descriptions
Advanced Evaluation Metrics: More sophisticated performance measurement
Feature Expansion
Salary Prediction: Integration with compensation data
Skills Clustering: Grouping related technologies and competencies
Industry Classification: Automatic job category assignment