The Society of Human Resource Management (SHRM) reports an average of 42 days (going upto 62 for Engineering roles) to shortlist candidates for a particular position.
The main issue causing this delay in talent acquisition is that recruiters struggle to find the right candidate profile that match their specific job description.
Draup has solved this problem with a keyword-aware semantic search approach that fetches close-match profiles in near-real-time.
The Problem with Profile Shortlisting Today
Finding the right candidate profile is time-consuming and requires
- Extensive research
- Manual curation of skillsets and competencies
- An understanding of relevant degrees/ education
- Exposure to industry nuances
The search space is vast, and finding exact matches is daunting.
From a recruiter’s perspective, a straightforward syntactic match is not sufficient.
More often than not, they are looking to hire candidates similar to not one but a combination of multiple profiles. They need to dive deeper to identify functional similarities and obscure features of the candidate pool if they want to decrease their time-to-hire.
For example, understanding that a data scientist and a machine learning engineer are related requires domain expertise. Similarly, realising that Aesara and Theano are related skills also require a fine understanding of the domain.
Currently, many solutions for finding the right candidate are based on a fixed set of titles, skills, domains etc. However, these solutions do not always yield results aligned with the given specifications as they cannot interpret the underlying connection.
Now, imagine if this computationally complex & difficult-to-scale process of finding the right candidate could be made 100% more efficient and deliver close-to-accurate matches in real time.
This is precisely what the Data Science team at Draup has achieved.
How did Draup solve this?
The system that we have in place lets you combine input profile features and search across the multitude of candidates in a matter of seconds. So, for a recruiter looking to hire a Product Manager with a background in Data Science – both specifications although met by different individuals – potential hires are just clicks away!
We have implemented this by creating a scalable format for representing profiles, called profile embeddings, hierarchically placed within the database along with other attributes for filtering and sorting.
The framework is capable of real-time retrieval of potential matches, with 600M+ profiles available for recommendation in 50+ languages, spanning 100+ countries and 2500+ job roles.
Here’s How We Did It
Representing a Profile
To represent a profile, we created embeddings for various profile attributes, including experience-acquired skills, position, and vertical history, among others.
Each profile is represented across 224 dimensions. Missing fields are handled by extrapolating information from secondary attributes that the profile indirectly depends upon**.**
Creating the Database
To create the database, the profile pool is segmented into multiple buckets or partitions, taking into account their present hierarchy in the organisation. This improves latency and prevents the disk from running out of memory.
Once the partitions are ready and embeddings inserted, we arrive at the next step of indexing them.
Indexing the database
Finding an appropriate algorithm to index and retrieve its contents is key to any database. And with vectors, it can get very complex given the need for fast, close-to-accurate results.
Parameters such as storage space and query latency serve as major deciding factors here.
For our framework, we have chosen graph-based indexing since it serves our purpose of real-time retrieval, albeit on a higher memory note.
Retrieving matches
When a user shortlists profiles into a folder, a query is computed internally, and the database is probed.
Profiles are returned in the sorted order of their similarities with the folder.
Tools and Techstack
Milestones Achieved
- Combining multiple input profiles: A powerful technique devised to produce recommendations relevant to one or many profiles.
- Validation of a candidate’s skillset: We have taken advantage of the candidate’s work experience, projects, and specialisations, to get insights into their skillset and discard irrelevant ones.
- Accommodating the profile pool: As the recommendation system is run on a very large scale (terabytes of candidate data), it was optimised through every step, from embedding model generation to database probing.
- Maintaining the database: The infrastructure offers end-to-end support for not just retrieval but also modification and deletion of its entities.
- Improving search latency: We adopted a combination of qualitative checks and quantitative analyses to segment and make sense of the various candidate groups, so that the search does not become an exhaustive one.
In particular, we have been successful in finding functionally similar profiles in near-real-time, which saves the effort going into the recruitment process.
The use of profile embeddings and other advanced techniques enables the system to handle large amounts of data, making it ideal for use in high-volume recruitment environments.
Concluding Remarks
Building this profile recommender system required the optimal placement of model building, database creation, and pipeline probing. Several insights were drawn in this journey, and bottlenecks were found, some resolved and some yet to be fully realised. This article highlights some of our findings on profile recommendation, and there is plenty more that we wish to explore and discuss in this domain.
Stay tuned for more updates…