Storing all the job posting data in a Postgres database is great and all but we are leaving meat on the bone if we don't try and use that data to extract valuable insights.
Using all the stored job descriptions, I used Google's Gemma4 model to extract job skill keywords from every
job description.
For example, a job posting for a "Billing Analyst" yielded the following: ['Data Analysis', 'Data Management',
'Data Modeling', 'Database Design', 'Data Mining Techniques', 'Data Cleaning', 'Querying', 'Database Management', 'Reporting Packages',
'Excel', 'XML Analysis', 'Formulas', 'Macros', 'Report Writing', 'Trend Identification']
FAISS (Facebook AI Similarity Search) was used to group similar keywords. For example, we have "Macros" as a listed skill keyword.
We want "VBA" to be counted in the same category as "Macros" even though they are entirely different keywords but yet they have the same concept.
The table below is the result.
Top 10 Job Skill Keywords
Comparison of job skill keywords across all job postings in database.
| Skill keyword(s) | % of Job Postings |
|---|---|
| Python Programming | 9.58% |
| SQL Development | 9.54% |
| Insight Generation | 9.19% |
| Data Analysis | 8.48% |
| Business Intelligence Strategy | 6.66% |
| Platform Engineering | 6.59% |
| Network Engineering | 6.35% |
| Enterprise Risk Management | 6.27% |
| Project Management | 5.99% |
| Full Stack AI Engineering | 5.91% |
Table 1.1: Top Job Skill keywords derived from job posting data collected from approximately 60 companies across a span of 2 months.
Turning Data Into Gold!
Data is very important to businesses. More importantly, analyzing data to generate insights is very valuable to businesses in
today's world. The top job skill keywords reflect that since most of them are related to this concept.
Important note: The data is entirely based on job postings from a curated company list. For example, I won't include
job postings from mining companies in this dataset since I have no interest in working for a mining company.