This observation reflects the fact that, in our dataset, skills vary in how specific they are: some of the skills are very broad, or transversal. Examples of these transversal skills are communication skills, problem solving and business management. Other requirements are domain specific but still broad, such as molecular biology, python and surgery. Finally, there are very specific skills, such as succession planning and visual merchandising.
Identifying and Removing Transversal Skills
Our initial attempts to detect the naturally existing communities in the graph demonstrated that the presence of highly central skills engenders one giant community that contains most skills and very few specialised communities. This makes sense since we can imagine how transversal skills may connect vertices in the graph that have little else in common. To address this, we identify and remove highly transversal skills; we also use cosine similarity as the edge strength parameter.
In order to identify highly transversal skills, we consider two vertex properties: eigenvector centrality and the local clustering coefficient. Eigenvector centrality is a measure of vertex influence, so if a vertex is connected to vertices with a high number of degrees, its centrality will be high \citep*{Austin2006}. The local clustering coefficient reflects how embedded the vertex is in its neighbourhood. If one vertex has a lower local clustering coefficient than another, it implies that fewer of that vertex’s neighbours are connected to each other \citep*{watts1998collective}. We argue that a highly transversal skill is likely to have a high eigenvector centrality and a low local clustering coefficient, since the vertices they connect have relatively few other connections in common. We recognise that it is not simply the case that each skill is either transversal or not. Instead each skill will lie somewhere along a continuum with transversal skills at one end and specialist skills at the other. Similarly, in the ESCO taxonomy, skills are divided into transversal, cross-sector, sector specific and occupation specific categories. To explore whether the skill requirements in the Burning Glass data fall into similar categories, we fit a Gaussian mixture model (GMM) to skills' eigenvector centrality scores to explore naturally occurring concentrations of skills. After initial exploration, we identify 19 groups with varying levels of specialisation (Table 2).