The emergence of the clean net around TOR

Despite the transient nature of hidden services, the multiplex network emerging from human attention and transaction flows (layers such as the regular web, blockchains, and so on) is often more stable. 
We collected data from May 2017 to October 2018, focusing on desktop user activity, worldwide.  We first create a portrait of early TOR users behavior in the surface web.   

Distribution

The bar chart in Figure 1 shows how the values of Proximity (a distance metric defined using semantic content and browsing patterns) compare by category, in this way we cluster the sites visited by users to summarize the overall behaviour of the population. We see that the dominant website category is Shopping, while subcategories such as Internet and Telecom/File Sharing is closer than Internet and Telecom/Web Hosting, as one should expect.