By Softlanding

Latent security issue with Delve

September 29, 2017

While the Microsoft SharePoint search engine is a very powerful tool, my attention is currently drawn towards Delve and Office Graph. Delve and Office Graph go hand-in-hand as Delve is the user-focus front end platform that visualizes data retrieved and sorted by Office Graph. Office Graph is the new and exciting search and index engine that uses artificial intelligence and fuzzy logic to create relations between data and persons while crawling data. 

While studying Office Graph I came across a latent security issue that Delve users should be aware of. To illustrate this latent security issue, let's do a little thought experiment. Let's assume there is a company using an intranet based on SharePoint Online as part of Office 365. This company is using SharePoint extensively resulting in a large structure of sites and subsites. Deeply hidden inside this subsite structure, another user grants you access to a document by accident that you should not have access to, because it is confidential. Although this is a major security risk, it may stay undiscovered for a long period of time with the way the current SharePoint Search index works.

The current SharePoint Search index will index this document, its metadata and the security settings upon its next crawl and this data will be stored to the SharePoint search index as usual. Unless you aren't searching for this document (that you probably don't know of) actively, there is a good chance that you'll never discover that there is a confidential document hidden deeply inside the subsite structure you have access to. 

These kind of security risks will drastically change when Office Graph and Delve come into play. Of course, Office Graph will also find this document and it will add it to its internal database. In addition, Office Graph will create relations based on coworkers changes to the permissions and/or document body and it will create relations that connect you with them and will also weigh these relations internally.

As an individual with accidental access to a confidential document on Delve, there is a good chance that you will discover this document very soon. As I mentioned at the beginning, Delve is not showing search results based on an active request of a user. As soon as a user is opening Delve, it shows what is currently trending around the user and if your colleagues are currently collaborating on this document, there is a good chance that Delve will show this document as one of the most important documents, because it appears to be very important for your coworkers – that's how weighted relations work!

The same issue will occur if the document is saved to a file share that is crawled by SharePoint.

Don't get me wrong, there is nothing wrong with Delve or Office Graph. It's just the different way search results will be displayed to a user - and we as users need to be aware of this!

To prevent these kind of accidental security issues, there is an option besides switching Delve off for your tenant in the Office 365 Admin Center. You can selectively disable Delve from showing certain documents or data in the Delve search results. The only thing you need to do is to declare a site column named 'HideFromDelve' and to use it in lists or libraries to selectively exclude documents and data from being shown in the Delve search results. Mikael Svenson (MVP) explained this in detail in his post. If you want to know more, I encourage you to read his article. But you need to keep in mind that this only affects Delve as the front end of Office Graph. Third party applications that also use search results derived from Office Graph will most likely ignore this site column and this mechanism is not affecting Office Graph itself. Office Graph will still add data to its index despite using this site column.

Disregarding this latent security issue, I still think that Office Graph is a great tool and it is an important step towards new search engines that rely on weighted relations to recognize metadata on their own.

Who knows – maybe one day manual metadata entry will be relieved as a tedious task for users.

The future remains exciting!

Loading Conversation