NOTE: If you are interested in using ManifoldCF with Solr, you may want to look at our Datafari software , which combines Apache ManifoldCF with Solr, so it eases this kind of integration. The code is available on google code: https://github.com/francelabs/datafari
Manifold CF (MCF) provides a early-binding authorization mechanism for file searchs. The aim of this entry is to will describe this mechanism, and then to show you the different steps needed to configure MCF and Solr to use this fonctionnality.
MCF extracts ACLs from files at crawling-time, and injects them into Solr as specific fields for the Solr document.
At query time, an external application can query Solr for documents that are available for a specific user profile. Solr handles the query, contacts the authority service of MCF in order to ask for information on the authenticated user, such as its group membership. Solr then performs the query and filters the query results with this information.
The global architecture is described in the following diagram:
Please note that MCF can take care of the autorisation mechanism, but authentication is not its job. You need an external application deals with authentication, and also secures the access to Solr. These parts, which are neither Solr nor MCF specific, won’t be adressed in this tutorial.
- Prerequisites:
Install and configure Manifold CF 1.1.1 and Solr 4.1 as explained for instance in our Tutorial for combining ManifoldCF and Solr for files search.
- Aim of the tutorial:
Starting with what was already done during the first tutorial, the different steps to enable security in files search are the following:
In Solr, we will have to modify the schema.xml to allow indexing of file ACLs and add MCF search plugin to verify at query time access rights for a specific user.
In MCF, we will have to change configuration of Windows Share Connector and configure a new AD connector
- Modification to Solr:
We need to modify the schema.xml to index file ACLs. To do that, you have to add the following lines to your Solr schema :
<field name="allow_token_document" type="string" indexed="true" stored="true" multiValued="true" required="false" default="__nosecurity__" /> <field name="allow_token_share" type="string" indexed="true" stored="true" multiValued="true" required="false" default="__nosecurity__" /> <field name="deny_token_document" type="string" indexed="true" stored="true" multiValued="true" required="false" default="__nosecurity__" /> <field name="deny_token_share" type="string" indexed="true" stored="true" multiValued="true" required="false" default="__nosecurity__" />
In Solrconfig.xml, add the MCF query parser that is used to get access token of a user :
<!-- ManifoldCF document security enforcement component --> <queryParser name="manifoldCFSecurity"> <str name="AuthorityServiceBaseURL"> http://localhost:8080/mcf-authority-service </str> </queryParser>
The implementation class is contained in a apache-solr-mcf-4.0-SNAPSHOT.jar. You have to copy this jar from
apache-manifoldcf-1.1.1\solr-integration\solr-4.x
to your Solr lib directory :
solr-4.1.0\example\solr\lib
In parameter AuthorityServiceBaseURL, specify the ULR of the MCF authority service (which is by default host:port/mcf-authority-service).
Filter the query with the information given by the new search component in your default query parser :
<lst name="appends"> <str name="fq">{!manifoldCFSecurity}</str> </lst>
This will filter the documents and remove those that cannot be read by a specified user.
You should now have something like that for the configuration of your search handler:
<requestHandler name="/select" class="solr.SearchHandler"> <lst name="defaults"> <str name="echoParams">explicit</str> <int name="rows">10</int> <str name="df">text</str> </lst> <lst name="appends"> <str name="fq">{!manifoldCFSecurity}</str> </lst> </requestHandler>
- Configuration of MCF :
Create a new Authority, select the connection type “Active Directory” and add the configuration of your AD in the tab Domain Controller as the following screenshot :
Change the configuration of your windows share connector to link it with your AD connector :
Create a crawl Job as explained in the first tutorial and crawl the files.
You can now perform a search, you have to specify the parameter AuthenticatedUserName=username@domain to filter the result for the user specified in parameter.
For testing purposes, you can also directly query the MCF authority service to get SSID of a specific user and SSIDs of group that he belongs to :
That’s it, you now have a secured file search system, thanks to Manifold CF and Solr.
Don’t forget that France Labs is specialised in Manifold CF, Apache Lucene/Solr and Constellio. So contact us if you have any specific need !