Validating the Cloudera Search Deployment
After installing and deploying Cloudera Search, you can validate the deployment by indexing and querying sample documents. Before beginning this process, make sure you have access to the Apache Solr admin web console, as described in Creating Collections.

Creating a Solr Collection
- On a host running Solr Server, make sure that the SOLR_ZK_ENSEMBLE environment variable is set in /etc/solr/conf/solr-env.sh. For example:
$ cat /etc/solr/conf/solr-env.sh export SOLR_ZK_ENSEMBLE=zk01.example.com:2181,zk02.example.com:2181,zk03.example.com:2181/solr
If you are using Cloudera Manager, this is automatically set on hosts with a Solr Server or Gateway role.
- Generate configuration files for the collection:
$ solrctl instancedir --generate $HOME/test_config
- Upload the configuration to ZooKeeper:
$ solrctl instancedir --create test_config $HOME/test_config
- Create a new collection with two shards by using the uploaded configuration directory:
$ solrctl collection --create test_collection -s 2 -c test_config
Indexing Sample Data
Cloudera Search includes sample data for testing and validation. Run the following commands to index this data for searching. Replace search01.example.com in the example below with the name of any host running the Solr Server process.
- Parcel-based Installation:
$ cd /opt/cloudera/parcels/CDH/share/doc/solr-doc*/example/exampledocs $ java -Durl=http://search01.example.com:8983/solr/test_collection/update -jar post.jar *.xml
- Package-based Installation:
$ cd /usr/share/doc/solr-doc*/example/exampledocs $ java -Durl=http://search01.example.com:8983/solr/test_collection/update -jar post.jar *.xml
Querying Sample Data
Run a query to verify that the sample data is successfully indexed and that you are able to search it:
- Open the Solr admin web interface in a browser by accessing http://search01.example.com:8983/solr. Replace search01.example.com with the name of any host running the Solr Server process.
- Select Cloud from the left panel.
- Select one of the hosts listed for the test_collection collection.
- From the Core Selector drop-down menu in the left panel, select the test_collection shard.
- Select Query from the left panel and select Execute Query. If you see results
such as the following, indexing was successful:
"response": { "numFound": 32, "start": 0, "maxScore": 1, "docs": [ { "id": "SP2514N", "name": "Samsung SpinPoint P120 SP2514N - hard drive - 250 GB - ATA-133", "manu": "Samsung Electronics Co. Ltd.", "manu_id_s": "samsung", "cat": [ "electronics", "hard drive" ],

Next Steps
After you have verified that Cloudera Search is installed and running properly, you can experiment with other methods of ingesting and indexing data:
- Preparing to Index Sample Tweets with Cloudera Search
- Using MapReduce Batch Indexing to Index Sample Tweets
- Near Real Time (NRT) Indexing Tweets Using Flume
- Using Hue with Cloudera Search
To learn more about Solr, see the Apache Solr Tutorial.
Page generated August 14, 2017.
<< Cloudera Search Tutorial | ©2016 Cloudera, Inc. All rights reserved | Preparing to Index Sample Tweets with Cloudera Search >> |
Terms and Conditions Privacy Policy |