Wednesday, 14 May 2014

How to Query Apache Solr

In this post I will show how to query Apache Solr using its Dashboard screen. This query can be done using Java HttpClient lib, curl request as well. But as I am giving intro to Solr dashboard in my last four posts, so I try to fire different kinds of query using it's dashboard screen. We will do all these stuff using java code as well in my next post.

Let's update your schema.xml file with given mappings & start your Apache Solr Server -

<?xml version="1.0" encoding="UTF-8" ?>
<schema name="example core zero" version="1.1">
<fields> <field name="_version_" type="long" indexed="true" stored="true"/> <field name="_root_" type="string" indexed="false" stored="false"/> <field name="id" type="string" indexed="true" stored="true" required="true" /> <field name="name" type="string" indexed="true" stored="true" /> <field name="address" type="string" indexed="true" stored="true" /> <field name="comments" type="string" indexed="true" stored="true" /> <field name="text" type="string" indexed="true" stored="false" multiValued="true"/> <field name="popularity" type="long" indexed="true" stored="true" multiValued="false"/> <field name="counts" type="long" indexed="true" stored="true" /> <dynamicField name="*_i" type="string" indexed="true" stored="true" /> </fields> <uniqueKey>id</uniqueKey> <copyField source="name" dest="text"/> <copyField source="address" dest="text"/> <copyField source="comments" dest="text"/> <types> <fieldtype name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/> <fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/> </types>

It's time to add more records on Apache Solr. Go to -
Solr Dashboard ->Select Collection1-> Documents 
& save all of these records one by one.
"id": "Solr101",
"name":"Solr version 4.7.2",
"address":"House No - 100, LR Apache, 40702",
"comments": "Apache Solr It's Cool.",
"dynamicField_i": "It is dynamically genrated field."        
"id": "Solr102",
"name":"Solr SECOND RECORD",
"dynamicField_i": "It is dynamically genrated field FOR SECOND RECORD."        
"id": "Solr103",
"name":"Solr THIRD RECORD",
"dynamicField_i": "It is dynamically genrated field FOR THIRD RECORD."        
"id": "Solr104",
"name":"Solr FOURTH RECORD",
"dynamicField_i": "It is dynamically genrated field FOR FOURTH RECORDS."        
Screen Shot -

Go to query tag & click on execute Query you will get-
This screen have lot of text fields, I am going to introduce all of them.

q Field (Stands for Query default *:*)
First * notation denotes the <field Name>
Second * denotes the text to be search in that field.
Ex. Type id:Solr102  in this textbox and click on Execute Query button, Solr will search "Solr102 "string in <id> field and returns you all the results matching this criteria.

fq Field stands for (Filter Query)
This is used as a query filter & imposes more restriction on the parent query string provided by you. This Filter response is stored in cache separately so if you hit this filter query as a main query then it's result will be return from the cached output.
fq parameter can be specified multiple times by pressing "+" sign at the right of the text box. Serch response will be generated after the intersection of these multiple parameters. ex.
It will fetch the records where popularity is 10 and counts is 140. It can be written into single query is
fq=+ popularity:10+counts:140
as shown below - 
In this screen shot top right corner have a link as highlighted in this image just click on this link it will open a new browser tab & show you the result in that tab. It means if you want to get the same result using browser window then no need to go to this Dashboard screen. You directly write your query on the browser window & it will return the result of your query.

ex. id desc 
Note here I am sorting the document on the basic of id.
Note :- syntex for declaration is <fieldName><space><Sorting Order i.e. asc or desc>
you can have multiple sorting order. Let's consider you have 3 sorting order then second is evaluated only when there ijs conflict in first sorting order and third sorting order will be evaluated only when first and second sorting order produces any conflict.

Starts is from where the fetching of the records should be done. rows means number of records to be fetched. ex.
if start=10, rows=20
then it will fetch records from 10th to 29th.

fl (stands for field list)
It will restrict the number of fields returned from the Apache Solr. These fields can be defined using comma separation.
Ex. name,address 
it will show only the name and address field returned from in response.
screen shot - 
Name aliasing can also be done as
Syntex- <Alias Name> : <fieldName>
here Solr will return the result with two fields one is id and second is UserName which is used as a alias of <name> field.
screen shot-
you can also use * annotation from returning the result as
description- It will return id and all those fields which are started from "add" string.
Function in response as
id, reviews:sum(popularity,counts)
description- It will return two fields as id and second is prise which is sum of popularity with itself.
df (Defined Fields)
Here all the fields are separated by comma and this field is used for search purpose. i.e
if you only enter the text in search query section (q section) and defined some fields in df textbox then Solr search that text only in these defined fields not in any other field.
Ex. type Solr104 in q section
and type id in df field will search Solr104 in df field.
omitHeader (default value false)
How to omit header from the response return from the Apache Solr
If omitHeader=true 
Ex. hit given URL in your browser's window you will get a response without addition details.
debug(default false)
you can debug your query by using this parameter.