Solr DataImportHandler ERROR DocBuilder Exception while processing - postgresql

I have been trying to get Solr DIH working with PostgreSQL for hours now and I cannot find the problem, as the Logger doesn't not tell me anthing helpful.
My aim is as simple as to synchronize the data from the database with Solr (using the DIH).
My setup is as follows:
Jetty, Windows 8
solrconfig.xml (nothing changed except for the following)
[...]
<lib dir="../../../../dist/" regex="solr-dataimporthandler-.*\.jar" />
<lib dir="../../../../dist/" regex="sqljdbc4.*\.jar" />
<lib dir="../../../../dist/" regex="postgresql-.*\.jar" />
[...]
data-config.xml
<dataConfig>
<dataSource type="JdbcDataSource"
driver="org.postgresql.Driver"
url="jdbc:postgresql://localhost:5432/solrdih"
user="solrdih"
password="solrdih"
batchSize="100" />
<document>
<entity name="solrdih"
query="SELECT * FROM myTable">
<field column="id" name="id" />
</entity>
</document>
</dataConfig>
schema.xml (nothing changed except for the following)
[...]
<fields>
<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<field name="name" type="text" indexed="true" stored="true"/>
<field name="description" type="text" indexed="true" stored="true"/>
[...]
Calling http://localhost:8983/solr/solr/dataimport, I get the following:
It reads:
ERROR DocBuilder Exception while processing: solrdih document : SolrInputDocument[]:org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to execute query: SELECT * FROM myTable Processing Document # 1
ERROR DataImporter Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to execute query: SELECT * FROM myTable Processing Document # 1
Could someone please provide hints where to look for the error?
Thanks in advance!

So, this error came from all the way down in Postgres and everything works fine since I made changes to pg_hba.conf
host all all 127.0.0.1/32 md5
to
host all all 127.0.0.1/32 trust

Related

Solr with mongodb connection

Connect my mongodb database with solr cloud using mongo-connector?
can any one tell me the steps to connect the mongo database to fetch data to the solr for indexing
Step 1: Installing the Mongo Connector
To install the mongo connector run
Pip install mongo-connector
Step 2: Creating a Solr Core
./bin/solr create -c <corename>-p 8983 -s 3 -rf 3
Step 3:Configuring to Solr
The fields in the mongodb documents to be indexed are specified in the schema.xml configuration file . open the schema.xml in a vi editor. As
vi/solr/solr-6.6.2/server/solr/configsets/data_driven_schema_configs/
conf/schema.xml
Step 4: Mongo Connector also stores the metadata associated with the each mongodb document it indexes in fields ns and _ts. Also add the ns and _ts fields to the schema.xml.
<schema>
<?xml version="1.0" encoding="UTF-8" ?>
<schema name="example" version="1.5">
<field name="time_stamp" type="string" indexed="true"  stored="true" 
multiValued="false" />
<field name="category" type="string" indexed="true"  stored="true" 
multiValued="false" />
<field name="type" type="string" indexed="true"  stored="true" 
multiValued="false" />
<field name="servername" type="string" indexed="true"  stored="true" 
multiValued="false" />
<field name="code" type="string" indexed="true"  stored="true" 
multiValued="false" />
<field name="msg" type="string" indexed="true"  stored="true" 
multiValued="false" />
<field name="_ts" type="long" indexed="true" stored="true" />
<field name="ns" type="string" indexed="true" stored="true"/>
<field name="_version_" type="long" indexed="true" stored="true"/>
</schema>
Step 5:We also need to configure the
org.apache.solr.handler.admin.LukeRequestHandler request handler in the solrconfig.xml. 
Open the solrconfig.xml in the vi editor.
vi ./solr-5.3.1/server/solr/configsets/basic_configs/conf/solrconfig.xml
Specify the request handler for the Mongo Connector.
*<requestHandler name="/admin/luke"
class="org.apache.solr.handler.admin.LukeRequestHandler" />*
Also configure the auto commit to true so that Solr auto commits the data from MongoDB after the configured time.
<autoCommit>
<maxTime>15000</maxTime>
<openSearcher>true</openSearcher>
</autoCommit>
Step 6:Solr needs to be restarted
Bin/solr restart -force
Starting MongoDB Server
Mongo Connector requires a MongoDB replica set to be running for it to index MongoDB data in Solr. A replica set is a cluster of MongoDB servers that implements replication and automatic failover. A replica set could consist of only a single server and  the port is specified as 27017, the data directory for MongoDB is specified as /data/db and the replica set name is specified as rs0 with the –replSet option.
Sudo mongod --port 27017 --dbpath /data/db --replSet rs0
Step 7: Starting MongoDB Shell
Start the Mongodb shell with the following command
Mongo
MongoDB shell gets started. We need to initiate the replica set. Run the following command to initiate the replica set.
rs.initiate()
Step 8:Starting MongoDB Connector and Indexing MongoDB Database with Solr
Run Mongo-connector command as below
mongo-connector --unique-key=id –n solr.wlslog -m localhost:27017 -t
http://xx.xxx.xxx.xx:8983/solr/wlslog -d solr_doc_manager
In above statement
solr.wlslog--> solr is a databasename wlslog is a collection name
Solr/wlslog--> wlslog is a CORE name
For Future reference use the below link
https://www.toadworld.com/platforms/nosql/b/weblog/archive/2017/02/03/indexing-mongodb-data-in-apache-solr

How to index data from mongodb to solr 4.7

Does anyone know how to index data from mongodb into solr i've followed previous procedure which was mentioned here
can anyone give a step by step procedure to fix this issue,
here the scripts
on mydataconfig.xml
<dataConfig>
<dataSource name="MyMongo" type="MongoDataSource" database="test" />
<document name="Products">
<entity processor="MongoEntityProcessor"
query="{'Active':1}"
collection="testusers"
datasource="MyMongo"
transformer="MongoMapperTransformer" >
<field column="name" name="name" mongoField="name"/>
<field column="position" name="position" mongoField="position"/>
</entity>
</document>
and in solrconfig.xml
<lib dir="../../lib/" regex="solr-mongo-importer.jar" />
<lib dir="../../lib/" regex="mongo.jar" />
......
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">mydataconfig.xml</str>
</lst> </requestHandler>
Note: when i checked in solr\admin (in console)
GET:http://localhost:8983/solr/suggest/admin/ping?action=status&wt=json&_=1417082932028 : 503 service not available"
and in dataimport section
TypeError: $(...).attr(...) is undefined in dataimport.js
it works well for other cores i've created in same solr which connects to mysql db.

Writing Multiple Segments using BeanIO throws error as indeterminate size

Please help as writing multiple Segment throws error
"A segment of indeterminate size may not follow another component of indeterminate size"
Sample XML config is
<field name="noOfShipmentContents" type="Integer" />
<segment name="shipmentContentsPart2"
class="com.ShipmentContentsPart2"
collection="list" minOccurs="1" maxOccurs="unbounded">
<field name="shipmentContents" type="String" nillable="true" />
</segment>
<field name="noOfSpecialServices" type="Integer" />
<segment name="specialServicesPart3"
class="com.SpecialServicePart3"
collection="list" minOccurs="0" maxOccurs="unbounded">
<field name="chrgServCode" type="String" nillable="true" />
<field name="chrgAmt" type="String" nillable="true" />
</segment>
</record>
beanio.jar version 2.0.7 and 2.1.0 Both gives same error
What JDK version?
1.6.0.35
Got answer from developer of beanIO Kevin (Thanks) to use occursRef="[name of field]" on segments whose occurrences depend on a preceding field in the same record.
Trick is configuring
<field name="noOfSpecialServices" type="Integer" />
<segment name="specialServicesPart3" class="com.SpecialServicePart3"
collection="list" occursRef="noOfSpecialServices">
This feature is available in beanIO2.1.x

Integrate MONGODB and SOLR

I have tried to integrate MONGODB and SOLR by using MONGO CONNECTOR provided by mongodb which is running in a replica set configuration.
**python2.7 mongo_connector.py -m localhost:27017 -t http://localhost:8983/solr -u_id -d ./doc_managers/solr_doc_manager.py**
My output is
2013-06-19 16:19:10,943 - INFO - Finished 'http://localhost:8983/solr/update/?commit=true' (post) with body 'u'<commit ' in 0.012 seconds.
But I could not configure SOLR to get the documents from MONGODB. Please help me how to configure SOLR to get the documents from MONGODB. Should I use a SolrMongoImporter ?
I was facing the same problem. I was not able to solved so I found an interesting link:
http://derickrethans.nl/mongodb-and-solr.html
It connect mongo with solr through a php script. A
Step 1: Installing the Mongo Connector
To install the mongo connector run
Pip install mongo-connector
Step 2: Creating a Solr Core
./bin/solr create -c <corename>-p 8983 -s 3 -rf 3
Step 3:Configuring to Solr
The fields in the mongodb documents to be indexed are specified in the schema.xml configuration file . open the schema.xml in a vi editor. As
vi/solr/solr-6.6.2/server/solr/configsets/data_driven_schema_configs/
conf/schema.xml
Step 4: Mongo Connector also stores the metadata associated with the each mongodb document it indexes in fields ns and _ts. Also add the ns and _ts fields to the schema.xml.
<schema>
<?xml version="1.0" encoding="UTF-8" ?>
<schema name="example" version="1.5">
<field name="time_stamp" type="string" indexed="true"  stored="true" 
multiValued="false" />
<field name="category" type="string" indexed="true"  stored="true" 
multiValued="false" />
<field name="type" type="string" indexed="true"  stored="true" 
multiValued="false" />
<field name="servername" type="string" indexed="true"  stored="true" 
multiValued="false" />
<field name="code" type="string" indexed="true"  stored="true" 
multiValued="false" />
<field name="msg" type="string" indexed="true"  stored="true" 
multiValued="false" />
<field name="_ts" type="long" indexed="true" stored="true" />
<field name="ns" type="string" indexed="true" stored="true"/>
<field name="_version_" type="long" indexed="true" stored="true"/>
</schema>
Step 5:We also need to configure the
org.apache.solr.handler.admin.LukeRequestHandler request handler in the solrconfig.xml. 
Open the solrconfig.xml in the vi editor.
vi ./solr-5.3.1/server/solr/configsets/basic_configs/conf/solrconfig.xml
Specify the request handler for the Mongo Connector.
*<requestHandler name="/admin/luke"
class="org.apache.solr.handler.admin.LukeRequestHandler" />*
Also configure the auto commit to true so that Solr auto commits the data from MongoDB after the configured time.
<autoCommit>
<maxTime>15000</maxTime>
<openSearcher>true</openSearcher>
</autoCommit>
Step 6:Solr needs to be restarted
Bin/solr restart -force
Starting MongoDB Server
Mongo Connector requires a MongoDB replica set to be running for it to index MongoDB data in Solr. A replica set is a cluster of MongoDB servers that implements replication and automatic failover. A replica set could consist of only a single server and  the port is specified as 27017, the data directory for MongoDB is specified as /data/db and the replica set name is specified as rs0 with the –replSet option.
Sudo mongod --port 27017 --dbpath /data/db --replSet rs0
Step 7: Starting MongoDB Shell
Start the Mongodb shell with the following command
Mongo
MongoDB shell gets started. We need to initiate the replica set. Run the following command to initiate the replica set.
rs.initiate()
Step 8:Starting MongoDB Connector and Indexing MongoDB Database with Solr
Run Mongo-connector command as below
mongo-connector --unique-key=id –n solr.wlslog -m localhost:27017 -t
http://xx.xxx.xxx.xx:8983/solr/wlslog -d solr_doc_manager
In above statement
solr.wlslog--> solr is a databasename wlslog is a collection name
Solr/wlslog--> wlslog is a CORE name
For Future reference use the below link
https://www.toadworld.com/platforms/nosql/b/weblog/archive/2017/02/03/indexing-mongodb-data-in-apache-solr

Sharepoint 2010 list schema deployment

I have create a list schema definition and list instance in VS2010. I have a feature that deploys both list definition and instance, plus a feature stappler which actives the new feature for each new sub site.
My list definition schema.xml is:
<Fields>
<Field Name="StartDate" Type="DateTime" Required="FALSE" DisplayName="Start Date" StaticName="StartDate" ID="9ea1256f-6b67-43b0-8ab7-1d643bf8a834" SourceID="http://schemas.microsoft.com/sharepoint/v3" ColName="datetime1" RowOrdinal="0" />
<Field Name="EndDate" Type="DateTime" Required="FALSE" DisplayName="End Date" StaticName="EndDate" ID="900503fa-4ab1-4938-be75-b40694ab97b6" SourceID="http://schemas.microsoft.com/sharepoint/v3" ColName="datetime2" RowOrdinal="0" />
I deploy successfully and create a new site using my site definitions, list gets created successfully all things work.
Now i want to add another field to my list, i go back to visual studio 2010 edit list definition schema.xml and add another field in Metadata fields section.
The schema.xml is now:
<Fields>
<Field Name="StartDate" Type="DateTime" Required="FALSE" DisplayName="Start Date" StaticName="StartDate" ID="9ea1256f-6b67-43b0-8ab7-1d643bf8a834" SourceID="http://schemas.microsoft.com/sharepoint/v3" ColName="datetime1" RowOrdinal="0" />
<Field Name="EndDate" Type="DateTime" Required="FALSE" DisplayName="End Date" StaticName="EndDate" ID="900503fa-4ab1-4938-be75-b40694ab97b6" SourceID="http://schemas.microsoft.com/sharepoint/v3" ColName="datetime2" RowOrdinal="0" />
<!-- New Field -->
<Field Name="TestRedeploy" Type="Text" Required="FALSE" DisplayName="TestRedeploy" StaticName="TestRedeploy" RichText="True" Sortable="FALSE" ID="A5656659-CD3E-4C84-AEAC-554DCE25434B" SourceID="http://schemas.microsoft.com/sharepoint/v3" ColName="ntext3" RowOrdinal="0" />
</Fields>
I build and deploy successfully, but when i go in list settings to check if new column was added i find that all columns have been deleted. Can you help me figure out how to deploy new columns with schema.xml ?
You should try to reinstall the feature that deploys your list
Go to sharepoint 2010 management console and write
install-spfeature -path "feature folder name in 14'hive" -force
after this make an IISRESET and reload the page. This should be enough for the field to be visible.
By the way you should never include colname and rowordinal values in your xml. These will be provided automatically by sharepoint when field is deployed. One problem that you might face with current deployment is that there is already a list field mapped to colname="ntext3".