nutch 1.1 schema.xml - plugins

I recently downloaded latest version of nutch. (nutch-1.1) While going through its code, I noticed that there is a conf/schema.xml file which defines schema for solr part bundled with nutch.
This schema.xml has fields for every plugin.
My question is, How do I find out, what values a particular plugin is retuning? In other words, if I use a third party plugin (say plugin X) with nutch and wants to add few fields in schema.xml, how do I figure out what "plugin X" is returning and if it is string, int, array?
My second question is that, I see conf/solrindex-mapping.xml which is been used by solrIndexer of nutch. This makes me more confused, since not all fields in schema.xml are in solrindex-mapping.xml
For simplicity of explaining answer, lets say Plugin X is feed plugin bundled with nutch.

Access and inspect the Nutch Index in question
View How-To recipe

Related

How to read .net assembly's meta data table

I'm reading Jeffery Richter's book "CLR via C#". He mentioned the CLR metadata table contains TypeRef and MemberRef section.
I want to build a call graph by reading these MemberRef and TypeRef information. What is the best way to do so? After a search, I found somebody read the file directly using PE/CLR file format. Somebody use native IMetaDataImport. I wonder if there is any .net built-int classes or 3rd party libraries to do so?
Thanks!
If you want to see metadata you can use ILDasm.exe.
Example: ILDasm.exe D: \ MyTestAsm.dll
Next: view -> meta info -> we Show!
If you want to receive data from the code, you can research 3rd party library Mono:
http://www.mono-project.com/docs/tools+libraries/libraries/Mono.Cecil/
MetaData it is part of IL-code. If you want to see Il code you can see this:
How to get access to embedded assembly's metadata using IMetaDataDispenser.OpenScope?

Birt Multi Sheet report using SpudSoft

I´m having some problems to create reports in my server without default export engine.
I´m using SpudSoft to create it. I have the following configuration:
Tomcat 7
Birt 4.2.2
uk.co.spudsoft.birt.emitters.excel_0.8.0.201310230652.jar
And i followed this tutorial:
spudsoft-birt-excel-emitters
I haven´t include this file
lib/slf4j-api-1.6.2.jar
because it´s not included in the *.jar file
and either wrote this code:
'if( "XLS".equalsIgnoreCase( outputFileFormat ) ) {
renderOption.setEmitterID( "uk.co.spudsoft.birt.emitters.excel.XlsEmitter" );
} else if( "XLSX".equalsIgnoreCase( outputFileFormat ) ) {
renderOption.setEmitterID( "uk.co.spudsoft.birt.emitters.excel.XlsxEmitter" );
}'
Because I dont really know where to use it.
to run my report i use the following URL
http://127.0.0.1:8090/birt-viewer/frameset?__format=xls&__report=informes/myReport.rptdesign&__emitterid=uk.co.spudsoft.birt.emitters.excel.XlsEmitter
and i get the following message:
org.eclipse.birt.report.service.api.ReportServiceException: EmitterID uk.co.spudsoft.birt.emitters.excel for render option is invalid.
What can i do to run SpudSoft report? I've been reading for a week and I haven´t found any solution!
Thanks a lot for all!
#Dominique,
I recommend upgrading from the emitter included with BIRT 4.3 (and given the lack of responses from the BIRT team I regret letting them put it in there).
Also, you don't need to use a specific IRenderOption type - they are all the same really anyway.
#Jota,
If you are getting that error it means that BIRT hasn't picked up the emitter correctly (you do have the correct emitter ID).
I don't use the BIRT war file, so my instructions aren't aimed at that approach (I just use the report engine in my own service).
The code snippet is no use for you, it's just a way to specify the emitter ID, which you are doing on the query string.
slf4j shouldn't be needed with the version of the emitter that you have - it uses JUL instead (I hate JUL, but it's one fewer dependency).
Can you post a complete listing of the jar files in your war?
It seems because you make use of a generic IRenderOption. With spudsoft emitter you should instantiate your render options like this:
EXCELRenderOption excelOptions = new EXCELRenderOption();
Note if you upgrade to BIRT 4.3 you don't have to set the emitter anymore, it is embedded

Can't find Lucene.Net.Spatial.Tier namespace with the current version of Lucene.Net

In my search for a geolocation search implementation using lucene.net I encountered this article from leapinggorilla.com and download the source code but have no luck compiling, I added the reference using nugget but still no luck, and if I browse the assembly using object browser, can't find the Namespace either.
Any suggestions to what I am missing?
Thanks
The spatial module in Lucene 3.x was found to be buggy and unmaintained, so it's gone as of Lucene 4.x. Lucene 4.x has a new spatial module that I developed with 2 others. If you download it, you should look at the "SpatialExample.java" in the tests (perhaps there's a .net equivalent). You also might want to watch the presentation I gave at Lucene/Solr Revolution, or simply flip through the slides:
http://www.lucenerevolution.org/2013/Lucene-Solr4-Spatial-Deep-Dive
Lucene.Net is at version 3.0.3 and the 3.x spatial module was dropped from it as well. The 4.x spatial module was backported from java lucene 4.x. You can view the source here and the unit tests here
Unfortunately, that means that most of the older blog posts won't work directly with the new API. However, since most of the API calls should be that same as java's, so I would assume that any blog posts written for java could be translated to .NET.
I have a Lucene.NET 3.0.3 solution which allows spatial search with ordering (from a centre point), within a circle of a given radius.
The answer is here on StackOverflow, and a full VS solution can be found on GitHub.
The key portion of code which drives the spatial search is this:
var spatialArgs = new SpatialArgs(SpatialOperation.Intersects, searchArea);
var spatialQuery = _strategy.MakeQuery(spatialArgs);
var valueSource = _strategy.MakeRecipDistanceValueSource(searchArea);
var valueSourceFilter = new ValueSourceFilter(new QueryWrapperFilter(spatialQuery), valueSource, 0, 1);
var filteredSpatial = new FilteredQuery(query, valueSourceFilter); // Restricts results to searchArea
var spatialRankingQuery = new FunctionQuery(valueSource); // Orders results by distance (closest first)
var bq = new BooleanQuery();
bq.Add(filteredSpatial,Occur.MUST);
bq.Add(spatialRankingQuery,Occur.MUST);
Please let me know if anything is unclear. I urge anyone curious to download and examine the full solution.

Replace éàçè... with equivalent "eace" In GWT

I tried
s=Normalizer.normalize(s, Normalizer.Form.NFD).replaceAll("[^\\p{ASCII}]", "");
But it seems that GWT API doesn't provide such fonction.
I tried also :
s=s.replace("é",e);
But it doesn't work either
The scenario is I'am trying to générate token from the clicked Widget's text for the history management
You can take ASCII folding filter from Lucene and add to your project. You can just take foldToASCII() method from ASCIIFoldingFilter (the method does not have any dependencies). There is also a patch in Jira that has a full class for that without any dependencies - see here. It should be compiled by GWT without any problems. License should be also OK, since it is Apache License, but don't quote me on it - you should ask a real lawyer.
#okrasz, the foldToASCII() worked but I found a shorter one Transform a String to URL standard String in Java

Does GemBox support column filters?

If I load an xlsx file that already has a filter, and then save the file using GemBox, it seems to throw my filtering cell away. Does GemBox support filtering at all? I know I can load the file in preserved mode but my intention is to create the filter in my C# app.
EDIT 2015-06-15:
The newer version of GemBox.Spreadsheet (version 3.9) does have an API support for filters, see the version history page. Also, you can find here an Excel AutoFilter example.
ORIGINAL ANSWER:
Unfortunately, in the current version (3.5) GemBox.Spreadsheet supports filtering only through preservation.