Zend_Search_Luncene handle Querys - zend-framework

iam trying to implement an Searchmachine into my site. Iam using Zend_Search_Lucene for this.
The index is created like this :
public function create($config, $create = true)
{
$this->_config = $config;
// create a new index
if ($create) {
Zend_Search_Lucene_Analysis_Analyzer::setDefault(
new Zend_Search_Lucene_Analysis_Analyzer_Common_TextNum_CaseInsensitive()
);
$this->_index = Zend_Search_Lucene::create(APPLICATION_PATH . $this->_config->index->path);
} else {
$this->_index = Zend_Search_Lucene::open(APPLICATION_PATH . $this->_config->index->path);
}
}
{
public function addToIndex($data)
$i = 0;
foreach ($data as $val) {
$scriptObj = new Sl_Model_Script();
$scriptObj->title = $val['title'];
$scriptObj->description = $val['description'];
$scriptObj->link = $val['link'];
$scriptObj->tutorials = $val['tutorials'];
$scriptObj->screenshot = $val['screenshot'];
$scriptObj->download = $val['download'];
$scriptObj->tags = $val['tags'];
$scriptObj->version = $val['version'];
$this->_dao->add($scriptObj);
$i++;
}
return $i;
}
/**
* Add to Index
*
* #param Sl_Interface_Model $scriptObj
*/
public function add(Sl_Interface_Model $scriptObj)
{
// UTF-8 for INDEX
$doc = new Zend_Search_Lucene_Document();
$doc->addField(Zend_Search_Lucene_Field::text('title', $scriptObj->title, 'utf-8'));
$doc->addField(Zend_Search_Lucene_Field::text('tags', $scriptObj->tags, 'utf-8'));
$doc->addField(Zend_Search_Lucene_Field::text('version', $scriptObj->version, 'utf-8'));
$doc->addField(Zend_Search_Lucene_Field::text('download', $scriptObj->download, 'utf-8'));
$doc->addField(Zend_Search_Lucene_Field::text('link', $scriptObj->link));
$doc->addField(Zend_Search_Lucene_Field::text('description', $scriptObj->description, 'utf-8'));
$doc->addField(Zend_Search_Lucene_Field::text('tutorials', $scriptObj->tutorials, 'utf-8'));
$doc->addField(Zend_Search_Lucene_Field::text('screenshot', $scriptObj->screenshot));
$this->_index->addDocument($doc);
}
But when i try to query the index with :
$index->find('Wordpress 2.8.1' . '*');
im getting the following error :
"non-wildcard characters are required at the beginning of pattern."
any ideas how to query for a string like mine ? an query for "wordpress" works like excepted.

Lucene cannot handle leading wildcards, only trailing ones. That is, it does not support queries like 'tell me everyone whose name ends with 'att'' which would be something like
first_name: *att
It only supports trailing wildcards. Tell me everyone whose names end that start with 'ma'
first_name: ma*
See this Lucene FAQ entry:
http://wiki.apache.org/lucene-java/LuceneFAQ#head-4d62118417eaef0dcb87f4370583f809848ea695
There IS a workaround for Lucene 2.1 but the developers say it can be "expensive".

Related

TYPO3 Extbase - Paginate through a large table (100000 records)

I have a fairly large table with about 100000 records. If I don't set the limit in the repository
Repository:
public function paginateRequest() {
$query = $this->createQuery();
$result = $query->setLimit(1000)->execute();
//$result = $query->execute();
return $result;
}
/**
* action list
*
* #return void
*/
public function listAction() {
$this->view->assign('records', $this->leiRepository->paginateRequest());
//$this->view->assign('records', $this->leiRepository->findAll());
}
... the query and the page breaks although I'm using f:widget.paginate . As per the docs https://fluidtypo3.org/viewhelpers/fluid/master/Widget/PaginateViewHelper.html I was hoping that I can render only the itemsPerPage and 'parginate' through the records ...
List.hmtl
<f:if condition="{records}">
<f:widget.paginate objects="{records}" as="paginatedRecords" configuration="{itemsPerPage: 100, insertAbove: 0, insertBelow: 1, maximumNumberOfLinks: 10}">
<f:for each="{paginatedRecords}" as="record">
<tr>
<td><f:link.action action="show" pageUid="43" arguments="{record:record}"> {record.name}</f:link.action></td>
<td><f:link.action action="show" pageUid="43" arguments="{record:record}"> {record.lei}</f:link.action></td>
</tr>
</f:for>
</f:widget.paginate>
Model:
class Lei extends \TYPO3\CMS\Extbase\DomainObject\AbstractEntity {
...
/**
* abc
*
* #lazy
* #var string
*/
protected $abc = '';
...
I use in TYPO3 9.5. The next function in repository:
public function paginated($page = 1, $size = 9){
$query = $this->createQuery();
$begin = ($page-1) * $size;
$query->setOffset($begin);
$query->setLimit($size);
return $query->execute();
}
And in the controller I am using arguments as parameter to send the page to load in a Rest action.
public function listRestAction()
{
$arguments = $this->request->getArguments();
$totalElements = $this->repository->total();
$pages = ceil($totalElements/9);
$next_page = '';
$prev_page = '';
#GET Page to load
if($arguments['page'] AND $arguments['page'] != ''){
$page_to_load = $arguments['page'];
} else {
$page_to_load = 1;
}
#Configuration of pagination
if($page_to_load == $pages){
$prev = $page_to_load - 1;
$prev_page = "http://example.com/rest/news/page/$prev";
} elseif($page_to_load == 1){
$next = $page_to_load + 1;
$next_page = "http://example.com/rest/news/page/$next";
} else {
$prev = $page_to_load - 1;
$prev_page = "http://example.com/rest/news/page/$prev";
$next = $page_to_load + 1;
$next_page = "http://example.com/rest/news/page/$next";
}
$jsonPreparedElements = array();
$jsonPreparedElements['info']['count'] = $totalElements;
$jsonPreparedElements['info']['pages'] = $pages;
$jsonPreparedElements['info']['next'] = $next_page;
$jsonPreparedElements['info']['prev'] = $prev_page;
$result = $this->repository->paginated($page_to_load);
$collection_parsed_results = array();
foreach ($result as $news) {
array_push($collection_parsed_results, $news->parsedArray());
}
$jsonPreparedElements['results'] = $collection_parsed_results;
$this->view->assign('value', $jsonPreparedElements);
}
The result of this, is a JSON like this:
{
"info": {
"count": 25,
"pages": 3,
"next": "",
"prev": "http://example.com/rest/news/page/2"
},
"results": [
{ ....}
] }
How large / complex are the objects you want to paginate through? If they have subobjects that you dont need in the list view, add #lazy annotation to those relations inside the model.
Due to this large amount of records, you should keep them as simple as possible in the list view. You can try to only give the result as array to the list view using $this->leiRepository->findAll()->toArray() or return only the raw result from your repository by adding true to execute(true).
You can also create an array of list items yourself in a foreach in the controller and only add the properties you really need inside the list.
If your problem is the performance, just use the default findAll()-Method.
The built-in defaultQuerySettings in \TYPO3\CMS\Extbase\Persistence\Repository set their offset and limit based on the Pagination widget, if not set otherwise.
If the performance issue persists, you may have to consider writing a custom query for your database request, that only requests the data your view actually displays. The process is described in the documentation: https://docs.typo3.org/typo3cms/ExtbaseFluidBook/6-Persistence/3-implement-individual-database-queries.html

How to prevent SQL injection in PhalconPHP when using sql in model?

Let's say I am building a search that finds all the teacher and got an input where the user can put in the search term. I tried reading the phalcon documentation but I only see things like binding parameters. I read the other thread about needing prepare statements do I need that in Phalcon as well?
And my function in the model would be something like this:
public function findTeachers($q, $userId, $isUser, $page, $limit, $sort)
{
$sql = 'SELECT id FROM tags WHERE name LIKE "%' . $q . '%"';
$result = new Resultset(null, $this,
$this->getReadConnection()->query($sql, array()));
$tagResult = $result->toArray();
$tagList = array();
foreach ($tagResult as $key => $value) {
$tagList[] = $value['id'];
....
}
}
My question is for the Phalcon framework is there any settings or formats I should code for this line $sql = 'SELECT id FROM tags WHERE name LIKE "%' . $q . '%"';
And also any general recommendation for preventing SQL Injection in PhalconPHP controllers and index would be appreciated.
For reference:
My controller:
public function searchAction()
{
$this->view->disable();
$q = $this->request->get("q");
$sort = $this->request->get("sort");
$searchUserModel = new SearchUsers();
$loginUser = $this->component->user->getSessionUser();
if (!$loginUser) {
$loginUser = new stdClass;
$loginUser->id = '';
}
$page = $this->request->get("page");
$limit = 2;
if (!$page){
$page = 1;
}
$list = $searchUserModel->findTeachers($q, $loginUser->id, ($loginUser->id)?true:false, $page, $limit, $sort);
if ($list){
$list['status'] = true;
}
echo json_encode($list);
}
My Ajax:
function(cb){
$.ajax({
url: '/search/search?q=' + mapObject.q + '&sort=<?php echo $sort;?>' + '&page=' + mapObject.page,
data:{},
success: function(res) {
//console.log(res);
var result = JSON.parse(res);
if (!result.status){
return cb(null, result.list);
}else{
return cb(null, []);
}
},
error: function(xhr, ajaxOptions, thrownError) {
cb(null, []);
}
});
with q being the user's search term.
You should bind the query parameter to avoid an SQL injection. From what I can remember Phalcon can be a bit funny with putting the '%' wildcard in the conditions value so I put them in the bind.
This would be better than just filtering the query.
$tags = Tags::find([
'conditions' => 'name LIKE :name:',
'bind' => [
'name' => "%" . $q . "%"
]
])
Phalcon\Filter is helpful when interacting with the database.
In your controller you can say, remove everything except letters and numbers from $q.
$q = $this->request->get("q");
$q = $this->filter->sanitize($q, 'alphanum');
The shortest way for requests:
$q = $this->request->get('q', 'alphanum');

Yii Lucene Encoding

Can't find a solution here or anywhere else, so I'm asking another question about Zend Lucene. Everyone tells about some encoding of Lucene. Where should I switch this encoding?
When I use search (PL language) I'm getting
oprĂłcz wystÄ…pi reprezentacja Rosji. Mistrzowie
olimpijscy z Londynu powalczÄ…
This Ăł should be "ó" in Polish, Ä… (umlaut?) is "ą" and so on...
It works great with English of course.
Again searchController.php (actions create + search):
public function actionCreate()
{
$_indexFiles = 'runtime.search';
$index = Zend_Search_Lucene::create($_indexFiles);
$index = new Zend_Search_Lucene(Yii::getPathOfAlias('application.' . $this->_indexFiles), true);
$posts = News::model()->with('comment')->findAll();
foreach($posts as $news){
$doc = new Zend_Search_Lucene_Document();
$doc->addField(Zend_Search_Lucene_Field::Text('title',CHtml::encode($news->name), 'utf-8'));
$doc->addField(Zend_Search_Lucene_Field::Text('link',CHtml::encode($news->url), 'utf-8'));
$doc->addField(Zend_Search_Lucene_Field::Text('content',CHtml::encode($news->description), ' utf-8 '));
$index->addDocument($doc);
}
setlocale(LC_CTYPE, 'pl_PL.utf-8');
$index->commit();
echo 'Lucene index created';
}
public function actionSearch()
{
Zend_Search_Lucene_Search_QueryParser::setDefaultEncoding('utf-8');
Zend_Search_Lucene_Analysis_Analyzer::setDefault(new Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8_CaseInsensitive ());
$this->layout='column2';
if (($term = Yii::app()->getRequest()->getParam('q', null)) !== null) {
$index = new Zend_Search_Lucene(Yii::getPathOfAlias('application.' . $this->_indexFiles));
$results = $index->find($term);
$query = Zend_Search_Lucene_Search_QueryParser::parse($term);
$this->render('search', compact('results', 'term', 'query'));
}
}
Wellcome to Zend_Lucene, when you get tired of it you can start using a native search engine like Solr, or Sphinx
"Learn from the mistakes of others. You can't live long enough to make them all yourself."

$this->fetchRow creates failure in phpunit in Zend framework

I followed Rob Allens ZF 1 Tutorial and wanted to pimp it up with some UnitTesting. But whenever I run the phpunit command, i get the message:
here was 1 failure:
1) IndexControllerTest::testDeleteAction
Failed asserting last controller used <"error"> was "Index"
/path/to/library/Zend/Test/PHPUnit/ControllerTestCase.php:1000
/path/to/tests/application/controllers/IndexControllerTest.php:55
FAILURES!
Tests: 4, Assertions: 9, Failures: 1.
The Action in question is the deleteAction and looks like this:
public function deleteAction() {
if ($this->getRequest()->isPost()) {
$del = $this->getRequest()->getPost('del');
if ($del == 'Yes') {
$id = $this->getRequest()->getPost('id');
$wishes = new Application_Model_DbTable_Wishes();
$wishes->deleteWish($id);
}
$this->_helper->redirector('index');
}
else {
$id = $this->_getParam('id', 0);
$wishes = new Application_Model_DbTable_Wishes();
$this->view->wish = $wishes->getWish($id);
}
}
I tracked the error down to be $wishes>getWish($id); so if i go to that function, that looks like this:
public function getWish($id) {
$id = (int) $id;
$row = $this->fetchRow('id = ' . $id);
if(!$row){
throw new Exception("Could not find row $id");
}
return $row->toArray();
}
it appears the line $row = $this->fetchRow('id = ' . $id); causes the problem. And I can't figure out why. All action work just fine, they do as expected.Any idea how to fix this?
Thanks!
Maybe try using the select() object instead if a plain string:
public function getWish($id) {
$id = (int) $id;
$select = $this->select();
$select->where('id = ?', $id);
$row = $this->fetchRow($select);
if(!$row){
throw new Exception("Could not find row $id");
}
return $row->toArray();
}
This just a wild guess, but who knows. The only thing that looks at all odd is the lack of a placeholder in the query string (?).
FetchRow() does like to work with the select() object, in fact if you pass a string the first thing fetchRow() does is build a select(). So maybe it just doesn't like the string.

Write config in Zend Framework with APPLICATION_PATH

For an application I'd like to create some kind of setup-steps. In one of the steps the database configuration is written to the application.ini file. This all works, but something very strange happens: All the paths to the directories (library, layout, ...) are changed from paths with APPLICATION_PATH . to full paths. As you can imagine, this isn't very systemfriendly. Any idea how I can prevent that?
I update the application.ini with this code:
# read existing configuration
$config = new Zend_Config_Ini(
$location,
null,
array('skipExtends' => true,
'allowModifications' => true));
# add new values
$config->production->doctrine->connection = array();
$config->production->doctrine->connection->host = $data['server'];
$config->production->doctrine->connection->user = $data['username'];
$config->production->doctrine->connection->password = $data['password'];
$config->production->doctrine->connection->database = $data['database'];
# write new configuration
$writer = new Zend_Config_Writer_Ini(
array(
'config' => $config,
'filename' => $location));
$writer->write();
Since Zend_Config_Ini uses the default ini scanning mode (INI_SCANNER_NORMAL), it will parse all options and replace constants with their respective values. What you could do, is call parse_ini_file directly, using the INI_SCANNER_RAW mode, so the options aren't parsed.
ie. use
$config = parse_ini_file('/path/to/your.ini', TRUE, INI_SCANNER_RAW);
You will get an associative array that you can manipulate as you see fit, and afterwards you can write that back with the following snippet (from the comments):
function write_ini_file($assoc_arr, $path, $has_sections=FALSE) {
$content = "";
if ($has_sections) {
foreach ($assoc_arr as $key=>$elem) {
$content .= "[".$key."]\n";
foreach ($elem as $key2=>$elem2) {
if(is_array($elem2))
{
for($i=0;$i<count($elem2);$i++)
{
$content .= $key2."[] = ".$elem2[$i]."\n";
}
}
else if($elem2=="") $content .= $key2." = \n";
else $content .= $key2." = ".$elem2."\n";
}
}
}
else {
foreach ($assoc_arr as $key=>$elem) {
if(is_array($elem))
{
for($i=0;$i<count($elem);$i++)
{
$content .= $key2."[] = ".$elem[$i]."\n";
}
}
else if($elem=="") $content .= $key2." = \n";
else $content .= $key2." = ".$elem."\n";
}
}
if (!$handle = fopen($path, 'w')) {
return false;
}
if (!fwrite($handle, $content)) {
return false;
}
fclose($handle);
return true;
}
ie. call it with :
write_ini_file($config, '/path/to/your.ini', TRUE);
after manipulating the $config array. Just make sure you add double quotes to the option values where needed...
Or alternatively - instead of using that function - you could try writing it back using Zend_Config_Writer_Ini, after converting the array back to a Zend_Config object, I guess that should work as well...
I'm guess you could iterate over the values, checking for a match between the value of APPLICATION_PATH, and replacing it with string literal APPLICATION_PATH.
That is if you know that APPLICATION_PATH contains the string '/home/david/apps/myapp/application' and you find a config value '/home/david/apps/myapp/application/views/helpers', then you do some kind of replacement of the leading string '/home/david/apps/myapp/application' with the string 'APPLICATION_PATH', ending up with 'APPLICATION_PATH "/views/helpers"'.
Kind of a kludge, but something like that might work.
This is a long shot - but have you tried running your Zend_Config_Writer_Ini code while the APPLICATION_PATH constant is not defined? It should interpret it as the literal string 'APPLICATION_PATH' and could possibly work.