Zend framework 1.11 Gdata Spreadsheets insertRow very slow - zend-framework

I'm using insertRow to populate an empty spreadsheet, it starts off taking about 1 second to insert a row and then slows down to around 5 seconds after 150 rows or so.
Has anyone experienced this kind of behaviour?
There aren't any calculations on the data in the spreadsheet that could be getting longer with more data.
Thanks!

I'll try to be strict.
If you take a look at class "Zend_Gdata_Spreadsheets" you figure that the method insertRow() is written in a very not optimal way. See:
public function insertRow($rowData, $key, $wkshtId = 'default')
{
$newEntry = new Zend_Gdata_Spreadsheets_ListEntry();
$newCustomArr = array();
foreach ($rowData as $k => $v) {
$newCustom = new Zend_Gdata_Spreadsheets_Extension_Custom();
$newCustom->setText($v)->setColumnName($k);
$newEntry->addCustom($newCustom);
}
$query = new Zend_Gdata_Spreadsheets_ListQuery();
$query->setSpreadsheetKey($key);
$query->setWorksheetId($wkshtId);
$feed = $this->getListFeed($query);
$editLink = $feed->getLink('http://schemas.google.com/g/2005#post');
return $this->insertEntry($newEntry->saveXML(), $editLink->href, 'Zend_Gdata_Spreadsheets_ListEntry');
}
In short, it loads your whole spreadsheet just in order to learn this value $editLink->href in order to post new row into your spreadsheet.
The cure is to avoid using this method insertRow.
Instead, get your $editLink->href once in your code and then insert new rows each time by reproducing the rest of behaviour of this method. I.e, in your code instead of $service->insertRow() use following:
//get your $editLink once:
$query = new Zend_Gdata_Spreadsheets_ListQuery();
$query->setSpreadsheetKey($key);
$query->setWorksheetId($wkshtId);
$query->setMaxResults(1);
$feed = $service->getListFeed($query);
$editLink = $feed->getLink('http://schemas.google.com/g/2005#post');
....
//instead of $service->insertRow:
$newEntry = new Zend_Gdata_Spreadsheets_ListEntry();
$newCustomArr = array();
foreach ($rowData as $k => $v) {
$newCustom = new Zend_Gdata_Spreadsheets_Extension_Custom();
$newCustom->setText($v)->setColumnName($k);
$newEntry->addCustom($newCustom);
}
$service->insertEntry($newEntry->saveXML(), $editLink->href, 'Zend_Gdata_Spreadsheets_ListEntry');
Don't forget to encourage this great answer, it costed me few days to figure out. I think ZF is great however sometimes you dont want to rely on their coode too much when it comes to resources optimization.

Related

Sort Childobjects by specific value

im currently trying to sort all Childobjects i can get by their validity time.
my current code is not sorting anything:
$children = $this->productRepository->findByParent($product->getNumber());
foreach ($children as $child) {
//debug::debug($child->getvalidityPeriod());
$product->addChild($child);
}
As allready to see in the debug, $child->getvalidityPeriod() is there to get the needed info as integer. I thought, there must be a way to sort this object array by their validityPeriod, but how do i manage that? Do i need to create a new method and filter it there or can i use something BEFORE the ForEach Loop?
Pseudo like:
$children = $this->productRepository->findByParent($product->getNumber());
ObjectSortFunction($children, 'getvalidityPeriod');
...and then for each...
Would be awesome if somebody would help me!
Okay after several tries i have a workaround, that can be used as answer, as well ;)
I wrote this method in my repo:
public function findByParentSorted($parentId)
{
$query = $this->createQuery();
$query->matching($query->equals('parent', $parentId));
$query->setOrderings(
[
'validity_period' => \TYPO3\CMS\Extbase\Persistence\QueryInterface::ORDER_ASCENDING
]
);
return $query->execute();
}

Magento error when trying to duplication product

I am using magento 1.7. i have got issue i don't know why this is happen. i just open product in backend for edit then click on duplicate then i got following error
Warning: Illegal string offset 'new_file' in D:\wamp\www\easyshop\app\code\core\Mage\Catalog\Model\Product\Attribute\Backend\Media.php on line 158
when i try following code to debug file:
print_r($newImages);
die;
then i got this following data
Array
(
[/s/a/samsung_galaxy_s2_front1.jpg] => /s/a/samsung_galaxy_s2_front1_4.jpg
[/s/g/sgs2p1.jpg] => /s/g/sgs2p1_4.jpg
[/s/g/sgs2_11.jpg] => /s/g/sgs2_11_4.jpg
[/s/g/sgs2-4386.jpg] => /s/g/sgs2-4386_4.jpg
)
I thing array keys are wrong can you please give solution to solve this problem
I had the same problem on 1.7.02. The solution I found was to change Magento's (IMHO) bugged code.
On Mage_Catalog_Model_Product_Attribute_Backend_Media i've changed the lines where you find:
// For duplicating we need copy original images.
$duplicate = array();
foreach ($value['images'] as &$image) {
if (!isset($image['value_id'])) {
continue;
}
$duplicate[$image['value_id']] = $this->_copyImage($image['file']);
$newImages[$image['file']] = $duplicate[$image['value_id']];
}
for:
// For duplicating we need copy original images.
$duplicate = array();
foreach ($value['images'] as &$image) {
if (!isset($image['value_id'])) {
continue;
}
$duplicate[$image['value_id']] = $this->_copyImage($image['file']);
$newImages[$image['file']] = array();
$newImages[$image['file']]['new_file'] = $duplicate[$image['value_id']];
$newImages[$image['file']]['label'] = $image['label'];
}
It did the trick for me... Images are now being properly duplicated and enabled on new product.

Best practice for mongodb bulk inserts in Symfony2

In my symfony2 command, I am running a script that inserts hundreds of thousands of urls (as string) into a document.
Here are the basic structures of the 2 documents I'm using. Before the program is run, there are thousands of ParentDocuments already inside the mongodb, but zero ChildDocuments:
ParentDocument:
$id:id
$subDocument:OneToManyReference(ChildDocument)
$etc:everythingelse
ChildDocument:
$id:id
$url:string
$parentDocument:ManyToOneReference(ParentDocument)
And my Command code:
$dm = $this->getContainer()->get('doctrine_mongodb.odm.document_manager');
$parentDocuments = $dm->repository('My:Bundle:ParentDocument')->findAll();
while ($parentDocument = $parentDocuments->getNext()) {
//Returns an array of hundreds of thousands urls
$urls = $this->somehowFetchUrlsRelatedToTheParentDocument($parentDocument);
foreach ($urls as $url) {
$subDocument = new SubDocument();
$subDocument->setUrl($url);
$subDocument->setParentDocument($parentDocument);
$dm->persist($subDocument);
}
$dm->flush();
}
When I run this simple command, the write speed at first is incredibly fast. However, in the case of inserting millions of rows, the write speeds become significantly slower. As slow as 1 write per second after the command has been running for 10 minutes, making the code extremely ineffective.
My first attempt at fixing this problem was to clear the document manager right after it flushes using $dm->clear();
But this meant that the document manager would lose track of the current ParentDocument. So my solution was this:
$dm = $this->getContainer()->get('doctrine_mongodb.odm.document_manager');
$parentDocumentCursors = $dm->repository('My:Bundle:ParentDocument')->findAll();
$parentDocuments = array();
while ($parentDocument = $parentDocumentCursors->getNext()) {
array_push($parentDocuments, $parentDocument);
}
$dm->clear();
unset($dm);
$dm = $this->getContainer()->get('doctrine_mongodb.odm.document_manager');
foreach ($parentDocuments as $parentDocument) {
$urls = $this->somehowFetchUrlsRelatedToTheParentDocument($parentDocument);
foreach ($urls as $url) {
$subDocument = new SubDocument();
$subDocument->setUrl($url);
$subDocument->setParentDocument($parentDocument);
$dm->persist($subDocument);
}
$dm->flush();
$dm->clear();
}
This solved the problem. The write speeds were consistently fast throughout the whole execution of the program and millions of rows were able to be inserted without gradual delay.
However, this feels like a bad practice and a quick fix hack. What is the best practice for inserting millions of rows in Symfony2 using document manager without read/write speeds becoming slow?
I would avoid using Symfony's document manager and use the batchInsert() function directly. This is described in the documentation at http://php.net/manual/en/mongocollection.batchinsert.php It feels to me like Doctrine's ODM is actually hurting you here.
In order to do a bulk insert in doctrine you would need to move your flush outside of your loop. Consider the scenario below where you would persist in the foreach then flush when the foreach is completed. Your only catch will be that you will not be able to query any of the data being inserted in the batch until after the flush.
$dm = $this->getContainer()->get('doctrine_mongodb.odm.document_manager');
foreach ($parentDocuments as $parentDocument) {
$urls = $this->somehowFetchUrlsRelatedToTheParentDocument($parentDocument);
foreach ($urls as $url) {
$subDocument = new SubDocument();
$subDocument->setUrl($url);
$subDocument->setParentDocument($parentDocument);
$dm->persist($subDocument);
}
}
$dm->flush();
$dm->clear();
Another option is to do a a push,pushall, or addto set.
One issue to consider is you will need to use stdClass in php in order to add an object.
I find this to be the quickest way to update a subdocument.
For example:
$dm->createQueryBuilder('My:Bundle:ParentDocument')
->update()
->field('subDocument')->push( (object) array('url'=> $url) )
->field('id')->equals( $parentDocumentId )
->getQuery()
->execute();

How to get zend_lucene and zend_paginator to work

I've been using Zend Framework for a few months now. So, my knowledge is pretty good but I'm not quite an expert yet. I am trying to use zend_lucene with zend_paginator and so far not successful. I am able to use zend_lucene and index data successfully by itself and able to do use zend_paginator when querying the database, but I can't seem to combine the two. Here is a sample of what I am doing:
try {
$searchresults = $index->find($lucenequery);
}
catch (Zend_Search_Lucene_Exception $e) {
echo "Unable {$e->getMessage()}";
}
$page = $this->_getParam('page',1);
$paginator = Zend_Paginator::factory($searchresults);
$paginator->setItemCountPerPage(20);
$paginator->setCurrentPageNumber($page);
$this->view->paginator = $paginator;
Is there a different step I need to do with lucene and zend_paginator? I am really uncertain. The result I get is that for the first page results display properly. But when I hit the second page or third my results are blank. So uncertain what might be wrong as I can't find docs or tutorials in using the two together. Any help would be greatly appreciated.
I think this may work with the iterator adapter:
public function searchAction() {
$index = Zend_Search_Lucene::open('/path/to/lucene');
$results = $index->find($this->_getParam('q'));
$paginator = Zend_Paginator::factory($results);
$paginator->setCurrentPageNumber($this->_getParam('page', 1));
$paginator->setItemCountPerPage(10);
$this->view->results = $paginator;
}
Perhaps the problem you are having is that $paginator doesn't know how many search results there are..
So you may need to do that manually:
$paginator->setDefaultPageRange($results->count());

Youtube API - How to limit results for pagination?

I want to grab a user's uploads (ie: BBC) and limit the output to 10 per page.
Whilst I can use the following URL:
http://gdata.youtube.com/feeds/api/users/bbc/uploads/?start-index=1&max-results=10
The above works okay.
I want to use the query method instead:
The Zend Framework docs:
http://framework.zend.com/manual/en/zend.gdata.youtube.html
State that I can retrieve videos uploaded by a user, but ideally I want to use the query method to limit the results for a pagination.
The query method is on the Zend framework docs (same page as before under the title 'Searching for videos by metadata') and is similar to this:
$yt = new Zend_Gdata_YouTube();
$query = $yt->newVideoQuery();
$query->setTime('today');
$query->setMaxResults(10);
$videoFeed = $yt->getUserUploads( NULL, $query );
print '<ol>';
foreach($videoFeed as $video):
print '<li>' . $video->title . '</li>';
endforeach;
print '</ol>';
The problem is I can't do $query->setUser('bbc').
I tried setAuthor but this returns a totally different result.
Ideally, I want to use the query method to grab the results in a paginated fashion.
How do I use the $query method to set my limits for pagination?
Thanks.
I've decided just to use the user uploads feed as a way of getting pagination to work.
http://gdata.youtube.com/feeds/api/users/bbc/uploads/?start-index=1&max-results=10
If there is a way to use the query/search method to do a similar job would be interesting to explore.
I basically solved this in the same way as worchyld with a slight twist:
$username = 'ignite';
$limit = 30; // Youtube will throw an exception if > 50
$offset = 1; // First video is 1 (silly non-programmers!)
$videoFeed = null;
$uploadCount = 0;
try {
$yt = new Zend_Gdata_YouTube();
$yt->setMajorProtocolVersion(2);
$userProfile = $yt->getUserProfile($username);
$uploadCount = $userProfile->getFeedLink('http://gdata.youtube.com/schemas/2007#user.uploads')->countHint;
// The following code is a dirty hack to get pagination with the YouTube API without always starting from the first result
// The following code snippet was copied from Zend_Gdata_YouTube->getUserUploads();
$url = Zend_Gdata_YouTube::USER_URI .'/'. $username .'/'. Zend_Gdata_YouTube::UPLOADS_URI_SUFFIX;
$location = new Zend_Gdata_YouTube_VideoQuery($url);
$location->setStartIndex($offset);
$location->setMaxResults($limit);
$videoFeed = $yt->getVideoFeed($location);
} catch (Exception $e) {
// Exception handling goes here!
return;
}
The Zend YouTube API seems silly as the included getUserUploads method never returns the VideoQuery instance before it actually fetches the feed, and while you can pass a location object as a second parameter, it's an "either-or" situation - it'll only use the username parameter to construct a basic uri or only use the location, where you have to construct the whole thing yourself (as above).