Is there an implementation of the Penn Treebank Tokenizer in Perl? - perl

I'm looking for a Perl module that is a port of this where I can basically create an object, call a tokenize() subroutine, pass in a pile of text and get back a list of tokens. Something to that effect. If it doesn't exist I'll do it, but no sense in reinventing the wheel, right? :) TIA.

I couldn't find an exact match, but could one of Lingua::EN::Tagger, Lingua::Treebank, Text::StemTagPOS, Lingua::Stem::Snowball or Treex::EN handle the problem?

Related

How do I get a `Pipe` to the standard input/output/error in Swift?

When using Process in Swift, if you do not specify the .standardInput, .standardOutput, or .standardError properties, the Process inherits the standard input/output/error of the caller, which can be quite useful. But let's say that, for example, I want to tell a Process to send its output to the inherited standard error. It seems to me I would want to do something like process.standardOutput = ProcessInfo.processInfo.standardError, but ProcessInfo has no standardError property. How am I supposed to do something like this?
Of course I find the answer to this immediately after posting this question.
The desired objects are FileHandle.standardInput, FileHandle.standardOutput, and FileHandle.standardError.

Using boost's socket.async_send_to()

I've been stuck on this for a while now. I am trying to send the following:
boost::shared_ptr<uint8_t[]> m_data
over the wire using:
_socket.async_send_to(boost::asio::buffer(m_data), m_remote_endpoint,
boost::bind(&UDPServer::handle_send, this, message,
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred));
I get the error "no instance of overloaded function boost::asio::buffer matches the argument list boost::shared_ptr<uint8_t[]>"
m_data is filled in another function.
I suspect this is because I actually have to use the key word new on m_data. But I can't quite figure out how to do it. I've tried a few different variations. Can anybody offer me some insight here? This is probably more a question of how to dereference a shared pointer then anything. Thanks in advance!
boost::asio::buffer has an impressive lists of consttructors, but neither of them takes shared_ptr<[]> (possibly an oversight form lib authors).
In your case, you simply need to derefence shared ptr, for example, by calling get on it.

Get Multiple Locations of the Same String in Array

Is there any way to get the indices of the same string (that appears more than once) in a single array? I know I can find a specific string's location using:
[nameOfArray indexOfObject:#"apple"]
Of course, I could create a for loop essentially using the same code above and ignoring the previous "apples" found. I can't help to feel that there is a simpler (built-in) way to do this in objective-c. Am I right?
Thank you all in advance.
You could use indexesOfObjectsPassingTest with the "test" block being a block that tests for equality.

How to write a string starting with '=' to a cell using Spreadsheet::WriteExcel

I'm using the Perl package Spreadsheet::WriteExcel to write an Excel file. I want to write a string that starts with the equal sign, "=ABC()", to a cell.
$ws->write('A1', '=ABC()');
But I got an error message of
Unknown function ABC() in formula
Can someone advise?
Use the write_string method directly instead of using write:
$ws->write_string('A1', '=ABC()');
Spreadsheet::WriteExcel's write method is a convenience method that guesses what kind of data you're trying to store. If it guesses wrong, you should use one of the type-specific methods.
#Cjm already provided the best answer. Still, I'd like to remember that's possible also to format any cell as text and type whatever you want, that won't be interpreted.

What's the best way to escape vars in Zend_View automatically?

A week or two ago I just started using Zend Framework seriously and have had trouble escaping manually with Zend_View::escape().
Does anyone knows how to escape vars in templates (Zend_View templates) automatically without using $this->escape(), or any other tricky ways like output buffering and PREG replacing *.phtml files.
I'd like to know the best practice in this case.
You can extend Zend_View to create a custom view class which autoescapes things, or you can use a view helper to turn autoescaping on/off.
I have written a blogpost about it, with example code for both approaches:
How to automatically escape template variables in Zend_View
Over at the PiKe project we build a custom stream wrapper that automatically escapes all view variables, with a MINIMAL performance hit! You can still get the RAW value with:
<?=~ $variable ?>
Notice the "~" character. Checkout http://code.google.com/p/php-pike/wiki/Pike_View_Stream
I know you said that you want to avoid "tricky ways like output buffering and PREG replacing *.phtml files.", but I still think it's a very neat way to fix auto escaping in Zend Framework 1.
You said "automatically", so I believe that that means when you do echo $this->var; you want it escaped. Well, if that's the case, maybe you could do the escaping when the variable is set to the view. AFAIK it's done in the Zend_View_Abstract class' __set magic method* (around line 300). Changing the core ZF code is not recommended, so you could go by extending Z_V_A or Z_V and just override the __set method.
*I'm not 100% sure that Z_V_A::__set is the only place where the params are assigned to the view, but I think it should be. Can't think of any other place for that.
Edit: Personally, I'd avoid this and just stick with the good ol' $this->escape(). More typing but less magic going on in the background.
You have asked for best practice then what you are doing is already it.
Wait till when you want to display your data before modifying it only for output reasons.
I understand you find writting ->escape() everytime tedious but its still the way to go.
If you where to auto escape everything then you would run into problems one day when you want/need unescaped data.
ZendX_View_Autoescaping, this project provides you a ViewRenderer with autoescaping of all assigned view variables.
https://github.com/jensklose/ZendX_View_Autoescaping
Try it!
It supports:
escaping into deep data structures
escaping the array keys
possibility to switch the escaping context (html, json, nofilter)