I have the following code:
subDir = dir(frames_dir);
isub = [subDir(:).isdir];
nameSubFolds = {subDir(isub).name}';
I'd like to only choose items in "nameSubFolds" which contain the string "2017", with any characters after it (so, "2017*"). How can I do this?
you can use the builtin regular expressions of dir:
subDir = dir([frames_dir '/*2017*']);
isub = [subDir(:).isdir];
nameSubFolds = {subDir(isub).name}';
When using the rapid annotator tool brat, it appears that the created annotations file will present the annotation in the order that the annotations were performed by the user. If you start at the beginning of a document and go the end performing annotation, then the annotations will naturally be in the correct offset order. However, if you need to go earlier in the document and add another annotation, the offset order of the annotations in the output .ann file will be out of order.
How then can you rearrange the .ann file such that the annotations are in offset order when you are done? Is there some option within brat that allows you to do this or is it something that one has to write their own script to perform?
Hearing nothing, I did write a python script to accomplish what I had set out to do. First, I reorder all annotations by begin index. Secondly, I resequence the label numbers so that they are once again in ascending order.
import optparse, sys
splitchar1 = '\t'
splitchar2 = ' '
# for brat, overlapped is not permitted (or at least a warning is generated)
# we could use this simplification in sorting by simply sorting on begin. it is
# probably a good idea anyway.
class AnnotationRecord:
label = 'T0'
type = ''
begin = -1
end = -1
text = ''
def __repr__(self):
return self.label + splitchar1
+ self.type + splitchar2
+ str(self.begin) + splitchar2
+ str(self.end) + splitchar1 + self.text
def create_record(parts):
record = AnnotationRecord()
record.label = parts[0]
middle_parts = parts[1].split(splitchar2)
record.type = middle_parts[0]
record.begin = middle_parts[1]
record.end = middle_parts[2]
record.text = parts[2]
return record
def main(filename, out_filename):
fo = open(filename, 'r')
lines = fo.readlines()
fo.close()
annotation_records = []
for line in lines:
parts = line.split(splitchar1)
annotation_records.append(create_record(parts))
# sort based upon begin
sorted_records = sorted(annotation_records, key=lambda a: int(a.begin))
# now relabel based upon the sorted order
label_value = 1
for sorted_record in sorted_records:
sorted_record.label = 'T' + str(label_value)
label_value += 1
# now write the resulting file to disk
fo = open(out_filename, 'w')
for sorted_record in sorted_records:
fo.write(sorted_record.__repr__())
fo.close()
#format of .ann file is T# Type Start End Text
#args are input file, output file
if __name__ == '__main__':
parser = optparse.OptionParser(formatter=optparse.TitledHelpFormatter(),
usage=globals()['__doc__'],
version='$Id$')
parser.add_option ('-v', '--verbose', action='store_true',
default=False, help='verbose output')
(options, args) = parser.parse_args()
if len(args) < 2:
parser.error ('missing argument')
main(args[0], args[1])
sys.exit(0)
I have many text files that have 35 lines of header followed by a large matrix with data of an image (that info can be ignored and do not need to read it at the moment). I want to be able to read the header lines and extract information contained on those lines. For instance the first few lines of the header are..
File Version Number: 1.0
Date: 06/05/2015
Time: 10:33:44 AM
===========================================================
Beam Voltage (-kV) = 13.000
Filament (W) = 4.052
Cond. (-kV) = 8.885
CenterX1 (V) = 10.7
CenterY1 (V) = -45.9
Objective (%) = 71.40
OctupoleX = -0.4653
OctupoleY = -0.1914
Angle (deg) = 0.00
.
I would like to be able to open this text file and read the vulue of the day and time the file was created, filament power, the condenser voltage, the angle, etc.. and save these in variables or send them to a text box on a GUI program.
I have tried several things but since the values I want to extract some times are after a '=' or after a ':' or simply after a '' then I do not know how to approach this. Perhaps reading each line and look for a match of a word?
Any help would be much appreciated.
Thanks,
Alex
This is not particularly difficult, and one of the ways to do it would be to parse line-by-line as you suggested. Something like this:
MAX_LINES_TO_READ = 35;
fid = fopen('input.txt');
lineCount = 0;
dateString = '';
beamVoltage = 0;
while ~eof(fid)
line = fgetl(fid);
lineCount = lineCount + 1;
%//check conditions for skipping loop body
if isempty(line)
continue
elseif lineCount > MAX_LINES_TO_READ
break
end
%//find headers you are interested in
if strfind(line, 'Date')
%//find the first location of the header separator
idx = find(line, ':', 1);
%//extract substring starting from 1 char after separator
%//note: the trim is to get rid of leading/trailing whitespace
dateString = strtrim(line(idx + 1 : end));
elseif strfind(line, 'Beam Voltage')
idx = find(line, '=', 1);
beamVoltage = str2double(line(idx + 1 : end));
end
end
fclose(fid);
I have made a plugin where I store many images in the "media" field and jsut as many captions in the field "imagecaption".
Now my wish is to display it like this:
image1.png
caption 1
image2.png
caption 2
image3.png
caption 3
This is how ive been trying to do it, but its not working:
plugin.tx_myplugin_pi1 = COA
plugin.tx_myplugin_pi1{
10 = TEXT
10.field = header
10.wrap = <h1>|</h1>
20 = COA
20{
10 = TEXT
10{
field = media
split{
token = ,
cObjNum = 1
1.current = 1
}
}
20 = TEXT
20{
field = imagecaption
split{
token.char = 10
cObjNum = 1
1.current = 1
}
}
}
}
But its not really working, as it shows first all the filenames and then the caption.
How could I do it?
Split is a function which returns all elements. Within 20.10 you get the content of field image, splitted by an newline f.e. and after that, you get the content of 20.20 which has the imagecaption.
What you need to do (untested):
10 = TEXT
10{
field = media
split{
token = ,
cObjNum = 1
1.current = 1
# for each image, add the imagecaption
1.append = TEXT
1.append {
field = imagecaption
# split saves the index in REGISTER:SPLIT_COUNT
listNum.stdWrap.data = REGISTER:SPLIT_COUNT
listNum.splitChar = 10
}
}
}
I do not think that token = \n is correct. You properly need to use .char = 10.
Also you will need to nest your TS somehow, because the current solution does handle the fields one by one.
I can't remember at the moment but I've wrote an extension that adds a frame to a picture and caption. It can solve your problem with the captions: http://typo3.org/extensions/repository/view/ch_imgtext_renderengine/current/.
how we display fpdf multicell in equal heights having different amount of content
Great question.
I solved it by going through each cell of data in your row that will be multicelled and determine the largest height of all these cells. This happens before you create your first cell in the row and it becomes the new height of your row when you actually go to render it.
Here's some steps to achieve this for just one cell, but you'll need to do it for every multicell:
This assumes a $row of data with a String attribute called description.
First, get cell content and set the column width for the cell.
$description = $row['desciption']; // MultiCell (multi-line) content.
$column_width = 50;
Get the width of the description String by using the GetStringWidth() function in FPDF:
$total_string_width = $pdf->GetStringWidth($description);
Determine the number of lines this cell will be:
$number_of_lines = $total_string_width / ($column_width - 1);
$number_of_lines = ceil( $number_of_lines ); // Round it up.
I subtracted 1 from the $column_width as a kind of cell padding. It produced better results.
Determine the height of the resulting multi-line cell:
$line_height = 5; // Whatever your line height is.
$height_of_cell = $number_of_lines * $line_height;
$height_of_cell = ceil( $height_of_cell ); // Round it up.
Repeat this methodology for any other MultiCell cells, setting the $row_height to the largest $height_of_cell.
Finally, render your row using the $row_height for your row's height.
Dance.
I was also getting same issue when i have to put height equal to three lines evenif i have content of half or 1 and half line, i solved this by checking if in a string no of elements are less than no of possible letters in a line it is 60. If no of letters are less or equal to one line then ln() is called for two empty lines and if no of words are equal to or less than 2 line letters then one ln() is called.
My code is here:
$line_width = 60; // Line width (approx) in mm
if($pdf->GetStringWidth($msg1) < $line_width)
{
$pdf->MultiCell(75, 8, $msg1,' ', 'L');
$pdf->ln(3);
$pdf->ln(3.1);
}
else{
$pdf->MultiCell(76, 4, $msg1,' ', 'L');
$pdf->ln(3.1);
}
Ok I've done the recursive version. Hope this helps even more to the cause!
Take in account these variables:
$columnLabels: Labels for each column, so you'll store data behind each column)
$alturasFilas: Array in which you store the max height for each row.
//Calculate max height for each column for each row and store the value in //////an array
$height_of_cell = 0;
$alturasFilas = array();
foreach ( $data as $dataRow ) {
for ( $i=0; $i<count($columnLabels); $i++ ) {
$variable = $dataRow[$i];
$total_string_width = $pdf->GetStringWidth($variable);
$number_of_lines = $total_string_width / ($columnSizeWidth[$i] - 1);
$number_of_lines = ceil( $number_of_lines ); // Redondeo.
$line_height = 8; // Altura de fuente.
$height_of_cellAux = $number_of_lines * $line_height;
$height_of_cellAux = ceil( $height_of_cellAux );
if($height_of_cellAux > $height_of_cell){
$height_of_cell = $height_of_cellAux;
}
}
array_push($alturasFilas, $height_of_cell);
$height_of_cell = 0;
}
//--END--
Enjoy!
First, it's not the question to get height of Multicell. (to #Matias, #Josh Pinter)
I modified answer of #Balram Singh to use this code for more than 2 lines.
and I added new lines (\n) to lock up height of Multicell.
like..
$cell_width = 92; // Multicell width in mm
$max_line_number = 4; // Maximum line number of Multicell as your wish
$string_width = $pdf->GetStringWidth($data);
$line_number = ceil($string_width / $cell_width);
for($i=0; $i<$max_line_number-$line_number; $i++){
$data.="\n ";
}
Of course you have to assign SetFont() before use GetStringWidth().
My approach was pretty simple. You already need to override the header() and footer() in the fpdf class.
So i just added a function MultiCellLines($w, $h, $txt, $border=0, $align='J', $fill=false).
That's a very simple copy of the original MultiCell. But it won't output anything, just return the number of lines.
Multiply the lines with your line-height and you're safe ;)
Download for my code is here.
Just rename it back to ".php" or whatever you like. Have fun. It works definitely.
Mac
My personal solution to reduce the font size and line spacing according to a preset box:
// set box size and default font size
$textWidth = 100;
$textHeight = 100;
$fontsize = 12;
// if you get text from html div (using jquery and ajax) you must replace every <br> in a new line
$desc = utf8_decode(str_replace('<br>',chr(10),strip_tags($_POST['textarea'],'<br>')));
// count newline set in $desc variable
$countnl = substr_count($desc, "\n");
// Create a loop to reduce the font according to the contents
while($pdf->GetStringWidth($desc) > ($textWidth * (($textHeight-$fontsize*0.5*$countnl) / ($fontsize*0.5)))){
$fontsize--;
$pdf->SetFont('Arial','', $fontsize);
}
// print multicell and set line spacing
$pdf->MultiCell($textWidth, ($fontsize*0.5), "$desc", 0, 'L');
That's all!
A better solution would be to use your own word wrap function, as FPDF will not cut words, so you do not have to rely on the MultiCell line break.
I have wrote a method that checks for every substring of the text if the getStringWidth of FPDF is bigger than the column width, and splits accordingly.
This is great if you care more about a good looking PDF layout, than performance.
public function wordWrapMultiCell($text, $cellWidth = 80) {
$explode = explode("\n", $text);
array_walk($explode, 'trim');
$lines = [];
foreach($explode as $split) {
$sub = $split;
$char = 1;
while($char <= strlen($sub)) {
$substr = substr($sub, 0, $char);
if($this->pdf->getStringWidth($substr) >= $cellWidth - 1) { // -1 for better getStringWidth calculating
$pos = strrpos($substr, " ");
$lines[] = substr($sub, 0, ($pos !== FALSE ? $pos : $char)).($pos === FALSE ? '-' : '');
if($pos !== FALSE) { //if $pos returns FALSE, substr has no whitespace, so split word on current position
$char = $pos + 1;
$len = $char;
}
$sub = ltrim(substr($sub, $char));
$char = 0;
}
$char++;
}
if(!empty($sub)) {
$lines[] = $sub;
}
}
return $lines;
}
This returns an array of text lines, which you could merge by using implode/join:
join("\r\n", $lines);
And what it was used for in the first place, get the line height:
$lineHeight = count($lines) * $multiCellLineHeight;
This only works for strings with a space character, no white spacing like tabs. You could replace strpos with a regexp function then.