x and y coordinate of text using itextsharp - itext

When i extract text using itextsharp, i will get x and y coordinate of text. By using these 2 coordinates if i convert text from pdf to html based on x y position the text position chnages . to get x ,y coordinates i used
Vector curBaseline = renderInfo.GetBaseline().GetStartPoint();
float x=curBaseline[Vector.I1];
float y= curBaseline[Vector.I2];
for example : when i extract text using above method say x=42 and y=659;
" < span style=left:{0}px;bottom:{1}px;position:relative;\">",curBaseline[Vector.I1],curBaseline[Vector.I2]); the position changes . can you help me how to set position of text default as pdf.?????

Posted as answer...
If i recall correctly, PDF uses a coordinate system which starts in the left corner at the BOTTOM of the page, not at the Top. So every coordiante is wrong, when you use it directly in html. You will have to convert the values.
Your pdf document should have something like document.actualheight, simply subtract your value from that....

Related

VISIO - How to set grid lines such that each character appears in 1 square and each square contains only 1 character using fixed font

What I am trying to do is to enable the user to fill a report design form using fixed font. Only older people have seen such forms for coding and report design :)
Question:
Using VISIO, how to set grid lines such that each character appears in 1 square and each square contains only 1 character using a fixed font such as Courier New? I am open to completely different approaches if any, even if it involves coding in VBA or C#, I know some.
I tried different values for "Minimum Spacing" but I can't get the desired result.
EDIT
I also attempted to place a grid image in the background and put the textbox in front, still monospaced font characters overlap with lines...
Edit I also tried this VBA Code:
Sub Test()
Dim I As Integer
Dim D As Double
Dim X1 As Double
Dim Y1 As Double
D = 0.13
X1 = 1.373
Y1 = 10.04
For I = 1 To 20
Application.ActiveWindow.Page.DrawLine X1, Y1, X1, Y1 + 1
X1 = X1 + D
Next
End Sub
Which produced the following "bad" result - All this suggests the the implementation of Courier New is not really fixed in width!
The Visio grid doesn't work in that way. You need to use a proper grid shape. There is one at More Shapes -> Business -> Charts and Graphs -> Charting Shapes. Set the columns and rows to what you need and then resize.
Example here - using font Liberation Mono.

Calculate X position where string ends

I have a string with some font size & added that text on UIView using drawInRect method. I am able to calculate the approx Y position of the string by calculating its height, but I want to know the X position as well.
Can I calculate the x location of string where my string gets end.
string length & font size can vary.
Thanks
Perhaps calling CGContextGetTextPosition after drawing the string will give you the coordinate you need.
For more information, read the “Text” chapter of Quartz 2D Programming Guide.

get the exact position of text from image in tesseract

Using GetHOCRText(0) method in tesseract I'm able to retrieve the text in html and on presenting the html in webview i'm able get the text but the postion of text in image is different from the output. Any idea is highly helpful.
tesseract->SetInputName("word");
tesseract->SetOutputName("xyz");
tesseract->Recognize(NULL);
char *utf8Text=tesseract->GetHOCRText(0);
and output image
If you have the hocr output, you should have a tag for each word. These tags should have class="ocrx_word" and name="bbox x1 y1 x2 y2" where the x and y are the top left and bottom right corner of the bounding box around the word. I don't think it's possible to automatically use this information to format a text document - would require translating pixel differences to number of tabs/spaces. But, you should be able to render text in the given location.
GetBoxText() method will return exact position of each characters in an array.
char *boxtext = _tesseract->GetBoxText(0);
NSString* aBoxText = [NSString stringWithUTF8String:boxtext];

OpenXml and Word: How to Calculate WrapPolygon Coordinates?

I am creating a Microsoft Word document using the OpenXml library. Most of what I need is already working correctly. However, I can't for the life of me find the following bit of information.
I'm displaying an image in an anchor, which causes text to wrap around the image. I used WrapSquare but this seems to affect the last line of the previous paragraph as shown in the image below. The image is anchored to the second paragraph but causes the last line of the first paragraph to also indent around the image.
Word Screenshot http://www.softcircuits.com/Client/Word.jpg
Experimenting within Word, I can make the text wrap how I want by changing the wrapping to WrapTight. However, this requires a WrapPolygon with several coordinates. And I can't find any way to determine the polygon coordinates so that they match the size of the image, which is in pixels.
The documentation doesn't even seem to indicate what units are used for these coordinates, let alone how to calculate them from pixels. I can only assume the calculation would involve a DPI value, but I have no idea how to determine what DPI will be used when the user eventually loads the document into Word.
I would also be satisfied if someone can explain why the issues described above is happening in the first place. I can shift the image down and the previous paragraph is no longer affected. But why is this necessary? (The Distance from text setting for both Left and Top is 0".)
The WrapPolygon element has two possible child elements of LineTo and StartPoint that each take a x and y coordinate. According to 2.1.1331 Part 1 Section 20.4.2.9, lineTo (Wrapping Polygon Line End Position) and 2.1.1334 Part 1 Section 20.4.2.14, start (Wrapping Polygon Start) found in the [MS-OI29500: Microsoft Office Implementation Information for ISO/IEC-29500 Standard Compliance]:
The standard states that the x and y attributes are represented in
EMUs. Office interprets the x and y attributes in a fixed coordinate
space of 21600x21600.
As far as converting pixels to EMUs (English Metric Units), take a look at this blog post for an example.
I finally resolved this. Despite what the standard says, the WrapPolygon coordinates are not EMUs (English Metric Units). The coordinates are relative to the fixed coordinate space (21600 x 21600, as mentioned in the quote provided by amurra).
More specifically, this means 0,0 is at the top, left corner of the image, and 21600,21600 is at the bottom, right corner of the image. This is the case no matter what the size of the image is. Coordinates greater than 21600 extend outside the image.
According to this article, "The 21600 value is a legacy artifact from the drawing layer of early versions of the Microsoft Office."

plotsymbol label

I'm playing with core-plot to generate a scatter plot. I've added a plot symbol and want to add a label near every symbol.
I able to do that but want to change the position of this label. I can set the offset but it move the label only vertically, I need to move the label horizontally.
Any way to do that?
this picture shows what i'd like.
screen shot
Thanks
The automatic labels will always appear above the point. You can create annotations to label your data points. For each label, use CPTPlotSpaceAnnotation, anchor it to the coordinates of the data point, and set the displacement and content anchor.
To put the labels to the right of the point, set the displacement to (x, 0) where x is the number of pixels between the label and the center of the data point. Set the content anchor to (0, 0.5).