Here is the situation. I am working in epidemiological research and describing tables is something we have to do on a day-to-day basis. Content editing tools and their associated language all offers ways to describe a table. Here are a few example :
Latex
\begin{center}
\begin{tabular}{ c c c }
cell1 & cell2 & cell3 \\
cell4 & cell5 & cell6 \\
cell7 & cell8 & cell9
\end{tabular}
\end{center}
HTML
<table style="width:100%">
<tr>
<td>Jill</td>
<td>Eve</td>
</tr>
</table>
Markdown
| Tables | Are | Cool |
| ------------- |:-------------:| -----:|
| col 3 is | right-aligned | $1600 |
| col 2 is | centered | $12 |
| zebra stripes | are neat | $1 |
Some of these are clearer than others, but all of them start to look absolutely horrific once one try to modify their formating to something other than the most basic example.
My question is the following :
Given that this is one of the most fundamental tasks of text editing, what is the most elegant way/language we know to represent textually a non-trivial table and its content?
Related
In Excel, I have a list with multiple rows of the same ID (column A), each with various dates recorded (Column B). I need to extract one row for each ID that contains the newest date. See below for example:
|Column A | Column B|
|(ID) | (Date) |
|-----------|-----------|
|00001 | 01/01/2022|
|00001 | 02/01/2022|
|00001 | 03/01/2022| <-- I Need this one
|00002 | 01/02/2022|
|00002 | 02/02/2022|
|00002 | 03/02/2022| <-- I Need this one
|00003 | 01/03/2022|
|00003 | 02/03/2022|
|00003 | 03/03/2022| <-- I Need this one
|00004 | 01/04/2022|
|00004 | 02/04/2022|
|00004 | 03/04/2022| <-- I Need this one
|00005 | 01/05/2022|
|00005 | 02/05/2022|
|00005 | 03/05/2022| <-- I Need this one
I need to extract the above rows, where the row with the newest date is extracted for each unique ID. It needs to look like this:
|Column A | Column B |
|(ID) | (Date) |
|----------|--------------|
|00001 | 03/01/2022 |
|00002 | 03/02/2022 |
|00003 |03/03/2022 |
|00004 | 03/04/2022 |
|00005 | 03/05/2022 |
I'm totally stumped and I can't seem to find the right answer (probably because of how I'm wording the question!)
Thank you!
Google searches for the answer - no joy. I don't know where to start in excel with this function, I thought perhaps DISTINCT or similar...
Assuming you have Office 365 compatible version of Excel, you could do something like this:
(screenshot/here refers):
=INDEX(SORTBY(A2:B11,B2#,-1),SEQUENCE(1,1,1,1),SEQUENCE(1,2,1,1))
This formula is superfluous albeit convenient - you don't really require the first sequence (there's only one row being returned). However, as you can see in the screenshot, using the self-same formula, this time with a leading 2 in the first argument of that sequence returns the top two (descending order) dates, and so forth.
FOR THOSE w/ Office 365 you could do something like this....
=LARGE(B2#+(ROW(B2#)-ROW(B2))/1000,1)
i.e. adding a "little bit" to the dates that we can subtract later and use as a unique reference (row number, original unsorted list)
As mentioned, reverse engineer, throw into an index, and voila!
=INDEX(A2:A11,ROUND((H2-ROUND(H2,0))*1000,6))
caveats:
the round(<>,6) is purely to eliminate Excel's irritating lack of precision issue.
can work if you're looking up text strings (i.e. attempting to sort alphabetically) EXCEPT large doesn't work with string (no prob, just use unicode - but good luck with expanding out the string etc. ☺ with mid(<>,row(a1:offset(a1,len(<>)-1)..,1)..
I've bumped into a seemingly simple problem that I'm unable to solve. I would like to determine whether the number of uppercase letters is greater than the number of lowercase letter (ignoring special character, spaces etc).
Example
id | text | upper_greater_lower | note
------------------------------------------------------------------
1 | Hello World | False | because |HW| < |elloorld|
2 | The XYZ | True | because |TXYZ| > |he|
3 | Foo!!! | False | because |F| < |oo|
4 | BAr??? | True | because |BA| > |r|
My initial idea was to determine the number of lowecase letters, then uppercase letters, and finally, compare them. However, I'm unable to do so in any elegant and efficient way.
I expect handling ~30M rows with ~300 character each.
What would you suggest?
Thanks!
Using regular expression magic, that could be:
SELECT length(regexp_replace(textcol, '[^[:upper:]]', '', 'g'))
> length(regexp_replace(textcol, '[^[:lower:]]', '', 'g'))
FROM atable;
I can make text wrap around images by adding
#+CAPTION: foo
#+ATTR_LATEX: :float wrap
[[./my_img.png]]
If I do the same to
#+CAPTION: bar
#+ATTR_LATEX: :float wrap
| a | b |
| c | d |
The table will stay centered in its own part of the document, the text being broken above and below it, not auto-flowing around it.
I also tried to do something like
#+CAPTION: baz
#+ATTR_LATEX: :environment wraptable :options {l}{5cm}
| a | b |
| c | d |
or using :position instead of :options, with no results.
Basically I want the table to export to
\begin{wraptable}{l}{5cm}
\begin{tabular}
....
\end{tabular}
\end{wraptable}
in the tex file. The arguments {r|l}{width} are mandatory, so simply #+begin_wraptable won't work either. Is there any way to do that from inside org-mode without manually fiddling with the final .tex?
You have to explicitly add the LaTeX-code for wraptable, using the :center attribute with no value removes the automatic insertion of \beginn{center} in the .tex-file.
#+LaTeX: \begin{wraptable}{r|l}{width}
#+ATTR_LATEX: :center
| A | B | C |
|--------+-------+-------|
| a | b | c |
#+LaTeX: \end{wraptable}
A report contains a unique central "Detail1" band that must fit the whole page height--even when the datasource provides just one record--the footer must remain at the bottom of the A4-sized page:
_______________
| header |
| |
| row1 |
| row2 |
| |
| |
| |
| |
| |
| footer |
|_______________|
YES!
_______________
| header |
| |
| row1 |
| row2 |
| footer |
|_______________|
NO!
I wonder whether this is related to the "stretching" options or to the background.
We also have this more general problem of vertically sizing some band or content relative to the page height.
In your case it could be done, e.g. by either increasing the row2 "bottom margin" or the footer "top margin" (or introducing some "spacer element" inbetween):
variant A variant B variant C
___________ __________ __________
| header | | header | | header |
|_________| |________| |________|
| row1 | | row1 | | row1 |
|_________| |________| |________|
| row2 | | row2 | | row2 |
| . | |________| |________|
| . | | . | | spacer |
|_________| | . | |________|
| footer | | footer | | footer |
|_________| |________| |________|
The only way I currently can think of to solve this, is kind of a hack, if "you know enough about the size of the to-be-streched element siblings":
We want to set the spacer.height (or margin depending on the variant chosen) like this:
spacer.height = page.height - header.height - row1.height - row2.height - footer.height
Example A
Let's assume for simplicity of this example that
only row2.height is flexible (~ dynamic)
for simplicity let's say row2.height is made of numbers separated by linefeeds like this:
17
2
34
Using variant A we could manually test, after how many linefeeds/numbers the footer would be pushed on the next page, let's say 5.
So all we would have to do is to
either dynamically adjust the bottom margin of row2 to max( 0, 5 - row2LinefeedsCount * row2lineHeight ) (e.g. via Groovy or Java)
or if it is based on some SQL select, fill in some blank space lines like this:
-- (Oracle SQL)
select
...,
dyn_row2_col
-- add additional linefeeds if necessary (when < 5 lines)
-- (counts by the content length after removing all digits)
|| lpad(
'',
max( 0, 5 - length( regexp_replace( dyn_row2_col, '\d', '' ))),
CHR(13) )
as dyn_row2_col,
...
from ...
Example B
Another example could be, that row2s height is dependent and linearly increased by the number of subrows of some query result (e.g. if it is a simple subreport with equally sized rows). Then we could maybe use variant C and fill the spacer (invisibly) with some dummy query based on the subrow query (and doing the calculation of the rest space similar to the above example).
search engine tags/phrases
fit band height to page size/height, band height 100% of page size, max-height, stretch vertically to 100% parent container height or page size
implementation/usage thoughts of Jasper Reports
I find it a pain to not be easily able to use flexible positioning/sizing expressions (e.g. to let the footer stick to the bottom, float elements horizontally (e.g. table columns) or spacer.height: 75%) with Jasper Reports in this modern flex layout HTML/responsive layout world. The absolut positioning philosophy should be enhanced here. Eclipse BIRT is much better at this, but has other disadvantages. Of course the implementation could be quite complex, more memory intense and slower, but I think the benefit in non-absolute positioning requirement scenarios would be great.
I wrote some general Scriptlet (I plan to provide in jasper-utils), which solves the floating of horizontal columns based on named component styles (similar to CSS classes) on related columns. It behaves very similar to HTML tables, using the available horizontal space based on column-specific percentage and "underflow-stretch-calc" expressions and the possiblity to show/hide columns (over all related bands).
I keep my budget in org-mode and have been pleased with how simple it is. The simplicity fails, however, as I am performing formulas on many cells; for instance, my year summary table that performs the same grab-and-calculate formulas for each month. I end up with a massive line in my +TBLFM. This would be dramatically shorter if I could programmatically pass arguments to the formula. I'm looking for something like this, but working:
| SEPT |
| #ERROR |
#+TBLFM: #2$1=remote(#1,$tf)
Elsewhere I have a table named SEPT and it has field named "tf". This function works if I replace "#1" with "SEPT" but this would cause me to need a new entry in the formula for every column.
Is there a way to get this working, where the table itself can specify what remote table to call (such as the SEPT in my example)?
Yes, you can't do this with built-in remote and you need to use org-table-get-remote-range. Hopefully this better suits your needs than the answer given by artscan (I used his/her example):
| testname1 | testname2 |
|-----------+-----------|
| 1 | 2 |
#+TBLFM: #2='(org-table-get-remote-range #<$0 (string ?# ?1 ?$ ?1))
#+TBLNAME: testname1
| 1 |
#+TBLNAME: testname2
| 2 |
Note the (string ?# ?1 ?$ ?1): this is necessary because before evaluating table formulae, all substitutions will be done first. If you use "#1$1" directly, it would have triggered the substitution mechanism and be substituted by the contents of the first cell in this table.
There is some ugly hack for same effect without using remote:
1) it needs named variable for remote address
(setq eab/test-remote "#1$1")
2) it uses elisp expression (from org-table.el) instead remote(tablename,#1$1)
(defun eab/test-remote (x)
`(car (read
(org-table-make-reference
(org-table-get-remote-range ,x eab/test-remote)
't 't nil))))
3) worked example
| testname1 | testname2 |
|-----------+-----------|
| | |
#+TBLFM: #2='(eval (eab/test-remote #1))
#+TBLNAME: testname1
| 1 |
#+TBLNAME: testname2
| 2 |
4) result
| testname1 | testname2 |
|-----------+-----------|
| 1 | 2 |