Best way to present side-by-side text in Org Mode - org-mode

I seek a good way to present text side-by-side both in the Org Mode buffer and also in the exported document. Here is an example showing two approaches that I have tried:
#+options: html-postamble:nil
#+options: toc:nil num:nil tags:nil ^:{}
* if statements
- Presented as an Org mode =example=:
#+begin_example
Ada | C | Pascal | Lisp
=== | = | ====== | ====
if B1 then | if (B1) { | if B1 then | (if B1 (progn
S1; | S1; | S1 | (S1))
elsif B2 then | } else { | else | (if B2 (progn
S2; | if (B2) { | if B2 then | (S2))
else | S2; | S2 | (S3)))
S3; | } else { | else |
end if; | S3; | S3 |
| } | end |
| } | end |
#+end_example
- Presented as an Org mode =table=:
#+macro: lf ##latex:\hspace{0pt}\\## ##html:<br>##
#+macro: nbsp ##latex:\hspace{1em}## ##html: ##
#+macro: indent {{{nbsp}}}{{{nbsp}}}
| Ada | C | Pascal | Lisp |
| <l> | <l> | <l> | <l> |
|---------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------|
| / | < | < | < |
| if B1 then{{{lf}}}{{{indent}}}S1;{{{lf}}}elsif B2 then{{{lf}}}{{{indent}}}S2;{{{lf}}}else{{{lf}}}{{{indent}}}S3;{{{lf}}}end if; | if (B1) {{{{lf}}}{{{indent}}}S1;} else {{{{lf}}}{{{indent}}}if (B2) {{{{lf}}}{{{indent}}}{{{indent}}}S2;{{{lf}}}{{{indent}}}} else {{{{lf}}}{{{indent}}}{{{indent}}}S3;{{{lf}}}{{{indent}}}}{{{lf}}}} | if B1 then{{{lf}}}{{{indent}}}S1{{{lf}}}else{{{lf}}}{{{indent}}}if B2 then{{{lf}}}{{{indent}}}{{{indent}}}S2{{{lf}}}{{{indent}}}else{{{lf}}}{{{indent}}}{{{indent}}}S3{{{lf}}}{{{indent}}}end{{{lf}}}end | (if B1 (progn{{{lf}}}{{{indent}}}(S1)){{{lf}}}{{{indent}}}(if B2 (progn{{{lf}}}{{{indent}}}{{{indent}}}(S2)){{{lf}}}{{{indent}}}{{{indent}}}(S3))) |
Surely no one would want to write that second version.
Here is how the document looks exported as HTML:
Is there a better way to do this?

Related

How to Decompose Global System Metrics to a Per Endpoint Basis on a Webserver

I'm implementing a metrics system for a backend API at scale and am running into a dilemma: using statsd, the application itself is logging request metrics on a per endpoint basis, but the CPU metrics are at the global server level. Currently each server has 10 threads, meaning 10 requests can be processed at once (yeah, yeah its actually serial).
For example, if we have two endpoints, /user and /item, the statsd implementation is differentiating statistics (DB/Redis I/O, etc.) per endpoint. However, say we are looking at linux-metrics every N seconds, those statistics do not separate endpoints, inherently.
I believe that it would be possible, assuming that your polling time ("N seconds") is small enough and that you have enough diversity within your requests, to decompose the global system metrics to create an estimate at the endpoint level.
Image a scenario like this:
note: we'll say a represents a GET to /user and b represents a GET to /item
|------|------|------|------|------|------|------|------|------|------|
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | t8 | t9 | t10 |
|------|------|------|------|------|------|------|------|------|------|
| a | b | b | a | a | b | b | a | b | b |
| b | a | b | | b | a | b | | b | |
| a | b | b | | a | a | b | | a | |
| a | | b | | b | a | a | | a | |
| a | | b | | a | a | b | | | |
| | | | | a | | a | | | |
|------|------|------|------|------|------|------|------|------|------|
At every timestep, t (i.e. t1, t2, etc.), we also take a snapshot of our system metrics. I feel like there should be a way (possibly through a sort of signal decomposition) to estimate the avg load each a/b request takes. Now, in practice I have ~20 routes so it would be far more difficult to get an accurate estimate. But like I said before, provided your requests have enough diversity (but not too much) so that they overlap in certain places like above, it should be at the very least possible to get a rough estimate.
I have to imagine that there is some name for this kind of thing or at the very least some research or naive implementations of this method. In practice, are there any methods that can achieve these kinds of results?
Note: it may be more difficult when considering that requests may bleed over these timesteps, but almost all requests take <250ms. Even if our system stats polling rate is every 5 seconds (which is aggressive), this shouldn't really cause problems. It is also safe to assume that we would be achieving at the very least 50 requests/second on each server, so sparsity of data shouldn't cause problems.
I believe the answer is doing a sum decomposition through linear equations. If we say that a system metric, for example the CPU, is a function CPU(t1), then it would just be a matter of solving the following set of equations for the posted example:
|------|------|------|------|------|------|------|------|------|------|
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | t8 | t9 | t10 |
|------|------|------|------|------|------|------|------|------|------|
| a | b | b | a | a | b | b | a | b | b |
| b | a | b | | b | a | b | | b | |
| a | b | b | | a | a | b | | a | |
| a | | b | | b | a | a | | a | |
| a | | b | | a | a | b | | | |
| | | | | a | | a | | | |
|------|------|------|------|------|------|------|------|------|------|
4a + b = CPU(t1)
a + 2b = CPU(t2)
5b = CPU(t3)
a = CPU(t4)
3a + 3b = CPU(t5)
4a + b = CPU(t6)
2a + 4b = CPU(t7)
a = CPU(t8)
2a + 2b = CPU(t9)
b = CPU(t10)
Now, there will be more than one way to solve this equation (i.e. a = CPU(t8) and a = CPU(t4)), but if you took the average of a and b (AVG(a)) from their corresponding solutions, you should get a pretty solid metric for this.

Iterate over Dataframe & Recursive filters

I have 2 dataframes. "MinNRule" & "SampleData"
MinNRule provides some rule information based on which SampleData needs to be:
Aggregate "Sample Data" on columns defined in MinNRule.MinimumNPopulation and MinNRule.OrderOfOperation
Check if Aggregate.Entity >= MinNRule.MinimumNValue
a. For all Entities that do not meet the MinNRule.MinimumNValue, remove from population
b. For all Entities that meet the MinNRule.MinimumNValue, keep in population
Perform 1 through 2 for next MinNRule.OrderOfOperation using 2.b dataset
MinNRule
| MinimumNGroupName | MinimumNPopulation | MinimumNValue | OrderOfOperation |
|:-----------------:|:------------------:|:-------------:|:----------------:|
| Group1 | People by Facility | 6 | 1 |
| Group1 | People by Project | 4 | 2 |
SampleData
| Facility | Project | PeopleID |
|:--------: |:-------: |:--------: |
| F1 | P1 | 166152 |
| F1 | P1 | 425906 |
| F1 | P1 | 332127 |
| F1 | P1 | 241630 |
| F1 | P2 | 373865 |
| F1 | P2 | 120672 |
| F1 | P2 | 369407 |
| F2 | P4 | 121705 |
| F2 | P4 | 211807 |
| F2 | P4 | 408041 |
| F2 | P4 | 415579 |
Proposed Steps:
Read MinNRule, read rule with OrderOfOperation=1
a. GroupBy Facility, Count on People
b. Aggregate SampleData by 1.a and compare to MinimumNValue=6
| Facility | Count | MinNPass |
|:--------: |:-------: |:--------: |
| F1 | 7 | Y |
| F2 | 4 | N |
Select MinNPass='Y' rows and filter the initial dataframe down to those entities (F2 gets dropped)
| Facility | Project | PeopleID |
|:--------: |:-------: |:--------: |
| F1 | P1 | 166152 |
| F1 | P1 | 425906 |
| F1 | P1 | 332127 |
| F1 | P1 | 241630 |
| F1 | P2 | 373865 |
| F1 | P2 | 120672 |
| F1 | P2 | 369407 |
Read MinNRule, read rule with OrderOfOperation=2
a. GroupBy Project, Count on People
b. Aggregate SampleData by 3.a and compare to MinimumNValue=4
| Project | Count | MinNPass |
|:--------: |:-------: |:--------: |
| P1 | 4 | Y |
| P2 | 3 | N |
Select MinNPass='Y' rows and filter dataframe in 3 down to those entities (P2 gets dropped)
Print Final Result
| Facility | Project | PeopleID |
|:--------: |:-------: |:--------: |
| F1 | P1 | 166152 |
| F1 | P1 | 425906 |
| F1 | P1 | 332127 |
| F1 | P1 | 241630 |
Ideas:
I have been thinking of moving MinNRule to a LocalIterator and loopinng through it and "filtering" SampleData
I am not sure how to pass the result at the end of one loop over to another
Still learning Pyspark, unsure if this is the correct approach.
I am using Azure Databricks
IIUC, since the rules df defines the rules therefore it must be small and can be collected to the driver for performing the operations on the main data.
One approach to get the desired result can be by collecting the rules df and passing it to the reduce function as:
data = MinNRule.orderBy('OrderOfOperation').collect()
from pyspark.sql.functions import *
from functools import reduce
dfnew = reduce(lambda df, rules: df.groupBy(col(rules.MinimumNPopulation.split('by')[1].strip())).\
agg(count(col({'People':'PeopleID'}.get(rules.MinimumNPopulation.split('by')[0].strip()))).alias('count')).\
filter(col('count')>=rules.MinimumNValue).drop('count').join(df,rules.MinimumNPopulation.split('by')[1].strip(),'inner'), data, sampleData)
dfnew.show()
+-------+--------+--------+
|Project|Facility|PeopleID|
+-------+--------+--------+
| P1| F1| 166152|
| P1| F1| 425906|
| P1| F1| 332127|
| P1| F1| 241630|
+-------+--------+--------+
Alternatively you can also loop through the df and get the result the performance remains same in both the cases
import pyspark.sql.functions as f
mapped_cols = {'People':'PeopleID'}
data = MinNRule.orderBy('OrderOfOperation').collect()
for i in data:
cnt, grp = i.MinimumNPopulation.split('by')
cnt = mapped_cols.get(cnt.strip())
grp = grp.strip()
sampleData = sampleData.groupBy(f.col(grp)).agg(f.count(f.col(cnt)).alias('count')).\
filter(f.col('count')>=i.MinimumNValue).drop('count').join(sampleData,grp,'inner')
sampleData.show()
+-------+--------+--------+
|Project|Facility|PeopleID|
+-------+--------+--------+
| P1| F1| 166152|
| P1| F1| 425906|
| P1| F1| 332127|
| P1| F1| 241630|
+-------+--------+--------+
Note: You have to manually parse your rules grammar as it is subjected to change

Which Rule engine to use?

I have a requirement for handling multiple rules and select a value as per the matching criteria.
The rule could be
case-1
----------------------------------------
| A | B | C | D | priority | value |
----------------------------------------
| a1 | b1 | | c1 | 1 | 250 |
----------------------------------------
| | b2 | c2 | d2 | 3 | 200 |
----------------------------------------
| a3 | b3 | c3 | d3 | 2 | 100 |
----------------------------------------
As per the above defined rules, we look for highest number of matching criteria first, and select the value of that rule, (i.e rule with value "100")
case-2
----------------------------------------
| A | B | C | D | priority | value |
----------------------------------------
| a1 | b1 | | c1 | 1 | 100 |
----------------------------------------
| | b2 | c2 | d2 | 2 | 200 |
----------------------------------------
If two conflicting rules found with same number of matching criteria, then look for priority, and select rule with highest priority. In this case (Rule with value "100".
case-3
----------------------------------------
| A | B | C | D | priority | value |
----------------------------------------
| a1 | b1 | | c1 | 3 | 100 |
----------------------------------------
| | b2 | c2 | d2 | 2 | 200 |
----------------------------------------
| a3 | b3 | c3 | d3 | 1 | 300 |
----------------------------------------
| a4 | b4 | c4 | d4 | 1 | 400 |
----------------------------------------
In this case, if more than one rule with same number of matching criteria found and with same priority then select the rule with highest value (i.e Rule4 with value 400).
I know it looks very specific, but i tried to google but couldn't came across any rule engine which can be used in this case.
Please help me out with some pointers and ideas to start with.
Like others have pointed out, any rule engine should do in your case. Since this seems at first glance to be a very lightweight use-case, you can use Rulette to do this almost trivially (Disclosure - I am the author). You could define your rules, and then use the getAllRules API to get the list of applicable rules on which you could do min/max as required.
I am curious, though, to understand why you would want to define conflicting rules and then apply a "priority" on them?

Merge orgmode tables vertically

Is it possible to append a table below another? I am looking for something like this but in the following form:
#+name: tbl1
| a | 1 |
| b | 2 |
#+name: tbl2
| c | 3 |
| d | 4 |
I am expecting to get this:
| a | 1 |
| b | 2 |
| c | 3 |
| d | 4 |
From my search I found lob-tables-operations but it seems to me that it's not well documented and likely not under maintenance.
It was quite straight forward based on this example. I just used of mapcan instead of mapcar
** append tables
:PROPERTIES:
:DATE: 2015-06-19
:END:
#+name: table-names
- first-table
- second-table
- third-table
#+name: first-table
| a | 1 |
| b | 2 |
|---+---|
#+name: second-table
| c | 3 |
| d | 4 |
|---+---|
#+name: third-table
| f | 5 |
| g | 6 |
|---+---|
#+BEGIN_SRC emacs-lisp :var table-names=table-names
(mapcan #'org-babel-ref-resolve table-names)
#+END_SRC
#+RESULTS:
| a | 1 |
| b | 2 |
|---+---|
| c | 3 |
| d | 4 |
|---+---|
| f | 5 |
| g | 6 |
|---+---|

conditional sum(sumif) in org-table

I have a table like this:
#+NAME: ENTRY
|------+--------|
| Item | Amount |
|------+--------|
| A | 100 |
| B | 20 |
| A | 120 |
| C | 40 |
| B | 50 |
| A | 20 |
| C | 16 |
|------+--------|
and then I need to sum each item in another table:
#+NAME: RESULT
|------+-----|
| Item | Sum |
|------+-----|
| A | 240 |
| B | 70 |
| C | 56 |
|------+-----|
I've tried using vlookup and remote reference in this table, but I'm not able to sum the resulting list like:
#+TBLFM: $2=vsum((vconcat (org-lookup-all $1 '(remote(ENTRY,#2$1..#>$1)) '(remote(ENTRY,#2$2..#>$2)))))
But it does not give the answer.
So I have to use a place holder to hold the resulting list then sum it:
#+NAME: RESULT
|------+--------------+-----|
| Item | Placeholder | Sum |
|------+--------------+-----|
| A | [100 120 20] | 240 |
| B | [20 50] | 70 |
| C | [40 16] | 56 |
|------+--------------+-----|
#+TBLFM: $2='(vconcat (org-lookup-all $1 '(remote(ENTRY,#2$1..#>$1)) '(remote(ENTRY,#2$2..#>$2))))::$3=vsum($2)
Is there a better solution for this?
One way to do this is without vsum:
#+TBLFM: $2='(apply '+ (mapcar 'string-to-number (org-lookup-all $1 '(remote(ENTRY,#2$1..#>$1)) '(remote(ENTRY,#2$2..#>$2)))))
If you want to use a calc function, you can always use calc-eval:
#+TBLFM: $2='(calc-eval (format "vsum(%s)" (vconcat (org-lookup-all $1 '(remote(ENTRY,#2$1..#>$1)) '(remote(ENTRY,#2$2..#>$2))))))