Inserting blank spaces at the end of a column name in a table using pander - knitr

I am trying to find a way of centering a column heading in a pander table using knitr to pdf in rmarkdwon, but keeping the column entries right justified.
---
title: "Table Doc"
output: pdf_document
---
```{r table, echo = FALSE}
table1 <- anova(lm(Petal.Length ~ Species*Petal.Width, iris))
names(table1) <- c("DF", "Sum Sq", "Mean Sq", "*F*", "*p*")
library(pander)
pander(table1, justify = c("left", rep("right", 5)))
```
There is no way to align individual cells inside a table in pandoc apparently. I want the entries to be to the right so they are all aligned properly but sit the column headings 'F' and 'p' in the center. So what I need to do is insert blank spaces after F and p to force them into the center. How do I do this? I tried simply inserting the blank spaces:
names(table1) <- c("DF", "Sum Sq", "Mean Sq", "*F* ", "*p* ")
but the spaces are not recognised by pander.
I also tried LaTex spacing characters
names(table1) <- c("DF", "Sum Sq", "Mean Sq", "*F*\\", "*p*\\")
but this didn't work either. Can anyone think of a workaround?

Related

How to strip extra spaces when writing from dataframe to csv

Read in multiple sheets (6) from an xlsx file and created individual dataframes. Want to write each one out to a pipe delimited csv.
ind_dim.to_csv (r'/mypath/ind_dim_out.csv', index = None, header=True, sep='|')
Currently outputs like this:
1|value1 |value2 |word1 word2 word3 etc.
Want to strip trailing blanks
Suggestion
Include the method .apply(lambda x: x.str.rstrip()) to your output string (prior to the .to_csv() call) to strip the right trailing blank from each field across the DataFrame. It would look like:
Change:
ind_dim.to_csv(r'/mypath/ind_dim_out.csv', index = None, header=True, sep='|')
To:
ind_dim.apply(lambda x: x.str.rstrip()).to_csv(r'/mypath/ind_dim_out.csv', index = None, header=True, sep='|')
It can be easily inserted to the output code string using '.' referencing. To handle multiple data types, we can enforce the 'object' dtype on import by including the argument dtype='str':
ind_dim = pd.read_excel('testing_xlsx_nums.xlsx', header=0, index_col=0, sheet_name=None, dtype='str')
Or on the DataFrame itself by:
df = pd.DataFrame(df, dtype='str')
Proof
I did a mock-up where the .xlsx document has 5 sheets, with each sheet having three columns: The first column with all numbers except an empty cell in row 2; the second column with both a leading blank and a trailing blank on strings, an empty cell in row 3, and a number in row 4; and the third column * with all strings having a leading blank, and an empty value in row 4*. Integer indexes and integer columns have been included. The text in each sheet is:
0 1 2
0 11111 valueB1 valueC1
1 valueB2 valueC2
2 33333 valueC3
3 44444 44444
4 55555 valueB5 valueC5
This code reads in our .xlsx testing_xlsx_dtype.xlsx to the DataFrame dictionary ind_dim.
Next, it loops through each sheet using a for loop to place the sheet name variable as a key to reference the individual sheet DataFrame. It applies the .str.rstrip() method to the entire sheet/DataFrame by passing the lambda x: x.str.rstrip() lambda function to the .apply() method called on the sheet/DataFrame.
Finally, it outputs the sheet/DataFrame as a .csv with the pipe delimiter using .to_csv() as seen in the OP post.
# reads xlsx in
ind_dim = pd.read_excel('testing_xlsx_nums.xlsx', header=0, index_col=0, sheet_name=None, dtype='str')
# loops through sheets, applies rstrip(), output as csv '|' delimit
for sheet in ind_dim:
ind_dim[sheet].apply(lambda x: x.str.rstrip()).to_csv(sheet + '_ind_dim_out.csv', sep='|')
Returns:
|0|1|2
0|11111| valueB1| valueC1
1|| valueB2| valueC2
2|33333|| valueC3
3|44444|44444|
4|55555| valueB5| valueC5
(Note our column 2 strings no longer have the trailing space).
We can also reference each sheet using a loop that cycles through the dictionary items; the syntax would look like for k, v in dict.items() where k and v are the key and value:
# reads xlsx in
ind_dim = pd.read_excel('testing_xlsx_nums.xlsx', header=0, index_col=0, sheet_name=None, dtype='str')
# loops through sheets, applies rstrip(), output as csv '|' delimit
for k, v in ind_dim.items():
v.apply(lambda x: x.str.rstrip()).to_csv(k + '_ind_dim_out.csv', sep='|')
Notes:
We'll still need to apply the correct arguments for selecting/ignoring indexes and columns with the header= and names= parameters as needed. For these examples I just passed =None for simplicity.
The other methods that strip leading and leading & trailing spaces are: .str.lstrip() and .str.strip() respectively. They can also be applied to an entire DataFrame using the .apply(lambda x: x.str.strip()) lambda function passed to the .apply() method called on the DataFrame.
Only 1 Column: If we only wanted to strip from one column, we can call the .str methods directly on the column itself. For example, to strip leading & trailing spaces from a column named column2 in DataFrame df we would write: df.column2.str.strip().
Data types not string: When importing our data, pandas will assume data types for columns with a similar data type. We can override this by passing dtype='str' to the pd.read_excel() call when importing.
pandas 1.0.1 documentation (04/30/2020) on pandas.read_excel:
"dtypeType name or dict of column -> type, default None
Data type for data or columns. E.g. {‘a’: np.float64, ‘b’: np.int32} Use object to preserve data as stored in Excel and not interpret dtype. If converters are specified, they will be applied INSTEAD of dtype conversion."
We can pass the argument dtype='str' when importing with pd.read_excel.() (as seen above). If we want to enforce a single data type on a DataFrame we are working with, we can set it equal to itself and pass it to pd.DataFrame() with the argument dtype='str like: df = pd.DataFrame(df, dtype='str')
Hope it helps!
The following trims left and right spaces fairly easily:
if (!require(dplyr)) {
install.packages("dplyr")
}
library(dplyr)
if (!require(stringr)) {
install.packages("stringr")
}
library(stringr)
setwd("~/wherever/you/need/to/get/data")
outputWithSpaces <- read.csv("CSVSpace.csv", header = FALSE)
print(head(outputWithSpaces), quote=TRUE)
#str_trim(string, side = c("both", "left", "right"))
outputWithoutSpaces <- outputWithSpaces %>% mutate_all(str_trim)
print(head(outputWithoutSpaces), quote=TRUE)
Starting Data:
V1 V2 V3 V4
1 "Something is interesting. " "This is also Interesting. " "Not " "Intereting "
2 " Something with leading space" " Leading" " Spaces with many words." " More."
3 " Leading and training Space. " " More " " Leading and trailing. " " Spaces. "
Resulting:
V1 V2 V3 V4
1 "Something is interesting." "This is also Interesting." "Not" "Intereting"
2 "Something with leading space" "Leading" "Spaces with many words." "More."
3 "Leading and training Space." "More" "Leading and trailing." "Spaces."

In orgmode ASCII export, omit blank lines between items in list

Consider the following:
#+OPTIONS: num:nil toc:nil
#+AUTHOR:
* h1
** h2
*** h3
**** item 1
**** item 2
**** item 3
When using ASCII export, orgmode renders that as:
h1
==
h2
~~
h3
--
* item 1
* item 2
* item 3
I don't like all of the blank lines (or empty lines, or newlines) between the itemized list items. I would prefer it to render as:
h1
==
h2
~~
h3
--
* item 1
* item 2
* item 3
How do I accomplish that?
Probably the simplest way is to customize org-ascii-headline-spacing and set it to nil. That will make the spacing of the output mirror the spacing of the input, so you can set it exactly as you want
The doc string of org-ascii-headline-spacing says:
Number of blank lines inserted around headlines.
This variable can be set to a cons cell. In that case, its car
represents the number of blank lines present before headline
contents whereas its cdr reflects the number of blank lines after
contents.
A nil value replicates the number of blank lines found in the
original Org buffer at the same place.

Formula won't display when certain fields are null

Similar to my first question. I want to show my address in a text box containing the fields as follows
{Company}
{AddLine1}
{AddLine2}
{ZIP}{State}{City}
{Country}
The line that concerns me (and hopefully some of you guys) is {ZIP} {City} {State}. What I want to produce is a consistent addressing format, so that there will be no blank space or indentation even if a ZIP, City or State field has been left blank in the DB. This line should still line up with the rest of the rows and not be indented. I also wish to insert commas between zip, state, city where they are relevant and leave them out where not. For this I have written a formula. Below:
If isnull({BILL_TO.ZIP}) or trim({BILL_TO.ZIP})= "" Then "" else {BILL_TO.ZIP}
+
(If isnull({BILL_TO.State}) or trim({BILL_TO.State})= "" Then ""
else(If not isnull({BILL_TO.ZIP}) and length(trim({BILL_TO.ZIP})) <> 0 Then ", " else "") + {BILL_TO.State})
+
(If isnull({BILL_TO.CITY}) or length(trim({BILL_TO.CITY})) = 0 Then ""
else(
If (not isnull({BILL_TO.State}) and length(trim({BILL_TO.State})) <> 0)
or
(not isnull({BILL_TO.Zip}) and length(trim({BILL_TO.Zip})) <> 0)
Then ", " else "")+ {BILL_TO.CITY}))
The problem is that when there is only a city (no state or zip entered) the formula itself will not display. It does however display when the others are present.
Can anyone see a bug in this code??? Its killing me
Thanks for the help.
Look forward to hearing from you guys!
There are a whole lot of if-then-elses going on in that formula so it's hard to tell, but if it's not displaying anything that probably means that you're using a field somewhere without handling its null condition first. Something simpler might be your best bet:
local stringvar output;
if not(isnull({Customer.Postal Code})) then output:=trim({Customer.Postal Code}) + ', ';
if not(isnull({Customer.Region})) then output:=output + trim({Customer.Region}) + ', ';
if not(isnull({Customer.City})) then output:=output + trim({Customer.City}) + ', ';
left(output,length(output)-2)
Create a formula field for each database field. For example:
// {#CITY}
If Isnull({BILL_TO.CITY}) Then
""
Else
Trim({BILL_TO.CITY})
// {#STATE}
If Isnull({BILL_TO.STATE}) Then
Space(2)
Else
Trim({BILL_TO.STATE})
// {#ZIP}
If Isnull({BILL_TO.ZIP}) Then
Space(5) //adjust to meet your needs
Else
Trim({BILL_TO.ZIP})
Embed each formula field in a text object.
Format text object and its fields to meet your need.
** edit **
I you have data-quality issues, address them in the STATE and ZIP formulas (because they will be a constant length). I would add the commas and spacing the text object.

How to efficiently transpose rows into columns in Vim?

I have a data file like the following:
----------------------------
a b c d e .............
A B C D E .............
----------------------------
But I want it to be in the following format:
----------------------------
a A
b B
c C
d D
e E
...
...
----------------------------
What is the quickest way to do the transformation in Vim or Perl?
Basically :.s/ /SpaceCtrl+vEnter/gEnterjma:.s/ /Ctrl+vEnter/gEnterCtrl+v'axgg$p'adG will do the trick. :)
OK, let's break that down:
:.s/ /Ctrl+vEnter/gEnter: On the current line (.), substitute (s) spaces (/ /) with a space followed by a carriage return (SpaceCtrl+vEnter/), in all positions (g). The cursor should now be on the last letter's line (e in the example).
j: Go one line down (to A B C D E).
ma: Set mark a to the current position... because we want to refer to this position later.
:.s/ /Ctrl+vEnter/gEnter: Do the same substitution as above, but without the Space. The cursor should now be on the last letter's line (E in the example).
Ctrl+v'a: Select from the current cursor position (E) to mark a (that we set in step 3 above), using the block select.
x: Cut the selection (into the " register).
gg: Move the cursor to the first line.
$: Move the cursor to the end of the line.
p: Paste the previously cut text after the cursor position.
'a: Move the cursor to the a mark (set in step 3).
dG: Delete everything (the empty lines left at the bottom) from the cursor position to the end of the file.
P.S. I was hoping to learn about a "built-in" solution, but until such time...
Simple re-map of the columns:
use strict;
use warnings;
my #a = map [ split ], <>; # split each line on whitespace and store in array
for (0 .. $#{$a[0]}) { # for each such array element
printf "%s %s\n", $a[0]->[$_], $a[1]->[$_]; # print elements in order
}
Usage:
perl script.pl input.txt
Assuming that the cursor is on the first of the two lines, I would use
the command
:s/ /\r/g|+&&|'[-;1,g/^/''+m.|-j

Displaying text from two fields, separated by a varying number of "." symbols, while preserving the total string length

I'm trying to create a Table of Contents for a small publication using Filemaker 10, since that's what the data has been stored in previously.
I'm able to generate page numbers, add heading to the TOC and pretty much everything else I've needed to do - one thing withstanding.
Our designer wants to fill each TOC line with "." to make it easier to read.
Currently:
Using Stack Overflow 1
Why Reddit is better than digg 7
Does Filemaker really suck this much 84
Ways to convince bosses 92
Ditching FileMaker 97
Wanted:
Using Stack Overflow..................................................1
Why Reddit is better than digg........................................7
Does Filemaker really suck this much.................................84
Ways to convince bosses..............................................92
Ditching FileMaker...................................................97
The item and page number are in different fields. Using a border is unsatisfactory because it underlines everything.
Solutions?
You can do this using tab stops in the Format -> Text menu
1) Create a calc field with the following definition (the character in the quotes is a tab):
title & " " & page
2) Add this field to your layout (it needs to be an actual field, not a merge field)
3) Highlight the field and choose format -> text -> paragraph -> tabs
4) Create a new Tab with a position of 6 inches and a Fill Character of "." or "…"
Now when viewed, any space from the end of the title up to the tab stop 6 inches away is filled with the fill character. No monospace font required.
You need to break it up into bits and then put it back with the right spacing. Something like this would do :
Let ( [
text = "Why Reddit is better than digg........................................7" ;
len = Length ( text ) ;
end = RightWords ( text ; 1 ) ;
lenEnd = Length ( end ) ;
lenStart = Length ( Trim ( Right ( text ; len - lenEnd ) ) ) ] ;
Left ( text ; lenStart ) &
Left ( "..........................................................................." ; len - lenStart - lenEnd ) &
end )
I've built the "text" variable into the calc for testing, but you could do this as a Custom Function or just inside a calculation with the field instead.
Also this assumes you're using a mono spaced font and the gap in the middle is a space character.