I can't figure out how to delete a table row in python-docx. Specifically, my tables come in with a header row and a row that has a special token stored in the text of the first cell. I search for tables with the token and then fill in a bunch of rows of the table. But how do I delete row 1, which has the token, before adding new rows?
I tried
table.rows[1].Delete()
and
table.rows = table.rows[0:1]
The first one fails with an unrecognized function (the documentation refers to this function in the Microsoft API, but I don't know what that means).
The second one fails because table.rows is read-only, as the documentation says.
So how do I do it?
This functionality is not built-in, which really shocks me. As I search forums I find many people asking for this over the last five years. However, a workaround exists and here it is:
def remove_row(table, row):
tbl = table._tbl
tr = row._tr
tbl.remove(tr)
row = table.rows[n]
remove_row(table, row)
The answer by All the Rage did not work for me unfortunately. However, I was able to remove rows and columns as follows:
from docx import Document
w=Document("path")
table_1=w.tables[0]
# delete row
print(len(table_1.rows))
row2=table_1.rows[1]
row2._element.getparent().remove(row2._element)
print(len(table_1.rows))
# delete column
col=table_1.table.columns[1]
for cell in col.cells:
cell._element.getparent().remove(cell._element)
You can use this function :
from docx import Document
document = Document('YOUR_DOCX')
def Delete_row_in_table(table, row):
document.tables[table]._tbl.remove(document.tables[table].rows[row]._tr)
Delete_row_in_table(0, 0)
document.save('OUT.docx')
I made this video to show how to do this easily because it confused me. https://www.youtube.com/watch?v=qA5QRXwAt2I
Table = document.tables[0]
RowA=Table.rows[0]
table_element = Table._tbl
table_element.remove(RowA._tr)
The code above removes the first row from the first table.
You have to alter the xml elements there isnt a way to do it through the API for docx.
Related
I am working on a solution that involves merging two queries in Power Query to retrieve a single data table back to Excel. The first query is always populated but the other query comes from an ERP and might be empty (empty table) from time to time.
Appending the two queries involves making the header names the same in the two queries before the appending takes place. As the second query sometimes results in an empty table, the error arises in the steps when Power Query is modifying the header names in the second table (it cannot modify the header names as there are no headers).
"Error message: Expression.Error: The column 'PartMtl_Company' of the table wasn't found.
Details: PartMtl_Company" where the PartMtl_Company is the leftmost column in my table.
I am kind of thinking that I would need to evaluate whether the second table is empty and skip the renaming steps if that is the case. I assume merging the populated first table with an empty table would cause no problem and would only result in the first table. I have tried to look around for a suitable M-code but have not come across such.
I'm thinking you might be able to use Table.RowCount to solve this. Something along the lines of:
= if Table.RowCount(Table2) > 0 then...
You would modify the headers only if there is data in the second table. Same goes for the appending of the tables: you would only append if there is data in the second table, since you won't have renamed any headers otherwise.
Thank you Marc! That did the trick.
In the end, I wrote some in the lines of
= if Table.RowCount(Table2) > 0 then... (code that works on a non-empty table) ...else Table2
, which returns the empty table if it is empty to begin with. Appending the second table into the first table did not throw an error but returned only the first table like planned.
I have a job where I am getting a flow into tOracleOutput where I am updating the table. Now, I have to update that table using an SQL statement, which I guess we have option in Advanced settings of tOracleOuptut, but I don't know how to use it or you can say that I am not getting the settings properly. I referred to official documentation but could not understand. Can any one explain the fields like Name, SQL expression, Position, Reference Column in a better way?
the SQL query which I am using is:
update set COL1=SOMETHING1
where COL2=SOMETHING2
Now, value for COL1 is coming from the flow but COL2 is some column in the table which is not coming from the flow.
Have a look to tOracleRow for such a case.
Hope this helps.
TRF
Using tOracleOutput is helpful when a ready data source (table or file (...) with same columns as destination) the more elaborate your query is, the more you should do as TRF said (and use tOracleRow), but here's an example to your question:
file contain 3 column,
DB table of destination contains 4 column, where the 4th is the date of update, (the first 3 are identical to the input)
so you add the destination's column's name in Name and put the SQL function for the date (eg: SYSDATE) and where to put it (Position) in reference to a column of your choice (Reference Column)
In my view it helps avoid using tMap for a miserable additional column when you want to Insert, but you want to Update, in which case the component doesn't offer the additional column section, plus I don't think you can add the WHERE clause here
Hope it helps
In matlab I have read in a table from a csv file, then moved two columns I am interested in into a new table. These columns are "ID" (of a person, 1-400) and then another ID to represent their occupation (1-12).
What I want to do is create a simple table with 12 records and 2 columns, there is a record for each job, and the number of user IDs who have this job must be aggregated/summed, such a table could be easily bar charted. At the moment I have 400 user records, all with their IDs and one of the 12 possible job IDs.
So much like an SQL aggregate/sum function, but I want to do it in Matlab, with a table object. The problem I am having is finding how to do this without using a cell array or something similar.
Thanks!
I know that you found an answer yourself, but I would like to mention the histc function, which avoids the loop (and is faster for larger matrices):
JobCounts = histc(OccupationTable(:,2), 1:NumberOfJobs);
Combining this with the job number gives the desired result:
result = [(1:NumberOfJobs)' JobCounts];
Nevermind, solved it. Just looped through the job numbers and ran "sum" where the ID was equal to what I wanted:
for i = 1:1:NumberOfJobs;
JobCounts(i,:) = sum(OccupationTable(:,2) == i);
end
Sorry if this has already been asked. I couldn't see it in previously asked questions.
I have a table - 'eightks'.
This file contains 1,000,000 text documents.
I only need those that mention the word 'other events'. So I am trying to do some text matching and then output these files into a new table.
My current code is;
SELECT * FROM eightks\d
WHERE to_tsvector(text) ## to_tsquery('other_events');
When I run this I get the following error
string is too long for tsvector (2368732 bytes, max 1048575 bytes)
Also How do I output the matching rows into a new table?
Any help is appreciated.
That's a documented limitation.
The length of a tsvector (lexemes + positions) must be less than 1 megabyte
It might be possible to change the source code and recompile. See ts_type.h. I suspect it won't be simple, though.
You might need to break the documents up into smaller pieces for searching, then combine the pieces for presentation to the user.
As for inserting the rows into another table, you can just insert a correct select statement. Basically . . .
insert into table_name
select ...
You might need to supply column names.
On a Mac, I have a txt file with two columns, one being an autoincrement in an sqlite table:
, "mytext1"
, "mytext2"
, "mytext3"
When I try to import this file, I get a datatype mismatch error:
.separator ","
.import mytextfile.txt mytable
How should the txt file be structured so that it uses the autoincrement?
Also, how do I enter in text that will have line breaks? For example:
"this is a description of the code below.
The text might have some line breaks and indents. Here's
the related code sample:
foreach (int i = 0; i < 5; i++){
//do some stuff here
}
this is a little more follow up text."
I need the above inserted into one row. Is there anything special I need to do to the formatting?
For one particular table, I want each of my rows as a file and import them that way. I'm guessing it is a matter of creating some sort of batch file that runs multiple imports.
Edit
That's exactly the syntax I posted, minus a tab since I'm using a comma. The missing line break in my post didn't make it as apparent. Anyways, that gives the mismatch error.
I was looking on the same problem. Looks like I've found an answer on the first part of your question — about importing a file into a table with ID field.
So yes, create a temporary table without ID, import your file into it, then do insert..select to copy its data into your target table. (Remove leading commas from mytextfile.txt).
-- assuming your table is called Strings and
-- was created like this:
-- create table Strings( ID integer primary key, Code text )
create table StringsImport( Code text );
.import mytextfile.txt StringsImport
insert into Strings ( Code ) select * from StringsImport;
drop table StringsImport;
Do not know what to do with newlines. I've read some mentions that importing in CSV mode will do the trick (.mode csv), but when I tried it did not seem to work.
In case anyone is still having issues with this you can download an SQLLite manager.
There are several that allow importing from a CSV file.
Here is one but a google search should reveal a few: http://sqlitemanager.en.softonic.com/
I'm in the process of moving data containing long text fields with various punctuation marks (they are actually articles on coding) into SQLite and I've been experimenting with various text imports.
I created a database in SQLite with a table:
CREATE TABLE test (id PRIMARY KEY AUTOINCREMENT, textfield TEXT);
then do a backup with .dump.
I then add the text below the "CREATE TABLE" line manually in the resulting .dump file as such:
INSERT INTO test textfield VALUES (1,'Is''t it great to have
really long text with various punctaution marks and
newlines');
Change any single quotes to two single quotes (change ' to ''). Note that an index number needs to be added manually (I'm sure there is an AWK/SED command to do it automatically). Change the auto increment number in the "sequence" line in the dump file to one above the last index number you added (I don't have SQLite in front of me to give you the exact line, but it should be obvious).
With the new file, I can then do a restore onto the database