this is div content i want to get from a web page.
<div class="result clearfix table-responsive">
<table class="table table-striped">
<thead>
<tr>
<th>Giải thưởng</th>
<th>Trùng khớp</th>
<th>Số lượng giải</th>
<th style="text-align: left; width: 22%;">Giá trị giải (đồng)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Jackpot</td>
<td>Trùng 6 số</td>
<td>0</td>
<td style="text-align: left"><span>27.868.784.500</span></td>
</tr>
<tr>
<td>Giải nhất</td>
<td>Trùng 5 số</td>
<td>18</td>
<td style="text-align: left"><span>10.000.000</span></td>
</tr>
<tr>
<td>Giải nhì</td>
<td>Trùng 4 số</td>
<td>613</td>
<td style="text-align: left"><span>300.000</span></td>
</tr>
<tr>
<td>Giải ba</td>
<td>Trùng 3 số</td>
<td>11047</td>
<td style="text-align: left"><span>30.000</span></td>
</tr>
</tbody>
</table>
<p class="role-result">
<span>Thời hạn lĩnh thưởng của vé trúng thưởng: là 60 (sáu mươi) ngày, kể từ ngày xác định kết quả trúng thưởng hoặc kể từ ngày hết hạn lưu hành của vé xổ số tự chọn số điện toán (nếu có). Quá thời hạn trên, các vé trúng thưởng không còn giá trị lĩnh thưởng.</span>
</p>
<div>
<a class="view-more" href="winning-numbers">Các lần quay trước</a>
</div>
</div>
this is my code to get div content and echo to my site:
$kqxsmega = file_get_contents ("http://vietlott.vn/vi/trung-thuong/ket-qua-trung-thuong/mega-6-45/");
$dom = new DomDocument();
$dom->loadHTML($kqxsmega);
$finder = new DomXPath($dom);
$classname="result clearfix table-responsive";
$divContent = $finder->query("//*[contains(#class, '$classname')]");
My code is running good, and i want to convert $divContent become string and i can echo it.
now to echo $divContent will show nothing
echo $divContent ;
Please help me.
Thank you.
Related
I am making a report for the invoice line, I have purchased a module in the third-party odoo store and it performs its function well.
But I can't see the discount on the invoice line.
I think this is because the module prevents me, but I already have no developer support.
What I need is that the discount (price list) can be seen on the invoice line.
What table or what element of the invoice line discount?
I leave you the code that I have in the report
''''
<tbody class="invoice_tbody">
<tr t-foreach="invoice_lines[0]" t-as="line">
<td><b><span t-esc="line['client_ref']"/></b>
<span t-esc="line['description']"/></td>
<td class="text-right">
<span t-esc="line['qty']"/>
</td>
<td class="text-right">
<span t-esc="line['price_unit']"/>
</td>
<td t-if="display_discount" class="text-right">
</td>
<td class="text-right" id="subtotal">
<t t-if="line['price_subtotal']">
<span t-esc = "line ['price_subtotal']" t-options = "{& quot; widget & quot ;: & quot; monetario & quot ;, & quot; display_currency & quot ;: o.currency_id}" /> </t>
</td>
</tr>
<tr t-foreach = "range (max (5-len (o.invoice_line_ids), 0))" t-as = "l">
<td t-translation = "off"> & amp; nbsp; </td>
<td class = "hidden" />
<td />
<td />
<td t-if = "display_discount" />
<td />
<td />
</tr>
</tbody>
</t>
'''
Yes, this parameter is in the report
"view / report_invoice_document"
But the report that I try to modify is this
report_invoice_document_inherit
<?xml version="1.0"?>
<data inherit_id="account.report_invoice_document">
<xpath expr="//table[#name='invoice_line_table']/tbody" position="replace">
<t t-if="res_company.is_group_by_so">
<t t-set="invoice_lines" t-value="o.get_invoice_lines()"/>
<tbody class="invoice_tbody">
<tr t-foreach="invoice_lines[0]" t-as="line">
<td><b><span t-esc="line['client_ref']"/></b>
<span t-esc="line['description']"/></td>
<!-- <td class="hidden"><span t-esc="line['client_ref']"/></td> -->
<td class="text-right">
<span t-esc="line['qty']"/>
<!-- <span t-field="l.uom_id" groups="product.group_uom"/> -->
</td>
<td class="text-right">
<span t-esc="line['price_unit']"/>
</td>
</td>
<td t-if="display_discount" class="text-right">
<!-- <span t-esc="line['price_unit']"/> -->
</td>
<td class="text-right" id="subtotal">
<t t-if="line['price_subtotal']">
<span t-esc="line['price_subtotal']" t-options="{"widget": "monetary", "display_currency": o.currency_id}"/></t>
</td>
</tr>
<tr t-foreach="range(max(5-len(o.invoice_line_ids),0))" t-as="l">
<td t-translation="off"> </td>
<td class="hidden"/>
<td/>
<td/>
<td t-if="display_discount"/>
<td/>
<td/>
</tr>
</tbody>
</t>
<t t-else="">
<tbody class="invoice_tbody">
<tr t-foreach="o.invoice_line_ids" t-as="l">
<td><span t-field="l.name"/></td>
<td class="hidden"><span t-field="l.origin"/></td>
<td class="text-right">
<span t-field="l.quantity"/>
<span t-field="l.uom_id" groups="product.group_uom"/>
</td>
<td class="text-right">
<span t-field="l.price_unit"/>
</td>
<td t-if="display_discount" class="text-right">
<span t-field="l.discount"/>
</td>
<td class="text-right">
<span t-esc="', '.join(map(lambda x: (x.description or x.name), l.invoice_line_tax_ids))"/>
</td>
<td class="text-right" id="subtotal">
<span t-field="l.price_subtotal" t-options="{"widget": "monetary", "display_currency": o.currency_id}"/>
</td>
</tr>
<tr t-foreach="range(max(5-len(o.invoice_line_ids),0))" t-as="l">
<td t-translation="off"> </td>
<td class="hidden"/>
<td/>
<td/>
<td t-if="display_discount"/>
<td/>
<td/>
</tr>
</tbody>
</t>
</xpath>
</data>
I have tried to modify the second report, and put and have looked at the python code in case something
invoice_report_grouped_by \ report \ account_invoice.py
# -*- coding: utf-8 -*-
from odoo import api, models
from datetime import datetime
class AccountInvoice(models.Model):
_inherit = "account.invoice"
def get_notation_amt(self, amt):
'''This method help us to return the value of the product pricing'''
amount = str(amt).split('.')
if len(amount) == 2:
amount = amount[0] + "," + amount[1]
return amount
return amt
#api.multi
def get_product_invoice_lines(self, client_ref=False):
'''This method helps to get the data for the following Invoice Line.'''
product_invoices = []
client_order_ref = []
for line in self.invoice_line_ids:
sale_line = (False, line)
if line.sale_line_ids:
sale_line = (line.sale_line_ids[0].order_id, line)
client_order_ref.append(sale_line)
if client_order_ref:
for ref in client_order_ref:
if (client_ref == ref[0]):
product_invoices.append({'price_subtotal': ref[1].price_unit * ref[1].quantity,
'default_code': ref[1].product_id.default_code,
'client_ref': False,
'discount': ref[1].discount,
'taxes': ",".join(map(lambda x: (x.description or x.name), ref[1].invoice_line_tax_ids)),
'description': ref[1].name,
'qty': self.get_notation_amt(ref[1].quantity),
'price_unit': self.get_notation_amt("{0:.3f}".format(ref[1].price_unit)),
})
else:
for line in self.invoice_line_ids:
product_invoices.append({'price_subtotal': line.price_unit * line.quantity,
'default_code': line.product_id.default_code,
'client_ref': False,
'discount': line.discount,
'taxes': ",".join(map(lambda x: (x.description or x.name), ref[1].invoice_line_tax_ids)),
'description': line.name,
'qty': self.get_notation_amt(line.quantity),
'price_unit': self.get_notation_amt("{0:.3f}".format(line.price_unit)),
})
return product_invoices
#api.multi
def get_invoice_lines(self):
'''This method help to get the invoice line group by Sale order'''
vals = []
sale_order_lines = []
false_sale_order_lines = []
for line in self.invoice_line_ids:
sale_line = False
if line.sale_line_ids:
sale_line = line.sale_line_ids[0].order_id
if sale_line:
sale_order_lines.append(sale_line)
else:
false_sale_order_lines.append(sale_line)
sale_order_lines = list(set(sale_order_lines))
false_sale_order_lines = list(set(false_sale_order_lines))
for sale_order in sale_order_lines:
if sale_order and self.origin:
confirmation_date = str(
sale_order.confirmation_date, '%d-%m-%Y %H:%M:%S').strftime('%d/%m/%Y')
client_ref = sale_order.name + ' - ' + confirmation_date
if sale_order.client_order_ref:
client_ref = client_ref + ' - ' + sale_order.client_order_ref
vals.append({'price_subtotal': False, 'default_code': False,
'client_ref': client_ref, 'description': False,
'qty': False, 'price_unit': False, 'taxes': False, 'discount': False})
vals.extend(self.get_product_invoice_lines(client_ref=sale_order))
# for sort false sale order, display manually invoice line at last
for so in false_sale_order_lines:
vals.extend(self.get_product_invoice_lines(client_ref=so))
return [vals, len(vals)]
You can see the default report here:
https://github.com/odoo/odoo/blob/06f9baae968674547cb2592b1c22147bfb2e8ba9/addons/account/views/report_invoice.xml#L49
<t t-set="display_discount" t-value="any([l.discount for l in o.invoice_line_ids])"/>
This means that if any line has a discount, it should display it.
I think there are two options to disable it. One is to remove that line from the report, or the second option is to set display_discount to false.
Knowing the module that breaks your report, the problem should be easy to find.
But the exact reason is hard to tell without seeing your module.
I have the following code to model a regression and print the summary to a log file
#Finding the model fit using the multiple regression
fit = smf.ols(self.formula_string, data=df_train).fit()
fit_parameters = str(fit.params)
fit_summary = str(fit.summary())
logger.info('fit_summary' + fit_summary)
As we know the summary has a table followed by a grid. Can the grid part alone, of the summary, (in blue in this sample image below), be converted to a HTML file ?
The summary of OLS is build from 3 separate tables. Each of the tables can be converted separately to string/text, html or latex
res is an OLS results instance returned by the fit method in the following
>>> summ = res.summary()
>>> dir(summ)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__',
'__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__',
'__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__',
'__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__',
'__subclasshook__', '__weakref__', '_repr_html_', 'add_extra_txt',
'add_table_2cols', 'add_table_params', 'as_csv', 'as_html', 'as_latex',
'as_text', 'extra_txt', 'tables']
>>> len(summ.tables)
3
>>> summ.tables[1].as_html()
'<table class="simpletable">\n<tr>\n <td></td> <th>coef</th> <th>std err</th> <th>t</th> <th>P>|t|</th> <th>[0.025</th> <th>0.975]</th> \n</tr>\n<tr>\n <th>C(Region)[C]</th> <td> 38.6517</td> <td> 9.456</td> <td> 4.087</td> <td> 0.000</td> <td> 19.826</td> <td> 57.478</td>\n</tr>\n<tr>\n <th>C(Region)[E]</th> <td> 23.2239</td> <td> 14.931</td> <td> 1.555</td> <td> 0.124</td> <td> -6.501</td> <td> 52.949</td>\n</tr>\n<tr>\n <th>C(Region)[N]</th> <td> 28.6347</td> <td> 13.127</td> <td> 2.181</td> <td> 0.032</td> <td> 2.501</td> <td> 54.769</td>\n</tr>\n<tr>\n <th>C(Region)[S]</th> <td> 34.1034</td> <td> 10.370</td> <td> 3.289</td> <td> 0.002</td> <td> 13.459</td> <td> 54.748</td>\n</tr>\n<tr>\n <th>C(Region)[W]</th> <td> 28.5604</td> <td> 10.018</td> <td> 2.851</td> <td> 0.006</td> <td> 8.616</td> <td> 48.505</td>\n</tr>\n<tr>\n <th>Literacy</th> <td> -0.1858</td> <td> 0.210</td> <td> -0.886</td> <td> 0.378</td> <td> -0.603</td> <td> 0.232</td>\n</tr>\n<tr>\n <th>Wealth</th> <td> 0.4515</td> <td> 0.103</td> <td> 4.390</td> <td> 0.000</td> <td> 0.247</td> <td> 0.656</td>\n</tr>\n</table>'
>>> print(summ.tables[1])
================================================================================
coef std err t P>|t| [0.025 0.975]
--------------------------------------------------------------------------------
C(Region)[C] 38.6517 9.456 4.087 0.000 19.826 57.478
C(Region)[E] 23.2239 14.931 1.555 0.124 -6.501 52.949
C(Region)[N] 28.6347 13.127 2.181 0.032 2.501 54.769
C(Region)[S] 34.1034 10.370 3.289 0.002 13.459 54.748
C(Region)[W] 28.5604 10.018 2.851 0.006 8.616 48.505
Literacy -0.1858 0.210 -0.886 0.378 -0.603 0.232
Wealth 0.4515 0.103 4.390 0.000 0.247 0.656
================================================================================
I have tried to implement footable pluglin on an MVC application and it currently has 3000+ users which fail to load and the browser page times out. For the User Admin section I have created the following View:
#model IEnumerable<MyBudgetWeb.Models.ApplicationUser>
#{
ViewBag.Title = "Index";
}
<script src="~/Scripts/jquery-1.10.2.js"></script>
<script type="text/javascript">
$(function () {
$('.footable').footable();
});
</script>
<br /><br />
<h2>Bill Pay Users</h2>
<input id="filter" type="text" placeholder="Filter" />
<br /><br />
#if (Model.Count() > 0)
{
<table class="table" data-filter="#filter" data-page-size="50">
<thead>
<tr>
<th data-type="numeric">Customer Number</th>
<th data-type="numeric">Account Name</th>
<th data-type="numeric" data-hide="phone">Go Live Date</th>
<th data-type="numeric" data-hide="phone">Online Live Date</th>
<th>Actions</th>
</tr>
</thead>
<tbody>
#foreach (var item in Model)
{
<tr>
<td>#item.CustomerNumber</td>
<td>#item.AccountName</td>
<td>#item.GoLiveDate</td>
<td>#item.OnlineCreationTimestamp</td>
<td>
#Html.ActionLink("Edit", "Edit", new { id = item.Id }) |
#Html.ActionLink("Details", "Details", new { id = item.Id }) |
#*#Html.ActionLink("Delete", "Delete", new { id = item.Id })*#
</td>
</tr>
}
</tbody>
<tfoot class="hide-if-no-paging">
<tr>
<td colspan="12" class="text-center">
<ul class="pagination"></ul>
</td>
</tr>
</tfoot>
</table>
}
else
{
<h2>You are still awaiting users to be added. All users will be displayed here.</h2>
}
My controller looks as follows:
public async Task<ActionResult> Index()
{
return View(await UserManager.Users.ToListAsync());
}
Is there any way workarounds to prevent this?
there are 2 option to solve:
1. PHP Side script, by setting the time limit of execution. Use this code on top your PHP script:
set_time_limit(0);
Use Footable AJAX what describe here:
http://fooplugins.github.io/FooTable/
Just starting to learn Coffeescript, and I'm working with React.js. I'm trying to determine which was clicked, and I've been advised not to use data-attributes on each header. I have some ideas how to handle this under the handleHeaderClick function, but I'm not exactly sure how they should be implemented. I'm also thinking about splitting up the ContactsTable component into a ContactsTableHeader component and a ContactsTableRow component, but I should still have the same issue in ContactsTableHeader - determining which header was clicked.
#Application.cjsx
handleHeaderClick: ->
# childComponent.props
# childComponent.refs
# React.findDOMNode(childComponent.refs.firstName)
# React.findDOMNode(childComponent.refs.lastName)
# React.findDOMNode(childComponent.refs.age)
render: ->
<div>
<ContactsTable contactList={#state.contacts} onClick={#handleHeaderClick} />
</div>
#ContactsTable.cjsx
render: ->
if #props.contactList
contactsList = #props.contactList.map (contact) ->
<tr><td>{"#{contact.firstName}"}</td><td>{"#{contact.lastName}"}</td><td>{contact.age}</td></tr>
<table style={tableStyle}>
<thead style={headerStyle} onClick=#props.onClick>
<tr>
<th>FirstName</th>
<th>Last Name</th>
<th>Age</th>
</tr>
</thead>
<tbody>
{contactsList}
</tbody>
</table>
You can do something like this.
#Application.cjsx
handleHeaderClick: (header) ->
#setState({clickedHeader: header})
#do something else
render: ->
<div>
<ContactsTable contactList={#state.contacts} onClick={#handleHeaderClick} />
</div>
#ContactsTable.cjsx
render: ->
if #props.contactList
contactsList = #props.contactList.map (contact) ->
<tr><td>{"#{contact.firstName}"}</td><td>{"#{contact.lastName}"}</td><td>{contact.age}</td></tr>
<table style={tableStyle}>
<thead style={headerStyle}>
<tr>
<th onClick={#props.onClick('FirstName')}>FirstName</th>
<th onClick={#props.onClick('LastName')}>Last Name</th>
<th onClick={#props.onClick('Age')}>Age</th>
</tr>
</thead>
<tbody>
{contactsList}
</tbody>
</table>
Here the clickedHeader is preserved inside the component which is present in Application.cjsx. You can also preserve it inside ContactsTableHeader which should look something like similar.
I'm needing some help with Sed. I'm using it on Windows and Mac OSX. I need to Sed to add a
</tr>
<tr>
every 4 lines, after the first <tr> found, and stop doing it on </tr>
i Just can't find a way to doing this.
Every file will have up to 20 tables, so i need to do it automatically...
changing from this
<div class="titulo"> TERMINAL CAPAO DA IMBUIA</div>
<div class="dataedia">
Válido a partir de: 30/07/2012 -
DIA ÚTIL</div>
<table>
<tr>
<td>05:50</td>
<td>05:58</td>
<td>06:04</td>
<td>06:08</td>
<td>06:12</td>
<td>06:15</td>
<td>06:17</td>
<td>06:20</td>
<td>06:22</td>
<td>06:25</td>
<td>06:27</td>
<td>06:30</td>
<td>06:32</td>
<td>06:35</td>
<td>06:37</td>
<td>06:39</td>
<td>06:42</td>
<td>06:44</td>
<td>06:47</td>
<td>06:49</td>
<td>06:52</td>
<td>06:54</td>
<td>06:57</td>
<td>06:59</td>
<td>07:01</td>
<td>07:04</td>
<td>07:06</td>
<td>07:09</td>
<td>07:11</td>
<td>07:14</td>
<td>07:16</td>
<td>07:18</td>
<td>07:21</td>
<td>07:23</td>
<td>07:26</td>
<td>07:28</td>
<td>07:31</td>
<td>07:33</td>
<td>07:36</td>
<td>07:38</td>
</tr>
</table>
</div>
to this
<div class="titulo"> TERMINAL CAPAO DA IMBUIA</div>
<div class="dataedia">
Válido a partir de: 30/07/2012 -
DIA ÚTIL</div>
<table>
<tr>
<td>05:50</td>
<td>05:58</td>
<td>06:04</td>
<td>06:08</td>
</tr>
<tr>
<td>06:12</td>
<td>06:15</td>
<td>06:17</td>
<td>06:20</td>
</tr>
<tr>
<td>06:22</td>
<td>06:25</td>
<td>06:27</td>
<td>06:30</td>
</tr>
<tr>
<td>06:32</td>
<td>06:35</td>
<td>06:37</td>
<td>06:39</td>
</tr>
<tr>
<td>06:42</td>
<td>06:44</td>
<td>06:47</td>
<td>06:49</td>
</tr>
<tr>
<td>06:52</td>
<td>06:54</td>
<td>06:57</td>
<td>06:59</td>
</tr>
<tr>
<td>07:01</td>
<td>07:04</td>
<td>07:06</td>
<td>07:09</td>
</tr>
<tr>
<td>07:11</td>
<td>07:14</td>
<td>07:16</td>
<td>07:18</td>
</tr>
<tr>
<td>07:21</td>
<td>07:23</td>
<td>07:26</td>
<td>07:28</td>
</tr>
<tr>
<td>07:31</td>
<td>07:33</td>
<td>07:36</td>
<td>07:38</td>
</tr>
</table>
</div>
Is it possible with sed? If not, what tool should i use?
Thanks
I don't like the idea of using sed to handle HTML code. Said that, try with this:
Content of script.sed:
## For every line between '<tr>' and '</tr>' do ...
/<tr>/,/<\/tr>/ {
## Omit range edges.
/<\/\?tr>/ b;
## Append '<td>...</td>' to Hold Space (HS).
H;
## Get HS to Pattern Space (PS) to work with it.
x;
## If there are at least four newline characters means that exists four
## '<td>' tags too, so add a '<tr>' before them and a '</tr>' after them,
## print, and delete them (already processed).
/\(\n[^\n]*\)\{4\}/ {
s/^\(\n\)/<tr>\1/;
s/$/\n<\/tr>/;
p
s/^.*$//;
}
## Save the '<td>'s to HS again and read next line.
x;
b;
}
## Print all lines out of the range.
p;
Assuming infile with the data posted in the question, run the script like:
sed -nf script.sed infile
That yields:
<div class="titulo"> TERMINAL CAPAO DA IMBUIA</div>
<div class="dataedia">
Válido a partir de: 30/07/2012 -
DIA ÚTIL</div>
<table>
<tr>
<td>05:50</td>
<td>05:58</td>
<td>06:04</td>
<td>06:08</td>
</tr>
<tr>
<td>06:12</td>
<td>06:15</td>
<td>06:17</td>
<td>06:20</td>
</tr>
<tr>
<td>06:22</td>
<td>06:25</td>
<td>06:27</td>
<td>06:30</td>
</tr>
<tr>
<td>06:32</td>
<td>06:35</td>
<td>06:37</td>
<td>06:39</td>
</tr>
<tr>
<td>06:42</td>
<td>06:44</td>
<td>06:47</td>
<td>06:49</td>
</tr>
<tr>
<td>06:52</td>
<td>06:54</td>
<td>06:57</td>
<td>06:59</td>
</tr>
<tr>
<td>07:01</td>
<td>07:04</td>
<td>07:06</td>
<td>07:09</td>
</tr>
<tr>
<td>07:11</td>
<td>07:14</td>
<td>07:16</td>
<td>07:18</td>
</tr>
<tr>
<td>07:21</td>
<td>07:23</td>
<td>07:26</td>
<td>07:28</td>
</tr>
<tr>
<td>07:31</td>
<td>07:33</td>
<td>07:36</td>
<td>07:38</td>
</tr>
</table>
</div>
try awk
awk '{print}; /<td>/ && ++i==4 {print "</tr>\n<tr>"; i=0}' file
print the line
if it's a <td> then increase i
if i is 4 print </tr><tr> and reset i
Testing with given input the desired output is returned,
with the only "problem" that an extra <tr></tr> appears at the end of the list.
This is fixable but I'm running out of time here.
When I get back I can look into it if you think it is needed.
... part of the end of the result file
<td>07:26</td>
<td>07:28</td>
</tr>
<tr>
<td>07:31</td>
<td>07:33</td>
<td>07:36</td>
<td>07:38</td>
</tr>
<tr> <-- extra <tr></tr> here
</tr>
</table>
you can try with regular expressions. You can test following expression on:
http://gskinner.com/RegExr/
Catch expression:
?</td>.<td>.*?</td>.<td>.*?</td>.<td>.*?</td>)(?!.</tr>)
Replace expression:
$1\n</tr>\n<tr>
Flags checked:
global, ignorecase, dotall
Result:
<table>
<tr>
<td>05:50</td>
<td>05:58</td>
<td>06:04</td>
<td>06:08</td>
</tr>
<tr>
<td>06:12</td>
<td>06:15</td>
<td>06:17</td>
<td>06:20</td>
</tr>
<tr>
<td>06:22</td>
<td>06:25</td>
<td>06:27</td>
<td>06:30</td>
</tr>
<tr>
<td>06:32</td>
<td>06:35</td>
<td>06:37</td>
<td>06:39</td>
</tr>
<tr>
<td>06:42</td>
<td>06:44</td>
<td>06:47</td>
<td>06:49</td>
</tr>
<tr>
<td>06:52</td>
<td>06:54</td>
<td>06:57</td>
<td>06:59</td>
</tr>
<tr>
<td>07:01</td>
<td>07:04</td>
<td>07:06</td>
<td>07:09</td>
</tr>
<tr>
<td>07:11</td>
<td>07:14</td>
<td>07:16</td>
<td>07:18</td>
</tr>
<tr>
<td>07:21</td>
<td>07:23</td>
<td>07:26</td>
<td>07:28</td>
</tr>
<tr>
<td>07:31</td>
<td>07:33</td>
<td>07:36</td>
<td>07:38</td>
</tr>
</table>
</div>
You can use editor like Notepad++ for batch replace on many files at once (syntax will be little different).
sed '\!<td>!,\!</table!{N;N;N;i\
</tr>\
<tr>
}' input_file
Perl solution, still using regular expression instead of parsing HTML:
perl -pe '
undef $inside if m{</tr>};
if ($inside and ($. % 4) == $tr_line) {
print "</tr>\n<tr>\n";
}
$inside = 1 if defined $tr_line;
$tr_line = ($. + 1) % 4 if /<tr>/;
' file
Using xsh:
open :F html file ; # Open as html.
while //table/tr[count(td)>4] wrap :U position()=8 tr //table/tr/td ; # Wrap four td's into a tr.
xmove :r //table/tr/tr before .. ; # Unwrap the extra tr.
remove //table/tr[last()] ; # Remove the extra tr.