Sed wraps some lines, not all - sed

Still dealing with quirky files (see my previous post), I am using SED to cleanup some that are laid out like so:
....Receiver ID = 028912781755
Serial Number = WD-WCAUH0546786
Current temp = 50C
PowerOnHours = 13166h
Receiver ID = 028920310381
Serial Number = WD-WCAUH0898333
Current temp = 51C
PowerOnHours = 9099h...
My boss wants files like this one to be tab ("\t") delimited like so
Receiver ID = 028912781755 Serial Number = WD-WCAUH0546786 Current temp = 50C PowerOnHours = 13166h
This is my sed code:
sed -e '/.$/N; s/.\n/\t/'
It works but strangely, not everywhere. This is the output I get
Receiver ID = 028920310381 Serial Number = WD-WCAUH0898333
Current temp = 51 PowerOnHours = 9099h
======================================================================
I need to be more specific. All suggestions I got produce the same result: it appends everything on one singly line. Not what I am looking for.
I am looking for:
Receiver ID = ...<tab>Serial Number = ...<tab>Current temp = ...<tab>PowerOnHours = ...<tab><carriage return>
Receiver ID = ...<tab>Serial Number = ...<tab>Current temp = ...<tab>PowerOnHours = ...<tab>

Number of fields vary but end in PowerOnHours
awk 'ORS=/PowerOnHours/?RS:"\t"' ./infile
Proof of Concept
$ awk 'ORS=/PowerOnHours/?RS:"\t"' receiverid
Receiver ID = 028912781755 Special Field = foo bar baz Serial Number = WD-WCAUH0546786 Current temp = 50C PowerOnHours = 13166h
Receiver ID = 028920310381 Serial Number = WD-WCAUH0898333 Current temp = 51C PowerOnHours = 9099h
Receiver ID = 028912781755 Serial Number = WD-WCAUH0546786 Current temp = 50C PowerOnHours = 13166h
Receiver ID = 028920310381 Serial Number = WD-WCAUH0898333 Current temp = 51C PowerOnHours = 9099h
*Note the Special Field on the first line
Number of fields between records are the same
awk 'ORS=NR%4?"\t":RS' ./infile
Proof of Concept
$ awk 'ORS=NR%4?"\t":RS' ./infile
Receiver ID = 028912781755 Serial Number = WD-WCAUH0546786 Current temp = 50C PowerOnHours = 13166h
Receiver ID = 028920310381 Serial Number = WD-WCAUH0898333 Current temp = 51C PowerOnHours = 9099h
Receiver ID = 028912781755 Serial Number = WD-WCAUH0546786 Current temp = 50C PowerOnHours = 13166h
Receiver ID = 028920310381 Serial Number = WD-WCAUH0898333 Current temp = 51C PowerOnHours = 9099h

Give this a try:
sed '/^Receiver/N;N;N;s/\n/\t/g' inputfile
Explanation:
/^Receiver/N;N;N; - Each time a line that begins with "Receiver" is read, append the next three lines.
s/\n/\t/g - Replace the embedded newlines with tabs
Sample output:
Receiver ID = 028912781755 Serial Number = WD-WCAUH0546786 Current temp = 50C PowerOnHours = 13166h
Receiver ID = 028920310381 Serial Number = WD-WCAUH0898333 Current temp = 51C PowerOnHours = 9099h
(I exaggerated the tabs for effect.)

you can use awk
$ cat file
Receiver ID = 028912781755
Serial Number = WD-WCAUH0546786
Current temp = 50C
PowerOnHours = 13166h
Receiver ID = 028920310381
Serial Number = WD-WCAUH0898333
Current temp = 51C
PowerOnHours = 9099h...
$ awk 'BEGIN{RS="Receiver";OFS="\t"}NF>1{$1=$1;print "Receiver\t"$0}' file
Receiver ID = 028912781755 Serial Number = WD-WCAUH0546786 Current temp = 50C PowerOnHours = 13166h
Receiver ID = 028920310381 Serial Number = WD-WCAUH0898333 Current temp = 51C PowerOnHours = 9099h...

sed ':a
N;/\nReceiver/{
P;D
}
s/\n/X/;ta'

As written, that will join the second line to the first, then go to the third line and join the fourth to it, etc.
sed ':b; /^$/n; N; s/.\n\(.\)/\t\1/; tb'
should loop appending non-empty lines. (Corrected to actually catch blank lines in runs.)

cat file | tr '\n' '\t'
will work, too

Related

Query with defined value after variable ${}

i have script to retrieve data stored in text file, then use variable query ${} to parse the data.
example:
data kept in text file is abc
below statement will execute query productId = 'abc'
Now, I want to append defined value after the abc. to make the query like below:
productId = 'abc/NDC-1111'
what should be the exact syntax i need to use?
//Read productId
def productId = new File(RunConfiguration.getProjectDir() + "/Data Files/productId.txt")
//SQL statement
dbQuery2 = /SELECT * FROM db.t1 where productId = '${productId.text}'/
You can just do:
dbQuery2 = "SELECT * FROM db.t1 where productId = ${"$productId.text/NDC-1111"}"

skipping non-plain index rt (sphinx 2.1.6)

There is the question. Sphinx, version 2.1.6. I used to rt(real time) index, but when indexing display message in koncole:
using config file 'sphinx.conf'...
skipping non-plain index 'rt'...
But at a connection to sphinxbase and write query mysql> desc rt - displays:
+------------+--------+
| Field | Type |
+------------+--------+
| id | bigint |
| id | field |
| first_name | field |
| last_name | field |
+------------+--------+
This is default data?? They do not meet my request. How to work with index rt?
Sphinx.conf.
source database
{
type = mysql
sql_host = 127.0.0.1
sql_user = test
sql_pass = test
sql_db = community
sql_port = 3306
mysql_connect_flags = 32 # enable compression
sql_query_pre = SET NAMES utf8
sql_query_pre = SET SESSION query_cache_type=OFF
}
source rt : database
{
sql_query_range = SELECT MIN(id),MAX(id) FROM mbt_accounts
sql_query = SELECT id AS 'accountId', first_name AS 'fname', last_name AS 'lname' FROM mbt_accounts WHERE id >= 0 AND id<= 1000
sql_range_step = 1000
sql_ranged_throttle = 1000 # milliseconds
}
index rt
{
source = rt
type = rt
path = /etc/sphinxsearch/rtindex
rt_mem_limit = 700M
rt_field = accountId
rt_field = fname
rt_field = lname
rt_attr_string = fname
rt_attr_string = lname
charset_type = utf-8
charset_table = 0..9, A..Z->a..z, _, -, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F, U+401->U+451, U+451
}
searchd
{
listen = localhost:9312 # port for API
listen = localhost:9306:mysql41 #port for a SphinxQL
log = /var/log/sphinxsearch/searchd.log
binlog_path = /var/log/sphinxsearch/
query_log = /var/log/sphinxsearch/query.log
query_log_format = sphinxql
pid_file = /var/run/sphinxsearch/searchd.pid
workers = threads
max_matches = 1000
read_timeout = 5
client_timeout = 300
max_children = 30
max_packet_size = 8M
binlog_flush = 2
binlog_max_log_size = 90M
thread_stack = 8M
expansion_limit = 500
rt_flush_period = 1800
collation_server = utf8_general_ci
compat_sphinxql_magics = 0
prefork_rotation_throttle = 100
}
Thanks.
indexer only works with indexes that have a 'source' - ie plain disk indexesd. ie indexer does the stuff in the source to get the data to create the index.
RT (Real Time) indexes work very differently. indexer is not involved with RT indexes at all. They are handled totally by searchd.
To add data to a RT index, you need to run a bunch of SphinxQL commands (INSERT, UPDATE etc) that actually add the data to the index.
(DESCRIBE works, because searchd knows the 'structure' of the index (you told it via the rt_field etc) - even if never inserted any data)
Ah, I think you are asking why the structure is different. That's probably because the index was probably created before, you modified sphinx.conf. If you change the definiton of a RT index, you need to 'destroy' the index, to allow it be recreated again.
The simplest way is to shutdown searchd, delete the index files, delete the binlog (it no longer relevent) and then restart searchd.
searchd --stopwait
rm /etc/sphinxsearch/rtindex*
rm /path/to/binlog* #(you dont define a path, so it must be the default, which varies)
searchd #(starts searchd again)

Sphinx stopped indexing

After doing rotate and re-generating index, Sphinx doesn't include new records from database. It doesn't give any error and also it includes old index data.
I removed data files and tried again but the result was same.
I also applied range query but the result was same.
So, I cannot update my search index now :(
Here I'm giving my configuration, thank you for your tips :)
source search_song
{
type = mysql
sql_host = localhost
sql_user = myusername
sql_pass = mypass
sql_db = mydb
sql_port = 3306 # optional, default is 3306
sql_query_pre = SET NAMES utf8
sql_query_pre = SET NAMES utf8 COLLATE utf8_turkish_ci
sql_query_pre = SET CHARACTER SET utf8
sql_query_pre = SET COLLATION_CONNECTION = utf8_turkish_ci
sql_query_range = SELECT MIN(song_ID), MAX(song_ID) FROM song
sql_range_step = 20000
sql_query = SELECT song.song_ID, artist.artist_ID, song.title, song_stats.total_read, IF(artist.flag_The = 1, CONCAT("The ", artist.name), artist.name) AS fullname \
FROM song \
INNER JOIN artist ON artist.artist_ID = song.artist_ID \
LEFT JOIN song_stats ON song_stats.song_ID = song.song_ID \
WHERE song.song_ID >= $start AND song.song_ID <= $end;
sql_attr_uint = total_read
}
index search_song
{
source = search_song
path = /var/lib/sphinxsearch/data/search_song
morphology = metaphone
min_word_len = 1
min_prefix_len = 2
enable_star = 1
charset_type = utf-8
# exceptions = /var/lib/sphinxsearch/exceptions.txt
charset_table = A->a, B->b, C->c, U+C7->c, U+E7->c, D..G->d..g, U+11E->g, U+11F->g, H->h, I->i, U+131->i, U+130->i, J..O->j..o, U+D6->o, U+F6->o, P..S->p..s, U+15E->s, U+15F->s, T..U->t..u, U+DC->u, U+FC->u, V..Z->v..z, _, a..z,[,],0..9
}
Does it work without the range ? just the sql_query , no range and step ?
I suspect the SQL query might be limiting results.

Error after converting Sphinx original indexes to real-time indexes

I used this tutorial to convert my original sphinx indexes to real-time indexes: http://www.ivinco.com/blog/converting-sphinx-original-indexes-to-real-time-indexes/
I changed my sphinx.conf:
source movies_dev
{
type = mysql
sql_host = localhost
sql_user = ********
sql_pass = ********
sql_db = ********
sql_sock = /var/run/mysqld/mysqld.sock
sql_port = 3306
sql_query = \
SELECT \
CRC32(movie_id) AS id, movie_id, format_id, active, year, title \
FROM \
movie;
sql_attr_uint = format_id
sql_attr_uint = active
sql_attr_uint = year
sql_field_string = movie_id
sql_field_string = title
sql_query_info = SELECT * FROM movie WHERE CRC32(movie_id)=$id
sql_query_pre = SET NAMES utf8
}
index movies_dev
{
source = movies_dev
path = /var/data/sphinx/movies_dev
morphology = stem_en
enable_star = 1
min_word_len = 3
min_prefix_len = 0
min_infix_len = 3
charset_type = utf-8
charset_table = 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F, U+DC->U+FC, U+C4->U+E4, U+D6->U+F6, U+DF, U+E4, U+F6, U+FC
}
index rt_movies_dev
{
type = rt
rt_mem_limit = 32M
path = /var/data/sphinx/rt_movies_dev
charset_type = utf-8
rt_field = movie_id
rt_field = title
rt_attr_uint = format_id
rt_attr_uint = year
rt_attr_uint = active
}
source attach_movies_dev
{
type = mysql
sql_host = localhost
sql_user = ********
sql_pass = ********
sql_db = ********
sql_query = SELECT 1 FROM rt_movies_dev
sql_query_post = ATTACH INDEX movies_dev TO RTINDEX rt_movies_dev
}
index attach_movies_dev
{
source = attach_movies_dev
path = /var/data/sphinx/attach_movies_dev
docinfo = extern
charset_type = utf-8
}
I created the "rt_movies_dev" table:
SET NAMES utf8;
SET foreign_key_checks = 0;
SET time_zone = '+01:00';
SET sql_mode = 'NO_AUTO_VALUE_ON_ZERO';
DROP TABLE IF EXISTS `rt_movies_dev`;
CREATE TABLE `rt_movies_dev` (
`movie_id` varchar(20) NOT NULL,
`format_id` int(10) NOT NULL,
`title` varchar(255) NOT NULL,
`year` int(20) DEFAULT NULL,
`active` tinyint(1) NOT NULL,
PRIMARY KEY (`movie_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
After that, I run these three commands:
root#server:~# /usr/local/sphinx/bin/searchd --config /usr/local/sphinx/etc/sphinx.conf;
root#server:~# /usr/local/sphinx/bin/indexer --config /usr/local/sphinx/etc/sphinx.conf movies_dev --rotate;
root#server:~# /usr/local/sphinx/bin/indexer --config /usr/local/sphinx/etc/sphinx.conf attach_movies_dev;
No errors after the first two commands (except the warnings like in the tutorial).
But the last command throws this:
ERROR: index 'attach_movies_dev': No fields in schema - will not index.
I do not know exactly what the error says and I could find nothing useful. Can you say what's wrong? I'm stuck here.
Firstly attach_movies_dev source, connects to SPHINX, not to mysql. So no mysql table is required.
You are just using indexer to invoke SphinxQL commands.
But from what I can see trying to index the attach index, will always result in an error, because the RT index itself must be empty (so can attach a disk index to it!)
So change your attach index to connect to searchd instead. And it should work better. Probably an empty RT index, is ok, indexer will just create an empty index, but importantly it will still run the _post command. Which is the whole reason the index exists!
Also beware that your disk index and RT index have different fields, in your disk index, you have two sql_field_string, which create both attributes AND fields. So your RT index, should to contain two string attributes to match (rather than just fields).

Powershell - how to obtain previous lines from text file?

I have a text file containing hundreds of lines of text containing database info.
I'm trying to extract the DatabaseIds for any database which is 35GB.
I would like to interrogate the file using Powershell and produce a text file containing all the matching databases.
So in essence, I would like to scan through the file, find a DatabaseSize which is 35 and then extract the corresponding DatabaseId from 3 lines previous.
I've looked all over the net but can't seem to find anything which can do this.
Example extract of text file:
ServerId = VMlinux-b71655e1
DatabaseId = db-ecb2e784
Status = completed
LocationId = 344067960796
DatabaseSize = 30
ServerId = VMlinux-0db8d45b
DatabaseId = db-cea2f7a6
Status = completed
LocationId = 344067960796
DatabaseSize = 35
ServerId = VMlinux-c5388693
DatabaseId = db-9a421bf2
Status = completed
LocationId = 344067960796
DatabaseSize = 8
etc
Try with something like this:
(( GC myfile.txt |
Select-String 'DatabaseSize = 35' -Context 3 ).context.precontext)[0]
In case of multiple match:
(GC myfile.txt |
SELECT-STRING 'DATABASESIZE = 35' -Context 3 ) | % { ($_.context.precontext)[0] }