Scala split on \n - scala

I have a text file with following contents-
"\n\n\n\n\n\n\n\n\t\n\t\t\t\n\t\t\t\t\t\n\t\t\t\t\n\t\t\n\n\n\t\n\t\t\n\t\t\t\t
Hotline: +49 40-300 51 701\n\t\n\t\n\t
Languages\n\t\n\t\n\t\t\n\t\t\n\t\t
Travel plan \n\t\n\t\n\n\n\n\t\t\n\n\t\t\n\t\t\t\n\n\n\n\n\n\n\n\n\n\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\n\n\t\t\n\n\t\t\n\t\t\t\t
Book\t
Packages from € 59\n
\tAccommodation and arrival\n
\tMusical packages\n
\tMaritime packages\n\t
Hamburg for Families\n\t
Experience Hamburg & Culture\n\n\n\n\n\t
Hotels from € 24\n\t
Book online now!\n\t
Theme hotels\n\t
Hotels by location\n\t
Special Offers\n\t
Hotels from A-Z\n\t
Other accommodation\n\n\n\n\n\t
Tickets from € 8\n\tBook online now!\n\t
Musicals Hamburg\n\tHamburg maritime\n\t
Sightseeing tours & city walks\n\tMuseums & Exhibitions\n\tHamburg for Families\n\n\n\n\n\t
Hamburg CARD\n\tBook online now!\n\tAll benefits at a glance\n\tFrequently asked questions\n\n\n\n\n\t
Group trips\n\tBooking request\n\tHamburg Guides and theme walks\n\n\n\n\n\n\n\t\n\t\tOffer\n\n\t\t\n\n\t\t\n\n\t\t
Hamburg CARD\n\t\tFree travel by bus, rail and ferry (HVV) and up to 50% discount on more than 150 tourist...\n\n\t\n\t\n\t\t\n\t\t\t\n\t\t\t\t
from 10,50 EUR\n\t\t\t\n\t\t\n\n\t\n\n\n\n\n\n\n\tAttractions\tBest of Hamburg\n\t
Town Hall\n\tThe \"Michel\"\n\tSt. Pauli & Reeperbahn\n\t
Elbphilharmonie\n\tJungfernstieg\n\tMiniatur Wunderland\n\tTierpark Hagenbeck\n\t
All about the Alster\n\tBlankenese\n\n\n\n\n\tHamburg Maritime\n\t
Urbanshore Hamburg\n\tPort of Hamburg\n\tLandungsbrücken\n\tFish Market\n\tSpeicherstadt\n\tOn the Elbe\n\tHafenCity\n\tWillkomm-Höft\n\tÖvelgönne\n\n\n\n\n\tHistoric Hamburg\n\tThe Old Elbe Tunnel\n\t"
I want to split it on the \n. I tried
string.split("\n")
string.split('\n')
string.split("""\n""")
string.split("\\n")
Nothing of this seems to work. How do I get it done in scala?

Split by \n, then \t, flatten, then remove empty strings.
var lines = Source.fromFile("/Users/rasika/Documents/example.txt").getLines.mkString
val result = lines.split("\\\\n").flatMap(_.split("\\\\t")).filter(_.nonEmpty).toList
Result
Hotline: +49 40-300 51 701
Languages
Travel plan
Book
Packages from € 59
Accommodation and arrival
Musical packages
Maritime packages
Hamburg for Families
Experience Hamburg & Culture
Hotels from € 24
Book online now!
Theme hotels
Hotels by location
Special Offers
Hotels from A-Z
Other accommodation
Tickets from € 8
Book online now!
Musicals Hamburg
Hamburg maritime
Sightseeing tours & city walks
Museums & Exhibitions
Hamburg for Families
Hamburg CARD
Book online now!
All benefits at a glance
Frequently asked questions
Group trips
Booking request
Hamburg Guides and theme walks
Offer
Hamburg CARD
Free travel by bus, rail and ferry (HVV) and up to 50% discount on more than 150 tourist...
from 10,50 EUR
Output exceeds cutoff limit.

If you want to split on literal \n in your text (i.e. literal text, and not just a newline), then try this:
string.split("\\\\n")
In a regex context in Java/Scala, a literal backslash requires four backslashes.

Since you're splitting on newlines, and io.Source.fromFile.getLines separates on newlines, you'll need to read the whole file in one go instead, with
val string = io.Source.fromFile(filepath).mkString
as per this answer. Then your attempts should work e.g.
string.split('\n')

Related

Google Maps 'Place Autocomplete' Not Finding Addresses with Apartment & Number Range

Google's 'Place Autocomplete' doesn't show any results for an address with an apartment number and street number range e.g. "2/61-69 Macquarie St, Sydney NSW 2000" (apartment 2, of the building located at 61-69 Macquarie Street).
However, it does show results for "61-69 Macquarie St, Sydney NSW 2000" and "2/69 Macquarie St, Sydney NSW 2000".
Is there a way to get it to find the full address - or maybe some javascript could trim the "2/61-" off before google searches for it, or get Google to ignore that part of the typed string when it looks up possible addresses and add it back to the matching results.

How to edit the email order template of OpenCart 1.5.6.1?

I would like to ask, how I can edit the following email order template of Opencart version 1.5.6.1?
Subject: Gifts - Order 804
You have received an order.
Order ID: 804
Date Added: 22/04/2014
Order Status: Pending
Products
1x Mother of Pearl Rosary (RO-21) $26.00 1x Holy Water & Oil From Holy Land (HWC-01) $15.00 1x Rosary with holy earth (RO-12) $15.00 1x Box with Jerusalem Cross (BO-08) $14.00
Order Totals
Sub-Total: $70.00
UPS Ground: $12.67
Total: $82.67
The comments for your order are:
sacramental gifts for first communion
Now I want to add the name of the recipient at the top of the message, above the sentence "You have received an order", what I can do? What is the code I should add, and which files?
Please help me, I searched all Google and didn't find a solution !!
/catalog/model/checkout/order.php
but all the texts are translates, then you need to change on this file:
/catalog/language/english/mail/order.php

Extract a person's full name from a block of text in Perl?

I need to extract names (including uncommon names) from blocks of text using Perl. I've looked into this module for extracting names, but it only has the top 1000 popular names and surnames in the US dating back to 1990; I need something a bit more comprehensive.
I've considered using the Social Security Index to make a database for comparison, but this seems very tedious and processing intensive. Is there a way to pull names from Perl using another method?
Example of text to parse:
LADNIER Louis Anthony Ladnier, [Louie] age 48, of Mobile, Alabama died at home Friday, November 16, 2012. Louie was born January 9, 1964 in Mobile, Alabama. He was the son of John E. Ladnier, Sr. and Gloria Bosarge Ladnier. He was a graduate of McGill-Toolen High School and attended University of South Alabama. He was employed up until his medical retirement as Communi-cations Supervisor with the Bayou La Batre Police Department. He is preceded in death by his father, John. Survived by his mother, Gloria, nephews, Dominic Ladnier and Christian Rubio, whom he loved and help raise as his own sons, sisters, Marj Ladnier and Morgan Gordy [Julian], and brother Eddie Ladnier [Cindy], and nephews, Jamie, Joey, Eddie, Will, Ben and nieces, Anna and Elisabeth. Memorial service will be held at St. Dominic's Catholic Church in Mobile on Wednesday at 1pm. Serenity Funeral Home is in charge of arrangements. In lieu of flowers, memorials may be sent to St. Dominic School, 4160 Burma Road Mobile, AL 36693, education fund for Christian Rubio and McGill-Toolen High School, 1501 Old Shell Road Mobile, AL 36604, education Fund for Dominic Ladnier. The family is grateful for all the prayers and support during this time. Louie was a rock and a joy to us all.
Use Stanford's NER (GPL). Demo:
http://nlp.stanford.edu:8080/ner/process
There is no sure fire way to do this due to the nature of the English language. You either need lists to (fuzzy)compare with, or will have to settle for significant accuracy penalties.
The Apache Foundation has a few projects that cover the topic of entity extraction with specific pre-trained models for English names (nameFinder). I would recommend openLNP or Stanbol. In the meantime if you have just a few queries I have an NLP I've implemented in C# in my apps section at http://www.augmentedintel.com/apps/csharpnlp/extract-names-from-text.aspx.
Best,
Don
You're trying to implement a named-entity recognition. The bad news is that it's really hard.
You could try Lingua::EN::NamedEntity, however:
$ perl -MLingua::EN::NamedEntity -nE 'say $_ for map { $_->{class} eq "person" ? $_->{entity} : () } extract_entities($_)' names.txt
Louie
Louis Anthony Ladnier
Louie
John E
Bayou La Batre Police Department
Gloria
Julian
Cindy
Eddie Ladnier
Eddie
John
Catholic Church
Christian Rubio
Dominic Ladnier
Burma Road Mobile
Louie
You can also use Calais, a Reuters webservice for natural language processing, which offers a lot better results:
I think you want to Google something like:
perl part of speech tagging

stored long description in Sqlite database manager in iphone

I want to store long description in sqlite database manager in iphone like this data.
"The Golden Temple: The Golden Temple, popular as Sri Harmandir Sahib or Sri Darbar Sahib, is the sacred seat of Sikhism. Bathed in a quintessential golden hue that dazzles in the serene waters of the Amrit Sarovar that lace around it, the swarn mandir (Golden temple) is one that internalizes in the mindscape of its visitors, no matter what religion or creed, as one of the most magnificent House of Worship. On a jewel-studded platform is the Adi Grantha or the sacred scripture of Sikhs wherein are enshrined holy inscriptions by the ten Sikh gurus and various Hindu and Moslem saints. While visiting the Golden Temple you need to cover your head. Street sellers sell bandanas outside the temple at cheap prices."
I am trying to take as description (VARCHAR(5000)) but when i execute query it is showing half text with dotted (....) like that http://i.stack.imgur.com/gyMqi.png
Thanks
The ... surely indicate that the full text is present in the database. It also indicates that "Sqlite database browser" truncates past a certain length:
m_textWidthMarkSize = s.value("prefs/sqleditor/textWidthMarkSpinBox", 60).toInt();
Is there a way to change the settings?
Edit
You can verify that the text is fully saved with the following query (replace theTable with the correct table name):
select length(description) from theTable;

Parsing an Address with T-SQL

I am trying to figure out how to parse an address using T-SQL and I suck at T-SQL. My challenge is this,
I have a table called Locations defined as follows:
- City [varchar(100)]
- State [char(2)]
- PostalCode [char(5)]
My UI has a text box in which a user can enter an address in. This address could be in the form of essentially anything (yuck, I know). Unfortunately, I cannot change this UI either. Anyways, the value of the text box is passed into the stored procedure that is responsible for parsing the address. I need to take what the person enters and get the PostalCode from the Locations table associated with their input. For the life of me, I cannot figure out how to do this. There are so many cases. For instance, the user could enter one of the following:
Chicago, IL
Chicago, IL 60601
Chicago, IL, 60601
Chicago, IL 60601 USA
Chicago, IL, 60601 USA
Chicago IL 60601 USA
New York NY 10001 USA
New York, NY 10001, USA
You get the idea. There are a lot of cases. I can't find any parsers online either. I must not be looking correctly. Can someone please point me to a parser online or explain how to do this? I'm willing to pay for a solution to this problem, but I can't find anything, I'm shocked.
Perhaps a CLR function might be a better choice than tsql. Check out http://msdn.microsoft.com/en-us/magazine/cc163473.aspx for an example of using regular expressions to parse some pretty complex string inputs into table value results. Now you get to be as creative as you please with your regex matching but the following regex should get you started:
(.*?)([A-Z]{2}),? (\d+)( USA)?$
If you're reluctant to use CLR functions, perhaps you have regex functionality in the calling system, like ASP.Net or PHP.