How to scrape a form out a website using beautifulSoup? - forms

So far we have made this code to scrape the website: http://www.theft-alerts.com
In the website is a form and in that form a frmSFair. We need all the stolen artworks information. Can someone help?
If we scrape the form by:
import urllib2
from bs4 import BeautifulSoup
connection = urllib2.urlopen('http://www.theft-alerts.com')
soup = BeautifulSoup(connection, "html.parser")
form = soup.find_all(span="table")
for form in soup.form.stripped_strings:
print(str(form.encode('utf-7')))
Output:
Sign up for our newsletter
Add email address below
See a sample eSalvo
The code picks the table newsletter on the ride side of the website and we need the table in the middle. This information:
STOLEN : CHERUB IN MARBLE, PART OF A FOUNTAIN
Stolen from Canterbury, Kent, UK on 8 February 2016
Item : A copy of Verrocchio's cupid - winged cherub standing on one leg holding a dolphin - in white marble which formed the top part of a fountain. approximately 3 foot high. Item has discoloured due to weathering with some lichen growth.
Any info to : PC 12994 Canterbury. Tel 01622 690690
Messages : Send a message
Crime Ref : ZY - 4370 - 16
No of items stolen : 1
images:
Location : UK > Kent
Category : STATUARY
ID : 93578
User : 53329 ; Diyer/Homeowner/Private ; (Registered SalvoWEB user for 1 month)
Date Created : 10 Feb 2016 14:36:23
Date Modified : 11 Feb 2016 16:40:06;

To get the text from each:
for sp in soup.select("table div.itemspacingmodified"):
for wd in sp.select("div.itemindentmodified"):
text = wd.text
if not text.startswith("Images :"):
print(text)
Which gives you:
STOLEN : CHERUB IN MARBLE, PART OF A FOUNTAIN
Stolen from Canterbury, Kent, UK on 8 February 2016
Item : A copy of Verrocchio's cupid - winged cherub standing on one leg holding a dolphin - in white marble which formed the top part of a fountain. approximately 3 foot high. Item has discoloured due to weathering with some lichen growth.
Any info to : PC 12994 Canterbury. Tel 01622 690690
Messages : Send a message
Crime Ref : ZY - 4370 - 16
No of items stolen : 1
Location : UK > KentCategory : STATUARYID : 93578User : 53329 ; Diyer/Homeowner/Private ; (Registered SalvoWEB user for 1 month)Date Created : 10 Feb 2016 14:36:23Date Modified : 11 Feb 2016 16:40:06;
STOLEN : OVER 70 ANTIQUE YORK STONE PAVING SLABS
Stolen from Steyning on 30th October 2015
Item : Antique York Stone paving slabs stolen from historic landscaped garden overnight. Truck driven through electric gates to gain access.
Any info to : PCSO Stewart Metcalfe. Sussex Police mob. 07912 894151
Messages : Send a message
Web URL : https://stmarysbramber.co.uk
Crime Ref : 47150140173
No of items stolen : 70
Recovered Details : None
Location : UK > West SussexCategory : FLAGSTONES & FLOOR TILESID : 92311User : 52866 ; Diyer/Homeowner/Private ; (Registered SalvoWEB user for 3 months)Date Created : 05 Nov 2015 12:04:50Date Modified : 05 Nov 2015 12:15:10;
STOLEN : GARDEN STATUE OF BOY
Stolen from Bridgnorth, Shropshire on 20th / 21st Aug 2015
Item : Small lead,(I think ! ),statue of a boy standing on a stone plinth
Any info to : West Mercia Police - Crime No 22FJ59981W15
Messages : Send a message
Crime Ref : 22FJ59981W15
No of items stolen : 2
Recovered Details : NA
Location : UK > ShropshireCategory : STATUARYID : 91278User : 52457 ; Diyer/Homeowner/Private ; (Registered SalvoWEB user for 6 months)Date Reset : 10 Sep 2015 00:30:01Date Modified : 26 Aug 2015 15:07:31; EL from 26 Aug 2015 to 09 Sep 2015 26Aug15;EL History : 26 Aug 2015 10:52:49;
STOLEN : YORKSTONE FLAGSTONES
Stolen from Nr Sevenoaks, Kent, UK on 26 Aug 2015
Item : Flagstones from St Mary's Church, Sundridge, Sevenoaks, Kent TN14 6EA. 70 in total. They are old Yorkstone flagstones approx. 2" thick, the sizes are as follows:
24 x 3'x2'
14 x 2'x1.5'
10 x 2'x2'
10 x 2'x1'
12 x 1'x1'
Any info to : Maidstone, Kent Police station Tel: 101
Messages : Send a message
Crime Ref : YY/17519/15
No of items stolen : 70
Location : UK > KentCategory : FLAGSTONES & FLOOR TILESID : 91428User : 52513 ; Churches and Memorial Custodians ; (Registered SalvoWEB user for 6 months)Date Created : 03 Sep 2015 14:48:45Date Modified : 04 Sep 2015 09:34:13;
STOLEN : RED STANDSTONE BIRDBATH
Stolen from Watlington, Oxfordshire UK on 16 July 2015
Item : A red sandstone bird bath with applied bronze decoration and a central bronze figure of a young girl on a dolphin by Richard Garbe
Overall height 4'3" (129.5 cm), Figure height 1' 3" (38 cm)
Square at bowl 1'3" (38 cm) Square at base 1'4" (40.4 cm)
Any info to : R McIntyre, PC 0200, Wallingford
Messages : Send a message
Crime Ref : 4315016713
No of items stolen : 1
Location : UK > OxfordshireCategory : STATUARYID : 89824User : 52054 ; Diyer/Homeowner/Private ; (Registered SalvoWEB user for 6 months)Date Reset : 13 Jul 2015 00:30:02Date Modified : 21 Jun 2015 20:58:33; EL from 21 Jun 2015 to 12 Jul 2015 21Jun15;EL History : 21 Jun 2015 20:50:51;
STOLEN : STONE SCULPTURE - STATUE IS OF A PROPHET
Stolen from Belton, Loughbrough on 05/05/2015
Item : Statue is of a Prophet
5ft tall, this has been sympathetically restored by stonemason using a local stone which is very similar to the original stone.
The statue is 63" tall and weighs around ½ ton
Any info to : 3046515
Messages : Send a message
Crime Ref : 3046515
No of items stolen : 1
Location : UK > LeicestershireCategory : GARDENID : 89030User : 51750 ; Professional/Architect/Designer/Media/Film/TV ; (Registered SalvoWEB user for 6 months)Date Created : 06 May 2015 15:38:24Date Modified : 06 May 2015 16:18:28;
STOLEN : WOOL CARPET/RUG IN VARIOUS COLOURS GEOMETRIC DESIGN
Stolen from Kensal Green on 21 April 2015
Item : A knotted wool rug or carpet, approx 118 by 76 inches, with a geometric design in light and dark brown, cream, pink and black.
Any info to : Metropolitan Police. Tel: 101 and quote crime ref: 6518176/15
Messages : Send a message
Crime Ref : 6518176/15
No of items stolen : 1
Location : UK > London North WestCategory : FURNITURE & MIRRORSID : 88924User : 34 ; Antique/Reclamation/Salvage Trade ; (Salvo Code Dealer)Date Created : 29 Apr 2015 21:46:11Date Modified : 29 Apr 2015 22:24:45;
STOLEN : ANOTHER HISTORIC MILESTONE STOLEN AT REDBOURN, HERTS.
Stolen from Redbourn, Hertfordshire on 15th March 2015
Item : Very distinctive square section small Milepost, well-known to local residents and others, and featured in local publications. It was on the A5183 at St.Albans Road at Redbourn, opposite the Chequers Public House. It would originally have been installed by The Dunstable-St. Albans-London Turnpike Trust, established by Act of Parliament in 1722; after the abolition of this and other similar trusts responsibility would have fallen to the Parish Council, later passing to the County Council in the 20th Century.
This milestone has the Milestone Society Identity ref. HE_LH24. Its neighbour HE_LH23 was stolen two years ago.
Any info to : Hertfordshire Police 01707 354000
Messages : Send a message
Crime Ref : WCR/41/20187/15
No of items stolen : 1
Location : UK > WarwickshireCategory : Architectural STONE & TERRACOTTAID : 88606User : 46089 ; Charity/Government/Institution/Plc ; (Registered SalvoWEB user for 2 years or more)Date Created : 10 Apr 2015 14:15:35Date Modified : 28 Apr 2015 16:52:03;
STOLEN : COALBROOKDALE GARDEN BENCH
Stolen from near Lichfield, Staffordshire UK on 25 February 2015
Item : A beautiful (most likely Coalbrookdale) cast iron 'Oak and Ivy' pattern bench. White painted with wooden slatted seat - but could now be a different colour. Approximate size 155 cm wide. Taken from a garden near Lichfield but possibly now in the Kent or South East region or anywhere in the country. Unusual and striking pattern. A £100 reward for first information which leads to recovery.
Any info to : PC 3864 Lichfield Tel 0300 123 4455
Messages : Send a message
Crime Ref : 27th February 2015 No 644
No of items stolen : 1
Location : UK > North YorkshireCategory : GARDENID : 87846User : 51382 ; Diyer/Homeowner/Private ; (Registered SalvoWEB user for 1 year)Date Reset : 19 Mar 2015 00:30:01Date Modified : 11 Mar 2015 18:00:05; EL from 03 Mar 2015 to 10 Mar 2015 03Mar15; EL from 11 Mar 2015 to 18 Mar 2015 11Mar15;EL History : 02 Mar 2015 10:07:07;11 Mar 2015 00:30:01;
STOLEN : MILESTONE TAKEN FROM THE SIDE OF OLD LONDON ROAD, MALDON, ESSEX
Stolen from Maldon, Essex, UK on 31/1/15 - 11/3/15
Item : This is a milestone approximately 30 cms square but only around 90 total height including below surface. It was set into a concrete socket and would probably have needed lifting equipment to extract. It had received damage to one corner and face about six years ago, probably by carelessly operated grass-cutting machinery. It was situated opposite the cemetery in Old London Road (Grid Ref TL83740712) and was registered bythe Milestone Society with the ID ref EX_MGMN37
Any info to : PS 214 Maldon District Neighbourhood Policing Sergeant Direct Dial: 101 Ext 412104
Messages : Send a message
Crime Ref : CF0205920315
No of items stolen : 1
Location : UK > EssexCategory : Architectural STONE & TERRACOTTAID : 88150User : 46089 ; Charity/Government/Institution/Plc ; (Registered SalvoWEB user for 2 years or more)Date Created : 16 Mar 2015 11:31:40Date Modified : 16 Mar 2015 11:45:20;
STOLEN : MILESTONE PLATE STOLEN. A420 JUST WEST OF CHIPPENHAM, WILTSHIRE
Stolen from Chippenham, Wiltshire on Before Christmas 2014
Item : This stone was involved in a major traffic accident sometime before Christmas 2014 and is now not only leaning over at a worse angle than ever but is in three or more pieces. The Cast iron plate on its front has disappeared, presumed stolen.
Any info to : Crime reference number 5410002796 reported by Wiltshire Council. Tel: 101 to report any news, quoting the ref. number.
Messages : Send a message
Crime Ref : 5410002796 Wiltshire
No of items stolen : 1
Location : UK > WiltshireCategory : Architectural METALWORKID : 86859User : 46089 ; Charity/Government/Institution/Plc ; (Registered SalvoWEB user for 2 years or more)Date Created : 10 Jan 2015 17:26:33Date Modified : 28 Apr 2015 16:54:46;
STOLEN : A LARGE TAYLORS OF LOUGHBOROUGH BELL
Stolen from Bromyard on 7 August 2014
Item : The bell has a diameter of 37 1/2" is approx 3' tall weighs just shy of half a ton and was made by Taylor's of Loughborough in 1902. It is stamped with the numbers 232 and 11.
The bell had come from Co-operative Wholesale Society's Crumpsall Biscuit Works in Manchester.
Any info to : PC 2361. Tel 0300 333 3000
Messages : Send a message
Crime Ref : 22EJ / 50213D-14
No of items stolen : 1
Location : UK > Hereford & WorcsCategory : Shop, Pub, Church, Telephone Boxes & BygonesID : 84377User : 1 ; Antique/Reclamation/Salvage Trade ; (Administrator)Date Created : 11 Aug 2014 15:27:57Date Modified : 11 Aug 2014 15:37:21;
Each section is contained in a div with the class itemspacingmodified, then all the info is inside the divs with the class itemindentmodified so you just need to pull the text from each.
The only problem is the line breaks, you can see 91278User is syuck together, we can replace the line breaks with newlines:
connection = urllib2.urlopen('http://www.theft-alerts.com')
soup = BeautifulSoup(connection.read().replace("<br>","\n"), "html.parser")
for sp in soup.select("table div.itemspacingmodified"):
for wd in sp.select("div.itemindentmodified"):
text = wd.text
if not text.startswith("Images :"):
print(text)
So now we get:
STOLEN : CHERUB IN MARBLE, PART OF A FOUNTAIN
Stolen from Canterbury, Kent, UK on 8 February 2016
Item : A copy of Verrocchio's cupid - winged cherub standing on one leg holding a dolphin - in white marble which formed the top part of a fountain. approximately 3 foot high. Item has discoloured due to weathering with some lichen growth.
Any info to : PC 12994 Canterbury. Tel 01622 690690
Messages : Send a message
Crime Ref : ZY - 4370 - 16
No of items stolen : 1
Location : UK > Kent
Category : STATUARY
ID : 93578
User : 53329 ; Diyer/Homeowner/Private ; (Registered SalvoWEB user for 1 month)
Date Created : 10 Feb 2016 14:36:23
Date Modified : 11 Feb 2016 16:40:06;
STOLEN : OVER 70 ANTIQUE YORK STONE PAVING SLABS
Stolen from Steyning on 30th October 2015
Item : Antique York Stone paving slabs stolen from historic landscaped garden overnight. Truck driven through electric gates to gain access.
Any info to : PCSO Stewart Metcalfe. Sussex Police mob. 07912 894151
Messages : Send a message
Web URL : https://stmarysbramber.co.uk
Crime Ref : 47150140173
No of items stolen : 70
Recovered Details : None
Location : UK > West Sussex
Category : FLAGSTONES & FLOOR TILES
ID : 92311
User : 52866 ; Diyer/Homeowner/Private ; (Registered SalvoWEB user for 3 months)
Date Created : 05 Nov 2015 12:04:50
Date Modified : 05 Nov 2015 12:15:10;
STOLEN : GARDEN STATUE OF BOY
Stolen from Bridgnorth, Shropshire on 20th / 21st Aug 2015
Item : Small lead,(I think ! ),statue of a boy standing on a stone plinth
Any info to : West Mercia Police - Crime No 22FJ59981W15
Messages : Send a message
Crime Ref : 22FJ59981W15
No of items stolen : 2
Recovered Details : NA
Location : UK > Shropshire
Category : STATUARY
ID : 91278
User : 52457 ; Diyer/Homeowner/Private ; (Registered SalvoWEB user for 6 months)
Date Reset : 10 Sep 2015 00:30:01
Date Modified : 26 Aug 2015 15:07:31; EL from 26 Aug 2015 to 09 Sep 2015 26Aug15;
EL History : 26 Aug 2015 10:52:49;
STOLEN : YORKSTONE FLAGSTONES
Stolen from Nr Sevenoaks, Kent, UK on 26 Aug 2015
Item : Flagstones from St Mary's Church, Sundridge, Sevenoaks, Kent TN14 6EA. 70 in total. They are old Yorkstone flagstones approx. 2" thick, the sizes are as follows:
24 x 3'x2'
14 x 2'x1.5'
10 x 2'x2'
10 x 2'x1'
12 x 1'x1'
Any info to : Maidstone, Kent Police station Tel: 101
Messages : Send a message
Crime Ref : YY/17519/15
No of items stolen : 70
Location : UK > Kent
Category : FLAGSTONES & FLOOR TILES
ID : 91428
User : 52513 ; Churches and Memorial Custodians ; (Registered SalvoWEB user for 6 months)
Date Created : 03 Sep 2015 14:48:45
Date Modified : 04 Sep 2015 09:34:13;
STOLEN : RED STANDSTONE BIRDBATH
Stolen from Watlington, Oxfordshire UK on 16 July 2015
Item : A red sandstone bird bath with applied bronze decoration and a central bronze figure of a young girl on a dolphin by Richard Garbe
Overall height 4'3" (129.5 cm), Figure height 1' 3" (38 cm)
Square at bowl 1'3" (38 cm) Square at base 1'4" (40.4 cm)
Any info to : R McIntyre, PC 0200, Wallingford
Messages : Send a message
Crime Ref : 4315016713
No of items stolen : 1
Location : UK > Oxfordshire
Category : STATUARY
ID : 89824
User : 52054 ; Diyer/Homeowner/Private ; (Registered SalvoWEB user for 6 months)
Date Reset : 13 Jul 2015 00:30:02
Date Modified : 21 Jun 2015 20:58:33; EL from 21 Jun 2015 to 12 Jul 2015 21Jun15;
EL History : 21 Jun 2015 20:50:51;
STOLEN : STONE SCULPTURE - STATUE IS OF A PROPHET
Stolen from Belton, Loughbrough on 05/05/2015
Item : Statue is of a Prophet
5ft tall, this has been sympathetically restored by stonemason using a local stone which is very similar to the original stone.
The statue is 63" tall and weighs around ½ ton
Any info to : 3046515
Messages : Send a message
Crime Ref : 3046515
No of items stolen : 1
Location : UK > Leicestershire
Category : GARDEN
ID : 89030
User : 51750 ; Professional/Architect/Designer/Media/Film/TV ; (Registered SalvoWEB user for 6 months)
Date Created : 06 May 2015 15:38:24
Date Modified : 06 May 2015 16:18:28;
STOLEN : WOOL CARPET/RUG IN VARIOUS COLOURS GEOMETRIC DESIGN
Stolen from Kensal Green on 21 April 2015
Item : A knotted wool rug or carpet, approx 118 by 76 inches, with a geometric design in light and dark brown, cream, pink and black.
Any info to : Metropolitan Police. Tel: 101 and quote crime ref: 6518176/15
Messages : Send a message
Crime Ref : 6518176/15
No of items stolen : 1
Location : UK > London North West
Category : FURNITURE & MIRRORS
ID : 88924
User : 34 ; Antique/Reclamation/Salvage Trade ; (Salvo Code Dealer)
Date Created : 29 Apr 2015 21:46:11
Date Modified : 29 Apr 2015 22:24:45;
STOLEN : ANOTHER HISTORIC MILESTONE STOLEN AT REDBOURN, HERTS.
Stolen from Redbourn, Hertfordshire on 15th March 2015
Item : Very distinctive square section small Milepost, well-known to local residents and others, and featured in local publications. It was on the A5183 at St.Albans Road at Redbourn, opposite the Chequers Public House. It would originally have been installed by The Dunstable-St. Albans-London Turnpike Trust, established by Act of Parliament in 1722; after the abolition of this and other similar trusts responsibility would have fallen to the Parish Council, later passing to the County Council in the 20th Century.
This milestone has the Milestone Society Identity ref. HE_LH24. Its neighbour HE_LH23 was stolen two years ago.
Any info to : Hertfordshire Police 01707 354000
Messages : Send a message
Crime Ref : WCR/41/20187/15
No of items stolen : 1
Location : UK > Warwickshire
Category : Architectural STONE & TERRACOTTA
ID : 88606
User : 46089 ; Charity/Government/Institution/Plc ; (Registered SalvoWEB user for 2 years or more)
Date Created : 10 Apr 2015 14:15:35
Date Modified : 28 Apr 2015 16:52:03;
STOLEN : COALBROOKDALE GARDEN BENCH
Stolen from near Lichfield, Staffordshire UK on 25 February 2015
Item : A beautiful (most likely Coalbrookdale) cast iron 'Oak and Ivy' pattern bench. White painted with wooden slatted seat - but could now be a different colour. Approximate size 155 cm wide. Taken from a garden near Lichfield but possibly now in the Kent or South East region or anywhere in the country. Unusual and striking pattern. A £100 reward for first information which leads to recovery.
Any info to : PC 3864 Lichfield Tel 0300 123 4455
Messages : Send a message
Crime Ref : 27th February 2015 No 644
No of items stolen : 1
Location : UK > North Yorkshire
Category : GARDEN
ID : 87846
User : 51382 ; Diyer/Homeowner/Private ; (Registered SalvoWEB user for 1 year)
Date Reset : 19 Mar 2015 00:30:01
Date Modified : 11 Mar 2015 18:00:05; EL from 03 Mar 2015 to 10 Mar 2015 03Mar15; EL from 11 Mar 2015 to 18 Mar 2015 11Mar15;
EL History : 02 Mar 2015 10:07:07;11 Mar 2015 00:30:01;
STOLEN : MILESTONE TAKEN FROM THE SIDE OF OLD LONDON ROAD, MALDON, ESSEX
Stolen from Maldon, Essex, UK on 31/1/15 - 11/3/15
Item : This is a milestone approximately 30 cms square but only around 90 total height including below surface. It was set into a concrete socket and would probably have needed lifting equipment to extract. It had received damage to one corner and face about six years ago, probably by carelessly operated grass-cutting machinery. It was situated opposite the cemetery in Old London Road (Grid Ref TL83740712) and was registered bythe Milestone Society with the ID ref EX_MGMN37
Any info to : PS 214 Maldon District Neighbourhood Policing Sergeant Direct Dial: 101 Ext 412104
Messages : Send a message
Crime Ref : CF0205920315
No of items stolen : 1
Location : UK > Essex
Category : Architectural STONE & TERRACOTTA
ID : 88150
User : 46089 ; Charity/Government/Institution/Plc ; (Registered SalvoWEB user for 2 years or more)
Date Created : 16 Mar 2015 11:31:40
Date Modified : 16 Mar 2015 11:45:20;
STOLEN : MILESTONE PLATE STOLEN. A420 JUST WEST OF CHIPPENHAM, WILTSHIRE
Stolen from Chippenham, Wiltshire on Before Christmas 2014
Item : This stone was involved in a major traffic accident sometime before Christmas 2014 and is now not only leaning over at a worse angle than ever but is in three or more pieces. The Cast iron plate on its front has disappeared, presumed stolen.
Any info to : Crime reference number 5410002796 reported by Wiltshire Council. Tel: 101 to report any news, quoting the ref. number.
Messages : Send a message
Crime Ref : 5410002796 Wiltshire
No of items stolen : 1
Location : UK > Wiltshire
Category : Architectural METALWORK
ID : 86859
User : 46089 ; Charity/Government/Institution/Plc ; (Registered SalvoWEB user for 2 years or more)
Date Created : 10 Jan 2015 17:26:33
Date Modified : 28 Apr 2015 16:54:46;
STOLEN : A LARGE TAYLORS OF LOUGHBOROUGH BELL
Stolen from Bromyard on 7 August 2014
Item : The bell has a diameter of 37 1/2" is approx 3' tall weighs just shy of half a ton and was made by Taylor's of Loughborough in 1902. It is stamped with the numbers 232 and 11.
The bell had come from Co-operative Wholesale Society's Crumpsall Biscuit Works in Manchester.
Any info to : PC 2361. Tel 0300 333 3000
Messages : Send a message
Crime Ref : 22EJ / 50213D-14
No of items stolen : 1
Location : UK > Hereford & Worcs
Category : Shop, Pub, Church, Telephone Boxes & Bygones
ID : 84377
User : 1 ; Antique/Reclamation/Salvage Trade ; (Administrator)
Date Created : 11 Aug 2014 15:27:57
Date Modified : 11 Aug 2014 15:37:21;

Related

Powershell script : Service now

I have got the response from the service now api
enter code here
$geturl = $ParentURL + "/table/sc_req_item?
sysparm_query=number="+$RITMNumber+"&sysparm_display_value=all&sysparm_fields=number,short_description,requested_for.first_name,requested_for.last_name,requested_for.user_name"
number : #{display_value=RITM2519394; value=RITMXXXX}
short_description : #{display_value=Login As ABC xyz on 19 Feb. Creating ticket for tracking purposes.; value=Login As ABC xyz on 19 Feb. Creating ticket for tracking purposes.}
requested_for.first_name : #{display_value=abc; value=abc}
requested_for.user_name : #{display_value=Exxxx; value=Exxxx}
requested_for.last_name : #{display_value=xyz; value=xyz}
I am able to display get the short description using $ShortDescription = $($response.result.short_description.value)
Login As ABC xyz on 19 Feb. Creating ticket for tracking purposes.
But when I m trying to get the requested for first name and last name I m getting blank
$RequestFor = $($response.result.requested_for.first_name)
could you please help me in sorting out

Extract date from string with another numbers from R

I need to extract the date from this text:
Mellisoni 2014 Malbec (Columbia Valley (WA))
Okapi 2013 Estate Cabernet Sauvignon (Napa Valley)
Podere dal Nespoli 2015 Prugneto Sangiovese (Romagna)
Simonnet-Febvre 2015 Chablis
Lagler 2012 1000 Eimerberg Smaragd Neuburger (Wachau)
I use this code:
vino<-mutate(vino, year1=sub("^.*([0-9]{4}).*", "\\1", vino$title))
It works, but I have the last value extract on 1000 instead of 2012, how can I fix it if have another numbers?

Save and use eigenvectors from PCA

I performed a Principal Component Analysis (PCA) in Stata.
My dataset includes eight financial indicators that vary across 9 countries.
For example:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str7 Country double(Investment Profit Income Tax Repayment Leverage Interest Liquidity) int Year
"France" -.1916055239385184 .046331346724579184 .16438012750896466 .073106839282063 30.373216652548326 4.116650784492168 3.222219873614461 .01453109309122077 2010
"UK" -.09287803170279468 .10772082765154019 .19475363707485557 .05803923583546618 31.746409646181174 9.669982727208433 1.2958094802269167 .014273374324088752 2010
"US" -.06262935107629553 .08674901201182428 .1241593221865416 .13387194413811226 25.336612638526013 11.14330064161111 1.954785887176916 .008355601163285917 2010
"Italy" -.038025847122363045 .1523162032749684 .23885658237030563 .2057478638900476 31.02007902336988 2.9660938817562292 6.12544787693943 .011694993164234125 2010
"Germany" -.05454795914578491 .06287079763890834 .09347194572148769 .08730237262847926 35.614342337621174 12.03770488195981 1.1958205191308358 .012467084153714813 2010
"Spain " -.09133982259799572 .1520056836126315 .20905656056324853 .21054797530580743 30.133833346916546 2.0623245902645073 5.122615899157435 .013545432336873187 2010
"Sweden" -.05403262462960799 .20463787181576967 .22924827352771968 .05655833155565016 20.30540887860061 10.392313613725324 .8634381995636089 .008030624504967313 2010
"Norway " -.07560184571862992 .08383822093909514 .15469418498932822 .06569716455818478 29.568228705840234 14.383460621594622 1.5561013535825234 .012843159364225464 2010
"Algeria" -.0494187835163535 .056252436429004446 .09174672864585759 .08143181185307143 34.74103858167055 15.045254276254616 1.2074942921860699 .011578038401820303 2010
"France" -.03831442432584342 .14722819896988698 .22035417794604084 .12183886462162773 28.44763045286005 12.727100288710087 1.405629911115614 .011186908059399987 2011
"UK" -.05002189329928202 .16833493262244398 .2288402623558823 .04977050186975224 27.640103129372747 11.17376089844228 1.1764542835994092 .008386726178729322 2011
"US" -.0871005985124144 .10270482619857023 .1523559355903486 .06775742210623094 26.840586700880362 10.783899184031576 1.454011947763254 .013501919089967212 2011
"Italy" -.1069324103590126 -.5877872620957578 -.47469302172710803 .2004436360021364 23.133243742952658 5.3936761686065875 4.532771849692548 .012586313916956204 2011
"Germany" -.05851794344524515 .09960345907923154 .136805115392161 .1373407846168154 32.6182637042919 14.109738344526052 1.5077699357228835 .013200993625042274 2011
"Spain " -.10650743527105216 -.015785638597076792 .1808727613216441 .05038848927405154 28.22206251292902 10.839614113486853 1.5021425852392374 .012076771099482617 2011
"Sweden" -.09678946710644694 .11801761803893955 .18569993056826523 .1481844716617448 27.439283362903794 5.771154420635893 5.493437819181101 .013820243145673811 2011
"Norway " -.04263379351591438 .09931719473864983 .14469611775596314 .0796835513869996 26.68561168581991 14.06385602832082 1.5200488174887825 .01029136242440406 2011
"Algeria" -.04871983526465598 .2139061303228528 .2728647845448156 .056537570099712456 22.50263575072073 16.919641035094685 .7539881754626142 .009734650338902404 2011
end
I called my first component "indebtedness" and my second one "profitability", after rotation.
I have the same data for 2011, 2012, 2013, 2014 and so on. I want to use the matrix of weights Stata computed for 2010 and apply it to 2011, 2012, 2013 separately. My goal is to compare the indebtedness and the profitability between countries over time.
To do this, I use the estimate save and estimates use commands (Chapter 20 of Stata manual on estimates and the post-estimation PCA command help).
However, I can't understand what Stata is saving. Is it saving the scores computed for 2010 or the eigenvalues and eigenvectors?
This is the code I use:
tempfile pca
save `pca'
use `pca' if Year==2010
global xlist Investment Profit Income Tax Repayment Leverage Interest Liquidity
pca $xlist, components(2)
estimates save pcaest, replace
predict score
summarize score
use `pca' if Year==2011, clear
estimates use pcaest
predict score
summarize score
Does this method and code seem correct to you?
I'd also like to save the matrix of weights and create a new vector Z=b|1,1]*investment+....
Using your toy example for year 2010:
clear
input str7 Country double(Investment Profit Income Tax Repayment Leverage Interest Liquidity) int Year
"France" -.1916055239385184 .046331346724579184 .16438012750896466 .073106839282063 30.373216652548326 4.116650784492168 3.222219873614461 .01453109309122077 2010
"UK" -.09287803170279468 .10772082765154019 .19475363707485557 .05803923583546618 31.746409646181174 9.669982727208433 1.2958094802269167 .014273374324088752 2010
"US" -.06262935107629553 .08674901201182428 .1241593221865416 .13387194413811226 25.336612638526013 11.14330064161111 1.954785887176916 .008355601163285917 2010
"Italy" -.038025847122363045 .1523162032749684 .23885658237030563 .2057478638900476 31.02007902336988 2.9660938817562292 6.12544787693943 .011694993164234125 2010
"Germany" -.05454795914578491 .06287079763890834 .09347194572148769 .08730237262847926 35.614342337621174 12.03770488195981 1.1958205191308358 .012467084153714813 2010
"Spain " -.09133982259799572 .1520056836126315 .20905656056324853 .21054797530580743 30.133833346916546 2.0623245902645073 5.122615899157435 .013545432336873187 2010
"Sweden" -.05403262462960799 .20463787181576967 .22924827352771968 .05655833155565016 20.30540887860061 10.392313613725324 .8634381995636089 .008030624504967313 2010
"Norway " -.07560184571862992 .08383822093909514 .15469418498932822 .06569716455818478 29.568228705840234 14.383460621594622 1.5561013535825234 .012843159364225464 2010
"Algeria" -.0494187835163535 .056252436429004446 .09174672864585759 .08143181185307143 34.74103858167055 15.045254276254616 1.2074942921860699 .011578038401820303 2010
end
I get the following results:
local xlist Investment Profit Income Tax Repayment Leverage Interest Liquidity
pca `xlist', components(2)
Principal components/correlation Number of obs = 9
Number of comp. = 2
Trace = 8
Rotation: (unrotated = principal) Rho = 0.7468
--------------------------------------------------------------------------
Component | Eigenvalue Difference Proportion Cumulative
-------------+------------------------------------------------------------
Comp1 | 3.43566 .896796 0.4295 0.4295
Comp2 | 2.53887 1.23215 0.3174 0.7468
Comp3 | 1.30672 .750756 0.1633 0.9102
Comp4 | .555959 .472866 0.0695 0.9797
Comp5 | .0830926 .0181769 0.0104 0.9900
Comp6 | .0649157 .0526462 0.0081 0.9982
Comp7 | .0122695 .00975098 0.0015 0.9997
Comp8 | .00251849 . 0.0003 1.0000
--------------------------------------------------------------------------
Principal components (eigenvectors)
------------------------------------------------
Variable | Comp1 Comp2 | Unexplained
-------------+--------------------+-------------
Investment | 0.0004 -0.3837 | .6262
Profit | 0.3896 -0.3794 | .1131
Income | 0.4621 -0.1162 | .232
Tax | 0.4146 0.1236 | .3706
Repayment | -0.1829 0.4747 | .3131
Leverage | -0.4685 -0.2596 | .07464
Interest | 0.4580 0.2625 | .1045
Liquidity | -0.0082 0.5643 | .1913
------------------------------------------------
To see what items the pca command returns type:
ereturn list
scalars:
e(N) = 9
e(f) = 2
e(rho) = .7468162625387222
e(trace) = 8
e(lndet) = -13.76082122673546
e(cond) = 36.93476257313668
macros:
e(cmdline) : "pca Investment Profit Income Tax Repayment Leverage Interest Liquidity, components(2)"
e(cmd) : "pca"
e(title) : "Principal components"
e(marginsnotok) : "_ALL"
e(estat_cmd) : "pca_estat"
e(rotate_cmd) : "pca_rotate"
e(predict) : "pca_p"
e(Ctype) : "correlation"
e(properties) : "nob noV eigen"
matrices:
e(sds) : 1 x 8
e(means) : 1 x 8
e(C) : 8 x 8
e(Psi) : 1 x 8
e(Ev) : 1 x 8
e(L) : 8 x 2
functions:
e(sample)
One way to save the returned matrix containing the eigenvectors as variables for the next year is to create a copy of the matrix and load the 2011 data:
matrix A = e(L)
clear
input str7 Country double(Investment Profit Income Tax Repayment Leverage Interest Liquidity) int Year
"France" -.03831442432584342 .14722819896988698 .22035417794604084 .12183886462162773 28.44763045286005 12.727100288710087 1.405629911115614 .011186908059399987 2011
"UK" -.05002189329928202 .16833493262244398 .2288402623558823 .04977050186975224 27.640103129372747 11.17376089844228 1.1764542835994092 .008386726178729322 2011
"US" -.0871005985124144 .10270482619857023 .1523559355903486 .06775742210623094 26.840586700880362 10.783899184031576 1.454011947763254 .013501919089967212 2011
"Italy" -.1069324103590126 -.5877872620957578 -.47469302172710803 .2004436360021364 23.133243742952658 5.3936761686065875 4.532771849692548 .012586313916956204 2011
"Germany" -.05851794344524515 .09960345907923154 .136805115392161 .1373407846168154 32.6182637042919 14.109738344526052 1.5077699357228835 .013200993625042274 2011
"Spain " -.10650743527105216 -.015785638597076792 .1808727613216441 .05038848927405154 28.22206251292902 10.839614113486853 1.5021425852392374 .012076771099482617 2011
"Sweden" -.09678946710644694 .11801761803893955 .18569993056826523 .1481844716617448 27.439283362903794 5.771154420635893 5.493437819181101 .013820243145673811 2011
"Norway " -.04263379351591438 .09931719473864983 .14469611775596314 .0796835513869996 26.68561168581991 14.06385602832082 1.5200488174887825 .01029136242440406 2011
"Algeria" -.04871983526465598 .2139061303228528 .2728647845448156 .056537570099712456 22.50263575072073 16.919641035094685 .7539881754626142 .009734650338902404 2011
end
Then you can simply use the svmat command:
svmat A
list A* if _n < 9
+-----------------------+
| A1 A2 |
|-----------------------|
1. | .0003921 -.383703 |
2. | .3895898 -.3793983 |
3. | .4621098 -.1162487 |
4. | .4146066 .1235683 |
5. | -.1828703 .4746658 |
|-----------------------|
6. | -.4685374 -.2596268 |
7. | .457974 .2624738 |
8. | -.0081538 .5643047 |
+-----------------------+
EDIT:
Revised according to comments:
use X1, clear
local xlist Investment Profit Income Tax Repayment Leverage Interest Liquidity
forvalues i = 1 / 5 {
pca `xlist' if year == 201`i', components(2)
matrix A201`i' = e(L)
svmat A201`i'
generate B201`i'1 = (A201`i'1 * Investment) + (A201`i'1 * Profit) + ///
(A201`i'1 * Income) + (A201`i'1 * Tax) + ///
(A201`i'1 * Repayment) + (A201`i'1 * Leverage) + ///
(A201`i'1 * Interest) + (A201`i'1 * Liquidity)
generate B201`i'2 = (A201`i'2 * Investment) + (A201`i'2 * Profit) + ///
(A201`i'2 * Income) + (A201`i'2 * Tax) + ///
(A201`i'2 * Repayment) + (A201`i'2 * Leverage) + ///
(A201`i'2 * Interest) + (A201`i'2 * Liquidity)
}

Mongodb find documents

I have a MongoDB instance which contains a translation of texts:
{
"_id" : ObjectId("57c68ba415f4d42b6ecd9ee7"),
"en" : "Adana (pronounced [aˈda.na]) is a major city in southern Turkey. The city is situated on the Seyhan river, 35 km (22 mi) inland from the Mediterranean Sea, in south-central Anatolia. It is the administrative seat of the Adana Province and has a population of 1.7 million,[1] making it the fifth most populous city in Turkey. Adana-Mersin polycentric metropolitan area, with a population of 3 million, stretches over 70 km (43 mi) east-west and 25 km (16 mi) north-south; encompassing the cities of Mersin, Tarsus and Adana.",
"sw" : "Adana (Kigiriki Άδανα) ni mji mkubwa katika nchi ya Uturuki. Kwa mujibu wa sensa iliyofanyika mwaka wa 2000, mji una wakazi wapatao 1,130,710 waishio huko,[2] na kuufanya kuwa mmoja kati ya miji mitano mikubwa ya Uturuku (baada ya Istanbul, Ankara, İzmir na Bursa). Mwaka wa 2006 mji wa Adana umekadiriwa kufikia iadadi ya wakazi wapatao 1,271,894. Huu ndiyo mji mkuu wa Mkoa wa Adana."
}
{
"_id" : ObjectId("57c68ba915f4d42b6ecd9eea"),
"en" : "Addis Ababa or Addis Abeba (the spelling used by the official Ethiopian Mapping Authority),(Amharic: አዲስ አበባ? Addis Abäba IPA: [adˈdis ˈabəba] ( listen), \"new flower\"; Oromo: Finfinne,[3][4] [fɪnˈfɪ́n.nɛ́] \"Natural Spring(s)\"), is the capital and largest city of Ethiopia. Finfinne is its Oromo name. It has a population of 3,384,569 according to the 2007 population census, with annual growth rate of 3.8%. This number has been increased from the originally published 2,738,248 figure and appears to be still largely underestimated.[2][5]",
"sw" : "Addis Ababa (pia Addis Abeba; kwa Kiamhara አዲስ አበባ, \"Ua Jipya\"; kwa Kioromo Finfinne) ni mji mkuu wa Ethiopia na wa Umoja wa Afrika."
}
{
"_id" : ObjectId("57c68bab15f4d42b6ecd9eec"),
"en" : "Adelaide of Italy (931 – 16 December 999), also called Adelaide of Burgundy, was the second wife of Holy Roman Emperor Otto the Great[2] and was crowned as the Holy Roman Empress with him by Pope John XII in Rome on February 2, 962. Empress Adelaide was perhaps the most prominent European woman of the 10th century; she was regent of the Holy Roman Empire as the guardian of her grandson in 991-995.[2]",
"sw" : "Adelaide wa Italia (takriban 931 – 16 Desemba, 999) alikuwa binti wa Rudolf II, mfalme wa Burgundia. Kwanza aliolewa na Lothar, mfalme wa Italia. Alipofariki Lothar, Adelaide aliolewa na Otto I, mfalme wa Ujerumani. Aliishi maisha matakatifu. Sikukuu yake ni 16 Desemba."
}
What I would like to do is to select one specific record. For example I expect to select the last record by doing this:
db.wiki.find({"sw": "Adelaide wa Italia"}).pretty();
But the mongo shell returns nothing.
Indeed, I know that I can create an index and do something like:
db.wiki.find({$text: {$search: "\"Adelaide wa Italia\""}}).pretty();
which indeed returns the record as expected.
What am I doing wrong in the non-index searching please?
In this case you should use search with regex:
db.wiki.find({"sw": /Adelaide wa Italia/}).pretty();
The way you are doing it by:
db.wiki.find({"sw": "Adelaide wa Italia"}).pretty();
you simply tell Mongo to return you all documents where sw is equal to Adelaide wa Italia but you want to get all documents which contains this phrase in sw field instead.

how to change data such that graph is interrupted

I'd like to construct a graph like the following which is done with excel:
I've entered my data into matlab using the following lines:
year = [1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
]
fix_No = [-9.449167466 -11.19432509 -8.500517848 -5.644813211 -2.608063866 2.614370892 6.461752833 7.035549084 8.542521755 12.11070577 6.476900841 8.029225388 4.315820526 4.165349512 5.34593031 7.510812752 -2.629044124 -5.713139529 -8.626773532 -11.83226415 . -8.821345246 -6.396197293 -5.187823611 -1.79008821 3.34099288 5.545228048 7.013763711 6.580524638 4.256524275
]
fix_No_and_mean = [11.1610424 5.437315474 5.833032482 4.591658232 1.578021362 -1.572756298 -1.03351595 -2.250991302 -3.222969261 -5.734621837 . 12.96685642 10.95095066 10.2207684 5.654017602 1.753259697 -2.596143576 -7.155087995 -9.687001589 -8.700979283 -4.290434459 . 2.299711172 1.640802028 1.714407543 0.8360893 -0.425484303 -1.160053823 -0.858530711 0.123787867 0.782208621
]
In 1985 and 1996 the lines should be interrupted as in the excel graph. But how can I do this using Matlab? I've put a "." (point) where the interruptions should be but matlab doesn't accept points.
The lines for the construction of the graph look like this:
plot(year,fix_No, 'color', 'k', 'LineWidth',2, 'LineSmoothing','on')
line(year,fix_No_and_mean, 'color', 'r', 'LineWidth',2, 'LineSmoothing','on')
xlabel('year')
legend('fixed number', 'fixed number')
You Should put "NaN" instead of '.' ,i.e. in the places on the vectors fix_No_and_mean and fix_No that correspond to 1985 and 1996, put NaN.
This will plot different lines as in the figure you attached from excel.