selecting cases based upon first few characters in spss? - select

i want to select cases with particular first 3 characters.
for example cases with first 3 characters containing "I22".
the length of whole value can vary. e,g "I228" or "I2279" but they have common first three characters "I22"
i usually use compute variable_name= "I228".
but this is tedious as i have to enter all variation of "I22" e.g "I228", "I229" and so on..
it would be much easier if i can just select cases based upon same first 3 characters

you can use the char.cubstr function to find out what the first three characters are in your string variable. For example:
if char.substr(variable_name,1,3)="I22" keep_this=1.
or:
select cases if char.substr(variable_name,1,3)="I22".

Related

Extracting Portions of String

I have a field with the following types of string
X000233756_9981900025_201901_EUR_/
I firstly need to take take the characters to the left of the first _
Secondly I need to take the characters between the first and 2nd _
First _ is CHARINDEX('_',[Line_Item_Text],1) AS Position_1
Second _ is CHARINDEX('_',[Line_Item_Text],CHARINDEX('_',[Line_Item_Text],1)+1) AS Position_2
I was then expecting to be able to do
left([Line_Item_Text],CHARINDEX('_',[Line_Item_Text],1)-1) AS Data_1
Substring([Line_Item_Text],CHARINDEX('_',[Line_Item_Text],1)+1),CHARINDEX('_',[Line_Item_Text],CHARINDEX('_',[Line_Item_Text],1)+1) - CHARINDEX('_',[Line_Item_Text],1)+1)) AS Data_2"
Which should give me
X000233756
9981900025
But getting errors with incorrect number of functions when I start adding and subtracting from CHARINDEX Function.
Any ideas where I am going wrong?
TIA
Geoff
Actually, using the base string functions here is going to be an ugly nightmare. You might find that STRING_SPLIT along with some clever logic might be easier:
SELECT value
FROM STRING_SPLIT('X000233756_9981900025_201901_EUR_', '_')
WHERE LEN(value) > 6 AND NOT value LIKE '[A-Z]%';
This answer assumes that the third and fourth components would always be a 6 digit date and 3 letter currency code, and that the first (but not second) component would always start with some letter.
Demo

Complex Regex for password field

Im having a difficulty to assemble as complex as this password validation rules:
Shall be at least 8-20 characters:
Shall have At least one number (0-9)
Shall contain At least one special characters (!,#,#,$,%...etc)
No repetition of single character more than 3 times
No repetition of sequence of characters/numbers more than 2 times.
No sequence of 4 or more characters (e.g. abcd)
No sequence of 4 or more numbers (e.g. 1234)
No sequence of 4 or more keyboard characters (e.g. qwer)
May not have any of the above spelled backwards.
That maximum i could achieve is the below, its not working as expected, could someone advice:
^[A-Za-z0-9\\S+](?=.*[$#$#!%*?&]){8,20}

officejs : Search Word document using regular expression

I want to search strings like "number 1" or "number 152" or "number 36985".
In all above strings "number " will be constant but digits will change and can have any length.
I tried Search option using wildcard but it doesn't seem to work.
basic regEx operators like + seem to not work.
I tried 'number*[1-9]*' and 'number*[1-9]+' but no luck.
This regular expression only selects upto one digit. e.g. If the string is 'number 12345' it only matches number 12345 (the part which is in bold).
Does anyone know how to do this?
Word doesn't use regular expressions in its search (Find) functionality. It has its own set of wildcard rules. These are very similar to RegEx, but not identical and not as powerful.
Using Word's wildcards, the search text below locates the examples given in the question. (Note that the semicolon separator in 1;100 may be soemthing else, depending on the list separator set in Windows (or on the Mac). My European locale uses a semicolon; the United States would use a comma, for example.
"number [0-9]{1;100}"
The 100 is an arbitrary number I chose for the maximum number of repeats of the search term just before it. Depending on how long you expect a number to be, this can be much smaller...
The logic of the search text is: number is a literal; the valid range of characters following the literal are 0 through 9; there may be one to one hundred of these characters - anything in that range is a match.
The only way RegEx can be used in Word is to extract a string and run the search on the string. But this dissociates the string from the document, meaning Word-specific content (formatting, fields, etc.) will be lost.
Try putting < and > on the ends of your search string to indicate the beginning and ending of the desired strings. This works for me: '<number [1-9]*>'. So does '<number [1-9]#>' which is probably what you want. Note that in Word wildcards the # is used where + is used in other RegEx systems.

select part of a file name with regex

i have a file u01/appl/wandl/import/AT/file.csv. i need to get the AT part only and store it in a variable using regex.it is always a 2 digit code but it could be BE UK etc etc.
\/([A-Z]{2})\/
That regex find exactly two big letters with / / around so it finds things like /AT/ or /UK/ or /BE/
As you can see parentheses are around 2 digit code so only code will be captures by grouping

How to fill a field with spaces until a length in Notepad++

I've prepared a macro in Notepad++ to transform a ldif file in a csv file with a few fields. Everything is OK but I have a final problem: I have to have 2 fields with a specific length and in this moment I cannot ensure that length because in the source file they are not coming so
For instance, I generate this line:
12345,namenamename,123456
And I have to ensure that the 2nd and 3rd fields have 30 (filling with spaces at right side) and 9 (filling with zeros at left) characters, so in this case I should generate:
12345,namenamename ,000123456
I haven't found how Notepad++ could match a pattern in order to add spaces/zeros, so I have though in to add 1 space/zero to the proper field and repeat this step so many times as needed to ensure the lengths (this is, 29 and 8, because they cannot come empty) and search with the length in the regex (for instance: \d{1,8} for the third field)
My question is: can I repeat only one step of the macro several times (and the rest of the macro only 1 repetition)?
I've read the wiki related to this point (http://sourceforge.net/apps/mediawiki/notepad-plus/index.php?title=Editing_Configuration_Files#.3CMacros.3E) and I don't found anything neither
If not possible, how could be a good solution? Create another 2 different macros and after execute the main one, execute this new 2 macros several times?
Thanks in advance!
A two pass solution with Notepad++ is possible. Find a pair of characters or two short sequence of characters that never occurs in your data file. I will use =#<= and =>#= here.
First pass, generate or convert the input text into the form 12345,=#<=namenamename______________________________,000000000123456=>#=. Ie add 30 spaces after the name and nine zeroes before the number (underscores used here just to make things clearer).
Second pass, do a regular expression search for =#<=(.{30})_*,0*(\d{9})=>#= and replace with \1,\2.
I have just suggested a similar solution in special timestamp format of csv