Count the number of occurrences of each character, i.e., how many times each letter, number, and punctuation character is used - intersystems-cache

I am trying to write a routine that counts the characters in a global.
These are the globals I set and the characters I would like counted.
s ^XA(1)="SYLVESTER STALLONE, BRUCE WILLIS, AND ARNOLD SCHWARZENEGGER WERE DISCUSSING THEIR "
s ^XA(2)="NEXT PROJECT, A BUDDY FILM IN WHICH BAROQUE COMPOSERS TEAM UP TO BATTLE BOX-OFFICE IRRELEVANCE "
s ^XA(3)="EVERY HAD BEEN SETTLED EXCEPT THE CASTING. "
s ^XA(4)="""ARNOLD CAN BE PACHELBEL,"" STALLONE. ""AND I WANT TO PLAY MOZART. """
s ^XA(5)="""NO WAY!"" SAID WILLIS. ""YOU'RE NOT REMOTELY MOZARTISH. """
s ^XA(6)="""I'LL PLAY MOZART. YOU CAN BE HANDEL. """
s ^XA(7)="""YOU BE HANDEL!"" YELLED STALONE. ""I'M PLAYING MOZART! """
s ^XA(8)="FINALLY, ARNOLD SPOKE ""YOU WILL PLAY HANDEL,"" HE SAID TO WILLIS. "
s ^XA(9)="""AND YOU,"" HE SAID TO STALLONE, ""THEN WHO ARE YOU GONNA PLAY? """
s ^XA(10)="""OH YEAH?"" SAID STALLONE, ""THEN WHO ARE YOU GONNA PLAY? """
s ^XA(11)="ARNOLD ROSE FROM THE TABLE AND DONNED A PAIR OF SUNGLASSES. "
s ^XA(12)="I'LL BE MOZART."

If I understood your question correctly, and you just need the total count of all characters in a global, here you go:
set key = ""
for {
set key = $Order(^XA(key))
quit:key=""
for i=1:1:$Length(^XA(key)) {
set char = $Extract(^XA(key), i)
set count(char) = $get(count(char)) + 1
}
}
zwrite count // or just return count
As for your example, this will produce the following output:
count(" ")=112
count("!")=3
count("""")=24
count("'")=4
count(",")=9
count("-")=1
count(".")=11
count("?")=3
count("A")=54
count("B")=12
count("C")=13
count("D")=23
count("E")=60
count("F")=6
count("G")=8
count("H")=20
count("I")=28
count("J")=1
count("K")=1
count("L")=48
count("M")=11
count("N")=39
count("O")=44
count("P")=13
count("Q")=1
count("R")=28
count("S")=29
count("T")=33
count("U")=13
count("V")=3
count("W")=11
count("X")=3
count("Y")=21
count("Z")=6
Hope this helps!

Related

Finding the three longest substrings in a string using SPARQL on the Wikidata Query Service, and ranking them across strings

I'm trying to identify the longest three substrings from a string using SPARQL and the Wikidata Query Service and then rank
the substrings within a string by length
the strings by the lengths of any of those longest substrings .
I managed to identify the first and second substring from a string and could of course just create similar additional lines to tackle the problem, but this seems ugly and inefficient, so I am wondering if anyone here knows of a better way to get there.
This is a simplified version of the code, though I have left some auxiliary variables in that I am using for tracking progress on the way. You can try it here.
Clarification in response to this comment: if it is necessary to treat this query as a subquery and to feed it with results from another subquery, that's fine with me. To get an idea of the kinds of use I have in mind, see this demo.
SELECT * WHERE {
{
VALUES (?title) {
("What are the longest three words in this string?")
("A really complicated title")
("OneWordTitleInCamelCase")
("Thanks for your help!")
}
}
BIND(STRLEN(REPLACE(?title, " ", "")) AS ?titlelength)
BIND(STRBEFORE(?title, " ") AS ?substring1)
BIND(STRLEN(REPLACE(?substring1, " ", "")) AS ?substring1length)
BIND(STRAFTER(?title, " ") AS ?postfix)
BIND(STRLEN(REPLACE(?postfix, " ", "")) AS ?postfixlength)
BIND(STRBEFORE(?postfix, " ") AS ?substring2)
BIND(STRLEN(REPLACE(?substring2, " ", "")) AS ?substring2length)
}
ORDER BY DESC(?substring1length)
Expected results:
longsubstring substringlength
OneWordTitleInCamelCase 23
complicated 11
longest 7
really 6
string 6
Thanks 6
title 5
three 5
your 4
help 4
Actual results:
title titlelength substring1 substring1length postfix postfixlength substring2 substring2length
Thanks for your help! 18 Thanks 6 for your help! 12 for 3
What are the longest three words in this string? 40 What 4 are the longest three words in this string? 36 are 3
A really complicated title 23 A 1 really complicated title 22 really 6
OneWordTitleInCamelCase 23 0 0 0

Index of word in string 'covering' certain position

Not sure if this is the right place to ask but I couldn't find any related or similar questions.
Anyway: imagine you have a certain string like
val exampleString = "Hello StackOverflow this is my question, cool right?"
If given a position in this string, for example 23, return the word that 'occupies' this position in the string. If we look at the example string, we can see that the 23rd character is the letter 's' (the last character of 'this'), so we should return index = 5 (because 'this' is the 5th word). In my question spaces are counted as words. If, for example, we were given position 5, we land on the first space and thus we should return index = 1.
I'm implementing this in Scala (but this should be quite language-agnostic and I would love to see implementations in other languages).
Currently I have the following approach (assume exampleString is the given string and charPosition the given position):
exampleString.split("((?<= )|(?= ))").scanLeft(0)((a, b) => a + b.length()).drop(1).zipWithIndex.takeWhile(_._1 <= charPosition).last._2 + 1
This works, but it is way too complex to be honest. Is there a better (more efficient?) way to achieve this. I'm fairly new to functions like fold, scan, map, filter ... but I would love to learn more.
Thanks in advance.
def wordIndex(exampleString: String, index: Int): Int = {
exampleString.take(index + 1).foldLeft((0, exampleString.head.isWhitespace)) {
case ((n, isWhitespace), c) =>
if (isWhitespace == c.isWhitespace) (n, isWhitespace)
else (n + 1, !isWhitespace)
}._1
}
This will fold over the string, keeping track of whether the previous character was a whitespace or not, and if it detects a change, it will flip the boolean and add 1 to the count (n).
This will be able to handle groups of spaces (e.g. in hello world, world would be at position 2), and also spaces at the start of the string would count as index 0 and the first word would be index 1.
Note that this can't handle when the input is an empty string, I'll let you decide what you want to do in that case.

How can I take a user input that may contain spaces and convert the spaces to a hyphen in Swift? [duplicate]

This question already has answers here:
Any way to replace characters on Swift String?
(23 answers)
Closed 7 years ago.
I'm trying to create a simple iOS app that takes user input ( a city ) and searches a website for that city, and then will display the forecasts for that city.
What I'm currently stuck on and unable to find much documentation that isn't overwhelming is how I can be sure that the user input will translate well to a URL if there are more then one words in the name of the city.
aka if a user inputs Salt Lake City into my text field, how can I write an if else statement that determines the amount of spaces, and if the amount of spaces is greater than 0 will convert those spaces to "-".
So far I've tried creating an array out of the string, but still can't figure out how I can append a - to each element in the array. I don't think it's possible.
Does anyone know how I can do what I'm trying to do? Or am I approaching it the incorrect way?
Here's a poor first attempt. I know this doesn't work, but hopefully it explains it a bit more of what I'm trying to accomplish than my text above.
var cityText = "Salt Lake City"
let cityArray = cityText.componentsSeparatedByString(" ")
let combineDashUrl = cityArray[0] + "-" + cityArray[1] + "-" + cityArray[2]
print(combineDashUrl)
Assuming there are never multiple spaces in a row you should be able to use stringByReplacingOccurrencesOfString.
let cityText = "Salt Lake City"
let newCityText = cityText.stringByReplacingOccurrencesOfString(
" ",
withString: "-")
Replacing variable numbers of spaces with a dash would be more complicated. I'd probably use regular expressions for that.
You can use map over the array of characters to transform spaces into hyphens.
let city = "Salt Lake City"
let hyphenatedCity = String(city.characters.map{$0 == " " ? "-" : $0})

Power Query - remove characters from number values

I have a table field where the data contains our memberID numbers followed by character or character + number strings
For example:
My Data
1234567Z1
2345T10
222222T10Z1
111
111A
Should Become
123456
12345
222222
111
111
I want to get just the member number (as shown in Should Become above). I.E. all the digits that are LEFT of the first character.
As the length of the member number can be different for each person (the first 1 to 7 digit) and the letters used can be different (a to z, 0 to 8 characters long), I don't think I can SPLIT the field.
Right now, in Power Query, I do 27 search and replace commands to clean this data (e.g. find T10 replace with nothing, find T20 replace with nothing, etc)
Can anyone suggest a better way to achieve this?
I did successfully create a formula for this in Excel...but I am now trying to do this in Power Query and I don't know how to convert the formula - nor am I sure this is the most efficient solution.
=iferror(value(left([MEMBERID],7)),
iferror(value(left([MEMBERID],6)),
iferror(value(left([MEMBERID],5)),
iferror(value(left([MEMBERID],4)),
iferror(value(left([MEMBERID],3)),0)
)
)
)
)
Thanks
There are likely several ways to do this. Here's one way:
Create a query Letters:
let
Source = { "a" .. "z" } & { "A" .. "Z" }
in
Source
Create a query GetFirstLetterIndex:
let
Source = (text) => let
// For each letter find out where it shows up in the text. If it doesn't show up, we will have a -1 in the list. Make that positive so that we return the index of the first letter which shows up.
firstLetterIndex = List.Transform(Letters, each let pos = Text.PositionOf(text, _), correctedPos = if pos < 0 then Text.Length(text) else pos in correctedPos),
minimumIndex = List.Min(firstLetterIndex)
in minimumIndex
in
Source
In the table containing your data, add a custom column with this formula:
Text.Range([ColumnWithData], 0, GetFirstLetterIndex([ColumnWithData]))
That formula will take everything from your data text until the first letter.

How to groupBy groupBy?

I need to map through a List[(A,B,C)] to produce an html report. Specifically, a
List[(Schedule,GameResult,Team)]
Schedule contains a gameDate property that I need to group by on to get a
Map[JodaTime, List(Schedule,GameResult,Team)]
which I use to display gameDate table row headers. Easy enough:
val data = repo.games.findAllByDate(fooDate).groupBy(_._1.gameDate)
Now the tricky bit (for me) is, how to further refine the grouping in order to enable mapping through the game results as pairs? To clarify, each GameResult consists of a team's "version" of the game (i.e. score, location, etc.), sharing a common Schedule gameID with the opponent team.
Basically, I need to display a game result outcome on one row as:
3 London Dragons vs. Paris Frogs 2
Grouping on gameDate let's me do something like:
data.map{case(date,games) =>
// game date row headers
<tr><td>{date.toString("MMMM dd, yyyy")}</td></tr>
// print out game result data rows
games.map{case(schedule,result, team)=>
...
// BUT (result,team) slice is ungrouped, need grouped by Schedule gameID
}
}
In the old version of the existing application (PHP) I used to
for($x = 0; $x < $this->gameCnt; $x = $x + 2) {...}
but I'd prefer to refer to variable names and not the come-back-later-wtf-is-that-inducing:
games._._2(rowCnt).total games._._3(rowCnt).name games._._1(rowCnt).location games._._2(rowCnt+1).total games._._3(rowCnt+1).name
maybe zip or double up for(t1 <- data; t2 <- data) yield(?) or something else entirely will do the trick. Regardless, there's a concise solution, just not coming to me right now...
Maybe I'm misunderstanding your requirements, but it seems to me that all you need is an additional groupBy:
repo.games.findAllByDate(fooDate).groupBy(_._1.gameDate).mapValues(_.groupBy(_._1.gameID))
The result will be of type:
Map[JodaTime, Map[GameId, List[(Schedule,GameResult,Team)]]]
(where GameId is the type of the return type of Schedule.gameId)
Update: if you want the results as pairs, then pattern matching is your friend, as shown by Arjan. This would give us:
val byDate = repo.games.findAllByDate(fooDate).groupBy(_._1.gameDate)
val data = byDate.mapValues(_.groupBy(_._1.gameID).mapValues{ case List((sa, ra, ta), (sb, rb, tb)) => (sa, (ta, ra), (tb, rb)))
This time the result is of type:
Map[JodaTime, Iterable[ (Schedule,(Team,GameResult),(Team,GameResult))]]
Note that this will throw a MatchError if there are not exactly 2 entries with the same gameId. In real code you will definitely want to check for this case.
Ok a soultion from RĂ©gis Jean-Gilles:
val data = repo.games.findAllByDate(fooDate).groupBy(_._1.gameDate).mapValues(_.groupBy(_._1.gameID))
You said it was not correct, maybe you just didnt use it the right way?
Every List in the result is a pair of games with the same GameId.
You could pruduce html like that:
data.map{case(date,games) =>
// game date row headers
<tr><td>{date.toString("MMMM dd, yyyy")}</td></tr>
// print out game result data rows
games.map{case (gameId, List((schedule, result, team), (schedule, result, team))) =>
...
}
}
And since you dont need a gameId, you can return just the paired games:
val data = repo.games.findAllByDate(fooDate).groupBy(_._1.gameDate).mapValues(_.groupBy(_._1.gameID).values)
Tipe of result is now:
Map[JodaTime, Iterable[List[(Schedule,GameResult,Team)]]]
Every list again a pair of two games with the same GameId