I was looking for a basic utility with 2 functions to convert IPv4 Addresses to/from Long in Scala, such as "10.10.10.10" to its Long representation of 168430090 and back. A basic utility such as this exists in many languages (such as python), but appears to require re-writing the same code for everyone for the JVM.
What is the recommended approach on unifying IPv4ToLong and LongToIPv4 functions?
Combining the ideas from leifbatterman and Elesin Olalekan Fuad and avoiding multiplication and power operations:
def ipv4ToLong(ip: String): Option[Long] = Try(
ip.split('.').ensuring(_.length == 4)
.map(_.toLong).ensuring(_.forall(x => x >= 0 && x < 256))
.reverse.zip(List(0,8,16,24)).map(xi => xi._1 << xi._2).sum
).toOption
To convert Long to String in dotted format:
def longToipv4(ip: Long): Option[String] = if ( ip >= 0 && ip <= 4294967295L) {
Some(List(0x000000ff, 0x0000ff00, 0x00ff0000, 0xff000000).zip(List(0,8,16,24))
.map(mi => ((mi._1 & ip) >> mi._2)).reverse
.map(_.toString).mkString("."))
} else None
import java.net.InetAddress
def IPv4ToLong(dottedIP: String): Long = {
val addrArray: Array[String] = dottedIP.split("\\.")
var num: Long = 0
var i: Int = 0
while (i < addrArray.length) {
val power: Int = 3 - i
num = num + ((addrArray(i).toInt % 256) * Math.pow(256, power)).toLong
i += 1
}
num
}
def LongToIPv4 (ip : Long) : String = {
val bytes: Array[Byte] = new Array[Byte](4)
bytes(0) = ((ip & 0xff000000) >> 24).toByte
bytes(1) = ((ip & 0x00ff0000) >> 16).toByte
bytes(2) = ((ip & 0x0000ff00) >> 8).toByte
bytes(3) = (ip & 0x000000ff).toByte
InetAddress.getByAddress(bytes).getHostAddress()
}
scala> IPv4ToLong("10.10.10.10")
res0: Long = 168430090
scala> LongToIPv4(168430090L)
res1: String = 10.10.10.10
Try the ipaddr scala library. Create an IpAddress and get its long value like this:
val ip1: IpAddress = IpAddress("192.168.0.1")
val ip1Long = ip1.numerical // returns 3232235521L
This is pretty straightforward for ipv4:
def ipToLong(ip:String) = ip.split("\\\\.").foldLeft(0L)((c,n)=>c*256+n.toLong)
def longToIP(ip:Long) = (for(a<-3 to 0 by -1) yield ((ip>>(a*8))&0xff).toString).mkString(".")
I have a GitHub gist that solves this. The gist contains code that converts from IP to Long likewise the reverse. Visit https://gist.github.com/OElesin/f0f2c69530a315177b9e0227a140f9c1
Here is the code:
def ipToLong(ipAddress: String): Long = {
ipAddress.split("\\.").reverse.zipWithIndex.map(a=>a._1.toInt*math.pow(256,a._2).toLong).sum
}
def longToIP(long: Long): String = {
(0 until 4).map(a=>long / math.pow(256, a).floor.toInt % 256).reverse.mkString(".")
}
Enjoy
Adding to Elesin Olalekan Fuad's answer it can be made a little more robust like this:
def ipToLong(ip: String): Option[Long] = {
Try(ip.split('.').ensuring(_.length == 4)
.map(_.toLong).ensuring(_.forall(x => x >= 0 && x < 256))
.zip(Array(256L * 256L * 256L, 256L * 256L, 256L, 1L))
.map { case (x, y) => x * y }
.sum).toOption
}
def longToIp(ip: Long): Option[String] = {
if (ip >= 0 && ip <= 4294967295L)
Some((0 until 4)
.map(a => ip / math.pow(256, a).floor.toInt % 256)
.reverse.mkString("."))
else
None
}
I like #jwvh's comment on ipv4ToLong. As to longToIpv4, how about just simply:
def longToIpv4(v:Long):String = (for (i <- 0 to 3) yield (v >> (i * 8)) & 0x000000FF ).reverse.mkString(".")
Related
I have this function which uses InetAddress, but the output is occasionally wrong. (example: "::ffff:49e7:a9b2" will give an incorrect result.)
def IPv6ToBigInteger(ip: String): BigInteger = {
val i = InetAddress.getByName(ip)
val a: Array[Byte] = i.getAddress
new BigInteger(1, a)
}
And the I also have this function
def IPv6ToBigInteger(ip: String): BigInteger = {
val fragments = ip.split(":|\\.|::").filter(_.nonEmpty)
require(fragments.length <= 8, "Bad IPv6")
var ipNum = new BigInteger("0")
for (i <-fragments.indices) {
val frag2Long = new BigInteger(s"${fragments(i)}", 16)
ipNum = frag2Long.or(ipNum.shiftLeft(16))
}
ipNum
}
which appears to have a parsing error because it gives the wrong output unless it is in 0:0:0:0:0:0:0:0 format, but is an based on my IPv4ToLong function:
def IPv4ToLong(ip: String): Long = {
val fragments = ip.split('.')
var ipNum = 0L
for (i <- fragments.indices) {
val frag2Long = fragments(i).toLong
ipNum = frag2Long | ipNum << 8L
}
ipNum
}
This
ipNum = frag2Long | ipNum << 8L
is
ipNum = (frag2Long | ipNum) << 8L
not
ipNum = frag2Long | (ipNum << 8L)
[ And please use foldLeft rather than var and while ]
Interesting challenge: transform IP address strings into BigInt values, allowing for all legal IPv6 address forms.
Here's my try.
import scala.util.Try
def iPv62BigInt(ip: String): Try[BigInt] = Try{
val fill = ":0:" * (8 - ip.split("[:.]").count(_.nonEmpty))
val fullArr =
raw"((?<=\.)(\d+)|(\d+)(?=\.))".r
.replaceAllIn(ip, _.group(1).toInt.toHexString)
.replace("::", fill)
.split("[:.]")
.collect{case s if s.nonEmpty => s"000$s".takeRight(4)}
if (fullArr.length == 8) BigInt(fullArr.mkString, 16)
else throw new NumberFormatException("wrong number of elements")
}
This is, admittedly, a bit lenient in that it won't catch all all non-IPv6 forms, but that's not a trivial task using tools like regex.
I'm getting logs from a firewall in CEF Format as a string which looks as:
ABC|XYZ|F123|1.0|DSE|DSE|4|externalId=e705265d0d9e4d4fcb218b cn2=329160 cn1=3053998 dhost=SRV2019 duser=admin msg=Process accessed NTDS fname=ntdsutil.exe filePath=\\Device\\HarddiskVolume2\\Windows\\System32 cs5="C:\\Windows\\system32\\ntdsutil.exe" "ac i ntds" ifm "create full ntdstest3" q q fileHash=80c8b68240a95 dntdom=adminDomain cn3=13311 rt=1610948650000 tactic=Credential Access technique=Credential Dumping objective=Gain Access patternDisposition=Detection. outcome=0
How can I create a DataFrame from this kind of string where I'm getting key-value pairs separated by = ?
My objective is to infer schema from this string using the keys dynamically, i.e extract the keys from left side of the = operator and create a schema using them.
What I have been doing currently is pretty lame(IMHO) and not very dynamic in approach.(because the number of key-value pairs can change as per different type of logs)
val a: String = "ABC|XYZ|F123|1.0|DSE|DCE|4|externalId=e705265d0d9e4d4fcb218b cn2=329160 cn1=3053998 dhost=SRV2019 duser=admin msg=Process accessed NTDS fname=ntdsutil.exe filePath=\\Device\\HarddiskVolume2\\Windows\\System32 cs5="C:\\Windows\\system32\\ntdsutil.exe" "ac i ntds" ifm "create full ntdstest3" q q fileHash=80c8b68240a95 dntdom=adminDomain cn3=13311 rt=1610948650000 tactic=Credential Access technique=Credential Dumping objective=Gain Access patternDisposition=Detection. outcome=0"
val ttype: String = "DCE"
type parseReturn = (String,String,List[String],Int)
def cefParser(a: String, ttype: String): parseReturn = {
val firstPart = a.split("\\|")
var pD = new ListBuffer[String]()
var listSize: Int = 0
if (firstPart.size == 8 && firstPart(4) == ttype) {
pD += firstPart(0)
pD += firstPart(1)
pD += firstPart(2)
pD += firstPart(3)
pD += firstPart(4)
pD += firstPart(5)
pD += firstPart(6)
val secondPart = parseSecondPart(firstPart(7), ttype)
pD ++= secondPart
listSize = pD.toList.length
(firstPart(2), ttype, pD.toList, listSize)
} else {
val temp: List[String] = List(a)
(firstPart(2), "IRRELEVANT", temp, temp.length)
}
}
The method parseSecondPart is:
def parseSecondPart(m:String, ttype:String): ListBuffer[String] = ttype match {
case auditActivity.ttype=>parseAuditEvent(m)
Another function call to just replace some text in the logs
def parseAuditEvent(msg: String): ListBuffer[String] = {
val updated_msg = msg.replace("cat=", "metadata_event_type=")
.replace("destinationtranslatedaddress=", "event_user_ip=")
.replace("duser=", "event_user_id=")
.replace("deviceprocessname=", "event_service_name=")
.replace("cn3=", "metadata_offset=")
.replace("outcome=", "event_success=")
.replace("devicecustomdate1=", "event_utc_timestamp=")
.replace("rt=", "metadata_event_creation_time=")
parseEvent(updated_msg)
}
Final function to get only the values:
def parseEvent(msg: String): ListBuffer[String] = {
val newMsg = msg.replace("\\=", "$_equal_$")
val pD = new ListBuffer[String]()
val splitData = newMsg.split("=")
val mSize = splitData.size
for (i <- 1 until mSize) {
if(i < mSize-1) {
val a = splitData(i).split(" ")
val b = a.size-1
val c = a.slice(0,b).mkString(" ")
pD += c.replace("$_equal_$","=")
} else if(i == mSize-1) {
val a = splitData(i).replace("$_equal_$","=")
pD += a
} else {
logExceptions(newMsg)
}
}
pD
}
The returns contains a ListBuffer[String]at 3rd position, using which I create a DataFrame as follows:
val df = ss.sqlContext
.createDataFrame(tempRDD.filter(x => x._1 != "IRRELEVANT")
.map(x => Row.fromSeq(x._3)), schema)
People of stackoverflow, i really need your help in improving my code, both for performance and approach.
Any kind of help and/or suggestions will be highly appreciated.
Thanks In Advance.
I'm new in Scala programming language so in this Bubble sort I need to generate 10 random integers instead of right it down like the code below
any suggestions?
object BubbleSort {
def bubbleSort(array: Array[Int]) = {
def bubbleSortRecursive(array: Array[Int], current: Int, to: Int): Array[Int] = {
println(array.mkString(",") + " current -> " + current + ", to -> " + to)
to match {
case 0 => array
case _ if(to == current) => bubbleSortRecursive(array, 0, to - 1)
case _ =>
if (array(current) > array(current + 1)) {
var temp = array(current + 1)
array(current + 1) = array(current)
array(current) = temp
}
bubbleSortRecursive(array, current + 1, to)
}
}
bubbleSortRecursive(array, 0, array.size - 1)
}
def main(args: Array[String]) {
val sortedArray = bubbleSort(Array(10,9,11,5,2))
println("Sorted Array -> " + sortedArray.mkString(","))
}
}
Try this:
import scala.util.Random
val sortedArray = (1 to 10).map(_ => Random.nextInt).toArray
You can use scala.util.Random for generation. nextInt method takes maxValue argument, so in the code sample, you'll generate list of 10 int values from 0 to 100.
val r = scala.util.Random
for (i <- 1 to 10) yield r.nextInt(100)
You can find more info here or here
You can use it this way.
val solv1 = Random.shuffle( (1 to 100).toList).take(10)
val solv2 = Array.fill(10)(Random.nextInt)
I try to rewrite this line of Scala + Figaro using my function sum_ but I have some errors.
val sum = Container(vars:_*).reduce(_+_)
It uses the reduce() method to calculate the sum. I want to rewrite this line but I have errors because of the Chain return type [Double, Int]:
import com.cra.figaro.language._
import com.cra.figaro.library.atomic.continuous.Uniform
import com.cra.figaro.language.{Element, Chain, Apply}
import com.cra.figaro.library.collection.Container
object sum {
def sum_(arr: Int*) :Int={
var i=0
var sum: Int =0
while (i < arr.length) {
sum += arr(i)
i += 1
}
return sum
}
def fillarray(): Int = {
scala.util.Random.nextInt(10) match{
case 0 | 1 | 2 => 3
case 3 | 4 | 5 | 6 => 4
case _ => 5
}
}
def main(args: Array[String]) {
val par = Array.fill(18)(fillarray())
val skill = Uniform(0.0, 8.0/13.0)
val shots = Array.tabulate(18)((hole: Int) => Chain(skill, (s:Double) =>
Select(s/8.0 -> (par(hole)-2),
s/2.0 -> (par(hole)-1),
s -> par(hole),
(4.0/5.0) * (1.0 - (13.0 * s)/8.0)-> (par(hole)+1),
(1.0/5.0) * (1.0 - (13.0 * s)/8.0) -> (par(hole)+2))))
val vars = for { i <- 0 until 18} yield shots(i)
//this line I want to rewrite
val sum1 = Container(vars:_*).reduce(_+_)
//My idea was to implement in this way the line above
val sum2 = sum_(vars)
}
}
If you want use your function you can do so:
val sum2 = sum_(vars.map(chain => chain.generateValue()):_*)
or
val sum2 = sum_(vars.map(_.generateValue()):_*)
but I'd recommend to dive deeper into your library and functional paradigm.
I have to analyze an email corpus to see how many of individual sentences are dominated by leet speak (i.e. lol, brb etc.)
For each sentence I am doing the following:
val words = sentence.split(" ")
for (word <- words) {
if (validWords.contains(word)) {
score += 1
} else if (leetWords.contains(word)) {
score -= 1
}
}
Is there a better way to calculate the scores using Fold?
Not a great deal different, but another option.
val words = List("one", "two", "three")
val valid = List("one", "two")
val leet = List("three")
def check(valid: List[String], invalid: List[String])(words:List[String]): Int = words.foldLeft(0){
case (x, word) if valid.contains(word) => x + 1
case (x, word) if invalid.contains(word) => x - 1
case (x, _ ) => x
}
val checkValidOrLeet = check(valid, leet)(_)
val count = checkValidOrLeet(words)
If not limited to fold, using sum would be more concise.
sentence.split(" ")
.iterator
.map(word =>
if (validWords.contains(word)) 1
else if (leetWords.contains(word)) -1
else 0
).sum
Here's a way to do it with fold and partial application. Could still be more elegant, I'll continue to think on it.
val sentence = // ...your data....
val validWords = // ... your valid words...
val leetWords = // ... your leet words...
def checkWord(goodList: List[String], badList: List[String])(c: Int, w: String): Int = {
if (goodList.contains(w)) c + 1
else if (badList.contains(w)) c - 1
else c
}
val count = sentence.split(" ").foldLeft(0)(checkWord(validWords, leetWords))
print(count)