stringbuilder Scala drop duplicate chars - scala

class Buffer(s: String) {
import scala.collection.mutable.StringBuilder
import scala.io.StdIn
private var buffer: StringBuilder = new StringBuilder(s)
private var cursor: Int = 0 // cursor is in between characters
private var marker: Int = 0 // marker is in between characters
private var paste: String = ""
private def end: Int = buffer.length // the end of the line
private def lwr: Int = Math.min(marker, cursor)
private def upr: Int = Math.max(marker, cursor)
/*
* Accessor methods to return aspects of the state
*/
def getCursor: Int = cursor
def getMarker: Int = marker
def getString: String = buffer.toString
def getPaste: String = paste
/**
* Delete Duplicate characters. Within the defined region, for each character,
* if it occurs once then keep it, but if it occurs multiple times then keep
* only the first occurrence. The characters to the left and right of the
* defined region remain unchanged, but within the defined region the duplicates
* are removed. This operation does not affect the paste buffer. The cursor is
* placed finally at the lower end of the defined region and the marker is placed
* finally at the upper end of the (probably reduced) defined region. For example:
*
* m i s s i s s i p p i marker = 1
* ^ ^ cursor = 10
*
* Then perform sc('a', 'X')
*
* m i s p i marker = 1
* ^ ^ cursor = 4
*/
def dd()
{
var droppedchars: Int = 0;
for (x <- lwr until upr)
{
var c = buffer.charAt(x)
for (i <- lwr until upr)
{
if (buffer.charAt(i) == c)
{
buffer.deleteCharAt(i)
droppedchars += 1
}
}
marker = lwr
cursor = upr - droppedchars
}
}
Need some help with this one too, doesn't appear to work
function needs to drop any duplicate chars it finds, move the marker back to the start of the new defined region and the cursor to the end of the new defined region, not asking for somebody to write this for me just guide me in the right direction

Why not just:
scala> "mississippi".distinct
res22: String = misp

What you're trying to do takes O(n^2) that is not very good from performance perspective...
Imo, a better one solution is use only one loop over the buffer and within it an each character c you should check with a set of characters (declared outside the for-loop):
val chars = scala.collection.mutable.Set[Char]()
...
for (x <- lwr until upr) {
var c = buffer.charAt(x)
if (chars contains c) buffer.deleteCharAt(x) else chars += c
}
cursor = marker + chars.size

Related

Scala - Delete between defined region

I Have a series of tests that need to pass one of them includes deleting the buffer between the defined region (which is between the marker and cursor) and inserting the cut text in the paste. I then need to set the cursor and marker to the beginning of the cut text (see below code for a better understanding)
class Buffer(s: String) {
import scala.collection.mutable.StringBuilder
import scala.io.StdIn
private var buffer: StringBuilder = new StringBuilder(s)
private var cursor: Int = 0 // cursor is in between characters
private var marker: Int = 0 // marker is in between characters
private var paste: String = ""
private def end: Int = buffer.length // the end of the line
private def lwr: Int = Math.min(marker, cursor)
private def upr: Int = Math.max(marker, cursor)
/*
* Accessor methods to return aspects of the state
*/
def getCursor: Int = cursor
def getMarker: Int = marker
def getString: String = buffer.toString
def getPaste: String = paste
Delete the contents of the defined region and save the cut string in the paste
buffer. This operation re-sets the cursor and the marker to the start of the
cut text. For example:
B U F F E R marker = 1
^ ^ cursor = 4
Then perform xd()
B E R marker = 1
^ cursor = 1
*/
I have written some code:
def xd() {
paste = buffer.substring(lwr, upr)
buffer = buffer.delete(lwr, upr)
cursor = end
marker = end
}
this seems to pass the other tests but does not set the marker and cursor.
Any suggestions please?
First in Scala you would try to have no mutable state (var).
Here is a solution for cut and paste that is immutable:
case class Buffer(s: String, paste: String, private val cursor: Int = 0, private val marker: Int = 0) {
def mark(str: String): Buffer = {
val startIndex = s.indexOf(str)
val endIndex = startIndex + str.length
Buffer(s, startIndex, endIndex)
}
def cut(): Buffer = {
Buffer(s.take(cursor) + s.drop(marker), // rest of the String
s.take(marker).drop(cursor)) // paste of the String
}
}
You can use it like this:
Buffer("hello there") // > Buffer(hello there,,0,0)
.mark("o t") // > Buffer(hello there,,4,7)
.cut() // > Buffer(hellhere,o t,0,0)
You see the result of each line.
Let me know if you need more support or if I misunderstood you.

Scala - Count number of adjacent repeated chars in String

I have this function that counts the number of adjacent repeated chars inside a String.
def adjacentCount( s: String ) : Int = {
var cont = 0
for (a <- s.sliding(2)) {
if (a(0) == a(1)) cont = cont + 1
}
cont
}
}
But I'm supposed to create a function that does exactly the same, but using only immutable variables or loop instructions, in a "purely" functional way.
You can just use the count method on the Iterator:
val s = "aabcddd"
s.sliding(2).count(p => p(0) == p(1))
// res1: Int = 3

best way read a file content and find pattern in a given list of files

I have a list of files names (nearly 400 000). I need to parse each file's content and find a given string pattern.
Can any one help me best way to boost my searching process(I'm able to process the content in 90 seconds).
Here is the piece of code that need to be optimised.
/**
* This method is called over a list of files and file is parsed char by char and compared with pattern using prefix table( used in KMP algorithm).
*
* #param pattern
* Pattern to be searched
*
* #param prefixTable
* Prefix table is build is using KMP algorithm.
* Example:- For a given pattern => results sets are { "ababaca" => 0012301, "abcdabca" => 00001231, "aababca" => 0101001, "aabaabaaa" => 010123452 }
*
* #param file
* File that need to be parsed to find the string pattern.
*
* ##return
* For a given file it return a map of lines numbers with all multiple char location(start) of pattern with in that line.
*
*/
def contains(pattern:Array[Char],prefixTable:Array[Int], file:String):LinkedHashMap[Integer, ArrayList[Integer]]= {
val pat:String = pattern.toString()
//stores a line and char location of each occurrence
var returnValue:LinkedHashMap[Integer, ArrayList[Integer]] = new LinkedHashMap[Integer, ArrayList[Integer]]()
val source = scala.io.Source.fromFile(file,"iso-8859-1")
val lines = try source.mkString finally source.close()
var lineNumber=1
var i=0
var k=0
var j=0
while(i < lines.length()){
if(lines(i)=='\n')
{lineNumber+=1;k=0; j=0}
var charAt = new ArrayList[Integer]();
while( j<pattern.length && i < lines.length() && lines(i)==pattern(j)){
j+=1
i+=1
k+=1
}
if(j==pattern.length){charAt.add(k-pattern.length+1);j=0}
if(j==0) {i+=1;k+=1}
else{j=prefixTable(j-1)}
if(charAt.size()>0){returnValue.put(lineNumber, charAt)}
}
return returnValue;
}
with this code :
object HelloWorld {
def main(args: Array[String]) {
val name="""A""".r
val chaine="BCDARFA"
val res=name.findAllIn(chaine)
println("found?"+res)
println("1st place "+res.start)
}
}
you can find the position of the first occurence of the regex in a string. I don't now if it is faster than yours, but anyway it could simplify your code.
EDIT:
here's the final code:
object HelloWorld {
def main(args: Array[String]) {
val name="""A""".r
val chaine="BCDARFA"
val res=name.findAllIn(chaine)
println("found?"+res)
println("1st place "+res.start)
for (elt <- res.matchData) {
println ("position : "+elt.start)
}
}
}

How can I set the order in which discrete objects are instantiated?

I wrote an object PathGraph which implements a graph of Nodes and various useful functions, which I intend to use for pathfinding in a simple tower defense game. I also wrote a class Path which implements Dijkstra's algorithm, and each non-static in-game unit has a Path.
The problem I am running into is that when I run the application, the code executes the code to initialize the units and, in doing so, initialize a path for each creep before building the PathGraph object (confirmed using Eclipse Scala debugger and println statements). Unfortunately however, the code to generate a path requires that the PathGraph object, and specifically the path variable (var so that I can point to a new path if the map gets updated, etc.), be initialized.
How should I fix this problem with my code? PathGraph code pasted below for reference.
object PathGraph {
private val graph:Array[Array[Node]] = buildAndFillGraph()
//val nodeDist:Double = MainGame.pixelsPerIteration
val nodeDist = .5
val numXNodes = (MainGame.gamePanelWidth.toDouble / nodeDist).toInt
val numYNodes = (MainGame.gamePanelHeight.toDouble / nodeDist).toInt
val defaultInfinity = 99999
//build every Nodes adjacent nodes
val angle = 45
val minHeight = 0
val minWidth = 0
val maxHeight = MainGame.gamePanelSize.height //game panel y value starts at 0 at TOP
val maxWidth = MainGame.gamePanelSize.width
val numPossibleAdjacentNodes = 360 / angle //360 degrees, 45 degree angle between every potentially adjacent Node
val hypotenuseLength = math.sqrt((nodeDist * nodeDist) + (nodeDist * nodeDist))
def buildGraphArray(): Array[Array[Node]] = {
println("numXNodes/nodeDist.toInt: " + (numXNodes.toDouble / nodeDist).toInt + "\n")
//build every Node in the graph
val lgraph =
(for (x <- 0 until (numXNodes / nodeDist).toInt) yield {
(for (y <- 0 until (numYNodes / nodeDist).toInt) yield {
new Node(x.toDouble * nodeDist, y.toDouble * nodeDist)//gives lgraph(x,y) notation
}).toArray //convert IndexedSeqs to Arrays
}).toArray//again
lgraph
}
def buildAndFillGraph():Array[Array[Node]] = {
val lgraph = buildGraphArray()//Ar[Ar[Node]]
println("lgraph built")
lgraph.map(x => x.map(y => y.setAdjacentNodes(lgraph)))
//set the adjacent nodes for all nodes in the array
if (lgraph.size != numXNodes*numYNodes) println("numXNodes*numYNodes: " + numXNodes*numYNodes)
else MainGame.pathGraphBuilt = true
lgraph
}
def getGraph() = graph
def toBuffer(): mutable.Buffer[Node] = graph.flatten.toBuffer
def toArray(): Array[Node] = graph.flatten
}
There are a few things you can do to improve the code:
Do not use static variables. Your PathGraph should be a class, not an object. MainGame. pathGraphBuilt is also a static variable that you can replace with a builder - see the next point.
Use a Builder pattern to differentiate between things that build and the end result. Your PathGraph logic will mostly go into the builder. Something along these lines:
-
case class PathGraphBuilder(nodeDist: Double, numXNodes: Double /* and so on */) {
def apply: PathGraph = buildAndFillGraph
def buildGraphArray = ...
def buildAndFillGraph = ...
}
class PathGraph(underlyingGraph: Array[Array[Node]]) {
def toBuffer(): mutable.Buffer[Node] = underlyingGraph.flatten.toBuffer
def toArray(): Array[Node] = underlyingGraph.flatten
}

Union-Find (or Disjoint Set) data structure in Scala

I am looking for an existing implementation of a union-find or disjoint set data structure in Scala before I attempt to roll my own as the optimisations look somewhat complicated.
I mean this kind of thing - where the two operations union and find are optimised.
Does anybody know of anything existing? I've obviously tried googling around.
I had written one for myself some time back which I believe performs decently. Unlike other implementations, the find is O(1) and union is O(log(n)). If you have a lot more union operations than find, then this might not be very useful. I hope you find it useful:
package week2
import scala.collection.immutable.HashSet
import scala.collection.immutable.HashMap
/**
* Union Find implementaion.
* Find is O(1)
* Union is O(log(n))
* Implementation is using a HashTable. Each wrap has a set which maintains the elements in that wrap.
* When 2 wraps are union, then both the set's are clubbed. O(log(n)) operation
* A HashMap is also maintained to find the Wrap associated with each node. O(log(n)) operation in mainitaining it.
*
* If the input array is null at any index, it is ignored
*/
class UnionFind[T](all: Array[T]) {
private var dataStruc = new HashMap[T, Wrap]
for (a <- all if (a != null))
dataStruc = dataStruc + (a -> new Wrap(a))
var timeU = 0L
var timeF = 0L
/**
* The number of Unions
*/
private var size = dataStruc.size
/**
* Unions the set containing a and b
*/
def union(a: T, b: T): Wrap = {
val st = System.currentTimeMillis()
val first: Wrap = dataStruc.get(a).get
val second: Wrap = dataStruc.get(b).get
if (first.contains(b) || second.contains(a))
first
else {
// below is to merge smaller with bigger rather than other way around
val firstIsBig = (first.set.size > second.set.size)
val ans = if (firstIsBig) {
first.set = first.set ++ second.set
second.set.foreach(a => {
dataStruc = dataStruc - a
dataStruc = dataStruc + (a -> first)
})
first
} else {
second.set = second.set ++ first.set
first.set.foreach(a => {
dataStruc = dataStruc - a
dataStruc = dataStruc + (a -> second)
})
second
}
timeU = timeU + (System.currentTimeMillis() - st)
size = size - 1
ans
}
}
/**
* true if they are in same set. false if not
*/
def find(a: T, b: T): Boolean = {
val st = System.currentTimeMillis()
val ans = dataStruc.get(a).get.contains(b)
timeF = timeF + (System.currentTimeMillis() - st)
ans
}
def sizeUnion: Int = size
class Wrap(e: T) {
var set = new HashSet[T]
set = set + e
def add(elem: T) {
set = set + elem
}
def contains(elem: T): Boolean = set.contains(elem)
}
}
Here is a simple, short and somewhat efficient mutable implementation of UnionFind:
import scala.collection.mutable
class UnionFind[T]:
private val map = new mutable.HashMap[T, mutable.HashSet[T]]
private var size = 0
def distinct = size
def addFresh(a: T): Unit =
assert(!map.contains(a))
val set = new mutable.HashSet[T]
set += a
map(a) = set
size += 1
def setEqual(a: T, b: T): Unit =
val ma = map(a)
val mb = map(b)
if !ma.contains(b) then
// redirect the elements of the smaller set to the bigger set
if ma.size > mb.size
then
ma ++= mb
mb.foreach { x => map(x) = ma }
else
mb ++= ma
ma.foreach { x => map(x) = mb }
size = size - 1
def isEqual(a: T, b: T): Boolean =
map(a).contains(b)
Remarks:
An immutable implementation of UnionFind can be useful when rollback or backtracking or proofs are necessary
An mutable implementation can avoid garbage collection for speedup
One could also consider a persistent datastructure -- works like an immutable implementation, but is using internally some mutable state for speed

Categories