Scala Count Lines in a file VERY FAST - scala

Count lines in a file (With BufferedInputStream) in Scala.
object CountLinesScala {
def main(args: Array[String]) {
val c = countLines("C:/.../Motifs/mrr569.fasta")
println(c)
}
def countLines(filename: String): Int = {
val is = new BufferedInputStream(new FileInputStream(filename))
try {
val c = Array.ofDim[Byte](1024)
var count = 0
var readChars = 0
var empty = true
while ((readChars = is.read(c)) != -1) {
empty = false
for (i <- 0 until readChars if c(i) == '\n') {
count=count +1
}
}
if ((count == 0 && !empty)) 1 else count
} finally {
is.close()
}
}
}
its not working, Why ?
i click run but there is no reaction, and no Errors !

You code is not working because the type of an assignation is Unit and you are comparing Unit to an Int (-1) which can never be equal, therefore your while loop never exits.
More specifically, this expression has type Unit
(readChars = is.read(c))
if you want to fix your version of the program you could define an internal function doRead
def doRead: Int = {
readChars = is.read(c)
readChars
}
and use that in your while loop
while (doRead != -1) {
empty = false
for (i <- 0 until readChars if c(i) == '\n') {
count=count +1
}
}
the final code for countLine would look like
def countLines(filename: String): Int = {
val is = new BufferedInputStream(new FileInputStream(filename))
try {
val c = Array.ofDim[Byte](1024)
var count = 0
var readChars = 0
var empty = true
def doRead: Int = {
readChars = is.read(c)
readChars
}
while (doRead!= -1) {
empty = false
for (i <- 0 until readChars if c(i) == '\n') {
count=count +1
}
}
if ((count == 0 && !empty)) 1 else count
} finally {
is.close()
}
}
However I advise you not to write Scala code like this. As was answered by brian, the most idiomatic way to write this would be to use the scala standard library and write
scala.io.Source.fromFile("C:/.../Motifs/mrr569.fasta").getLines.size
Then your original program would become
import scala.io.Source
object CountLinesScala {
def main(args: Array[String]) {
val c = Source.fromFile("C:/.../Motifs/mrr569.fasta").getLines().size
println(c)
}
}

The standard library handles this nicely.
scala.io.Source.fromFile("C:/.../Motifs/mrr569.fasta").getLines.size

Related

Unable to compare lists and array in scala

My problem statement is as below:
1) I have a path till a folder. I have to traverse using that path. Check whether there are any subfolders and files in that path. If yes, match the contents of the folder(Lists) with the array. If it matches take it as a new path.
2) Using new path. List down the files in that path.
I am able to do everything except comparing lists and array. Below is my code:
import java.text.SimpleDateFormat
import java.util.Calendar
import java.io.File
class ListingDirectories {
def getListOfDirectories(dir: String): List[File] = {
val d = new File(dir)
if (d.exists && d.isDirectory) {
d.listFiles().filter(_.isDirectory()).toList
} else {
List[File]()
}
}
def getListOfFiles(dir: String): List[File] = {
val d = new File(dir)
if (d.exists && d.isDirectory()) {
d.listFiles().filter(_.isFile()).toList
} else {
List[File]()
}
}
}
object FirstSample {
def main(args: Array[String]) {
val ld = new ListingDirectories()
val directoriesList = ld.getListOfDirectories("C:/Users/Siddheshk2/Desktop/CENSUS").toList
println(directoriesList + "\n")
val directoriesListReplaced = directoriesList.toString().replace("//", "/")
// println(directoriesListReplaced.indexOf("C:/Users/Siddheshk2/Desktop/CENSUS/SAMPLE"))
var finalString = ""
var s = Array("C:/Users/Siddheshk2/Desktop/CENSUS/SAMPLE")
for (x <- s) {
if (x.equals(directoriesListReplaced)) {
finalString = s(0)
} else {
println("No matching strings")
}
}
val filesList = ld.getListOfFiles(finalString)
println(filesList.toString())
}
}
I just need to compare the values from the list and array and take it as a new path in finalString variable in order to pass in the next method which is getListOfFiles. I figured out since I am returning List[file] in methods I am not able to access the elements inside it. Can anyone help me to understand where am I going wrong? TIA
Your directoriesListReplaced will be a string which looks like "List(C:/Users/Siddheshk2/Desktop/CENSUS/SAMPLE,C:/Users/Siddheshk2/Desktop/CENSUS/SAMPLE1,C:/Users/Siddheshk2/Desktop/CENSUS/SAMPLE2)" and s won't equal it. It isn't at all clear what you want to do with directoriesListReplaced; maybe it should just be
for (x <- s) {
if (directoriesList.contains(x)) {
...
} else {
println("No matching strings")
}
}
After lot of struggle, I solved it using below code:
import java.io.File
object PlayingWithLists {
def main(ar: Array[String]) {
var s = "C:/Users/Siddheshk2/Desktop/CENSUS/SAMPLE2"
var finalValue = ""
var valuesReplaced = ""
var filePath = ""
for (file <- new File("C:/Users/Siddheshk2/Desktop/CENSUS").listFiles) {
valuesReplaced = file.toString.replace("\\", "/")
if (valuesReplaced.contains(s.trim)) {
finalValue = file.toString
} else {
}
}
for (file <- new File(finalValue).listFiles) {
filePath = file.toString.trim
}
}
}

Circular Generators hanging indefinately

I have a set of names in a file. I need to implement a Generator that iterates through them continually. However, the code is hanging indefinitely at if (iter.hasNext) after the first pass.
Gen code
var asStream = getClass.getResourceAsStream("/firstnames/female/en_US_sample.txt")
var source: BufferedSource = Source.fromInputStream(asStream)
var iter: Iterator[String] = Iterator.continually(source.getLines()).flatten
val genLastName: Gen[String] = {
genCannedData
}
def genCannedData: Gen[String] = {
println("Generating names: " + iter)
Gen.delay {
if (iter.hasNext) {
println("In if")
Gen.const(iter.next)
}
else {
println("In else")
Gen.const(iter.next)
}
}
}
Sample Property test
property("FirstNames") = {
forAll(genLastName) {
a => {
println(a)
a == a
}
}
}
en_US_sample.txt file contents
Abbie
Abby
Abigail
Ada
Adah
EDIT- Temporary working code
The following code works if I recreate the iterator but I was wondering why Iterator.continually is hanging?
def genCannedData: Gen[String] = {
Gen.delay {
if (iter.hasNext) {
Gen.const(iter.next)
}
else {
asStream = getClass.getResourceAsStream("/firstnames/female/en_US_sample.txt")
source = Source.fromInputStream(asStream)
iter = source.getLines()
Gen.const(iter.next)
}
}
}
After first iteration, an iterator returned by source.getLines() returns false for hasNext, which means an empty iterator.
Iterator.continually() continually evaluate source.getLines() expecting a next iterator, but it continues to return an empty iterator. Then it forms an infinite loop.

Will values in function generate every time?

object isValidUuid {
val sample = "f40473b8-9924-2a9a-bd82-7432191f2a75"
val len = sample.length
val dashIndices = sample.indices.filter(sample(_) == '-')
def apply(string: String) = {
string.length == len && dashIndices.forall(string(_) == '-')
}
}
def isValidUuid(string: String) = {
//f40473b8-9924-2a9a-bd82-7432191f2a75
val sample = "f40473b8-9924-2a9a-bd82-7432191f2a75"
val len = sample.length
val dashIndices = sample.indices.filter(sample(_) == '-')
string.length == len && dashIndices.forall(string(_) == '-')
}
Did the object and function isValidUuid do exactly the same thing,
or object will be faster,because function calculate len and dashIndices every time?
This scala code:
object O {
val i = 1
def foo = {
i
}
def bar = {
val x = 1
x
}
}
Compiles to this java:
public class _$$anon$1$O$ {
private final int i;
public int i() {
return this.i;
}
public int foo() {
return this.i();
}
public int bar() {
final int x = 1;
return x;
}
{
this.i = 1;
}
}
// lazy object initialization omitted
As you can see, all values inside the function are transpiled into local variables, while values inside the object are class fields and they are initialized only once (when object is initialized). I omitted object initialization code for clarity.
Check my scala-to-java tool, it helps to understand how scala works in such cases.
You can easy test this, adding sleep to your lenght calculation
object isValidUuidObj {
val sample = "f40473b8-9924-2a9a-bd82-7432191f2a75"
val len = {
Thread.sleep(1000)
sample.length
}
val dashIndices = sample.indices.filter(sample(_) == '-')
def apply(string: String) = {
string.length == len && dashIndices.forall(string(_) == '-')
}
}
def isValidUuid(string: String) = {
//f40473b8-9924-2a9a-bd82-7432191f2a75
val sample = "f40473b8-9924-2a9a-bd82-7432191f2a75"
val len = {
Thread.sleep(1000)
sample.length
}
val dashIndices = sample.indices.filter(sample(_) == '-')
string.length == len && dashIndices.forall(string(_) == '-')
}
Yes, object will be faster, but I don't think that in your case this difference is important.
You'd better keep all constants in object (magic numbers/strings are always bad idea), while using object for calculations is not a clean solution.

Call the neighbor actors in the game of life using scala akka actor model

I'm just beginner in scala but experienced in Java and C++, now I want to use akka actor model to implement the parallel Conway's Game of Life. My thought is create a 50*50 grid, each cell is an actor, and passing messages among actors to update. Here is the way I create the actors:
class World(row: Int, col: Int, step: Int) extends Actor {
val random = new Random() //generate the alive or dead cell for the game
val cellgrid = new World(row, col, step)
def receive = {
case Start => {
for(gridrow <- 0 until row) {
for(gridcol <- 0 until col) {
context.actorOf(Props(new Grid(random.nextBoolean, gridrow, gridcol, row, col, step)))
}
}
for (gridrow <- 0 until row) {
for (gridcol <- 0 until col) {
sender ! Compare()
}
}
}
case Done(gridrow, gridcol, alive) => {
for(gridrow <- 0 until row) {
for(gridcol <- 0 until col) {
if(alive) print("* ")
else print(" ")
}
print("\n")
}
sender ! Stop
}
case Stop => {context.stop(self); println("Stopped the actor system!")}
case _ => context.stop(self)
}
}
But that causes a problem. Since I create so many Grid classes, I'm having trouble call the neighbor actors. Here is the Grid class:
class Grid(var life: Boolean, gridrow: Int, gridcol: Int, row: Int, col: Int, step: Int) extends Actor {
val alive = life
var numNeighbors = 0
var currentStep = 0
val desiredSteps = step
val count = Array(0,0)
val countsuround = Array(0,0)
for (neighbourX <- gridrow - 1 to gridrow + 1) {
if (neighbourX >= 0 && neighbourX < col) {
for (neighbourY <- gridcol - 1 to gridcol + 1) {
if (neighbourY >= 0 && neighbourY < row && (neighbourX != row && neighbourY != col))
numNeighbors = numNeighbors + 1
}
}
}
def receive = {
case Compare() => {
for (neighbourX <- gridrow - 1 to gridrow + 1) {
if (neighbourX >= 0 && neighbourX < col) {
for (neighbourY <- gridcol - 1 to gridcol + 1) {
if (neighbourY >= 0 && neighbourY < row && (neighbourX != row && neighbourY != col))
sender ! Record(life, currentStep) //Here I need to pass messages to my neighbors
}
}
}
}
case Record(alive,step) => {
if(alive == true){
count(step%2) += 1
}
countsuround(step%2) += 1
self ! Check()
}
case Check() => {
if(countsuround(currentStep%2) == numNeighbors) {
if(count(currentStep%2) ==3 ||life == true && count(currentStep%2) == 2)
life = true
else
life = false
count(currentStep%2) =0; countsuround(currentStep%2) = count(currentStep%2)
currentStep += 1
if(desiredSteps <= currentStep + 1)
sender ! Stop
else {
sender ! Done(gridrow, gridcol, alive)
//context.stop(self)
}
}
}
}
}
Please take a look at the Compare case in the receive function, at the end of that, I need to send messages Record to my neighbors, but I can't find a proper way to talk, I don't have any index of neighbors (like (neighborX, neighborY).Record(life, currentStep)). Please help me, I've stuck here for several weeks. Thanks!!!
Here is a full working example with some comments.
import akka.actor._
import scala.concurrent.duration._
object GameOfLife extends App {
val system = ActorSystem("game-of-life")
implicit val ec = system.dispatcher // implicit ExecutionContext for scheduler
val Width = 20
val Height = 20
// create view so we can send results to
val view = system.actorOf(Props(classOf[View], Width, Height), "view")
// create map of cells, key is coordinate, a tuple (Int, Int)
val cells = (for { i <- 0 until Width; j <- 0 until Height } yield {
val cellRef = system.actorOf(Props[Cell], s"cell_$i-$j") // correct usage of Props, see docs for details
((i,j), cellRef)
}).toMap
// we need some helpers to work with grid
val neighbours = (x:Int, y:Int) => Neighbours(
for (i <- x - 1 to x + 1; j <- y - 1 to y + 1; if ((i,j) != (x,y))) yield {
cells( ( (i + Width) % Width, (j + Height) % Height) )
}
)
for { i <- 0 until Width; j <- 0 until Height } { // notice that this loop doesn't have yield, so it is foreach loop
cells((i,j)) ! neighbours(i,j) // send cell its' neighbours
}
// now we need to synchronously update all cells,
// this is the main reason why suggested model (one actor - one cell) is probably not the most adequate
for { i <- 0 until Width; j <- 0 until Height } {
cells((i,j)) ! SetState(alive = util.Random.nextBoolean, x = i, y = j)
}
// for simplicity I assume that update will take less then update time
val refreshTime = 100.millis
system.scheduler.schedule(1.second, refreshTime) {
view ! Run
for { i <- 0 until Width; j <- 0 until Height } {
cells((i,j)) ! Run
}
}
}
class View(w:Int, h:Int) extends Actor {
var actorStates: Map[(Int,Int), Boolean] = Map()
def receive:Receive = {
case Run => actorStates = Map.empty
case UpdateView(alive, x, y) =>
actorStates = actorStates + (((x,y), alive))
if (actorStates.size == w * h) {
for { j <- 0 until h } {
for(i <- 0 until w) {
if(actorStates((i,j))) {
print("x ")
} else {
print(". ")
}
}
println()
}
}
}
}
class Cell extends Actor {
var neighbours:Seq[ActorRef] = Seq()
var neighbourStates: Map[ActorRef, Boolean] = Map() // Map.empty[Map[ActorRef, Boolean]] is better
var alive:Boolean = false
var previousState:Boolean = false
var x:Int = 0
var y:Int = 0
def receive : Receive = {
case Run =>
neighbourStates = Map.empty
previousState = alive
neighbours.foreach(_ ! QueryState)
case SetState(alive,x,y) =>
this.alive = alive
this.x = x
this.y = y
case Neighbours(xs) =>
neighbours = xs
case QueryState =>
sender ! NeighbourState(alive = previousState)
case NeighbourState(alive) =>
neighbourStates = neighbourStates + ((sender, alive))
// this is tricky when all senders has send you their states it doesn't mean that you can mutate your own,
// they could still may request your internal state, will use hackish previousState
if (neighbourStates.size == 8) { // total of 8 neighbours sent their states, we are complete with update
val aliveMembers = neighbourStates.values.filter(identity).size
aliveMembers match {
case n if n < 2 => this.alive = false
case 3 => this.alive = true
case n if n > 3 => this.alive = false;
case _ => // 2, state doesn't change
}
context.actorSelection("/user/view") ! UpdateView(this.alive, x, y)
}
}
}
case class SetState(alive:Boolean, x:Int, y:Int)
case class Neighbours(xs:Seq[ActorRef])
case object QueryState
case object Run
case class NeighbourState(alive:Boolean)
case class UpdateView (alive:Boolean, x:Int, y:Int)

Merge sort in Scala using Actors

So this is the code :
object Main {
def main(args: Array[String]){
var myArray = Array(5,2,7,6,8,1,15,5/*,9,10,56*/)
var startSorting = new Starter(myArray)
startSorting.start
startSorting ! Begin
var i = 0
for( i <- 0 to myArray.length - 1){
println(myArray(i))
}
}
}
import scala.actors.Actor
import scala.actors.Actor._
abstract class SortArray
case object Sort extends SortArray
case object FinishedSubArraySorting extends SortArray
case object Begin extends SortArray
class Starter(toBeSorted :Array[Int]) extends Actor{
def act(){
var first:Array[Int] = Array()
var second:Array[Int] = Array()
loop{
react{
case Begin =>
var SortActor = new MergeSort(toBeSorted)
SortActor.start
SortActor ! Sort
case sortedArray :Array[Int] =>
var i = 0
println("Sortat:")
for( i <- 0 to sortedArray.length - 1){
println(sortedArray(i))
}
}
}
}
}
class MergeSort(toBeSorted :Array[Int]) extends Actor{
def act(){
var finishedSorting = 0
var thisArray = toBeSorted
var first:Array[Int] = Array();
var second:Array[Int] = Array();
var sortedSubArrays = 0;
loop{
react{
case Sort =>
if(thisArray.length == 1){
finishedSorting = 1
sender ! thisArray
exit('stop)
}else{
first = thisArray.slice(0,thisArray.length/2)
second = thisArray.slice(thisArray.length/2,thisArray.length)
var firstSort = new MergeSort(first)
var secondSort = new MergeSort(second)
firstSort.start
secondSort.start
firstSort ! Sort
secondSort ! Sort
}
case subSortedArray:Array[Int] =>
sortedSubArrays = sortedSubArrays + 1
if(sortedSubArrays == 1){
first = subSortedArray
}else{
second = subSortedArray
thisArray = merge(first,second)
finishedSorting = 1
sender ! thisArray
exit('stop)
}
}
}
}
def merge(firstArray :Array[Int],secondArray :Array[Int]):Array[Int] = {
var result:Array[Int] = new Array[Int](firstArray.length + secondArray.length)
var i = 0
var j = 0
var k = 0
while(i < firstArray.length && j < secondArray.length){
if(firstArray(i) <= secondArray(j)){
result(k) = firstArray(i)
i = i + 1
}else{
result(k) = secondArray(j)
j = j + 1
}
k = k + 1
}
while (i < firstArray.length)
{
result(k) = firstArray(i)
i = i + 1
k = k + 1
}
while (j < secondArray.length)
{
result(k) = secondArray(j)
j = j + 1
k = k + 1
}
return result
}
}
It uses 2 actors:One is for the beginning and end output and another one that is going to have multiple instances of it to merge sort the array and sub-arrays. An instance is first called with !Sort in order to divide the array in two and make two actors for each subArray and is also called afterwards with the resulting sorted sub-array from each of the two actors i mentioned earlier.
The problem with this code is that it manages to sort and merge only at the lowest levels(subarrays with max length 2) and then for some reasons the actors call different senders than the ones that called them so nothing happens anymore.
Your problem is that when the MergeSort actors receive the sublists from their children they then send the merged lists to sender - which will be their children instead of parents.
I added a parent field to MergeSort and changed sends to use this parent - here's the modified code
object Main extends App {
var myArray = Array(5,2,7,6,8,1,15,5/*,9,10,56*/)
var startSorting = new Starter(myArray)
startSorting.start
startSorting ! Begin
var i = 0
for( i <- 0 to myArray.length - 1){
println(myArray(i))
}
}
import scala.actors.Actor
import scala.actors.Actor._
abstract class SortArray
case object Sort extends SortArray
case object FinishedSubArraySorting extends SortArray
case object Begin extends SortArray
class Starter(toBeSorted :Array[Int]) extends Actor{
def act(){
var first:Array[Int] = Array()
var second:Array[Int] = Array()
loop{
react{
case Begin =>
var SortActor = new MergeSort(toBeSorted,self)
SortActor.start
SortActor ! Sort
case sortedArray :Array[Int] =>
var i = 0
println("Sortat:")
for( i <- 0 to sortedArray.length - 1){
println(">"+sortedArray(i))
}
}
}
}
}
class MergeSort(toBeSorted :Array[Int],parent:Actor) extends Actor{
def act(){
var finishedSorting = 0
var thisArray = toBeSorted
var first:Array[Int] = Array();
var second:Array[Int] = Array();
var sortedSubArrays = 0;
loop{
react{
case Sort =>
if(thisArray.length == 1){
finishedSorting = 1
println(this + " sending up "+thisArray.mkString("[",",","]"))
parent ! thisArray
exit('stop)
}else{
first = thisArray.slice(0,thisArray.length/2)
second = thisArray.slice(thisArray.length/2,thisArray.length)
var firstSort = new MergeSort(first,self)
var secondSort = new MergeSort(second,self)
firstSort.start
secondSort.start
firstSort ! Sort
secondSort ! Sort
}
case subSortedArray:Array[Int] =>
println(this + " received " + subSortedArray.mkString("[",",","]"))
sortedSubArrays = sortedSubArrays + 1
if(sortedSubArrays == 1){
first = subSortedArray
}else{
second = subSortedArray
thisArray = merge(first,second)
finishedSorting = 1
parent ! thisArray
exit('stop)
}
}
}
}
def merge(firstArray :Array[Int],secondArray :Array[Int]):Array[Int] = {
var result:Array[Int] = new Array[Int](firstArray.length + secondArray.length)
var i = 0
var j = 0
var k = 0
while(i < firstArray.length && j < secondArray.length){
if(firstArray(i) <= secondArray(j)){
result(k) = firstArray(i)
i = i + 1
}else{
result(k) = secondArray(j)
j = j + 1
}
k = k + 1
}
while (i < firstArray.length)
{
result(k) = firstArray(i)
i = i + 1
k = k + 1
}
while (j < secondArray.length)
{
result(k) = secondArray(j)
j = j + 1
k = k + 1
}
return result
}
}