how to read a binary file in chunks in scala.
This was what I was trying to do
val fileInput = new FileInputStream("tokens")
val dis = new DataInputStream(fileInput)
var value = dis.readInt()
var i=0;
println(value)
the value which is printed is a huge number. Whereas it should return 1 as the first output
Because you're seeing 16777216 where you'd expect to have a 1, it sounds like the problem is the endianness of the file is different than the JVM is expecting. (That is, Java always expects big endian/network byte order and your file contains numbers in little endian.)
That's a problem with a well established gamut of solutions.
For example this page has a class that wraps the input stream and makes the problem go away.
Alternatively this page has functions that will read from a DataInputStream.
This StackOverflow answer has various snippets that will simply convert an int, if that's all you need to do.
Here's a Scala snippet that will add methods to read little endian numbers from the file.
The simplest answer to your question of how to fix it is to simply swap the bytes around as you read them. You could do that by replacing your line that looks like
var value = dis.readInt()
with
var value = java.lang.Integer.reverseBytes(dis.readInt())
If you wanted to make that a bit more concise, you could use either the approach of implicitly adding readXLE() methods to DataInput or you could override DataInputStream to have readXLE() methods. Unfortunately, the Java authors decided that the readX() methods should be final, so we can't override those to provide a transparent reader for little endian files.
object LittleEndianImplicits {
implicit def dataInputToLittleEndianWrapper(d: DataInput) = new DataInputLittleEndianWrapper(d)
class DataInputLittleEndianWrapper(d: DataInput) {
def readLongLE(): Long = java.lang.Long.reverseBytes(d.readLong())
def readIntLE(): Int = java.lang.Integer.reverseBytes(d.readInt())
def readCharLE(): Char = java.lang.Character.reverseBytes(d.readChar())
def readShortLE(): Short = java.lang.Short.reverseBytes(d.readShort())
}
}
class LittleEndianDataInputStream(i: InputStream) extends DataInputStream(i) {
def readLongLE(): Long = java.lang.Long.reverseBytes(super.readLong())
def readIntLE(): Int = java.lang.Integer.reverseBytes(super.readInt())
def readCharLE(): Char = java.lang.Character.reverseBytes(super.readChar())
def readShortLE(): Short = java.lang.Short.reverseBytes(super.readShort())
}
object M {
def main(a: Array[String]) {
println("// Regular DIS")
val d = new DataInputStream(new java.io.FileInputStream("endian.bin"))
println("Int 1: " + d.readInt())
println("Int 2: " + d.readInt())
println("// Little Endian DIS")
val e = new LittleEndianDataInputStream(new java.io.FileInputStream("endian.bin"))
println("Int 1: " + e.readIntLE())
println("Int 2: " + e.readIntLE())
import LittleEndianImplicits._
println("// Regular DIS with readIntLE implicit")
val f = new DataInputStream(new java.io.FileInputStream("endian.bin"))
println("Int 1: " + f.readIntLE())
println("Int 2: " + f.readIntLE())
}
}
The "endian.bin" file mentioned above contains a big endian 1 followed bay a little endian 1. Running the above M.main() prints:
// Regular DIS
Int 1: 1
Int 2: 16777216
// LE DIS
Int 1: 16777216
Int 2: 1
// Regular DIS with readIntLE implicit
Int 1: 16777216
Int 2: 1
Related
I have two case classes: addSmall and addBig.
addSmall contains only one field.
addBig contains several fields.
case class AddSmall(set: Set[Int] = Set.empty[Int]) {
def add(e: Int) = copy(set + e)
}
case class AddBig(set: Set[Int] = Set.empty[Int]) extends Foo {
def add(e: Int) = copy(set + e)
}
trait Foo {
val a = "a"; val b = "b"; val c = "c"; val d = "d"; val e = "e"
val f = "f"; val g = "g"; val h = "h"; val i = "i"; val j = "j"
val k = "k"; val l = "l"; val m = "m"; val n = "n"; val o = "o"
val p = "p"; val q = "q"; val r = "r"; val s = "s"; val t = "t"
}
A quick benchmark using JMH shows that copying addBig objects is way more exprensive even if i change only one field..
import java.util.concurrent.TimeUnit
import org.openjdk.jmh.annotations._
#State(Scope.Benchmark)
class AddState {
var elem: Int = _
var addSmall: AddSmall = _
var addBig: AddBig = _
#Setup(Level.Trial)
def setup(): Unit = {
addSmall = AddSmall()
addBig = AddBig()
elem = 1
}
}
#OutputTimeUnit(TimeUnit.MILLISECONDS)
#BenchmarkMode(Array(Mode.Throughput))
class SetBenchmark {
#Benchmark
def addSmall(state: AddState): AddSmall = {
state.addSmall.add(state.elem)
}
#Benchmark
def addBig(state: AddState): AddBig = {
state.addBig.add(state.elem)
}
}
And the results show that copying addBig is more than 10 times slower than copying addSmall!
> jmh:run -i 5 -wi 5 -f1 -t1
[info] Benchmark Mode Cnt Score Error Units
[info] LocalBenchmarks.Set.SetBenchmark.addBig thrpt 5 10732.569 ± 349.577 ops/ms
[info] LocalBenchmarks.Set.SetBenchmark.addSmall thrpt 5 126711.722 ± 10538.611 ops/ms
How come copying the object is much slower for addBig?
As far as i understand structural sharing, since all fields are immutable copying the object should be very efficient as it only needs to store the changes ("delta") which in this case is only the set s, and should thus give the same performance as addSmall.
EDIT: The same performance issue arises when the state is part of the case class.
case class AddBig(set: Set[Int] = Set.empty[Int], a: String = "a", b: String = "b", ...) {
def add(e: Int) = copy(set + e)
}
I guess, that this is because AddBig class extends Foo trait, which has all this String fields - a to t. It seems like, in result object they will be declared as regular fields, not the static fields if compare to Java, hence allocating memory for the object, might be the root cause of slower copy performance.
UPDATE:
In order to verify this theory you can try to use JOL (Java Object Layout) tool - openjdk.java.net/projects/code-tools/jol
Here is the simple code example:
import org.openjdk.jol.info.{ClassLayout, GraphLayout}
println(ClassLayout.parseClass(classOf[AddSmall]).toPrintable())
println(ClassLayout.parseClass(classOf[AddBig]).toPrintable())
println(GraphLayout.parseInstance(AddSmall()).toPrintable)
println(GraphLayout.parseInstance(AddBig()).toPrintable)
Which in my case produced next output (short version for answer readability):
xample.AddSmall object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 12 (object header) N/A
12 4 scala.collection.immutable.Set AddSmall.set N/A
Instance size: 16 bytes
Space losses: 0 bytes internal + 0 bytes external = 0 bytes total
example.AddBig object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 12 (object header) N/A
12 4 scala.collection.immutable.Set AddBig.set N/A
16 4 java.lang.String AddBig.a N/A
20 4 java.lang.String AddBig.b N/A
24 4 java.lang.String AddBig.c N/A
Instance size: 96 bytes
Space losses: 0 bytes internal + 0 bytes external = 0 bytes total
example.AddSmall#ea1a8d5d object externals:
ADDRESS SIZE TYPE PATH VALUE
770940b28 16 example.AddSmall (object)
770940b38 470456 (something else) (somewhere else) (something else)
7709b38f0 16 scala.collection.immutable.Set$EmptySet$ .set (object)
example.AddBig#480bdb19d object externals:
ADDRESS SIZE TYPE PATH VALUE
770143658 24 java.lang.String .h (object)
770143670 24 [C .h.value [h]
770143688 15536 (something else) (somewhere else) (something else)
770147338 24 java.lang.String .m (object)
770147350 24 [C .m.value [m]
770147368 1104264 (something else) (somewhere else) (something else)
770254cf0 24 java.lang.String .r (object)
770254d08 24 [C .r.value [r]
770254d20 7140768 (something else) (somewhere else) (something else)
7709242c0 24 java.lang.String .a (object)
So as you can see fields from parent trait become class fields as well, so will be copied along with the object.
Hope this helps!
Have you checked this question?
scala case class copy implementation
You can check compiler generated things to elaborate this. There's a probability that these vals became regular fields of case class and being copied each time class copied.
Your Foo trait adds 20 members to every subclass even though they are constants. This is going to use more memory and make copying the class slower.
Consider
1) Making them def rather than val so they are no longer data members
OR
2) Moving them into the companion class for the trait and accessing as Foo.a etc.
I'm trying to implement Huffman compression. After encoding the char into 0 and 1, how do I write it to a file so it'll be compressed? Obviously simply writing the characters 0,1 will only make the file larger.
Let's say I have a string "0101010" which represents bits.
I wish to write the string into file but as in binary, i.e. not the char '0' but the bit 0.
I tried using the getBytesArray() on the string but it seems to make no difference rather than simply writing the string.
Although Sarvesh Kumar Singh code will probably work, it looks so Javish to me that I think this question needs one more answer. The way I imaging Huffman coding to be implemented in Scala is something like this:
import scala.collection._
type Bit = Boolean
type BitStream = Iterator[Bit]
type BitArray = Array[Bit]
type ByteStream = Iterator[Byte]
type CharStream = Iterator[Char]
case class EncodingMapping(charMapping: Map[Char, BitArray], eofCharMapping: BitArray)
def buildMapping(src: CharStream): EncodingMapping = {
def accumulateStats(src: CharStream): Map[Char, Int] = ???
def buildMappingImpl(stats: Map[Char, Int]): EncodingMapping = ???
val stats = accumulateStats(src)
buildMappingImpl(stats)
}
def convertBitsToBytes(bits: BitStream): ByteStream = {
bits.grouped(8).map(bits => {
val res = bits.foldLeft((0.toByte, 0))((acc, bit) => ((acc._1 * 2 + (if (bit) 1 else 0)).toByte, acc._2 + 1))
// last byte might be less than 8 bits
if (res._2 == 8)
res._1
else
(res._1 << (8 - res._2)).toByte
})
}
def encodeImpl(src: CharStream, mapping: EncodingMapping): ByteStream = {
val mainData = src.flatMap(ch => mapping.charMapping(ch))
val fullData = mainData ++ mapping.eofCharMapping
convertBitsToBytes(fullData)
}
// can be used to encode String as src. Thanks to StringLike/StringOps extension
def encode(src: Iterable[Char]): (EncodingMapping, ByteStream) = {
val mapping = buildMapping(src.iterator)
val encoded = encode(src.iterator, mapping)
(mapping, encoded)
}
def wrapClose[A <: java.io.Closeable, B](resource: A)(op: A => B): B = {
try {
op(resource)
}
finally {
resource.close()
}
}
def encodeFile(fileName: String): (EncodingMapping, ByteStream) = {
// note in real life you probably want to specify file encoding as well
val mapping = wrapClose(Source.fromFile(fileName))(buildMapping)
val encoded = wrapClose(Source.fromFile(fileName))(file => encode(file, mapping))
(mapping, encoded)
}
where in accumulateStats you find out how often each Char is present in the src and in buildMappingImpl (which is the main part of the whole Huffman encoding) you first build a tree from that stats and then use the create a fixed EncodingMapping. eofCharMapping is a mapping for the pseudo-EOF char as mentioned in one of the comments. Note that high-level encode methods return both EncodingMapping and ByteStream because in any real life scenario you want to save both.
The piece of logic specifically being asked is located in convertBitsToBytes method. Note that I use Boolean to represent single bit rather than Char and thus Iterator[Bit] (effectively Iterator[Boolean]) rather than String to represent a sequence of bits. The idea of the implementation is based on the grouped method that converts a BitStream into a stream of Bits grouped in a byte-sized groups (except possible for the last one).
IMHO the main advantage of this stream-oriented approach comparing to Sarvesh Kumar Singh answer is that you don't need to load the whole file into memory at once or store the whole encoded file in the memory. Note however that in such case you'll have to read the file twice: first time to build the EncodingMapping and second to apply it. Obviously if the file is small enough you can load it into the memory first and then convert ByteStream to Array[Byte] using .toArray call. But if your file is big, you can use just stream-based approach and easily save the ByteStream into a file using something like .foreach(b => out.write(b))
I don't think this will help you achieve your Huffman compression goal, but in answer to your question:
string-to-value
Converting a String to the value it represents is pretty easy.
val x: Int = "10101010".foldLeft(0)(_*2 + _.asDigit)
Note: You'll have to check for formatting (only ones and zeros) and overflow (strings too long) before conversion.
value-to-file
There are a number of ways to write data to a file. Here's a simple one.
import java.io.{FileOutputStream, File}
val fos = new FileOutputStream(new File("output.dat"))
fos.write(x)
fos.flush()
fos.close()
Note: You'll want to catch any errors thrown.
I will specify all the required imports first,
import java.io.{ File, FileInputStream, FileOutputStream}
import java.nio.file.Paths
import scala.collection.mutable.ArrayBuffer
Now, We are going to need following smaller units to achieve this whole thing,
1 - We need to be able to convert our binary string (eg. "01010") to Array[Byte],
def binaryStringToByteArray(binaryString: String) = {
val byteBuffer = ArrayBuffer.empty[Byte]
var byteStr = ""
for (binaryChar <- binaryString) {
if (byteStr.length < 7) {
byteStr = byteStr + binaryChar
}
else {
try{
val byte = java.lang.Byte.parseByte(byteStr + binaryChar, 2)
byteBuffer += byte
byteStr = ""
}
catch {
case ex: java.lang.NumberFormatException =>
val byte = java.lang.Byte.parseByte(byteStr, 2)
byteBuffer += byte
byteStr = "" + binaryChar
}
}
}
if (!byteStr.isEmpty) {
val byte = java.lang.Byte.parseByte(byteStr, 2)
byteBuffer += byte
byteStr = ""
}
byteBuffer.toArray
}
2 - We need to be able to open the file to serve in our little play,
def openFile(filePath: String): File = {
val path = Paths.get(filePath)
val file = path.toFile
if (file.exists()) file.delete()
if (!file.exists()) file.createNewFile()
file
}
3 - We need to be able to write bytes to a file,
def writeBytesToFile(bytes: Array[Byte], file: File): Unit = {
val fos = new FileOutputStream(file)
fos.write(bytes)
fos.close()
}
4 - We need to be able to read bytes back from the file,
def readBytesFromFile(file: File): Array[Byte] = {
val fis = new FileInputStream(file)
val bytes = new Array[Byte](file.length().toInt)
fis.read(bytes)
fis.close()
bytes
}
5 - We need to be able convert bytes back to our binaryString,
def byteArrayToBinaryString(byteArray: Array[Byte]): String = {
byteArray.map(b => b.toBinaryString).mkString("")
}
Now, we are ready to do every thing we want,
// lets say we had this binary string,
scala> val binaryString = "00101110011010101010101010101"
// binaryString: String = 00101110011010101010101010101
// Now, we need to "pad" this with a leading "1" to avoid byte related issues
scala> val paddedBinaryString = "1" + binaryString
// paddedBinaryString: String = 100101110011010101010101010101
// The file which we will use for this,
scala> val file = openFile("/tmp/a_bit")
// file: java.io.File = /tmp/a_bit
// convert our padded binary string to bytes
scala> val bytes = binaryStringToByteArray(paddedBinaryString)
// bytes: Array[Byte] = Array(75, 77, 85, 85)
// write the bytes to our file,
scala> writeBytesToFile(bytes, file)
// read bytes back from file,
scala> val bytesFromFile = readBytesFromFile(file)
// bytesFromFile: Array[Byte] = Array(75, 77, 85, 85)
// so now, we have our padded string back,
scala> val paddedBinaryStringFromFile = byteArrayToBinaryString(bytes)
// paddedBinaryStringFromFile: String = 1001011100110110101011010101
// remove that "1" from the front and we have our binaryString back,
scala> val binaryStringFromFile = paddedBinaryString.tail
// binaryStringFromFile: String = 00101110011010101010101010101
NOTE :: you may have to make few changes if you want to deal with very large "binary strings" (more than few millions of characters long) to improve performance or even be usable. For example - You will need to start using Streams or Iterators instead of Array[Byte].
Ok, so basically, if I am in the console (Intellij) and I type FileScramble.getRandomPW, I get an ASCII password. But if I run the command in the code, I don't. Instead, I get "org.jasypt.exceptions.EncryptionInitializationException: InvalidKeySpecException: Password is not ASCII."
Here is a screen shot of what I mean.
The fact that I've been up and down that block of code so many times leads me to believe that I'm missing something fundamental in the scala language. The try-catch of the getRandomPW block is never triggered. And, like I said, if I call it from the console, I get only ASCII.
The program is just going to scramble the contents of a file before deletion. It's by no means secure -- it's an exercise. It's me getting familiar with 1) scala, 2) encryption, and 3) sbt.
So here is the relevant code:
import java.io.{BufferedOutputStream, File, FileOutputStream, InputStream}
import java.nio.ByteBuffer
import java.security.SecureRandom
import org.jasypt.util.binary.BasicBinaryEncryptor
object FileScramble {
val base64chars = ('a' to 'z').union('A' to 'Z').union(0 to 9).union(List('/', '+'))
def byteArrayToBase64(x: java.nio.ByteBuffer) : String = {
// convert to string and filter out anything but base64chars
val nowString = new String(x.array.takeWhile(_ != 0), "UTF-8")
nowString.filter(base64chars.contains(_))
}
def writeBytes( data : Stream[Byte], file : File ) = {
val target = new BufferedOutputStream( new FileOutputStream(file) );
try data.foreach( target.write(_) ) finally target.close;
}
def getRandomPW : String = {
try {
var output : String = ""
while (output.length() < 10) {
// val r = scala.util.Random
val r = SecureRandom.getInstance("SHA1PRNG")
var bytePW : Array[Byte] = new Array[Byte](1000)
r.nextBytes(bytePW)
// get 1000 random bytes into a ByteBuffer
val preString = ByteBuffer.allocate(1000).put(bytePW)
// get a random base 64 password at least 10 chars long
output = byteArrayToBase64(preString)
}
output
}
catch {
case e : Exception => e.getMessage()
}
}
def main( args: Array[String] ): Unit = {
val fileHandle = new java.io.File(args(0))
// https://github.com/liufengyun/scala-bug
val source = scala.io.Source.fromFile(fileHandle, "ISO-8859-1")
// source = new MyInputStream(dataStream)
val byteArray = source.map(_.toByte).toArray
// val byteStream = source.map(_.toByte).toStream
source.close()
var binaryEncryptor = new BasicBinaryEncryptor();
val pw = getRandomPW
println("BEGIN: " + pw + ":END")
binaryEncryptor.setPassword(pw);
val encryptedOut = binaryEncryptor.encrypt(byteArray).toStream
writeBytes(encryptedOut, fileHandle)
}
}
Honestly, I've been up and down the block for a few hours and have not come up with any ideas as to what could be happening. It's by far the biggest head-scratcher I've had recently, to the point that I've asked SO a question for the first time in several years.
Your help is appreciated! I thank you in advance, whether you can help or not.
You have only one small, elusive mistake - when you're trying to add the numeric characters 0 - 9, you should add union('0' to '9'), instead of union(0 to 9) - otherwise you're adding non-ASCII characters (unicode values 0 - 9...) and thus getting the (justifiable) exception.
#TzachZohar has it exactly right.
What you might also consider, though, is letting the compiler help you out a bit more by adding your expected type.
val base64anys: Seq[Char] = ('a' to 'z').union('A' to 'Z').union(0 to 9).union(List('/', '+'))
does not compile. So you would have seen the error.
So I have an association that associates a pair of Ints with a Vector[Long] that can be up to size 10000, and I have anywhere from several hundred thousand to a million of such data. I would like to store this in a single file for later processing in Scala.
Clearly storing this in a plain-text format would take way too much space, so I've been trying to figure out how to do it by writing a Byte stream. However I'm not too sure if this will work since it seems to me that the byteValue() of a Long returns the Byte representation which is still 4 bytes long, and hence I won't save any space? I do not have much experience working with binary formats.
It seems the Scala standard library had a BytePickle that might have been what I was looking for, but has since been deprecated?
An arbitrary Long is about 19.5 ASCII digits long, but only 8 bytes long, so you'll gain a savings of a factor of ~2 if you write it in binary. Now, it may be that most of the values are not actually taking all 8 bytes, in which case you could define some compression scheme yourself.
In any case, you are probably best off writing block data using java.nio.ByteBuffer and friends. Binary data is most efficiently read in blocks, and you might want your file to be randomly accessible, in which case you want your data to look something like so:
<some unique binary header that lets you check the file type>
<int saying how many records you have>
<offset of the first record>
<offset of the second record>
...
<offset of the last record>
<int><int><length of vector><long><long>...<long>
<int><int><length of vector><long><long>...<long>
...
<int><int><length of vector><long><long>...<long>
This is a particularly convenient format for reading and writing using ByteBuffer because you know in advance how big everything is going to be. So you can
val fos = new FileOutputStream(myFileName)
val fc = fos.getChannel // java.nio.channel.FileChannel
val header = ByteBuffer.allocate(28)
header.put("This is my cool header!!".getBytes)
header.putInt(data.length)
fc.write(header)
val offsets = ByteBuffer.allocate(8*data.length)
data.foldLeft(28L+8*data.length){ (n,d) =>
offsets.putLong(n)
n = n + 12 + d.vector.length*8
}
fc.write(offsets)
...
and on the way back in
val fis = new FileInputStream(myFileName)
val fc = fis.getChannel
val header = ByteBuffer.allocate(28)
fc.read(header)
val hbytes = new Array[Byte](24)
header.get(hbytes)
if (new String(hbytes) != "This is my cool header!!") ???
val nrec = header.getInt
val offsets = ByteBuffer.allocate(8*nrec)
fc.read(offsets)
val offsetArray = offsets.getLongs(nrec) // See below!
...
There are some handy methods on ByteBuffer that are absent, but you can add them on with implicits (here for Scala 2.10; with 2.9 make it a plain class, drop the extends AnyVal, and supply an implicit conversion from ByteBuffer to RichByteBuffer):
implicit class RichByteBuffer(val b: java.nio.ByteBuffer) extends AnyVal {
def getBytes(n: Int) = { val a = new Array[Byte](n); b.get(a); a }
def getShorts(n: Int) = { val a = new Array[Short](n); var i=0; while (i<n) { a(i)=b.getShort(); i+=1 } ; a }
def getInts(n: Int) = { val a = new Array[Int](n); var i=0; while (i<n) { a(i)=b.getInt(); i+=1 } ; a }
def getLongs(n: Int) = { val a = new Array[Long](n); var i=0; while (i<n) { a(i)=b.getLong(); i+=1 } ; a }
def getFloats(n: Int) = { val a = new Array[Float](n); var i=0; while (i<n) { a(i)=b.getFloat(); i+=1 } ; a }
def getDoubles(n: Int) = { val a = new Array[Double](n); var i=0; while (i<n) { a(i)=b.getDouble(); i+=1 } ; a }
}
Anyway, the reason to do things this way is that you'll end up with decent performance, which is also a consideration when you have tens of gigabytes of data (which it sounds like you have given hundreds of thousands of vectors of length up to ten thousand).
If your problem is actually much smaller, then don't worry so much about it--pack it into XML or use JSON or some custom text solution (or use DataOutputStream and DataInputStream, which don't perform as well and won't give you random access).
If your problem is actually bigger, you can define two lists of longs; first, the ones that will fit in an Int, say, and then the ones that actually need a full Long (with indices so you know where they are). Data compression is a very case-specific task--assuming you don't just want to use java.util.zip--so without a lot more knowledge about what the data looks like, it's hard to know what to recommend beyond just storing it as a weakly hierarchical binary file as I've described above.
See Java's DataOutputStream. It allows easy and efficient writing of primitive types and Strings to byte streams. In particular, you want something like:
val stream = new DataOutputStream(new FileOutputStream("your_file.bin"))
You can then use the equivalent DataInputStream methods to read from that file to variables again.
I used scala-io, scala-arm to write a binary stream of Long-s. The libraries itself are supposed to be a Scala-way to do things, but these are not in Scala master branch - maybe someone knows why? I use them from time to time.
1) Clone scala-io:
git clone https://github.com/scala-incubator/scala-io.git
Go to scala-io/package and change in Build.scala, val scalaVersion to yours
sbt package
2) Clone scala-arm:
git clone https://github.com/jsuereth/scala-arm.git
Go to scala-arm/package and change in build.scala, scalaVersion := to yours
sbt package
3) Copy somewhere not too far:
scala-io/core/target/scala-xxx/scala-io-core_xxx-0.5.0-SNAPSHOT.jar
scala-io/file/target/scala-xxx/scala-io-file_xxx-0.5.0-SNAPSHOT.jar
scala-arm/target/scala-xxx/scala-arm_xxx-1.3-SNAPSHOT.jar
4) Start REPL:
scala -classpath "/opt/scala-io/scala-io-core_2.10-0.5.0-SNAPSHOT.jar:
/opt/scala-io/scala-io-file_2.10-0.5.0-SNAPSHOT.jar:
/opt/scala-arm/scala-arm_2.10-1.3-SNAPSHOT.jar"
5) :paste actual code:
import scalax.io._
// create data stream
val EOData = Vector(0xffffffffffffffffL)
val data = List(
(0, Vector(0L,1L,2L,3L))
,(1, Vector(4L,5L))
,(2, Vector(6L,7L,8L))
,(3, Vector(9L))
)
var it = Iterator[Long]()
for (rec <- data) {
it = it ++ Vector(rec._1).iterator.map(_.toLong)
it = it ++ rec._2.iterator
it = it ++ EOData.iterator
}
// write data at once
val out: Output = Resource.fromFile("/tmp/data")
out.write(it)(OutputConverter.TraversableLongConverter)
I want to take input from the user. Can you please tell me how to ask for user input as a string in Scala?
In Scala 2.11 use
scala.io.StdIn.readLine()
instead of the deprecated Console.readLine.
Here is a standard way to read Integer values
val a = scala.io.StdIn.readInt()
println("The value of a is " + a)
similarly
def readBoolean(): Boolean
Reads a Boolean value from an entire line from stdin.
def readByte(): Byte
Reads a Byte value from an entire line from stdin.
def readChar(): Char
Reads a Char value from an entire line from stdin.
def readDouble(): Double
Reads a Double value from an entire line from stdin.
def readFloat(): Float
Reads a Float value from an entire line from stdin.
def readInt(): Int
Reads an Int value from an entire line from stdin.
def readLine(text: String, args: Any*): String
Prints formatted text to stdout and reads a full line from stdin.
def readLine(): String
Reads a full line from stdin.
def readLong(): Long
Reads a Long value from an entire line from stdin.
def readShort(): Short
Reads a Short value from an entire line from stdin.
def readf(format: String): List[Any]
Reads in structured input from stdin as specified by the format specifier.
def readf1(format: String): Any
Reads in structured input from stdin as specified by the format specifier, returning
only the first value extracted, according to the format specification.
def readf2(format: String): (Any, Any)
Reads in structured input from stdin as specified by the format specifier, returning
only the first two values extracted, according to the format specification.
def readf3(format: String): (Any, Any, Any)
Reads in structured input from stdin as specified by the format specifier, returning
only the first three values extracted, according to the format specification.
Similarly if you want to read multiple user inputs from the same line ex: name, age, weight you can use the Scanner object
import java.util.Scanner
// simulated input
val input = "Joe 33 200.0"
val line = new Scanner(input)
val name = line.next
val age = line.nextInt
val weight = line.nextDouble
abridged from Scala Cookbook: Recipes for Object-Oriented and Functional Programming by Alvin Alexander
From the Scala maling list (formatting and links were updated):
Short answer:
readInt
Long answer:
If you want to read from the terminal, check out Console.scala.
You can use these functions like so:
Console.readInt
Also, for your convenience, Predef.scala
automatically defines some shortcuts to functions in Console. Since
stuff in Predef is always and everywhere imported automatically, you
can use them like so:
readInt
object InputTest extends App{
println("Type something : ")
val input = scala.io.StdIn.readLine()
println("Did you type this ? " + input)
}
This way you can ask input.
scala.io.StdIn.readLine()
You can take a user String input using readLine().
import scala.io.StdIn._
object q1 {
def main(args:Array[String]):Unit={
println("Enter your name : ")
val a = readLine()
println("My name is : "+a)
}
}
Or you can use the scanner class to take user input.
import java.util.Scanner;
object q1 {
def main(args:Array[String]):Unit={
val scanner = new Scanner(System.in)
println("Enter your name : ")
val a = scanner.nextLine()
println("My name is : "+a)
}
}
Simple Example for Reading Input from User
val scanner = new java.util.Scanner(System.in)
scala> println("What is your name") What is your name
scala> val name = scanner.nextLine()
name: String = VIRAJ
scala> println(s"My Name is $name")
My Name is VIRAJ
Also we can use Read Line
val name = readLine("What is your name ")
What is your name name: String = Viraj
In Scala 2:
import java.io._
object Test {
// Read user input, output
def main(args: Array[String]) {
// create a file writer
var writer = new PrintWriter(new File("output.txt"))
// read an int from standard input
print("Enter the number of lines to read in: ")
val x: Int = scala.io.StdIn.readLine.toInt
// read in x number of lines from standard input
var i=0
while (i < x) {
var str: String = scala.io.StdIn.readLine
writer.write(str + "\n")
i = i + 1
}
// close the writer
writer.close
}
}
This code gets input from user and outputs it:
[input] Enter the number of lines to read in: 2
one
two
[output] output.txt
one
two
Using a thread to poll the input-readLine:
// keystop1.sc
// In Scala- or SBT console/Quick-REPL: :load keystop1.sc
// As Script: scala -savecompiled keystop1.sc
#volatile var isRunning = true
#volatile var isPause = false
val tInput: Thread = new Thread {
override def run: Unit = {
var status = ""
while (isRunning) {
this.synchronized {
status = scala.io.StdIn.readLine()
status match {
case "s" => isRunning = false
case "p" => isPause = true
case "r" => isRunning = true;isPause = false
case _ => isRunning = false;isPause = false
}
println(s"New status is: $status")
}
}
}
}
tInput.start
var count = 0
var pauseCount = 0
while (isRunning && count < 10){
println(s"still running long lasting job! $count")
if (count % 3 == 0) println("(Please press [each + ENTER]: s to stop, p to pause, r to run again!)")
count += 1
Thread sleep(2000) // simulating heavy computation
while (isPause){
println(s"Taking a break ... $pauseCount")
Thread sleep(1000)
pauseCount += 1
if (pauseCount >= 10){
isPause = false
pauseCount = 0
println(s"Taking a break ... timeout occurred!")
}
}
}
isRunning = false
println(s"Computation stopped, please press Enter!")
tInput.join()
println(s"Ok, thank you, good bye!")
readLine() lets you prompt the user and read their input as a String
val name = readLine("What's your name? ")
please try
scala> readint
please try this method