java.lang.IllegalArgumentException when HDFS file creating - scala

I have HDFS and some text, I want to create file with text. I tried to use HDFS api and FSDataOutputStream, but got an exception. Could you help me please resolve it.
The exception is:
Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: hdfs:/user/user1, expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(
at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(
at org.apache.hadoop.fs.ChecksumFileSystem.mkdirs(
at org.apache.hadoop.fs.ChecksumFileSystem.create(
at org.apache.hadoop.fs.ChecksumFileSystem.create(
at org.apache.hadoop.fs.FileSystem.create(
at org.apache.hadoop.fs.FileSystem.create(
at org.apache.hadoop.fs.FileSystem.create(
at org.apache.hadoop.fs.FileSystem.create(
at com.example.FileBuilder$.buildFile(FileBuilder.scala:23)
The code is
val fs = FileSystem.get(new Configuration())
val path = new Path(s"hdfs:////user/" + "fileName.sql")
val fsDataOutputStream = fs.create(path)
val outputStreamWriter = new OutputStreamWriter(fsDataOutputStream, "UTF-8")
val bufferedWriter = new BufferedWriter(outputStreamWriter)

I think there is some problem with the file path. Can you test by replace below portion of code in yours.
Configuration configuration = new Configuration();
FileSystem fs = FileSystem.get(new URI(<url:port>), configuration);
Path filePath = new Path("/user/fileName.sql");
val fsDataOutputStream = fs.create(path)


Writing file using FileSystem to S3 (Scala)

I'm using scala , and trying to write file with string content,
to S3.
I've tried to do that with FileSystem ,
but I getting an error of:
"Wrong FS: s3a"
val content = "blabla"
val fs = FileSystem.get(spark.sparkContext.hadoopConfiguration)
val s3Path: Path = new Path("s3a://bucket/ha/fileTest.txt")
val localPath= new Path("/tmp/fileTest.txt")
val os = fs.create(localPath)
and i'm getting an error:
java.lang.IllegalArgumentException: Wrong FS: s3a://...txt, expected: file:///
What is wrong?
you need to ask for the specific filesystem for that scheme, then you can create a text file directly on the remote system.
val s3Path: Path = new Path("s3a://bucket/ha/fileTest.txt")
val fs = s3Path.getFilesystem(spark.sparkContext.hadoopConfiguration)
val os = fs.create(s3Path, true)
There's no need to write locally and upload; the s3a connector will buffer and upload as needed

Out of memory issue while using Multipart upload API of AWS s3

I am trying to use aws multipart upload using aws SDK and spark and file size is around 14GB but getting out of memory error. Its giving error at this line - val bytes: Array[Byte] = IOUtils.toByteArray(is)
I have tried to bump up driver memory and executor memory to 100 G and tried few other spark optimizations.
Below is the code I am trying with :-
val tm = TransferManagerBuilder.standard.withS3Client(s3Client).build
val fs = FileSystem.get(new Configuration())
val filePath = new Path(hdfsFilePath)
val is:InputStream =
val om = new ObjectMetadata()
val bytes: Array[Byte] = IOUtils.toByteArray(is)
val byteArrayInputStream: ByteArrayInputStream = new ByteArrayInputStream(bytes)
val request = new PutObjectRequest(bucketName, keyName, byteArrayInputStream, om).withSSEAwsKeyManagementParams(new SSEAwsKeyManagementParams(kmsKey)).withCannedAcl(CannedAccessControlList.BucketOwnerFullControl)
val upload = tm.upload(request)
And this is the Exception I am getting :-
at com.amazonaws.util.IOUtils.toByteArray(
PutObjectRequest accepts File:
public PutObjectRequest(String bucketName, String key, File file)
Something like the following should work (I haven't checked though):
val result = TransferManagerBuilder.standard.withS3Client(s3Client)
new PutObjectRequest(
new File(new Path(hdfsFilePath))
.withSSEAwsKeyManagementParams(new SSEAwsKeyManagementParams(kmsKey))

Cannot write a string to hdfs file using scala

I wrote some code to create a file in hdfs and write bytes to it. This is the code:
def write(uri: String, filePath: String, data: String): Unit = {
System.setProperty("HADOOP_USER_NAME", "hibou")
val path = new Path(filePath + "/hello.txt")
val conf = new Configuration()
conf.set("fs.defaultFS", uri)
val fs = FileSystem.get(conf)
val os = fs.create(path)
The code success without error but I only see that file created. When I examine the content of the file with hdfs -dfs -cat /.../hello.txt I don't see any content?

Read data from HDFS

I'm using the FSDataInputStream library to access the data from HDFS
The following is the snippet which I'm using
val fs = FileSystem.get(new,new Configuration())
val stream = Path(#PATH))
val reader = new BufferedReader(new InputStreamReader(stream))
val offset:String = reader.readLine() #Reads the string "5432" stored in the file
Expected output is "5432".
But the actual output is "^#5^#4^#3^#2"
Not able to trim "^#" since they are not considered as characters.Please help with appropriate solution.

Hdfs file list in scala

i am trying to find the list of file in hdfs directory but the code its expecting file as the input when i try to run the below code.
val TestPath2="hdfs://localhost:8020/user/hdfs/QERESULTS1.csv"
val hdfs: org.apache.hadoop.fs.FileSystem = org.apache.hadoop.fs.FileSystem.get(sc.hadoopConfiguration)
val hadoopPath = new org.apache.hadoop.fs.Path(TestPath1)
val recursive = true
// val ri = hdfs.listFiles(hadoopPath, recursive)()
val ri=hdfs.listFiles(hadoopPath, true)
You should set your default filesystem to hdfs:// first, I seems like your default filesystem is file://
val conf = sc.hadoopConfiguration
conf.set("fs.defaultFS", "hdfs://some-path")
val hdfs: org.apache.hadoop.fs.FileSystem = org.apache.hadoop.fs.FileSystem.get(conf)