09. Files Regex - RobertMakyla/scalaWiki GitHub Wiki
KEY POINTS: Source.fromFile("c:\sandbox\clean_logs_intellij.bat").getLines.toArray - yields all lines of a file. Source.fromFile("c:\sandbox\clean_logs_intellij.bat").mkString - yields the file contents as a string. Use the Java PrintWriter to write text files. "regex".r is a Regex object. Use """...""" if your regular expression contains backslashes or quotes. If a regex pattern has groups, you can extract their contents using the syntax: for (regex(var1, ...,varn) <- string).
Reading files by lines
import scala.io.Source
val source = Source.fromFile("c:\\sandbox\\clean_logs_intellij.bat")
val iter: Iterator[String] = source.getLines // Iterator[String] = non-empty iterator
while (iter.hasNext) println( iter.next )
source.close() // must be closed
val source = Source.fromFile("c:\\sandbox\\clean_logs_intellij.bat")
val linesArray: Array[String] = source.getLines.toArray
source.close() // must be closed
val source = Source.fromFile("c:\\sandbox\\clean_logs_intellij.bat", "UTF-8")
val content: String = source.mkString
source.close() // must be closed
Source.fromFile(String filePath, String encoding) // ("myFile.txt", "UTF-8")
Source.fromFile(String filePath) // ("myFile.txt")
Source.fromFile(java.io.File file) // ( new File("myFile.txt") )
Reading files by character
import scala.io.Source
val source = Source.fromFile("c:\\sandbox\\clean_logs_intellij.bat", "UTF-8")
// scala.io.BufferedSource = non-empty iterator
val iter = source.buffered
// scala.collection.BufferedIterator[Char] = non-empty iterator
while (iter.hasNext) println( iter.next )
source.close()
BUG: val iter: Iterator[String] = Source.getLines is broken in 2.9.0 val iter: Iterator[Char] = Source.buffered is OK
I cannot:
val source = Source.fromFile(...)
val iter = source.getLines // bug: iterator gets empty
while(iter.hasNext) println(iter.next) // no loops here, iter was empty
but I must convert it to Array, List, etc, in one operation :
val arr:Array[String] = Source.fromFile(...).getLines.toArray
for(elem <- arr) println( elem ) // many loops here
Reading Tokens
splitting by whitespace chars:
val tokens: Array[String] = Source.fromFile("c:\\sandbox\\clean_logs_intellij.bat").mkString.split("\\s+")
processing each element (not optimised way)
for (elem <- tokens ) yield elem.toUpperCase
scala way
tokens.map( _.toUpperCase )
tokens.map( _.toUpperCase ).filter(_.startsWith("C"))
Reading from console
print("How old are you? ")
val age = readInt()
Reading from URLs / Other Sources
val source1 = Source.fromURL("http://horstmann.com", "UTF-8")
val source2 = Source.fromString("Hello, World!") // Reads from the given string—useful for debugging
val source3 = Source.stdin // Reads from standard input
Reading Binary Files (Scala has no provision for reading binary files, so use java here: java.io.FileInputStream)
val file = new File(filename)
val in = new FileInputStream(file)
val bytes = new Array[Byte](file.length.toInt)
in.read(bytes)
in.close()
Writing into files (Scala has no provision for writing into files, so use java here: java.io.PrintWriter )
val out = new PrintWriter( "C:\\Users\\43714408\\Desktop\\deleteMe.txt" )
for (i <- 1 to 10) out.println(i)
out.close()
val source = Source.fromFile( "C:\\Users\\43714408\\Desktop\\deleteMe.txt" )
source mkString
source close
Visiting Directories (again, java libs:)
Iterating through file tree:
import java.io.File
def subdirsIter(dir: File): Iterator[File] = {
val children = dir.listFiles.filter(_.isDirectory) // Array[File]
children.toIterator ++ children.toIterator.flatMap(subdirsIter _)
}
subdirsIter(new File("C:\\sandbox")) toArray
Or in Java 7, java.nio package, there is walkFileTree mechanism
Serialization (to transmit objects to other virtual machines or for short-term storage)
Java:
public class Person implements java.io.Serializable {
private static final long serialVersionUID = 42L;
}
public class Person implements java.io.Serializable { // java will give default serial ID
}
Scala:
@SerialVersionUID(42L) class Person extends Serializable // trait Serializable is in scala pkg
class Person extends Serializable // scala will give default serial ID
Process Control
import scala.sys.process._
"ls -la .." ! // ! - executed
val result = "ls -al .." !! // !! - saved as String
"ls -al .." #| "grep sec" ! // #| - pipe
"ls -al .." #> new File("output.txt") ! // #> - redirect to file
"ls -al .." #>> new File("output.txt") ! // #>> - append to file
"grep sec" #< new File("output.txt") ! // #< - redirect from file
Regular Expressions
import scala.util.matching.Regex
val numPattern: Regex = "[0-9]+".r // [0-9]+
val wsnumwsPattern = """\s+[0-9]+\s+""".r // \s+[0-9]+\s+
iterator:
for (matchString <- numPattern.findAllIn("a1a b2b c3c")) println(matchString)
array:
numPattern.findAllIn("a1a b2b c3c") toArray
Regular Expression Groups
val numitemPattern: Regex = "([0-9]+) ([a-z]+)".r
val numitemPattern(num, item) = "99 bottles"