09. Files Regex - RobertMakyla/scalaWiki GitHub Wiki

KEY POINTS: Source.fromFile("c:\sandbox\clean_logs_intellij.bat").getLines.toArray - yields all lines of a file. Source.fromFile("c:\sandbox\clean_logs_intellij.bat").mkString - yields the file contents as a string. Use the Java PrintWriter to write text files. "regex".r is a Regex object. Use """...""" if your regular expression contains backslashes or quotes. If a regex pattern has groups, you can extract their contents using the syntax: for (regex(var1, ...,varn) <- string).

Reading files by lines

     import scala.io.Source

     val source = Source.fromFile("c:\\sandbox\\clean_logs_intellij.bat")
     val iter: Iterator[String]    = source.getLines       // Iterator[String] = non-empty iterator
     while (iter.hasNext) println( iter.next )
     source.close()                                        // must be closed

     val source = Source.fromFile("c:\\sandbox\\clean_logs_intellij.bat")
     val linesArray: Array[String] = source.getLines.toArray
     source.close()                                        // must be closed

     val source =  Source.fromFile("c:\\sandbox\\clean_logs_intellij.bat", "UTF-8")
     val content: String           = source.mkString
     source.close()                                        // must be closed


 Source.fromFile(String filePath, String encoding)         // ("myFile.txt", "UTF-8")
 Source.fromFile(String filePath)                          // ("myFile.txt")
 Source.fromFile(java.io.File file)                        // ( new File("myFile.txt") )

Reading files by character

     import scala.io.Source

     val source = Source.fromFile("c:\\sandbox\\clean_logs_intellij.bat", "UTF-8")
                                                 // scala.io.BufferedSource = non-empty iterator
     val iter = source.buffered
                                                 // scala.collection.BufferedIterator[Char]  = non-empty iterator
     while (iter.hasNext)  println( iter.next )
     source.close()

BUG: val iter: Iterator[String] = Source.getLines is broken in 2.9.0 val iter: Iterator[Char] = Source.buffered is OK

 I cannot:

     val source = Source.fromFile(...)
     val iter = source.getLines                          // bug: iterator gets empty
     while(iter.hasNext) println(iter.next)              // no loops here, iter was empty

 but I must convert it to Array, List, etc, in one operation :

     val arr:Array[String] = Source.fromFile(...).getLines.toArray
     for(elem <- arr) println( elem )                    // many loops here

Reading Tokens

 splitting by whitespace chars:

     val tokens: Array[String] = Source.fromFile("c:\\sandbox\\clean_logs_intellij.bat").mkString.split("\\s+")

 processing each element (not optimised way)

     for (elem <- tokens ) yield elem.toUpperCase

 scala way

     tokens.map( _.toUpperCase )
     tokens.map( _.toUpperCase ).filter(_.startsWith("C"))

Reading from console

 print("How old are you? ")
 val age = readInt()

Reading from URLs / Other Sources

 val source1 = Source.fromURL("http://horstmann.com", "UTF-8")
 val source2 = Source.fromString("Hello, World!")   // Reads from the given string—useful for debugging
 val source3 = Source.stdin // Reads from standard input

Reading Binary Files (Scala has no provision for reading binary files, so use java here: java.io.FileInputStream)

 val file = new File(filename)
 val in = new FileInputStream(file)
 val bytes = new Array[Byte](file.length.toInt)
 in.read(bytes)
 in.close()

Writing into files (Scala has no provision for writing into files, so use java here: java.io.PrintWriter )

     val out = new PrintWriter( "C:\\Users\\43714408\\Desktop\\deleteMe.txt" )
     for (i <- 1 to 10) out.println(i)
     out.close()

     val source = Source.fromFile( "C:\\Users\\43714408\\Desktop\\deleteMe.txt" )
     source mkString
     source close

Visiting Directories (again, java libs:)

 Iterating through file tree:

     import java.io.File
     def subdirsIter(dir: File): Iterator[File] = {
         val children = dir.listFiles.filter(_.isDirectory)                  // Array[File]
         children.toIterator ++ children.toIterator.flatMap(subdirsIter _)
     }

     subdirsIter(new File("C:\\sandbox")) toArray


 Or in Java 7, java.nio package,  there is walkFileTree mechanism

Serialization (to transmit objects to other virtual machines or for short-term storage)

 Java:
     public class Person implements java.io.Serializable {
         private static final long serialVersionUID = 42L;
     }

     public class Person implements java.io.Serializable {       // java will give default serial ID
     }

 Scala:

    @SerialVersionUID(42L) class Person extends Serializable     // trait Serializable is in scala pkg

    class Person extends Serializable                            // scala will give default serial ID

Process Control

     import scala.sys.process._

     "ls -la .." !                                 //   !    - executed
     val result = "ls -al .." !!                   //   !!   - saved as String
     "ls -al .." #| "grep sec" !                   //   #|   - pipe
     "ls -al .." #> new File("output.txt") !       //   #>   - redirect to file
     "ls -al .." #>> new File("output.txt") !      //   #>>  - append to file
     "grep sec" #< new File("output.txt") !        //   #<   - redirect from file

Regular Expressions

     import scala.util.matching.Regex

     val numPattern: Regex = "[0-9]+".r            //   [0-9]+
     val wsnumwsPattern = """\s+[0-9]+\s+""".r     //   \s+[0-9]+\s+

 iterator:

     for (matchString <- numPattern.findAllIn("a1a b2b c3c")) println(matchString)

 array:

     numPattern.findAllIn("a1a b2b c3c") toArray

Regular Expression Groups

 val numitemPattern: Regex = "([0-9]+) ([a-z]+)".r

 val numitemPattern(num, item) = "99 bottles"