Perl - jwyx/ForFun GitHub Wiki
Credit
<< Learning Perl 6 Edition >>
Introduction
Perl is sometimes called the “Practical Extraction and Report Language,” although it
has also been called a “Pathologically Eclectic Rubbish Lister,” among other expansions.
Perl is easy to use, but sometimes hard to learn.
Perl is optimized for problems which are about 90% working with text and about 10% everything else.
shebang line
#!/usr/bin/perl
#!/usr/bin/env perl
documentation
http://perldoc.perl.org/perldiag.html
perl interpreter: compile into bytecode -> bytecode engine run it
The mod_perl extension to the Apache web server (http://perl.apache.org) or
Perl modules like CGI::Fast can help you.
keep your program in memory between invocations. (no compile)
run external command with backquotes ``
Scalar Data
either a number or a string of characters
integers and floating-point numbers are represented with double-precision floating point values
can use underscore in integer literal; e.g. 61_298_040_283_768
base 10 (decimal), base 8 (octal, 0377), base 16 (hexadecimal, 0xff), base 2 (binary, 0b111)
numeric operators
addition
subtraction
multiplication
division
modulus: both values first reduce to their integer values, e.g. 10.5 % 3.2 is computed as 10 % 3.
exponentiation: 2**3
String
contain any combination of any characters
there’s nothing special about the NUL character in Perl because Perl uses length counting
min: empty string, has no characters
max: no limited
This is in accordance with the principle of “no built-in limits” that Perl follows at every opportunity.
can be used to manipulate raw binary data
full support Unicode, but add pragma 'use utf8;' and make sure that you save your files with the UTF-8 encoding
String Literal
Single-Quoted
Any character other than a single quote or a backslash between the quote marks
(including newline characters, if the string continues on to successive lines)
stands for itself inside a string.
Double-Quoted
But now the backslash takes on its full power to specify certain control
characters, or even any character at all through octal and hex representations
And variable interpolated
\l Lowercase next letter
\L Lowercase all following letters until \E
\u Uppercase next letter
\U Uppercase all following letters until \E
\Q Quote nonword characters by adding a backslash until \E
\E End \L, \U, or \Q
String operators
concatenate/join the values of string with .
must explicitly use the concatenation operator
string repetition operator: x
"fred" x 3 # is "fredfredfred"
"barney" x (4+1) # is "barney" x 5, or "barneybarneybarneybarneybarney"
5 x 4.8 # is really "5" x 4, which is "5555"
The copy count (the right operand) is first truncated to an integer value (4.8 becomes 4)
before being used.
A copy count of less than one results in an empty (zero-length) string.
Perl automatically converts between numbers and strings as needed.
Trailing nonnumber stuff and leading whitespace are discarded, so "12fred34" * " 3" will also
give 36 without any complaints.
At the extreme end of this, something that isn’t a number at all converts to zero.
built-in warning
use warnings;
#!/usr/bin/perl -w
perl -w my_program
use diagnostics;
perl -Mdiagnostics ./my_program
just in case you want to read the documentation as soon as Perl notices your mistakes
Scalar variable
A scalar variable holds a single scalar value, as you’d expect.
Scalar variable names begin with a dollar sign, called the sigil, followed by a Perl identifier:
a letter or underscore, and then possibly more letters, or digits, or underscores.
uppercase and lowercase letters are distinct
Perl doesn’t restrict itself to ASCII for variable names, either.
If you enable the utf8 pragma, you can use a much wider range of alphabetic or
numeric characters in your identifiers
Perl uses the sigils to distinguish things that are variables from anything
else that you might type in the program. You don’t have to know the names of
all the Perl functions and operators to choose your variable name.
You can name your variables with all uppercase, but you might end up using a special variable
reserved for Perl. If you avoid all uppercase names you won’t have that problem.
assignment
binary assignment operator
E.g. print "The answer is ", 6 * 7, ".\n"; # list
Don’t bother with interpolating if you have just the one lone variable
E.g. print $fred; # better style
To put a real dollar sign into a double-quoted string, precede the dollar sign with a
backslash, which turns off the dollar sign’s special significance
Alternatively, you could avoid using double quotes around the problematic part of the string
The variable name will be the longest possible variable name that makes sense at that
part of the string. This can be a problem if you want to follow the replaced value immediately
with some constant text that begins with a letter, digit, or underscore
$what = "brontosaurus steak";
$n = 3;
print "fred ate $n $whats.\n"; # not the steaks, but the value of $whats
print "fred ate $n ${what}s.\n"; # now uses $what
print "fred ate $n $what" . "s.\n"; # another way to do it
print 'fred ate ' . $n . ' ' . $what . "s.\n"; # an especially difficult way
Create them by their code point* with the chr() function:
$alef = chr( 0x05D0 );
$alpha = chr( hex('03B1') );
$omega = chr( 0x03C9 );
Go the other way with the ord() function, which turns a character into its code point
That might be more work than interpolating them directly by putting the hexadecimal
representation in \x{}: "\x{03B1}\x{03C9}"
Operator Precedence and Associativity
Associativity Operators
left parentheses and arguments to list operators
left ->
++ -- (autoincrement and autodecrement)
right **
right \ ! ~ + - (unary operators)
left =~ !~
left * / % x
left + - . (binary operators)
left << >>
named unary operators (-X filetests, rand)
< <= > >= lt le gt ge (the “unequal” ones)
== != <=> eq ne cmp (the “equal” ones)
left &
left | ^
left &&
left ||
.. ...
right ?: (conditional operator)
right = += -= .= (and similar assignment operators)
left , =>
list operators (rightward)
right not
left and
left or xor
Better to use parentheses!
to compare numbers: < <= == >= > !=
to compare strings: lt, le, eq, ge, gt, ne
If structure
format: if ( ) { } else { }
You must have those block curly braces around the conditional code,
unlike C (whether you know C or not).
Perl doesn’t have a separate Boolean datatype, but use a few simple rules:
• If the value is a number, 0 means false; all other numbers mean true.
• Otherwise, if the value is a string, the empty string ('') means false; all other strings
mean true.
• Otherwise (that is, if the value is another kind of scalar than a number or a string),
convert it to a number or a string and try again;
This means that undef (which you’ll see soon) means false, while all references
(which we cover in Intermediate Perl) are true
There’s one trick hidden in those rules. Because the string '0' is the exact same scalar
value as the number 0, Perl has to treat them both the same. That means that the string
'0' is the only non-empty string that is false.
Since the ! changes true to false and false to true, and since Perl
doesn’t have a separate Boolean type, the ! has to return some scalar to represent true
and false. It turns out that 1 and 0 are good enough values, so some people like to
standardize their values to just those values.
Getting User Input
Use the line-input operator, <STDIN>
Each time you use <STDIN> in a place where Perl expects a scalar value, Perl reads the
next complete text line from standard input (up to the first newline), and uses that string
as the value of <STDIN>.
The string value of <STDIN> typically has a newline character on the end of it.
But in practice, you don’t often want to keep the newline, so you need the chomp() operator.
chomp($text = <STDIN>); # Read the text, without the newline character
It works on a variable. The variable has to hold a string, and if the string ends in a newline
character, chomp() removes the newline.
chomp() is actually a function. As a function, it has a return value,
which is the number of characters removed.
This is another general rule in Perl: except in cases where it changes the meaning to remove them,
parentheses are always optional.
If a line ends with two or more newlines, chomp() removes only one. If there’s no
newline, it does nothing, and returns zero.
While structure:
the block curly braces are required
The undef value:
Variables have the special undef value before they are first
assigned, which is just Perl’s way of saying, “Nothing here to look at—move along,
move along.”
But undef is neither a number nor a string; it’s an entirely separate kind of scalar value.
Because undef automatically acts like zero when used as a number.
Many operators return undef when the arguments are out of range or don’t make sense.
To tell whether a value is undef and not the empty string, use the
defined function, which returns false for undef, and true for everything else.
e.g.
$madonna = <STDIN>;
if ( defined($madonna) ) {
print "The input was $madonna";
} else {
print "No input available!\n";
}
If you’d like to make your own undef values, you can use the obscurely named undef
operator:
e.g.
$madonna = undef; # As if it had never been touched
Cont.
P43