Module shlex removing newline after comment - ebranca/owasp-pysec GitHub Wiki
Classification
-
Affected Components : shlex
-
Operating System : Linux
-
Python Versions : 2.6.x, 2.7.x, 3.1.x, 3.2.x
-
Reproducible : Yes
Source code
from shlex import shlex
print("#################")
lexera = shlex("a \n b")
A = ",".join(lexera)
texta = repr('CASE1 ==> shlex("a \n b")')
print('%s' % (texta))
print(repr(A))
if A == 'a,b':
print "CASE1 --> PASS"
else:
print "CASE1 --> FAIL"
print("#################")
lexerb = shlex("a # comment \n b")
B = ",".join(lexerb)
textb = repr('CASE2 ==> shlex("a # comment \n b")')
print('%s' % (textb))
print(repr(B))
if B == 'a,\n,b':
print "CASE2 --> PASS"
else:
print "CASE2 --> FAIL"
print("#################")
lexerc = shlex("a \n b")
lexerc.whitespace=" "
C = ",".join(lexerc)
textc = repr('CASE3 ==> shlex("a \n b") + whitespace')
print('%s' % (textc))
print(repr(C))
if C == 'a,\n,b':
print "CASE3 --> PASS"
else:
print "CASE3 --> FAIL"
print("#################")
lexerd = shlex("a # comment \n b")
lexerd.whitespace=" "
D = ",".join(lexerd)
textd = repr('CASE4 ==> shlex("a # comment \n b") + whitespace')
print('%s' % (textd))
print(repr(D))
if D == 'a,\n,b':
print "CASE4 --> PASS"
else:
print "CASE4 --> FAIL"
Steps to Produce/Reproduce
To reproduce the problem copy the source code
in a file and execute the script using the following command syntax:
$ python -OOBRtt test.py
Alternatively you can open python in interactive mode:
$ python -OOBRtt <press enter>
Then copy the lines of code into the interpreter.
Description
Test results are the same in every version of python tested.
Test 1: no comment, newline, no whitespace detection
Test | Test String | Expected | Obtained | Status |
---|---|---|---|---|
1 | "a \n b" |
'a,b' | 'a,b' | PASS |
Test 2: comment, newline, no whitespace detection
Test | Test String | Expected | Obtained | Status |
---|---|---|---|---|
2 | "a # comment \n b" |
'a,\n,b' | 'a,b' | FAIL |
Test 3: no comment, newline, whitespace detection
Test | Test String | Expected | Obtained | Status |
---|---|---|---|---|
3 | "a \n b" |
'a,b' | 'a,b' | PASS |
Test 4: comment, newline, whitespace detection
Test | Test String | Expected | Obtained | Status |
---|---|---|---|---|
4 | "a # comment \n b" |
'a,\n,b' | 'a,b' | FAIL |
The module shlex
should tokenize like the shell but the python implementation seems not following POSIX 2008 standards.
If POSIX 2008 would have to be followed, nexline characters after a comment are not considered to be part of the comment therefore they should be tokenized.
Workaround
We are not aware on any easy solution other than trying to avoid using shlex for cases like the one examined.
Secure Implementation
WORK IN PROGRESS
References
[POSIX 2008 - Shell Command Language][01] [01]:http://pubs.opengroup.org/onlinepubs/9699919799/idx/shell.html
[POSIX 2008 - Shell Command Language - Sections 2.10][02] [02]:http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_10
[POSIX 2008 - Shell Command Language - Sections 2.10][02] [02]:http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_03
[POSIX 2008 - Shell Command Language rationale][07] [07]:http://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xcu_chap02.html
[Python bug 7089][04] [04]:http://bugs.python.org/issue7089
[Python bug 7611][05] [05]:http://bugs.python.org/issue7611
[Python bug 1521950][06] [06]:http://bugs.python.org/issue1521950