Vectorrep. - sagr4019/ResearchProject GitHub Wiki
Identifier - Security class map is not in an AST?
We want to encode the AST in Vector representation. Because we already get ASTs from the Data Generation, beside that extracting tokens from ASTs seems more complex. Do we need one vector for one program or multiple vectors, one for every AST node?
How do we encode multiple vectors the encoding of a identifier?
Omit information for unimportant content or use an equal character to mark it.
[{'Identifier': 'Hdrw', 'Security': 'H'}, {'Identifier': 'hJzX', 'Security': 'H'}, {'Identifier': 'NWT', 'Security': 'H'}]
NWT := Null;
hJzX := Null;
Hdrw := Null;
while (-158202 == Hdrw) do {
hJzX := (612125 + NWT)
}
{
'Kind': 'Seq',
'Left': {
'Kind': 'Assign',
'Left': {
'Kind': 'Var',
'Name': 'NWT'
},
'Right': {
'Kind': 'Null',
'Value': 'Null'
}
},
'Right': {
'Kind': 'Seq',
'Left': {
'Kind': 'Assign',
'Left': {
'Kind': 'Var',
'Name': 'hJzX'
},
'Right': {
'Kind': 'Null',
'Value': 'Null'
}
},
'Right': {
'Kind': 'Seq',
'Left': {
'Kind': 'Assign',
'Left': {
'Kind': 'Var',
'Name': 'Hdrw'
},
'Right': {
'Kind': 'Null',
'Value': 'Null'
}
},
'Right': {
'Kind': 'While',
'Condition': {
'Kind': 'Equal',
'Left': {
'Kind': 'Int',
'Value': -158202
},
'Right': {
'Kind': 'Var',
'Name': 'Hdrw'
}
},
'Body': {
'Kind': 'Assign',
'Left': {
'Kind': 'Var',
'Name': 'hJzX'
},
'Right': {
'Kind': 'Add',
'Left': {
'Kind': 'Int',
'Value': 612125
},
'Right': {
'Kind': 'Var',
'Name': 'NWT'
}
}
}
}
}
}
}
Identifier Vector:
[{'Identifier': 'Hdrw', 'Security': 'H'}, {'Identifier': 'hJzX', 'Security': 'H'}, {'Identifier': 'NWT', 'Security': 'H'}]
index 0
identifier 100
SecClass 1
identifier 101
SecClass 1
identifier 102
SecClass 1
'Kind': 'Seq',
'Left': {
'Kind': 'Assign',
'Left': {
'Kind': 'Var',
'Name': 'NWT'
},
'Right': {
'Kind': 'Null',
'Value': 'Null'
}
Program Vector:
index 1
Kind Seq 1000
Left 42
Kind Assign 1001
Left 42
Kind Var 00 //or delete
Var 102 // in Identifier Vector
Right 43
Kind 00 //or delete
Val 00 //or delete