EMME: a formal tool for ECMAScript Memory Model Evaluation |
Artifact Evaluation |
This package has been created for
the Artifact
Evaluation of TACAS 2018, which requires to be used on
Ubuntu 16.04 64bits, including: build-essential, cmake,
clang, mono-complete, openjdk-8-jdk, and ruby.
A
pre-configured virutal machine that satisfies these
requirements can be downloaded
from this
link.
License
The usage of EMME tool is regulated by modified BSD license (see EMME github page for more information).
For the additional tools included in this package (such as Google's V8, Mozilla's SpiderMonkey, and Apple's JSC) we refer to the information provided by their official websites.
Initial Setup
Download the package from this link and extract the ZIP file.
The package containts two folders: "EMME-master" containing the EMME program, and "TACAS18-tests" that collects all tests used in the paper.
In order to perform the initial setup we need to set some environment variables, thus execute the following commands:
$ cd EMME-master
$ source env_variables.source
Following we provide a user guide for the main analyses that can be performed with EMME.
Generate all valid executions of a given program
Providing a .bex file (Bounded Execution), EMME produces the valid executions. The syntax of a bex file can be is described below (see "Input Syntax" section).
The results of each analysis are stored in a folder with the same name as the input file e.g., folder "examples/program01" for the program "examples/program01.bex".
Example:
$ ./emme.py examples/single_var/sv_simple01.bex
Output:
** Processing file "examples/single_var/sv_simple01.bex" **
** Program Analysis **
Generating bounded execution... DONE
Generating model... DONE
Solving... DONE
-> Found 2 possible models
Computing expected outputs... DONE
Generating program... DONE
Generating expected outputs... DONE
-> Found 2 possible outputs
Exiting...
Generate the graphical representation of each valid execution
As described in the paper, EMME can produce a graphical representation of each possible valid execution. This can be done by relying on the parameter "-g", and the images will be stored in the program folder in eps format (i.e., gmm*.eps files). For a complete list of parameters we refer to the section "Command Options".
Example:
$ ./emme.py examples/single_var/sv_simple01.bex -g
$ xdg-open examples/single_var/sv_simple01
Force the computation of all valid executions
EMME stores previously computed results, and with the "-f" parameter we ask the tool to discharge the previous results.
Example:
$ ./emme.py examples/single_var/sv_simple01.bex -f
Enable the assertions in the memory model
As described in the paper, the Memory Model is enriched with assertions. These can be enabled by relying on the "--defines" parameter, which passes definitions to the cpp preprocessor.
Example:
$ ./emme.py examples/single_var/sv_simple01.bex --defines="en_ASSERTS=1"
Generate and run a litmus test
The generation of all valid executions produces also a JavaScript litmus test as "program.js" that can be run using a JavaScript engine.
Each engine requires small variations in the input formats, and EMME supports multiple definition of different JavaScript printers.
Different printers can be selected using the "-p" parameter.
A litmus test can be then run using "litmus.py", providing the number of executions with the parameter "-n", the litmus test with the parameter "-i", and the command used to run the Javascript engine with the parameter "-c".
With the distribution package we provide a pre-configured setup for:
- Google's V8, command "bash ext_tools/run_V8.sh"
- Mozilla's SpiderMonkey, command "bash ext_tools/run_SM.sh"
- Apple's JSC, command "bash ext_tools/run_JSC.sh"
Example:
$ ./emme.py examples/single_var/sv_simple01.bex -p=JS-TEST262-V8
$ ./litmus.py -n 1k -c="bash ext_tools/run_V8.sh" -i examples/single_var/sv_simple01/program.js
Output:
** Processing command "bash ext_tools/run_V8.sh examples/single_var/sv_simple01/program.js" **
Running...
.........
=== Results ===
+-----------+-------------+
| Frequency | Output 1    |
+-----------+-------------+
| 51.40%    | id3_R_t2: 1 |
| 48.60%    | id3_R_t2: 0 |
+-----------+-------------+
Coverage: 100.00%
Perform the Behavioral Coverage Constraints analysis
The behavioral coverage constraints analysis can be performed using EMME and providing the parameter "--unmatched", in addition to the command required to run the JavaScript engine (parameter "--jsengine"), and the correct printer with the parameter "-p". As described in the paper, for this analysis we need a program that produces a minimum set of different outputs (e.g., 5), and these tests are available in the folder "TACAS18-tests/ge5_outputs".
This analysis can be configured to provide different formulae expressing the relation between matched and unmatched outputs. This is possible by changing the file in "EMME-master/configs/unmatched.ini".
Examples:
$ ./emme.py ../TACAS18-tests/ge5_outputs/sv_6ev-100.bex --unmatched --jsengine="bash ext_tools/run_V8.sh" -p=JS-TEST262-V8
$ ./emme.py ../TACAS18-tests/ge5_outputs/sv_6ev-100.bex --unmatched --jsengine="bash ext_tools/run_SM.sh" -p=JS-TEST262-SM
Output:
** Processing file "../TACAS18-tests/ge5_outputs/sv_6ev-100.bex" **
** Program Analysis **
Generating bounded execution... DONE
Generating model... DONE
Solving... DONE
-> Found 21 possible models
Computing expected outputs... DONE
Generating program... DONE
Generating expected outputs... DONE
-> Found 21 possible outputs
** Unmatched Outputs Analysis **
Generating model... DONE
Running (x10) "bash ext_tools/run_V8.sh ../TACAS18-tests/ge5_outputs/sv_6ev-100/program.js"...ok
-> Found 6 matched models
-> Found 15 unmatched models
Matched models analysis
Solving... .. DONE
-> Found 2 solutions
(~L_RF_implies_HB & L_HB_implies_RF & ~L_RF_implies_SW & ~L_SW_implies_RF & ~L_SW4 & L_SW4c & L_SW4d & L_HB4a & L_HB4b & L_HB4c & L_HB4d) |
(L_RF_implies_HB & L_HB_implies_RF & L_RF_implies_SW & ~L_SW_implies_RF & L_SW4 & L_SW4c & L_SW4d & L_HB4a & L_HB4b & L_HB4c & L_HB4d)
Unmatched models analysis
Solving... .. DONE
-> Found 2 solutions
(~L_RF_implies_HB & L_HB_implies_RF & ~L_RF_implies_SW & ~L_SW_implies_RF & ~L_SW4 & L_SW4c & L_SW4d & L_HB4a & L_HB4b & L_HB4c & L_HB4d) |
(L_RF_implies_HB & L_HB_implies_RF & L_RF_implies_SW & ~L_SW_implies_RF & L_SW4 & L_SW4c & L_SW4d & L_HB4a & L_HB4b & L_HB4c & L_HB4d)
Difference analysis (exist support(matched) in unmatched)
VALID
Matched -> Unmatched
VALID
Unmatched -> Matched
VALID
Union analysis (matched | unmatched)
-> Found 2 solutions
(~L_RF_implies_HB & L_HB_implies_RF & ~L_RF_implies_SW & ~L_SW_implies_RF & ~L_SW4 & L_SW4c & L_SW4d & L_HB4a & L_HB4b & L_HB4c & L_HB4d) |
(L_RF_implies_HB & L_HB_implies_RF & L_RF_implies_SW & ~L_SW_implies_RF & L_SW4 & L_SW4c & L_SW4d & L_HB4a & L_HB4b & L_HB4c & L_HB4d)
Intersection analysis (matched & unmatched)
-> Found 2 solutions
(~L_RF_implies_HB & L_HB_implies_RF & ~L_RF_implies_SW & ~L_SW_implies_RF & ~L_SW4 & L_SW4c & L_SW4d & L_HB4a & L_HB4b & L_HB4c & L_HB4d) |
(L_RF_implies_HB & L_HB_implies_RF & L_RF_implies_SW & ~L_SW_implies_RF & L_SW4 & L_SW4c & L_SW4d & L_HB4a & L_HB4b & L_HB4c & L_HB4d)
Exiting...
Scalability analysis for all programs provided in the paper: all valid executions
The scalability analysis is performed on all programs in the folder "TACAS18-tests/all", and this evaluation is performed with EMME by relying on forcing the computation (parameter "-f"), recording the time computation (parameter "-t"), and reducing the size of the output produced (parameter "-l").
Command:
$ for file in ../TACAS18-tests/all/*.bex; do ./emme.py $file -tlf; done
Scalability analysis for all programs provided in the paper: behavioral coverage constraints analysis
Similarly for the scalability analysis, in this case we perform the analysis on all programs that produce at least 5 different outputs, but in this case we do not force the computation of all valid executions (parameter "-f").
Command:
$ for file in ../TACAS18-tests/ge5_outputs/*.bex; do ./emme.py $file -tl --unmatched --jsengine="bash ext_tools/run_V8.sh" -p=JS-TEST262-V8; done
Input Syntax
The syntax for a .bex file is reported below in BNF format, and the entry point is the "program" rule.
program ::= sab_defs + operations + threads + param_section
sab_defs ::= sab_def | (sab_des + sab_def)
threads ::= thread | (threads + thread)
thread ::= "Thread" + varname + "{" + operations + "}"
operations ::= (operation + ";") | (operations + ";" + operation)
operation ::= ifcond | forcond | read | write
bool_cond ::= (read | value) + (op_cond | pname) + (read | value)
op_cond ::= ("==" | "<" | "<=" | ">" | ">=")
ifcond ::= "if(" bool_cond ")" + "{" + operations + "}" + "else" + "{" + operations + "}"
forcond ::= "for(" + varname + "=" + value + ".." + value + ")" + "{" + operations + "}"
read ::= unordered_read | atomic_read
write ::= unordered_write | atomic_write
unordered_read ::= varname + type + "[" + address + "]"
unordered_write ::= varname + type + "[" + address + "]" + "=" + value
atomic_read ::= "Atomics.load" + "(" + varname + type + "," + address + ")"
atomic_write ::= "Atomics.store" + "(" + varname + type + "," + address + "," + value + ")"
param_section ::= "Params" + "{" + parameters + "}"
parameters ::= parameter | (parameters + ";" + parameter)
parameter ::= passign | prange
passign ::= "val_" + varname + "=" + (ivalue | fvalue)
prange ::= "val_" + varname + "=" + ivalue + ".." + ivalue
pset ::= "val_" + varname + "=" + (ivalue ",")* + ivalue
pcond ::= "op_" + varname + "=" + "[" + (op_cond ",")* + op_cond + "]"
type ::= "-" + ("I8" | "I16" | "I32" | "F32" | "F64")
varname ::= [a-zA-Z]+ [0-9]*
value ::= ivalue | fvalue | pname
ivalue ::= [0-9]+
fvalue ::= [0-9]+ "." [0-9]+
pname ::= ("<val_" | "<op_") + varname + ">"
Command Options for EMME
usage: emme.py [-h] [-p [jsprinter]] [-d [jsdir]] [-g] [-r [relations]]
[-x [prefix]] [--synth] [--unmatched] [--jsengine [jsengine]] [-n runs] [-c] [-b] [-v verbosity] [-e nexecs] [-f] [-k] [-l] [-s] [-m] [-j number] [--debug] [-t] [--no-exbounded] [--defines [defines]] [--preproc [preproc]] program EMME: ECMAScript Memory Model Evaluator positional arguments: program the input file describing the program optional arguments: -h, --help show this help message and exit -p [jsprinter], --jsprinter [jsprinter] select the JS printer between (Default is "JS-TEST262"): - "JS-TEST262": TEST262 format (Standard) - "BE": Bounded Execution format - "JSON": JSON format - "JS-TEST262-JSC": TEST262 format (Accepted by JSC) - "JS-TEST262-SM": TEST262 format (Accepted by SM) - "JS-TEST262-V8": TEST262 format (Accepted by V8) - "JS-TEST262-W-V8": TEST262 format with WASM (Accepted by V8) - "JS-TEST262-W-JSC": TEST262 format with WASM (Accepted by JSC) -d [jsdir], --jsdir [jsdir] directory where to store all JS programs. (Default is the same as the input file) -g, --graphviz generates the eps files of each execution (requires neato). (Default is "False") -r [relations], --relations [relations] a (comma separated) list of relations to consider in the graphviz file. Keyword "all" means all. -x [prefix], --prefix [prefix] directory where to store the results. (Default is the same as the input file) --synth enables equivalent programs synthesis. (Default is "False") --unmatched enables unmatched outputs analysis. (Default is "False") --jsengine [jsengine] the command used to call the JavaScript engine, to use with "--unmatched". -n runs, --runs runs number of runs for the unmatched outputs analysis, to use with "--unmatched". (Default is "10") -c, --use-cvc4 relies on CVC4 instead of Alloy Analyzer. (Default is "False") -b, --best relies on CVC4 or Alloy Analyzer for best performance. (Default is "False") -v verbosity verbosity level. (Default is "1") -e nexecs, --max-executions nexecs maximum number of executions. (Default is "unlimited") -f, --force-solving forces the solving part by discharging the previous models. (Default is "False") -k, --skip-solving skips the solving part. (Default is "False") -l, --silent silent mode. (Default is "False") -s, --only-sat performs only the satisfiability checking. (Default is "False") -m, --only-model exits right after the model generation. (Default is "False") -j number, --threads number number of threads - EXPERIMENTAL. (Default is "1") --debug enables debugging setup. (Default is "False") -t, --time enables time debugging setup. (Default is "False") --no-exbounded disables the bounded sets quantifier expansion. (Default is "False") --defines [defines] the set of preprocessor's defines. (Default is none) --preproc [preproc] the memory model preprocessor. (Default is "cpp")
Command Options for Litmus
usage: litmus.py [-h] -c command [-i input_file] [-o outputs] [-m] [-n number] [-j number] [-s] [-t] Litmus tests checking. optional arguments: -h, --help show this help message and exit -c command, --command command the command to be executed -i input_file, --input_file input_file input file -o outputs, --outputs outputs possible outputs -m, --models evaluates models -n number, --number number total executions -j number, --threads number number of threads -s, --silent show only PASS/FAIL -t, --total show the total number of matches