Data Files

There are three kinds of data files in MLSA:

  1. data files that contain a monolingual Abstract Syntax Tree (AST) in text format
  2. data files the contain monolingual AST in JSON format
  3. data files in comma separated values format (CSV) that contain the results of various kinds of static analysis
If a source code file is called NAME.X where NAME is the root file name and X is the language suffix (e.g. test.cpp or analyze.py, etc.) then the data files are named using this root file name as follows:

  • AST files: NAME.X_ast.txt or NAME.X_ast.json
  • Monolingual procedure call graph files: NAME.X_call.csv
  • Monolingual procesure call graph files with API integration: NAME.X_finalcall.csv
  • Combined multilingual call graph file: NAME_callgraph.csv
  • Combined function file: NAME_funcs.csv
  • Forward flow control file: NAME.X_fcfg.csv
  • Reverse flow control file: NAME.X_rcfg.csv
  • Monoligual variable assignments: NAME.X_vars.csv
  • Monolingual reaching definitions analysis: NAME.X_rda.csv
The CSV file formats are as follows:

  • NAME.X_call.csv
    • call id, class, scope, function name, argument1, argument2...
  • NAME.X_finalcall.csv
    • call id, class name, scope, function called, argument1, argument2...
  • NAME_callgraph.csv
    • call program name, call program type, function program name, call id, class name, scope, function called, argument1, argument2...
  • NAME_funcs.csv
    • program name, class name, function name, number of parameters


-- (c) Fordham University Robotics and Computer Vision

