Skip to content
Hariharan Devarajan edited this page Jun 26, 2020 · 3 revisions

Application Programming Interface

The VaniDL library reads Darshan and Tensorflow Logs and produces High-Level I/O insights on the application using various drill-up and drill-down analysis of the I/O traces.

Initialization of Analyzer

from src.vanidl import VaniDL
analyzer = VaniDL()
status = analyzer.Load(darshan_file, preprocessed_dir, data_paths_include)

Darshan raw records as Dataframe

# Darshan basic trace records 
darshan_normal_df = analyzer.GetDXTAsDF()

# Darshan extended trace records 
darshan_dxt_df = analyzer.GetTraceAsDF()

Summarization Calls

# I/O summarization for the given trace.
summary = analyzer.GetSummary()

# File summary 
file_summary = analyzer.GetFileSummary(filename)

I/O Analysis

# Get the I/O time along with filters of filename (optional) and rank (optional)
io_time = analyzer.GetIOTime(filename,rank)
# Get the I/O size for the whole application. Additional Filter by filename and rank. 
io_size = analyzer.GetIOSize(filename,rank)

Data Access Pattern Analysis.

# Get the Transfer Size Distribution with filters of filename (optional) and rank (optional)
request_histogram = analyzer.GetIORequestDistribution(filename,rank, operation, bins, threshold)
# Get the data access pattern for the application or per file.
access_pattern = analyzer.GetAccessPattern(filepath)

Timeline of Application

# Create a timeline of I/O operations for the application. This can be filtered per file or per rank.
timeline = analyzer.CreateIOTimeline(filename, rank, time_step)

# Create a Darshan I/O timeline (json.gz) file which can be viewed within chrome://tracing
io_chrome_timeline = CreateChromeTimeline()

# Merge Darshan I/O timeline with tensorboard timeline for a holistic application view.
app_chrome_timeline = CreateMergedTimeline(tensorboard_dir, output_dir, output_file)