Skip to content

Commit

Permalink
First draft at FloatTracker → TrackedFloats rename
Browse files Browse the repository at this point in the history
  • Loading branch information
ashton314 committed Aug 1, 2024
1 parent 0f76599 commit 2f5e3f5
Show file tree
Hide file tree
Showing 12 changed files with 126 additions and 45 deletions.
4 changes: 2 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ That said, this is research software, so expect some instability as we aim first

## 1.0.1

Enable Inf injection (issue [#41](https://github.com/utahplt/FloatTracker.jl/issues/41))
Enable Inf injection (issue [#41](https://github.com/utahplt/TrackedFloats.jl/issues/41))

## 1.0.0

Expand Down Expand Up @@ -56,7 +56,7 @@ Improved the last-ditch effort to extract the module name from a stack frame. No

### Added

Event limit in logger works: set `maxLogs` to control how many events get logged. FloatTracker stops collecting stack traces after this, so should run much faster once the threshold has been hit. Defaults to `Unbounded()`.
Event limit in logger works: set `maxLogs` to control how many events get logged. TrackedFloats stops collecting stack traces after this, so should run much faster once the threshold has been hit. Defaults to `Unbounded()`.

## 0.2.0

Expand Down
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name = "FloatTracker"
name = "TrackedFloats"
uuid = "1722352a-da05-4a89-9ace-0f2f7ef630bb"
authors = ["Taylor Allred <[email protected]>", "Ashton Wiersdorf <[email protected]>", "Ben Greenman <[email protected]>"]
version = "1.0.1"
Expand Down
54 changes: 28 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,32 @@
# FloatTracker.jl
# TrackedFloats.jl (formerly FloatTracker.jl)

[![CI](https://github.com/utahplt/FloatTracker.jl/actions/workflows/CI.yml/badge.svg?branch=main)](https://github.com/utahplt/FloatTracker.jl/actions/workflows/CI.yml)
⚠️ **NOTICE** ⚠️ we are in the process of renaming FloatTracker → TrackedFloats to bring this inline with Julia package naming conventions. Please be patient.

[![CI](https://github.com/utahplt/TrackedFloats.jl/actions/workflows/CI.yml/badge.svg?branch=main)](https://github.com/utahplt/TrackedFloats.jl/actions/workflows/CI.yml)

Track `NaN` and `Inf` generation and propagation in your code.

Available on [JuliaHub](https://juliahub.com/ui/Packages/FloatTracker/dBXig/1.0.0)
Available on [JuliaHub](https://juliahub.com/ui/Packages/TrackedFloats/dBXig/1.0.0)

# Synopsis

```julia
# Pull in FloatTracker
using FloatTracker
# Pull in TrackedFloats
using TrackedFloats

# Wrap inputs in a TrackedFloat* type
num = TrackedFloat64(-42.0)

# Watch as a NaN gets born
should_be_nan = sqrt(num)

# Flush FloatTracker's logs
# Flush TrackedFloats's logs
ft_flush_logs()
```

# Description

`FloatTracker.jl` is a library that provides three new types: `TrackedFloat16`, `TrackedFloat32`, and `TrackedFloat64`.
`TrackedFloats.jl` is a library that provides three new types: `TrackedFloat16`, `TrackedFloat32`, and `TrackedFloat64`.
These behave just like their `FloatN` counterparts except that they detect and log instances of exceptional floating point values. (E.g. `NaN` or `Inf`)

There are three kinds of events that can happen during the lifetime of an exceptional floating point value:
Expand All @@ -41,12 +43,12 @@ JuliaCon 2023 talk by Ashton:

<a href="https://www.youtube.com/live/rMrHCM1Etng?si=fK0Y3WYiFOzJYQ4V&t=10147"><img src="https://github-production-user-asset-6210df.s3.amazonaws.com/1731829/266869983-661b7d36-ca5f-489f-bda4-30591ebb25d7.png" /></a>

<!-- https://github.com/utahplt/FloatTracker.jl/assets/1731829/661b7d36-ca5f-489f-bda4-30591ebb25d7 -->
<!-- https://github.com/utahplt/TrackedFloats.jl/assets/1731829/661b7d36-ca5f-489f-bda4-30591ebb25d7 -->


## Example
```julia
using FloatTracker
using TrackedFloats

config_logger(filename="max")

Expand Down Expand Up @@ -81,13 +83,13 @@ One uses the builtin `<` operator and the other uses Julia's `max` function. Whe

Note that the result of this program is *wrong*: instead of the true maximum value of the list (`5.0`) getting returned, the bad version returns `4.0`!

We can see this in the log that produced by FloatTracker when running this file.
We can see this in the log that produced by TrackedFloats when running this file.

```
[NaN] check_error at /Users/ashton/.julia/dev/FloatTracker/src/TrackedFloat.jl:11
< at /Users/ashton/.julia/dev/FloatTracker/src/TrackedFloat.jl:214
maximum at /Users/ashton/Research/FloatTrackerExamples/max_example.jl:0
top-level scope at /Users/ashton/Research/FloatTrackerExamples/max_example.jl:20
[NaN] check_error at /Users/ashton/.julia/dev/TrackedFloats/src/TrackedFloat.jl:11
< at /Users/ashton/.julia/dev/TrackedFloats/src/TrackedFloat.jl:214
maximum at /Users/ashton/Research/TrackedFloatsExamples/max_example.jl:0
top-level scope at /Users/ashton/Research/TrackedFloatsExamples/max_example.jl:20
eval at ./boot.jl:370
include_string at ./loading.jl:1899
_include at ./loading.jl:1959
Expand All @@ -103,11 +105,11 @@ This tool may be useful for debugging those sorts of issues.

## Usage

1. Call `using FloatTracker`; you may want to include functions like `enable_nan_injection` or `config_logger` or the like. (See below for more details.)
1. Call `using TrackedFloats`; you may want to include functions like `enable_nan_injection` or `config_logger` or the like. (See below for more details.)
2. Add additional customization to logging and injection.
3. Wrap as many of your inputs in `TrackedFloatN` as you can.

FloatTracker should take care of the rest!
TrackedFloats should take care of the rest!

Digging into step 2, there are two things that you can customize after initialization:

Expand Down Expand Up @@ -163,7 +165,7 @@ Keyword arguments for `config_logger`:

### Configuring the injector

FloatTracker can *inject* `NaN`s at random points in your program to help you find places where you might not be handling exceptional values properly: this technique can help you find `NaN` kills before they happen in a production environment.
TrackedFloats can *inject* `NaN`s at random points in your program to help you find places where you might not be handling exceptional values properly: this technique can help you find `NaN` kills before they happen in a production environment.

```julia
# Inject 2 NaNs
Expand Down Expand Up @@ -218,7 +220,7 @@ Most of the time comparison operators are what kill a NaN. But `^` can kill NaNs

# Fuzzing and Recording injections

FloatTracker allows you to fuzz code and inject NaNs or Infs wherever a `TrackedFloat` type is used. Moreover, you can record these injections to rerun injections.
TrackedFloats allows you to fuzz code and inject NaNs or Infs wherever a `TrackedFloat` type is used. Moreover, you can record these injections to rerun injections.

**WARNING:** it is critical that inputs to the program be exactly the same for recording and replaying to be consistent. The recordings are sensitive to the number of times a floating point operation is hit.

Expand All @@ -235,7 +237,7 @@ The checks in the purple region cost the most time, so we do those as late as po

Sometimes we want to inject NaNs throughout the program. We can create a "recording session" that will before each injection check if that point has been tried before. If it has, we move on and try again at the next injection point.

We can tell FloatTracker what we consider to be identical injection points. **TODO:** how *do* we tell FloatTracker what we consider to be the same and not the same? Function boundaries?
We can tell TrackedFloats what we consider to be identical injection points. **TODO:** how *do* we tell TrackedFloats what we consider to be the same and not the same? Function boundaries?

## Recording internals

Expand All @@ -247,13 +249,13 @@ Injection points are saved to a *recording file*, where each line denotes an inj
42, solve.jl, OrdinaryDiffEq::solve OrdinaryDiffEq::do_it Finch::make_it_so
```

The first field `42` is the injection point, or the nth time a floating point operation was intercepted by FloatTracker. The second field `solve.jl` acts as a little sanity check: this is the first non-FloatTracker file off of the stack trace. After that comes a list of module names paired with the function on the call stack.
The first field `42` is the injection point, or the nth time a floating point operation was intercepted by TrackedFloats. The second field `solve.jl` acts as a little sanity check: this is the first non-TrackedFloats file off of the stack trace. After that comes a list of module names paired with the function on the call stack.

# Generating CSTGs

Get the [CSTG](https://github.com/utahplt/cstg) code.

Run a program that uses TrackedFloats (e.g. from the [example repository](https://github.com/utahplt/FloatTrackerExamples)).
Run a program that uses TrackedFloats (e.g. from the [example repository](https://github.com/utahplt/TrackedFloatsExamples)).
By default, a file with `*error_log*` in its name should appear.

Generate a graph using the error log:
Expand All @@ -272,13 +274,13 @@ For more about CSTG, please see the original paper:
# Examples

Examples have been moved from this repository to an [example repository](https://github.com/utahplt/FloatTrackerExamples)—this allows us to keep the dependencies in this repository nice and light.
Examples have been moved from this repository to an [example repository](https://github.com/utahplt/TrackedFloatsExamples)—this allows us to keep the dependencies in this repository nice and light.

# Julia and GPU programming

FloatTracker works on the CPU. If a Julia function calls a GPU kernel, then you can track exceptions inside the GPU execution using our companion tool [GPU-FPX](https://github.com/LLNL/GPU-FPX) developed by Xinyi Li for her PhD. This will allow you to (1) see the exception flows inside the kernel, (2) whether the exceptions got killed inside the kernel, and if the exceptions were present in the return result of the Julia GPU call, then (3) FloatTracker will show how that exception further flows through the Julia code. You get this full effect by running your Julia Command under `LD_PRELOAD`.
TrackedFloats works on the CPU. If a Julia function calls a GPU kernel, then you can track exceptions inside the GPU execution using our companion tool [GPU-FPX](https://github.com/LLNL/GPU-FPX) developed by Xinyi Li for her PhD. This will allow you to (1) see the exception flows inside the kernel, (2) whether the exceptions got killed inside the kernel, and if the exceptions were present in the return result of the Julia GPU call, then (3) TrackedFloats will show how that exception further flows through the Julia code. You get this full effect by running your Julia Command under `LD_PRELOAD`.

For details of `LD_PRELOAD` and to obtain and install GPU-FPX, please visit the [GPU-FPX repository](https://github.com/LLNL/GPU-FPX) and ask its authors for assistance if needed. For help on using FloatTracker in conjunction with this tool, talk to us.
For details of `LD_PRELOAD` and to obtain and install GPU-FPX, please visit the [GPU-FPX repository](https://github.com/LLNL/GPU-FPX) and ask its authors for assistance if needed. For help on using TrackedFloats in conjunction with this tool, talk to us.

# Running tests

Expand All @@ -293,7 +295,7 @@ or via the Julia shell:
```
julia> ] # enter the package shell
pkg> activate .
(FloatTracker) pkg> test
(TrackedFloats) pkg> test
```

# License
Expand All @@ -304,7 +306,7 @@ MIT License

Inspired by [Sherlogs.jl](https://github.com/milankl/Sherlogs.jl).

This repository originally lived in [Taylor Allred's repository](https://github.com/tcallred/FloatTracker.jl).
This repository originally lived in [Taylor Allred's repository](https://github.com/tcallred/TrackedFloats.jl).

# Citation

Expand Down
12 changes: 6 additions & 6 deletions src/Config.jl
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#
# Unified config for FloatTracker
# Unified config for TrackedFloats
# ===============================
#
# Contents:
Expand Down Expand Up @@ -218,7 +218,7 @@ function SessionConfig()
end

"""
FloatTracker config struct
TrackedFloats config struct
## Logger Config
## Injector Config
Expand All @@ -240,7 +240,7 @@ ft_config = nothing
"""
ft_init()
Initialize the global FloatTracker configuration. (Automatically called when using function by `__init__`)
Initialize the global TrackedFloats configuration. (Automatically called when using function by `__init__`)
We need to make this a function, otherwise it can cache the value of the
timestamp used for writing unique log files.
Expand All @@ -264,7 +264,7 @@ export ft__get_global_ft_config_for_test
config_logger(log::LoggerConfig)
config_logger(; args...)
Set the logger for the global FloatTracker configuration instance.
Set the logger for the global TrackedFloats configuration instance.
Takes either a `LoggerConfig` struct, or the same keyword arguments as the
`LoggerConfig` constructor.
Expand All @@ -280,7 +280,7 @@ config_logger(; args...) = patch_config!(ft_config.log; args...)
config_injector(log::InjectorConfig)
config_injector(; args...)
Set the injector for the global FloatTracker configuration instance.
Set the injector for the global TrackedFloats configuration instance.
Takes either a `InjectorConfig` struct, or the same keyword arguments as the
`InjectorConfig` constructor.
Expand All @@ -295,7 +295,7 @@ config_injector(; args...) = patch_config!(ft_config.inj; args...)
config_session(log::SessionConfig)
config_session(; args...)
Set the session for the global FloatTracker configuration instance.
Set the session for the global TrackedFloats configuration instance.
Takes either a `SessionConfig` struct, or the same keyword arguments as the
`SessionConfig` constructor.
Expand Down
4 changes: 2 additions & 2 deletions src/Injector.jl
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ end
end

@inline function drop_ft_frames(frames)
collect(Iterators.dropwhile((frame -> frame_library(frame) === "FloatTracker"), frames))
collect(Iterators.dropwhile((frame -> frame_library(frame) === "TrackedFloats"), frames))
end

"""
Expand All @@ -102,7 +102,7 @@ Returns whether or not the current point in the code (indicated by the
StackTrace) is a valid point to inject.
"""
function injectable_region(i::InjectorConfig, raw_frames::StackTraces.StackTrace)::Bool
# Drop FloatTracker frames
# Drop TrackedFloats frames
frames = drop_ft_frames(raw_frames)

# If neither functions nor libraries are specified, inject as long as we're
Expand Down
4 changes: 2 additions & 2 deletions src/FloatTracker.jl → src/TrackedFloats.jl
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
module FloatTracker
module TrackedFloats

export FtConfig, ft_init, TrackedFloat16, TrackedFloat32, TrackedFloat64, FunctionRef
export LoggerConfig, config_logger, exclude_stacktrace, print_log, ft_flush_logs
export InjectorConfig, config_injector, enable_nan_injection, disable_nan_injection, enable_inf_injection, disable_inf_injection, record_injection, replay_injection
export SessionConfig, config_session

include("SharedStructs.jl") # Structures used in multiple places throughout FloatTracker
include("SharedStructs.jl") # Structures used in multiple places throughout TrackedFloats
include("Config.jl") # Primary interface routines: routines to control injection, logging, etc.
include("Event.jl") # Routines for diagnosing exceptional events
include("Logger.jl") # Formatting and writing of error/event logs
Expand Down
79 changes: 79 additions & 0 deletions src/tf_counter.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
abstract type AbstractTrackedFloat <: AbstractFloat end

for TrackedFloatN in (:TrackedFloat16, :TrackedFloat32, :TrackedFloat64)
# Helper functions for working with complex numbers
println("tf_to_complex")
println("tf_track_complex")
println("tf_untrack_complex")

# Use this where an int got wrapped with a TrackedFloat
println("trunc_if_int")

number_types = (:Number, :Integer, :Float16, :Float32, :Float64)
complex_types = (:ComplexF16, :ComplexF32, :ComplexF64)

for NumType in number_types
println("Base")
end

# Binary operators
for O in (:(+), :(-), :(*), :(/), :(^), :min, :max, :rem)
println("Base")
println("Base")
println("Base")
println("Base")

# Hack to appease type dispatch
for NumType in tuple(:Bool, number_types...)
println("Base")
println("Base")
end
end

# Base.decompose seems to be an internal function. Moreover, it always returns
# a tuple of integers. See function def:
# ~/.asdf/installs/julia/1.8.5/share/julia/base/float.jl
#
# Because of this, we treat any call to decompose with a NaN as a kill event.

println("Base")

# Unary operators
for O in (:(-), :(+),
:sign,
:prevfloat, :nextfloat,
:round, :trunc, :ceil, :floor,
:inv, :abs, :sqrt, :cbrt,
:exp, :expm1, :exp2, :exp10,
:exponent,
:log, :log1p, :log2, :log10,
:rad2deg, :deg2rad, :mod2pi, :rem2pi,
:sin, :cos, :tan, :csc, :sec, :cot,
:asin, :acos, :atan, :acsc, :asec, :acot,
:sinh, :cosh, :tanh, :csch, :sech, :coth,
:asinh, :acosh, :atanh, :acsch, :asech, :acoth,
:sinc, :sinpi, :cospi,
:sind, :cosd, :tand, :cscd, :secd, :cotd,
:asind, :acosd, :atand, :acscd, :asecd, :acotd,
)
println("$O")
end

# Type-based functions
for fn in (:floatmin, :floatmax, :eps)
println("$fn")
end

println("one")
println("Base")
println("Base")


for O in (:isnan, :isinf, :issubnormal)
println("Base")
end

for O in (:(<), :(<=), :(==))
println("Base")
end
end # for TrackedFloatN
2 changes: 1 addition & 1 deletion test/complex_test.jl
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
using Test
using FloatTracker
using TrackedFloats

@testset "Constructing tracked complex numbers" begin
scotty_tt = 0.0 + 1.0im # Our imaginary friend
Expand Down
2 changes: 1 addition & 1 deletion test/config_api_tests.jl
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
using Test

using FloatTracker
using TrackedFloats

@testset "config_* doesn't override everything" begin
global_config = ft__get_global_ft_config_for_test()
Expand Down
2 changes: 1 addition & 1 deletion test/injector_tests.jl
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
using Test

using FloatTracker
using TrackedFloats

@testset "should_inject basic behavior" begin
i1 = InjectorConfig(active=true, odds=1, n_inject=2)
Expand Down
2 changes: 1 addition & 1 deletion test/logger_perf_tests.jl
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Run this by entering the Julia REPL and running
# include("test/logger_perf_tests.jl")

using FloatTracker: TrackedFloat64, ft_flush_logs, exclude_stacktrace, set_logger
using TrackedFloats: TrackedFloat64, ft_flush_logs, exclude_stacktrace, set_logger
using FileIO, Profile, FlameGraphs, Plots, ProfileView

function track(loops)
Expand Down
4 changes: 2 additions & 2 deletions test/logger_tests.jl
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
using Test

using FloatTracker
using TrackedFloats

println("FloatTracker loaded")
println("TrackedFloats loaded")

f5(n) = n-2

Expand Down

0 comments on commit 2f5e3f5

Please sign in to comment.