forked from google/mtail
-
Notifications
You must be signed in to change notification settings - Fork 0
/
TODO
128 lines (70 loc) · 5.54 KB
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
standard library, search path
no trailing newline in parser test, requires changes to expr stmt
parse tree/ast testing? - expected AST as result from parse/check instead of
merely getting a result
mapping between progs and logs to reduce wasted processing- issue #35
bytecode like
[{push 1} {push 0} {cmp 1} {jm 6} {push 0} {jmp 7} {push 1} {jnm 13}
{setmatched false} {mload 0} {dload 0} {inc <nil>} {setmatched true}]
can be expressed as
[{push 1} {push 0} {cmp 1} {jm 9} {setmatched false} {mload 0} {dload 0} {inc
<nil>} {setmatched true}]
but jnm 13 is from the condExpr and the previous is from a comparison binary
expr; an optimizer is needed to collapse the bytecode to undersand that
cmp, jm, push, jump, push, jnm in sequence like so is the same as a cmp, jm
and we need to worry about the jump table too
reversed casts: s2i,i2s pairs as well
count stack size and preallocate stack
-> counts of push/pop per instruction
-> test to keep p/p counts updated
: seems like a lot of work for not much return
# Won't do
X Use capture group references to feed back to declaring regular expression,
X noting unused caprefs,
X possibly flipping back to noncapturing (and renumbering the caprefs?)
X -> unlikely to implement, probably won't impact regexp speed
When using a const by itself as a match condition, then we get warnings about
the capture group names not existing.
const A /(?<a>.*)/
A {
x[$a]++
}
... => $a not defined in scope.
Can't define string constants, like const STRPTIME_FORMAT "Jan _2"
Multline const can't startwith a newline, must be const FOO // +\n...
Can't chain two matches in same expresison like getfilename() =~ 'name' &&
EXPR_RE because $0 is redefined
Can't set the timestamp when processing one log line and reuse it in another; must use the
caching state metric pattern, hidden gauge time.
Get a list of non-stdlib deps
go list -f "{{if not .Standard}}{{.ImportPath}}{{end}}" $(go list -f '{{join .Deps "\n"}}' ./...)
Programs may not use mtail_ as a metric prefix.
Theory: Implicitly cast Int shouldn't get the S2i conversion applied to them. Do we need to name Implicit Int separate from Int and then not create s2i or other conversions for implicits. (and we need to keep the runtime conversions?)
=~ isn't documented in the language or programming guide docs
if you comment out the MATCH_NETWORK clase in dhcpd.mtail it gets 30x faster... because the regexp no longer backtracks... why...
Avoid byte to string conversions in the tailer and vm FindStringSubmatch > https://dave.cheney.net/high-performance-go-workshop/dotgo-paris.html#strings_and_bytes
Use FindSubmatchIndex to avoid copies?
Why is strings.Builder slower than bytes.Buffer when the latter's docstring recommends the former?
ci: rerun failed tests to see if they're flaky
Find out if OpenTelemetry is better than OpenCensus when creating no-op trace spans.
run with gosec, and possibly annotate safe sections in code
Test that when path/* is the logpathpattern that we handle log rotation, e.g. log -> log.1
= how can this work, we can't tell the difference between log.1 being a rotation or a new log. This could work if we can have a tailer-level registry of filenames currently with a goroutine. But we don't know the name of the new file when filestream creates a new goroutine for the replacement; fd.Stat() doesn't return the new name of the file.
VM profiler, when enabled, times instructions so user gets feedback on where their program is slow.
Can we create a linter that checks for code patterns like 'path.Join' and warns against them? Can govet be made to do this?
Detect when a regular expression compiled doesn't have a onepass program, and report a compile warning. we can't do this today with the regexp API, because it's not an exported field, and the onepass compilation step is not an exported function. IF we can do this, we can warn the user that their regular expression has ambiguity and will backtrack. See MATCH_NETWORK above.
Do we have a precision problem that shold be solved by using math/big for literals in the AST. Extra credit: find out if the vm runtime should use big internally as well?
regular expression matching is expensive. prefilter on match prefix. for extra credit, filter on all substrings of the expressions, using aho-corasick.
once the vm refactoring has completed, move the VM execute method into per-opcode functions, and use the same state machine function as in lexer.NextToken() to simulate threaded code as we don't get tail recursion in Go. The plan is to see if execution speed is same or better -- expect moving to function calls to be slower unless inlined, but gain in readability and reuse.
refactor vm further to replace stack with registers, we need typed registers to remove the pop runtime type cast. new opcodes to do migration from stack to register based ops required
Once the tailer can read from sockets, I'll move it out of `internal/`.
Pass a Logger as an option to tailer and vm.
StatusHTML in vm reads expvars; can we not do that?
Move from expvar to OpenSomething metrics.
Should the exporter move into the metric package?
Should the waker move into the tailer package?
Benchmarks on GHA are too variable. Compute benchmarks old and new in same instance, per guidelines from "Software Microbenchmarking in the Cloud. How Bad is it Really?" Laaber et al.
Move loc and useCurrentYear out of VM and into Runtime env.
Move const folding into parser during AST build.
Const-fold identity functions.
Replace IndexedExpr with no ExprList with just the IdTerm.