Skip to content

bcsmith2846/RegexMatch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 

Repository files navigation

I have implemeted the regex matcher using a finite state machine (FSM)
Using a FSM came about after I created a recursive grammar parser that was really slow at larger inputs
I remered creating a syntax parser in some programming language theory classes, and leveraged that experience here
The FSM was faster, but I didn't like the O(2^n) worst/average case runtime.
So, I allowed the FSM to be in multiple states at once, cutting the runtime down to O(n) worst/average case

To speed this up even more go's goroutine functons could be leveraged.
First, each line could be in a goroutine. With the file being outside and a worker pool being created
The worker pool would be an arbiter and allow for the line numbers to be leveraged.
Next, I would add the FSM parsing into a goroutine. Each move to a new character in the outermost parse loop would be a new goroutine.
Doing this, one could use channels to pass a successful parse back up the chain.

Usage:
    ./regex <filename> 'regex'
    ./regex <(echo 'sometext') 'regex'

Compiled for linux x86_64 arch.
I have renamed the binary for convience.

V2 notes:
- Added clearer transition logic
- Leveraged more of Go's features
- Got rid of a bug or two

After some testing and static code analysis I've noticed the code I have written is correct for all non ambigious  matches.
If the match is ambigious, a correct result will be displayed. The algorithm is deterministic, but which solution is chosen depends on how the FSM is compiled. 
For matching that is very specific on how ambigious matches are handled, a O(2^n) runtime is required.

A more Go like version could be created by creating an interface for nodes, type aliases for each node type, 
and object methods for each node type that advance to the next character correctly. 
This may be implemented Friday 8/17 in the evening if I haven't heard anything back about the code.

About

A go implementation of a regex match.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages