Observable parsing/extraction updates #4

kx499-zz · 2019-06-25T02:06:32Z

This employs both singe value matching and full text extractions (think re.find_all) to support pulling indicators out of bobs of text like email bodies. Additionally it supports indicator validators to assist in removing false positives post regex extraction. It exposes the functions so you can call it separately from an analyzer or automatically from the iterable function. in the iterable function it fist calls check_type, and then if not match it goes on to process the full text regex. This is an iteration of this PR #1

nadouani · 2019-06-26T08:16:48Z

Hello @kx499 thanks for the PR.

Can you please remove the .DS_Store file.
The code you are submitting is a good candidate for unit tests, could you add some of them to cover your changes?

Thanks

kx499-zz · 2019-06-27T02:52:38Z

thanks - will do. I'm not real familiar with unit tests, but I'll work up some tests though

…re. still need to add unit test for new features

… list/dicts for full text

kx499-zz · 2019-06-28T01:46:30Z

@nadouani made the updates, let me know what you think and if any other updates are needed.

kx499-zz · 2019-08-19T19:08:13Z

Any word on this? It's been a few months so I figured I'd check in

kx499-zz · 2019-09-12T10:36:36Z

@nadouni is there anything else needed for this? I'm looking to develop/update some analyzers based on this code and was hoping it could either get committed or we could discuss other ways of accomplishing the same

iwitz · 2019-10-03T13:53:19Z

@kx499 Thanks for the PR ! Could you add a closing '>' after the opening '<' in the following line in extractor.py ? Otherwise the closing angle bracket is captured by the regular expression :

ft_r = '(' + \
               '(?:(?:meows?|h[Xxt]{2}ps?)://)?(?:(?:(?:[a-zA-Z0-9\-]+\[?\.\]?)+[a-z]{2,8})' + \
               '|(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\[?\.\]?){3}(?:25[0-5]|2[0-4][0-9]' + \
               '|[01]?[0-9][0-9]?))/[^\s\<>"]+' + \
                ')'

(this is the modified line. If there is a bracket in the URL it may stop capturing the URL early though)

Also, @nadouani it'd be fantastic if you could have a look at this or at PR #1 😃

nadouani and others added 3 commits April 4, 2019 15:50

Merge branch 'release/2.0.0'

3978d77

updates to extractor

9cbdac3

Merge remote-tracking branch 'upstream/master'

c6340a2

kx499 added 4 commits June 26, 2019 23:36

fixed extractor unit test and removed .DS_Store file/updated .gitigno…

407e066

…re. still need to add unit test for new features

simplified recursive function and fixed bug that presented it self in…

0ddddf0

… list/dicts for full text

minor edits to extractor code

b22efdb

unit tests to cover code changes

26c5187

added requirements file

d7fd6da

nadouani changed the base branch from master to develop August 20, 2019 11:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Observable parsing/extraction updates #4

Observable parsing/extraction updates #4

kx499-zz commented Jun 25, 2019

nadouani commented Jun 26, 2019

kx499-zz commented Jun 27, 2019 •

edited

Loading

kx499-zz commented Jun 28, 2019

kx499-zz commented Aug 19, 2019

kx499-zz commented Sep 12, 2019

iwitz commented Oct 3, 2019 •

edited

Loading

Observable parsing/extraction updates #4

Are you sure you want to change the base?

Observable parsing/extraction updates #4

Conversation

kx499-zz commented Jun 25, 2019

nadouani commented Jun 26, 2019

kx499-zz commented Jun 27, 2019 • edited Loading

kx499-zz commented Jun 28, 2019

kx499-zz commented Aug 19, 2019

kx499-zz commented Sep 12, 2019

iwitz commented Oct 3, 2019 • edited Loading

kx499-zz commented Jun 27, 2019 •

edited

Loading

iwitz commented Oct 3, 2019 •

edited

Loading