Skip to content

Commit

Permalink
- Increasing parallelism performance (real multiprocessing implementa…
Browse files Browse the repository at this point in the history
…tion, addressing #29)

- Better handling of config parser errors (addressing #22)
- Fixing typos
  • Loading branch information
r3nt0n committed Aug 30, 2024
1 parent adb2050 commit c66f91a
Show file tree
Hide file tree
Showing 9 changed files with 48 additions and 48 deletions.
14 changes: 9 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Thanks dude :)
[![Packaging status](https://repology.org/badge/tiny-repos/bopscrk.svg)](https://repology.org/project/bopscrk/versions)
![[GPL-3.0 License](https://github.com/r3nt0n)](https://img.shields.io/badge/license-GPL%203.0-brightgreen.svg)
![[Python 3](https://github.com/r3nt0n)](http://img.shields.io/badge/python-3-blue.svg)
![[Version 2.4.5](https://github.com/r3nt0n)](http://img.shields.io/badge/version-2.4.5-orange.svg)
![[Version 2.4.6](https://github.com/r3nt0n)](http://img.shields.io/badge/version-2.4.6-orange.svg)



Expand Down Expand Up @@ -108,7 +108,7 @@ Thanks dude :)

### What's new

**2.4.5 RELEASED**: Progress bar with ETA implemented
**2.4.6 RELEASED** (30/08/2024): Speed and performance dramatically increased, real multiprocessing implementation.

[//]: # (<p align="center"><img src="https://github.com/r3nt0n/bopscrk/blob/master/img/progressbar_example1.gif" /></p>)

Expand Down Expand Up @@ -200,7 +200,6 @@ It will retrieve all lyrics from all songs which belongs to artists that you pro

#### Customizing behaviour using .cfg file
+ In `bopscrk.cfg` file you can specify your own charsets and enable/disable options:
+ **threads**: number of threads to use in multithreaded operations
+ **extra_combinations** (like `(john, doe) => 123john, john123, 123doe, doe123, john123doe doe123john`) are *enabled by default*. You can disable it in the configuration file in order to get more focused wordlists.
+ **separators_chars**: characters to use in extra-combinations. *Can be a single char or a string of chars, e.g.: `!?-/&(`*
+ **separators_strings**: strings to use in extra-combinations. *Can be a single string or a list of strings space-separated, e.g.: `123` `34!@`*
Expand All @@ -214,7 +213,6 @@ It will retrieve all lyrics from all songs which belongs to artists that you pro
+ **lyric_space_replacement**: same with lyrics found
+ **space_replacement_chars**: characters to insert instead of spaces inside an artist name or a lyric phrase. *Can be a single char or a string of chars, e.g.: `!?-/&(`*
+ **space_replacement_strings**: strings to insert instead of spaces inside an artist name or a lyric phrase. *Can be a single string or a list of strings space-separated, e.g.: `123` `34!@`*
+ Some transforms have **extensive charsets** preincluded. To use it instead of the basic ones, just **comment and uncomment** the corresponding lines (It's important to comment the original one, if you let two lines with the same keyname uncommented, it will throw an error: `AttributeError: 'bool' object has no attribute 'split'`).

+ **Parameters configuration examples**
+ Combine all the words using dots as separator, and same using commas
Expand All @@ -232,11 +230,12 @@ It will retrieve all lyrics from all songs which belongs to artists that you pro
- [ ] Improve **memory management**
- [ ] Write wordlists into filesystem during execution and use it as cache (<a href="https://github.com/r3nt0n/bopscrk/issues">#12</a>)
- [ ] Improve **performance**
- [ ] Refactor and improve threads and transforms logic
- [x] Improve parallelism logic
- [ ] Extra features
- [x] Implement **progress bar** to keep user informed of the execution state
- [ ] Implement **session file** to keep track of the execution point and **be able to stop and resume sessions** (<a href="https://github.com/r3nt0n/bopscrk/issues">#12</a>)
- [ ] Create **config options** for customized **case transforms** (e.g.: disable pair/odd transforms)
- [ ] Implement "pipable" output to allow integration with other tools (`-q` flag will just output final wordlist to sdout)

See the [open issues](https://github.com/r3nt0n/bopscrk/issues) for a full list of proposed features (and known issues).

Expand Down Expand Up @@ -272,6 +271,11 @@ Thank you all!

## Changelist
[//]: # (+ `last development version &#40;available on Github&#41;`)
+ `2.4.6 version notes (30/08/2024)`
+ **Increasing parallelism performance** (real multiprocessing implementation)
+ Better handling of config parser errors
+ Fixing typos

+ `2.4.5 version notes (02/08/2022)`
+ **progress bar** implemented and working
+ `version` argument included
Expand Down
17 changes: 9 additions & 8 deletions bopscrk/bopscrk.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -10,24 +10,25 @@
###################################################################################

[GENERAL]
# Number of threads to use in multithreaded operations
threads=32
# Reserved for potential future uses

[COMBINATIONS]
# Enables extra combination and additions at begining and end of words
# Enables extra combination and additions at beginning and end of words
# example: (john, doe) => 123john, john123, 123doe, doe123, john123doe doe123john
extra_combinations=true
# SEPARATORS CHARSET - Characters to use in extra-combinations
separators_chars=._-$%%&#@
separators_strings=123 xXx !!
# To get an extensive charset, comment the previous line and uncomment the next one (having both enabled could cause an error)
# separators_chars=!"#$%%&'()*+,-./:;<=>?@[\]^_`{|}~
separators_strings=!! 123 xXx
# To get extensive charsets, uncomment the following lines:
#separators_chars=!"#$%%&'()*+,-./:;<=>?@[\]^_`{|}~
#separators_strings=!! ¡¡ !!! ¡¡¡ ¡!¡ !¡! 123 1234 xXx XxX WwW wWw


[TRANSFORMS]
# LEET REPLACEMENT CHARSET
# characters to replace and correspondent substitute in leet transforms
leet_charset=a:4 e:3 i:1 o:0 s:$
# To get an extensive charset, comment the previous line and uncomment the next one (having both enabled could cause an error)
# To get an extensive charset, uncomment the following line
# leet_charset=a:4 a:@ e:3 i:1 i:! i:¡ l:1 o:0 s:$ s:5 b:8 t:7 c:(

# RECURSIVE LEET TRANSFORMS - Enables a recursive call to leet_transforms() function
Expand All @@ -50,5 +51,5 @@ lyric_space_replacement=true
# Comment two above lines or set it empty in order to don't replace spaces, just remove them
space_replacement_chars=!@+._-
space_replacement_strings=
# To get an extensive charset, comment the previous line and uncomment the next one (having both enabled cause an error)
# To get an extensive charset, uncomment the following line
#space_replacement_chars=!"#$%%&'()*+,-./:;<=>?@[\]^_`{|}~
2 changes: 1 addition & 1 deletion bopscrk/bopscrk.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

name = 'bopscrk.py'
desc = 'Generate smart and powerful wordlists'
__version__ = '2.4.5'
__version__ = '2.4.6'
__author__ = 'r3nt0n'
__status__ = 'Development'

Expand Down
4 changes: 2 additions & 2 deletions bopscrk/modules/banners.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,9 @@ def banner(name, version, author="r3nt0n"):
name_rand_leet = name
name_rand_case = case_transforms(name)
name_rand_case = name_rand_case[randint((len(name_rand_case) - 3), (len(name_rand_case) - 1))]
version = version[:3]
#version = version[:3]
print(' ,----------------------------------------------------, ,------------,');sleep(interval)
print(' | [][][][][] [][][][][] [][][][] [][__] [][][][] | | v{}{}{} |'.format(color.BLUE, version, color.END));sleep(interval)
print(' | [][][][][] [][][][][] [][][][] [][__] [][][][] | | v{}{}{} |'.format(color.BLUE, version, color.END));sleep(interval)
print(' | | |------------|');sleep(interval)
print(' | [][][][][][][][][][][][][][_] [][][] [][][][] |===| {}{}{} |'.format(color.RED, name_rand_leet, color.END));sleep(interval)
print(' | [_][][][]{}[]{}[][][][]{}[][]{}[][][ | [][][] [][][][] |===| {}{}{}{} |'.format(color.KEY_HIGHL, color.END, color.KEY_HIGHL, color.END, color.BOLD, color.RED, name, color.END));sleep(interval)
Expand Down
14 changes: 5 additions & 9 deletions bopscrk/modules/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,14 @@
class Config:
def __init__(self, cfg_file):
self.CFG_FILE = cfg_file
self.cfg = configparser.ConfigParser(strict=False)

def read_config(self, category, field):
cfg = configparser.ConfigParser()
try:
cfg.read([self.CFG_FILE])
value = cfg.get(category, field)
except:
self.cfg.read([self.CFG_FILE])
value = self.cfg.get(category, field)
except Exception as e:
print(e)
value = False
return value

Expand All @@ -36,12 +37,7 @@ def parse_booleans(self, value):
except AttributeError:
return None

def parse_threads(self, value):
try: value = int(value); return value
except ValueError: return 4 # default number of threads if error in config provided

def setup(self):
self.THREADS = self.parse_threads(self.read_config('GENERAL', 'threads'))
self.EXTRA_COMBINATIONS = self.parse_booleans(self.read_config('COMBINATIONS', 'extra_combinations'))
self.SEPARATORS_CHARSET = self.merge_settings(self.read_config('COMBINATIONS', 'separators_chars'),
self.read_config('COMBINATIONS', 'separators_strings'))
Expand Down
4 changes: 2 additions & 2 deletions bopscrk/modules/excluders.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# https://github.com/r3nt0n/bopscrk
# bopscrk - transform functions module

from multiprocessing.dummy import Pool as ThreadPool
from multiprocessing import Pool, cpu_count
from collections import OrderedDict

from . import Config
Expand All @@ -15,7 +15,7 @@ def compare(word_to_exclude, word_in_wordlist):
# Remove word to exclude from final_wordlist
def multithread_exclude(word_to_exclude, wordlist):
diff_wordlist = []
with ThreadPool(Config.THREADS) as pool:
with Pool(cpu_count()) as pool:
#args = (word, words_to_exclude)
diff_wordlist += pool.starmap(compare, [(word_to_exclude, word) for word in wordlist])

Expand Down
26 changes: 13 additions & 13 deletions bopscrk/modules/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
from .auxiliars import clear, remove_duplicates_from_file
from . import banners
from .color import color
from .transforms import leet_transforms, case_transforms, artist_space_transforms, lyric_space_transforms, multithread_transforms, take_initials, transform_cached_wordlist_and_save
from .transforms import leet_transforms, case_transforms, artist_space_transforms, lyric_space_transforms, multiprocess_transforms, take_initials, transform_cached_wordlist_and_save
from .combinators import combinator, add_common_separators
from .excluders import remove_by_lengths, remove_duplicates, multithread_exclude

Expand All @@ -24,7 +24,7 @@ def run(name, version):
if args.print_version: print(name + '_' + version); sys.exit(0)

try:
# setting args whter interactive or not
# setting args whether interactive or not
if args.interactive:
clear()
banners.bopscrk_banner()
Expand Down Expand Up @@ -92,7 +92,7 @@ def run(name, version):
# Take just the initials on each phrase and add as a new word to FINAL wordlist
if Config.TAKE_INITIALS:
base_lyrics = lyrics[:]
ly_initials_wordlist = multithread_transforms(take_initials, base_lyrics)
ly_initials_wordlist = multiprocess_transforms(take_initials, base_lyrics)
final_wordlist += ly_initials_wordlist

# Make space transforms and add it too
Expand All @@ -102,7 +102,7 @@ def run(name, version):
elif Config.LYRIC_SPACE_REPLACEMENT:
print(' {}[+]{} Producing new words replacing spaces in {} phrases...'.format(color.BLUE, color.END, len(lyrics)))
base_lyrics = lyrics[:]
space_transformed_lyrics = multithread_transforms(lyric_space_transforms, base_lyrics)
space_transformed_lyrics = multiprocess_transforms(lyric_space_transforms, base_lyrics)
final_wordlist += space_transformed_lyrics

except ImportError:
Expand All @@ -121,16 +121,16 @@ def run(name, version):
if Config.EXTRA_COMBINATIONS:
if Config.SEPARATORS_CHARSET:
#print(' {}[+]{} Creating extra combinations (separators charset in {}{}{})...'.format(color.BLUE, color.END,color.CYAN, args.cfg_file,color.END))
print(' {}[+]{} Creating extra combinations with separators charset...'.format(color.BLUE,color.END))
print(' {}[+]{} Creating extra combinations using separators charset...'.format(color.BLUE,color.END))
final_wordlist += add_common_separators(base_wordlist)
print(' {}[*]{} Words produced: {}'.format(color.CYAN, color.END, len(final_wordlist)))
else:
print(' {}[!]{} Any separators charset specified in {}{}'.format(color.ORANGE, color.END, args.cfg_file,color.END))
print(' {}[!]{} No separators charset specified in {}{}'.format(color.ORANGE, color.END, args.cfg_file,color.END))

# Remove words by min-max length range established
print(' {}[-]{} Removing words by min and max length provided ({}-{})...'.format(color.PURPLE, color.END,args.min_length,args.max_length))
final_wordlist = remove_by_lengths(final_wordlist, args.min_length, args.max_length)
print(' {}[*]{} Words remained: {}'.format(color.CYAN, color.END, len(final_wordlist)))
print(' {}[*]{} Words remaining: {}'.format(color.CYAN, color.END, len(final_wordlist)))
# (!) Check for duplicates (is checked before return in combinator() and add_common_separators())
#final_wordlist = remove_duplicates(final_wordlist)

Expand Down Expand Up @@ -164,14 +164,14 @@ def run(name, version):
# ' max-length configured (now is {}{}{}) and the size of your\n'
# ' wordlist at this point (now contains {}{}{} words), this process\n'
# ' could take a long time{}\n'.format(color.ORANGE,color.END,args.max_length,color.ORANGE,color.END,len(final_wordlist),color.ORANGE,color.END))
recursive_msg = '{}recursive{} '.format(color.RED,color.END)
recursive_msg = '{}recursive{} '.format(color.ORANGE,color.END)
print(' {}[+]{} Applying {}leet transforms to {} words...'.format(color.BLUE, color.END, recursive_msg,len(final_wordlist)))

#transform_cached_wordlist_and_save(leet_transforms, args.outfile)
#remove_duplicates_from_file(args.outfile)

temp_wordlist = []
temp_wordlist += multithread_transforms(leet_transforms, final_wordlist)
temp_wordlist += multiprocess_transforms(leet_transforms, final_wordlist)
final_wordlist += temp_wordlist

# CASE TRANSFORMS
Expand All @@ -181,14 +181,14 @@ def run(name, version):
# transform_cached_wordlist_and_save(case_transforms, args.outfile) # not working yet, infinite loop ?¿?¿

temp_wordlist = []
temp_wordlist += multithread_transforms(case_transforms, final_wordlist)
temp_wordlist += multiprocess_transforms(case_transforms, final_wordlist)
final_wordlist += temp_wordlist

print(' {}[-]{} Removing duplicates...'.format(color.PURPLE, color.END))
final_wordlist = remove_duplicates(final_wordlist)
print(' {}[*]{} Words remained: {}'.format(color.CYAN, color.END, len(final_wordlist)))
print(' {}[*]{} Words remaining: {}'.format(color.CYAN, color.END, len(final_wordlist)))

# EXCLUDE FROM OTHER WORDLISTS
# EXCLUDE FROM OTHER WORDLISTS (deprecated)
#if args.exclude_wordlists:
# For each path to wordlist provided
# for wl_path in args.exclude_wordlists:
Expand Down Expand Up @@ -218,7 +218,7 @@ def run(name, version):
# PRINT RESULTS
############################################################################
print('\n {}[+]{} Words generated:\t{}{}{}'.format(color.GREEN, color.END, color.RED, len(final_wordlist),color.END))
print(' {}[+]{} Time elapsed:\t{}'.format(color.GREEN, color.END, total_time))
print(' {}[+]{} Elapsed time:\t{}'.format(color.GREEN, color.END, total_time))
print(' {}[+]{} Output file:\t{}{}{}{}'.format(color.GREEN, color.END, color.BOLD, color.BLUE, args.outfile, color.END))
#print(' {}[+]{} Words generated:\t{}{}{}\n'.format(color.GREEN, color.END, color.RED, str(sum(1 for line in open(args.outfile))), color.END))
sys.exit(0)
Expand Down
9 changes: 4 additions & 5 deletions bopscrk/modules/transforms.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,8 @@
# https://github.com/r3nt0n/bopscrk
# bopscrk - transform functions module

from multiprocessing.dummy import Pool as ThreadPool
from multiprocessing import cpu_count, Pool

#from tqdm import tqdm
from alive_progress import alive_bar

from . import Config
Expand Down Expand Up @@ -136,10 +135,10 @@ def lyric_space_transforms(word):
return new_wordlist


def multithread_transforms(transform_type, wordlist):
def multiprocess_transforms(transform_type, wordlist):
# process each word in their own thread and return the results
new_wordlists = []
with ThreadPool(Config.THREADS) as pool:
with Pool(cpu_count()) as pool:
with alive_bar(bar=None,spinner='bubbles', monitor=False,elapsed=False,stats=False,receipt=False) as progressbar:
new_wordlists += pool.map(transform_type, wordlist)
progressbar()
Expand Down Expand Up @@ -173,7 +172,7 @@ def transform_cached_wordlist_and_save(transform_type, filepath):
counter += 1
last_position = f.tell() # save last_position

new_wordlist += multithread_transforms(transform_type, cached_wordlist)
new_wordlist += multiprocess_transforms(transform_type, cached_wordlist)
#cached_wordlist += new_wordlist
append_wordlist_to_file(filepath, new_wordlist)

Expand Down
6 changes: 3 additions & 3 deletions bopscrk/tests/transforms_tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
from os import path
sys.path.append(path.dirname(path.dirname(path.abspath(__file__))))

from ..modules.transforms import case_transforms, leet_transforms, multithread_transforms, \
from ..modules.transforms import case_transforms, leet_transforms, multiprocess_transforms, \
take_initials, artist_space_transforms, lyric_space_transforms


Expand All @@ -29,8 +29,8 @@ def test_case_transform(self):

def test_multithread_transform(self):
wordlist = ['hello', 'world', 'lorem', 'ipsum']
self.assertEqual(33, len(multithread_transforms(case_transforms, wordlist)))
self.assertEqual(10, len(multithread_transforms(leet_transforms, wordlist)))
self.assertEqual(33, len(multiprocess_transforms(case_transforms, wordlist)))
self.assertEqual(10, len(multiprocess_transforms(leet_transforms, wordlist)))

def test_take_initials(self):
word = 'hello world lorem ipsum'
Expand Down

0 comments on commit c66f91a

Please sign in to comment.