Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I created a POSIX shell script alternative to randomize.py #2

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 6 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,14 +62,15 @@ To see all available options type <code>typist --help</code> or just
### Tips

Doing the same test over and over again isn't good practice. I have provided
a text file and a small python script to randomize the most used words in the
English language. You can use this script on any text file, though it will
remove punctuation marks and reformats the text.
a text file and a small python script and a shell script to randomize the most
used words in the English language. You can use this script on any text file,
though it will remove punctuation marks and reformats the text.

```
python3 utils/randomize.py most-used-words.txt
python3 utils/randomize.py most_used_words.txt
# or
utils/randomize.sh most_used_words.txt
```

---

### [Issues / Bugs](https://github.com/ny64/typist/issues)
Expand Down
50 changes: 50 additions & 0 deletions utils/randomize.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#!/bin/sh

# use a POSIX shell shuf alternative if none on the system
command -v shuf > /dev/null 2> /dev/null || shuf() {
awk '
BEGIN {
srand();
OFMT = "%.17f"
}
{
print rand(), $0
}
' "$@" |
sort -k1,1n |
cut -d ' ' -f2- ;}

# first argument is the filename
filename="$1"
[ ! -f "$filename" ] && printf "%s\n" "You must provide a text file as argument." >&2 && exit 1

words=""
# read file line by line
while IFS= read -r line; do
# split the line into words and set them as positional parameters
# shellcheck disable=SC2086
set -- $line
for val; do
# clean up each word: remove leading and trailing punctuation and quotes
val=$(printf "%s" "$val" | sed -e 's/^[({'"'"'"]*//' -e 's/[.,!?;:)]}'"'"'"]*$//')
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would do the same thing as this cryptic sed command and better, but I will be honest, I haven't tried the punctuation cleaning feature with eider commands.

# delete leading and trailing punctuation and whitespace from each line
sed -e 's/^[[:punct:][:space:]]*//' -e 's/[[:punct:][:space:]]*$//'

words="$words
$val" # one word per line in $words var
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be better?

words="${words}${words:+$'\n'}$val"

It seems to work the same way and fits on a single line, but it's less understandable.

done
done < "$filename"

line_length=0
{
# loop over each shuffled words
printf "%s\n" "$words" | shuf | while IFS= read -r word; do
# update line length
line_length=$((line_length + ${#word} + 1))
# if line length exceeds 80 characters, make a new line
if [ "$line_length" -gt 80 ]; then
echo
# update line length
line_length=$(( ${#word} + 1 ))
fi
printf "%s " "$word"
done
echo
} > "$filename"