Skip to content

Latest commit

 

History

History
352 lines (242 loc) · 20.3 KB

Readme.rst

File metadata and controls

352 lines (242 loc) · 20.3 KB

Introduction

Kaptsja is the Dutch "phonetic pronunciation" for CaptCha. This Kaptsja software is meant to determine if the online "user" is a human and not a (ro)bot or any type of automated program. The acronym CAPTCHA stands for "Completely Automated Public Turing test to tell Computers and Humans Apart". When a user solves the presented captcha "puzzle" continuation to next pages on the site are allowed else the user is refushed further access. In this Kaptsja software it means continuation to respectively a success or failure page depending on the check result. This routing can be easily adapted to the needs.

NOTE 1: The second Readme2.rst describes additional details and features.

NOTE 2: Also the configuration file and the source code offer a wealth of documentation.

Release notes

10.1.2 First release

10.1.3 Added missing randomlist directory with pictures

10.1.4 HTML closing tag correction in modal page

10.1.5 Added time limit to solve Kaptsja within a defined time (in seconds). Deleted some left "debug" print statements.

10.1.6 Correction of wrong rotation angle test. The value was always set to 30. Now rotation range 0 - 360 is possible as intended. Bottle debug and reloader set to False, to prevent double generation of HTML pages (no functional change however). How to run on Android has been added.

10.2.0 Added configuration setting "redirect_after_captcha". Setting this to "True" will avoid redirection to success or failure page after submission and Kaptsja verification happened. Only a True or False string will be returned allowing JavaScript to handle the continuation in the HTML page itself (similar to Google's reCaptcha).

10.3.0 PIL (pillow) was changed after version 9.5.0. Upgraded to PIL 10.2.0 ! That version is now required
run: pip uninstall Pillow followed by pip install Pillow.
All versions:

If you get font errors on Linux please run: sudo apt --reinstall install ttf-mscorefonts-installer

If you get Beautifulsoup error in a virtual environment you might need to run: pip3 install beautifulsoup4

Clearing the browser cache might be required in case False is wrong given!

Quick Installation of Kaptsja

Important: use --target or -t to specify your desired location

pip install --target=<your_directory> Kaptsja

pip install -t <your_directory> Kaptsja

Example: pip install -t tempKaptsja Kaptsja

Example: pip install -t /home/Kaptsja Kaptsja

Features and Functions

  • Three (3) different Kaptsja models are supported and are selectable.
    1. default -> the Kaptsja is shown direct on a single full HTML page
    2. modal -> the Kaptsja is "injected" in a modal window generated by bootstrap.js at top of a full HTML page
    3. div -> the Kaptsja is "injected" in a single full HTML page using JQuery

For models 2 and 3 the user is asked to check a box labeled "I am not a Robot"

  • Two (2) modus of operandi are possible:
    1. Each Kaptsja page is 100% unique per user request, as it is generated instantly on-the-fly
    2. A configurable set of 100% unique Kaptsja pages is pre-generated and randomly presented per user request
  • Numerous of settings are available to change the look and feel, colors, morphing, rotation behavior, number of characters(circles) and so on.
  • All relevant displayed text on the Kaptsja and its generated Web pages can be changed or translated in the Configuration file.
  • Configuration settings are validated, defaults set when needed and in case of errors a message is displayed in the log file.
  • Support instructions and examples for usage in combination with Nginx web server are provided.
  • It has been tested on Windows and Ubuntu 20.04 Linux with Python 3.7.4 and 3.8.5. Other versions might work as well.
  • It has also been tested for fun on Android (v8.0.0) using Pydroid 3 App (v4.01) with the Pydroid Repository Plugin. See below "How to run Kaptsja on Android".

Summarized functionality / process overview

  • The online user is prompted with a picture with randomly positioned circles with characters (default uppercase letters and digits).
  • The user should click in the Right Sorting Order in each circle according to a generated instruction text inside the picture.
  • When correctly sorted, the user will pass the captcha check else a failure page will be shown.
  • For dynamically generated Kaptsja's a maximum number of seconds to solve the "puzzle" is set. This avoids resending values of a solved Kaptsja after the defind time.
  • Before submitting the result, the user may Retry (means correcting a wrong sorting order) till the configurable number of retries has been exceeded.
  • The font type and size for characters and instruction text are configurable.
  • The number of circles and the characters to use are both configurable, as well as the transparency percentage of the circle and character.
  • Automatically the instruction text positioning adapts to picture size, the size of the text and used fonts. More/less words, more lines or line breaks, other font types and font points are all leading to a different instruction text size, as well as different sized circles with a character: Kaptsja takes care of that all.
  • The diameters of the circles are randomly generated. Positioning of circles in the picture will automatically adapt to changes in the size of the instruction text.
  • A single picture or a subdirectory with a number of pictures (randomly picked) can be provided to serve as background for the captcha. A copyright notice text can optionally be put on the picture at the left/right upper corner. Copyright text, the color and text size can be configured
  • A full set of nice example pictures is provided for random display. The default files: Kaptsja_bg.jpg and Kaptsja.ico can be (re-)generated at will.
  • Automatically too large background pictures will be resized for their width: the height will resize proprotionally (resizing happens and is needed only once).
  • Pictures can be forced to rescale. For example to get a smaller modal window (see model 2 which create a modal HTML page: popup). The modal window width size will adapt automatically to the new width of the resized picture.
  • For too small background pictures a recommendation message with a minimum required picture size will be generated in the log file: so another input picture can be picked or the font sizes and number of circles can be reduced.
  • The characters are somewhat distorted to make it difficult for (ro)bots and OCR software to recognise them.
  • Chinese and other Unicode characters are supported and they can be used as "Chinese characters symbols" in the circles, but keep in mind that not everybody is familiar with their right sorting order.
  • All circles and characters do get random colors. Transparency increase may make it more difficult to detect them on the picture for scan or face recognition software, but also for humans to a lesser extent.
  • The morphing of the characters can be changed making detection even more difficult. In general humans are better in "reconstructing the missing pieces" of a character.
  • Also the sorting instructions are somewhat distorted for the same reason as above.
  • For (ro)bots and humans there are no hints inside the generated Kaptsja HTML page to find for example the correct sort order or to determine an algoritme for correct clicking in the circles. Visual interpretation of the picture and instructions is required to solve the captcha.
  • Automated detection software will be very complex as it needs to figure out:
    • What are the shown sorting instructions telling? --> instruction changes randomly
    • Which characters are provided? --> changes randomly and they are morphed to hinder easy detection
    • Where are these characters located in the picture? --> the position changes per new Kaptsja picture
    • Clicking in the circle is a must, so the software needs to emulate this.
  • A click on a circle picture does not indicate the character itself. The clicks represent just a check number.
  • At the server side the Kaptsja puzzle will be solved by the software following the same process as the user, but with one difference: the check process knows (without server side storage) what the correct order should be, which is not the sorting order itself.
  • The server encrypts the clue to solve the puzzle using a 256 bits AES encryption (subset of the Rijndael-algoritm) and submits it within the Kaptsja page. The user re-submits the hidden encrypted code back together with the code created by the clicks. Both codes are needed to validate the result. The server side software knows how to decrypt the "clue".
  • The Kaptsja software generates all: picture with circles plus characters and the required Kaptsja HTML pages including the JavaScript and CSS for processing.
  • From the HOME page three (3) models are presented. The activated model can be selected to show and test the generated captcha.
  • An external Web Server is advised to run this Kaptsja software in production. Nginx and uwsgi has been tested. Configuration instruction and conf and ini file examples are included as well as some hints to solve potential issues.
  • An pre-configured web site module using the Bottle webserver is provided. Installation of the Python Bottle web server is required.

Dependencies

  • Python 3.7.4 was used to develop this software. It has been tested with Python 3.7.4 on Windows and Python 3.8.5 on Ubuntu 20.04 (other Python 3 versions might work).
  • Most imports of packages are from the standard Python distribution libraries.
  • The indicated versions below are additionally installed and used during development.
  • The additional Python packages can be installed with: pip install <package> ; on Ubuntu: apt get <packages>
  • When Anaconda is installed use conda install -c anaconda <package>.
Required:
  • bottle 0.12.18
  • Pillow (Python Imaging Library (Fork of PIL) Version 7.0.0 and 8.0.0 on Windows and Version 8.0.0 on Ubuntu 20.04 are tested
  • pycryptodome 3.9.8 (as alternative: pycrypto 2.6.1 will also work with Python 3.7, with Python 3.8 a small fix is needed)
  • BeautifulSoup4

Optional:

lxml 4.6.2
  • pytesseract 0.3.6 and Tesseract. They can be installed to read with OCR the generated images to verify if Kaptsja's generated can be recognised with the OCR method.

Quick start

This instruction assumes that Python 3.7.4 is already installed. Copy Kaptsja and subdirectories to a directory (any directory will do). Unzip into your directory of choice if you have a zipped version of Kaptsja.

The structure should look like this (the rendering of tree structure might fail here; sorry for that. Please look into readme.rst file itself when it is unreadable here):

---<your directory>
|   |_Kaptsja
|   | |_css
|   |   |_bootstrap.min-3.3.7.css
|   | |_docs
|   |   |_Readme2.rst
|   | |_html
|   |   |_KaptsjaFailurePage.html
|   |   |_KaptsjaHome.html
|   |   |_KaptsjaSuccessPage.html
|   | |_js
|   |   |_bootstrap.min-3.3.7.js
|   |   |_jquery.min-3.5.1.js
|   | |_key
|   |   |_Kaptsja_secret_key.txt
|   | |_log
|   |   |_Kaptsja.log
|   | |_media
|   |   |_randomlist
|   |   | |_ ... A list of example input picture files has been provided (Courtesy of Margrhet Stamps, All Glass works are made by myself ;-)
|   |   | |_ ... Various input types may be used like jpg, png, tiff, bmp, ...
|   |   | |_ Glass_1.jpg
|   |   | |_ Glass_2.jpg
|   |   | |_ ...
|   |   | |_ Glass_7.jpg
|   |   | |_ Glass_8.jpg
|   |   | |_ Kaptsja_bg.jpg
|   |   |_ Kaptsja.ico     <-- the default favicon.ico file, which is presented in the web browser tab and served by Bottle.py
|   |   |_ Kaptsja_bg.jpg  <-- default input picture file plus copies of the files shown under randomlist
|   | |_scripts
|   | | |_KaptsjaConfiguration.py
|   | | |_KaptsjaEncDec.py
|   | | |_KaptsjaGenerator.py
|   | | |_KaptsjaHTMLpages.py
|   | | |_KaptsjaPictureIco.py
|   | | |_KaptsjaSite.py
|   | | |_secret_key.txt
|   | | |_Z__input.txt
|   | | |_Z__input_dec.txt
|   | | |_Z__input_enc.txt
|   | |_work
|   | |_ ... generated unique Kaptsja sets (html, png, js, css files)
|   | |_ ... See below the examples of generated file names.
|   | |_ ... KaptsjaDIV_1607460886.7940052.html,
|   | |_ ... KaptsjaDIV_1607460886.7940052.css
|   | |_ ... KaptsjaDIV_1607460886.7940052.js
|   | |_ ... KaptsjaPage_1607460908.852623
|   | |_ ... KaptsjaPicture_1607460888.4223156.png
|   | |_ ... KaptsjaModal_1607461539.9284627.html
|   |_Kaptsja Copyright Notice.txt
|   |_Kaptsja.zip     <-- Complete zipped Kaptsja directory, download this Zip and unzip. Kaptsja directory plus subdirectories and files will be created
|   |_Readme.rst
|   |_Start_Kaptsja_website.bat
|   |_Start_Kaptsja_website.sh

Installation of the additonal Python packages

Use pip for installation. Pip is the package installer for Python packages.

  • pip install bottle

  • pip install Pillow

  • pip install pycryptodome

  • on Ubuntu 20.04 use: sudo apt-get install python3-bs4

  • on Windows use: pip install BeautifulSoup4

  • Optional: install lxml

    • on Ubuntu 20.04 use: sudo apt-get install python-lxml
    • on windows use : pip install lxml

    When lxml is installed it will automatically "replace" the default html.paser.

If an Anaconda distribution from anaconda.org has been installed use: conda install -c conda-forge <package name>

Some optional configuration changes for a quick customization

Play first with the Kaptsja software, consult Readme2.rst in ./docs and study the comments in the configuration file for more advanced configuration possibilities.

Adapt in file KaptsjaConfiguration.py in the subdirectory ./scripts/ some settings as shown below (when needed).

These are: the paths to where your Fonts are installed and the default input picture if you want to change that. Best is to use Linux path notations, but Windows path notation will work as long as you quote them with the letter r or R in front of the path string like:

r"<Windows Path here>" or R"<Windows Path here>". This is a Python raw string notation and all backslashes are left in the string. You do not need to use \ as Windows path separator, unless the letter r is missing!

Be aware that this is normal Python code! Check these settings to begin with. The values are just examples and may be changed.

  • input_picture ="Kaptsja_bg.jpg"
  • font_textzone = 20
  • font_circle = 45
  • sitehost = "ubuntu2004.wsl"
  • siteport = 9081
  • siteserver = "python_server"
  • sitedebug = False
  • site_reloader = False

Startup commands

Open a command window and cd to <your directory>/Kaptsja/

On Windows enter command: Start_Kaptsja_website.bat or run python .scriptsKaptsjaSite.py

On Linux enter command: Start_Kaptsja_website.sh. or run python ./scripts/KaptsjaSite.py

Open a web browser and enter the URL as shown in the command window: Default: http://localhost:8080/

A Web page opens with tab. Click on the tab for the activated model to start the Kaptsja and try it!

Program KaptsjaGenerator.py which generates the shown Kaptsja page can be run directly from the command line as follows (needed when max_captcha_sets > 0):

Open a command window and cd to <your directory>/Kaptsja/

Enter command: python ./scripts/KaptsjaGenerator.py and follow the shown instructions.

If no default picture (KaptsjaPictureIco.py) or default icon (Kaptsja.ico) exists then run KaptsjaPictureIco.py.

Enter command: python ./scripts/KaptsjaPictureIco.py and the picture and ico will be created in media_dir.

Put any picture to be used as Kaptsja background in /Kaptsja/media or in /Kaptsja/media/randomlist.

How to run Kaptsja on Android

For fun Kaptsja has also been tested on Android (v8.0.0) using Pydroid 3 App (v4.01) with the Pydroid Repository Plugin. Here is how to do it.

  • Install Pydroid 3 and Pydroid Repository plugin Apps (the free versions will be okay).
  • Remember to run both Pydroid 3 and your mobile web browser in a split screen else Pydroid will close the session.
  • This is due to a bug in Pydroid and it needs to be restarted to continue.
  • You need to install a font as well. Download the desired font file and when compressed (.zip format) unzip the file.
  • Free fonts can be downloaded here https://fonts.google.com/ e.g. https://fonts.google.com/specimen/Ubuntu. Ubuntu-Bold.ttf is a good choice.
  • A good place to put the font file in (e.g. Ubuntu-Bold.ttf), is in home Kaptsja folder. Do not forget to adapt the two font paths in KaptsjaConfiguration.py.
  • Use a text editor App like QuickEdit (any will do) to change the settings.
  • Always install Kaptsja via the Pydroid Terminal option (from Menu). First change to your desired folder location and run there following command "pip install -t . Kaptsja".
  • Install the dependencies as listed above. Here you can either use the Pydroid pip menu or simply run in the Pydroid Terminal "pip install bottle Pillow pycryptodome BeautifulSoup4".
  • The whole Kaptsja folder structure as listed above will be installed as subdirectory at your desired location.
  • Run Kaptsja via the Pydroid Terminal option. Start the command in the new Kaptsja home folder! Run command "python ./scripts/KaptsjaSite.py".
  • Open a mobile browser in split screen and enter url: "http://localhost:8080/".
  • Enjoy solving Kaptsja's!

More details are documented in ./docs/Readme2.rst

For more installation and configuration details look into Readme2.rst file. It is located at "/Kaptsja/docs/Readme2.rst". For installation with Ningx and uwsgi refer to "/Kaptsja/docs/Installation of Kaptsja with Nginx and uwsgi.rst". Note that a combination of various Python versions in a Python virtual environment setup and / or with native Python installtions on Linux can cause quite some headaches; especially when settings and binaries are mixed! Double check!

A Multipurpose AES 256 bits Encryption and Decription module is included

Module KaptsjaEncDec.py contains an advanced encyptions/decryption

In the Kaptsja HTML page it encrypts/decrypts the controlvalue send to and returned from the browser.

This encryption/ decryption module can be used universally in many projects!

Note: a fix when using pycrypto in stead of pycryptodome

The suggestion is to use pycryptodome, but when not possible pycrypto can be used as well, taking into account next remarks.

In stead of using pycryptodome 3.9.8, package pycrypto 2.6.1 may be used as drop-in.
It has been tested with Python 3.7.4, but when combined with Python 3.8 following error needs to be fixed first!
* Solving: AttributeError: module 'time' has no attribute 'clock' in Python 3.8*

        When using Python 3.8 or higher following error will occur:
          File "/usr/local/lib/python3.8/dist-packages/Crypto/Random/_UserFriendlyRNG.py", line 77, in collect
                t = time.clock()
        AttributeError: module 'time' has no attribute 'clock'

        In Python 3.8 the function time.clock() has been removed, after having been deprecated since Python 3.3:
        use time.perf_counter() or time.process_time() instead, depending on your requirements, to have a well-defined behavior.
        (Contributed by Matthias Bussonnier in bpo-36895 https://bugs.python.org/issue36895)

        To Fix this change line 77 in module _UserFriendlyRNG.py as follows:
        77   t = time.clock()          <-- old
        77   t = time.process_time()   <-- new

        Ubuntu:
                sudo nano /usr/local/lib/python3.8/dist-packages/Crypto/Random/_UserFriendlyRNG.py