Skip to content

The Rookout Log Scanner project enables easy and straight-forward log lines scanning of Github repositories.

License

Notifications You must be signed in to change notification settings

Rookout/log-scanner

Repository files navigation

Rookout logo

Rookout Log Scanner

The Rookout Log Scanner project enables easy and straight-forward log lines scanning of Github repositories.

About

Twitter Follow

Rookout is a data extraction and pipelining platform, which provides the ability to collect any piece of data from live code, on-demand, using non-breaking breakpoints (Learn more about Rookout on our website or our docs pages).

The Log Scanner was designed as a tool for users to scan Github repositories made by them or by others, and analyze logging usage.

  • Currently supports: Python, Java, JavaScript, C#.

Getting Started

Prerequisites

Docker OR Python 3.7.4 or Later.

Clone the Rookout Log Scanner repository.

git clone https://github.com/Rookout/log-scanner.git
cd log-scanner

Make sure you have a Github Token. (If you're generating a new one, it doesn't need anything but repo access.)

Usage

Using Docker (recommanded)

  1. Build the docker
docker build . -t log-scanner
  1. Automatically collect all of your personal repositories using our Auto_Collector. OR manually edit inputs/repositories.txt to include all the repositories you wish to scan, with a newline between them. for example:
https://github.com/Rookout/log-scanner
https://github.com/Rookout/tutorial-python
https://github.com/Rookout/explorook
  1. Run the scanner (don't forget to add <Your_Github_token>)
docker run \
    -v `pwd`/inputs:/app/inputs \
    -v `pwd`/outputs:/app/outputs \
    -e GITHUB_TOKEN=<Your_GitHub_Token> \
    log-scanner

Once the scanner is done, you will find the results in the outputs folder.
For full details about the outputs and a quick auto analysis tool, check out OUTPUTS.md.

Running Locally

  1. Install the project requirements.
pip install -r requirements.txt
  1. Set <Your_Github_Token> as a temporary local environment variable.
# macOS
export GITHUB_TOKEN="<Your_Github_Token>"
# Windows
$env:GITHUB_TOKEN="<Your_Github_Token>"
  1. Automatically collect all of your personal repositories using our Auto_Collector.
python inputs/collect_my_repos.py 
  • Alternatively, you can manually edit inputs/repositories.txt to include any repository you wish to scan, with a newline between them. for example:
https://github.com/Rookout/log-scanner
https://github.com/Rookout/tutorial-python
https://github.com/Rookout/explorook
  1. Run the program. (macOS users: please notice)
python index.py

Once the scanner is done, you will find the results in the outputs folder.
For full details about the outputs and a quick auto analysis tool, check out OUTPUTS.md.

Disclaimers

Rookout Log Scanner communicates widely with the Github api and due to that fact it currently doesn't support repositories that are stored on different version control repository hosting services like Bitbucket, Gitlab, Coding, etc. Due to that fact, Github token is a prerequisite for running the project, as well as its essentiality for accessibility to the user's private repositories. The token is used for connecting Github API only, by the local Python code you've cloned. Feel free to delete the token after running the scanner.

Rookout Log Scanner currently supports Python, Java, JavaScript, and C# scanning. As a result, all the extracted data from the repositories relate to files that are written in those languages only. We are currently working to expand the scope of the project and support additional common languages.

During the scanning process, at any given moment, few repositories will be cloned onto the user's computer, and immediately deleted at the end of the scanning. The exact amount of simultaneously cloned repositories depends on the machine's CPU capabilities. For the process to succeed, the user should to make sure he has spare memory in accordance with the input repositories.

Rookout Log Scanner does not guarantee full detection of every log in every file. The detection is executed using regular expressions and was set up according to research of the common syntax, conventions, tools and packages used in the market. Scanning a repository which includes the unique use of a self-created logging system or deviation of common conventions, might lead to inaccurate results.

Copyright (c) Rookout LTD. All rights reserved.

Licensed under the Apache 2.0 license.

About

The Rookout Log Scanner project enables easy and straight-forward log lines scanning of Github repositories.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •