Skip to content

Commit

Permalink
Merge pull request #1 from Mudlet/dev
Browse files Browse the repository at this point in the history
Merge working dev branch into main to have the source of truth as main branch
  • Loading branch information
keneanung authored Sep 27, 2021
2 parents 53f81da + 1e6cf31 commit 469d640
Show file tree
Hide file tree
Showing 18 changed files with 1,744 additions and 615 deletions.
30 changes: 28 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,13 +29,39 @@ CI Snapshots provides an endpoint at `/json.php` for fetching snapshot data as J
To use the JSON data, send a GET request to https://make.mudlet.org/snapshots/json.php

All data available will be returned by default. Optional arguments can be supplied to filter the returned data.
Optional URL Paramers are:
Optional URL Parameters are:
- `prid` -- a PR ID number from github.
- `commitid` -- a Commit ID from Git/Github.
- `platform` -- a string for the platform, which must be one of: `windows`, `linux`, or `macos`

The requested JSON list will show only entries which have matching values.

## Usage for Github Artifacts
CI Snapshots supports fetching Artifacts from Github Action Storage.
Use `/github_artifacts.php` to submit a fetch request. If successful, the endpoint will return a URL similarly to how PUT uploads work.
Required URL Parameters are:
- `id` -- a github artifact ID number.
- `unzip` -- 0 to use the zip as-is, or 1 to extract a file from the zip.

When using `unzip=1` parameter, the artifact zip must contain only one file. The filename stored in the zip is also used as the snapshot filename.
When using `unzip=0` parameter, the snapshot filename is created using the artifact `name` json property and appending `.zip` to it.

Optional settings via Headers `Max-Days` and `Max-Downloads` are also supported by this endpoint.

### Using Artifact Queue
Due to limitations in Github Actions and Artifacts storage, while a job is running associated artifacts cannot be downloaded.
To work around this, workflows can enqueue artifacts by name - assuming the artifact name is unique among other artifacts.
Usage of `/gha_queue.php` is the same as `/github_artifacts.php` with one exception.
The `id` parameter is replaced with an `artifact_name` parameter and must contain a unique Artifact name.

Example usage:
`https://make.mudlet.org/snapshots/gha_queue.php?artifact_name=Mudlet-4.10.1-testing-pr1111-a770ad4d-linux-x64&unzip=1`

Optional settings via Headers are also supported.

**Note:** Queue services require a separate cron job!
**Note 2:** This feature requires a OAuth token with the `actions` scope or a personal access token with the `public_repos` scope.

## Installation Requirements
This software is powered by PHP and Apache with Mod_Rewrite. Internationalization requires Intl and gettext php support.
Most Apache configurations may disable PUT method requests by default, so we need to make some configuration changes to Apache (in Server or VirtualHost areas) in order to enable PUT method requests as well as configure our RewriteMap directive for mod_rewrite.
Expand All @@ -55,4 +81,4 @@ The required Apache Directives should be something similar to this:
Download and unpack or Clone the software into a PHP-Enabled server directory. Copy and rename the file `config.exmaple.php` to `config.php` and edit the configuration.
Likewise, copy the `ip_list.example` to `ip_list` and edit the tab-separated list data to suit your needs.
Ensure that you have created an `.htaccess` file and are using the directives found within `.htaccess.example` to enable the rewrite rules and other security controls.
The software will automatically create the required tables in the database. Simply navigate to the index with a web browser before using `cron.php` or attempting to `PUT` files.
The software will automatically create the required tables in the database. Simply navigate to the index with a web browser before using `cron.php` or attempting to `PUT` files.
34 changes: 34 additions & 0 deletions config.example.php
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,40 @@
define('STRANDED_FILE_WINDOW', 2);


// URL to the base artifact listing URL. All URLs are built from this one.
define('GHA_LIST_URL', 'https://api.github.com/repos/Mudlet/Mudlet/actions/artifacts');

// URL for sending notification about processed artifacts.
// Uses POST method and sends application/json list of PR IDs.
define('GHA_QUEUE_NOTICE_URL', '');

// Regular Expression used to validate and extract IDs from artifact URLs.
define('GHA_URL_REGEX', '#https?://api\.github\.com/repos/Mudlet/Mudlet/actions/artifacts/(\d+)(?:/|/zip/?)?#iu');

// Authentication token for Github API calls - OAuth or Personal Access Token.
// An OAuth token needs the "actions" scope
// A personal access token needs the "public_repos" scope
define('GHA_AUTH_TOKEN', 'token');

// Timeout in seconds for github artifact request execution.
// this value controls the max time a request/download can take.
define('GHA_CURL_TIMEOUT', 180);

// Enforce HTTP Auth for GHA requests, similarly to PUT_REQUIRES_AUTH
define('GHA_REQUIRES_AUTH', false);

// prefix for filenames used to store github artifact queue data.
// this must not be left empty!
define('GHA_QFILE_PREFIX', 'ghaq_');

// file extension to append to queue data filename.
define('GHA_QFILE_EXT', '.json');

// number of seconds a github artifact queue item may remain in the queue.
// a few hours would be reasonable.
define('GHA_QUEUE_TIMEOUT', 18000);


// Database connection details
define('DB_HOST', '');

Expand Down
6 changes: 5 additions & 1 deletion cron.php
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,11 @@ function getTravisIps()
$data = array();
$io = popen('dig +short nat.travisci.net | sort', 'r');
while ($line = fgets($io)) {
$data[] = trim($line);
$line = trim($line);
$vIP = filter_var($line, FILTER_VALIDATE_IP);
if( $vIP !== false ) {
$data[] = $vIP;
}
}
pclose($io);
return $data;
Expand Down
286 changes: 286 additions & 0 deletions gha_cron.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,286 @@
<?php
/***
* This file has one task, scan for, process, and remove 'enqueued' github artifact transfer requests.
* Processing a queue entry means generating a request to our own endpoint. because I'm lazy...
* Artifacts are available only after a workflow/job is complete, but can be queued long before that.
*
* As it takes some time to process multiple files, care should be taken when deciding on a schedule for this job.
We need to accept a certain amount of 404s and errors, putting the queued file back until it is done or expired.
***/

require_once('config.php');

header('Content-Type: text/plain');

if (php_sapi_name() != 'cli') {
header("Location: " . SITE_URL);
exit();
}

define("CI_SNAPSHOTS", true);
require_once('lib/functions.common.php');
require_once('lib/functions.http.php');


$JobLockMaxTTL = 1200; // seconds from lock creation to consider the job expired.
$PRList_file = $ScriptPath . DIRECTORY_SEPARATOR . TEMP_DIR . '.pr-queue.json';


class GithubArtifactList {
private $data = array();
public $artifacts_total = 0;
public $artifacts = array();

public function __construct() {
$req_url = GHA_LIST_URL . '?per_page=100';

echo("Fetching artifact list - 100 per-page\n");
$req = do_curl_get($req_url, array('Authorization: token ' . GHA_AUTH_TOKEN));
if ($req === false) {
return false;
}
if ($req[2] !== 200) {
return false;
}

$jdata = json_decode($req[0], true);
if ($jdata === null) {
return false;
}
$this->data = $jdata;

if (isset($jdata['artifacts'])) {
$this->artifacts = $jdata['artifacts'];
} else {
return false;
}

if (isset($jdata['total_count'])) {
$this->artifacts_total = $jdata['total_count'];
} else {
return false;
}

}

function countArtifactsByName( $artifact_name ) {
$found=0;
foreach( $this->artifacts as $idx => $a ) {
if ( $artifact_name == $a['name'] ) {
$found = $found + 1;
}
}
return $found;
}

function getArtifactIdByName( $artifact_name ) {
$id=false;
foreach( $this->artifacts as $idx => $a ) {
if ( $artifact_name == $a['name'] ) {
$id = $a['id'];
}
}
return $id;
}
}


function processSnapshotFetch( $url, $headers ) {
echo("Fetching: $url \n");
$r = do_curl_get($url, $headers);

if ($r === false) {
echo("Error while fetching\n");
return false;
}

if ($r[2] !== 200) {
echo("Response not 200, is: ${r[2]}\n");
echo("Error Text: $r[0] \n");
return false;
}

if ($r[2] === 200) {
echo("Response: ${r[0]} \n");
}

return true;
}

function processQueueFile($filepath) {
global $ghalObj;

$raw = file_get_contents($filepath);
$data = json_decode($raw, true);

// unlink in cases of invalid data.
if ($data === false) {
echo('Error decoding JSON! - ' . json_last_error() . "\n" );
return false;
}
if (!isset($data['name']) || !isset($data['unzip'])) {
echo("Data missing 'name' and 'unzip' required members!\n");
return false;
}

// we need to clean up carefuly here...
// unlink if we have more than one file
// if no files, we may need to either search deeper,
// or we need to wait for a later check - artifact may not exist yet!
$c = $ghalObj->countArtifactsByName( $data['name'] );
if ($c > 1) {
$n = $data['name'];
echo("More than one artifact with the name: $n\n");
return false;
}

if ($c == 0) {
$n = $data['name'];
echo("No artifact with the name: $n \n");
// in this case we check the queue entry time for expiry, otherwise leave the item.
$ex_time = $data['qtime'] + GHA_QUEUE_TIMEOUT;
if (time() > $ex_time) {
echo("Queue item has expired.\n");
return false;
}
return true;
}


$queue_url = SITE_URL . 'github_artifact.php';
$headers = array();
$unzip = ($data['unzip']) ? '1' : '0';

if (isset($data['id'])) {
$id = $data['id'];
} else {
$id = $ghalObj->getArtifactIdByName( $data['name'] );
}

if (isset($data['maxdays'])) {
$headers[] = 'Max-Days: ' . strval( $data['maxdays'] );
}
if (isset($data['maxdls'])) {
$headers[] = 'Max-Downloads: ' . strval( $data['maxdls'] );
}

$url = $queue_url . '?id=' . strval($id) . '&unzip=' . $unzip;

if ( processSnapshotFetch( $url, $headers ) ) {
unlink($filepath);
}

return true;
}

function loadPRIdsQueue() {
global $PRList_file;

$list = array();
if ( file_exists( $PRList_file ) ) {
$d = file_get_contents($PRList_file);
$list = json_decode($d);
}

return $list;
}

function savePRIdsQueue($PRList) {
global $PRList_file;

$data = json_encode($PRList);

file_put_contents($PRList_file, $data);
}

function clearPRIdsQueue() {
global $PRList_file;
if ( file_exists($PRList_file) ) {
unlink($PRList_file);
}
}


$job_lock = $ScriptPath . DIRECTORY_SEPARATOR . '.gha_cron.lock';
if ( file_exists( $job_lock ) ) {
$lock_time = filemtime($job_lock);
$lock_exp_time = $lock_time + $JobLockMaxTTL;
if (time() < $lock_exp_time) {
echo("Job already running, quitting. \n");
exit();
} else {
echo("Job already running, but lock has expired! \n");
unlink($job_lock);
exit();
}
} elseif( touch($job_lock) === false ) {
echo("Could not set job lock, quitting. \n");
exit();
}

$timer_start = microtime(true);

$ghalObj = new GithubArtifactList();

$TempDir = $ScriptPath . DIRECTORY_SEPARATOR . TEMP_DIR ;
$FileRegex = '/^' . preg_quote(GHA_QFILE_PREFIX, '/') . '(.+)' . preg_quote(GHA_QFILE_EXT, '/') . '$/iu';
$PRIdList = loadPRIdsQueue();
$files = scandir($TempDir);
foreach( $files as $idx => $file ) {
if ($file == '.' || $file == '..') {
continue;
}
$m = null;
if ( false == preg_match($FileRegex, $file, $m) ) {
continue;
}

$ts = microtime(true);
echo("Processing queue file: $file \n");

if ( !empty($m) ) {
$p = null;
$s = preg_match('/(?:-PR([0-9]+))?-([a-f0-9]{5,9})[\.-]{1}/i', $m[1], $p);
$c = $ghalObj->countArtifactsByName( $m[1] );

if ( $s === 1 && count($p) == 3 ) {
$prid = intval($p[1]);
if ( !in_array($prid, $PRIdList) && $prid !== 0 && $c > 0) {
$PRIdList[] = $prid;
}
}
}

$filepath = $TempDir . $file;
$status = processQueueFile( $filepath );

if ($status === false) {
unlink( $filepath );
}
$n = microtime(true) - $ts;
printf("Processing finished in %4.2f seconds.\n\n", $n);
}

if ( count($PRIdList) > 0 && !empty(GHA_QUEUE_NOTICE_URL) ) {
$post_data = json_encode($PRIdList);

echo("Sending PR link update notice for: \n - " . $post_data . "\n");
$res = do_curl_post(GHA_QUEUE_NOTICE_URL, $post_data, array('Content-Type: application/json'));
if ( $res === false ) {
echo(" - Failed \n");
savePRIdsQueue($PRIdList);
} elseif ( $res[2] == 204 ) {
echo(" - OK \n");
clearPRIdsQueue();
}
}

if ( file_exists($job_lock) ) {
echo("Removing job lock. \n");
unlink($job_lock);
}

$t = microtime(true) - $timer_start;
printf("Finished in %4.2f seconds\n\n\n", $t);
Loading

0 comments on commit 469d640

Please sign in to comment.