-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
44 changed files
with
3,227 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
# AdAr - Another dumb Archive | ||
|
||
(sorry, ATM german only) | ||
|
||
AdAr ist eine Weiterentwicklung auf Basis des Systems "DiBaS (Digitales Bildarchiv Saffig)", welches zur Archivierung des Fotos-Bestandes des Geschichtsvereins Saffig entwickelt wurde. In diesem Projekt wurde das System um dokumentenrelevante Funktionen wie OCR, Kontaktverwaltung u.A. ergaenzt. | ||
|
||
AdAr ist vorerst nur in Deutsch verfuegbar. Der PHP-Code wird unter den Bedingungen der GPLv3 oder neuer bereitgestellt. Einige Libraries, welche sich in diesem Repo befinden, stehen unter anderen Lizenzen, welche im jeweiligen Projektordner eingesehen werden koennen. | ||
|
||
Achtung: Gebastel mit Teils historischem Code. Nicht ohne prüfenden Blick produktiv verwenden. | ||
|
||
Wenn die PHP-EXIF-Erweiterung installiert ist wird diese verwendet | ||
Wenn pdftotext installiert ist wird dies verwendet | ||
|
||
## Nutzung | ||
Das System wird von mir aktiv zur Datenablage genutzt. Hierzu werden PDF-Dateien mit Text generiert (siehe tools/) und im Anschluss hochgeladen | ||
|
||
## Installation | ||
|
||
- Benötigt einen Webserver mit PHP >=5.6 und EXIF-Support | ||
- Benötigt eine MySQL-Datenbank | ||
- Benötigt [composer](https://getcomposer.org/) | ||
- tesseract >=3 | ||
- Um OCR für Grafiken auszuführen | ||
- nicht wirklich getestet, Sprache Deutsch voreingestellt | ||
- pdftotext | ||
- Zum Extrahieren von Text aus PDF-Dateien | ||
|
||
|
||
- Daten auf Webserver kopieren | ||
- Die Ordner daten/* und tpl/cache/ müssen für den Webserver schreibbar sein | ||
- MySQL-Datenbank anlegen und doc/mysql.sql importieren | ||
- Zugangsdaten in config.php ergänzen | ||
- Optional: Name der Installation (ADAR_PROGNAME) anpassen | ||
- Optional: E-Mail-Adresse in ADAR_INFOMAIL_TO ergänzen, in diesem Fall wird bei jeder Neuanlage eine E-Mail an diese Adresse versendet | ||
- Abhängigkeiten installieren: ```composer install``` | ||
- cron.php sollte regelmäßig als Webserver aufgerufen werden, andernfalls werden temporäre Dateien nicht aufgeräumt und OCR nicht ausgeführt | ||
- z.B. ```*/15 * * * * /usr/bin/php -f /var/www/cron.php > /var/log/adar.cron.log``` in crontab | ||
- Login mit admin/admin | ||
|
||
## Hinweise | ||
- Aktuell existiert keine grafische Nutzerverwaltung, das Passwort kann also nicht geändert werden. Generell empfieht es sich eine Authentifizierung auf Webserverebene einzurichten. Die Nutzer lassen sich in SQL editieren, passende Passwort-Hashes können üder die Funktion [session_getNewPasswordHash](https://github.com/adlerweb/awtools/blob/master/session.php#L137) generiert werden. | ||
- Backups. | ||
- Mehr Backups. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,166 @@ | ||
<?PHP | ||
|
||
/** | ||
* AdAr - Another dumb Archive | ||
* | ||
* AJAX API | ||
* | ||
* @package adar | ||
* @author Florian Knodt <[email protected]> | ||
*/ | ||
|
||
if(!file_exists('config.php') ||!is_readable('config.php')) { | ||
die('Missing configuration'); | ||
} | ||
|
||
require_once('config.php'); //Config | ||
require_once('lib/mysql.wrapper.php'); //ATools->MySQL | ||
require_once('vendor/adlerweb/awtools/session.php'); //ATools->Session-Manager | ||
|
||
if(!$GLOBALS['adlerweb']['session']->session_isloggedin()) { | ||
echo 'Invalid session'; | ||
header('HTTP/1.0 403 Forbidden'); | ||
} | ||
|
||
$requestData= $_REQUEST; | ||
|
||
$columns = array( | ||
// datatable column index => database column name | ||
0 => array(false, 'ItemID', array('<a href="?m=content_detail&id=%s">%s</a>', array('ItemID', 'ItemID'))), | ||
1 => array(false, 'Caption', false), | ||
2 => array(false, 'Format', false), | ||
3 => array(false, 'Date', false), | ||
4 => array('CONCAT(`Sender`.`FamilyName`,", ",`Sender`.`GivenName`)', 'S_Sender', array('<a href="?m=contact_create&id=%s">%s</a>', array('Sender', 'S_Sender'))), | ||
5 => array('CONCAT(`Receiver`.`FamilyName`,", ",`Receiver`.`GivenName`)', 'S_Receiver', array('<a href="?m=contact_create&id=%s">%s</a>', array('Receiver', 'S_Receiver'))) | ||
); | ||
|
||
$colout = array(); | ||
$colout_f = array(); | ||
$colout_done = array(); | ||
foreach($columns as $col) { | ||
$colout_done[] = $col[1]; | ||
if($col[0]) { | ||
$colout[] = $col[0].' AS `'.$col[1].'`'; | ||
$colout_f[] = $col[0].' AS `'.$col[1].'`'; | ||
}else{ | ||
$colout[] = '`'.$col[1].'`'; | ||
} | ||
|
||
if($col[2]) { | ||
foreach($col[2][1] as $tcol) { | ||
if(!in_array($tcol, $colout_done)) { | ||
$colout[] = $tcol; | ||
$colout_done[] = $tcol; | ||
} | ||
} | ||
} | ||
} | ||
|
||
$sql_data = "SELECT "; | ||
$sql_data .= implode(", ", $colout); | ||
$sql_data .= " FROM Items | ||
LEFT JOIN `Contacts` AS `Sender` ON `Items`.`Sender` = `Sender`.`CID` | ||
LEFT JOIN `Contacts` AS `Receiver` ON `Items`.`Receiver` = `Receiver`.`CID` "; | ||
|
||
$sql_anz = "SELECT COUNT(`Items`.`ItemID`) as anz "; | ||
|
||
$sql_anz .= " FROM Items | ||
LEFT JOIN `Contacts` AS `Sender` ON `Items`.`Sender` = `Sender`.`CID` | ||
LEFT JOIN `Contacts` AS `Receiver` ON `Items`.`Receiver` = `Receiver`.`CID` "; | ||
|
||
// getting total number records without any external filters | ||
//$anzq=$GLOBALS['adlerweb']['sql']->query($sql_anz.$sql_filter); | ||
$anzq=$GLOBALS['adlerweb']['sql']->query_single($sql_anz); | ||
if(!$anzq) { | ||
$totalData=0; | ||
}else{ | ||
$totalData=$anzq['anz']; | ||
} | ||
$totalFiltered = $totalData; | ||
|
||
$sql_filter_data = array(); | ||
$sql_filter = " WHERE 1 = ?"; | ||
$sql_filter_data[] = 1; | ||
|
||
// getting records as per search parameters | ||
for($i=0; $i<count($columns); $i++) { | ||
if( !empty($requestData['columns'][$i]['search']['value']) ){ | ||
if($columns[$i][0]) { | ||
$sql_filter.=" AND (".$columns[$i][0].") LIKE ? "; | ||
$sql_filter_data[] = '%'.$requestData['columns'][$i]['search']['value'].'%'; | ||
}else{ | ||
$sql_filter.=" AND `".$columns[$i][1]."` LIKE ? "; | ||
$sql_filter_data[] = '%'.$requestData['columns'][$i]['search']['value'].'%'; | ||
} | ||
} | ||
} | ||
|
||
if(!empty($requestData['search']['value'])) { | ||
$sql_filter.=" | ||
AND ( | ||
`ItemID` LIKE ? OR | ||
`Caption` LIKE ? OR | ||
`Description` LIKE ? OR | ||
`Format` LIKE ? OR | ||
CONCAT(`Sender`.`FamilyName`,\", \",`Sender`.`GivenName`) LIKE ? OR | ||
CONCAT(`Receiver`.`FamilyName`,\", \",`Receiver`.`GivenName`) LIKE ? | ||
) "; | ||
$sql_filter_data[] = '%'.$requestData['search']['value'].'%'; | ||
$sql_filter_data[] = '%'.$requestData['search']['value'].'%'; | ||
$sql_filter_data[] = '%'.$requestData['search']['value'].'%'; | ||
$sql_filter_data[] = '%'.$requestData['search']['value'].'%'; | ||
$sql_filter_data[] = '%'.$requestData['search']['value'].'%'; | ||
$sql_filter_data[] = '%'.$requestData['search']['value'].'%'; | ||
} | ||
|
||
if(count($sql_filter_data) > 1) { | ||
$anzq=$GLOBALS['adlerweb']['sql']->querystmt_single($sql_anz.$sql_filter, str_repeat('s', count($sql_filter_data)), $sql_filter_data); | ||
if(!$anzq) { | ||
$totalFiltered=0; | ||
}else{ | ||
$totalFiltered=$anzq['anz']; | ||
} | ||
} | ||
|
||
if(isset($requestData['order'][0]['column']) && isset($requestData['order'][0]['dir'])) { | ||
if(!in_array($requestData['order'][0]['dir'], array('ASC', 'DESC', 'asc', 'desc'))) die('Errr?'); | ||
$sql_filter.=" ORDER BY ". $columns[$requestData['order'][0]['column']][1]." ".$requestData['order'][0]['dir'].' '; | ||
} | ||
if(isset($requestData['start']) && isset($requestData['length']) && $requestData['length'] > 0) | ||
$sql_filter.="LIMIT ".(int)$requestData['start']." ,".(int)$requestData['length']." "; // adding length | ||
|
||
$query = $GLOBALS['adlerweb']['sql']->querystmt($sql_data.$sql_filter, str_repeat('s', count($sql_filter_data)), $sql_filter_data); | ||
$data = array(); | ||
if($query) { | ||
foreach($query as $row) { // preparing an array | ||
$nestedData=array(); | ||
|
||
foreach($columns as $col) { | ||
if($col[2]) { | ||
$argdata = array( | ||
$col[2][0] | ||
); | ||
foreach($col[2][1] as $in) { | ||
$argdata[] = $row[$in]; | ||
} | ||
$nestedData[] = call_user_func_array('sprintf', $argdata); | ||
}else{ | ||
$nestedData[] = $row[$col[1]]; | ||
} | ||
} | ||
|
||
$data[] = $nestedData; | ||
} | ||
} | ||
|
||
$json_data = array( | ||
"recordsTotal" => intval( $totalData ), // total number of records | ||
"recordsFiltered" => intval( $totalFiltered ), // total number of records after searching, if there is no searching then totalFiltered = totalData | ||
"data" => $data // total data array | ||
); | ||
|
||
if(isset($requestData['draw'])) $json_data['draw'] = $requestData['draw']; | ||
|
||
echo json_encode($json_data); // send data as json format | ||
|
||
?> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
{ | ||
"name": "adlerweb/adar", | ||
"type": "project", | ||
"description": "AdAr - Another dumb Archive - Document archiving solution", | ||
"keywords": ["archive", "DMS", "PDF", "OCR"], | ||
"homepage": "https://github.com/adlerweb/adar", | ||
"license": "GPL-3.0", | ||
"authors": [ | ||
{ | ||
"name": "Florian Knodt", | ||
"email": "[email protected]" | ||
} | ||
], | ||
"support": { | ||
"issues": "https://github.com/adlerweb/adar/issues" | ||
}, | ||
"require": { | ||
"php": ">=5.6", | ||
"datatables/datatables": "1.10.*", | ||
"components/jquery": "3.2.*", | ||
"components/jqueryui": "1.12.*", | ||
"smarty/smarty": "3.1.*", | ||
"adlerweb/calender-date-input": "dev-master", | ||
"pixabay/jquery-tageditor": "dev-master", | ||
"adlerweb/awtools": "0.2.*", | ||
"koala-framework/library-silkicons": "1.3" | ||
}, | ||
"repositories": [ | ||
{ | ||
"type": "vcs", | ||
"url": "https://github.com/adlerweb/calendarDateInput" | ||
},{ | ||
"type": "vcs", | ||
"url": "https://github.com/adlerweb/jQuery-tagEditor" | ||
},{ | ||
"type": "vcs", | ||
"url": "https://github.com/adlerweb/awtools" | ||
} | ||
] | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
<?PHP | ||
|
||
/** | ||
* AdAr - Another dumb Archive | ||
* | ||
* Main Configuration | ||
* | ||
* @package adar | ||
* @author Florian Knodt <[email protected]> | ||
*/ | ||
|
||
error_reporting(E_ALL); | ||
|
||
define("AW_SQL_SERV", "localhost"); | ||
define("AW_SQL_USER", "adar"); | ||
define("AW_SQL_PASS", "testinstallation"); | ||
define("AW_SQL_DATB", "adar"); | ||
define("AW_SQL_DEBUG", true); | ||
define("AW_SQL_DEBUG_SHOW", false); | ||
|
||
define("SMARTY_CACHE", false); | ||
|
||
define("ADAR_PROGNAME", 'AdAr - Another dumb Archive'); | ||
|
||
define("ADAR_INFOMAIL_TO", ''); | ||
define("ADAR_INFOMAIL_FROM", 'ADAR <adar@localhost>'); | ||
?> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
<?PHP | ||
|
||
/** | ||
* AdAr - Another dumb Archive | ||
* | ||
* System zur Archivierung von Fotos und Dokumenten | ||
* | ||
* @package adar | ||
* @author Florian Knodt <[email protected]> | ||
*/ | ||
|
||
if(!file_exists('config.php') ||!is_readable('config.php')) { | ||
die('Konfiguration fehlt'); | ||
} | ||
require_once('config.php'); //Config | ||
require_once('lib/mysql.wrapper.php'); //ATools->MySQL | ||
require_once('lib/ocr.php'); | ||
|
||
//Step 1: Temp Cleanup | ||
echo "Temp Cleanup\n"; | ||
$dir = opendir('data/tmp/'); | ||
while (($file = readdir($dir)) !== false) { | ||
if(filetype('data/tmp/' . $file) == 'file' && filectime('data/tmp/' . $file) <= time()-(12*60*60)) { | ||
echo " Delete: data/tmp/".$file."\n"; | ||
unlink('data/tmp/' . $file); | ||
} | ||
} | ||
closedir($dir); | ||
echo "DONE!\n"; | ||
|
||
//Step 2: OCR | ||
echo "OCR...\n"; | ||
$list = $GLOBALS['adlerweb']['sql']->query('SELECT ItemID,Description FROM `Items` WHERE OCRStatus = 1'); | ||
if($list->num_rows > 0) { | ||
while($item = $list->fetch_object()) { | ||
echo " ORC for ".$item->ItemID."\n"; | ||
$ocr=''; | ||
if(file_exists('data/org/'.$item->ItemID.'.png')) { | ||
echo " PNG OCR\n"; | ||
$ocr = ocr('data/org/'.$item->ItemID.'.png'); | ||
}elseif(file_exists('data/org/'.$item->ItemID.'.jpg')) { | ||
echo " JPG OCR\n"; | ||
$ocr = ocr('data/org/'.$item->ItemID.'.jpg'); | ||
}elseif(file_exists('data/org/'.$item->ItemID.'.pdf')) { | ||
echo " PDF TXT..."; | ||
exec('pdftotext -layout data/org/'.$item->ItemID.'.pdf data/tmp/'.$item->ItemID.'.txt'); | ||
if(!file_exists('data/tmp/'.$item->ItemID.'.txt') || !($text = file_get_contents('data/tmp/'.$item->ItemID.'.txt')) || strlen(trim($text)) < 100) { | ||
echo "FAILED\n PDF OCR\n"; | ||
//Fallback to optical method | ||
exec('convert -density 400 '.escapeshellarg('data/org/'.$item->ItemID.'.pdf').' '.escapeshellarg('data/tmp/'.$item->ItemID.'.png')); | ||
$page=0; | ||
do { | ||
$ocr .= ocr('data/tmp/'.$item->ItemID.'-'.$page.'.png'); | ||
unlink('data/tmp/'.$item->ItemID.'-'.$page.'.png'); | ||
$page++; | ||
} while(file_exists('data/tmp/'.$item->ItemID.'-'.$page.'.png')); | ||
}else{ | ||
echo "OK\n"; | ||
$ocr = $text; | ||
} | ||
if(file_exists('data/tmp/'.$item->ItemID.'.txt')) unlink('data/tmp/'.$item->ItemID.'.txt'); | ||
}else{ | ||
echo "No original?!\n"; | ||
} | ||
|
||
if($ocr != '') { | ||
$desc = ''; | ||
if($item->Description != '') { | ||
$desc = $item->Description."\n\n---\n\n."; | ||
} | ||
$desc .= $ocr; | ||
$GLOBALS['adlerweb']['sql']->querystmt("UPDATE `Items` SET `Description` = ? `OCRStatus` = 2 WHERE ItemID = ?;", 'ss', array($desc, $item->ItemID)); | ||
echo " Added ".strlen($ocr)." chars\n"; | ||
} | ||
} | ||
} | ||
echo "DONE!\n"; | ||
|
||
?> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# Ignore everything in this directory | ||
* | ||
# Except this file | ||
!.gitignore |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# Ignore everything in this directory | ||
* | ||
# Except this file | ||
!.gitignore |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# Ignore everything in this directory | ||
* | ||
# Except this file | ||
!.gitignore |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
- search tags | ||
- user management | ||
- installer | ||
- API/Liveforms kombinieren | ||
- Verwaltung Kontakte | ||
- Insert-API aufnehmen | ||
- Suche: Datumsbereiche/Datepicker | ||
- Formularerkennung | ||
- Mehr Dateitypen (Libreoffice-API?) | ||
- GPG statt SHA256 | ||
- gettext / Übersetzungen | ||
|
Oops, something went wrong.