Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make iThenticate reports work with new TCA integration #172

Merged
merged 1 commit into from
Aug 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions .copypatrol.ini.dist
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
; Used by CopyPatrolBot, which populates the database that gets feed into the frontend.
; For more information, see https://github.com/JJMC89/copypatrol-backend#configuration
[copypatrol]
version = 2023.08.10
url-ignore-list-title = example-url-title
user-ignore-list-title = example-user-title

[copypatrol:en.wikipedia.org]
enabled = true
namespaces = 0,2,118
pagetriage-namespaces = 0,118

[copypatrol:es.wikipedia.org]
enabled = true
namespaces = 0,2

[copypatrol:fr.wikipedia.org]
enabled = false

[client]
drivername = mysql+pymysql
username = example-db-user
password = example-db-password
database = example-db-name
host = localhost
port = 3306

[tca]
domain = example-tca-domain.com
key = example-tca-key
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
.idea/
.copypatrol.ini

###> symfony/framework-bundle ###
/.env.local
Expand Down
18 changes: 12 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@ A tool that allows you to see recent Wikipedia edits that are flagged as possibl

* User documentation: https://meta.wikimedia.org/wiki/Special:MyLanguage/CopyPatrol
* Issue tracker: https://phabricator.wikimedia.org/tag/copypatrol/
* Source code: https://gitlab.wikimedia.org/repos/commtech/copypatrol
* Frontend source code (this repo): https://github.com/wikimedia/CopyPatrol
* Bot source code: https://github.com/JJMC89/copypatrol-backend

## Installing manually

Expand All @@ -18,7 +19,11 @@ A tool that allows you to see recent Wikipedia edits that are flagged as possibl
* [Toolforge access](https://wikitech.wikimedia.org/wiki/Help:Toolforge/Quickstart)

This application makes use of the [Symfony framework](https://symfony.com/) and
the [ToolforgeBundle](https://github.com/wikimedia/ToolforgeBundle).
the [ToolforgeBundle](https://github.com/wikimedia/ToolforgeBundle). A [bot](https://meta.wikimedia.org/wiki/User:CopyPatrolBot) is used
to continually query recent changes against the Turnitin Core API (TCA), record possible copyright
violations in a user database, and CopyPatrol then reads from that database. Unless you need to work
on the [bot code](https://github.com/JJMC89/copypatrol-backend), there's no reason to bother with bot
integration, and instead connect to the existing user database on Toolforge (more on this below).

### Instructions

Expand Down Expand Up @@ -123,10 +128,11 @@ docker run -ti -p 80:80 -v $(pwd)/.env.local:/app/.env.local wikimedia/copypatro
Any languages that are not regularly being used should be removed.
3. Make sure the corresponding `-wikipedia` message in [i18n/en.json](i18n/en.json) (and [qqq.json](i18n/qqq.json)
exists and is translated in the desired language.
4. Add the language code to `APP_ENABLED_LANGS` in the [.env](.env) files.
5. _We now use https://github.com/JJMC89/copypatrol-backend as the bot backend; technical instructions TBD_
4. Update the `.copypatrol.ini` file accordingly (which is used by the bot),
and add the new languages to the `APP_ENABLED_LANGS` variable in [.env](.env).

## Removing a language

1. Remove the language from `APP_ENABLED_LANGS`.
2. _TBD_: Remove the job from CopyPatrolBot.
1. In `.copypatrol.ini`, set the `enabled` key for the desired language to `false`,
or remove the definition entirely.
2. Remove the language from the `APP_ENABLED_LANGS` variable in [.env](.env).
1 change: 1 addition & 0 deletions composer.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
"php": ">=7.4",
"ext-ctype": "*",
"ext-iconv": "*",
"ext-json": "*",
"doctrine/annotations": "^2.0",
"doctrine/doctrine-bundle": "^2.2",
"phpxmlrpc/phpxmlrpc": "^4.10",
Expand Down
17 changes: 9 additions & 8 deletions composer.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

85 changes: 79 additions & 6 deletions src/Controller/AppController.php
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
use DateTime;
use Doctrine\DBAL\Exception\DriverException;
use Exception;
use Krinkle\Intuition\Intuition;
use PhpXmlRpc\Client;
use PhpXmlRpc\Value;
use stdClass;
Expand All @@ -18,8 +19,10 @@
use Symfony\Component\HttpFoundation\RequestStack;
use Symfony\Component\HttpFoundation\Response;
use Symfony\Component\HttpFoundation\Session\SessionInterface;
use Symfony\Component\HttpKernel\Exception\HttpException;
// phpcs:ignore MediaWiki.Classes.UnusedUseStatement.UnusedUse
use Symfony\Component\Routing\Annotation\Route;
use Symfony\Contracts\HttpClient\HttpClientInterface;

class AppController extends AbstractController {

Expand Down Expand Up @@ -278,26 +281,39 @@ public function undoReviewAction(
}

/**
* @Route("/ithenticate/{id}", name="ithenticate", requirements={"id"="\d+|[a-z\d+\-]"})
* @Route("/ithenticate/{id}", name="ithenticate", requirements={"id"="\d+|[a-z\d+\-]+"})
* @param HttpClientInterface $httpClient
* @param RequestStack $requestStack
* @param CopyPatrolRepository $copyPatrolRepo
* @param Intuition $intuition
* @param string $iThenticateUser
* @param string $iThenticatePassword
* @param int $id
* @param string $id
* @return RedirectResponse
* @throws Exception
*/
public function iThenticateAction(
HttpClientInterface $httpClient,
RequestStack $requestStack,
CopyPatrolRepository $copyPatrolRepo,
Intuition $intuition,
string $iThenticateUser,
string $iThenticatePassword,
int $id
string $id
): RedirectResponse {
$record = $copyPatrolRepo->getRecordById( $id );
$v2DateTime = new DateTime( self::ITHENTICATE_V2_TIMESTAMP );
if ( new DateTime( $record['rev_timestamp'] ) > $v2DateTime ) {
// New system
// TODO: make this work
return new RedirectResponse( '' );
// New system.
$loggedInUser = $requestStack->getSession()->get( 'logged_in_user' )->username;
if ( $loggedInUser === null ) {
return $this->redirectToRoute( 'toolforge_login', [
'callback' => $this->generateUrl( 'toolforge_oauth_callback', [
'redirect' => $requestStack->getCurrentRequest()->getUri(),
] )
] );
}
return $this->redirectToTcaViewer( $httpClient, $loggedInUser, $intuition->getLang(), $id );
}

$client = new Client( 'https://api.ithenticate.com/rpc' );
Expand All @@ -315,6 +331,63 @@ public function iThenticateAction(
return $this->redirect( $response['view_only_url']->scalarval() );
}

/**
* @param HttpClientInterface $client
* @param string $loggedInUser
* @param string $locale
* @param string $id
* @param int $retries
* @return RedirectResponse
*/
private function redirectToTcaViewer(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TCA can be down (e.g., maintenance). Does this need any special handling for that?

Copy link
Member Author

@MusikAnimal MusikAnimal Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added automatic retries up to 5 times. I'm only checking if the response from Turnitin is not 200, so it's possible we'll needlessly retry when it's something on our end (i.e. bad credentials), but regardless we'll get an email (once I set that up) so we'll be able to diagnose and fix accordingly.

HttpClientInterface $client,
string $loggedInUser,
string $locale,
string $id,
int $retries = 0
): RedirectResponse {
// Load config settings. This lives in .copypatrol.ini and is shared with the bot.
$projectDir = $this->getParameter( 'kernel.project_dir' );
$config = parse_ini_file( "$projectDir/.copypatrol.ini" );
// Request the viewer URL from Turnitin.
$response = $client->request(
'POST',
"https://{$config['domain']}/api/v1/submissions/$id/viewer-url",
[
'json' => [
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider whether any viewer_permissions should be set. Some could be useful to reviewers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added may_view_submission_full_source, may_view_match_submission_info, may_view_document_details_panel and may_view_sections_exclusion_panel. The others didn't seem to do anything or didn't work.

'viewer_user_id' => $loggedInUser,
'locale' => $locale,
'viewer_permissions' => [
'may_view_submission_full_source' => true,
'may_view_match_submission_info' => true,
'may_view_document_details_panel' => true,
'may_view_sections_exclusion_panel' => true,
],
],
'headers' => [
'Authorization' => "Bearer {$config['key']}",
'From' => '[email protected]',
'User-Agent' => "copypatrol/{$config['version']}",
MusikAnimal marked this conversation as resolved.
Show resolved Hide resolved
'X-Turnitin-Integration-Name' => 'CopyPatrol',
'X-Turnitin-Integration-Version' => $config['version'],
],
]
);

if ( $response->getStatusCode() !== Response::HTTP_OK ) {
if ( $retries > 5 ) {
throw new HttpException(
Response::HTTP_BAD_GATEWAY,
'Failed to fetch URL from the Turnitin Core API'
);
}
sleep( $retries + 1 );
return $this->redirectToTcaViewer( $client, $loggedInUser, $locale, $id, $retries + 1 );
}

return $this->redirect( json_decode( $response->getContent() )->viewer_url );
}

/**
* ToolforgeBundle's logout apparently doesn't work :(
*
Expand Down
4 changes: 2 additions & 2 deletions src/Repository/CopyPatrolRepository.php
Original file line number Diff line number Diff line change
Expand Up @@ -197,10 +197,10 @@ public function updateCopyvioAssessment(
/**
* Get a particular record by submission ID.
*
* @param int $submissionId ID of record.
* @param string $submissionId ID of record.
* @return array Query result.
*/
public function getRecordById( int $submissionId ) {
public function getRecordById( string $submissionId ) {
$sql = "SELECT * FROM diffs WHERE submission_id = :id";
return $this->client->fetchAssociative( $sql, [
'id' => $submissionId
Expand Down
Loading