Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/jplag implementation #447

Open
wants to merge 15 commits into
base: development
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,7 @@ GEM
faraday-net_http (3.1.0)
net-http
ffi (1.17.0-aarch64-linux-gnu)
ffi (1.17.0-x86_64-linux-gnu)
fugit (1.11.0)
et-orbi (~> 1, >= 1.2.11)
raabro (~> 1.4)
Expand Down Expand Up @@ -265,6 +266,8 @@ GEM
nio4r (2.7.3)
nokogiri (1.16.5-aarch64-linux)
racc (~> 1.4)
nokogiri (1.16.5-x86_64-linux)
racc (~> 1.4)
observer (0.1.2)
orm_adapter (0.5.0)
parallel (1.24.0)
Expand Down Expand Up @@ -490,6 +493,7 @@ GEM

PLATFORMS
aarch64-linux
x86_64-linux

DEPENDENCIES
better_errors
Expand Down
29 changes: 29 additions & 0 deletions app/api/task_definitions_api.rb
Original file line number Diff line number Diff line change
Expand Up @@ -614,4 +614,33 @@ class TaskDefinitionsApi < Grape::API

stream_file path
end

desc 'Download the JPLAG report for a given task'
params do
requires :unit_id, type: Integer, desc: 'The unit to download JPLAG report for'
requires :task_def_id, type: Integer, desc: 'The task definition to get the JPLAG report of'
end
get '/units/:unit_id/task_definitions/:task_def_id/jplag_report' do
unit = Unit.find(params[:unit_id])
task_def = unit.task_definitions.find(params[:task_def_id])

unless authorise? current_user, unit, :download_jplag_report
error!({ error: 'Not authorised to download JPLAG reports of unit' }, 403)
end

file_loc = FileHelper.task_jplag_report_path(unit, task_def)
logger.debug "JPLAG report file location: #{file_loc}"

if file_loc.nil? || !File.exist?(file_loc)
file_loc = Rails.root.join('public', 'resources', 'FileNotFound.pdf')
header['Content-Disposition'] = 'attachment; filename=FileNotFound.pdf'
else
header['Content-Disposition'] = "attachment; filename=#{task_def.abbreviation}-jplag-report.zip"
end
header['Access-Control-Expose-Headers'] = 'Content-Disposition'

content_type 'application/octet-stream'

stream_file file_loc
end
end
12 changes: 12 additions & 0 deletions app/helpers/file_helper.rb
Original file line number Diff line number Diff line change
Expand Up @@ -249,6 +249,16 @@ def student_portfolio_path(unit, username, create = true)
File.join(student_portfolio_dir(unit, username, create), FileHelper.sanitized_filename("#{username}-portfolio.pdf"))
end

def task_jplag_report_dir(unit)
file_server = Doubtfire::Application.config.jplag_report_dir
dst = "#{file_server}/#{unit.code}-#{unit.id}/" # trust the server config and passed in type for paths
dst
end

def task_jplag_report_path(unit, task)
File.join(task_jplag_report_dir(unit), FileHelper.sanitized_filename("#{task.id}-result.zip"))
end

def comment_attachment_path(task_comment, attachment_extension)
"#{File.join(student_work_dir(:comment, task_comment.task), "#{task_comment.id.to_s}#{attachment_extension}")}"
end
Expand Down Expand Up @@ -658,4 +668,6 @@ def line_wrap(path, width: 160)
module_function :known_extension?
module_function :pages_in_pdf
module_function :line_wrap
module_function :task_jplag_report_dir
module_function :task_jplag_report_path
end
71 changes: 70 additions & 1 deletion app/models/similarity/unit_similarity_module.rb
Original file line number Diff line number Diff line change
Expand Up @@ -14,17 +14,26 @@ def last_plagarism_scan
end

# Pass tasks on to plagarism detection software and setup links between students
def check_moss_similarity(force: false)
def check_similarity(force: false)
# Get each task...
return unless active

# need pwd to restore after cding into submission folder (so the files do not have full path)
pwd = FileUtils.pwd

# making temp directory for unit - jplag
root_work_dir = Rails.root.join("tmp", "jplag", "#{code}-#{id}")
unit_code = "#{code}-#{id}"
FileUtils.mkdir_p(root_work_dir)

begin
logger.info "Checking plagiarsm for unit #{code} - #{name} (id=#{id})"

task_definitions.each do |td|
# making temp directory for each task - jplag
tasks_dir = root_work_dir.join(td.id.to_s)
FileUtils.mkdir_p(tasks_dir)

next if td.moss_language.nil? || td.upload_requirements.nil? || td.upload_requirements.select { |upreq| upreq['type'] == 'code' && upreq['tii_check'] }.empty?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this still be moss_language?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using the upload requirement language from the task definition ui task-definition-upload.component.html.
Feel like it makes sense to use existing language definition considering the task creator has already decided if plagiarism checks are necessary from MOSS.
So if no requirement for MOSS, no requirement for JPLAG.
Let me know if you don't think is this appropriate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this we should remove the old moss code, so these could be renamed...


type_data = td.moss_language.split
Expand All @@ -35,6 +44,9 @@ def check_moss_similarity(force: false)
tasks = tasks_for_definition(td)
tasks_with_files = tasks.select(&:has_pdf)

# JPLAG
run_jplag_on_done_files(td, tasks_dir, tasks_with_files, unit_code)

JackSCarroll marked this conversation as resolved.
Show resolved Hide resolved
# Skip if not due yet
next if td.due_date > Time.zone.now

Expand All @@ -48,6 +60,7 @@ def check_moss_similarity(force: false)

# There are new tasks, check these


logger.debug 'Contacting MOSS for new checks'

# Create the MossRuby object
Expand Down Expand Up @@ -222,4 +235,60 @@ def add_done_files_for_plagiarism_check_of(task_definition, tmp_path, to_check,

self
end

# JPLAG Function - extracts "done" files for each task and packages them into a directory for JPLAG to run on
def run_jplag_on_done_files(task_definition, tasks_dir, tasks_with_files, unit_code)
type_data = task_definition.moss_language.split
return if type_data.nil? || (type_data.length != 2) || (type_data[0] != 'moss')
similarity_pct = task_definition.plagiarism_warn_pct
return if similarity_pct.nil?

# Check if the directory exists and create it if it doesn't
results_dir = "/jplag/results/#{unit_code}"
`sudo docker exec jplag sh -c 'if [ ! -d "#{results_dir}" ]; then mkdir -p "#{results_dir}"; fi'`

# Remove existing result file if it exists
result_file = "#{results_dir}/#{task_definition.id}-result.zip"
`sudo docker exec jplag sh -c 'if [ -f "#{result_file}" ]; then rm "#{result_file}"; fi'`

# get each code file for each task
task_definition.upload_requirements.each_with_index do |upreq, idx|
# only check code files marked for similarity checks
next unless upreq['type'] == 'code' && upreq['tii_check']

pattern = task_definition.glob_for_upload_requirement(idx)

tasks_with_files.each do |t|
t.extract_file_from_done(tasks_dir, pattern, ->(_task, to_path, name) { File.join(to_path.to_s, t.student.username.to_s, name.to_s) })
end

logger.info "Starting JPLAG container to run on #{tasks_dir}"
root_dir = Rails.root.to_s
tasks_dir_split = tasks_dir.to_s.split(root_dir)[1]

# Set the file language based on the type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can be removing MOSS with this - so the languages can be updated to those we want from JPlag. This would be part of the front end change as well.

# Currently only supporting C/C++/C#/Python
# MOSS and JPLAG use different names for some languages, need to be converted
# If new MOSS languages options are added to task-defintion-upload, this will need to be updated
file_lang = case type_data[1]
when 'cc'
'cpp'
when 'python'
'python3'
else
type_data[1]
end

# Run JPLAG on the extracted files
`sudo docker exec jplag java -jar /jplag/myJplag.jar #{tasks_dir_split} -l #{file_lang} --similarity-threshold=#{similarity_pct} -M RUN -r #{results_dir}/#{task_definition.id}-result`
end

# Delete the extracted code files from tmp
tmp_dir = Rails.root.join("tmp", "jplag")
logger.info "Deleting files in: #{tmp_dir}"
logger.info "Files to delete: #{Dir.glob("#{tmp_dir}/*")}"
FileUtils.rm_rf(Dir.glob("#{tmp_dir}/*"))

self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are the similarity matches created?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to parse the report and link related tasks

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a function to create the similarity matches. Here's the commit.
It opens the overview.json file from the report and extracts the max similarity between 2 submissions.
It also finds the task id by looking in /files/student_name/task_id and grabbing the task id folder name; it matches the student names from the two that are being looked at currently in overview.json.

end
end
4 changes: 4 additions & 0 deletions app/models/unit.rb
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ def self.permissions
:download_stats,
:download_unit_csv,
:download_grades,
:download_jplag_report,
:exceed_capacity
]

Expand All @@ -46,6 +47,7 @@ def self.permissions
:change_project_enrolment,
:download_stats,
:download_grades,
:download_jplag_report,
:rollover_unit,
:exceed_capacity,
:perform_overseer_assessment_test
Expand All @@ -66,6 +68,7 @@ def self.permissions
:download_stats,
:download_unit_csv,
:download_grades,
:download_jplag_report,
:exceed_capacity
]

Expand Down Expand Up @@ -177,6 +180,7 @@ def role_for(user)
scope :set_inactive, -> { where('active = ?', false) }

include UnitTiiModule

include UnitSimilarityModule

def detailed_name
Expand Down
6 changes: 6 additions & 0 deletions config/application.rb
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,12 @@ class Application < Rails::Application
# variable.
config.student_work_dir = ENV['DF_STUDENT_WORK_DIR'] || "#{Rails.root}/student_work"

# ==> JPLAG report directory
# File server location for storing JPLAG reports. Defaults to `jplag_results`
# directory under root but is overridden using DF_JPLAG_REPORT_DIR environment
# variable.
config.jplag_report_dir = ENV['DF_JPLAG_REPORT_DIR'] || "#{Rails.root}/jplag_results"

# ==> Load credentials from env
credentials.secret_key_base = ENV.fetch('DF_SECRET_KEY_BASE', Rails.env.production? ? nil : '9e010ee2f52af762916406fd2ac488c5694a6cc784777136e657511f8bbc7a73f96d59c0a9a778a0d7cf6406f8ecbf77efe4701dfbd63d8248fc7cc7f32dea97')
credentials.secret_key_attr = ENV.fetch('DF_SECRET_KEY_ATTR', Rails.env.production? ? nil : 'e69fc5960ca0e8700844a3a25fe80373b41c0a265d342eba06950113f3766fd983bad9ec51bf36eb615d9711bfe1dd90b8e35f01841b323f604ffee857e32055')
Expand Down
2 changes: 1 addition & 1 deletion lib/tasks/checks.rake
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ namespace :submission do
puts ' ------------------------------------------------------------ '
puts " Starting Plagiarism Check for #{unit.name}"
puts ' ------------------------------------------------------------ '
unit.check_moss_similarity
unit.check_similarity
end
puts ' ------------------------------------------------------------ '
puts ' done.'
Expand Down