Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lectures: Show slide numbers to be removed in automatic unit processing #7350

Merged
merged 14 commits into from
Nov 3, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -50,13 +50,13 @@
* Split units from given file according to given split information and saves them.
*
* @param lectureUnitInformationDTO The split information
* @param file The file (lecture slide) to be split
* @param fileBytes The byte content of the file (lecture slides) to be split
* @param lecture The lecture that the attachment unit belongs to
* @return The prepared units to be saved
*/
public List<AttachmentUnit> splitAndSaveUnits(LectureUnitInformationDTO lectureUnitInformationDTO, MultipartFile file, Lecture lecture) throws IOException {
public List<AttachmentUnit> splitAndSaveUnits(LectureUnitInformationDTO lectureUnitInformationDTO, byte[] fileBytes, Lecture lecture) throws IOException {

try (ByteArrayOutputStream outputStream = new ByteArrayOutputStream(); PDDocument document = Loader.loadPDF(file.getBytes())) {
try (ByteArrayOutputStream outputStream = new ByteArrayOutputStream(); PDDocument document = Loader.loadPDF(fileBytes)) {
List<AttachmentUnit> units = new ArrayList<>();
Splitter pdfSplitter = new Splitter();

Expand Down Expand Up @@ -100,6 +100,40 @@
}
}

/**
* Gets the slides that should be removed by the given keyphrase
*
* @param fileBytes The byte content of the file (lecture slides) to be split
* @param commaSeparatedKeyPhrases key phrases that identify slides about to be removed
* @return list of the number of slides that will be removed
*/
public List<Integer> getSlidesToRemoveByKeyphrase(byte[] fileBytes, String commaSeparatedKeyPhrases) {
List<Integer> slidesToRemove = new ArrayList<>();
if (commaSeparatedKeyPhrases.isEmpty()) {
return slidesToRemove;
}
try (PDDocument document = Loader.loadPDF(fileBytes)) {
PDFTextStripper pdfTextStripper = new PDFTextStripper();
Splitter pdfSplitter = new Splitter();
List<PDDocument> pages = pdfSplitter.split(document);

for (int index = 0; index <= pages.size() - 1; index++) {
rstief marked this conversation as resolved.
Show resolved Hide resolved
PDDocument currentPage = pages.get(index);
String slideText = pdfTextStripper.getText(currentPage);

if (slideContainsKeyphrase(slideText, commaSeparatedKeyPhrases)) {
slidesToRemove.add(index);
}
currentPage.close(); // make sure to close the document
rstief marked this conversation as resolved.
Show resolved Hide resolved
}
}
catch (IOException e) {
log.error("Error while retrieving slides to remove from document", e);
throw new InternalServerErrorException("Error while retrieving slides to remove from document");

Check warning on line 132 in src/main/java/de/tum/in/www1/artemis/service/LectureUnitProcessingService.java

View check run for this annotation

Teamscale / teamscale-findings

src/main/java/de/tum/in/www1/artemis/service/LectureUnitProcessingService.java#L132

[New] Exception stacktrace is lost https://teamscale.io/findings.html#details/GitHub-ls1intum-Artemis?t=feature%2Flecture%2Fshow-removed-slide-numbers%3AHEAD&id=9F93CF50D9129DBD6D0797FF4FD0E3C3
}
return slidesToRemove;
}

/**
* Removes the slides containing any of the key phrases from the given document.
*
Expand Down Expand Up @@ -138,14 +172,14 @@
/**
* Prepare information of split units for client
*
* @param file The file (lecture slide) to be split
* @param fileBytes The byte content of the file (lecture slides) to be split
* @return The prepared information of split units LectureUnitInformationDTO
*/
public LectureUnitInformationDTO getSplitUnitData(MultipartFile file) {
public LectureUnitInformationDTO getSplitUnitData(byte[] fileBytes) {

try {
log.debug("Start preparing information of split units for the file {}", file);
Outline unitsInformation = separateIntoUnits(file);
log.debug("Start preparing information of split units.");
Outline unitsInformation = separateIntoUnits(fileBytes);
Map<Integer, LectureUnitSplit> unitsDocumentMap = unitsInformation.splits;
int numberOfPages = unitsInformation.totalPages;

Expand All @@ -166,11 +200,11 @@
* is going to be split. The map looks like the following:
* Map<OutlineNumber, (UnitName, StartPage, EndPage)>
*
* @param file The file (lecture pdf) to be split
* @param fileBytes The byte content of the file (lecture pdf) to be split
* @return The prepared map
*/
private Outline separateIntoUnits(MultipartFile file) throws IOException {
try (PDDocument document = Loader.loadPDF(file.getBytes())) {
private Outline separateIntoUnits(byte[] fileBytes) throws IOException {
try (PDDocument document = Loader.loadPDF(fileBytes)) {
Map<Integer, LectureUnitSplit> outlineMap = new HashMap<>();
Splitter pdfSplitter = new Splitter();
PDFTextStripper pdfStripper = new PDFTextStripper();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,18 +14,16 @@
import org.springframework.web.bind.annotation.*;
import org.springframework.web.multipart.MultipartFile;

import com.google.gson.Gson;

import de.tum.in.www1.artemis.domain.Attachment;
import de.tum.in.www1.artemis.domain.Lecture;
import de.tum.in.www1.artemis.domain.lecture.AttachmentUnit;
import de.tum.in.www1.artemis.repository.AttachmentUnitRepository;
import de.tum.in.www1.artemis.repository.LectureRepository;
import de.tum.in.www1.artemis.security.Role;
import de.tum.in.www1.artemis.security.annotations.EnforceAtLeastEditor;
import de.tum.in.www1.artemis.service.AttachmentUnitService;
import de.tum.in.www1.artemis.service.AuthorizationCheckService;
import de.tum.in.www1.artemis.service.CompetencyProgressService;
import de.tum.in.www1.artemis.service.LectureUnitProcessingService;
import de.tum.in.www1.artemis.service.SlideSplitterService;
import de.tum.in.www1.artemis.service.*;

Check warning on line 26 in src/main/java/de/tum/in/www1/artemis/web/rest/lecture/AttachmentUnitResource.java

View check run for this annotation

Teamscale / teamscale-findings

src/main/java/de/tum/in/www1/artemis/web/rest/lecture/AttachmentUnitResource.java#L26

[New] Star import of `de.tum.in.www1.artemis.service.*` should not be used https://teamscale.io/findings.html#details/GitHub-ls1intum-Artemis?t=feature%2Flecture%2Fshow-removed-slide-numbers%3AHEAD&id=B51213DBBDCCDFD0E8DA09036C8A372D
import de.tum.in.www1.artemis.service.notifications.GroupNotificationService;
import de.tum.in.www1.artemis.web.rest.dto.LectureUnitInformationDTO;
import de.tum.in.www1.artemis.web.rest.errors.BadRequestAlertException;
Expand All @@ -40,6 +38,8 @@

private static final String ENTITY_NAME = "attachmentUnit";

private static final Gson gson = new Gson();

Check warning on line 41 in src/main/java/de/tum/in/www1/artemis/web/rest/lecture/AttachmentUnitResource.java

View check run for this annotation

Teamscale / teamscale-findings

src/main/java/de/tum/in/www1/artemis/web/rest/lecture/AttachmentUnitResource.java#L41

[New] Constant `gson` violates naming convention. Should be one of `[A-Z][_A-Z0-9]*` https://teamscale.io/findings.html#details/GitHub-ls1intum-Artemis?t=feature%2Flecture%2Fshow-removed-slide-numbers%3AHEAD&id=23F2D5C0E099F40CD19A29181F5480D3

private final AttachmentUnitRepository attachmentUnitRepository;

private final LectureRepository lectureRepository;
Expand All @@ -56,9 +56,13 @@

private final SlideSplitterService slideSplitterService;

private final FileService fileService;

private final FilePathService filePathService;

public AttachmentUnitResource(AttachmentUnitRepository attachmentUnitRepository, LectureRepository lectureRepository, LectureUnitProcessingService lectureUnitProcessingService,
AuthorizationCheckService authorizationCheckService, GroupNotificationService groupNotificationService, AttachmentUnitService attachmentUnitService,
CompetencyProgressService competencyProgressService, SlideSplitterService slideSplitterService) {
CompetencyProgressService competencyProgressService, SlideSplitterService slideSplitterService, FileService fileService, FilePathService filePathService) {
this.attachmentUnitRepository = attachmentUnitRepository;
this.lectureUnitProcessingService = lectureUnitProcessingService;
this.lectureRepository = lectureRepository;
Expand All @@ -67,6 +71,8 @@
this.attachmentUnitService = attachmentUnitService;
this.competencyProgressService = competencyProgressService;
this.slideSplitterService = slideSplitterService;
this.fileService = fileService;
this.filePathService = filePathService;
}

/**
Expand Down Expand Up @@ -161,17 +167,45 @@
}

/**
* POST lectures/:lectureId/attachment-units/split : creates new attachment units. The provided file must be a pdf file.
* POST lectures/:lectureId/process-units/upload : Temporarily uploads a file which will be processed into lecture units
*
* @param file the file that will be processed
* @param lectureId the id of the lecture to which the attachment units will be added
* @return the ResponseEntity with status 200 (ok) and with body filename of the uploaded file
*/
@PostMapping("lectures/{lectureId}/process-units/upload")
rstief marked this conversation as resolved.
Show resolved Hide resolved
@EnforceAtLeastEditor
public ResponseEntity<String> uploadSlidesForProcessing(@PathVariable Long lectureId, @RequestPart("file") MultipartFile file) {
// time until the temporary file gets deleted. Must be greater or equal than MINUTES_UNTIL_DELETION in attachment-units.component.ts
// TODO: increase to 30 after testing

Check warning on line 180 in src/main/java/de/tum/in/www1/artemis/web/rest/lecture/AttachmentUnitResource.java

View check run for this annotation

Teamscale / teamscale-findings

src/main/java/de/tum/in/www1/artemis/web/rest/lecture/AttachmentUnitResource.java#L180

[New] TODO: increase to 30 after testing https://teamscale.io/findings.html#details/GitHub-ls1intum-Artemis?t=feature%2Flecture%2Fshow-removed-slide-numbers%3AHEAD&id=E569F013F747C7BB57DB079EC3BE1747
int MINUTES_UNTIL_DELETION = 3;

Check warning on line 181 in src/main/java/de/tum/in/www1/artemis/web/rest/lecture/AttachmentUnitResource.java

View check run for this annotation

Teamscale / teamscale-findings

src/main/java/de/tum/in/www1/artemis/web/rest/lecture/AttachmentUnitResource.java#L181

[New] Local variable `MINUTES_UNTIL_DELETION` violates naming convention. Should be one of `[a-z][a-zA-Z0-9]*` https://teamscale.io/findings.html#details/GitHub-ls1intum-Artemis?t=feature%2Flecture%2Fshow-removed-slide-numbers%3AHEAD&id=3FE34364AF509E3F644D1A8E9B1E037F
log.debug("REST request to upload file: {}", file.getOriginalFilename());

Lecture lecture = lectureRepository.findByIdWithLectureUnitsElseThrow(lectureId);
if (lecture.getCourse() == null) {
throw new ConflictException("Specified lecture is not part of a course", "AttachmentUnit", "courseMissing");
}
authorizationCheckService.checkHasAtLeastRoleInCourseElseThrow(Role.EDITOR, lecture.getCourse(), null);

URI fileURI = fileService.handleSaveFile(file, false, false);
fileService.schedulePathForDeletion(filePathService.actualPathForPublicPath(fileURI), MINUTES_UNTIL_DELETION);
String fileName = filePathService.actualPathForPublicPath(fileURI).getFileName().toString();

return ResponseEntity.ok().body(gson.toJson(fileName));
}

/**
* POST lectures/:lectureId/process-units/split : creates new attachment units from the given file and lecture unit information
*
* @param lectureId the id of the lecture to which the attachment units should be added
* @param lectureUnitInformationDTO the units that should be created
* @param file the file to be splitted
* @param lectureId the id of the lecture to which the attachment units will be added
* @param lectureUnitInformationDTO the units that will be created
* @param filename the name of the lecture file, located in the temp folder
* @return the ResponseEntity with status 200 (ok) and with body the newly created attachment units
*/
@PostMapping("lectures/{lectureId}/attachment-units/split")
@PostMapping("lectures/{lectureId}/process-units/split")
@EnforceAtLeastEditor
public ResponseEntity<List<AttachmentUnit>> createAttachmentUnits(@PathVariable Long lectureId, @RequestPart LectureUnitInformationDTO lectureUnitInformationDTO,
@RequestPart MultipartFile file) {
@RequestPart String filename) {
log.debug("REST request to create AttachmentUnits {} with lectureId {}", lectureUnitInformationDTO, lectureId);

Lecture lecture = lectureRepository.findByIdWithLectureUnitsElseThrow(lectureId);
Expand All @@ -181,10 +215,8 @@
authorizationCheckService.checkHasAtLeastRoleInCourseElseThrow(Role.EDITOR, lecture.getCourse(), null);

try {
if (!Objects.equals(FilenameUtils.getExtension(file.getOriginalFilename()), "pdf")) {
throw new BadRequestAlertException("The file must be a pdf", ENTITY_NAME, "wrongFileType");
}
List<AttachmentUnit> savedAttachmentUnits = lectureUnitProcessingService.splitAndSaveUnits(lectureUnitInformationDTO, file, lecture);
byte[] fileBytes = fileService.getFileForPath(FilePathService.getTempFilePath().resolve(filename));
github-advanced-security[bot] marked this conversation as resolved.
Fixed
Show resolved Hide resolved
List<AttachmentUnit> savedAttachmentUnits = lectureUnitProcessingService.splitAndSaveUnits(lectureUnitInformationDTO, fileBytes, lecture);
savedAttachmentUnits.forEach(attachmentUnitService::prepareAttachmentUnitForClient);
savedAttachmentUnits.forEach(competencyProgressService::updateProgressByLearningObjectAsync);
return ResponseEntity.ok().body(savedAttachmentUnits);
Expand All @@ -196,25 +228,56 @@
}

/**
* POST lectures/:lectureId/process-units : Prepare attachment units information
* GET lectures/:lectureId/process-units : Calculates lecture units by splitting up the given file
*
* @param file the file to get the units data
* @param lectureId the id of the lecture to which the file is going to be splitted
* @param lectureId the id of the lecture to which the file is going to be split
* @param filename the name of the lecture file to be split, located in the temp folder
* @return the ResponseEntity with status 200 (ok) and with body attachmentUnitsData
*/
@PostMapping("lectures/{lectureId}/process-units")
@GetMapping("lectures/{lectureId}/process-units")
@EnforceAtLeastEditor
public ResponseEntity<LectureUnitInformationDTO> getAttachmentUnitsData(@PathVariable Long lectureId, @RequestParam("file") MultipartFile file) {
log.debug("REST request to split lecture file : {}", file.getOriginalFilename());
public ResponseEntity<LectureUnitInformationDTO> getAttachmentUnitsData(@PathVariable Long lectureId, @RequestParam("filename") String filename) {
log.debug("REST request to split lecture file : {}", filename);

Lecture lecture = lectureRepository.findByIdWithLectureUnitsElseThrow(lectureId);
if (lecture.getCourse() == null) {
throw new ConflictException("Specified lecture is not part of a course", "AttachmentUnit", "courseMissing");
}
authorizationCheckService.checkHasAtLeastRoleInCourseElseThrow(Role.EDITOR, lecture.getCourse(), null);

LectureUnitInformationDTO attachmentUnitsData = lectureUnitProcessingService.getSplitUnitData(file);
return ResponseEntity.ok().body(attachmentUnitsData);
try {
byte[] fileBytes = fileService.getFileForPath(FilePathService.getTempFilePath().resolve(filename));
github-advanced-security[bot] marked this conversation as resolved.
Fixed
Show resolved Hide resolved
LectureUnitInformationDTO attachmentUnitsData = lectureUnitProcessingService.getSplitUnitData(fileBytes);
return ResponseEntity.ok().body(attachmentUnitsData);
}
catch (IOException e) {
log.error("Could not calculate lecture units automatically", e);
throw new InternalServerErrorException("Could not calculate lecture units automatically");

Check warning on line 255 in src/main/java/de/tum/in/www1/artemis/web/rest/lecture/AttachmentUnitResource.java

View check run for this annotation

Teamscale / teamscale-findings

src/main/java/de/tum/in/www1/artemis/web/rest/lecture/AttachmentUnitResource.java#L255

[New] Exception stacktrace is lost https://teamscale.io/findings.html#details/GitHub-ls1intum-Artemis?t=feature%2Flecture%2Fshow-removed-slide-numbers%3AHEAD&id=28674A1A194CC4384435CA034253DBC6
}
}

/**
* GET lectures/:lectureId/process-units/slides-to-remove : gets the slides to be removed
*
* @param lectureId the id of the lecture to which the unit belongs
* @param filename the name of the file to be parsed, located in the temp folder
* @param commaSeparatedKeyPhrases the comma seperated keyphrases to be removed
* @return the ResponseEntity with status 200 (OK) and with body the list of slides to be removed
*/
@GetMapping("lectures/{lectureId}/process-units/slides-to-remove")
@EnforceAtLeastEditor
public ResponseEntity<List<Integer>> getSlidesToRemove(@PathVariable Long lectureId, @RequestParam String filename, @RequestParam String commaSeparatedKeyPhrases) {
Lecture lecture = lectureRepository.findByIdWithLectureUnitsElseThrow(lectureId);
authorizationCheckService.checkHasAtLeastRoleInCourseElseThrow(Role.EDITOR, lecture.getCourse(), null);
try {
byte[] fileBytes = fileService.getFileForPath(FilePathService.getTempFilePath().resolve(filename));
github-advanced-security[bot] marked this conversation as resolved.
Fixed
Show resolved Hide resolved
List<Integer> slidesToRemove = this.lectureUnitProcessingService.getSlidesToRemoveByKeyphrase(fileBytes, commaSeparatedKeyPhrases);
return ResponseEntity.ok().body(slidesToRemove);
}
catch (IOException e) {
log.error("Could not calculate slides to remove", e);
throw new InternalServerErrorException("Could not calculate slides to remove");

Check warning on line 279 in src/main/java/de/tum/in/www1/artemis/web/rest/lecture/AttachmentUnitResource.java

View check run for this annotation

Teamscale / teamscale-findings

src/main/java/de/tum/in/www1/artemis/web/rest/lecture/AttachmentUnitResource.java#L279

[New] Exception stacktrace is lost https://teamscale.io/findings.html#details/GitHub-ls1intum-Artemis?t=feature%2Flecture%2Fshow-removed-slide-numbers%3AHEAD&id=324CD4CC10344D134CD8EA0572843963
}
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -79,17 +79,22 @@ <h2 id="page-heading">
<span class="px-1">{{ 'artemisApp.attachmentUnit.createAttachmentUnits.removeSlides' | artemisTranslate }}</span>
<div>
<div class="alert alert-warning mt-3 ml-3">
<span>
{{ 'artemisApp.attachmentUnit.createAttachmentUnits.removeSlidesInfo' | artemisTranslate }}
</span>
<ul>
<li>{{ 'artemisApp.attachmentUnit.createAttachmentUnits.removeSlidesInfo.firstLine' | artemisTranslate }}</li>
<li>{{ 'artemisApp.attachmentUnit.createAttachmentUnits.removeSlidesInfo.secondLine' | artemisTranslate }}</li>
<li>
{{ 'artemisApp.attachmentUnit.createAttachmentUnits.removeSlidesInfo.thirdLine' | artemisTranslate }}
{{ removedSlidesNumbers.length > 0 ? removedSlidesNumbers : '-' }}
</li>
</ul>
</div>
<input
type="text"
class="form-control"
id="removeSlidesCommaSeparatedKeyPhrases"
placeholder="{{ 'artemisApp.attachmentUnit.createAttachmentUnits.removeSlidesPlaceholder' | artemisTranslate }}"
autocomplete="off"
[(ngModel)]="removeSlidesCommaSeparatedKeyPhrases"
[(ngModel)]="searchTerm"
/>
</div>
</div>
Expand Down
Loading
Loading