Skip to content

Commit

Permalink
Merge pull request #7 from lolaent/sync_branch
Browse files Browse the repository at this point in the history
Sync branch
  • Loading branch information
bogdanv96 authored Jul 9, 2020
2 parents b0d905e + 79cb110 commit 505ff85
Show file tree
Hide file tree
Showing 27 changed files with 1,330 additions and 159 deletions.
26 changes: 26 additions & 0 deletions .github/workflows/gradle.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# This workflow will build a Java project with Gradle
# For more information see: https://help.github.com/actions/language-and-framework-guides/building-and-testing-java-with-gradle

name: Java CI with Gradle

on:
push:
branches: [ develop ]
pull_request:
branches: [ develop ]

jobs:
build:

runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2
- name: Set up JDK 1.8
uses: actions/setup-java@v1
with:
java-version: 1.8
- name: Grant execute permission for gradlew
run: chmod +x gradlew
- name: Build with Gradle
run: ./gradlew build
69 changes: 62 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,74 @@

## Connector Description

Drupal8 Connector fetches the entries of a json file.
The Drupal Connector fetches all the content that is available on the URL provided from Drupal Module. Using an algorithm that takes all the links from every page, all the content from those pages is going to be parsed and normalised in a Fusion 5 Server instance, using the SDK connector.

## System Diagram
The System Diagram presents the 3 components and how are connected. Fusion is the component where the Java Connector will be uploaded as a plugin. This connector plugin will then be used as a datasource in _Index Workbench_. One of the properties that this connector expects is a URL coming from Drupal Module, from where all the content will be taken recursively and get indexed.

![diagram](diagram.png)

## Quick start

1. Clone the repo:
### Clone the repo:
```
git clone https://github.com/lucidworks/drupal-connector.git
```

### Build the project:
```
git clone https://github.com/lolaent/fusion-connector-java.git
cd drupal8
./gradlew clean build assemblePlugin
```
2. This produces the zip file, named `drupal8.zip`, located in the `build` directory.
This artifact is now ready to be uploaded directly to Fusion as a connector plugin.
This produces the zip file, named `drupal8.zip`, located in the `build` directory.
This artifact is now ready to be uploaded directly into Fusion Server as a connector plugin.

### Connector properties
This connector is using the `connector-plugin-sdk` version `2.0.1` which is compatible with Fusion Server 5.

#### Properties required
1. **_Drupal URL_** - the link from where this connector takes all the content.
2. **_Username_** - the username used to login into drupal to be able to fetch a specific type of content. There are different roles for users defined in that module.
3. **_Password_**
4. **_Login Path_** - the path used to the login request - ```defaultValue = "/user/login"```
4. **_Logout Path_** - the path used to the logout request - ```defaultValue = "/user/logout"```
5. **_Entry Path_** - this entry indicates the page from where fetching the content begins - ```defaultValue = "/en/fusion"```

#### Properties added in MANIFEST.MF
```
Plugin-Id: com.lucidworks.fusion.connector
Plugin-Type: connector
Plugin-Provider: Lucidworks
Plugin-Version: 1.0-SNAPSHOT
Plugin-Connectors-SDK-Version: 2.0.1
Plugin-Class: com.lucidworks.fusion.connector.ConnectorPlugin
```

### The Fetcher
The **_JsonContentFetcher_** class provides methods that define how data is fetched and indexed for Fusion Server. The data is fetched from Drupal using the OkHttp Client to call the request. But before the actual request is done to get all the content a login request is needed. There are different types of users that can see the entire content or just a particular part from it.
From the login response the header _Set-Cookie_ is taken and used as a header for the next requests.

### Crawling process
The **_DrupalContentCrawler_** is the class where the data for indexing is resolved. The startCrawling function will do a GET request to all URLs saved in **drupalUrls** and prepare the next step of execution. All the content is saved in a map **topLevelJsonapiMap**. This process is running until **drupalUrls** has values.
If you need to check if the process is done you can check the value of **processFinished** flag.

#### JSON:API
The content from Drupal URL has a JSON:API format.

JSON:API is a specification for how a client should request the resources to be fetched or modified, and how a server should respond to those requests.
JSON:API is designed to minimize both the number of requests and the amount of data transmitted between clients and servers. This efficiency is achieved without compromising readability, flexibility, or discoverability.

JSON:API requires use of the JSON:API media type `application/vnd.api+json` for exchanging data.

## Dependencies
The most important dependency is the SDK connector. The SDK connector used in this project can be found [here](https://github.com/lucidworks/connectors-sdk-resources/tree/master/java-sdk).

Beside this another needed dependency is a HTTP client in order to connect to a third-party REST API, in this project a Drupal Module.

### OkHttp
OKHttp is a HTTP client that is efficient by default. It supports HTTP/2 and allows all requests to the same host to share a socket. It's connection pooling reduces request latency. The response caching avoids the network completely for repeat requests.
Using OkHttp is easy. Its request/response API is designed with fluent builders and immutability. It supports both synchronous blocking calls and async calls with callbacks.

## Connector properties
### Lombok
Lombok is a java library that automatically plugs into your editor and build tools and replaces using annotations most of the code regarding getters, setters and even constructors.

...
8 changes: 6 additions & 2 deletions build.gradle
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

// Folder where the plugins are built
ext.pluginsDir = "${rootProject.buildDir}"

Expand All @@ -8,14 +7,18 @@ apply plugin: 'idea'
group 'com.lucidworks.fusion.connector'
version '1.0-SNAPSHOT'

sourceCompatibility = 1.8
compileJava {
sourceCompatibility = 1.8
targetCompatibility = 1.8
}

repositories {
mavenCentral()
mavenLocal()
maven {
url "https://artifactory.lucidworks.com/artifactory/public-artifacts/"
}
jcenter()
}

dependencies {
Expand All @@ -28,6 +31,7 @@ dependencies {
testCompile group: 'junit', name: 'junit', version: '4.12'
compile group: 'org.slf4j', name: 'slf4j-api', version: '1.6.1'
compile group: 'org.slf4j', name: 'slf4j-simple', version: '1.6.1'
testImplementation "org.mockito:mockito-core:${mockitoVersion}"
}

jar {
Expand Down
Binary file added diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 3 additions & 2 deletions gradle.properties
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# packaing properties
group=com.lucidworks.connector.plugins
version=2.0.0
#
# Connector SDK version
connectorsSDKVersion=2.0.1
# Deploy tasks
userPass=admin:a-very-secret-password
Expand All @@ -16,7 +16,8 @@ lombokVersion=1.18.12
jacksonVersion=2.9.10
jsonApiVersion=0.10
okHttpVersion=4.7.2
#plugins
mockitoVersion=2.7.22
# plugins
pluginClass=com.lucidworks.fusion.connector.ConnectorPlugin
pluginId=com.lucidworks.fusion.connector
pluginProvider=Lucidworks
36 changes: 28 additions & 8 deletions src/main/java/com/lucidworks/fusion/connector/Runner.java
Original file line number Diff line number Diff line change
@@ -1,30 +1,50 @@
package com.lucidworks.fusion.connector;

import com.fasterxml.jackson.databind.ObjectMapper;
import com.lucidworks.fusion.connector.model.DrupalLoginRequest;
import com.lucidworks.fusion.connector.model.DrupalLoginResponse;
import com.lucidworks.fusion.connector.model.TopLevelJsonapi;
import com.lucidworks.fusion.connector.service.ConnectorService;
import com.lucidworks.fusion.connector.service.ContentService;
import com.lucidworks.fusion.connector.service.DrupalOkHttp;
import com.lucidworks.fusion.connector.util.DataUtil;

import java.util.Map;

public class Runner {

public static void main(String[] args) {
String baseUrl = "http://s5ee7c4bb7c413wcrxueduzw.devcloud.acquia-sites.com/";
DrupalOkHttp drupalOkHttp = new DrupalOkHttp();
ObjectMapper mapper = new ObjectMapper();
ContentService contentService = new ContentService(drupalOkHttp, mapper);

DrupalLoginResponse drupalLoginResponse = contentService.login(baseUrl, "authenticated", "authenticated");
DrupalOkHttp drupalOkHttp = new DrupalOkHttp(mapper);
ContentService contentService = new ContentService(mapper);

ConnectorService connectorService = new ConnectorService(baseUrl + "fusion", drupalLoginResponse, contentService);
DrupalLoginRequest drupalLoginRequest = new DrupalLoginRequest("authenticated", "authenticated");

DrupalLoginResponse drupalLoginResponse = drupalOkHttp.loginResponse(normalizeUrl(baseUrl) + normalizeUrl("/user/login"), drupalLoginRequest);

ConnectorService connectorService = new ConnectorService(normalizeUrl(baseUrl) + normalizeUrl("/en/fusion/node/article"), new DrupalLoginResponse(), contentService, mapper);

Map<String, String> response = connectorService.prepareDataToUpload();

response.forEach((currentUrl, content) -> {
System.out.println(currentUrl);
//System.out.println(content);
});
Map<String, TopLevelJsonapi> topLevelJsonapiMap = contentService.getTopLevelJsonapiDataMap();

Map<String, Map<String, Object>> objectMap = DataUtil.generateObjectMap(topLevelJsonapiMap);

for (String key : objectMap.keySet()) {
Map<String, Object> pageContentMap = objectMap.get(key);
System.out.println(pageContentMap.values().toString());
}

System.out.println("Logout is successful: " + drupalOkHttp.logout(normalizeUrl(baseUrl) + "/user/logout", drupalLoginResponse));
}


private static String normalizeUrl(String initialUrl) {
String normalizedUrl = initialUrl.endsWith("/") ?
initialUrl.substring(0, initialUrl.length() - 1) : initialUrl;

return normalizedUrl;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ interface Properties extends ConnectorPluginProperties {

@Property(
title = "Drupal URL",
description = "Content URL location. If empty, the connector will generate entries (see 'Generate Properties')",
description = "Page URL.",
required = true,
order = 1
)
Expand All @@ -38,22 +38,47 @@ interface Properties extends ConnectorPluginProperties {

@Property(
title = "Username for login",
description = "Username to login into drupal to be able to fetch content from it",
required = true,
description = "Username to login into drupal to be able to fetch content from it.",
order = 2
)
@StringSchema
String getUsername();

@Property(
title = "Password for login",
description = "Password to login into drupal to be able to fetch content from it",
required = true,
description = "Password to login into drupal to be able to fetch content from it.",
order = 3
)
@StringSchema
@StringSchema()
String getPassword();

@Property(
title = "Login path",
description = "Login path.",
required = true,
order = 4
)
@StringSchema(defaultValue = "/user/login")
String getLoginPath();

@Property(
title = "Logout path",
description = "Logout path.",
required = true,
order = 5
)
@StringSchema(defaultValue = "/user/logout")
String getLogoutPath();

@Property(
title = "Drupal Content entry path",
description = "Drupal Content entry path from where the crawling should start.",
required = true,
order = 6
)
@StringSchema(defaultValue = "/en/fusion")
String getDrupalContentEntryPath();

}

}
Loading

0 comments on commit 505ff85

Please sign in to comment.