diff --git a/docs/dev/solr.md b/docs/dev/solr.md new file mode 100644 index 0000000..aeb10ac --- /dev/null +++ b/docs/dev/solr.md @@ -0,0 +1,258 @@ +# Solr and YaCy Integration + +Hint: If you are not a developer, just don't care about this topic. Solr +is already inside YaCy, just do nothing. + +YaCy uses Solr (and other data structures) to store the local search +index (while the remote search index is a RWI data structure). The Solr +index is deeply/programmatically embedded into YaCy, but it is also +possible to use an external Solr index which can then be assigned to +YaCy as external storage. This can be activated with one single click if +you have a running Solr, configured for YaCy. + +The remote index scheme is similar (but extended) to SolrCell; see + We added some +more generic fields, added a second solr core and therefore we need to +use the solr.xml and schema.xml from a YaCy installation. + +## Use the deep-embedded Solr in YaCy and an external Solr concurrently + +This is the default setting. The assignment of a remote Solr and also +switching off of the embedded Solr is done in the servlet +/IndexFederated_p.html. The embedded Solr is switched on, if the flag +"Use deep-embedded local Solr" is switched on. + +## Use an external Solr or Solr Shards to have a distributed Solr-backend for a single YaCy + +In the "Solr URL(s)" field of the + servlet, you can enter +several Solr addresses. If there is more than one Solr assigned, these +are accessed as a 'Shard'. This will cause that each document is hashed +using the document id and stored only in one of the shards. If a query +to Solr is made, then all shards are queried concurrently and the +results are merged. + +## Concurrent usage of the embedded Solr and an external Solr or Solr Shard + +It is possible to leave the "Use deep-embedded local Solr" flag switched +on while using an external Solr. Then each document is stored in the +local and the remote Solr. If a document is searched, this is done +concurrently in the local and remote Solr as if they are a Solr Shard. + +# How to deploy an external Solr for YaCy + +The deployment needs two steps: (1) embedd Solr into a servlet +environment, (2) configure Solr for YaCy. Both is described in each of +the following three options: you can choose between Jetty and Tomcat as +servlet container, do only one of the following three: + +## Use the example-deployment in a Solr package + +This is probably the easiest and fastest way to test a YaCy-Solr +connection. Don't do this for a production environment; one of the next +two options is better for this. The following steps uses Solr 4.1.0; you +can use the most recent version as well. + + - Download solr-4.1.0.tgz from + - Decompress solr-4.1.0.tgz (with 'tar xfz solr-4.1.0.tgz') and put + solr-4.1.0 into ~/ + - We must defined two cores for Solr: the default collection1 and an + addition 'webgraph' core. This is done by copying the YaCy solr.xml + + + + cp ~/yacy/defaults/solr/solr.xml ~/solr-4.1.0/example/solr/collection1/conf/ + + - The webgraph core is basically a copy of the default collection1 + core. Create a configuration for the webgraph as a clone of + collection1: + + + + mkdir ~/solr-4.1.0/example/solr/webgraph + cp -R ~/solr-4.1.0/example/solr/collection1/conf ~/solr-4.1.0/example/solr/webgraph/conf + + - A YaCy schema configuration must be copied to each core. To do this, + we have two options: either copy a generic version of the schema.xml + as used by YaCy + + + + cp ~/yacy/defaults/solr/schema.xml ~/solr-4.1.0/example/solr/collection1/conf/ + cp ~/yacy/defaults/solr/schema.xml ~/solr-4.1.0/example/solr/webgraph/conf/ + +or, using the explicit schema definition which can be extracted from the +YaCy API; start YaCy (if not already running) and execute the following +commands: + + ~/yacy/bin/apicat.sh /api/schema.xml?core=collection1 > ~/solr-4.1.0/example/solr/collection1/conf/schema.xml + ~/yacy/bin/apicat.sh /api/schema.xml?core=webgraph > ~/solr-4.1.0/example/solr/webgraph/conf/schema.xml + + - Finally, start the external Solr with: + + + + cd ~/solr-4.1.0/example/ && java -jar start.jar + +Solr is then running at + + - Start YaCy (if not already running) and open + + - in the "Solr URL(s)" field, enter: (or + a remote address, if you want to run solr on a different server) + - uncheck the "Use deep-embedded local Solr" flag and check the "Use + remote Solr server(s)" flag + +## Deploy Solr in Tomcat + +First you must download and decompress tomcat 6. In this example you +install tomcat to your home directory at `~/tomcat/` + + cd ~ + wget http://archive.apache.org/dist/tomcat/tomcat-6/v6.0.37/bin/apache-tomcat-6.0.37.tar.gz + tar xfz apache-tomcat-6.0.37.tar.gz + mv apache-tomcat-6.0.37 tomcat + +To deploy a solr container, download a solr package and copy the +relevant files to the correct tomcat subdirectory: + + cd ~/tomcat + wget http://www.eu.apache.org/dist/lucene/solr/4.5.1/solr-4.5.1.tgz + tar xfz solr-4.5.1.tgz + cp solr-4.5.1/dist/solr-4.5.1.war . + cp -R solr-4.5.1/example/solr yacyindex + +We need to copy the YaCy schema and the definition of the second core +'webgraph'. Consider that at ~/yacy you have installed a YaCy peer, +then you can simply copy the generic schema file for collection1 to +solr: + + cp ~/yacy/defaults/solr/schema.xml ~/tomcat/yacyindex/collection1/conf/ + +Clone the collection1 to get the webgraph core + + cp -R ~/tomcat/yacyindex/collection1 ~/tomcat/yacyindex/webgraph + +Patch the core.properties in +`~/tomcat/yacyindex/webgraph/core.properties` and replace the line +`name=collection1` with `name=webgraph`. Then copy the solr.xml +definition for two cores: + + cp ~/yacy/defaults/solr/solr.xml ~/tomcat/yacyindex/ + +Copy the solr logging libraries to the tomcat library folder because +Solr uses a different logging in jetty as implemented in solr. In the +~/tomcat directory, do + + cp solr-4.5.1/example/lib/ext/* ~/tomcat/lib/ + +To deploy Solr with the YaCy configuration you must create a Tomcat +Context fragment. This is a file within the conf subdirecty which is +created once tomcat was started. Therefore we start tomcat now: + + ~/tomcat/bin/startup.sh + +Look at the path `~/tomcat/conf/Catalina/localhost` which was now +created. Thats the place where we create the Tomcat Context fragment. +You need the absolute path to the tomcat installation directory which we +consider as `/home/administrator/tomcat` in this example. Create a file in +`/home/administrator/tomcat/conf/Catalina/localhost/solr4yacy.xml` with +the following content: + + + + + + +Restart tomcat to activate this configuration + + ~/tomcat/bin/shutdown.sh && ~/tomcat/bin/startup.sh + +Finished! You can now access Solr with the url + This is the url which you can set in +the "Use remote Solr server(s)" field of the /IndexFederated_p.html +servlet in YaCy to attach the solr-in-tomcat to YaCy as remote storage +server. When doing this you may want to remove the flag "Use +deep-embedded local Solr" so this remote solr becomes the single storage +point for the YaCy search index. + +### User Administration and Search Index Access Protection + +Tomcat can add a password protection to web pages. There is i.e. a +default manager web application available at + which cannot be accessed without +setting a role and a user name for this. We will activate the manager to +test the password protection: write the new role "manager" to the file +`~/tomcat/conf/tomcat-users.xml` and set a password, i.e. + + + + + + + +Re-start tomcat, then open and +manage solr applications there. Log-in with the user name 'admin' and +the password 'tomcat'. We will use this now to access our YaCy search +index in Solr. To do this, we need access rules defined in the web.xml +configuration file to declare a role to be protected. We will call this +role 'user' and the paths to be all paths within tomcat. Open the file +`~/tomcat/conf/web.xml` and add the following lines at the end before the +closing tag \: + + + + /* + + + user + + + + BASIC + tomcat + + + user + + +To use the new role 'user', we add an account in the file +`~/tomcat/conf/tomcat-users.xml`. Add the following lines to +\: + +``` + + +``` + +and restart tomcat. You can now access Solr with the url + This is the url +which you can set in the "Use remote Solr server(s)" field of the +/IndexFederated_p.html servlet in YaCy. The account:password encoding +in the url is used by YaCy to access the solr index within tomcat. + +# Copy the deeply-embedded Solr Index to an external Solr + +This is easy, just copy the Solr directory in +DATA/INDEX/\/SEGMENTS/solr\_40 into the solr data directory of +your remote Solr installation. You can also do this using a script +during runtime. Call + + ~/yacy/bin/indexdump.sh + +which causes that YaCy creates a tar.gz file of the `solr_40` directory +during runtime and the indexdump.sh script returns the file path to this +tar.gz file. This filename can then be processed further with your own +copy-and-deploy script to fill a remote Solr with that. + + + +For cluster solr usage, see [Solr Cloud instructions](./solrcloud.md) + + +_Converted from +, may be outdated_ + + + + diff --git a/docs/dev/solrcloud.md b/docs/dev/solrcloud.md new file mode 100644 index 0000000..c4d1fb8 --- /dev/null +++ b/docs/dev/solrcloud.md @@ -0,0 +1,286 @@ +# YaCy and Solr Cloud + + +This is an advanced [Solr for YaCy](./solr.md) +installation which uses the SolrCloud architecture. If you want to read +and understand this, you should be (at least a little bit) familiar with +debian, Solr and tomcat. + +In this example, we install a shard of 4 Solr instances within the same +server. + + + +## Software Installation + +We install tomcat, zookeeper and YaCy as standard debian packages and +Solr as web app for tomcat. + +### Tomcat Installation + +We will install tomcat as a standard debian system service using apt: + + apt-get install tomcat6 tomcat6-examples tomcat6-admin tomcat6-docs + +The tomcat web service on port 8080 will start automatically and you can +open the default page at The optional packages +tomcat6-examples tomcat6-admin tomcat6-docs are great to develop and +test applications, but it is also possible to omit them. If you +installed the optional packages, then you can test them: + + - is the online-documentation + - links to a set of example tomcat + applications + - and + are tomcat management + applications but their access is restricted. To use them you must + set a password in /etc/tomcat6/tomcat-users.xml, like + + + + + + + + + + + + +After setting this, you must restart tomcat with + + /etc/init.d/tomcat6 restart + +and then you can log in the +[manager](http://localhost:8080/manager/html) and +[host-manager](http://localhost:8080/host-manager/html) servlet with +the user 'admin' and the password 'tomcat'. Please replace the default +password 'tomcat' with your own. + +The relevant paths for the result of this installation are: + + tomcat users: /etc/tomcat6/tomcat-users.xml + CATALINA_HOME: /usr/share/tomcat6 + CATALINA_BASE: /var/lib/tomcat6 + default web page: /var/lib/tomcat6/webapps/ROOT/index.html + +### Zookeeper Installation + +The SolrCloud peers need a common configuration system which is provided +by zookeeper. Zookeeper can be installed with + + apt-get install zookeeper zookeeperd + +This will create a new user named 'zookeeper'. The relevant paths are at + + Zookeeper config: /etc/zookeeper/conf (linked to /etc/zookeeper/conf_example) + Zookeeper data: /var/lib/zookeeper/ + Zookeeper binary: /usr/share/zookeeper/ + +To check if Zookeeper is running, start the Zookeeper shell: + + /usr/share/zookeeper/bin/zkCli.sh + +and run shell scripts like + + ls / + ls /zookeeper + +Because solr is started within tomcat and needs to know the host address +of zookeeper, we must assign this to tomcat as a jvm option. Open the +file /usr/share/tomcat6/bin/catalina.sh and add the following lines at +the begining of the document (right after the comments): + + # added zookeeper host information used by tomcat to find Solr shards for the SolrCloud + CATALINA_OPTS=$CATALINA_OPTS -DzkHost=localhost:2181 + +..and restart tomcat + + /etc/init.d/tomcat6 restart + +### Solr Installation + +Download a solr release from (Solr +4.5.1. worked while Solr 4.6.0 did not work!) i.e. + + cd /opt + wget http://apache.mirrors.spacedump.net/lucene/solr/4.5.1/solr-4.5.1.tgz + tar xfz solr-4.5.1.tgz + ln -s solr-4.5.1 solr + ln -s solr-4.5.1/dist/solr-4.5.1.war solr.war + +Because Solr uses a different logging in jetty as implemented in solr, +we must add slf4j adapters to the tomcat library + + cd /usr/share/tomcat6/lib/ + wget http://www.slf4j.org/dist/slf4j-1.6.6.zip + apt-get install unzip + unzip slf4j-1.6.6.zip + cp slf4j-1.6.6/{jcl-over-slf4j-1.6.6.jar,slf4j-1.6.6/log4j-over-slf4j-1.6.6.jar,slf4j-1.6.6/slf4j-api-1.6.6.jar,slf4j-1.6.6/slf4j-jdk14-1.6.6.jar} . + +and restart tomcat: + + /etc/init.d/tomcat6 restart + +### YaCy Installation + +Follow the [YaCy for Debian installation +instructions](../installation/debianinstall.md) +and select 'webportal' as network to join into (we consider that you do +this not create a standalone-YaCy, not a peer-to-peer participant; you +can of course also use this for a 'freeworld' peer as well). The +relevant paths are at + + YaCy data: /var/lib/yacy + YaCy log: /var/log/yacy + YaCy binary: /usr/share/yacy/ + Solr conf for YaCy: /usr/share/yacy/defaults/solr + +## Software Configuration + +The SolrCloud needs a common configuration of the index cores used by +YaCy. YaCy uses two cores, 'collection1' and 'webgraph'. Both are +defined with a generic index schema and they are exact clones of each +other. It may be also possible to defines these cores with non-generic, +exact defined schema.xml files, but we will not do that right now +because it makes things much more complex. + +### Zookeeper Client for Solr + +First, we need a Zookeeper client for Solr because Solr provides it's +own client app to upload the relevant configuration files. We must +fabricate this client using the libraries inside the Solr war-file and +additional libraries for logging. We use the already installed war file, +you must adopt the paths here if you used a more recent version of Solr: + + unzip -q /opt/solr.war -d /tmp/solr-war/ + mkdir /usr/share/zookeeper/solr-cli-lib + cp /tmp/solr-war/WEB-INF/lib/* /usr/share/zookeeper/solr-cli-lib/ # solr libs + cp /opt/solr/example/lib/ext/* /usr/share/zookeeper/solr-cli-lib/ # logger libs + rm -Rf /tmp/solr-war + +Now we can take advantage of the SolrCloud ZooKeeper CLI commands. + +### Create Solr Configuration of Solr Cores for YaCy Inside Zookeeper + +For a detailed description of the set-up of Solr Clusters and a +SolrCloud configuration, see the [SolrCloud Wiki of +apache.org](http://wiki.apache.org/solr/SolrCloud), the [SolrCloud +Installation in Tomcat](http://wiki.apache.org/solr/SolrCloudTomcat), +a [Guide to SolrCloud +Configuration](http://systemsarchitect.net/painless-guide-to-solr-cloud-configuration/) +and a [SolrCloud Cluster (Single Collection) +Deployment](http://myjeeva.com/solrcloud-cluster-single-collection-deployment.html). +To upload the solr configuration in Zookeeper, we fabricate a config +directory using the solr example config and the YaCy genric schema file +schema.xml: + + cp -R /opt/solr/example/solr/collection1/conf /opt/yacyconf + cp /usr/share/yacy/defaults/solr/schema.xml /opt/yacyconf/ + +We can then use that to upload the configuration to zookeeper: + + java -classpath .:/usr/share/zookeeper/solr-cli-lib/* org.apache.solr.cloud.ZkCLI -zkhost localhost:2181 -cmd upconfig -confdir /opt/yacyconf -confname yacygeneric + +That configuration is good for both collections, 'collection1' and +'webgraph'. We can link this configuration therefore to both +collections: + + java -classpath .:/usr/share/zookeeper/solr-cli-lib/* org.apache.solr.cloud.ZkCLI -zkhost localhost:2181 -cmd linkconfig -collection collection1 -confname yacygeneric + java -classpath .:/usr/share/zookeeper/solr-cli-lib/* org.apache.solr.cloud.ZkCLI -zkhost localhost:2181 -cmd linkconfig -collection webgraph -confname yacygeneric + +Lets see whats inside of zookeeper now, i.e. how the collection1 is +linked against the generic schema: + + /usr/share/zookeeper/bin/zkCli.sh get /collections/collection1 + +#### Create Tomcat Configuration of Solr Web Services + +We want to use four Solr servers as a SolrCloud, each with two cores +('collection1' and 'webgraph'). We create subdirectories for the servers +inside of /var/opt/solrcloud/: + + mkdir /var/opt/solrcloud/ + mkdir /var/opt/solrcloud/solr0 + mkdir /var/opt/solrcloud/solr1 + mkdir /var/opt/solrcloud/solr2 + mkdir /var/opt/solrcloud/solr3 + +In each of these directories, put a file named solr.xml. The description +for the creation of that file in the web is mainly void, since there is +a new [xml structure for solr.xml for Solr 4.4 and +beyond](http://wiki.apache.org/solr/Solr.xml%204.4%20and%20beyond), +especially for [Core Discovery with +SolrCloud](http://wiki.apache.org/solr/Core%20Discovery%20%284.4%20and%20beyond%29). +Put the following content into `/var/opt/solrcloud/solr0/solr.xml`: + + + + 4 + + localhost + 8080 + solr0 + localhost:2181 + ${solr.zkclienttimeout:30000} + ${shareSchema:false} + ${genericCoreNodeNames:true} + + + ${socketTimeout:0} + ${connTimeout:0} + + + +Finally, make the path `/var/opt/solrcloud/` writable for tomcat6: + + chown -R tomcat6 /var/opt/solrcloud/ + chgrp -R tomcat6 /var/opt/solrcloud/ + +To deploy Solr with the YaCy configuration you must create a Tomcat +Context fragment for each Solr instance. A Tomcat Context Fragment is a +file in `/var/lib/tomcat6/conf/Catalina/localhost`. Therefore, we must +create four files, one for each Solr server, in this directory: write a +file to `/var/lib/tomcat6/conf/Catalina/localhost/solr0.xml` with the +following content: + + + + + + +and copy this to `solr1.xml .. solr3.xml` and patch the solr/home +attribute to `solr1 .. solr3`. If you patch these files using emacs, make +sure that you delete all files ending with '~' because they will cause +an error. Finally, restart tomcat: + + /etc/init.d/tomcat6 restart + +### Create the SolrCloud + +We can now open the Solr web service at +Open this web page to check if the service is up and running. Then we +can use that web service to instantiate the SolrCloud: + + curl 'http://localhost:8080/solr0/admin/collections?action=CREATE&name=collection1&numShards=4&replicationFactor=1' + curl 'http://localhost:8080/solr0/admin/collections?action=CREATE&name=webgraph&numShards=4&replicationFactor=1' + +### Assign the SolrCloud to YaCy + +When the SolrCloud is ready and running, it can be assigned to YaCy as +storage server. Open the servlet at + and select the flag "Use +remote Solr server(s)". As server address, enter one of the Solr +servers, like Finally, uncheck the flag +"Use deep-embedded local Solr". + + + + + +_Converted from +, may be outdated_ + + + + diff --git a/docs/docs.md b/docs/docs.md index 56d8077..436ba32 100644 --- a/docs/docs.md +++ b/docs/docs.md @@ -13,6 +13,7 @@ ## Operation * [Setting the ranking rules](operation/ranking.md) * [YaCy config settings](operation/yacy_conf.md) +* [Logging in YaCy](operation/logging.md) ## Developers * [How to contribute](contribute.md) @@ -47,6 +48,10 @@ may be outdated, you can help the community by checking and [improving](contribu * [YaCy and Tor](operation/yacy-tor.md) * [Network Definition](operation/network-definition.md) +### Developers +* [Solr and YaCy integration](dev/solr.md) +* [YaCy and Solr Cloud](dev/solrcloud.md) + ## Old and obsolete The original YaCy wiki is closed now (no new registration or editing) and will be abandoned in future, but still contains valuable information. You diff --git a/docs/download_installation.md b/docs/download_installation.md index e3346c7..d00fef7 100644 --- a/docs/download_installation.md +++ b/docs/download_installation.md @@ -23,7 +23,7 @@ If you don't hava Docker installed, get it from [https://docs.docker.com/get-doc * Download YaCy for Windows from [https://download.yacy.net/yacy_v1.924_20201214_10042.exe](https://download.yacy.net/yacy_v1.924_20201214_10042.exe) -* Download Yacy for Linux from [https://download.yacy.net/yacy_v1.930_202404051704_de941c6fe.tar.gz](https://download.yacy.net/yacy_v1.930_202404051704_de941c6fe.tar.gz) +* Download Yacy for Linux from [https://download.yacy.net/yacy_v1.930_202405130205_59c0cb0f3.tar.gz](https://download.yacy.net/yacy_v1.930_202405130205_59c0cb0f3.tar.gz) * Download YaCy for macOS from [https://download.yacy.net/yacy_v1.924_20201214_10042.dmg](https://download.yacy.net/yacy_v1.924_20201214_10042.dmg) * Download latest developer release for Linux from [https://release.yacy.net/](https://release.yacy.net/) @@ -75,8 +75,8 @@ Installing from start to finish would look something like this, depending on you sudo apt-get update sudo dpkg --configure -a sudo apt-get install -y openjdk-11-jre-headless -wget https://download.yacy.net/yacy_v1.930_202404051704_de941c6fe.tar.gz -tar xfz yacy_v1.930_202404051704_de941c6fe.tar.gz +wget https://download.yacy.net/yacy_v1.930_202405130205_59c0cb0f3.tar.gz +tar xfz yacy_v1.930_202405130205_59c0cb0f3.tar.gz cd yacy ./startYACY.sh ``` diff --git a/docs/faq.md b/docs/faq.md index 7d601ab..246be2e 100644 --- a/docs/faq.md +++ b/docs/faq.md @@ -144,13 +144,23 @@ Alternatively, another way to do this is through the configuration file httpProx ### Something seems not to be working properly ; what should I do? -YaCy is still undergoing development, so one should opt for a stable version for use. The latest stable version can be downloaded from the [YaCy homepage](https://yacy.net). If you are experiencing a strange behaviour of YaCy then you should search the [community forum](https://community.searchlab.eu/) for known issues. If the issue is unknown, then you can ask for help on the forum (and provide the YaCy version, details on the occurrence of the issue, and if possible an excerpt from the log file in order to help fix the bug) or [start an issue](https://github.com/yacy/yacy_search_server/issues/) on github. +YaCy is still undergoing development, so one should opt for a stable version +for use. The latest stable version can be downloaded from the [YaCy +homepage](https://yacy.net). If you are experiencing a strange behaviour of +YaCy then you should search the [community +forum](https://community.searchlab.eu/) for known issues. If the issue is +unknown, then you can ask for help on the forum (and provide the YaCy +version, details on the occurrence of the issue, and if possible an excerpt +from the [log file](operation/logging.md) in order to help fix the bug) or [start an +issue](https://github.com/yacy/yacy_search_server/issues/) on github. + First thing to see while experiencing some errors, is the log located at `DATA/LOG/yacy00.log`. You can monitor it live using `tail` command. While it flips around when certain size is reached, it's better to use -F option: ``` tail -F DATA/LOG/yacy00.log ``` -You can also filter lines by using `grep` command (eg. `tail -F DATA/LOG/yacy00.log | grep DHT` to show only DHT lines) or -v parameter of grep to ignore some lines (eg. `tail -F DATA/LOG/yacy00.log | grep -v DHT` to ignore DHT lines). +See more about setting and using the [yacy log](operation/logging.md). + ### YaCy is running terribly slow; what should I do? As an indexing and search host, YaCy is quite resource hungry. It's written in Java. Fast disks (SSD or RAID) and plenty of RAM will help. diff --git a/docs/installation/full-install.md b/docs/installation/full-install.md new file mode 100644 index 0000000..f9823ac --- /dev/null +++ b/docs/installation/full-install.md @@ -0,0 +1,85 @@ +# Full YaCy installation guide +##### (systemd or runit, nginx, let's encrypt) + +## Prerequisites +- Any UNIX-like system which has Nginx, Certbot and Wget in its repos (assuming your distribution is Debian, but actually you can install it even on StaLI. +- A domain name +- Systemd or Runit-powered UNIX-like system (for using on Runit system, see this: https://aur.archlinux.org/packages/yacy-runit) +- `also, if anyone reading it has some free time, please write services and instructions for their installation for sysvinit, openrc, dinit, etc...` + +## Installation +#### Note: `#` before the command means running as root. +**1.** Install needed packages: +``` +# apt install nginx certbot python3-certbot-nginx wget openjdk-17-jdk-headless +``` + +**2.** Create a user needed for running YaCy +``` +# useradd --system yacydm -m -d /home/yacy +# useradd --system yacy -m -d /home/yacy +``` + +**3.** Download, unpack and fix permissions for YaCy\*, please replace download link with new one **for *"Linux"*** located [here](https://yacy.net/download_installation/#download) +#### `$` here means running as user created later, not your own user. +``` +# su -l yacy + +$ wget https://release.yacy.net/yacy_latest.tar.gz +$ tar -xf yacy_v1.930_202404051704_de941c6fe.tar.gz -C .. +$ exit + +# chown -R yacydm:yacydm /home/yacy/ +# chown -R yacy:yacy /home/yacy/DATA/ +``` + +**4.** Install systemd service (runit instructions are missing, so if anyone write it, I would be grateful. Write by editing this page on GitHub.) +``` +# cat > yacy.service << EOF +[Unit] +Description=YaCy P2P Search Server +After=network.target + +[Service] +Type=forking +User=yacy +ExecStart=/home/yacy/startYACY.sh +ExecStop=/home/yacy/stopYACY.sh +ExecRestart=/home/yacy/restartYACY.sh + +[Install] +WantedBy=multi-user.target +EOF + +# cp yacy.service /etc/systemd/system/ +# systemctl enable --now yacy.service +``` + +**5.** Add Nginx site +``` +# cat >> yacy-nginx << EOF +server { + server_name [your domain name]; + access_log /var/log/nginx/search-access.log; + error_log /var/log/nginx/search-error.log; + + location / { + proxy_pass http://127.0.0.1:8090; + proxy_set_header Host $host; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Real-IP $remote_addr; + } +} +EOF +# cp yacy-nginx /etc/nginx/sites-available/yacy +# ln -s /etc/nginx/sites-{available,enabled}/yacy +# nginx -t +# nginx -s reload +``` + +**6.** Let's Encrypt! +``` +# certbot --nginx -d [your domain name] +``` + +Now you can visit `https://[your domain name]/` from your phone or whatever you use and see the YaCy search page! diff --git a/docs/operation/logging.md b/docs/operation/logging.md new file mode 100644 index 0000000..fe6d01f --- /dev/null +++ b/docs/operation/logging.md @@ -0,0 +1,71 @@ +# Logging in YaCy + +Every YaCy instance keeps a log of it's operation. + +To take a look into a log is useful for debugging, when hunting an error, +inefficiency or just for... train-spotting. + +Seak peek into log is in the web UI `/ViewLog_p.html`. You can filter the +entries based on regular-expression. And as you know, a fragment of log is +shown live at the Status page (`/Status.html`). + +## Logfiles + +Real log is located in file `DATA/LOG/yacy00.log`. You can watch it live by +common unix tools like +[tail](https://www.man7.org/linux/man-pages/man1/tail.1.html). + +`tail` has a parameter -F, which allows to monitor logfile live: + + `tail -F DATA/LOG/yacy00.log` + +You can use `grep` to filter out only lines you wanna see: + + `tail -F DATA/LOG/yacy00.log | grep CRAWLER` + +or to hide the lines you don't want to read ('negative' grep): + + `tail -F DATA/LOG/yacy00.log | grep -v CRAWLER` + +You can also use some colorizer like [ccze](https://github.com/cornet/ccze) +to increase the readability of logs. + +Logs are rotated after filling, which can be very fast, last log is +`yacy00.log` and `tail`s parameter `-F` helps you to keep a track of actual +file. + +Queries searched by your instance are logged anonymously in +`DATA/LOG/queries.log` with timestamps. Currently, there is a certain +delay after which entry appears in the queries.log. + +## Verbosity + +You can set the verbosity of logging in `DATA/LOG/yacy.logging` file, for +example when you hunt a bug or, on the other side, you want to unclutter the +logfile and show only information useful for you. + +For each component of YaCy, eg. `CRAWLER`, `NETWORK`, etc. you can set a +level of details logged: + + - `OFF` - no output at all + + - `SEVERE` - system-level error, internal cause, critical and not fixable (i.e. inconsistency) + + - `WARNING` - uncritical service failure, may require user activity (i.e. input required, wrong authorization) + + - `INFO` - regular action information (i.e. any httpd request URL) + + - `CONFIG` - regular system status information (i.e. start-up messages) + + - `FINE` - in-function status debug output + + - `FINER` - more details in debug output + + - `FINEST` - even more details in debug output + +For example `NETWORK.level = WARNING` set in `yacy.logging` will not show the regular p2p network +traffic, only the warnings, in the logs. + +Technically, +[ConcurrentLog](https://yacy.net/api/javadoc/net/yacy/cora/util/ConcurrentLog.html) +class is used for logging in YaCy java source code. \ No newline at end of file