From 89eff207a7d3aa649c734bc0b189af0f84fa00e7 Mon Sep 17 00:00:00 2001 From: okybaca Date: Fri, 23 Feb 2024 13:34:21 +0100 Subject: [PATCH] converted first batch of old-wiki pages --- docs/docs.md | 21 + docs/installation/archinstall.md | 78 +++ docs/installation/debian_high_availability.md | 418 ++++++++++++ docs/installation/debianinstall.md | 151 +++++ docs/installation/freebsdinstall.md | 66 ++ docs/installation/gentooinstall.md | 21 + docs/installation/gnuinstall.md | 146 +++++ docs/{operation => installation}/headless.md | 0 docs/installation/obsinstall.md | 38 ++ docs/installation/raspberry_pi.md | 244 +++++++ docs/installation/requirements.md | 49 ++ docs/{operation => installation}/shrink.md | 0 docs/{operation => installation}/staticip.md | 0 docs/operation/portforwarding.md | 25 + docs/operation/yacy-tor.md | 605 ++++++++++++++++++ docs/operation/yacyoverhttps.md | 58 ++ 16 files changed, 1920 insertions(+) create mode 100644 docs/installation/archinstall.md create mode 100644 docs/installation/debian_high_availability.md create mode 100644 docs/installation/debianinstall.md create mode 100644 docs/installation/freebsdinstall.md create mode 100644 docs/installation/gentooinstall.md create mode 100644 docs/installation/gnuinstall.md rename docs/{operation => installation}/headless.md (100%) create mode 100644 docs/installation/obsinstall.md create mode 100644 docs/installation/raspberry_pi.md create mode 100644 docs/installation/requirements.md rename docs/{operation => installation}/shrink.md (100%) rename docs/{operation => installation}/staticip.md (100%) create mode 100644 docs/operation/portforwarding.md create mode 100644 docs/operation/yacy-tor.md create mode 100644 docs/operation/yacyoverhttps.md diff --git a/docs/docs.md b/docs/docs.md index b19c388..8613add 100644 --- a/docs/docs.md +++ b/docs/docs.md @@ -17,6 +17,27 @@ * [Crawler API](api/crawler.md) * [javadoc](https://yacy.net/api/javadoc/) - generated documentation for YaCy's java source code +## Converted from old-wiki +may be outdated, you can help the community by checking and [improving](contribute.md) the pages + +### Installation +* [System Requirements](installation/requirements.md) +* [Arch Install Guide](installation/archinstall.md) +* [Installation of YaCy on Debian](installation/debianinstall.md) +* [YaCy High-Availability Configuration on Debian](installation/debian_high_availability.md) +* [HowTo install YaCy on Gentoo](installation/gentooinstall.md) +* [GNU Install](installation/gnuinstall.md) +* [FreeBSD Install Guide](installation/freebsdinstall.md) +* [Install YaCy-packages from OpenSUSE Build Service](installation/obsinstall.md) +* [Set up Raspberry Pi with YaCy](installation/raspberry_pi.md) + +### Operation +* [YaCy and Tor](operation/yacy-tor.md) +* [Portforwarding](operation/portforwarding.md) +* [Using the YaCy Front-End over HTTPS](operation/yacyoverhttps.md) + + + ## Old and obsolete The original YaCy wiki is closed now (no new registration or editing) and diff --git a/docs/installation/archinstall.md b/docs/installation/archinstall.md new file mode 100644 index 0000000..af872e7 --- /dev/null +++ b/docs/installation/archinstall.md @@ -0,0 +1,78 @@ +# Arch Install Guide + +YaCy has a [PKGBUILD in the +AUR](https://aur.archlinux.org/packages/yacy/) which greatly simplifies +installation. You can install this PKGBUILD with an AUR helper or by +hand. + +## Installation + +### Using an AUR Helper + +If you are using a full-featured AUR helper like +[packer](https://aur.archlinux.org/packages/packer/), +[yaourt](https://aur.archlinux.org/packages/yaourt/), or another of that +ilk, you can run + + packer -S yacy + +or + + yaourt -S yacy + +. Adjust for your specific helper. + +### By hand + +First, ensure that you have installed all the dependencies; +specifically, you need `sudo`, `libcups`, `xorg-server`, and +`java-environment`. Install these with pacman. Once this is finished, +download and extract the PKGBUILD from the AUR. You can do this from +your web browser +([link](https://aur.archlinux.org/packages/ya/yacy/yacy.tar.gz)) and +extract it using your favorite GUI tool, or you can run the following +command: + + $ curl https://aur.archlinux.org/packages/ya/yacy/yacy.tar.gz | tar -xz + +Now `cd` into the folder that the tarball extracted to (most likely +called "yacy") and run + + makepkg + +. (Note: if you forgot to install the dependecies before, you should run +`makepkg -s`.) This will build us a nice .tar.xz package which `pacman` +knows how to handle. Once `makepkg` completes, install the package with +`pacman`: + + # pacman -U yacy-(YourVersion)-(YourArchitecture).tar.xz + +(replace YourVersion and YourArchitecture with the current version and +your chip architecture (either i686 or x86\_64). Tab completion helps a +lot here.) + +## Use + +To actually start YaCy, you need to start its daemon with systemd: + + # systemctl start yacy.service + +YaCy needs about a minute to initialize, bootstrap, and in general get +up and running. If you'd like the YaCy daemon to run automagically at +boot, enable its service with systemd: + + # systemctl enable yacy.service + +You're all set! Don't forget to open your firewall or set up port +forwarding to contribute your index to the network! + + + + + +_Converted from , may +be outdated_ + + + + diff --git a/docs/installation/debian_high_availability.md b/docs/installation/debian_high_availability.md new file mode 100644 index 0000000..d3587fb --- /dev/null +++ b/docs/installation/debian_high_availability.md @@ -0,0 +1,418 @@ +# YaCy High-Availability Configuration on Debian + +A high-availabilty configuration is usually ensured using a redundant +set-up of a product. We will do this with YaCy in the following way: + + - run two YaCy instances, one will be the master node and the other is + a replication node + - set-up index replication between both nodes + - add a full index back-up as a regular and automatic process + - add an automatic peer re-start on peer failure + - add automated software update without service downtime + +A picture of this set-up can be found here: + + +## Prepare Debian + +We will not use the [Debian +Installation](./debianinstall.md) process here +because we want to install two YaCy peers on the same computer. We will +use the tarball release of YaCy and put the same application in two +separate directories. We will use a linux account `yacyappliance` for +the home of the application. Therefore we first create this user. You +can omit this step and use an existing user `{myuser}` instead; if you do +so, replace all `~yacyappliance` with `~{myuser}` and `/home/yacyappliance` +with `/home/{myuser}`. + +As root, do: + + adduser yacyappliance + +We will use some debian packes: + + apt-get update + apt-get install ant openjdk-7-jdk openjdk-7-jre-headless git wget + +Then log in into the new account yacyappliance. + +## Install Redundant YaCy Applications + +We could use the official tarball from yacy.net here but to get full +availability of YaCy updates without the dependency on official update +servers we will use a self-generated YaCy update taken from the git +repository. + +As user yacyappliance do + + cd + git clone git://github.com/yacy/yacy_search_server.git yacy_deploy + cd yacy_deploy + ant clean all dist + +This writes a fresh YaCy tarball release to +`~yacyappliance/yacy_deploy/RELEASE/` + +We will unpack this tarball twice and create the two YaCy peers yacy0 +(the master) and yacy1 (the replication peer). + + cd + mkdir yacy0 + mkdir yacy1 + cd yacy_deploy/RELEASE + tar xfz `ls -1tr | tail -1` -C ../../yacy0 --strip-components=1 + tar xfz `ls -1tr | tail -1` -C ../../yacy1 --strip-components=1 + +We can now update the peers yacy0 and yacy1 just by overwriting the +current code with new code that we create ourself (don't do this now\!): + + cd ~/yacy_deploy && git pull origin master && ant clean all dist + cd ~/yacy_deploy/RELEASE && tar xfz `ls -1tr | tail -1` -C ../../yacy0 --strip-components=1 + cd ~/yacy_deploy/RELEASE && tar xfz `ls -1tr | tail -1` -C ../../yacy1 --strip-components=1 + +Now prepare the peer yacy0 to use the 'allip' network which will allow +you to index both, intranet and internet addresses without connecting +the YaCy P2P network: edit the file `~/yacy0/defaults/yacy.init` and +search the line containing network.unit.definition, set the value of +that property to `defaults/yacy.network.allip.unit` + +You can now start YaCy the first time to set an administration account. +This is needed to configure the second peer yacy1 based on the content +of the first peer yacy0 + + ~/yacy0/startYACY.sh + ~/yacy0/bin/passwd.sh {newpassword} + ~/yacy0/stopYACY.sh + +Now the DATA directory in `~/yacy0` was created and we can clone this to +yacy1, run + + cp -R ~/yacy0/DATA ~/yacy1/DATA && sed "s/port=8090/port=8091/" -i ~/yacy1/DATA/SETTINGS/yacy.conf && rm -f ~/yacy1/DATA/WORK/api.bheap + +The second peer needs a different port. Therefore the sed command +replaces the port 8090 by 8091. We also do not want that the second peer +does the same crawling as configured in yacy0, therefore we also delete +the api file. We need this command later, therefore we create a file +`~/replicate_all.sh` with the following content: + + ~/yacy0/stopYACY.sh + cp -R ~/yacy0/DATA ~/yacy1/DATA_NEW + sed "s/port=8090/port=8091/" -i ~/yacy1/DATA_NEW/SETTINGS/yacy.conf + rm -f ~/yacy1/DATA_NEW/WORK/api.bheap + ~/yacy0/startYACY.sh + ~/yacy1/stopYACY.sh + rm -f ~/yacy1/DATA + mv ~/yacy1/DATA_NEW ~/yacy1/DATA + ~/yacy1/startYACY.sh + +We will install a load balancer for the two peers later and because this +script ensures that one of the peers runs at any time, this is a update +replication of the complete peer configuration. But because we can do an +index-only replication as well during uptime of the peer, you should use +that script only if you do a reconfiguration of the primary peer. + +We can now start both peers + + ~/yacy0/startYACY.sh && ~/yacy1/startYACY.sh + +## Adding a https Gateway in Front of the Administration Interface + +Security is part of an availability strategy and therefore all +production modifications should be made through a secure interface. To +apply a ssl encryption in front of the primary peer, follow the +instructions in +[YaCy Over HTTPS](../operation/yacyoverhttps.md) + +## Auto-Start for YaCy + +Because the two YaCys are not installed using the debian package +manager, there is no autostart for these applications. We will create +the autostart manually. Become root and create a file in +/etc/init.d/yacy with the following content: + + #! /bin/sh + ### BEGIN INIT INFO + # Provides: YaCy + # Required-Start: $local_fs $remote_fs $network $time + # Required-Stop: $local_fs $remote_fs $network $time + # Default-Start: 2 3 4 5 + # Default-Stop: 0 1 6 + # Short-Description: YaCy Search Engine + ### END INIT INFO + case "$1" in + start) + su - yacyappliance -c "/home/yacyappliance/yacy0/startYACY.sh" + su - yacyappliance -c "/home/yacyappliance/yacy1/startYACY.sh" +  ;; + stop) + su - yacyappliance -c "/home/yacyappliance/yacy0/stopYACY.sh" + su - yacyappliance -c "/home/yacyappliance/yacy1/stopYACY.sh" +  ;; + *) + exit 3 +  ;; + esac + : + +and make it executable and linked with + + sudo chmod 755 /etc/init.d/yacy + sudo update-rc.d yacy defaults + +This will start and stop both YaCy instances automatically. + +## Fail-Over Access to Redundant YaCy Installations using a Reverse Proxy + +We will use a nginx http server and use it as a reverse proxy with +fail-over as load balancer for the two YaCy that we are running now. +Become root, run + + apt-get install nginx + +and edit the file /etc/nginx/nginx.conf. Locate the http section and +modify it: + + - add a comment in front of the "include /etc/nginx/sites-enabled/\*;" + line to disable it + - add the following at the end of the http section: + + + + upstream yacyappliance { + ip_hash; + server localhost:8090; + server localhost:8091; + } + server { + listen 8100; + server_name yacy-appliance; + location / { + proxy_pass http://yacyappliance; + proxy_redirect off; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + client_max_body_size 10m; + client_body_buffer_size 128k; + proxy_connect_timeout 3s; + proxy_send_timeout 10s; + proxy_read_timeout 10s; + proxy_buffer_size 4k; + proxy_buffers 4 32k; + proxy_busy_buffers_size 64k; + proxy_temp_file_write_size 64k; + } + location \.(jpg|png|gif|jpeg|css)$ { + proxy_buffering on; + proxy_cache_valid 200 120m; + expires 864000; + } + } + +This will run nginx at port 8100; if you want to run it at port 80 then +just replace the 8100 with 80. Finally run + + /etc/init.d/nginx restart + +and open [http://{yourhost}:8100](http://%7Byourhost%7D:8100) to check +that nginx is running and successfully routes the access from 8100 to +either 8090 or 8091. You can also stop one of the yacy0 or yacy1 and the +search interface at [http://{yourhost}:8100](http://%7Byourhost%7D:8100) +will still be available. + +## Add a Heartbeat Check for the YaCy Peers + +There is a check-alive script which checks if a YaCy server responds on +the web interface and restarts the application if it does not respond. +This script can be called on a regular basis using a cronjob. As root, +edit /etc/crontab and add the following lines + +``` + 0 * * * * yacyappliance cd /home/yacyappliance/yacy0/bin && ./checkalive.sh +15 * * * * yacyappliance cd /home/yacyappliance/yacy1/bin && ./checkalive.sh +30 * * * * yacyappliance cd /home/yacyappliance/yacy0/bin && ./checkalive.sh +45 * * * * yacyappliance cd /home/yacyappliance/yacy1/bin && ./checkalive.sh +``` + +This will cause that twice an hour each peer is checked and restarted, +if needed. This means a maximum downtime of 15 minutes if *both* peers +fail. + +## Test Indexing, Replication and Fail-Over Access + +We must put some document data to check fail-over after a replication. +Open +[http://{yourhost}:8090/CrawlStartExpert\_p.html](http://%7Byourhost%7D:8090/CrawlStartExpert_p.html) +and start a web crawl. When the crawl has created some data in the +search index, first test that a search request succeeds at +[http://{yourhost}:8090/](http://%7Byourhost%7D:8090/) + +Then call (as user yacyappliance) + + ~/yacy1/bin/indexrestore.sh `~/yacy0/bin/indexdump.sh` + +This replicates the index from yacy0 to yacy1. You can now open +[http://{yourhost}:8091/](http://%7Byourhost%7D:8091/) and get the same +search result as before. During the replication process the index in +yacy0 is first unmounted and during this process the search inteface +does not respond on search requests. At that time the load balancer on +[http://{yourhost}:8100/](http://%7Byourhost%7D:8100/) should switch to +the second peer. When the dump is done, the index in the second peer +becomes unavailable during reading of the dump and the load balancer +should switch to the first peer again. + +## Automatically do an Index Backup and Replication + +We want to have an automated and combined backup, replicate and stale +backup cleaning process to ensure that the replication peer is always +updated to the index of the primary peer while keeping dumps as a +back-up for emergency cases. + +For this, we create a backup-directory. As user yacyappliance do + + mkdir ~/indexbackup + +to create a storage directory for the index backup dumps. Now you can +create a backup with + + mv `~/yacy0/bin/indexdump.sh` ~/indexbackup/ + +To create such a backup automatically once every day during night-time, +add the following line to /etc/crontab + +``` + 5 3 * * * yacyappliance cd /home/yacyappliance/ && mv `yacy0/bin/indexdump.sh` indexbackup/ +``` + +Then, every night a 3:05h the index is dumped. This should not take more +than 25 minutes because the next checkalive ping happens at 3:30h +according to the heartbeat configuration. + +We use the index dump generated in the backup process to feed the index +dump to the replication peer yacy1. This is done manually by calling + + ~/yacy1/bin/indexrestore.sh ~/indexbackup/`ls -1tr ~/indexbackup | tail -1` + +and automatically using the following entry in /etc/crontab + + 50 3 * * * yacyappliance cd /home/yacyappliance/ && yacy1/bin/indexrestore.sh indexbackup/`ls -1tr indexbackup | tail -1` + +which will use the dump created 45 minutes before. Again this should not +take more than 25 minutes because the peer yacy1 gets the next heartbeat +ping at 4:15. Feel free to add more backup/replicate lines to the +crontab to get index backups from the primary peer to the replication +peer more frequently; i.e. add 6 hours three times to make an backup at +3:05h, 9:05h, 15:05h and 21:05h while doing a backup to the replication +peer at 3:50h, 9:50h, 15:50h and 21:50h. + +Finally we need a clean-up process to removed stale backups. This can be +done with + + cd ~/indexbackup && find * -type f -mtime +7 -delete + +to delete all backups which are older than seven days. To do this +automatically, put + + 20 4 * * * yacyappliance cd /home/yacyappliance/indexbackup && find * -type f -mtime +7 -delete + +into the crontab. This will delete the update ten minutes after the +replication happened. Please modify the crontab lines if you wish to do +backup and replication more/less often. + +## Automatically Update Debian + +This is optional but recommended to install all new debian updates +automatically. As user root edit the file `/etc/crontab` and add the +following line: + + 20 3 * * * root apt-get update && apt-get -y upgrade + +This will update your server every night at 3:20h automatically. + +## Automatically Update to Latest Code Changes + +This optional function will cause that the YaCy Search Appliance will +always be up-to-date to the latest YaCy code changes. As user +yacyappliance do: + + cd ~/yacy_deploy + git pull --tags origin master && ant clean all dist && cd RELEASE + ../../yacy0/stopYACY.sh && tar xfz `ls -1tr | tail -1` -C ../../yacy0 --strip-components=1 && ../../yacy0/startYACY.sh + ../../yacy1/stopYACY.sh && tar xfz `ls -1tr | tail -1` -C ../../yacy1 --strip-components=1 && ../../yacy1/startYACY.sh + +We can run this also automatically, twice a day in a 12 hour distance by +alternating the peers to prevent that a bad release destroys both peers +at the same time. To do this, we wrap the commands above in shell +scripts and call them from the crontab. As user yacyappliance create the +following files: + +"update.sh" + + cd ~/yacy_deploy + git pull --tags origin master + ant clean all dist + +"upgrade.sh" + + timeout 120s $1/stopYACY.sh + $1/killYACY.sh + rm -f $1/lib/* + rm -Rf $1/htroot + cd yacy_deploy/RELEASE/ + tar xfz `ls -1tr | tail -1` -C ../../$1 --strip-components=1 + ../../$1/startYACY.sh + +Set the executable flag of update.sh and upgrade.sh. Then, in +/etc/crontab, add the following lines: + + 30 4 * * * yacyappliance cd /home/yacyappliance/yacy_deploy && ./update.sh + 40 4 * * * yacyappliance cd /home/yacyappliance/yacy_deploy && ./upgrade.sh yacy0 + 15 18 * * * yacyappliance cd /home/yacyappliance/yacy_deploy && ./update.sh + 25 18 * * * yacyappliance cd /home/yacyappliance/yacy_deploy && ./upgrade.sh yacy1 + +Finally we should clean up the generated releases the same way as we +delete old backup files. This can be done with + + cd ~/yacy_deploy/RELEASE && find * -type f -mtime +7 -delete + +and as a cron job with + + 20 8 * * * cd /home/yacyappliance/yacy_deploy/RELEASE && find * -type f -mtime +7 -delete + +## Usage of this High Availability Configuration + +The public search interface is at +[http://{yourhost}:8100](http://%7Byourhost%7D:8100) (or +[http://{yourhost}](http://%7Byourhost%7D) if you set the reverse proxy +port to 80). Thats the URL you propagate as the search interface. To +administrate the search interface, use the address +[http://{yourhost}:8090](http://%7Byourhost%7D:8090), or better: +[https://{yourhost}](https://%7Byourhost%7D) if you followed the +instructions in [YaCy Over HTTPS](../operation/yacyoverhttps.md). + +Please note that [http://{yourhost}](http://%7Byourhost%7D) is the +gateway for the load balancer while +[https://{yourhost}](https://%7Byourhost%7D) is only the access point +for the yacy0-peer and administration tasks. + +Every time you changed the search interface configuration (like a new +skin or different ranking), you must do a full (downtime) replication +using the script `~/replicate_all.sh` (shut down both peers before this) +but it would be easy to make this as a high-availability task (shut down +yacy0 first, tar the DATA dir, start up yacy0, shut down yacy1, extract +DATA to yacy1, start up yacy1). There is no need to do this for index +updates after you started a web crawl, because the index replication +process does that once a day automatically. You might want to do this +more often by adjusting the line in crontab. + + + + + +_Converted from +, may +be outdated_ + + + + diff --git a/docs/installation/debianinstall.md b/docs/installation/debianinstall.md new file mode 100644 index 0000000..1bc7a48 --- /dev/null +++ b/docs/installation/debianinstall.md @@ -0,0 +1,151 @@ +# Installation of YaCy on Debian + +Installation on Debian-based GNU/Linux operating systems is easy using +our Debian repository: + + http://debian.yacy.net + +**TODO: not any more!** + +Create a debian source list file for YaCy sources: + + echo 'deb http://debian.yacy.net ./' > /etc/apt/sources.list.d/yacy.list + +Install the developer key with one of the two next methods + + wget http://debian.yacy.net/yacy_orbiter_key.asc -O- | apt-key add - + apt-key advanced --keyserver pgp.net.nz --recv-keys 03D886E7 + +And finally install YaCy itself. **Warning\!** If you will be using Tor, +it is important to read +[YaCy and Tor](../operation/yacy-tor.md) before taking +the next step\! Tor must be configured for YaCy before YaCy runs for the +time. Running "apt-get install yacy" before setting up Tor will create a +state for which there is no documentation to recover from. + + apt-get update + apt-get install openjdk-7-jre-headless # java 7 is sufficient, only a headless version is needed + apt-get install yacy + +## Important File Locations + +After the installation, the yacy application path is: + + /usr/share/yacy + +The DATA-path is: + + /var/lib/yacy + +The configuration files should be here: + + /etc/yacy/ + +/etc/yacy/yacy.conf is created using the +/usr/share/yacy/defaults/yacy.init file on the first run. + +The log files should be here: + + /var/log/yacy/ + +## Managing YaCy + +When you have installed YaCy using the Debian repository, YaCy is +started automatically after a OS startup, and stopped before shutdown. + +You can also start and stop YaCy from the command line with: (must be +run as root) + + /etc/init.d/yacy stop + /etc/init.d/yacy start + /etc/init.d/yacy restart + +You can use systemctl (run as root) to enable or disable YaCy automatic +startup at boot. + +Enable automatic startup : + + systemctl enable yacy + +Disable automatic startup : + + systemctl disable yacy + +The YaCy web server runs on port 8090 by default. The administration +pages are at + + http://localhost:8090/ + +but you can also set any other port for the interface using the +administration pages of YaCy. A Port 80 is possible, but it is better to +get access to this port using a +[Portforwarding](../operation/portforwarding.md). + +**Changing password**: If you do not set a username and password during +installation, the username will be "admin" and the password will be +randomly generated. You can still run a yacy node this way, but it won't +have the potential to be as useful since you won't be able to change +password-protected settings. Access the password-protected parts by +changing the password manually in a terminal. You can do this by going +to /usr/share/yacy/bin and running `./passwd.sh ` (Note: +your new password will appear in plain text in the terminal). + + cd /usr/share/yacy/bin + ./passwd.sh + +## Automatic Updates + +When configured this way, the YaCy-internal auto-updater does not work. +An automatic update must be done with OS tools. i.e. with a crontab +command. An example for that is the following line, which you must write +into /etc/crontab + + 0 6 * * * root apt-get update && apt-get -y --force-yes install yacy + +In Ubuntu, the above line is only valid for the system crontab file +(located at /etc/crontab) - you can edit this file on newer Ubuntu OS +directly without using the crontab command. Below are comments from the +file in Ubuntu 12.04: + + # /etc/crontab: system-wide crontab + # Unlike any other crontab you don't have to run the `crontab' + # command to install the new version when you edit this file + # and files in /etc/cron.d. These files also have username fields, + # that none of the other crontabs do. + +If you want to use the **root user** crontab in Ubuntu instead, an +example would be: + + username@hostname:~$ sudo crontab -e + no crontab for root - using an empty one + + Select an editor. To change later, run 'select-editor'. + 1. /bin/ed + 2. /bin/nano <---- easiest + 3. /usr/bin/vim.basic + + Choose 1-3 [2]: + +Then add the following at the end of the file: + + 0 6 * * * /usr/bin/apt-get update && /usr/bin/apt-get -y --force-yes install yacy + +Please note, there is no user name on the line above, and absolute +(full) paths are used here to prevent binary location problems\! + +## Next Steps + +After you configured YaCy, you may also want to set a [static +IP](staticip.md) to get a unique IP to your YaCy +peer. + + + + + +_Converted from +, may be outdated_ + + + + diff --git a/docs/installation/freebsdinstall.md b/docs/installation/freebsdinstall.md new file mode 100644 index 0000000..8ec5cc2 --- /dev/null +++ b/docs/installation/freebsdinstall.md @@ -0,0 +1,66 @@ +# FreeBSD Install Guide + +## Requirements + +Next to Java YaCy requires bash and either wget or curl to work. These +are used for the scripts you find in ./bin. So we install those via pkg +or ports + + # Using pkg: + pkg install openjdk curl bash + +### Java Setup + +Java requires proc and fd to be mounted. Make sure to append these lines +to /etc/fstab + + fdesc /dev/fd fdescfs rw 0 0 + proc /proc procfs rw 0 0 + +Then reboot or mount them manually for the changes to take effect. + +## Install YaCy + +For security reasons you should run YaCy as its own user. So you might +want to add a user called YaCy and switch to it. + + # As root: + adduser + Username: yacy + ... + # Switch to user yacy + su - yacy + +You need to download and extract the Linux version of YaCy. Luckily it +does not seem to be Linux specific at all. + + curl -o yacy.tar.gz http://release.yacy.net/release/yacy_latest.tar.gz + tar xzvf yacy_latest.tar.gz + +That's it! You can now cd into directory and start YaCy. + + cd yacy + ./startYACY.sh + +## Side notes for YaCy developers and advanced users + +To make the installation easier `/usr/bin/env bash` in various scripts +should be replaced with `/usr/bin/env sh`. Bash doesn't seem to be +required for scripts to work, so this would remove an explicit +dependency. FreeBSD comes with ftp (which also supports http) and fetch, +which could theoretically be used in bin/apicall.sh. However the way the +password is sent (using MD5:...) appears to be causing problems. I was +not yet able to work around this. If this could be fixed curl/wget would +not be required. + + + + + +_Converted from +, may be +outdated_ + + + + diff --git a/docs/installation/gentooinstall.md b/docs/installation/gentooinstall.md new file mode 100644 index 0000000..045262b --- /dev/null +++ b/docs/installation/gentooinstall.md @@ -0,0 +1,21 @@ +# HowTo install YaCy on Gentoo + +Gentoo Bug for yacy: + +Current available ebuilds from overlays: + + +For example + +` layman -a flow && emerge net-misc/yacy ` + + + + + +_Converted from +, may be outdated_ + + + + diff --git a/docs/installation/gnuinstall.md b/docs/installation/gnuinstall.md new file mode 100644 index 0000000..c95d51a --- /dev/null +++ b/docs/installation/gnuinstall.md @@ -0,0 +1,146 @@ +# GNU Install + +YaCy needs a Java Runtime Environment to run. Since YaCy was operated +under Sun's proprietary [JRE](http://java.sun.com/) at first, it +currently works best with it. However efforts are underway to make YaCy +runnable with free software JVMs. + + + + + +## YaCy with Kaffe + + - YaCy starts + - Network functions seem not to be implemente complete and/or + compatible, see [forum + post](http://www.yacy-forum.de/viewtopic.php?t=2919) + +## YaCy with GCJ (compiler) + +To build with gcj it suffices to run Ant with the option +-Dbuild.compiler=gcj or create a file named ant.properties in your which +contains the line "build.compiler=gcj". With Debian GNU/Linux, Ubuntu & +Fedora Core it suffices to not install a proprietary Java Runtime (GCJ +is the default on these systems). + +### MacOS X + +Install [Darwinports](http://www.darwinports.com), [GCJ for +Darwinports](http://gcj34.darwinports.com) and [GNU Classpath for +Darwinports](http://gnu-classpath.darwinports.com). + +## YaCy with GIJ (runtime) + +YaCy starts and runs + +after approx. 2 hours an OutOfMemoryError occurs + + +**TODO:** Find out why + + + +### Instructions + +To run YaCy using [GNU](http://gcc.gnu.org/java)s Java Virtual Machine +(suggested version: 4.1.1 which is roughly equivalent to JDK 1.4.2), 2 +preconditions must be met: + +1. YaCy should be compiled with ECJ (the commandline compiler from the + Eclipse project) +2. YaCy needs [Ant](http://ant.apache.org/) to compile + +#### Installation of ECJ + +ECJ is available for most distributions and can be installed via the +package management. + +#### Installation of Ant + +One can install Ant by following the instruction on the website above or +by using the distributions package management system. + + + + + +[![](../images/thumb/galternatives.png/300px-galternatives.png)](./datei:galternatives.png.html) + + + + + +[](./datei:galternatives.png.html "vergrößern") + + + +Auswahl verschiedener Java-VMs mit galternatives + + + + + + + +### Debian + +In Debian & Ubuntu JVMs are managed through the 'alternatives' system. +You can chose the wanted virtual machine by running +"update-alternatives" or graphically through "galternatives". + +### modify YaCy init script + +To change to VM for YaCy only, you can edit the init script since +version 0.38 or SVN revision 2696 by changing the line containing +JAVA="\`which java\`" with JAVA=/usr/bin/gij In older versions you need +to replace all occurances of "java" with the name of your alternative +VM. + + + +**TODO:** Eventually a set `JAVA_HOME` must be delete by calling `unset +JAVA_HOME`? + + + +#### Installation of YaCy + +Because of an uninspected inconsistency between Sun's JRE 1.4.2 and GIJ +YaCy's function to test for a compatible Java version fails. To prevent +an immediate exit of YaCy comment the line containing `System.exit(-1);` +in source/yacy.java in method `startup(String homePath, long +startupMemFree, long startupMemTotal)` out. + +If both ant and ecj installed one can start YaCy's compilation by +calling: + + ant compileMain + +At the moment it is needed to run YaCy on GIJ with the following command +line because it uses a different parameter syntax than the JDK: + + gij --mx=128m -cp classes:.:lib/commons-collections.jar:lib/commons-pool-1.2.jar:lib/svnRevNr.jar yacy + + + +**TODO:** AFAIK gij understand "-Xmx" as well + + + +## YaCy with JamVM / Classpath + +*not tested yet* + +## YaCy with Cacao / Classpath + +*not tested yes* + + +_Converted from +, may be +outdated_ + + + + diff --git a/docs/operation/headless.md b/docs/installation/headless.md similarity index 100% rename from docs/operation/headless.md rename to docs/installation/headless.md diff --git a/docs/installation/obsinstall.md b/docs/installation/obsinstall.md new file mode 100644 index 0000000..cf83e06 --- /dev/null +++ b/docs/installation/obsinstall.md @@ -0,0 +1,38 @@ +# Install YaCy-packages from OpenSUSE Build Service + +## openSUSE + +steps on commandline: + + zypper addrepo http://download.opensuse.org/repositories/home://f1ori/openSUSE_11.0_contrib/home:f1ori.repo + zypper install yacy + +You can do the same steps with yast and yast2. + +## Fedora + +steps on commandline: + + wget http://download.opensuse.org/repositories/home://f1ori/Fedora_9/home:f1ori.repo -O /etc/yum.repos.d/home:f1ori.repo + yum install yacy + +## Mandriva + +Unfortuately, the default package manager of Mandriva doesn't support +the repository-format "repomd" That's why you have to install another +packagemanager like smart. + +With smart, you can added the repo and install yacy like that: + + smart channel --add http://download.opensuse.org/repositories/home://f1ori/ + smart update + smart install yacy + + + +_Converted from +, may be outdated_ + + + + diff --git a/docs/installation/raspberry_pi.md b/docs/installation/raspberry_pi.md new file mode 100644 index 0000000..862b40d --- /dev/null +++ b/docs/installation/raspberry_pi.md @@ -0,0 +1,244 @@ +## Set up Raspberry Pi with YaCy + +The [Raspberry Pi](http://www.raspberrypi.org/) ('RPi') is a +credit-card-sized single-board computer which can run Linux kernel-based +operating systems. We consider the usage of a 'Model B' with 512 MB +SDRAM. Since this computer consumes only 3.5 W it is an ideal plattform +for private 'cloud' applications and also to run a YaCy peer 24x7. + +There is a wide range of [available operating systems for the +RPi](http://elinux.org/RPi_Distributions). There is also a 'default' +system recommended by the makers of the RPi, called "Raspbian" which is +based on debian wheezy. + +There are several options for an OS on the Raspberry PI ('RPi'). We +consider to run YaCy on Raspbian but you might want to insert more +sections here for other OS' + +### Running YaCy on Raspbian + +We take the default Raspbian image but modify it to fit it for our +needs: + +#### Preparation of Raspbian + + - Download the soft-float ABI version of Raspian called "wheezy-armel" + from + - Write the image to a SD card, a manual for Windows, Mac and Linux is + at + - When the RPi starts up the first time, enable the ssh server in the + Raspi-config menu and run expand-rootfs to use the full sd card + capacity. + - Log in to your RPi using ssh with the user 'pi' and the password + 'raspberry'. + - Assign a [Static IP](staticip.md) to your RPi. + This will cause that you have a unique link to your YaCy peer on the + RPi in your intranet. If there is no conflict in the set-up of your + network, use the default IP 192.168.1.70 + - Englarge swap space. It is not recommended to use a swap space on SD + cards but java crashes if for any reason more memory is needed than + we thought that is necessary. We will configure YaCy to take up only + as much space so that swapping does not happen. But to protect YaCy + from crashing, we enlarge the swap space: + + + + - open the file /etc/dphys-swapfile and replace the '100' by i.e. + '1024'. This will give you 1GB of swap space. This is available + after a 'sudo dphys-swapfile setup' or a re-start. + - to protect Raspbian from swapping (while still having the option) we + need a low swappiness value. View the file /proc/sys/vm/swappiness + and check that this is low. By default, there is a 1, you can + replace it with a 0 with "sysctl -w vm.swappiness=0" + + + + - Optional/recommended: update the RPi firmware, follow instructions + in readme from + - Optional/recommended: shrink Raspian with the [Headless + Debian](shrink.md) tutorial by + removing X11 and all dependencies. After this, you can also remove + python and the Python Games on the RPi + + + + rm -Rf ~/python_games + sudo apt-get remove --purge python + +.. followed by the same deborphan-process as described in [Headless +Debian](shrink.md) + + - Optional/recommended: remove all programming languages that you + don't need when running YaCy: + + + + sudo apt-get remove --purge python python2.6 python2.7 python3 python3.2 perl + +You may want to remove orphan and not required packages after this using +the [Headless Debian](shrink.md) +tutorial again. + + - Optional/recommended: get latest system updates + + + + sudo apt-get update + sudo apt-get upgrade + sudo apt-get dist-upgrade + +followed by a restart + + sudo shutdown -r now + +Repeat this until no updates appear. Then do a cleanup + + sudo apt-get clean + +then you should have at most 627M used on your SD card: + + df -h . + +#### Java Installation + + - primary option (recommended): install Oracle Headless JVM. This is + probably the fastest JVM. + + + + - Download a ARMv6/7 version from + + - Java 1.6 is sufficient + - copy the ejre\*.tar.gz file with scp to your RPi to ~pi/ i.e. + + + + scp ejre-1_6_0_38-fcs-b05-linux-arm-vfp-eabi-headless-13_nov_2012.tar.gz pi@192.168.1.70:~/ + + - untar the ejre\*.tar.gz file on your RPi, i.e. + + + + sudo tar xfz ejre-1_6_0_38-fcs-b05-linux-arm-vfp-eabi-headless-13_nov_2012.tar.gz -C /usr/lib/ + +which creates a directory ejre1.6.0\_38 in /usr/lib. To add the java +command to the execution path, do + + sudo ln -s /usr/lib/ejre1.6.0_38/bin/java /usr/bin/java + + - alternative option: install OpenJDK. This works fine but is a much + larger package and probably not as fast as the Oracle JVM. We need + only the headless JRE. Simply do: + + + + sudo apt-get install openjdk-7-jre-headless + + - To test if java is now available, run + + + + java + + - **Note as of 09/11/2014** : an install on an ARM board (Olinuxino + A13) proved that openjdk-7-jre-headless would not provide what is + necessary to run the Java server as needed by Yacy, but would + provide only a 'Zero VM' instead of a 'Server VM'. Hence at the + moment installing the package from Java seems to be the only + solution. + +#### YaCy Installation + +There is the option to install YaCy like any other debian package (see: +[Debian Installation](debianinstall.md)), but +then you cannot use the Oracle JVM as described above. We will just use +the YaCy tarball release. + + - install YaCy using a tarball from i.e. + + + + wget https://release.yacy.net/yacy_latest.tar.gz + tar xfz yacy_latest.tar.gz + +We should change some default settings in the yacy.init file (lower RAM +usage and lower disk space limit) + + sed "s/disk.free = 3000/disk.free = 1000/" -i ~/yacy/defaults/yacy.init + sed "s/javastart_Xmx=Xmx600m/javastart_Xmx=Xmx120m/" -i ~/yacy/defaults/yacy.init + +Now you can start YaCy: + + ~/yacy/startYACY.sh + + - If you set the default IP 192.168.1.70, then your YaCy peer will be + available (wait a bit) at + - YaCy will replace the default administration password, which is + empty, after some minutes by a random password. You should set your + own password by calling + + + + ~/yacy/bin/passwd.sh {yournewpassword} + +If you click on a protected page in YaCy, you must put in that password. + +#### YaCy Auto-Start and Watchdog + +We want that our RPi starts YaCy at boot time and shuts it down +properly. Create a file in /etc/init.d/yacy with the following content: + + #! /bin/sh + ### BEGIN INIT INFO + # Provides: YaCy + # Required-Start: $local_fs $remote_fs $network $time + # Required-Stop: $local_fs $remote_fs $network $time + # Default-Start: 2 3 4 5 + # Default-Stop: 0 1 6 + # Short-Description: YaCy Search Engine + ### END INIT INFO + case "$1" in + start) + su - pi -c "/home/pi/yacy/startYACY.sh" +  ;; + stop) + su - pi -c "/home/pi/yacy/stopYACY.sh" +  ;; + *) + exit 3 +  ;; + esac + : + +and make it executable and linked with + + sudo chmod 755 /etc/init.d/yacy + sudo update-rc.d yacy defaults + +This will start and stop YaCy automatically. We also want that YaCy is +supervised with a watchdog and automatically restarted if it failed, +crashed or behaves dead. Add the following line to /etc/crontab + + 0 * * * * pi cd /home/pi/yacy/bin && ./checkalive.sh + +This will check and if necessary restart YaCy once an hour. + +#### YaCy Search on (default http) Port 80 + +You need an iptables entry for that, just write the following line into +/etc/rc.local + + iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to-port 8090 + +Then your YaCy peer on the RPi is available at + + + + + +_Converted from , may be +outdated_ + + + + diff --git a/docs/installation/requirements.md b/docs/installation/requirements.md new file mode 100644 index 0000000..e108370 --- /dev/null +++ b/docs/installation/requirements.md @@ -0,0 +1,49 @@ +# System Requirements + +To get YaCy running, the only thing you need is a somewhat recent +computer (everything that used to be up-to-date in this millennium +should be all right) and an operating system that supports Java. Apart +from this, you should allow for 25GB (or at least 1-2 GB for a start) of +disk space on your hard-drive for collected data on websites and the +index itself (keep in mind in which directory YaCy stocks its data, e.g. +in /var/lib for Debian GNU/Linux). The more [physical +memory](http://en.wikipedia.org/wiki/Random_access_memory) (RAM) you +have the better, but 256MB are a good start. Less is also acceptable, +but it will make work a bit slow. + +YaCy has been written in the +[Java](http://en.wikipedia.org/wiki/Java_\(programming_language\)) +programming language, which means that it is available for a great +number of computer systems, as Java is available for almost all systems +in use. An outstanding feature of Java programs is the fact that they +can be run on different systems without any modification. + +Java can be downloaded and used for free. If there is no Java +environment on your system yet, you must install it before installing +YaCy. GNU/Linux distributions may include, e.g. a free one called +[OpenJDK](http://openjdk.java.net/install/). Otherwise, Java is +available from the [Sun website](http://java.com/en/download/index.jsp). +The minimum Java version you need for YaCy is Java 7 — which might +change in the future. + +Note that Apache Solr beeing a YaCy core component, it is a good idea to +follow also Solr recommendations. For example, YaCy 1.82 includes Solr +4.10.3 : Java 1.7.0\_u55 or later is recommanded (see [Solr System +Requirements](https://lucene.apache.org/solr/4_10_3/SYSTEM_REQUIREMENTS.html)) + +Because of this — and the fact that the newer Java version is more +powerful than the old one — you should chose Java 7 right from the +start. If the only thing you want to do is just run Java programs you +can make do with the [JRE (Java Runtime +Environment)](https://en.wikipedia.org/wiki/Java_Runtime_Environment). If you want to develop +Java programs or even help making YaCy better, you will need the [JDK +Java Development Kit](http://en.wikipedia.org/wiki/JDK). + + + +_Converted from , may +be outdated_ + + + + diff --git a/docs/operation/shrink.md b/docs/installation/shrink.md similarity index 100% rename from docs/operation/shrink.md rename to docs/installation/shrink.md diff --git a/docs/operation/staticip.md b/docs/installation/staticip.md similarity index 100% rename from docs/operation/staticip.md rename to docs/installation/staticip.md diff --git a/docs/operation/portforwarding.md b/docs/operation/portforwarding.md new file mode 100644 index 0000000..a76291c --- /dev/null +++ b/docs/operation/portforwarding.md @@ -0,0 +1,25 @@ +# Portforwarding + + +You can configure any port for YaCy but small numbers below 1024 are +reserved to be used by **root** on linux. Because some firewalls block +port 8090 it can be useful to redirect the server port 80 to 8090. + +One solution for this is the usage of a port forwarding using iptables. +As user **root** just do + + iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to-port 8090 + +This is a temporary port redirection, if you want this to be permanent +just write this line into /etc/rc.local + + + + + +_Converted from +„“, may be outdated_ + + + + diff --git a/docs/operation/yacy-tor.md b/docs/operation/yacy-tor.md new file mode 100644 index 0000000..0b6b3f4 --- /dev/null +++ b/docs/operation/yacy-tor.md @@ -0,0 +1,605 @@ +# YaCy and Tor + + + +**Note:** YaCy is not yet able to recognise Tor URLs, therefore you should +use YaCy as "tor-only", otherwise useless URLs will be sent to normal peers. +Please follow all steps and read all hints carefully. + + +Peers in the public cluster should blacklist Tor-URLs to prevent seeding +.onion URLs sent to them by wrongly configured peers. The blacklist entry +should look like this: `*.onion/.*` + +**Note:** This How-To is divided in two parts. Please complete part 1 fist before starting with the part 2. + + + + + + +**Warning!** There is no whitelisting filter anymore, so you are not able to run tor-only yacy + + + +Thread about Whitelisting feature: + + - + + + + +## Goal + +An independent YaCy network to index Tor hidden services is to be build. +No normal Internet sites should be indexed for that purpose. There also +is a [another YaCy network](./en:yacy-tor.html#Similar_YaCy_networks) to +index both Tor hidden-services and normal Web sites. + +## Help + +Should you have questions or need help, go to the [English YaCy +forum](http://www.huzzaar.com/yacy-forum/) + +## Part 1 - Configuring Tor and Privoxy + +Please install Tor and Privoxy first. The installation depends on your +operating system. Read the OS specific manual. + +### Configuring Tor + +Its sufficient to run Tor as a client, though we are going to install a +hidden service later on. The Tor package comes fully configured to run +out of box as a client. Nevertheless you should edit your Tor +configuration file (e.g. `/etc/tor/torrc`) to increase system-security. + +First of all look for "SocksPort", which defaults to 9050: + + SocksPort 9050 + +Remember this port number. + +If you connect to Tor from the same system only, prevent other IPs from +connecting by binding Tor to localhost: + + SocksListenAddress 127.0.0.1:9050 + +Additionally you should restrict access on the Socksport: + + SocksPolicy accept 127.0.0.0/8 + SocksPolicy reject * + +`ORPort`, `ORListenAddress`, `DirPort` or `DirListenAddress` only need +to be set if you run Tor as a server. + +`ControlPort` only needs to be set if you run a control application. + +Make sure to disable logging, otherwise sensitive informations will be +logged. Using + + Log notice syslog + +only writes minimal information to syslog. (Apparently `Log notice` has +to be set, otherwise Tor won't start properly. The configuration may +vary for different operating systems.) + +Who wants to feel really safe can optionally set + + ExitPolicy reject *:* + +and + +``` + BandwidthRate 50 KB +``` + +This will limit damage in case of misconfiguration by reducing traffic +and restricting connections + +----- + +Here is the configuration as a whole (depends on your OS, this is for +Linux) + + ExitPolicy reject *:* + User tor + Group tor + PIDFile /var/run/tor.pid + SocksPort 9050 + SocksListenAddress 127.0.0.1:9050 + SocksPolicy accept 127.0.0.0/8 + SocksPolicy reject * + Log notice syslog + DataDirectory /var/lib/tor/data + # ControlPort 9051 + # RunAsDaemon 1 # has to be set depending on os + BandwidthRate 50 KB + +### Configuring Privoxy + +The following how-to assumes you will use Privoxy for Tor only. + +Edit privoxy's configuration file (e.g. `/etc/privoxy/config`). Check or +edit the following settings. + +Don't log every requested page. You only need startup and error +messages. Probably the best is to not log anything at all: + + debug 0 + +Make sure only localhost is allowed to connect and privoxy listens to +port 8118. + + listen-address 127.0.0.1:8118 + +Privoxy filter should be switched off, since it just acts as proxy +between YaCy and tor. You also can switch off toggling: + + toggle 0 + enable-remote-toggle 0 + enable-remote-http-toggle 0 + +You may disable editing filtes and rules, too. + + enable-edit-actions 0 + +The most important is to forward all connections to the Tor (9050). +(Don't forget the dot at the end of line) + + forward-socks4a / 127.0.0.1:9050 . + +`forwarded-connect-retries` should be slightly increased to improve +connections. I recommend 2 or 3: + + forwarded-connect-retries 2 + +----- + +This is a listing of all settings (depends on OS, here Linux): + + confdir /etc/privoxy + logdir /var/log/privoxy + actionsfile standard + actionsfile default + actionsfile user + filterfile default.filter + logfile privoxy.log + debug 0 + # debug 1 # make sure to uncomment! + listen-address 127.0.0.1:8118 + toggle 0 + enable-remote-toggle 0 + enable-remote-http-toggle 0 + enable-edit-actions 0 + buffer-limit 4096 + forward-socks4a / 127.0.0.1:9050 . + forwarded-connect-retries 2 + +### Check configuration + +Before you start to configure YaCy, you should test the configuration of +Tor and privoxy to make sure everything works fine. Wait some time to +let Tor connect to the the Tor network. Start your browser and configure +it to use a proxy with proxyhost `localhost` and proxyport `8118`. Visit +an tor-URL, e.g.: + + - [Hidden Wiki](http://6sxoyfb3h2nvok2d.onion/) + - [Invisi Wiki](http://2qrww3nv5w3ue3ir.onion/) + - [APE Wiki](http://anegvjpd77xuxo45.onion/) + +When you are able to connect to an onion URL successfully, continue with +part 2 of the how-to. If you are having trouble, check your +configuration files and reread the documentation of Tor and privoxy +carefully. + +Don't forget to remove the proxy settings from your +browser-configuration. + +## Part 2 - Configuring a hidden-service and YaCy + +**Note:** You just should continue with this part if Tor and Privoxy are running correctly + + + +### Configuring a hidden-service + +Shutdown Tor. Modify the Tor configuration file and add an entry to +support YaCy as a hidden-service e.g.: + + HiddenServiceDir /var/lib/tor/yacy/ + HiddenServicePort 8181 127.0.0.1:8181 + +Port 8181 is the YaCy port we will use later. + +After restarting Tor you will find a file named `hostname` in the +directory `HiddenServiceDir`. The hostname in this file (e.g. +*1a2b3c4d5e6f7g89.onion*) will be needed later. + +### Configuring YaCy + +#### Preamble + +First of all, there are several ways to modify YaCy's configuration. One +is to edit the file yacy.init, another is to edit httpProxy.conf +directly. It's up to you which way you choose. + +It's recommended to download an up to date version of YaCy and to modify +the yaci.init **before** starting it the first time. This way it is +ensured that YaCy didn't make contacts and didn't build an index yet. +The informations in yacy.init will be written to +DATA/SETTINGS/httpProxy.conf on the first startup. + +There are also several ways to modify superseed.txt and here too I will +describe an unusual way to prevent that superseed.txt will be +overwritten when updating. + +The recommended edits are optimal for my configuration. If you use +another configuration, make sure you know what you are doing. + +Under no circumstances you should try to modify an already used +(started) YaCy installation since there are several traps that are not +documented and which will cause YaCy to contact public YaCy clusters and +distribute onion URLs. + +Ok, let's start. First change into the YaCy directory. All following +pathnames are relative to the YaCy directory. + +#### Modifying the configuration files + +Now we will modify yacy.init. Only the setting we have to modify are +listed. + +First we have to set the port on which YaCy will be reachable and which +is different from the normal YaCy port. + + port = 8181 + +Then we need to set another location of the net definition files since +the standard ones will be overwritten with every update. + + network.unit.definition = ../yacy.network.unit.tor + network.group.definition = ../yacy.network.group.tor + +Automatic update should work, but it hasn't been tested sufficiently yet +and until we can be sure it won't destroy anything we better disable it: + + update.process = manual + +It's also important to replace the blacklist with a whitelist so that +only the domains will be indexed which are in the list, instead of +indexing all domains which are *not* in the list. This way we make sure +that only hidden services will be indexed, since they are defined by the +onion domain. Later we will configure the whilelist. + + BlackLists.class=de.lulabad.blacklist.advancedWhiteURLPattern + +Now we make sure YaCy only will contact the Internet through privoxy: + + remoteProxyUse=true + remoteProxyHost=localhost + remoteProxyPort=8118 + +Since the DNS-resolution only delivers local network addresses, we have +to empty the IP address blocklist for the proxy, otherwise YaCy would +try to connect to sites directly without using the proxy and thus won't +be able to find them: + + remoteProxyNoProxy= + +The following settings make the seedfile available in the Tor network: + + seedUploadMethod=File + seedFilePath=htroot/seed.txt + seedURL=http://1a2b3c4d5e6f7g89.onion:8181/seed.txt + +Now we give our YaCy a freely selectable name: + + peerName=TorYaCy + +YaCy needs to run in debug mode to handle local addresses (as used by +Tor) correctly: + + yacyDebugMode=true + +To be able to make a connection, YaCy needs to be told from which +hostname (domain) it is reachable: + + staticIP=1a2b3c4d5e6f7g89.onion + +Should you want YaCy to open a browser window, just skip the following +option. Otherwise set: + + browserPopUpTrigger=false + +Since the Tor network is not the fastest, we set all timeouts to high +values: + + clientTimeout=90000 + crawler.clientTimeout=90000 + proxy.clientTimeout=90000 + indexControl.timeout = 180000 + indexDistribution.timeout = 180000 + indexTransfer.timeout = 360000 + +The following options are very important for that our peer won't contact +any public clusters but only other Tor-YaCy peers: + + CRDistOn = false + CRDist1Target = + +For security reasons it is also important that the proxy isn't reachable +from the Tor network. The following configuration describes the scenario +that YaCy is running on the same computer as Tor. Then you need to set +for example 192.168.1.2 as the address for the server in your browser +instead of localhost: + + proxyClient=192.168.*,10.* + +At last we set several options to increase the anonymity in the Tor +network: + + proxy.sendViaHeader=false + proxy.sendXForwardedForHeader=false + useYacyReferer=false + useYacyReferer__pro = false + +Optionally we can set the following options to restrict the maximum file +size (here \~10MB) and to reduce the cache size on a minumum (here 4MB), +because the sites we browse are cached there: + + crawler.ftp.maxFileSize=10000000 + crawler.http.maxFileSize=10000000 + proxyCacheSize=4 + + + +**Note:** If you run YaCy using Linux or any similar OS: Don't forget to set +the right owner/group and the right file modes to files and YaCy +directories, especially `DATA/SETTINGS` and the file `httpProxy.conf` +located in there, e.g. `chown -R yacy: ./` | + + + +#### Activate Whitelist + + +**Warning:** There is no whitelisting filter anymore, so you are not able to run tor-only yacy | + + + +Thread about Whitelisting feature: + + - + +~~YaCy only supports a blacklist by default, therefore you have to +download +[\[1\]](http://yacy-websuche.de/wiki/index.php/Benutzer:Lulabad#regex_Whitelist_.28erst_ab_0.3.29%7CadvancedBlacklist-0.3.jar) +(or higher) and copy it to libx. After that the previously configured +filter is available.~~ + +Sorry, but this Whitelist can't be used at this moment: + + - + +Now we just have to make an entry to only index *.onion* sites: + + - Create the subfolder DATA + - Create the sub-subfolder LISTS (DATA/LISTS) + - Create the file url.default.black (DATA/LISTS/url.default.black) + with the following content: + + + + *.onion/.* + +A possible workaround is to use a filtered proxy in front of YaCy they +accept only \*.onion domains. + +#### Defining the YaCy-Tor-network + +By now, YaCy is able to build and define separated networks: +[Netzdefinition](./de:netzdefinition.html "De:Netzdefinition") + +The current definitions can be downloaded from +[\[2\]](http://byi4akelnxrz5tab.onion:8081/yacy.network.unit.tor) and +[\[3\]](http://byi4akelnxrz5tab.onion:8081/yacy.network.group.tor). + +`yacy.network.group.tor` is empty and `yacy.network.unit.tor` has the +following content: + + network.unit.name = torworld + network.unit.description = Yacy network for TOR https://www.torproject.org/ + network.unit.domain = any + network.unit.search.time = 4 + network.unit.dhtredundancy.junior = 1 + network.unit.dhtredundancy.senior = 3 + network.unit.bootstrap.seedlist0 = http://byi4akelnxrz5tab.onion:8081/seed.txt + network.unit.bootstrap.seedlist1 = http://pah22f4rpnz4hoyn.onion:8084/seed.txt + network.unit.bootstrap.seedlist2 = http://zxbagwypsfbicebv.onion:8091/seed.txt + network.unit.update.location0 = http://yacy.net/Download.html + network.unit.update.location1 = http://latest.yacy.de + network.unit.update.location2 = http://www.findenstattsuchen.info/YaCy/latest/index.php + network.unit.protocol.control = uncontrolled + +#### Starting YaCy + +Now you may start YaCy. Watch the log file and maybe the network graph, +since other Tor-YaCy should be seen within minutes. Public IPs shouldn't +rise in the log file. Error messages caused by the seedfiles may appear +in the beginning and can be ignored as soon as the first other Tor-YaCy +are found. + + + +**Warning:** +Visit and set an admin password when you start yacy for the first time. + + + +#### Using YaCy + +Enter proxyhost `localhost` and proxyport `8181` into your +browserconfiguration. Now you should be able to visit Tor hidden +services using YaCy. + +# Notes + +Tor is a slow and sometimes unstable system and sometimes it can take a +while until the YaCy peers find eachother and exchange data. Be patient. + + - Some Tor pages have to be reloaded several times + - DHT transfer to other tor-YaCy is working (untested) + - RankingDistribution to other tor-YaCy is working (untested) + - The status of a peer depends on the quality of the connection. Don't + be suprised if you are principal and some minutes later you are + junior. + - Don't index the Internet sites using Tor-YaCy. That would destroy + the Tor-only index. You may find filter rules which block the access + to the Internet at other tor-peers. Use them\! + - The German version of this article can be found at the + tor-hiddenwiki. Who modifies the article, should modify it there, + too: + +# Security Hints + + - logging should be disabled in all programs or the log-files should + be placed on a ramdisk, it is the same for YaCy. + - YaCy's HTCACHE should be placed on a ramdisk + - when starting crawls, these should be run local and not distributed, + don't forget to set the filter rule to onion-domains + - wiki and blog should be used carefully + - public bookmarks shouldn't be used + - browser cache and browser history should be deactivated + - paranoid people can install YaCy on an encrypted filesystem + +# Seeds for Tor-YaCy + +Please only post well available seed files. These seed files can be +added to the unit file. + + - + - + - + +If your Tor-yacy is up and running, please post the URL to your seedfile +here (the one from `seedURL`) + +# Demo-peers + +Beware, maybe they are down. + + - + - + + + + - + - + - + - + +# Similar YaCy networks + +Please always join already existing networks whenever possible. Tor's +resources are limited and should be spared. In addition it is hard +enough to connect enough servers to build a stable Tor-YaCy-network. + +## *freeworld* + +That is not a dedicated network, but *usal* YaCy server, there are +provided as hidden service in tor network, too. + +So they are **not** crawl tor hidden services, but enable tor users to +use directly and anonymous yacy server. + +Servers: + + - + +Please announce your server in tor network, too: + + - [Announce new hidden services](http://eqt5g4fuenphqinx.onion/sites) + - [Discussion about YaCy and Tor and announcing some YaCy + servers](http://eqt5g4fuenphqinx.onion/talk/272) + +## anonymworld + + +**Warning:** This network is shutted down. It's mor important to support the common *freeworld* and the special *torworld* network. It would be more usefull to provide an access to your node on tor using hidden services than to build this network. | + + + + + +### Goals + +anonymworld is a network that, in addition to Tor hidden services, also +indexes the normal web and only is reachable through Tor. + +### Differences + +The following changes from the above described procedure have to be +made: + +1. No whitelist: All whitelist/blacklist entries as well as the + additional Jar-file are not necessary. +2. Independent network: There are special unit and group files. + +**Important:** This are the differences from the general description\! + +#### Unit/Group-Files + +The current definitions can be downloaded from +[\[4\]](http://jarwf7lglg3lbujb.onion:8086/yacy.network.unit.tor) and +[\[5\]](http://jarwf7lglg3lbujb.onion:8086/yacy.network.group.tor). + +yacy.network.group.tor is empty and yacy.network.unit.tor has the +following content: + + network.unit.name = anonymworld + network.unit.description = Yacy network for TOR https://www.torproject.org/ indexing whole world + network.unit.domain = any + network.unit.search.time = 4 + network.unit.dhtredundancy.junior = 1 + network.unit.dhtredundancy.senior = 3 + network.unit.bootstrap.seedlist0 = http://jarwf7lglg3lbujb.onion:8086/seed.txt + +### Demo Peers + +Beware, peers might be down. + + - + +### Seed files + +Please only add well available seed files. + + - + +# External links + + - - Proxy page for using + and testing tor network (hidden services) without any own tor + installation + - -- downloading Tor and + installation hints for several operating-systems + - + -- hints on the dns-problem and solving it (german) + - -- another + description for installing yacy-tor (german) + - [core.onion - Build decentral search engine network to index hidden + services](http://eqt5g4fuenphqinx.onion/talk/272) + +## Documentation for newbies in the Tor world + + - -- Hidden-Wiki with a list of + Hidden Services + - -- How-To at Hidden + Wiki for using Tor and YaCy + +_Converted from +, may be outdated_ + + + + diff --git a/docs/operation/yacyoverhttps.md b/docs/operation/yacyoverhttps.md new file mode 100644 index 0000000..b307a81 --- /dev/null +++ b/docs/operation/yacyoverhttps.md @@ -0,0 +1,58 @@ +# Using the YaCy Front-End over HTTPS + +It is possible to put a SSL encoding in front of YaCy to get the YaCy +interface accessible using https. This can easily be done using stunnel +and openssl. + +## Installation of openssl and stunnel in debian + +As user root, call: + + apt-get install openssl stunnel + +Then create a ssl certificate: + + cd ~ + openssl req -new -x509 -keyout stunnel-key-pw.pem -out stunnel-cert.pem -days 3650 + +And get rid of the passphrase with + + openssl rsa -in stunnel-key-pw.pem -out stunnel-key.pem + +As user root, copy the key and certificate file to `/etc/stunnel/` + + mv stunnel-key.pem stunnel-cert.pem /etc/stunnel/ + chmod 600 /etc/stunnel/*.pem + +To configure stunnel, create the file `/etc/stunnel/stunnel.conf` with the +following content: + + chroot = /var/lib/stunnel4/ + setuid = stunnel4 + setgid = stunnel4 + pid = /stunnel4.pid + cert = /etc/stunnel/stunnel-cert.pem + key = /etc/stunnel/stunnel-key.pem + output = stunnel.log + + [https] + accept = 443 + connect = 8090 + +To activate this service, edit the file `/etc/default/stunnel4` and change +the value of `ENABLED` to `1`. Finally restart stunnel: + + /etc/init.d/stunnel4 restart + +Now the YaCy search page can be opened at + + + + + +_Converted from +, may be outdated_ + + + +