dienste:bytecluster0002:wiki

Unterschiede

Hier werden die Unterschiede zwischen zwei Versionen angezeigt.

Link zu dieser Vergleichsansicht

Nächste Überarbeitung
Vorhergehende Überarbeitung
dienste:bytecluster0002:wiki [12.12.2020 22:38] – ↷ Seitename wurde von dienste:bytecluster0002:wiki auf dienste:bytecluster0002:web geändert mape2kdienste:bytecluster0002:wiki [18.03.2024 22:10] (aktuell) – Bytespider aussperren wegen exzessiven Anfragen, Robots.txt wegen KI-Spidern mape2k
Zeile 1: Zeile 1:
 +====== Container 'wiki' ======
 +===== Ressourcen =====
  
 +  * 0.5 GB RAM
 +  * 2 Cores
 +  * 8 GB HDD (root-fs)
 +
 +===== System =====
 +
 +  * interne IPs
 +    * 10.2.0.2, fd00:10:2:0::2
 +
 +===== Dienste =====
 +
 +  * Dokuwiki (via NGINX)
 +
 +===== Installation =====
 +
 +  * Standard-Template mit Benutzern
 +
 +==== NGINX / PHP ====
 +
 +  - NGINX und PHP-FPM installieren
 +    * **sudo apt-get install php7.4 php7.4-cli php7.4-fpm php7.4-gd php7.4-xml php7.4-zip php7.4-mbstring nginx**
 +  - Default-Konfiguration anpassen<file|/etc/nginx/sites-available/default>
 +server {
 + listen 80 default_server;
 + listen [::]:80 default_server;
 +
 + root /var/www/public_html;
 +
 + # Add doku.php as index
 + index doku.php;
 +
 + server_name _;
 +
 +        # Block Bytespider
 +        if ($http_user_agent ~ (Bytespider|bytedance) ) {
 +                return 403;
 +        }
 +
 + location / {
 + try_files $uri $uri/ @dokuwiki;
 + }
 +
 + # Do not serve config, data and .ht*
 + # Comment during installation
 + location ~ /(data/|conf/|bin/|inc/|install.php) {
 + return 403;
 + }
 + location ~ /\.ht { deny all; }
 +
 + # Maximum POST-size and Buffer
 + client_max_body_size 20M;
 + client_body_buffer_size 128K;
 +
 + # Rewrite Rules
 +        location @dokuwiki {
 + rewrite ^/_media/(.*) /lib/exe/fetch.php?media=$1 last; 
 +                rewrite ^/_detail/(.*) /lib/exe/detail.php?media=$1 last; 
 +                rewrite ^/_export/([^/]+)/(.*) /doku.php?do=export_$1&id=$2 last; 
 +                rewrite ^/(.*) /doku.php?id=$1 last;
 + }
 +
 + # pass PHP scripts to FastCGI server
 + location ~ \.php$ { 
 + try_files $uri =404; 
 +                fastcgi_pass unix:/var/run/php/php7.4-fpm.sock; 
 +                fastcgi_index index.php; 
 +                include fastcgi_params; 
 +                fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; 
 + }
 +}
 +</file> 
 +  - Dienste aktivieren und neustarten
 +    * **sudo systemctl enable nginx.service php7.4-fpm.service**
 +    * **sudo systemctl restart nginx.service php7.4-fpm.service**
 +
 +==== Dokuwiki ====
 +
 +  - Dokuwiki-Verzeichnisse erstellen
 +    * **sudo mkdir /var/www/{public_html,data}** 
 +  - Dokuwiki installieren 
 +    * nach [[https://www.dokuwiki.org/install|Anleitung]]
 +      * Entpacken in das Verzeichnis /var/www/public_html
 +      * Datenverzeichnis nach /var/www/data (bzw. ../data)
 +  - Berechtigungen anpassen
 +    * **chown -R www-data:www-data /var/www/{public_html,data}**
 +  - Cache-Cleanup einrichten
 +    - Cleanup-Skript<file|/usr/local/bin/cleanup_dokuwiki_cache.sh>
 +#!/bin/bash
 +
 +cleanup()
 +{
 +    local data_path="$1"        # full path to data directory of wiki
 +    local retention_days="$2"   # number of days after which old files are to be removed
 +
 +    # remove stale lock files (files which are 1-2 days old)
 +    find "${data_path}"/locks/ -name '*.lock' -type f -mtime +1 -delete
 +
 +    # remove files older than ${retention_days} days from the cache
 +    find "${data_path}"/cache/ -type f -mtime +${retention_days} -delete
 +}
 +
 +# cleanup DokuWiki installations (path to datadir, number of days)
 +cleanup /var/www/data    180
 +</file>
 +    - Cleanup-Skript ausführbar machen
 +      * **chmod +x /usr/local/bin/cleanup_dokuwiki_cache.sh**
 +    - Cleanup-Skript als Cronjob einrichten machen
 +      * **<nowiki>echo -e "0 2 * * *\troot\t/usr/local/bin/cleanup_dokuwiki_cache.sh" | sudo tee /etc/cron.d/dokwiki > /dev/null/<nowiki>**
 +
 +=== robots.txt ===
 +
 +Quelle: https://neil-clarke.com/block-the-bots-that-feed-ai-models-by-scraping-your-website/
 +
 +  - Robots.txt setzen<file|/var/www/public_html/robots.txt>
 +User-agent: CCBot
 +Disallow: /
 +
 +User-agent: ChatGPT-User
 +Disallow: /
 +
 +User-agent: GPTBot
 +Disallow: /
 +
 +User-agent: Google-Extended
 +Disallow: /
 +
 +User-agent: anthropic-ai
 +Disallow: /
 +
 +User-agent: Omgilibot
 +Disallow: /
 +
 +User-agent: Omgili
 +Disallow: /
 +
 +User-agent: FacebookBot
 +Disallow: /
 +
 +User-agent: Bytespider
 +Disallow: /
 +
 +User-agent: ImagesiftBot 
 +Disallow: /
 +</file>
 +  - Berechtigungen anpassen
 +    * **chown www-data:www-data /var/www/public_html/robots.txt**
 +
 +==== Backup mit Borgmatic ====
 +
 +  - Installation siehe [[mariadb]]
 +  - Konfiguration <file|/etc/borgmatic/config.yaml>
 +...
 +location:
 +    # List of source directories to backup (required). Globs and
 +    # tildes are expanded.
 +    source_directories:
 +        - /etc
 +        - /home
 +        - /root
 +        - /usr/local
 +        - /var/log
 +        - /var/www
 +...
 +</file>