Published on

TailScale+Headscale 局域网组网全记录

Authors
  • avatar
    Name
    NorthSecond
    Twitter

起因

最近准备回家过年了,但是在家里显然还是要做科研的。细细数了一下自己手上的设备发现几年攒下来也确实不在少数(虽然大部分都是电子垃圾),目前主要的解决方案还是 SSH + RDP 这一套,管理起来还是比较麻烦,于是想着能不能有一个优雅的解决方案。

当然在学校做科研不可避免的一个问题是需要连接校园网内部的设备。目前外网访问校园网用的是深信服出品的 SSLVPN,其一是由于出口服务器带宽的原因内网网速实在是太慢太慢,其二也是因为 Sangfor VPN “名声在外”,我也亲手抓到过它扫盘上传,卸载的时候怕卸载不干净还直接重装过,哪怕放docker里面因为我懒得做加固也怕它逃逸喽()。因为出于安全考量,校园网内部也是有种种限制的(比如我在宿舍就无法 SSH 到实验楼的同在 SECURE 下的机器),再加上工位台式机所连接的更是校园网下的路由器下的局域网,这就导致了我即使在校园网内,在没有穿透的情况下也访问不了我的台式机。

可是 RDP 真的太香了,远程到实验室桌面环境进行远程开发对我来说是刚需,向日葵什么的实现起来让人感觉到还是很不优雅(其实主要是延迟太高),后面白嫖了阿里云半年的河源的云服务器有个公网 IP ,暂时的实现就是 FRP 内网穿透了台式机的 RDP 端口到公网IP的一个大号端口。

这样做可以是可以,但是核心痛点还是没有解决——这台被我用作内网穿透的服务器带宽只有 1Mbps,也就比 SSLVPN 快那么一点点。校园网虽然骂它慢的人不少,但是连接公网好歹还有 30Mbps 的样子,最起码能做到不卡嘛。再加上还有一台 NAS 托管在盆友的内网穿透下面,也是想集中纳入管理+顺带用它做一个内网 SSH 到组里服务器的跳板机(安全考虑就不把组里的 LXC 服务器虚拟机放内网了),还可以把家里的电脑也组进来,父母年龄慢慢大了有时候电脑有啥问题不太会折腾可以远程桌面帮他们弄一下。

查了一下还真有相关的解决方案,Tailscale 能提供我需要的几乎所有功能,又顺藤摸瓜找到了 juanfont/headscale ,能把服务器放在我的河源服务器上,满足了我对于网络全过程关键节点可控的幻觉需求(突发奇想的是是不是什么时候我也可以搭建一个这个内网的 SSLVPN)。看了一遍[译] NAT 穿透是如何工作的:技术原理及企业级实践(Tailscale, 2020) (arthurchiao.art)发现看起来感觉很不错,于是就开始折腾。

本文记录折腾的全过程,留作备忘。

Headscale 的配置

Headscale 安装

Headscale 是开源的 Tailscale 管理服务器,本文主要通过 Docker 部署在阿里云的 Ubuntu 22.04 LTS 服务器上,当然,环境需要安装好 Docker 和开了相应的端口。

进入自己准备好部署的文件夹,创建对应的配置文件

mkdir -p ./headscale/config
cd ./headscale
touch ./config/db.sqlite

然后下载示例的配置文件:

curl https://raw.githubusercontent.com/juanfont/headscale/main/config-example.yaml -o ./config/config.yaml

由于阿里云服务器访问 GitHub 有那么一些些玄学,所以还是手动新建文件复制粘贴了一下。由于和 docker 配合可以干掉很多和路径相关的问题(也产生了一些新的和路径相关的问题),对配置文件的修改主要就是改了自己的服务器地址和相关路径,换用国内可以访问的 DNS,还有关掉 magicDNS 以及 DERP 功能。

update: 改了一下 ipv4 的默认网段,不然占用 CGNAT 的ipv4 网段会有一些和运营商冲突的问题(比如阿里云的镜像就不能用了)

View config.yaml
# headscale will look for a configuration file named `config.yaml` (or `config.json`) in the following order:
#
# - `/etc/headscale`
# - `~/.headscale`
# - current working directory

# The url clients will connect to.
# Typically this will be a domain like:
#
# https://myheadscale.example.com:443
#
server_url: http://公网IP:8080

# Address to listen to / bind to on the server
#
# For production:
listen_addr: 0.0.0.0:8080
# listen_addr: 127.0.0.1:8080

# Address to listen to /metrics, you may want
# to keep this endpoint private to your internal
# network
#
metrics_listen_addr: 0.0.0.0:9090

# Address to listen for gRPC.
# gRPC is used for controlling a headscale server
# remotely with the CLI
# Note: Remote access _only_ works if you have
# valid certificates.
#
# For production:
# grpc_listen_addr: 0.0.0.0:50443
grpc_listen_addr: 127.0.0.1:50443

# Allow the gRPC admin interface to run in INSECURE
# mode. This is not recommended as the traffic will
# be unencrypted. Only enable if you know what you
# are doing.
grpc_allow_insecure: false

# The Noise section includes specific configuration for the
# TS2021 Noise protocol
noise:
  # The Noise private key is used to encrypt the
  # traffic between headscale and Tailscale clients when
  # using the new Noise-based protocol.
  private_key_path: /etc/headscale/noise_private.key

# List of IP prefixes to allocate tailaddresses from.
# Each prefix consists of either an IPv4 or IPv6 address,
# and the associated prefix length, delimited by a slash.
# It must be within IP ranges supported by the Tailscale
# client - i.e., subnets of 100.64.0.0/10 and fd7a:115c:a1e0::/48.
# See below:
# IPv6: https://github.com/tailscale/tailscale/blob/22ebb25e833264f58d7c3f534a8b166894a89536/net/tsaddr/tsaddr.go#LL81C52-L81C71
# IPv4: https://github.com/tailscale/tailscale/blob/22ebb25e833264f58d7c3f534a8b166894a89536/net/tsaddr/tsaddr.go#L33
# Any other range is NOT supported, and it will cause unexpected issues.
ip_prefixes:
  - 10.255.0.0/10

# DERP is a relay system that Tailscale uses when a direct
  # - 100.64.0.0/10
  # - fd7a:115c:a1e0::/48
# connection cannot be established.
# https://tailscale.com/blog/how-tailscale-works/#encrypted-tcp-relays-derp
#
# headscale needs a list of DERP servers that can be presented
# to the clients.
derp:
  server:
    # If enabled, runs the embedded DERP server and merges it into the rest of the DERP config
    # The Headscale server_url defined above MUST be using https, DERP requires TLS to be in place
    enabled: false

    # Region ID to use for the embedded DERP server.
    # The local DERP prevails if the region ID collides with other region ID coming from
    # the regular DERP config.
    region_id: 999
    
    # Region code and name are displayed in the Tailscale UI to identify a DERP region
    region_code: "headscale"
    region_name: "Headscale Embedded DERP"
    
    # Listens over UDP at the configured address for STUN connections - to help with NAT traversal.
    # When the embedded DERP server is enabled stun_listen_addr MUST be defined.
    #
    # For more details on how this works, check this great article: https://tailscale.com/blog/how-tailscale-works/
    stun_listen_addr: "0.0.0.0:3478"
    
    # Private key used to encrypt the traffic between headscale DERP
    # and Tailscale clients.
    # The private key file will be autogenerated if it's missing.
    #
    private_key_path: /etc/headscale/derp_server_private.key

  # List of externally available DERP maps encoded in JSON
  urls:
    - https://controlplane.tailscale.com/derpmap/default

  # Locally available DERP map files encoded in YAML
  #
  # This option is mostly interesting for people hosting
  # their own DERP servers:
  # https://tailscale.com/kb/1118/custom-derp-servers/
  #
  # paths:
  #   - /etc/headscale/derp-example.yaml
  paths: []

  # If enabled, a worker will be set up to periodically
  # refresh the given sources and update the derpmap
  # will be set up.
  auto_update_enabled: true

  # How often should we check for DERP updates?
  update_frequency: 24h

# Disables the automatic check for headscale updates on startup
disable_check_updates: false

# Time before an inactive ephemeral node is deleted?
ephemeral_node_inactivity_timeout: 30m

# Period to check for node updates within the tailnet. A value too low will severely affect
# CPU consumption of Headscale. A value too high (over 60s) will cause problems
# for the nodes, as they won't get updates or keep alive messages frequently enough.
# In case of doubts, do not touch the default 10s.
node_update_check_interval: 10s

# SQLite config
db_type: sqlite3

# For production:
db_path: /etc/headscale/db.sqlite

# # Postgres config
# If using a Unix socket to connect to Postgres, set the socket path in the 'host' field and leave 'port' blank.
# db_type: postgres
# db_host: localhost
# db_port: 5432
# db_name: headscale
# db_user: foo
# db_pass: bar

# If other 'sslmode' is required instead of 'require(true)' and 'disabled(false)', set the 'sslmode' you need
# in the 'db_ssl' field. Refers to https://www.postgresql.org/docs/current/libpq-ssl.html Table 34.1.
# db_ssl: false

### TLS configuration
#
## Let's encrypt / ACME
#
# headscale supports automatically requesting and setting up
# TLS for a domain with Let's Encrypt.
#
# URL to ACME directory
acme_url: https://acme-v02.api.letsencrypt.org/directory

# Email to register with ACME provider
acme_email: ""

# Domain name to request a TLS certificate for:
tls_letsencrypt_hostname: ""

# Path to store certificates and metadata needed by
# letsencrypt
# For production:
tls_letsencrypt_cache_dir: /etc/headscale/cache

# Type of ACME challenge to use, currently supported types:
# HTTP-01 or TLS-ALPN-01
# See [docs/tls.md](docs/tls.md) for more information
tls_letsencrypt_challenge_type: HTTP-01
# When HTTP-01 challenge is chosen, letsencrypt must set up a
# verification endpoint, and it will be listening on:
# :http = port 80
tls_letsencrypt_listen: ":http = port 8080"

## Use already defined certificates:
tls_cert_path: ""
tls_key_path: ""

log:
  # Output formatting for logs: text or json
  format: text
  level: info

# Path to a file containg ACL policies.
# ACLs can be defined as YAML or HUJSON.
# https://tailscale.com/kb/1018/acls/
acl_policy_path: ""

# ## DNS
# #
# # headscale supports Tailscale's DNS configuration and MagicDNS.
# # Please have a look to their KB to better understand the concepts:
# #
# # - https://tailscale.com/kb/1054/dns/
# # - https://tailscale.com/kb/1081/magicdns/
# # - https://tailscale.com/blog/2021-09-private-dns-with-magicdns/
# #
# dns_config:
#   # Whether to prefer using Headscale provided DNS or use local.
#   override_local_dns: false

#   # List of DNS servers to expose to clients.
#   # nameservers:
#   #   - 1.1.1.1

#   # NextDNS (see https://tailscale.com/kb/1218/nextdns/).
#   # "abc123" is example NextDNS ID, replace with yours.
#   #
#   # With metadata sharing:
#   # nameservers:
#   #   - https://dns.nextdns.io/abc123
#   #
#   # Without metadata sharing:
#   # nameservers:
#   #   - 2a07:a8c0::ab:c123
#   #   - 2a07:a8c1::ab:c123

#   # Split DNS (see https://tailscale.com/kb/1054/dns/),
#   # list of search domains and the DNS to query for each one.
#   #
#   # restricted_nameservers:
#   #   foo.bar.com:
#   #     - 1.1.1.1
#   #   darp.headscale.net:
#   #     - 1.1.1.1
#   #     - 8.8.8.8

#   # Search domains to inject.
#   domains: []

#   # Extra DNS records
#   # so far only A-records are supported (on the tailscale side)
#   # See https://github.com/juanfont/headscale/blob/main/docs/dns-records.md#Limitations
#   # extra_records:
#   #   - name: "grafana.myvpn.example.com"
#   #     type: "A"
#   #     value: "100.64.0.3"
#   #
#   #   # you can also put it in one line
#   #   - { name: "prometheus.myvpn.example.com", type: "A", value: "100.64.0.3" }

#   # Whether to use [MagicDNS](https://tailscale.com/kb/1081/magicdns/).
#   # Only works if there is at least a nameserver defined.
#   magic_dns: false

#   # Defines the base domain to create the hostnames for MagicDNS.
#   # `base_domain` must be a FQDNs, without the trailing dot.
#   # The FQDN of the hosts will be
#   # `hostname.user.base_domain` (e.g., _myhost.myuser.example.com_).
#   base_domain: example.com

# Unix socket used for the CLI to connect without authentication
# Note: for production you will want to set this to something like:
unix_socket: /var/run/headscale/headscale.sock
unix_socket_permission: "0770"
#
# headscale supports experimental OpenID connect support,
# it is still being tested and might have some bugs, please
# help us test it.
# OpenID Connect
# oidc:
#   only_start_if_oidc_is_available: true
#   issuer: "https://your-oidc.issuer.com/path"
#   client_id: "your-oidc-client-id"
#   client_secret: "your-oidc-client-secret"
#   # Alternatively, set `client_secret_path` to read the secret from the file.
#   # It resolves environment variables, making integration to systemd's
#   # `LoadCredential` straightforward:
#   client_secret_path: "${CREDENTIALS_DIRECTORY}/oidc_client_secret"
#   # client_secret and client_secret_path are mutually exclusive.
#
#   # The amount of time from a node is authenticated with OpenID until it
#   # expires and needs to reauthenticate.
#   # Setting the value to "0" will mean no expiry.
#   expiry: 180d
#
#   # Use the expiry from the token received from OpenID when the user logged
#   # in, this will typically lead to frequent need to reauthenticate and should
#   # only been enabled if you know what you are doing.
#   # Note: enabling this will cause `oidc.expiry` to be ignored.
#   use_expiry_from_token: false
#
#   # Customize the scopes used in the OIDC flow, defaults to "openid", "profile" and "email" and add custom query
#   # parameters to the Authorize Endpoint request. Scopes default to "openid", "profile" and "email".
#
#   scope: ["openid", "profile", "email", "custom"]
#   extra_params:
#     domain_hint: example.com
#
#   # List allowed principal domains and/or users. If an authenticated user's domain is not in this list, the
#   # authentication request will be rejected.
#
#   allowed_domains:
#     - example.com
#   # Note: Groups from keycloak have a leading '/'
#   allowed_groups:
#     - /headscale
#   allowed_users:
#     - alice@example.com
#
#   # If `strip_email_domain` is set to `true`, the domain part of the username email address will be removed.
#   # This will transform `first-name.last-name@example.com` to the user `first-name.last-name`
#   # If `strip_email_domain` is set to `false` the domain part will NOT be removed resulting to the following
#   user: `first-name.last-name.example.com`
#
#   strip_email_domain: true

# Logtail configuration
# Logtail is Tailscales logging and auditing infrastructure, it allows the control panel
# to instruct tailscale nodes to log their activity to a remote server.
logtail:
  # Enable logtail for this headscales clients.
  # As there is currently no support for overriding the log server in headscale, this is
  # disabled by default. Enabling this will make your clients send logs to Tailscale Inc.
  enabled: false

# Enabling this option makes devices prefer a random port for WireGuard traffic over the
# default static port 41641. This option is intended as a workaround for some buggy
# firewall devices. See https://tailscale.com/kb/1181/firewalls/ for more information.
randomize_client_port: false

为了方便管理,这里我使用了 gurucomputing/headscale-ui,在根文件夹下创建 docker-compose.yml,此时文件夹结构应如下图所示:

配置docker-compose.yml 如下:

version: '3.5'
services:
  headscale:
    image: headscale/headscale:latest
    container_name: headscale
    volumes:
      - ./headscale/config:/etc/headscale/
    ports:
      - 8080:8080
      - 9090:9090
    command: headscale serve
    restart: always
  headscale-ui:
    image: ghcr.io/gurucomputing/headscale-ui:latest
    restart: always
    container_name: headscale-ui
    ports:
      - 443:443
networks:
  reverseproxy-nw:
    external: true

然后 docker compose up -d 就行了。

经过漫长的等待,看到 Docker 终于像预期一样跑起来了,这时我们就可以进行进一步的设置了。

Headscale 配置

图形界面配置

在 Docker 运行起来之后,我们应该就能直接访问对应公网 IP 的 443 端口,进入 ui 界面,提示设置 apikey,此时我们只需要执行:

docker exec headscale \
	headscale api create

拿到 api-key 填入即可。

apikey 只会显示这一次,但是 ui 界面填入的内容只保存在该浏览器的本地缓存中,所以请妥善保管好您的 apikey,或者干脆每次申请一个新的。

租户创立

租户创立只需要在容器中执行一行命令:

docker exec headscale \
	headscale user create <USERNAME>

就可以创建一个对应用户名的租户了。

节点绑定

在节点接入的提示中就会有对应的指令给出,在此备忘:

docker exec headscale \
	headscale nodes register --user <USERNAME> --key mkey:<key>

设备接入

在完成 Headscale 的配置之后,我们就可以将节点接入到我们的虚拟局域网中了。

Windows 接入

访问 <server_url>/windows 可以查到对应的接入方法。具体而言就是在下载好新版本的 tailscale 之后在带有管理员权限的 powershell 中运行:

tailscale login --login-server <server_url>

按照 powershell 中给出的指示拿到 <mkey> 完成节点接入即可。在连接上后不要忘了在设置中取消对使用 DNS 的勾选以及选择 run unattended 便于远程控制。

Linux 接入

首先按照指引完成 Linux 端的 tailscale 程序的安装 (可能要sudo)

curl -fsSL https://tailscale.com/install.sh | sh

默认安装程序在国内可能有网络问题,那就也可以在 官网 下载针对不同体系结构 build 好的二进制发行文件(想从 Github 自己源码编译也不是不行,但是我没试过)。下载之后按照官方教程进行安装并启动tailscaled.service就好。之后就是登录。

sudo tailscale up --login-server=<server_url> --accept-routes=true --accept-dns=false

之后就是和 Windows 接入一样的流程了。

群晖接入

首先要维持一个和群晖的 SSH 连接。还是在在 官网 下载了对应版本和体系结构的 headscale spk 文件(群晖的体系结构和系统版本可以 SSH 上去用 uname -a 看到)。传上去之后用

sudo synopkg install {{path/to/package.spk}}

验证密码即可安装。

接下来进行登录,登录和 Linux 就几乎一样了,只不过要多加一个 --reset 的 arg。

sudo tailscale up --login-server=<server_url> --accept-routes=true --accept-dns=false --reset

按照指引安装即可。

安卓接入

在 Google Play 中安装 headscale,打开之后暂时会找不到自定义服务器的入口。这时我们不用慌,快速点击右上角的菜单展开按钮,打开收起几次之后就有了。

输入 <server_url> 后返回点击 Sign in with other,之后的操作都是大同小异了。

不用服务端每次都确定: Pre-Authkey

生成 Authkey

docker exec headscale \
	headscale preauthkeys create -e <TIME> --user <USERNAME>

其中,<TIME> 指有效期长度,一般最好不要超过 24h,如果要使这个 Authkey 可被复用,请加入 --reusable 这个arg。

可以使用

docker exec headscale \
	headscale --user <USERNAME> preauthkeys list

使用 Authkey

在上面各种接入方式的命令行操作末尾加一句 --authkey <KEY>,即可,例如此时的 Linux 接入命令就是:

sudo tailscale up --login-server=<server_url> --accept-routes=true --accept-dns=false --authkey <KEY>

检查网络连通性

接入之后可以在服务器端使用

docker exec headscale \
	headscale nodes list

看到各个设备的上线状态和对应的 IP 地址,当然各个设备本身也能看到其他设备的 ip 地址(看不到上线状态但是可以ping)。

在 Linux 设备上我们执行:

ip route show table 52

可以看到新的路由表已经被建立了。之后就可以互相进行访问了。当然 tailscale 还有一些局域网打通之类的操作,但是由于人在校园网内网,为了安全性考虑还是不弄了。

本文参考了 Tailscale 基础教程:Headscale 的部署方法和使用教程 · 云原生实验室 (icloudnative.io)Headscale 搭建 P2P 内网穿透 - Kovacs (mritd.com)和相关官方文档中的一些内容,在此表示感谢。