$ whoami
$ whoami

Infrastructure & backend
engineer who builds, ships
and operates what he sells.

Инженер инфраструктуры
и бэкенда, который сам
эксплуатирует то, что продаёт.

VPN fleets, self-hosted infra, Telegram apps, and the boring glue that keeps them running. I work hands-on in production — not from a deck, not behind a sales team.

VPN-флотилии, self-hosted инфра, Telegram-приложения и весь скучный клей, на котором это держится. Работаю руками в проде — не из дека и не из-за спин продажников.

VPN & anti-censorshipRemnawave · Xray-core · multi-node fleets
VPN и анти-цензураRemnawave · Xray-core · мульти-нодовые флоты
Self-hosted infraProxmox · NetBird mesh · PBS · observability
Self-hosted инфраProxmox · NetBird mesh · PBS · мониторинг
Backend & botsTelegram Mini Apps · billing · automation
Бэкенд и ботыTelegram Mini Apps · биллинг · автоматизация
~/infra/mesh.txt
read-only
// production mesh — netbird overlay

        ru-bride  ──┐         ┌── exit-est-1 
                      │  ┌─────┐  │
        ru-white  ──┼──┤ ger ├──┼── exit-fin-1 
                      │  └─────┘  │
        mt-q      ──┘     │     └── exit-pl-1      routing
                       
            RU → direct      !RU → tunnel

$ nb status | head -3
peer ru-bride  3s ago
peer ru-white  7s ago
peer germany   2s ago
// availability · Europe (CET) · accepting work for Q3// доступность · Европа (CET) · беру задачи на Q3
// 01servicesуслугиsix things i do well

Six things I do well. The rest I'll tell you upfront if I can't help.

Шесть вещей, которые я делаю хорошо. Про остальное скажу честно, если не вытяну.

VPN & anti-censorship infrastructure

VPN и анти-цензурная инфраструктура

[xray ▸ remnawave ▸ marzban]
01
  • Remnawave / Marzban / 3x-ui panel deployments & ops
  • Xray-core configs: VLESS-Reality, Trojan, SS-2022, mux
  • Bridge architectures: entry → exit via friendly jurisdictions
  • Custom subscription templates · split-tunnel RU→direct · DPI bypass
  • Mobile op. diagnostics: IPv6 leaks, MTU/MSS, Happy Eyeballs
  • Развёртывание и поддержка Remnawave / Marzban / 3x-ui
  • Конфиги Xray-core: VLESS-Reality, Trojan, SS-2022, mux
  • Bridge-архитектуры: entry → exit через дружественные юрисдикции
  • Кастомные шаблоны подписок · split-tunnel RU→direct · обход DPI
  • Диагностика мобилок: IPv6, MTU/MSS, Happy Eyeballs
XrayRemnawaveMarzbanVLESSRealitySS-2022

Self-hosted infrastructure

Self-hosted инфраструктура

[proxmox ▸ netbird ▸ pbs]
02
  • Proxmox clusters, VMs/LXCs, PBS backups with retention & verify
  • NetBird / Tailscale / Wireguard mesh between nodes
  • Hardening: UFW, fail2ban, traffic-guard, SSH ProxyJump, lockout-protection
  • Observability: Prometheus / Grafana / Loki + Telegram alerts
  • OVH dedicated, Hetzner, bare-metal migrations without downtime
  • Proxmox-кластеры, VM/LXC, бэкапы PBS с retention и verify
  • NetBird / Tailscale / Wireguard mesh между нодами
  • Hardening: UFW, fail2ban, traffic-guard, SSH ProxyJump, защита от lockout
  • Мониторинг: Prometheus / Grafana / Loki + алерты в Telegram
  • OVH dedicated, Hetzner, миграции с bare-metal без даунтайма
ProxmoxNetBirdPBSUFWPrometheusGrafana

Custom backend & automation

Кастомный бэкенд и автоматизация

[perl ▸ ts ▸ python]
03
  • Telegram bots & Mini Apps — React/TS/Vite over real billing
  • Billing: subscriptions, invoices, service provisioning via queues
  • ETL / log-analysis pipelines (Xray analyzer, SQLite, agents)
  • Scripting: Bash, Python, Perl, Node/TS — whatever fits the task
  • MTProto proxies with FakeTLS and tuned routing
  • Telegram-боты и Mini Apps — React/TS/Vite поверх живого биллинга
  • Биллинг: подписки, инвойсы, провижининг через очереди
  • ETL / log analysis (Xray analyzer, SQLite, агенты на нодах)
  • Скрипты: Bash, Python, Perl, Node/TS — что подойдёт под задачу
  • MTProto-прокси с FakeTLS и тонким routing
Perl 5Node/TSPythonMySQLRedisBot API

Incident response & rescue work

Incident response и спасательные работы

[when it's already on fire]
04
  • When production is down — bring it up now, root-cause later
  • Mobile clients lagging on Telegram — find that it's IPv6, fix the route
  • Panel migrated, subscriptions broken — restore without losing customers
  • Post-mortem with a runbook, not a Slack message
  • Когда прод лежит — поднять сейчас, причину разобрать потом
  • Мобильные клиенты тормозят в Telegram — найти IPv6, починить маршрут
  • Панель уехала, подписки не отдаются — поднять и не потерять клиентов
  • Post-mortem с раннбуком, не сообщением в чате
on-callforensicsrunbooks

OSINT, intelligence & C2 systems

OSINT, intelligence и C2 системы

[search ▸ leak ▸ darkweb ▸ c2]
05
  • OSINT API platforms: search, leak, darkweb, person tracking, reverse-proxy
  • Identity / background-check pipelines over Elasticsearch + Mongo data lakes
  • C2 dashboards with real-time Mapbox, alert workflows, asset & UAV dispatch
  • SIGINT collection surfaces — IMSI/IMEI intercept logs, satcomm monitoring
  • Multi-source ingestion: API connectors, ES indexing, Postgres for relational
  • OSINT API: search, leak, darkweb, person tracking, reverse-proxy
  • Identity / background-check пайплайны на Elasticsearch + Mongo data lake
  • C2 dashboards с real-time Mapbox, alert-workflow, dispatch активов и UAV
  • SIGINT-поверхности — логи IMSI/IMEI intercept, satcomm-мониторинг
  • Multi-source ingestion: API-коннекторы, ES-индексация, Postgres для реляционки
OSINTElasticsearchMongoDBMapboxSIGINT

AI-enabled products

AI-продукты

[bots ▸ assistants ▸ mcps]
06
  • Telegram support bots with AI conversational context over a real ticket DB
  • B2B sales assistants over data lakes — schema, query, summarize, route
  • MCP servers for security tooling, observability and incident workflows
  • LLM gateways via Envoy AI Gateway, model-pool routing, cost guardrails
  • Product, not pitch — AI is a feature with a measurable job, not the headline
  • Telegram support-боты с AI-контекстом поверх живой ticket-БД
  • B2B sales-ассистенты поверх data lake — схема, запрос, summary, routing
  • MCP-серверы для security-тулинга, observability и incident-workflow
  • LLM-шлюзы через Envoy AI Gateway, model-pool routing, cost-guardrails
  • Продукт, не лозунг — AI как фича с измеримой задачей, а не заголовок
MCPHonoTypeScriptEnvoy AI GWpgvector
// 02selected workизбранные кейсыten things i shipped · click to expandдесять кейсов · клик для деталей

Real systems in production. Numbers, not adjectives.

Реальные системы в проде. Цифры, не прилагательные.

CASE 01

Telegram lags on mobile, fixed at the routing layer

Telegram-лаги на мобильных — решено на уровне routing

network-debugginganti-censorshipxrayipv6
problemintermittent 30s Telegram stalls on LTE; Wi-Fi & desktop fine
fixv6 black-hole via ::/0 → BLOCK on all 6 exit profiles + 2 bridges
problemзависания Telegram на 30с по LTE; Wi-Fi/десктоп ок
fixv6 black-hole через ::/0 → BLOCKна 6 exit-профилях и 2 bridge'ах
stalls → 0зависания → 0
detail

// problem

Users on a private VPN service reported that Telegram was noticeably laggy on phones over LTE, while desktop and Wi-Fi felt fine. Symptoms didn't match a typical "bad node" story — exit nodes were healthy, Telegram-owned domains resolved and responded in 100–300ms from every exit.

// diagnosis

Captured packets on a real phone, not a synthetic test. The failure was not in the VPN path — it was Happy Eyeballs. Carriers handed dual-stack, the client opened the IPv6 socket through the tunnel, 6 of 8 exits had no IPv6, and the kernel reject + xray buffering added 200–500ms before fallback. Telegram opens dozens of concurrent sockets; the delay surfaced as "laggy UI".

// solution

Two-layer fix. Server-side: instant ip6tables reject so clients fall back immediately. Proper fix in Xray routing.rules, applied across all 6 exit profiles and both RU bridges:

// xray routing "rules": [ { "type": "field", "ip": ["::/0"], "outboundTag": "BLOCK" } ], "outbounds": [{ "tag": "BLOCK", "protocol": "blackhole" }]

Rollout strictly one node at a time with verify-after-each (sub-3s xray reload, client auto-reconnect).

// result

Mobile Telegram latency restored to feel parity with desktop on every tested LTE carrier. ~15 minutes per node including verification. No client-side changes required.

// stack

Xray-coreRemnawaveip6tablestcpdumpmobile profiles
CASE 02

6 subscription templates migrated — zero downtime, paying customers stayed up

Миграция 6 шаблонов подписок без даунтайма

vpn-infrastructuremigrationremnawave
problem6 templates drifting; RU traffic accidentally tunneled; 2-line SINGBOX bug
fixminimum-diff sync per template, rollback documented, RU-bypass UX locked
problemдрифт 6 шаблонов; RU в туннеле; баг в SINGBOX
fixmin-diff синк, rollback по каждому, UX заблокирован
0 downtime0 даунтайма
detail

// problem

Live VPN service had 6 subscription templates (MIHOMO, CLASH, XRAY_JSON Default/Balance, SINGBOX, STASH) drifting from current best practices. RU traffic was getting routed throughVPN in some templates (defeating local-bypass). One template had a 2-line bug. Couldn't break active subscriptions — paying customers were on these templates right now.

// diagnosis

Pulled all 6 from /api/subscription-templates and snapshot-saved them. Diffed each against the upstream community config to identify minimum-diff changes. Flagged the SINGBOX bug (RU bundle pointing to proxy instead of direct). Noticed the MIHOMO 🏠 RU group was select[DIRECT, VPN] — one tap from breaking RU-bypass.

// solution

Per-template minimum-diff:

· MIHOMO/CLASH — sync from upstream + lock 🏠 RU to [DIRECT] only
· XRAY_JSON Default — sync from Lite, keep inbounds/outbounds
· XRAY_JSON Balance — sync + auto_balancer + injectHosts + burstObservatory
· SINGBOX — fix 2-line bug, no other changes
· STASH — untouched (risk > reward)

Each applied via panel API with a per-template rollback command saved to README.

// result

All 6 migrated in one session, zero subscription disruption. Per-template rollbacks documented and tested. RU-bypass UX hardened against accidental misconfiguration.

// stack

Remnawave REST + JWTClashMihomoSingBoxXrayYAMLJSONgit
CASE 03

Built a fleet-wide Xray log analyzer for abuse detection & forensics

Fleet-wide Xray log analyzer для abuse detection и forensics

backendobservabilitygopostgresopen-source
problem0 visibility on a multi-node fleet; bridge hides real users behind one UUID
builtGo agent → WebSocket → Postgres 17 → Next.js dashboard, with bridge-correlation
problem0 видимости; bridge прячет пользователей за одним UUID
builtGo-агент → WS → Postgres 17 → Next.js, c bridge-корреляцией
⭐ 20 · 8/8 nodes
detail

// problem

A multi-node VPN fleet had no visibility into what users were doing through it. No way to investigate abuse complaints against specific destinations. No automated detection of port-scan patterns. Bridge architecture made it worse — hundreds of real users hid behind one synthetic identity at the exit node.

// diagnosis

Reviewed off-the-shelf solutions — none handled bridge-correlation, none had Telegram-friendly ergonomics. Sized the data: ~50–500MB/day per node. Needed ≥7d active + ≥90d historical retention.

// solution

Two-tier system in Go. Per-node agent tails access.log, batches entries, ships over WebSocket to central. Server normalizes & stores in Postgres 17, exposes a Next.js dashboard. Key subsystems:

· bridge correlation — time-based fan-out (±15s) writes one row per candidate to bridged_flows; most-frequent over a window = most likely originator
· anomaly detectors — port_scan, abuse_port_flood, burst_scan, blacklist via threat_matches
· anti-hang patch— configurable pongWait / pingPeriod / writeWait + TCP keepalive, replaced "drop entries → OOM → restart" with auto-reconnect
· SQLite→Postgres 17 migration — hybrid copy, 6s cutover, rollback warm. 65+ storage tests on testcontainers-go.

// result

All 8 fleet nodes reporting within a week of agent rollout (one node at a time, strict verification). Single SQL query attributes abuse to a real user IP within seconds. Survives WebSocket proxy flakiness — no manual restarts since the anti-hang patch. Open-sourced — 20+ operators run it now.

github.com/qwertyhq/xray-analyzer

// stack

Go 1.25gorilla/websocketpgx/v5Postgres 17Next.jsDockertestcontainers-go
CASE 04

MTProto proxy with full observability — custom Erlang patches on top of seriyps

MTProto-прокси с полным стеком обзора — Erlang-патчи поверх seriyps

anti-censorshipobservabilityerlangprometheus
problemno metrics, no GeoIP, no real-session count — only raw TCP conns
built9 Erlang patches — 2 new modules + 7 mods; 22 Prometheus metrics exported
problemнет метрик, GeoIP, реальных сессий — только raw TCP
built9 патчей на Erlang — 22 метрики в Prometheus
33/38 panels live
detail

// problem

Standard MTProto distributions (mtg) had dropped ad-tag support. The Erlang seriyps fork kept ad-tag + FakeTLS + Secure-dd — but no Prometheus, no GeoIP, no way to see unique real Telegram sessions vs raw TCP connection counts.

// diagnosis

Read the source. mtp_metric exposed counter callbacks but no backend was wired up. No GeoIP module. The auth_key_id (first 8 bytes of the first decoded MTProto packet) was already parsed and then discarded — free signal for unique-session counts.

// solution

9 patches — 2 new modules + 7 modifications:

· mtp_prometheus_metric.erl — metric backend, registers active/passive, runs prometheus_httpd on :9091
· mtp_unique_users.erl — gen_server + 2 ETS tables (IP→country, auth_key_id→last_seen), 5-min sliding window, GeoIP via locus
· mtp_handler.erl — captures auth_key_id from the first decoded packet
· mtp_full.erl / mtp_down_conn.erl — counters on CRC/seq mismatch, handshake timeout, keepalive

Hard lessons baked in: iptables must ACCEPT lo first or curl 127.0.0.1 times out. Em-dash in an Erlang docstring → 500 from prometheus_http_impl. Per-IP labels = cardinality blow-up. ulimits.nofile=65536 or Ranch chokes around 1k conns.

// result

33/38 dashboard panels render live data (87%) — the 5 [no data] panels are conceptually impossible without rewriting state machines. "Real TG sessions" via unique_auth_keys_5mcorrelates within expected variance to MTProxybot's own daily count. Per-country breakdown (RU/PK/UA/DE/US) via GeoLite2.

github.com/qwertyhq/telemt-install

// stack

Erlang/OTPprometheus.erlprometheus_httpdlocus (MaxMind)DockerGrafana
CASE 05

OVH dedicated build-out: Proxmox + NetBird mesh + PBS, end-to-end

Переезд в OVH dedi: Proxmox + NetBird mesh + PBS, end-to-end

self-hosted-infraproxmoxnetbirdbackups
problemmixed VPSes, inconsistent backups, no private mesh, exposed control plane
builtRyzen 9700X + ZFS mirror, Proxmox 9, NetBird overlay, PBS in LXC
problemразные VPS, нет норм. бэкапов и mesh, открытая control-plane
builtRyzen 9700X + ZFS mirror, Proxmox 9, NetBird, PBS в LXC
1 host, all VMs1 хост, все VM
detail

// problem

Multi-VM setup on a mixed bag of small VPSes was painful to operate: backups inconsistent, no private mesh between hosts, monitoring fragmented, deployments depended on hosting-provider quirks. Needed to consolidate onto owned hardware without losing small-VM ergonomics.

// diagnosis

Sized the workload: 3–4 production VMs, ~64GB RAM, snapshot-friendly FS, growth headroom. Chose OVH dedicated (Ryzen 9700X, 64GB, ZFS mirror) over hyperscaler — better $/RAM, full control over kernel/networking.

// solution

End-to-end build:

· Proxmox VE 9 on ZFS mirror, per-VM zvols, 15-minute ZFS snapshots
· PBS as an LXC on the same host, daily 4:00 backups across all VMs
· R2 offsite sync (WIP) for 3-2-1 backup policy
· NetBird mesh as data plane — VMs + Mac client all peers on 10.10.10.0/24 overlay; no public ports on app VMs
· External edge node terminates TLS (Caddy) and socats into the mesh — clean separation between exposed surface and private infra
· Prometheus + Grafana + Loki, alerts piped to Telegram

// result

One dedicated host runs SHM billing VM, VPN panel VM, frontend VM, and PBS — with capacity headroom. Backups verified end-to-end (PBS restore tested, ZFS rollback tested). Documented in a dedicated infra/ repo with runbooks.md + known-issues.md.

// stack

Proxmox VE 9ZFSProxmox Backup ServerNetBirdCaddyPrometheusGrafanaLoki
CASE 06

Locking down a dedicated server without locking myself out

Lockdown dedicated сервера без потери доступа

securitynetworkincident-responseiptables
problempublic :22 & :8006 are liabilities; closing without lockout is the trick
builtsystemd-run auto-rollback pattern; two independent access paths
problemпубличные :22 и :8006 — риск; закрыть без локаута
builtauto-rollback через systemd-run; два независимых пути доступа
surface → :51820/udp
detail

// problem

A scanning incident upstream made it clear that public :22 and Proxmox UI :8006 were liabilities. Closing them without breaking access was the actual challenge — the same firewall change that hides you from scanners can lock the operator out if the NetBird daemon hiccups.

// diagnosis

Inventoried external surface: :51820/udp NetBird mesh (the new control plane, keep), and :22 / :8006 / :25 / :111 / :8080 / :9100 / :9115 / :9874 / :3128 — all DROP candidates. Mapped existing iptables — NetBird daemon installs dynamic ACL chains that get flushed by iptables-restore. That's the foot-gun.

// solution

A safety-net pattern I now use for every iptables change on prod:

TS=$(date +%Y-%m-%d-%H%M) iptables-save > /root/iptables.runtime.bak.$TS # Schedule auto-rollback in 10 minutes systemd-run --unit=fw-rollback --on-active=10min \ /usr/sbin/iptables-restore /root/iptables.runtime.bak.$TS # ...make changes... verify access works... # If verification passes: systemctl stop fw-rollback.timer fw-rollback.service

Then the lockdown itself: custom NETBIRD-SAFETY chain DROP-ing everything on vmbr0 except :51820/udp + ICMP. SSH alias switched to NetBird IP with ProxyJump via edge node as fallback.

// result

Public surface reduced to :51820/udp + ICMP. Scanners see nothing else. Two independent access paths (direct NetBird, ProxyJump fallback) — no single point of failure. Rollback pattern is reusable, documented, exercised.

// stack

iptablesnftablesNetBirdsystemd-runSSH ProxyJump
CASE 07

Realtime layer that survived a hostile reverse proxy

Realtime-слой, переживший враждебный reverse-proxy

backendrealtimeperlreact
problemWebSocket died after ~1s in prod — Caddy security plugin quietly cut the tunnel
fixdual-protocol server on one port; SSE primary, WS fallback; transport swappable by env var
problemWebSocket падал через ~1с в проде — Caddy security-плагин рвал туннель
fixсервер на двух протоколах, SSE по дефолту, WS как fallback; transport через env
SSE through any proxySSE везде проходит
detail

// problem

The Telegram Mini App needed realtime push for billing/service events. Initial WebSocket implementation worked locally but died on production: connections closed after ~1 second under the production reverse proxy (a Caddy build with a security plugin that quietly broke the WebSocket tunnel).

// diagnosis

Reproduced with Node's raw HTTP/1.1 client — WebSocket handshake completed, then the connection got cut mid-stream. Browser WebSocket API (HTTP/2 Extended CONNECT) failed even harder. The proxy was the problem, not the server or the client. Two options on the table: replace the proxy stack (high-risk for a single feature), or switch transport — drop to Server-Sent Events, which is just a long-lived HTTP response and works through every HTTP proxy in the world.

// solution

Picked SSE and built it so WebSocket remained a fallback:

· Single realtime-server.pl handles both protocols on the same port
· SSE became the primary transport, WebSocket kept for future bidirectional needs
· Backend wired into Redis pub/sub (shm:events) — same notify path, different transport
· Frontend: shared event handler handleRealtimeEvent; separate transport hooks useSSEUpdates / useWebSocketUpdates; transport selectable via VITE_REALTIME_TRANSPORT
· nginx sse-location.conf disables buffering on /sse

// result

SSE works reliably through the production proxy — no patches to the proxy itself, no version pin. Transport is swappable by env var, so future-proof if the proxy stack changes. One realtime server, two protocols, single code path for events.

// stack

Perl 5Redis pub/subServer-Sent EventsReact + TypeScriptVitenginx + Caddy
CASE 08

HQ VPN — cross-platform desktop client on Mihomo

HQ VPN — десктоп-клиент на Mihomo, Mac + Windows

vpn-infrastructuredesktopelectronmihomo
problemthird-party Clash/Stash clients fight the RU-bypass UX and aren't branded for the service
builtElectron + Vite + React shell over a bundled Mihomo core; floating widget; SSID auto-pause; profile sync from Remnawave subscription
problemсторонние Clash/Stash клиенты ломают RU-bypass UX и не привязаны к сервису
builtElectron + Vite + React поверх Mihomo; floating widget; SSID auto-pause; sync профилей из Remnawave
Mac + Win buildsMac + Win сборки
detail

// problem

Subscribers were spread across Clash, Mihomo, Stash and SingBox clients. None of them are branded for the service, RU-bypass ergonomics are fragile (one tap from sending RU traffic through the tunnel), and support requests need to start with "which client are you on, what version, what mode". A single native client owned by us removes that whole class of friction.

// diagnosis

Mihomo (the Clash-fork proxy core) has a clean REST API and ships statically-linked binaries for Mac/Win/Linux — perfect to bundle. Electron + Vite gives one codebase for Mac and Windows, native window chrome, and access to OS-level signals (SSID, HWID, OS version) we need for support and auto-pause.

// solution

Electron app shape:

· Main process — owns the bundled mihomo binary, IPC handlers, system-proxy toggling, elevation for first-run admin tasks
· Renderer — React + Radix UI; pages for connections, proxies, rules, DNS, TUN, sniffer, logs, system proxy, resources, settings
· Profile sync — pulls subscription from the Remnawave URL, runs buildControlledOverlay + cleanProfile to merge app config and strip user-mutable bits (DNS / TUN / sniffer / external-controller / LAN IPs)
· Floating window — borderless traffic widget with context menu, lives outside the main window for menubar-style quick toggle
· SSID-aware auto-pause — reads current Wi-Fi SSID, pauses the tunnel on configured trusted networks (so home Wi-Fi traffic skips the VPN automatically)
· Device info — HWID / OS / version / model exposed for support correlation
· i18n + custom theme system (CSS custom properties, swappable at runtime)

// result

One codebase ships to electron-builder --mac and --win. Subscribers get a branded client with the bypass model locked correctly, support gets device fingerprints in one click, and the proxy core (Mihomo) is the same as the community standard — so debugging tracks back to upstream behaviour, not a custom fork.

// stack

Electronelectron-viteReactTypeScriptRadix UIMihomoRemnawave subscription APIelectron-builder
CASE 09

Maritime C2 dashboard — real-time vessel tracking + SIGINT for an EEZ operator

Maritime C2 — real-time трекинг судов и SIGINT для EEZ-оператора

c2-systemsintelligencemapboxsigintfrontend
problemoperator team context-switched across alerts / vessels / assets / SIGINT tools to run one EEZ shift
builtNext.js 15 + Mapbox GL console with alerts, dispatch (navy / CG / UAV), UAV telemetry, SIGINT IMSI/IMEI intercept, case lifecycle
problemоператор прыгал между alerts / vessels / assets / SIGINT инструментами на одну EEZ-смену
builtNext.js 15 + Mapbox GL консоль: alerts, dispatch (флот / БО / UAV), UAV-телеметрия, SIGINT IMSI/IMEI intercept, case lifecycle
one consoleодна консоль
detail

// problem

A maritime client monitoring an exclusive economic zone (EEZ) ran their watch across half a dozen single-purpose tools: vessel registry in one place, AIS alerts in another, asset dispatch by phone, UAV telemetry in a separate viewer, SIGINT in a closed CLI. Investigations spanned hours of context-switch before someone could say this boat, this incident, this action.

// diagnosis

The fix wasn't more tools — it was one operator console that put the map first. Side panels for situational data (alerts, vessels, asset status), modals for time-critical actions (dispatch, contact SOP, case open), and a case surface that ties an alert all the way through to closure.

// solution

Built as a single-page operator workstation, deployed to the client's ops team:

· Map-first layout — Mapbox GL via react-map-gl, AIS-status colored vessel markers, EEZ polygon overlay, zoom-aware clustering
· Alert workflow — severity-graded panel, acknowledge / escalate / open-case actions, audit trail
· Asset dispatch — navy / coast guard / UAV inventory, dispatch dialog with VHF / SatComm contact SOP wired in
· UAV control — live feed slot, telemetry panel, remote command hand-off
· SIGINT collection — IMSI/IMEI intercept log, satcomm session metadata, attribution to vessel-of-interest
· Case management — full lifecycle: alert → investigation → dispatch → contact → closure with evidence pack

Built on Next.js 15 App Router, TypeScript 5.7, shadcn/ui primitives, Zustand state, Tailwind 4. Playwright for end-to-end coverage of the alert→case workflows.

// result

One operator surface replaces five. Alert acknowledgment to asset dispatch is two clicks. SIGINT and case history live in the same audit thread as the alert that triggered the investigation. Stack is plain web — no Java client, no installer, no platform lock.

// stack

Next.js 15TypeScriptMapbox GLreact-map-glshadcn/uiZustandTailwind 4Playwright
CASE 10

MTBolt — high-performance MTProto proxy in C, scaling to 10M+ connections

MTBolt — высокопроизводительный MTProto-прокси на C, до 10M+ конн

anti-censorshipsystems-programmingcperformanceopen-source
problemexisting MTProto distributions ceiling out around 100k–1M conn/process; no per-country observability built-in
builtfrom-scratch C99: TCP_DEFER_ACCEPT, jemalloc, multi-worker shared-mem IP stats, native Prometheus, RELRO/PIE/FORTIFY hardening
problemсуществующие MTProto-дистрибутивы упираются в 100k–1M конн на процесс; per-country observability нет
builtс нуля на C99: TCP_DEFER_ACCEPT, jemalloc, multi-worker shared-mem IP stats, native Prometheus, RELRO/PIE/FORTIFY hardening
GPLv2 · 10M+ conn target
detail

// problem

I already had an Erlang MTProto stack (see Case 04) tuned for observability — but observability and absolute scale are different problems. At provider-level concurrency (millions of live TCP sessions), Erlang runtime overhead becomes the bottleneck. I wanted a clean C implementation with kernel-level acceleration and no language-runtime baggage.

// diagnosis

The pieces needed: kernel-side filtering of bad TCP attempts before they hit userspace (TCP_DEFER_ACCEPT), a memory allocator that survives high churn without fragmentation death (jemalloc with background-thread GC), shared-memory IP stats so multi-worker counters don't need a coordinator, and a hardened build so a hostile peer can't buffer-overflow the proxy into something unpleasant.

// solution

C99 from scratch. The build-time decisions matter as much as the runtime ones:

· TCP_DEFER_ACCEPT — kernel drops zero-byte connections before they cost a syscall in userspace
· jemalloc with MALLOC_CONF=background_thread:true,dirty_decay_ms:5000,muzzy_decay_ms:30000 — survives the allocation pattern of millions of short-lived sessions
· Multi-worker, shared-memory IP table — workers count uniques without coordinator overhead; dynamic table scales to 10M+ entries
· Native Prometheus metrics — per-country unique IPs, Russian region breakdown, online/total — fed by libmaxminddb + GeoLite2-City
· Runtime config — TOML file + --cli-flag overrides; no recompile to retune workers, GeoIP path, stats endpoint
· Build hardening — stack-protector, FORTIFY_SOURCE=2, PIE, full RELRO, BIND_NOW
· systemdLimitNOFILE=10485760, TasksMax=infinity, BBR congestion control on the host

// result

Production-grade proxy with native observability and a hardened binary. Open-sourced under GPLv2 — the goal is for other operators running provider-scale Telegram traffic to have a sane baseline to start from.

github.com/dpibreaker/mtbolt

// stack

C99OpenSSLjemalloclibmaxminddbPrometheusBBRsystemdcppcheck
root@poland-1 · ssh · 80×24 · NetBird mesh
tmux:0
root@poland-1 ~ $ xray version Xray 1.8.24 (Xray, Penetrates Everything.) Go 1.22.4 linux/amd64 A unified platform for anti-censorship. root@poland-1 ~ $ systemctl is-active xray && uptime -p active up 47 days, 13 hours, 22 minutes root@poland-1 ~ $ ss -H -tnp state established '( sport = :443 )' | wc -l 1284 root@poland-1 ~ $ ./xray-analyzer report --since=24h --top=3 region share p50 p95 err EU 62.4% 18ms 41ms 0.02% RU 21.0% 27ms 58ms 0.31% ASIA 16.6% 73ms 142ms 0.07% root@poland-1 ~ $ _
// 03open sourceopen sourcetools i build for the work i doинструменты, которые я делаю для своей же работы

If you operate VPN infrastructure, you've maybe already run one of these.

Если держишь VPN-инфру — возможно, что-то из этого у тебя уже стоит.

github.com/qwertyhq  ·  18 public repos  ·  since 2021
github.com/qwertyhq  ·  18 публичных репо  ·  с 2021
// also active on forks: mtproto_proxy (seriyps), traffic-guard, remnawave-scripts, shm-fork, hexstrike-ai
// ещё в работе с форками: mtproto_proxy (seriyps), traffic-guard, remnawave-scripts, shm-fork, hexstrike-ai
// 04stackстекtools in active rotationинструменты в обиходе

What's currently on disk. No buzzwords, only things I've shipped with.

Что реально на диске. Без хайповых слов, только то, с чем работал в проде.

// networking

Xray-coreMihomoWireguardNetBirdTailscaleiptablesnftablesUFWBGP basicsMTProtoVLESS-RealityShadowsocks-2022

// infra

ProxmoxPBSDockerK8sHelmArgoCDGitOpsOVHHetznerCaddynginxHAProxygVisorMinIO

// backend

Perl 5Node/TSPythonBashCHonoDrizzleExpoMySQLPostgresRedisMongoDBElasticsearchSQLiteRabbitMQ

// frontend

ReactTypeScriptViteZustandTailwindCSSshadcn/uiRadix UIMapbox GLTelegram WebApp SDKElectronNext.js

// observability

PrometheusGrafanaLokiAlertmanagerSentrySignOzOpenReplaySonarQubenode_exporterblackbox_exporter

// security & ai

SSH hardeningfail2bantraffic-guardMTProto engineProxyJumpWG-only mgmtsemgreptrivyMCP serversEnvoy AI Gatewaypgvector
// 05how i workкак я работаюfour rules, no exceptionsчетыре правила, без исключений

Boring on purpose. Nothing here is clever — it's just what keeps production up.

Намеренно скучно. Никакой магии — только то, что держит прод стоя.

step.01

Diagnose first.

Сначала диагноз.

Logs and packets before hypotheses. No "let's try restarting"
before we know what's actually happening.

Логи и пакеты раньше гипотез. Никакого «давай перезапустим» без понимания, что вообще происходит.

step.02

One change at a time.

По одному изменению за раз.

Roll out across the fleet one node at a time. Verify, then move on. No big-bang deploys.

Раскатываю по флоту по одной ноде. Верифицирую и иду дальше. Никаких big-bang.

step.03

Backups before changes.

Бэкапы до изменений.

Proxmox / PBS snapshots, configs in git, one-line rollback path. Always.

Снапшоты Proxmox / PBS, конфиги в git, откат в одну команду. Всегда.

step.04

Document what stays.

Документирую то, что остаётся.

Runbooks and architecture notes go to the client. Not "in my head" — on paper, in repo.

Раннбуки и архитектурные заметки — клиенту. Не «в голове», а в репо.

// 06aboutобо мнеplain text, first personот первого лица, без воды

I'm a senior infrastructure & backend engineer. I run my own VPN service end-to-end — panel, fleet, billing, Mini App — and ship OSINT, intelligence and AI-enabled products outside of it. When I show up to a client's mess, I've already debugged the same class of problem at 3 a.m. on my own production.

Senior-инженер инфраструктуры и бэкенда. Держу VPN-сервис end-to-end — панель, флот, биллинг, Mini App — и параллельно делаю OSINT-, intel- и AI-продукты. Когда прихожу к клиенту, такой же класс проблем уже разбирал на собственном проде в три ночи.

I'm not a consultant, not an agency, not a strategy deck. I work hands-on: shell, configs, packet captures. AI ships when it earns its place — support bots that resolve, sales assistants over real data lakes, MCPs that wire actual tooling, not pitch slides. If I can't help, I'll say so up front instead of pretending.

Я не консультант, не агентство и не стратегическая презентация. Работаю руками: shell, конфиги, дампы пакетов. AI попадает в работу там, где это конкретный продукт — support-боты, sales assistants поверх живых data lake'ов, MCP-серверы для реальных инструментов, не пафосные слайды. Если задачу не вытяну — скажу сразу, не буду делать вид.

I treat client systems the way I treat my own: backups before changes, one node at a time, runbooks left behind. Nothing clever — just the discipline that keeps things up at month six, not month one.

Клиентские системы веду как свои: бэкап до изменений, по одной ноде за раз, после себя оставляю раннбуки. Без магии — обычная дисциплина, которая держит систему на шестом месяце, а не только на первом.

based
Europe · CET
availability
accepting Q3 work
engagement
contract · retainer · triage
response
under 24h on weekdays
languages
EN · RU · DE (B1)
not for
web3, growth hacking, AI strategy decks
// 07contactконтактыthree ways inтри способа написать

Tell me what's broken.

Расскажи, что сломалось.

The more concrete you are — what panel, what version, what the logs say — the faster I can tell you whether I can help. If I can't, I'll point you at someone who can.

Чем конкретнее опишешь — какая панель, версия, что в логах — тем быстрее пойму, потяну ли. Если не потяну, подскажу, к кому идти.

Quick triage call · 15–30 min · free.Быстрый triage-созвон · 15–30 мин · бесплатно.We get on a call, look at logs together, and figure out if this is something I should take on. No pitch, no slides.Созваниваемся, смотрим логи вместе, понимаем, моя ли это задача. Без презентаций.
// preferred: Telegram. Replies are usually same-day, CET hours.// предпочтительно: Telegram. Отвечаю обычно в тот же день, CET.