12 KiB
Technical report — Parallel networking (Will-Executor anti-freeze) + UI feedback
Audience: external programmer / plugin maintainer
Author: AI refactoring work (on GitHub Bitcoin-after-life/test)
Date: 2026-06-15
Branch: feature/networking-parallelo (Pull Request #4)
Private Gitea repo kaibot/bal-plugin-ai: NOT modified (per explicit request; cloned read-only only to run the official tests).
1. Problem
When the plugin contacts the Will-Executor servers (pushing transactions, pinging/refreshing the inheritance, downloading the list, and checking transactions), it used to do it sequentially. If a server did not answer, the thread stayed blocked on the connection timeouts and, worse, on the retries:
send_requestretried up to 10 times withtime.sleep(3)on every timeout → roughly ~140 seconds per unreachable server, summed one after another.
Consequences:
- Noticeable already with a few servers; with 20 servers it became unusable.
- The user saw "Stay waiting — Not responding" with no idea what was happening.
- A single dead server blocked the whole operation.
2. Solution (overview)
- Parallelism with
ThreadPoolExecutor: servers are contacted concurrently. Total time ≈ the slowest server, not the sum. - Fast-fail for interactive operations (ping/info/download): no retry storm, a single short timeout and the server is marked "KO".
- Aggressive timeouts + global deadline for push/check: a short per-server retry budget is kept (a real transaction must survive a transient hiccup), but a wall-clock global deadline caps the whole batch so a dialog never freezes behind one unresponsive server.
- Live feedback + reliable elapsed-time counter:
on_each(...)updates the dialog as results arrive, andon_tick()refreshes an elapsed-time counter (Xs / DEADLINEs) so the user always knows progress and the maximum wait.
Why it is thread-safe
Network.send_http_on_proxy() uses asyncio.run_coroutine_threadsafe(coro, loop) and then coro.result(): every call schedules its own coroutine on
Electrum's shared asyncio loop and blocks only its own worker thread.
Multiple concurrent calls are therefore safe → ThreadPoolExecutor gives true
parallelism.
UI updates go through BalWaitingDialog.update() / the dialog's pyqtSignal,
which marshals to the GUI thread automatically. Callbacks from worker threads
can therefore update the dialog safely.
Why the counter is driven from the calling thread (important)
An earlier attempt refreshed the elapsed-time counter from a separate raw
threading.Thread heartbeat that emitted the pyqtSignal. That proved
unreliable: a pyqtSignal emitted from a raw (non-Qt) Python thread inside
the wizard's TaskThread was not reliably marshalled and the dialog never
repainted — the counter was invisible.
The fix: the parallel helpers accept an on_tick callback that is invoked
periodically from the CALLING thread (the same thread that already drives
on_each and successfully repaints). The helpers poll the futures in short
slices (concurrent.futures.wait(..., timeout=tick_interval)) and call
on_tick() between waits. No heartbeat thread is used anymore.
3. Modified files (all on GitHub Bitcoin-after-life/test, branch feature/networking-parallelo)
3.1 bal/core/willexecutors.py
Networking constants (module level, also exposed as Willexecutors class
attributes for a single source of truth in the GUI):
DEFAULT_TIMEOUT = 5 # interactive ops (ping/info/list)
PUSH_TIMEOUT = 8 # broadcast (pushtxs)
PUSH_MAX_RETRIES = 2
PUSH_RETRY_SLEEP = 1
PUSH_GLOBAL_DEADLINE = 30 # wall-clock cap for the whole parallel push
CHECK_TIMEOUT = 8 # check (searchtx)
CHECK_MAX_RETRIES = 1
CHECK_RETRY_SLEEP = 1
CHECK_GLOBAL_DEADLINE = 30 # wall-clock cap for the whole parallel check
Worst case per server is now ~26s (push) / ~17s (check) instead of ~140s, and the global deadline guarantees the dialog proceeds within 30s regardless.
send_request(...) — keyword-only retry controls:
def send_request(method, url, data=None, *, timeout=10, handle_response=None,
count_reply=0, max_retries=10, retry_sleep=3):
- Defaults unchanged → callers that need the historical behaviour are unaffected.
- Interactive callers pass
max_retries=0→ fast-fail.
get_info_task(...) — fast-fail by default (max_retries=0); a
timeout/empty response yields status="KO".
check_transaction(...) — now accepts timeout/max_retries/retry_sleep
(defaults from the CHECK_* constants) and forwards them to send_request,
replacing the old ~140s default storm.
NEW ping_servers_parallel(willexecutors, *, on_each=None, max_workers=8, timeout=DEFAULT_TIMEOUT, on_tick=None, tick_interval=1.0)
ThreadPoolExecutor; polls futures in slices and callson_tick()from the calling thread; mutateswillexecutorsin place; invokeson_each(url, we, ok)as results arrive; a worker exception never blocks the others (defensive try/except).
NEW push_transactions_parallel(willexecutors, *, on_each=None, max_workers=8, deadline=PUSH_GLOBAL_DEADLINE, on_timeout=None, on_tick=None, tick_interval=1.0)
- Parallel push only to entries that have a
"txs"key; each server keeps its short retry budget. on_each(url, we, ok, exc)per server;on_timeout(url, we)for servers still pending when the global deadline elapses;on_tick()for the counter.- Manual pool (no
with) soshutdown(wait=False, cancel_futures=True)does not block on a hung worker once the deadline is reached. - Returns
{url: (ok, exc)}for the servers that answered in time.
NEW check_transactions_parallel(items, *, on_each=None, max_workers=8, deadline=CHECK_GLOBAL_DEADLINE, on_timeout=None, on_tick=None, tick_interval=1.0)
- Same design as the push helper but for the Check (searchtx) operation.
itemsis an iterable of(wid, url)pairs;_check_onecallscheck_transaction.on_each(wid, url, result_or_None, exc),on_timeout(wid, url),on_tick().- Returns
{wid: (result_or_None, exc)}.
3.2 bal/gui/qt/window.py
ping_willexecutors_task(self, wes)rewritten onping_servers_parallelwith live feedback and a counterPing Will-Executors: 2/3 (3s / 30s)driven byon_tickfrom the calling thread.push_transactions_to_willexecutors(self, force=False)rewritten onpush_transactions_parallel;on_eachdoes thread-safe book-keeping + UI update; "already present" servers are verified afterwards (original check logic intact).check_transactions_task(self, will)rewritten oncheck_transactions_parallel; showsChecking transactions: 2/5 (4s / 30s), reusing the originalset_check_willexecutor(...)per-item logic insideon_each(andset_check_willexecutor(None)onon_timeout).fetch_will_executors_list(...)fast-fail download (timeout=10, max_retries=1, retry_sleep=1); the download dialog showsDownloading will-executors list... (Xs / 45s).
3.3 bal/gui/qt/dialogs.py
BalBuildWillDialog.loop_push(the "Building Will" wizard broadcast step) rewritten onpush_transactions_parallelwith theon_tickcounterBroadcasting 2/3 (5s / 30s). The previous raw heartbeat thread was removed.
3.4 bal/gui/qt/plugin.py — status-bar icon (restored)
create_status_bar re-adds the BAL StatusBarButton (bottom-right of the
Electrum status bar). It shows that the plugin is installed and opens the plugin
settings on click; it also de-duplicates the button per window. (Comments in
English.)
3.5 bal/gui/qt/lists.py and bal/gui/qt/widgets.py — GUI usability
- Tooltips (hover) on the Will toolbar icons, all in English:
Wizard (
Wizard - Build your will), Delivery time (truck), Check Alive (siren), Calendar, Check (refresh). - Toolbar order changed to:
Wizard | Delivery time | Check Alive | Calendar | Check; layout margins tightened so everything fits the Will window.
3.6 bal/core/util.py — BUGFIX (pre-existing regression)
In get_value_amount (line 324) Util.in_output(...) (returns bool) had been
used instead of Util.din_output(...) (returns the tuple
(same_amount, same_address)), causing:
TypeError: cannot unpack non-iterable bool object
Fixed by restoring din_output. Found by running the official Gitea tests
(tests/test_core_util.py::test_get_value_amount).
4. Verification (ruff + official tests)
4.1 ruff (lint / PEP8)
ruff checkon the new code: no new issues introduced. TheF403/F405/ F401warnings come from the originalfrom .common import *pattern; per-file counts are identical between HEAD and the working tree.- The new parallel functions add 0
E501(line-length) issues; inwindow.pythe count actually decreased after the rewrite. ruff check tests/parallel_ping_test.py→ no new issues.
4.2 Official tests from the Gitea repo kaibot/bal-plugin-ai/tests
Run against the refactored code (with all the networking + UI changes):
| Suite | Result |
|---|---|
test_core_* + test_gui_* (pytest) |
182 passed |
smoke_test.py |
OK |
external_zip_test.py |
OK |
windows_overflow_test.py |
OK |
gui_fixes_test.py |
OK |
parallel_ping_test.py (new) |
OK — parallel ping/push/check ~0.50s for 8 servers (sequential would be ~4.00s); global deadline enforced; on_tick fired from the calling thread; static checks that the dialogs use the parallel helpers + the Xs / Ns counter |
Commands (as per README):
QT_QPA_PLATFORM=offscreen PYTHONPATH=<electrum-src> \
python3 -m pytest tests/ -q
QT_QPA_PLATFORM=offscreen PYTHONPATH=<electrum-src> \
python3 tests/smoke_test.py electrum.plugins.bal
QT_QPA_PLATFORM=offscreen PYTHONPATH=<electrum-src> \
python3 tests/external_zip_test.py bal-electrum-plugin.zip
QT_QPA_PLATFORM=offscreen PYTHONPATH=<electrum-src> \
python3 tests/parallel_ping_test.py bal
5. Integration notes / risks
- No change to the server protocol: only the how (parallel) and the when (retries/deadline) of the calls changed, not the payloads.
- Push transactions: per-server retries are intentionally kept so a real
transaction is not lost to a transient hiccup; only ping/info/download use
fast-fail. The global deadline marks unanswered servers as failed (
on_timeout) so the user can retry later. max_workers=8is conservative; with many servers (e.g. 20) it can be raised, but 8 workers already collapse the total time to the slowest server.- Thread/UI: all UI updates from workers go through
pyqtSignal-based dialog updates; the periodic counter is driven byon_tickfrom the calling thread. Do not reintroduce a raw heartbeat thread emitting signals — it does not repaint reliably. - Compatibility: signatures are backward compatible (new parameters are keyword-only with defaults that preserve the old behaviour).
6. How to test
- Install
bal-electrum-plugin.zip(Tools → Plugins → install from file). Fully close and reopen Electrum to avoid the cached zip import. - Configure several Will-Executors, including at least one unreachable.
- Run push / ping / Check: each dialog shows per-server status plus a counter
N/total (Xs / 30s)and no longer freezes on the dead server — within the global deadline the operation reports the dead server and proceeds.
The SHA-256 of the zip is printed by build_zip.py at the end of the build
(use it to verify integrity).