Troubleshooting
This chapter is organized around the symptoms you actually see, not around an internal error taxonomy. If you're here, something isn't working right. Find your symptom in the list below, jump to the matching section, and follow the diagnostic procedure.
Most bench problems fall into three categories:
- Bus-level — the module doesn't answer over I²C, and reads
raise
RbAmpIOError. - Data-level — the module answers, but the numbers are wrong: zeros, odd values, the wrong sign.
- Application-level — the link is unstable, the Wh counter drifts, or the script hangs / dies on a timeout.
Shortcut for the impatient: in steady state, the public methods
should return values / yield snapshots without raising. If you see
RbAmpIOError or RbAmpStaleError frequently in the logs, jump
straight to the "Module doesn't answer over I²C" section below.
Module doesn't answer over I²C
What you see: dev.begin() (or entering the
with RbAmp(bus, 0x50) as dev: context manager) raises
RbAmpIOError or RbAmpVersionError, or RT reads regularly raise
RbAmpIOError.
Step 1 — Bus scan
First, confirm the module is even present on the bus.
CPython (Linux SBC)
External tool:
i2cdetect -y 1 # bus 1 — the standard one on the RPi 40-pin headerOr in Python:
from smbus2 import SMBus
with SMBus(1) as bus:
print("Scanning...")
for addr in range(0x08, 0x78):
try:
bus.write_quick(addr)
print(f"Found 0x{addr:02X}")
except OSError:
passExpected output: Found 0x50 (or another address, if you changed it).
MicroPython
from machine import I2C, Pin
i2c = I2C(0, scl=Pin(22), sda=Pin(21), freq=50_000)
print("Found:", [hex(a) for a in i2c.scan()])Expected output: Found: ['0x50'] (or your address).
What to do if nothing is found
It's a wiring or power problem:
- SDA / SCL aren't swapped (see 04 · Hardware connection for per-host GPIO defaults).
- Both lines have a pull-up to 3.3 V (the module board has built-in 4.7 kΩ resistors — no externals needed for a single module).
- The module's supply pin really has 5 V on it (4.5–5.5 V).
- No other master (debug probe, second controller) is hanging on the same lines.
- On MicroPython, check that
freq=50_000(not 100k or 400k — on ESP32 with v1 firmware the baseline NACK pattern will eat your retry budget). - On CPython, check that the kernel I²C driver is loaded —
lsmod | grep i2c_bcmon RPi.
What to do if the module shows up at an unexpected address
Someone re-addressed it earlier on the bench. Update the second
argument to RbAmp:
dev = RbAmp(bus, 0x52) # address from the bus scanStep 2 — MicroPython + ESP32 baseline NACK pattern
If the bus scan finds the module but dev.read_voltage()
occasionally raises RbAmpIOError — and you're on MicroPython +
ESP32 — this is a documented baseline pattern: the I²C stack on
the ESP32 port of MicroPython inherits quirks from the ESP-IDF v5
driver, which, when talking to rbAmp on the current firmware,
produces a ~20 % NACK rate at 100 kHz.
The mitigation is already built into MachineI2CBackend (3 retries
× 5 ms gap by default), but under heavy workloads the retry budget
can run out.
What to do on ESP32:
- Lower the bus speed to 50 kHz:
python
i2c = I2C(0, scl=Pin(22), sda=Pin(21), freq=50_000)
- Raise the retry count for a heavy workload — this needs the advanced API, by creating the backend explicitly:
```python from rbamp._io_micropython import MachineI2CBackend from rbamp import RbAmp
backend = MachineI2CBackend(i2c, retry_attempts=5, retry_gap_ms=5) dev = RbAmp(backend, 0x50) # the address goes to RbAmp, not to the backend ```
- Monitor the retry-exhaustion counter — it should stay zero in steady state:
python
if backend.retry_exhaustion_count > 0:
print("WARN: retry exhausted:", backend.retry_exhaustion_count)
Step 3 — Non-ESP32 platforms (RP2040 / STM32 / CPython)
On RP2040 / STM32 / Pyboard (via MicroPython) there is no
baseline-NACK pattern — these platforms show a ~0 % NACK rate with
rbAmp. MachineI2CBackend auto-detects the port and lowers the
retry default to a single attempt.
On CPython (via smbus2) there is no retry layer at all — the
Linux kernel I²C driver isn't subject to the NACK pattern; a single
OSError from smbus2 is translated into RbAmpIOError with no
retry.
If NACKs do occur on these platforms, look for:
- High bus capacitance — long wires plus many devices. Lower the speed or shorten the cable.
- Master contention — the package does not support multi-master.
- Floating SCL between transactions — a missing pull-up.
- On RPi: check
dtparam=i2c_arm_baudratein/boot/config.txt— 100000 is recommended.
Current reads zero on a working load
What you see: dev.current[0] (or dev.read_current(0))
returns 0.000 (or a very small value), even though a real
consumer is on (a kettle, a lamp, an iron). dev.power_factor[0]
may show an odd value (nan / 0 / ±1) — that's a side effect;
the root cause is the current.
Diagnostic procedure
- Check that the sensor class and CT model are configured:
```python from rbamp import RbAmp, RbAmpSensorClass, RbAmpModeError
try: dev.set_sensor_class(RbAmpSensorClass.SCT_013) except RbAmpError as e: print("set_sensor_class failed:", e)
try: dev.set_ct_model(3) # or your model — see the table in 03 except RbAmpModeError: print("Class must be set first!") except ValueError: print("Code out of range 1..5") ```
On v1.2+ firmware, without these two calls the calibration coefficients aren't loaded, and current always reads as zero. This step is done once at first install — the choice is stored in the module's flash.
- Check that the CT model matches the load. A large CT clamp (for example SCT-013-100, 100 A) on a small load (a 50 W lamp = ~0.2 A) produces a signal right at the noise floor, and the readings will be zero. Pick the smallest CT model that covers your maximum expected current. The full table is in 03 · Current sensor selection.
If you have a multi-channel module (UI2 / UI3) and want to see both small loads and peak spikes, consider the dual-CT pattern: a small clamp (SCT-013-005) on one channel for the low range plus a large one (SCT-013-030/100) on another channel for the high range; the master picks the value by a threshold. The pattern is discussed in 03 · Current sensor selection, the "Dual-CT topology" section.
- Check the clamp orientation. The arrow on the clamp body
should point in the direction of current flow toward the
load. If the clamp is "backwards,"
dev.read_current()gives the right value in absolute terms, butdev.read_power()returns a negative value on a consuming load. Confirmation:dev.read_power_factor()will read exactly −1.0 on a resistive load (instead of +1.0). The fix is either to physically reinstall the clamp (unclip it, flip it so the arrow points correctly, clip it back) or to invert the sign on the application side (p = -p; pf = -pf;if you know the load isn't a PV inverter).
If all three steps pass but the current is still zero
Then the signal at the ADC really is below the noise floor. Check:
- Whether current is actually flowing — measure with a multimeter (DC clamp / AC clamp meter) on the same wire.
- Whether the clamp is intact — its connector should carry an AC voltage proportional to the current (a few millivolts for consumer loads).
- Whether you're clamping the right wire — the line (phase) conductor, not the neutral (although a clamp on the neutral will also work — it measures current amplitude, not direction).
Readings jump around or raise RbAmpIOError
What you see: dev.voltage / dev.read_current(...)
occasionally raises RbAmpIOError. The property value wasn't
obtained — you need an application-side retry or a try/except guard.
RbAmpIOError from the package covers two classes of problem:
- NACK after retry exhaustion — the link is unstable.
- The sanity filter rejected a value — a float
NaN/Inf/|x| > 10000came off the bus (clearly not a physical value).
Tell them apart via the diagnostic counters (available on MicroPython):
from rbamp._io_micropython import MachineI2CBackend
# Created explicitly at the start of the program
backend = MachineI2CBackend(i2c, 0x50, retry_attempts=5)
dev = RbAmp(backend, 0x50)
# After a period of operation:
print("retries succeeded:", backend.retry_count_total)
print("retries exhausted:", backend.retry_exhaustion_count)
print("sanity rejects:", dev.sanity_reject_count)| Counter | What it means |
|---|---|
retry_count_total ↑, retry_exhaustion_count = 0, sanity_reject_count = 0 |
Bus is healthy; the package quietly recovers the occasional NACK. Normal. |
retry_count_total ↑, retry_exhaustion_count ↑ |
NACK plateau — increase retry_attempts (see Step 2 above). |
retry_count_total low, sanity_reject_count ↑ |
The sanity filter is catching garbage after failed retries. Raise retry_attempts=5+. |
On CPython (SMBusBackend) there is no retry layer — RbAmpIOError
is raised on the very first failure. The symptoms are rare (the
Linux kernel is usually reliable), but when they appear, check the
wiring / pull-ups / bus speed.
Power Factor looks wrong
What you see: dev.power_factor[0] (or
dev.read_power_factor(0)) returns a value that doesn't match the
load type.
Expectations by load:
| Load | Expected PF |
|---|---|
| Kettle, iron, incandescent lamp | +0.95 .. +1.0 (resistive) |
| Refrigerator, compressor motor | +0.6 .. +0.85 (inductive) |
| LED lamp, TV (switch-mode PSU) | +0.5 .. +0.95 (nonlinear) |
| PV inverter exporting power | negative PF |
PF = nan or 0 when I = 0
PF is defined as P / (U × I). At zero current I=0, the math is
undefined. The exact value returned depends on the firmware (it may
be nan, 0, or a placeholder) — that's normal as long as the
current really is zero. Once current appears, a valid PF appears.
PF is exactly −1.0 on a purely consuming load
The clamp is installed "backwards" — the arrow points away from the
direction of current flow to the load. The fix is either to
reinstall the clamp correctly or to handle the sign on the
application side (p = -p; pf = -pf; if you know the load can't be
a PV inverter).
PF floats between +0.3 and +0.7 on a resistive load
- Possible cause: the voltage reference is taken from a different phase. This applies to split-phase (240 V in the US) and 3-phase grids, where the module takes U from one phase while the CT clamp hangs on another — a 120° or 180° phase shift between U and I yields exactly these PF values. The fix is to install the module so the U input and the CT are on the same phase.
- Alternative: the load really isn't purely resistive. Repeat the test with a known-resistive load (an electric kettle at full power).
Period snapshots are always RbAmpStaleError
What you see: dev.read_period_snapshot() raises
RbAmpStaleError on every call. This means the module hasn't
finished integrating the previous period by the time of the next read.
The package protects against double-counting Wh: on a stale read it records the master timestamp, so the next successful snapshot covers one period, not two.
Acceptable: rare stales (1–2 per hour at a 60 s cadence). Not acceptable: consecutive stales — that means the firmware is unresponsive or the master is polling too often.
Cadence-check procedure
- Check the cadence: 60 s between latches is comfortable; 30 s is marginal; < 10 s guarantees stales.
- Check the module's responsiveness between snapshots:
python
if dev.probe():
print("alive")
- Check the flag directly:
python
if dev.is_period_valid():
# safe to read avg_p[]
...
- Check the firmware version —
dev.firmware_version >= 0x02shows fewer stales than 0x01.
Special case — MicroPython deep-sleep wake
If you use a deep-sleep pattern on MicroPython, the default
dev.read_period_snapshot() after a context-manager entry will
always be stale (or give near-zero values) — __enter__ issues a
priming LATCH that resets the firmware accumulator. The canonical
pattern uses skip_latch=True plus a known sleep duration (the
machine.deepsleep(SLEEP_MS) argument) — see
06 · Examples, Scenario 9.
Wh accounting drifts from a reference
What you see: dev.energy.wh(0), after several hours / days of
operation, diverges from the utility meter or a reference meter
(Kill-A-Watt, etc.).
First, rule out the trivial
- Current sensor calibration: confirm that
dev.set_sensor_class()anddev.set_ct_model()were called with the correct CT model (see 03 · Current sensor selection). Without this, RMS current is computed against the default floor and the power value will be systematically biased. - Dropped stale snapshots: if
dev.energy.wh(0)is consistently below the reference, those may be missed intervals — the snapshot came back stale, the package protected against double-counting, but the interval measurement was lost. Check the cadence (see above). - Master clock drift:
time.monotonic()(CPython) andtime.ticks_ms()(MicroPython) are reliable under normal operation. But if the master goes into deep sleep, use the RTC-memory + known-sleep-duration pattern from Scenario 9.
⚠
ticks_ms()wrap-around on MicroPython.time.ticks_ms()wraps at2**30ms (~12.4 days) — a naïvet1 - t0subtraction across the 12.4-day boundary gives a negative / huge number. Internally the package usestime.ticks_diff(t1, t0), which is wrap-safe. If you computemaster_dt_msyourself in user code (outside the package), useticks_diff, not raw subtraction.
Accumulator precision
The Wh accumulator inside the handle:
| Runtime | Type | Long-term precision |
|---|---|---|
| CPython | 64-bit float (Python float) |
drift < 1 LSB / year @ 60 s cadence |
| MicroPython (ESP32 N16R8 + double-precision build) | 64-bit float |
same |
| MicroPython on ports without double-float | 32-bit float |
~0.01 % drift per day |
Checking precision on MicroPython (sys.float_info exists only on
CPython; on uPy we use an empirical check):
# On CPython:
import sys
print(sys.float_info.dig) # 15 → double; 7 → single
# On MicroPython — empirically:
import sys
print(sys.implementation.name) # 'micropython'
test = 1.0 + 1e-15
print("double" if test != 1.0 else "single") # 1.0 + 1e-15 != 1.0 only on doubleFor single-float MicroPython ports on long soak installs,
periodically call dev.energy.reset(0) and store the long-term
total in your own persistent store (for example, via MQTT-retain in HA).
The script hangs / dies on a timeout
What you see: the Python script starts normally, then goes into a hang / RuntimeError after a few minutes.
CPython: signal.SIGTERM / KeyboardInterrupt not handled
Cause: a long time.sleep(60) blocks the signal handler. In a
production deploy (systemd), systemctl stop rbamp waits for the
SIGTERM response timeout (90 s default) and then kills via SIGKILL
— Wh can be lost.
Fix: a signal-aware loop:
import signal, time
shutdown = False
def on_sig(sig, frame):
global shutdown
shutdown = True
signal.signal(signal.SIGTERM, on_sig)
signal.signal(signal.SIGINT, on_sig)
while not shutdown:
snap = dev.read_period_snapshot()
# ... publish ...
for _ in range(60): # wake on the signal every second
if shutdown: break
time.sleep(1)Or use asyncio with a signal_handler — see the section below.
MicroPython (ESP32): Watchdog timeout on WiFi connect
Cause: an unbounded WiFi connection loop trips the task-WDT after ~5 s (default).
Fix: a bounded wait with a restart fallback:
import network, time, machine
wlan = network.WLAN(network.STA_IF)
wlan.active(True)
wlan.connect("ssid", "pass")
t0 = time.ticks_ms()
while not wlan.isconnected():
time.sleep_ms(100)
if time.ticks_diff(time.ticks_ms(), t0) > 30_000:
machine.reset()The MQTT broker disconnects
On CPython (paho-mqtt): the default keepalive is 60 s, but
mqtt.connect() is blocking and mqtt.loop() must be called in an
event loop. Use mqtt.loop_start() (a background thread) or the
asyncio variant aiomqtt.
On MicroPython (umqtt.simple): synchronous, with no keepalive
— you have to call mqtt.ping() manually every ~30 s. The
alternative is the async mqtt_as package with built-in keepalive.
TLS handshake fails (CPython cloud + RPi with little RAM)
Rarely a problem on 1+ GB RAM Pis, but on a Pi Zero / Zero W (512 MB) the large cert-store TLS handshake can get the process killed by the OOM killer.
Fix:
- Use a specific CA instead of the system ca-certificates — less
memory.
- Reduce the MQTT buffers.
- Move to a Pi 3/4/5 for a production cloud deploy.
set_sensor_class() / set_ct_model() raises RbAmpModeError
What you see: one of the setup calls raises RbAmpModeError
with the message:
REG_SENSOR_CLASS is UNSET on v1.2+ firmware;
call dev.set_sensor_class(RbAmpSensorClass.SCT_013) firstCause: on v1.2+ firmware, set_ct_model[_ch] requires that
set_sensor_class() has been called before it. This is an
intentional guard.
What to do:
from rbamp import RbAmpSensorClass
# Class first — MANDATORY
dev.set_sensor_class(RbAmpSensorClass.SCT_013)
# Then the model
dev.set_ct_model(3) # or per-channel dev.set_ct_model_ch(ch, code)More detail in 03 · Current sensor selection and 09 · API reference, the "Sensor configuration" section.
RbAmpModeError is also raised from prepare_address_change() /
commit_address_change() when the module isn't in develop mode — a
different situation, see the next item.
set_sensor_class() / set_ct_model*() raises RbAmpParamError or ValueError
What you see: the call raises RbAmpParamError (on CPython it
also inherits ValueError through multi-base).
Possible causes:
- Invalid model code — the valid range is 1..5 (see the table
in 03). Values 0 and 6+ raise
RbAmpParamError(ValueError). - Invalid channel index in the per-channel form
dev.set_ct_model_ch(channel, code)— it must be <dev.channels. - A reserved
RbAmpSensorClassvalue —RbAmpSensorClass.WIRED_CTorBUILTIN_CTaren't supported yet (not present on the current SKU); useSCT_013.
set_ct_model_ch() raises RbAmpVersionError
What you see: the per-channel form raises RbAmpVersionError.
Cause: the module's firmware is too old for per-channel commands
(the CMD_SET_CT_MODEL_CH0/1/2 opcodes appeared in v1.2). On v1.0 /
v1.1 this command doesn't exist.
What to do:
- Check the version:
dev.firmware_versionshould return0x03or higher for the per-channel form. - Use the legacy single-arg form
dev.set_ct_model(code)— it works on every firmware version, but only configures channel 0. - Update the module firmware to v1.2+ if you need full per-channel configuration.
prepare/commit_address_change() raises RbAmpModeError
What you see: the I²C-address-change methods raise
RbAmpModeError.
⚠ Develop-mode-only operation. Changing the address requires the module to be in develop mode (an internal flag, set at the factory). On a standard production module this flag is not set, and these methods raise
RbAmpModeError— that's expected behavior, not a bug. Thedev.prepare_address_change()+dev.commit_address_change()method pair is intended for factory provisioning and integrator bench operations, not for user code. If a deployed module needs a different I²C address, the documented path is reconfiguration on the factory bench (outside the package's responsibility).
commit_address_change() raises RbAmpTimeoutError
If you have a module with develop mode enabled and prepare
succeeded, but commit raises RbAmpTimeoutError, the "arming"
window has expired (5 seconds after prepare). dev.probe()
won't help here (the module answers; the problem is in the
module's state machine). The fix is to call
dev.prepare_address_change(new_addr) again and immediately, in
the same function/iteration, with no network calls in between,
call dev.commit_address_change(). Any blocking I/O between
prepare and commit (WiFi, MQTT, HTTP) is the main cause of
window expiry.
wait_ready() raises RbAmpTimeoutError
The module's ready bit wasn't set within timeout_ms. Possible
causes:
- The module hasn't finished its cold start yet (~250 ms) —
increase
timeout_msor retry the call. - The module doesn't answer over I²C — see the "Module doesn't answer over I²C" section above.
- The supply level dropped below 4.5 V — check
VCCwith a multimeter.
For more on the public-with-warning methods, see
09 · API reference, the "Sensor
configuration" section (the address-change subsection lives there
along with save_gains and factory_reset).
Exception summary table
RbAmpError (a subclass of OSError) is the base class. An
existing except OSError: handler on the application side keeps
catching rbAmp errors with no changes — the standard Python
convention for hardware-I/O drivers.
| Exception | When | Where to look |
|---|---|---|
RbAmpIOError |
NACK after retry; sanity reject; bus-level failure | "Module doesn't answer over I²C" section |
RbAmpTimeoutError |
wait_ready() expired; the commit_address_change window (5 s) expired |
"commit_address_change raises RbAmpTimeoutError" section below |
RbAmpNotReadyError |
(reserved; not raised in v1.1.0, exported for forward-compat) | if you catch this type — repro it in an issue |
RbAmpStaleError |
period snapshot stale | "Period snapshots are always RbAmpStaleError" section |
RbAmpParamError (on CPython also ValueError) |
bad argument: dev=None, ch out of range, code outside 1..5, a reserved cls. On v1.2+ also a precondition violation: set_ct_model*() without a preceding set_sensor_class() |
check the call arguments; on v1.2+ — that set_sensor_class() was called |
RbAmpModeError |
develop mode not set for an address change | "prepare_address_change / commit_address_change raises RbAmpModeError" section |
RbAmpVersionError |
REG_VERSION = 0/0xFF on begin() (the module doesn't answer or the firmware is corrupt). Per-channel set_ct_model_ch on v1.0/v1.1 may NACK (the opcode doesn't exist) and raise RbAmpIOError, not RbAmpVersionError — the version guard is NOT implemented client-side; check dev.firmware_version ≥ 0x03 before calling |
check dev.firmware_version |
Every exception has a __str__ with a meaningful message — the
package carefully constructs error messages with context.
Diagnostic counters summary table
In a healthy soak run (the 1-hour long-soak harness) they all stay zero:
| Counter | Steady-state | Reset | Availability |
|---|---|---|---|
dev.sanity_reject_count |
0 | dev.reset_counters() |
both runtimes |
MachineI2CBackend.retry_exhaustion_count |
0 | backend.reset_counters() |
MicroPython only |
MachineI2CBackend.retry_count_total |
low (~5-20 per hour is normal) | backend.reset_counters() |
MicroPython only |
| stale fraction in period snapshots | < 1 % | (cumulative — no reset) | both runtimes |
If any of these is nonzero in steady state, go back to the matching section above.
The long-soak harness
tests/test_long_soak.pyruns viapytest libs/python/rbamp/tests/test_long_soak.py --soak --bus N --addr 0x50. For the six acceptance criteria, see the README.md "Long-soak regression harness" section.
Bus-level debug with a logic analyzer
For deep debugging, when the package can no longer tell you what's happening on the wire, capture SDA + SCL with a logic analyzer (Saleae, DSLogic Plus, Sigrok-compatible):
- Sample rate ≥ 1 MS/s at 100 kHz I²C; ≥ 4 MS/s at 400 kHz.
- The I²C decoder in Sigrok / Saleae will show ACK / NACK on each byte plus the address phase.
- Compare your script's calls (
dev.voltage, etc.) against the expected byte sequence — they should match exactly (single byte per address phase, no auto-increment).
If the package's behavior diverges from the capture, open an issue
with the capture file attached (.sal / .dsl / .csv).
When to contact support
If you've worked through the matching section above and the problem persists, open an issue:
github.com/rb-amp/rbamp-python/issues
In the issue, include:
- Runtime + version:
python --version/mpremote eval 'import sys; print(sys.version, sys.implementation)'. - Host platform: "RPi 4B Bookworm", "ESP32-S3 N16R8 MicroPython 1.22", etc.
- Package version:
rbamp --version(CPython) orpython -c "import rbamp; print(rbamp.__version__)". - Module firmware version —
dev.firmware_versionfrom the logs. - A minimal script (~30 LOC) that reproduces the problem.
- The exception traceback with full context.
- The counters at the time of failure:
dev.sanity_reject_count,backend.retry_exhaustion_count(uPy only). - Verbose library logs — the easiest way to collect them is
dev.set_logger(print)before the operation that fails. The package will print its internal steps (the retry loop, settle timeouts, the latch primer, and so on) — attach stdout to the issue. - (If you have one) a logic-analyzer capture file.
CLI — rbamp scan / rbamp read / rbamp address
The rbamp CLI is a thin wrapper over RbAmp.* for bench
operations (see 09 · API reference, the "CLI"
section). Because it uses the same RbAmp class, ALL the symptoms
above apply. CLI-specific errors:
rbamp scanfinds nothing — same as Step 1 in the "Module doesn't answer over I²C" section above. Checki2cdetect -y 1directly. If it finds the module, it's a bug in our backend autodetect; open an issue with the output ofrbamp --verbose scan.rbamp <subcommand>fails withPermissionError: /dev/i2c-1— the user isn't in thei2cgroup. See README § Installation forsudo usermod -aG i2c $USERplus a re-login.rbamp: command not foundafterpip install— the[project.scripts]console entry isn't active. Check thatpip show rbampshows the installed location; make sure$HOME/.local/binis on$PATH(forpip install --user).
Links
- 05 · Quickstart — your first working script
- 09 · API reference — the full API plus the warnings on public-with-WARNING methods
- 03 · Current sensor selection — the SCT-013 model table, the dual-CT topology for a wide dynamic range, and approaches to boosting sensitivity at low currents
← API Reference | Contents | Changelog →
Source & issues: rb-amp/rbamp-python · this page in the repo: docs/10_troubleshooting.md