Skip to content

Rpi under-voltage

This is script I use for checking if under-voltage was hit on any of my Raspberry Pi 4, mainly I'm still trying to catch what is causing my USB with OS to switch to read only mode.

I'm getting this in log:

Dec 27 23:10:21 cube05 kernel: [1225968.218593] xhci_hcd 0000:01:00.0: xHCI host not responding to stop endpoint command.
Dec 27 23:10:21 cube05 kernel: [1225968.218604] xhci_hcd 0000:01:00.0: USBSTS:
Dec 27 23:10:21 cube05 kernel: [1225968.234615] xhci_hcd 0000:01:00.0: Host halt failed, -110
Dec 27 23:10:21 cube05 kernel: [1225968.234620] xhci_hcd 0000:01:00.0: xHCI host controller not responding, assume dead
Dec 27 23:10:21 cube05 kernel: [1225968.242480] xhci_hcd 0000:01:00.0: HC died; cleaning up
Dec 27 23:10:21 cube05 kernel: [1225968.247911] xhci_hcd 0000:01:00.0: xHCI host not responding to stop endpoint command.
Dec 27 23:10:21 cube05 kernel: [1225968.247921] xhci_hcd 0000:01:00.0: USBSTS:
Dec 27 23:10:21 cube05 kernel: [1225968.248063] usb 1-1: USB disconnect, device number 2
Dec 27 23:10:21 cube05 kernel: [1225968.263931] xhci_hcd 0000:01:00.0: Host halt failed, -110
Dec 27 23:10:21 cube05 kernel: [1225968.264496] usb 2-1: USB disconnect, device number 2
Dec 27 23:10:21 cube05 kernel: [1225968.278711] blk_update_request: I/O error, dev sda, sector 3251264 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0

Cause for this can be multiple, from faulty USB, faulty Raspberry, issue with driver but mostly under voltage, where power source drop voltage during some heavy use and Raspberry flips out...

Script

Author: https://gist.github.com/aallan/0b03f5dcc65756dde6045c6e96c26459

#!/bin/bash

#Flag Bits
UNDERVOLTED=0x1
CAPPED=0x2
THROTTLED=0x4
SOFT_TEMPLIMIT=0x8
HAS_UNDERVOLTED=0x10000
HAS_CAPPED=0x20000
HAS_THROTTLED=0x40000
HAS_SOFT_TEMPLIMIT=0x80000


#Text Colors
GREEN=`tput setaf 2`
RED=`tput setaf 1`
NC=`tput sgr0`

#Output Strings
GOOD="${GREEN}NO${NC}"
BAD="${RED}YES${NC}"

#Get Status, extract hex
STATUS=$(vcgencmd get_throttled)
STATUS=${STATUS#*=}

echo -n "Status: "
(($STATUS!=0)) && echo "${RED}${STATUS}${NC}" || echo "${GREEN}${STATUS}${NC}"

echo "Undervolted:"
echo -n "   Now: "
((($STATUS&UNDERVOLTED)!=0)) && echo "${BAD}" || echo "${GOOD}"
echo -n "   Run: "
((($STATUS&HAS_UNDERVOLTED)!=0)) && echo "${BAD}" || echo "${GOOD}"

echo "Throttled:"
echo -n "   Now: "
((($STATUS&THROTTLED)!=0)) && echo "${BAD}" || echo "${GOOD}"
echo -n "   Run: "
((($STATUS&HAS_THROTTLED)!=0)) && echo "${BAD}" || echo "${GOOD}"

echo "Frequency Capped:"
echo -n "   Now: "
((($STATUS&CAPPED)!=0)) && echo "${BAD}" || echo "${GOOD}"
echo -n "   Run: "
((($STATUS&HAS_CAPPED)!=0)) && echo "${BAD}" || echo "${GOOD}"

echo "Softlimit:"
echo -n "   Now: "
((($STATUS&SOFT_TEMPLIMIT)!=0)) && echo "${BAD}" || echo "${GOOD}"
echo -n "   Run: "
((($STATUS&HAS_SOFT_TEMPLIMIT)!=0)) && echo "${BAD}" || echo "${GOOD}"

Usage

I just created simple voltage.sh on my control01 node and used Ansible to distribute it to all.

ansible cube -b -m copy -a "src=/home/ubuntu/voltage.sh dest=/home/ubuntu/voltage.sh"

And then use asnible to run it on all nodes.

ansible cube -b -m shell -a "bash /home/ubuntu/voltage.sh"

Result

control01 | CHANGED | rc=0 >>
Status: 0x0
Undervolted:
   Now: NO
   Run: NO
Throttled:
   Now: NO
   Run: NO
Frequency Capped:
   Now: NO
   Run: NO
Softlimit:
   Now: NO
   Run: NO
cube02 | CHANGED | rc=0 >>
Status: 0x0
Undervolted:
   Now: NO
   Run: NO
Throttled:
   Now: NO
   Run: NO
Frequency Capped:
   Now: NO
   Run: NO
Softlimit:
   Now: NO
   Run: NO
cube01 | CHANGED | rc=0 >>
Status: 0x0
Undervolted:
   Now: NO
   Run: NO
Throttled:
   Now: NO
   Run: NO
Frequency Capped:
   Now: NO
   Run: NO
Softlimit:
   Now: NO
   Run: NO
control02 | CHANGED | rc=0 >>
Status: 0x0
Undervolted:
   Now: NO
   Run: NO
Throttled:
   Now: NO
   Run: NO
Frequency Capped:
   Now: NO
   Run: NO
Softlimit:
   Now: NO
   Run: NO
control03 | CHANGED | rc=0 >>
Status: 0x0
Undervolted:
   Now: NO
   Run: NO
Throttled:
   Now: NO
   Run: NO
Frequency Capped:
   Now: NO
   Run: NO
Softlimit:
   Now: NO
   Run: NO
cube03 | CHANGED | rc=0 >>
Status: 0x0
Undervolted:
   Now: NO
   Run: NO
Throttled:
   Now: NO
   Run: NO
Frequency Capped:
   Now: NO
   Run: NO
Softlimit:
   Now: NO
   Run: NO
cube04 | CHANGED | rc=0 >>
Status: 0x0
Undervolted:
   Now: NO
   Run: NO
Throttled:
   Now: NO
   Run: NO
Frequency Capped:
   Now: NO
   Run: NO
Softlimit:
   Now: NO
   Run: NO
cube05 | CHANGED | rc=0 >>
Status: 0x0
Undervolted:
   Now: NO
   Run: NO
Throttled:
   Now: NO
   Run: NO
Frequency Capped:
   Now: NO
   Run: NO
Softlimit:
   Now: NO
   Run: NO
cube06 | CHANGED | rc=0 >>
Status: 0x0
Undervolted:
   Now: NO
   Run: NO
Throttled:
   Now: NO
   Run: NO
Frequency Capped:
   Now: NO
   Run: NO
Softlimit:
   Now: NO
   Run: NO

As you can see no under-voltage or limits are hit but I had node 05 ( cube05 ) faulty in this manner 2 days ago... Next is to do some read-write tests and maybe cpu load test and have this script check again.

Possible improvement wold be to log directly using syslog, since I have all logs collecting to control01 and that still work when the OS switch to read only mode and I can't log in anymore.


Last update: February 8, 2021

Comments