CrankGPT is the Offline AI Voice Assistant That Runs on Muscle Power Alone

Builders from SqueezLabs set out to answer a simple question. What if an AI assistant did not need data centers, internet connections, or even a battery? Their answer sits inside a compact red enclosure with a hand crank mounted on one side, called CrankGPT. Turn the handle and the system wakes up. Keep turning and it stays alive while it listens, thinks, and speaks.
The main device is built on a Raspberry Pi 5 and has 8GB of RAM and a cooling fan for stability. A Seeed Studio ReSpeaker microphone hat controls both audio input and output. It everything runs on DietPi, a stripped-down operating system chosen for its lightning-fast boot rates. There is no cloud service that touches the entire process at any stage, as everything from speech recognition to voice output occurs directly on the board.

Plaud Note Pro AI Voice Recorder, Transcribe & Summarize with AI Note Taker for Meetings & Calls…
- AI-POWERED TRANSCRIPTION & MULTI-DIMENSIONAL SUMMARIES: Plaud Note Pro is your professional voice transcriber, delivering high-accuracy transcription…
- ENHANCED CONTEXT WITH MULTIMODAL INPUT: Capture audio, type notes, add images, and press to highlight key moments for richer context. During…
- CHAT WITH YOUR RECORDINGS USING “ASK Plaud”: Unlock deeper insights with this interactive AI. Ask questions, extract key points, draft emails, and get…
The only power source is a hand-cranked generator. The device consumes approximately 20 watts of power, allowing some breathing room. When idle, it consumes only 4 watts, while voice recognition consumes approximately 8 watts and complete language inference and text-to-speech consume up to 15 watts each. To deal with the power spikes, the team designed a bespoke capacitor board that serves as a 20-second buffer when the crank stops moving. When the model eventually responds, you can feel the change in the handle since the crank becomes much harder.

The microphone records the spoken word, Moonshine ASR turns it to text, and Silero monitors the situation to ensure that the speaker has done speaking. The solution is then generated using a tiny language model. You can choose from Liquid AI’s 350 million or 1.2 billion parameter LFM2.5 series and Google’s Gemma 3 with a billion parameters; they all pass through a speed-optimizing llama.cpp before being fed to the limited hardware. The Piper text-to-speech engine responds and sends it back to you one sentence at a time, resulting in a significantly more responsive discussion; from crank to conversation takes roughly 30 seconds on average.
If you’re stuck in the middle of nowhere or experiencing an extended power outage, you can crank up this device (literally!) and get some practical help, such as step-by-step instructions, language translations, or learning how things work. The responses are in plain English and do not require a connection to the rest of the internet. Because the models are kept tiny, responses begin to flow quickly once a consistent source of power is established. Larger models would require a significant amount of effort to crank, and you’d have to wait a long time for a response, so the team kept them scaled down to keep things going.

SqueezLabs designed this machine with equal parts technical expertise and a cheeky attitude. The researchers noted that super-powerful systems frequently consume more power than is necessary for the task at hand. They intended to demonstrate that smaller models operating on a local system can perform valuable tasks, keep your private information private, and break free from reliance on remote infrastructure. On the project’s website, they’re making light of the problem with tongue-in-cheek product tiers that range from ‘hand crank only’ to ‘gym membership’ for power hungry jobs. However, beneath all of this is a true belief that your instruments should be appropriate for the task at hand.
The real world has a lot to say about this project, too, because the Pi lacks a proper sleep mode, thus power outages reset the entire system. The crank itself makes quite a bit of noise for the microphone to filter through. Despite this, the pipelining manages to maintain some relatively fluid discussions. Latency is normally less than a second, although it might be a few seconds depending on the model, prompt, and length. The entire stack operates perfectly without any additional hardware, only the Pi.
[Source]
CrankGPT is the Offline AI Voice Assistant That Runs on Muscle Power Alone
#CrankGPT #Offline #Voice #Assistant #Runs #Muscle #Power