This weekend I decided to test OpenClaw, formerly Clawdbot, formerly MoltBot for about five minutes, (which should already tell you something about the maturity curve)! Not a quick poke around, but leaving it running as an always-on AI assistant and seeing what broke first. After consuming a frankly unhealthy amount of opinions and hype about it, my aim was simple enough; to explore what it can really do without losing my mind, my money, or my data in the process…

If you somehow missed the hype train, OpenClaw positions itself as an AI assistant that behaves more like a human. It can act proactively, use multiple LLMs as sub-agents, orchestrate tasks, and interact with almost anything on your computer on your behalf. At least in theory. Cue the “beam me up” moment and a strong temptation to believe this might finally be the thing that liquefies your brain in the good way.
That said, there are some fairly substantial security questions still outstanding, particularly around prompt injection and unintended actions, so I approached this very deliberately (unlike some of the horror stories I’ve seen already!). Everything ran in a sandbox, with a carefully limited blast radius, no access to my personal files or accounts, and its own email and calendar. For now, I treated it like a teenager on work experience (keen, capable, occasionally overconfident, and absolutely not to be left unsupervised with anything sharp).
What surprised me most over the weekend wasn’t what it could do, but what it demanded from me in return…
What did I Learn
The first big lesson is that the core capabilities are already solid. File operations, shell commands, and basic system interactions work reliably as long as you give them the permissions they need (and actually understand what you’ve granted). When something went wrong, it was rarely because the system couldn’t do the thing, and far more often because the constraints, context, or guardrails weren’t clear enough.
That quickly leads to the real work, which is teaching. This is not a fire-and-forget setup. You have to approach it like a teacher, because the assistant doesn’t magically infer how you want to work. The upside is that once you explain your processes clearly and correct it when it gets things wrong, it tends to internalise those patterns very quickly (sometimes alarmingly so!). One explicit correction, especially when backed by documented preferences that it stores in workspace md files, often changed behaviour permanently.

Communicating with OpenClaw
Clarity matters more than cleverness. These systems are far better at technical problem solving than mind reading! When I was vague, I paid for it in wasted tokens, odd detours, and solutions that were technically ‘correct’, but practically useless. When I was precise about what I wanted and what I didn’t, the results improved dramatically and stayed that way.
Autonomy turned out to be another important dial. It seems that you need to give an always-on agent enough freedom to be useful, but not so much that it becomes risky or unpredictable. I had the best results when I defined (very!) explicit boundaries around what was safe, what required confirmation from me, and what was simply not allowed. I didnt expect it to be perfect, but it adhered reasonably well to simple, clearly stated rules, which was enough to build confidence over time (and sleep slightly better!). I’m sure there is risk in there of it overstepping some boundaries, so again until this matures, I’m keeping that blast radius as small as possible, even in the worst case scenario.
I also found that you have to accept a certain amount of trial and error. Things don’t always work on the first attempt, and that’s part of the deal. In one case, a system looked “healthy” while doing absolutely nothing useful because it was monitoring the wrong signal. In another, an automation kept hanging because the machine was quietly waiting for a human approval prompt I couldn’t see, despite being on the desktop! These weren’t exotic failures, just the kind you only discover by actually running the thing.
Pro tip: don’t /reset in the middle of a long conversation unless you have to, as it will forget some vital things, but when you finish a piece of work, using /reset will reduce the context window and save you tokens!

Tokens, tokens everywhere…
Tokens deserve special attention, because if you’re not careful, OpenClaw will burn through them with impressive enthusiasm. What worked best for me was treating models as tools with different costs, not interchangeable brains. For routine work and exploratory steps, I have heard good things about cheaper models like Kimi K2.5, that hold up remarkably well. For me, most of the orchestration and day-to-day thinking ran on Anthropic’s Claude Sonnet 4.5, which struck a good balance between capability and cost. When I genuinely needed deep reasoning, I escalated deliberately, rather than by default to Opus 4.5. For code-heavy work, ChatGPT 5.2 performed well and, somewhat surprisingly, survived the token usage better than expected. One practical tip here is to avoid pay-as-you-go APIs unless you enjoy anxiety, and to accept that if you are using this for anything more than extremely light testing, upgrading to the Anthropic Max plan is often the least painful option in the long run (I went for the $100 equivalent tier for this month and it’s holding up well so far).
One habit that paid off faster than expected was documenting as I went, or better yet, having the bot do it for me! Notion worked particularly well for this, and having a written record of decisions, preferences, and fixes turned out to be just as valuable as the automation itself (future me will almost certainly agree).
The Verdict
Stepping back, OpenClaw is clearly bleeding-edge software with some rough, bordering on paper-cut level edges! Some operations are fragile, the learning curve is real, and you are very much learning alongside the system rather than using a polished product. If you’re technical, patient, and comfortable in a terminal, it’s genuinely astounding, and will surprise you regularly (in a good way!), but it’s not going to be a totally smooth ride, and it’s absolutely nowhere near ready for the general public to safely play with!
I also discovered more about ways to use ChatGPT or Claude independently of OpenClaw that I wasnt previously aware of. Simple things like scheduling reminders and information to be sent to you later is all feasible, which actually negates some of the reason (and therefore risk) of using OpenClaw, depending on your use case.
After three days, my takeaway is that an AI assistant isn’t magical or terrible; it’s extremely useful. It’s definitely more like that teenager on work experience, except they rarely forget anything, work continuously, and execute precisely once you’ve taken the time to teach them how you operate (but still under strict supervision!). The value isn’t in a single impressive feature, but in the compound effect of lots of small, but amazing things.
I’m cautiously optimistic.
Ask me again in a couple of weeks.




RSS – Posts