5 Things Running an OpenClaw Personal AI Agent Taught Me (The Hard Way)

The tool I’ve been tinkering with just made headlines. Peter Steinberger, creator of OpenClaw, is joining OpenAI to “drive the next generation of personal agents.” Sam Altman called him “a genius.” Not bad for an open source project only weeks old…

I’ve been running OpenClaw as a personal AI agent for several weeks now (in very strict isolation). It handles a standalone calendar, sends me reminders, processes emails I forward to it, manages my task lists, writes code and makes commits to specific projects I’ve shared from mine, to its GitHub account. Think Jarvis, but with more cron jobs and less Robert Downey Jr. Along the way I’ve learned a few things the hard way. I already posted about some of those last week, and here are five more.

AI agent setup

1. Your AI Will Forget Unless You Make It Remember

This one caught me off guard. Every time your agent starts a new session, it wakes up with absolutely no memory of what you did yesterday. None. It’s like your intelligent, funny, witty teenage child, who wakes up every morning with no memory of you reminding them to clean their room tomorrow…

The fix is decidedly low tech, namely markdown files! The agent reads a MEMORY.md file at the start of every session for long term context, plus daily “summarising” note files for recent history. Without these, every conversation starts from zero. You find yourself re-explaining the same decisions, the same preferences, the same project context. It can be quite frustrating to say the least!

In short, if you want your AI to know something tomorrow (or even immediately following a /reset), write it down today. In a file. On disk. Like it’s 1995 (only not floppy)…

2. Silent Fallbacks Will Eat Your Budget

Here’s a fun one. I set up a simple reminder cron job. Simple task, should cost fractions of a penny. I configured it to use Google Gemini 2.5 Flash Lite, a super-fast, super cheap model. Perfectly adequate for “tell Alex to [insert reminder here].” (side note: mine has permission to do so “with attitude” if need be!).

What I didn’t click was that by default when Google rate-limited Gemini (I was using the free version to start), the system silently fell back to Opus, Anthropic’s most expensive model. My bedtime reminder, a task that could run on a calculator, was burning through premium AI tokens! I only found out when I was looking at some failed, rate-limited tasks. The bot didn’t think this would be something worthy of proactively letting me know. No warning, no alert. Just a quiet, expensive upgrade.

Check your fallback chains, then check them again. Then check them after each time you do an upgrade (which has on one occasion so badly broken the contexts and channels for the gateway, it forgot who it was and I had to restore from a backup!).

Finally, setup a monitoring page on your “Mission Control”. You should definitely build one of these – a bot-built small webapp for managing and monitoring your bot, e.g. here’s mine at the moment:

3. The Hidden Cost of “Good Enough” Model Defaults

Related to the above, but subtler. Not every task your agent performs needs the flagship model. Heartbeat checks, health pings, simple notifications: these can run on the cheapest model available. I’ve got simple jobs running on Gemini Flash Lite, which costs almost nothing. Meanwhile, many of my cron jobs were defaulting to models ten times the price for work that was just as simple.

Match the model to the task. Your “send me the weather” job doesn’t need the same brain as your “analyse this quarterly report” job. It sounds obvious, but so do many things!

4. Your AI Anchors on Context, Not Facts

This is the one that properly messed with my head. I spent a long evening debugging cron job issues. Hours of back and forth, pasting logs, tweaking configs. All the conversation context was about problems from Tuesday. By the time we finished, my agent was convinced it was still Tuesday.

It was Wednesday.🤦

The model doesn’t always know what day it is from an internal clock (you know – those things that have been in computers for decades…). It appears to infer “reality” from the conversation window. If your context is full of Tuesday’s problems, Tuesday is reality. This has real consequences when you’re scheduling things, setting reminders, or asking “what’s happening tomorrow”. I’ve seen this happen many times in different scenarios, including pre-scheduled morning briefs based on the wrong day and scheduling cron jobs for the wrong day and time.

Your AI’s sense of the world is only as good as the context you’ve given it, and context can lie. Once again I recommend a Mission Control to easily eyeball things occasionally.

5. Trust Logs Over Vibes

At one point I asked my agent which model it had used for a particular task. It confidently told me Sonnet, but the logs showed Opus (via fallback). The model wasn’t lying exactly… it just didn’t know. It reported what it thought was true based on its configuration, not what actually happened at the infrastructure level.

This applies broadly. Just like any chatbot or web-based bot you’ve been talking to for the last 3 years, your AI will sound confident about things it cannot possibly verify or doesn’t want to as it might be a wasteful activity, so it just goes with what it has in context. System level behaviour, actual API calls, real thing live in logs, not in chat responses, so when it matters always go to the source.

(I think of this as the “Did you really brush your teeth? Shall we go check if the brush is wet?” scenario).

What Next

Peter Steinberger’s very high profile move to OpenAI tells you where this is heading. Personal agents aren’t a nerdy hobbyist curiosity anymore, they’re absolutely going mainstream. OpenClaw will continue as open source via a foundation, which is great, but the bigger signal is that OpenAI wants this expertise in house. They’re betting that millions of people will be running agents like this, and you would imagine soon, by default they’ll run on Codex.

When that happens, every lesson I’ve learned will be demonstrated at global scale. People will inevitably burn money on cheap tasks, wonder why their agent forgot last week’s conversation and trust a confident response over a log file.

If you’re thinking about running your own agent, start now while it’s still a bit rough around the edges. The lessons are cheaper to learn on an old laptop in your cupboard (on an isolated network, with isolated accounts!), than in production and connected to all your company’s systems!

Now I’m off to go and find some 10 year old memory DIMMs to sell for a 400% markup.

💻💰🎉

PS – If you got this far, thanks for reading, and I apologise for the rather click-baity title! Don’t hate the player, etc…

AI, Web , , , , , , , , ,

Testing OpenClaw Without Losing Your Mind, Money, or Data

This weekend I decided to test OpenClaw, formerly Clawdbot, formerly MoltBot for about five minutes, (which should already tell you something about the maturity curve)! Not a quick poke around, but leaving it running as an always-on AI assistant and seeing what broke first. After consuming a frankly unhealthy amount of opinions and hype about it, my aim was simple enough; to explore what it can really do without losing my mind, my money, or my data in the process…

Of course I had to include a Red Dwarf meme!

If you somehow missed the hype train, OpenClaw positions itself as an AI assistant that behaves more like a human. It can act proactively, use multiple LLMs as sub-agents, orchestrate tasks, and interact with almost anything on your computer on your behalf. At least in theory. Cue the “beam me up” moment and a strong temptation to believe this might finally be the thing that liquefies your brain in the good way.

That said, there are some fairly substantial security questions still outstanding, particularly around prompt injection and unintended actions, so I approached this very deliberately (unlike some of the horror stories I’ve seen already!). Everything ran in a sandbox, with a carefully limited blast radius, no access to my personal files or accounts, and its own email and calendar. For now, I treated it like a teenager on work experience (keen, capable, occasionally overconfident, and absolutely not to be left unsupervised with anything sharp).

What surprised me most over the weekend wasn’t what it could do, but what it demanded from me in return…

What did I Learn

The first big lesson is that the core capabilities are already solid. File operations, shell commands, and basic system interactions work reliably as long as you give them the permissions they need (and actually understand what you’ve granted). When something went wrong, it was rarely because the system couldn’t do the thing, and far more often because the constraints, context, or guardrails weren’t clear enough.

That quickly leads to the real work, which is teaching. This is not a fire-and-forget setup. You have to approach it like a teacher, because the assistant doesn’t magically infer how you want to work. The upside is that once you explain your processes clearly and correct it when it gets things wrong, it tends to internalise those patterns very quickly (sometimes alarmingly so!). One explicit correction, especially when backed by documented preferences that it stores in workspace md files, often changed behaviour permanently.

Communicating with OpenClaw

Clarity matters more than cleverness. These systems are far better at technical problem solving than mind reading! When I was vague, I paid for it in wasted tokens, odd detours, and solutions that were technically ‘correct’, but practically useless. When I was precise about what I wanted and what I didn’t, the results improved dramatically and stayed that way.

Autonomy turned out to be another important dial. It seems that you need to give an always-on agent enough freedom to be useful, but not so much that it becomes risky or unpredictable. I had the best results when I defined (very!) explicit boundaries around what was safe, what required confirmation from me, and what was simply not allowed. I didnt expect it to be perfect, but it adhered reasonably well to simple, clearly stated rules, which was enough to build confidence over time (and sleep slightly better!). I’m sure there is risk in there of it overstepping some boundaries, so again until this matures, I’m keeping that blast radius as small as possible, even in the worst case scenario.

I also found that you have to accept a certain amount of trial and error. Things don’t always work on the first attempt, and that’s part of the deal. In one case, a system looked “healthy” while doing absolutely nothing useful because it was monitoring the wrong signal. In another, an automation kept hanging because the machine was quietly waiting for a human approval prompt I couldn’t see, despite being on the desktop! These weren’t exotic failures, just the kind you only discover by actually running the thing.

Pro tip: don’t /reset in the middle of a long conversation unless you have to, as it will forget some vital things, but when you finish a piece of work, using /reset will reduce the context window and save you tokens!

Tokens, tokens everywhere…

Tokens deserve special attention, because if you’re not careful, OpenClaw will burn through them with impressive enthusiasm. What worked best for me was treating models as tools with different costs, not interchangeable brains. For routine work and exploratory steps, I have heard good things about cheaper models like Kimi K2.5, that hold up remarkably well. For me, most of the orchestration and day-to-day thinking ran on Anthropic’s Claude Sonnet 4.5, which struck a good balance between capability and cost. When I genuinely needed deep reasoning, I escalated deliberately, rather than by default to Opus 4.5. For code-heavy work, ChatGPT 5.2 performed well and, somewhat surprisingly, survived the token usage better than expected. One practical tip here is to avoid pay-as-you-go APIs unless you enjoy anxiety, and to accept that if you are using this for anything more than extremely light testing, upgrading to the Anthropic Max plan is often the least painful option in the long run (I went for the $100 equivalent tier for this month and it’s holding up well so far).

One habit that paid off faster than expected was documenting as I went, or better yet, having the bot do it for me! Notion worked particularly well for this, and having a written record of decisions, preferences, and fixes turned out to be just as valuable as the automation itself (future me will almost certainly agree).

The Verdict

Stepping back, OpenClaw is clearly bleeding-edge software with some rough, bordering on paper-cut level edges! Some operations are fragile, the learning curve is real, and you are very much learning alongside the system rather than using a polished product. If you’re technical, patient, and comfortable in a terminal, it’s genuinely astounding, and will surprise you regularly (in a good way!), but it’s not going to be a totally smooth ride, and it’s absolutely nowhere near ready for the general public to safely play with!

I also discovered more about ways to use ChatGPT or Claude independently of OpenClaw that I wasnt previously aware of. Simple things like scheduling reminders and information to be sent to you later is all feasible, which actually negates some of the reason (and therefore risk) of using OpenClaw, depending on your use case.

After three days, my takeaway is that an AI assistant isn’t magical or terrible; it’s extremely useful. It’s definitely more like that teenager on work experience, except they rarely forget anything, work continuously, and execute precisely once you’ve taken the time to teach them how you operate (but still under strict supervision!). The value isn’t in a single impressive feature, but in the compound effect of lots of small, but amazing things.

I’m cautiously optimistic.

Ask me again in a couple of weeks.

AI , , , , , , , , , ,

Is the Cloud actually greener?

This week, I returned from an amazing family adventure holiday in Morocco, where the country’s wonderful culture and fascinating history made it (I hope!) an unforgettable experience for my kids. However, recent droughts there have had severe consequences on the country’s agriculture, economy, and water resources. Reduction in rainfall over the past two years has impacted crops, increased food prices, and water scarcity, affecting millions of people and raising concerns about long-term sustainability.

During one of many hours on the minibus, travelling between regions, my family asked me about the cloud and what impact it has on the environment. This has obviously been a massive topic over the past few years, prompting the hyperscalers to take a very public stance on the matter, for example, the re:Invent 2021 sustainability announcement by AWS.

We all know that cloud computing has become an essential part of modern life, changing the way we work, play, and communicate arguably faster than any other time in history! I would suggest that there are a huge number of sustainability benefits to adopting the cloud, but that doesn’t mean it’s environmental impact is zero. As with all things, we should be looking at the pros, cons and mitigations.

Just some of the Pros

The cloud allows businesses to reduce energy consumption and hardware waste significantly. By using shared cloud resources, organisations can get rid of their low-utilisation, on-premises hardware footprint, unused redundant kit for HA and DR, etc, all of which requires electricity, cooling, shipping, maintenance, etc. Cloud providers typically use state-of-the-art, energy-efficient data centres with huge economies of scale to minimise the overall carbon footprint.

Speaking of which – economies of scale! Hyperscalers benefit from massive economies of scale, making it more efficient for them to build, manage and maintain data centres. They have the budgets to invest in advanced technologies and energy-efficient infrastructure, leading to a lower environmental impact compared to small-scale, on-premises solutions (of even traditional colo).

On-demand scalability in the cloud allows organisations to optimise resource utilisation and remove the need for over-provisioning of hardware for peak demand or HA/DR. This not only reduces waste, ensuring only necessary compute resources are used, but reduces the TCO and frees up budget to be used elsewhere!

Something perhaps overlooked at times is that the cloud increasingly enables remote working, thereby providing better work/life balance for people and reducing the environmental impact of commuting. Greenhouse gas emissions from vehicles have a massive impact, which (especially in temperate countries) can be mitigated by more work from home. Furthermore, with the ubiquity of 4G and 5G mobile communications, this provides access to compute resources from remote locations where they would not have otherwise been available. This will likely increase utilisation and impact, but will help people all over the world benefit their lives and will likely lead to further innovation that will benefit the environment.

Lastly, as bonkers as it is to even needing to remind people of this in 2023, cloud computing virtually forces users to adopt virtualisation, utilising resources far more efficiently than traditional full-fat tin. It’s mind boggling how many companies are still uncomfortable virtualising heavy workloads such as databases today, despite all of the classic concerns being mitigated.

Remote village in Moroccan mountains

A Few Risks

The largest risk, but possibly the one which may have its own mitigations, comes from increased adoption. The increasing popularity of cloud computing means that the demand for data centre resources is rising massively. As more businesses move their operations to the cloud, energy consumption of centralised, cloud data centres will continue to grow (whilst reducing that of local), but beyond that, the innovation of all those very clever humans who have found new ways to utilise this new technology is likely further driving up utilisation beyond our traditional baselines.

The location choice for data centres can have a significant impact on the environment. In regions where electricity is generated using fossil fuels, cloud computing indirectly contributes to higher greenhouse gas emissions, and cooling data centres in hot climates can be super energy-intensive. If data sovereignty is not an issue, then utilising compute regions close to natural energy / cooling can help to mitigate this.

Inefficient development practices and code bloat further add to the risk landscape. The availability of virtually unlimited resources in the cloud may inadvertently reduce the drive for developers to write efficient code. Promoting clean development practices and optimisation is essential to minimise energy consumption. We should be fostering a culture of efficiency and sustainability right from the early stages of developer education to ensure this issue doesn’t continue to creep into the cloud. The growing trend of of microservices architectures may actually help here, encouraging developers to think small and efficient modules, but that remains to be seen!

One of the fastest growing users of energy and hardware is Cryptocurrencies. The massive amounts of power used to not only generate new coins, but also manage transactions on the chain, are a significant concern. Dedicated crypto hardware, such as ASICs, can help reduce energy consumption and those specifically designed for cryptocurrency mining are more energy-efficient compared to general-purpose hardware like GPUs. I would hope that the miners will adopt these more, if only for their own benefits, if not for the environment!

TLDR

So to respond to the question posed by the title of this post, I believe the answer is yes, but there are some key considerations to ensure it remains so.

To make cloud computing a truly sustainable solution, we need to advocate for the use of renewable energy sources by cloud providers and the drive for net-zero carbon emissions in our cloud platforms (not just through buy carbon credits, but through actual change). Harnessing solar, wind, and hydroelectric power can enable cloud providers to decrease their dependence on fossil fuels and shrink their carbon footprint, but this will always be region-specific and impacted by data sovereignty regulations.

As consumers of the cloud, we have a crucial role to play by opting for cloud service providers that prioritise eco-friendly practices as well as adopting those ourselves, from architecture to development, fostering a culture of well-architected sustainability in our own organisations.

AWS, Cloud , , , , , , , , ,

TekBytes #5: The Current State of Cloud Security

Discussing the concept of Cloud Security over breakfast with my kids (yup – poor kids I hear you say!), I was thinking about the current state as one of constant (and accelerating) evolution and improvement. As more businesses adopt cloud computing, the need for robust and effective security measures has become increasingly important. While cloud hyperscalers have made significant investments in securing their platforms, the responsibility for implementing and maintaining effective security measures ultimately falls on customers or those they entrust to manage their platforms on their behalf.

Challenges

There are many challenges that businesses face when it comes to cloud security and far too many to go into in a TekBytes thought of the day, but let’s look at a few.

One major challenge is the lack of visibility and control over the infrastructure and data that are hosted in the cloud. This can make it very difficult to identify and address security vulnerabilities and threats. Another challenge is the complexity of cloud security, which can be exacerbated by the use of multiple cloud providers, each with their own security protocols and standards. Finally, we have a huge lack of skills in the market, and those few people with the skills are constantly being tempted by offers of outrageous salaries, so retaining your talented teams is really tough!

Despite these, there have been really significant advancements in cloud security in recent years. The hyperscalers have implemented many new security measures, such as encryption, improved access controls and policies, significantly better monitoring tools, to help protect their platforms and their customers’ data. Post-Covid, with customers moving to the cloud in even larger numbers, it’s also great to see that customers have become more aware of the importance of cloud security and are taking steps to prioritise it.

The threat landscape for cloud security continues to evolve, with new and extremely sophisticated attacks emerging all the time. Businesses need to keep up and be proactive in their approach to cloud security.

Tips

So, a couple of quick tips to think about if you haven’t already started taking your cloud security seriously?

  1. Implement multi-factor authentication (MFA). A bit like when you hear sports commentators or coaches talking about a losing team, the common thread is simply not doing the fundamentals / basics well. One of the most effective ways to improve cloud security is to require MFA for all users accessing cloud resources (not just root). Lack of MFA is like leaving your car door unlocked and crying out to have your vehicle taken for a Ferris Bueller-style joy ride!
  2. Regularly review and update security policies. It’s important for businesses to regularly review and update their security policies to ensure they are aligned with current best practices and standards, and these best practices are constantly evolving. Things like access controls, password policies, data encryption, and incident response plans. By keeping security policies up-to-date and ensuring that all employees are aware of them, businesses can significantly reduce the risk of security breaches.
  3. Investigating the used of third-party security tools and services. Tools (if properly implemented) provide additional layers of protection, such as threat detection and monitoring, vulnerability scanning, data encryption, etc. Engaging security experts one-off or regularly to provide recommendations for improving their security posture, or simply outsourcing management of their cloud estates.

I’m genuinely hopeful that the emerging (and frankly astounding) improvements in artificial intelligence will have a positive and significant impact on businesses who don’t or can’t spend the time and resources to protect themselves and their customers effectively. If they don’t we’re only going to see a proliferation of more high profile and high impact cases in the news!

Cloud , , , , , ,