Exactly my experience, I know they vibe code features and that’s fine but it looks like they don’t do proper testing which is surprising to me because all you need bunch of cheap interns to some decent enough testing
I have little doubt where things are going, but the irony of the way they communicate versus the quality of their actual product is palpable.
Claude Code (the product, not the underlying model) has been one of the buggiest, least polished products I have ever used. And it's not exactly rocket science to begin with. Maybe they should try writing slightly less than 100% of their code with AI?
More generally, Anthropic's reliability track record for a company which claims to have solved coding is astonishingly poor. Just look at their status page - https://status.claude.com/ - multiple severe incidents, every day. And that's to say nothing of the constant stream of bugs for simple behavior in the desktop app, Claude Code, their various IDE integrations, the tools they offer in the API, and so on.
Their models are so good that they make dealing with the rest all worth it. But if I were a non-research engineer at Anthropic, I wouldn't strut around gloating. I'd hide my head in a paper bag.
"Coding" is solved in the same way that "writing English language" is solved by LLMs. Given ideas, AI can generate acceptable output. It's not writing the next "Ulysses," though, and it's definitely not coming up with authentically creative ideas.
But the days of needing to learn esoteric syntax in order to write code are probably numbered.
That's a bummer. I was looking forward to testing this, but that seems pretty limiting.
My current solution uses Tailscale with Termius on iOS. It's a pretty robust solution so far, except for the actual difficulty of reading/working on a mobile screen. But for the most part, input controls work.
My one gripe with Termius is that I can't put text directly into stdin using the default iOS voice-to-text feature baked into the keyboard.
I’ve been doing this for a while [1], but ultimately settled on a building a thin transport layer for Telegram to accept and return media, and persistent channels, vastly improved messaging UX, etc. and ended up turning this into a ‘claw with a heartbeat and SOUL [2].
I really enjoyed reading both posts. Thanks for sharing!
I, like many others, have written my own "claw" implementation, but it's stagnated a bit. I use it through Slack, but the idea of journaling with it is compelling. Especially when combined with the recent "two sentence" journaling article[1] that floated through HN not too long ago.
Great posts! So far [2] is the only "claw" that has caught my interest, mostly because it isn't trying to do everything itself in some bespoke, NIH way.
I've been using email and Cloudeflare email router. You don't get the direct feedback of a terminal, but it's much easier to read what's happening in html formatted email.
It also feels kind of nice to just fire off an email and let it do it's thing.
You jest but I was flabbergasted when doing some AI backed feature that the fix was adding a "The result you send back MUST be accurate." to the already pretty clear prompt.
I'm willing to bet most of their libraries are definitely vibe coded. I'm using the claude-agent-sdk and there are quite a few bugs and some weird design decisions. And looking through the actual python code it's definitely not what I would classify 'best practice'. Bunch of imports in functions, switching on strings instead of enums, etc.
I had to downgrade to an earlier release because an update introduced a regression where they weren't handling all of their own event types.
A few weeks ago the github integration was completely broken on the claude website for multiple days. It's very clear they vibe code everything and while it's laudable that they eat their own dogfood, it really projects a very amateurish image about their infrastructure and implementation quality.
In theory, comments on Hacker News should advance discussion and meet a certain quality bar lest they be downvoted to make room for the ones that meet the criteria. I am not sure if this ever was true in practice, it certainly seems to have waned in the years I have been a reader of this forum (see one of the many pelican on a bike comments on any AI model release thread), but I'd expect some people still try to vote with this in mind.
Being sarcastic doesn't lower the bar for a comment to meet to not get downvoted, so I wouldn't go thinking people miss the sarcasm without first considering whether the comment adds to the discussion when wondering why a comment is downvoted.
I only understood it after reading some of co_king_5’s other comments. This is Poe’s law in action. I know several people who converted into AI coding cultists and they say the same things but seriously. Curiously none of them were coders before AI.
I'm willing to bet you don't full-on YOLO vibecode like the lead Claude Code developer, running 10 Claude Code sessions in parallel to push 259 pull requests that modify >40k lines of code in a month [0]? There is zero chance any of that code was rigorously reviewed.
I use Claude Code almost every day [1], and when used properly (i.e. with manual oversight), it's an amazing productivity booster. The issue is when it's used to produce far more code than can be rigorously reviewed.
> - You can't interrupt Claude (you press stop and he keeps going!)
This is normal behavior on desktop sometimes its in the middle of something? I also assume there's some latency
> - At best it stops but just keeps spinning
Latency issues then?
> - It can get stuck in plan mode
I've had this happen from the desktop, and using Claude Code from mobile before remote control, I assume this has nothing to do with remote control but a partial outage of sorts with Claude Code sometimes?
I don't work for Anthropic, just basing off my anecdotal experience.
Yes. Doing the same. What is the advantage of this new feature? Tmux/Tailscale/Termius give you full control of your terminal.
Or mainly to save the end user the hassle to set it up correctly?
Oh lots of people will not be comfortable with tmux approach. The anthropic feature makes sense. But it's Max only and doesn't work well according to other comments.
I was using this religiously but there’s a bug currently that makes the initialization fail and/or throws an error on the phone client.
Absolutely great piece of software otherwise, free, anonymous, encrypted and so on. Really hope the team can fix this soon - I would hate to switch back to tmux tunneling.
Opencode's 'web' command makes your local session run on the browser with same access rights as the cli. It's a pretty slick interface too. I sometimes use it instead of the cli even when I can access both.
You can test it right now if you want with the included free models.
It's changing super fast. I am using it on the desktop mostly and when I tried on my phone there were issues yes. But do try it out again in a few weeks.
(I am actually using zellij on the remote and using various CLIs more than I am using only opencode on the web. I was using wezterm mux until about a week ago but the current state of the terminal is not very good for this scenario. It seems like almost all the CLIs are choking because of nodejs ink library)
I feel like a lot of folks are saying this kills the Code on your Phone opportunity some start-ups are building for. I don't agree. I feel like coding agents are like streaming services, we will subscribe to multiple and switch between them. So for one there's value in a universal control plane. The other is that mobile as a coding interface should offer more than a remote control to the desktop. I think there's still some space to cook, especially if people are investing 8 hours a day talking to agents, the interface surely matters.
I don't know a single person who is satisfied with the status quo on streaming services where you have to subscribe to multiple ones. Everyone is complaining that the landscape is 1) more fragmented than cable was, 2) costs more, 3) has even more ads than cable
I think people forgot how bad it was. It was much more fragmented before but instead of services it was fragmented by time. Sure you have access to Seinfeld, but you can watch one or two Seinfelds a night at 8pm and 11pm.
I also remember base cable without any movies was around $60 or something and with some movie channels is >$100. And that's not inflation adjusted. You can easily get 3 or 4 of the top services for $100 today.
Finally claiming there are more ads on these services is a joke. There was ~20m for every 30m of programming, meaning 1/3 of the time you're watching commercials. And not just any commercials, the same commercials over and over. There was even a case of shows being sped up on cable to show more commercials.
I get it, everyone wants everything seamlessly and for next to nothing, but claiming that 90s cable was even comparable is absurd.
Not that it is particularly relevant to agentic coding but how can anyone truly argue streaming costs more? Average cable packages were exceeding 125-150 USD a month (in 2000 dollars). Under no circumstances would I be sympathetic to the argument that streaming costs more.
You can get all 7 of the major streaming subs for less without even shopping out deals. That is 100s of times the volume and quality of content that was delivered on cable for far less. It is so much content realistically that no one I have ever met has subscribed to all of them at once.
The argument really is empty. The fragmentized experience is annoying, but it isn't more expensive...And it DEFINITELY has fewer ads.
I agree. I spend a lot of time working from my phone so I had to make my own workflow that works for me. I've been following all these bans and drama with the subscription keys and custom harnesses etc. I think there's room for a "universal control plan" that lets you leverage the CLI providers (and whatever crappy interfaces / apis they give you).
Maybe it’s related to what I tend to use the agents for, but I guess I don’t understand what is this for. Typically I try to structure the tasks in a way that require me to do or check something important when the agent gets back to me. If the agents query is trivial enough I can respond from my phone, it was likely not needed at all. If the agent finished - fine. It will have to wait until I get back in front of the computer anyway.
I've used similar things (omnara/happy) while taking walks. Sometimes I'll get an idea about the problem I'm working on and I can just dictate it into my phone and check in 15min later. I stopped being able to do that when claude added those nice interview panes to clarify things because it didn't work back then. But mostly it's really annoying when you think you've created the plan/prompt and that it's ready to go. But it gets stuck or decided to stop while you're away. I pretty often need to give Claude a "continue" kick. To be fair this happens far less after Opus 4.6.
Also, I felt the need to use it far more when I was on Pro vs a Max plan. On Pro when you hit the usage windows it's nice to be able to kick claude back into gear without scheduling your life around getting back to the terminal to type "continue".
- Plan mode -> answer questions/make corrections, continue planning
- Some of us don't do full yolo mode all the time, then tool approvals or code reviews are required, nice to do a quick review and decide if you need to go back to your computer or not
- Letting claude spin or handle a long-running task outside of normal work hours and being able to check in intermittently to see if something crashed
I don't dangerously accept permissions outside of a few scripts I have reviewed as safe. This means claude gets stuck often when testing it's work, but also means it doesn't uninstall production workloads from the kubernetes cluster.
Weird all these companies struggle so much to support remote services, ssh has been working for me pretty seamlessly for like the 20 years I've been using it and has allowed me to remote-control any computer I own with relatively reliable authentication (with some hiccups that tend to be patched pretty rapidly when found) throughout that entire period. I hear tell it worked even before I was using computers professionally, too
Claude Code Team: Please fix the core experience instead of branching out into all these tertiary features. I know it is fun and profit to release new features but you need to go deeper into features not broader into there be dragons territory.
The real question is whether remote control becomes the default interface for AI coding tools. Right now we have this awkward split between chat-based coding (Cursor, Copilot) and terminal-based agents (Claude Code, Aider). Remote control bridges that gap but introduces a new failure mode: you're now debugging the agent's environment management on top of the actual code problems.
I've found the terminal-based approach more reliable precisely because it's simpler - fewer layers of abstraction between the model and the filesystem. Adding a web UI on top reintroduces all the state synchronization issues we thought we'd left behind.
Can anyone recommend a tool that gives a 'mission control' overview of multiple agents, but also combines some basic project management functionality.
For example, maybe I have an idea for a feature and I want to spin up a new branch and have agents work on that. But then I get stuck or bored (I'm talking personal usage), so decide to park it. But maybe after a few days I have a shower thought and want to resume it.
The current method of listing sessions and resuming them can work, but you need to find the right session. If there is something that shows all the branches, a docs overview of what that feature it, and the current progress it would make this workflow a lot more effective. Plus I switch LLMs when I hit rate limits.
I'm probably going to just build it myself, but wondering if anyone has something that does this already.
Maybe combine Claude Code + Obsidian, so Claude can use the node structure as a second brain for projects. I was just watching this video (not affiliated):
This new remote control handoff is neat but still requires you to remember to do the handoff. Oftentimes I’m waiting on an agent and then walk away.
I built Crabigator[1] and it's a wrapper around `claude` and `codex`, so its ready for coding on the go on start and already streaming. Plus, crabigator shows many parallel windows, separated by repo/project/machine, so you can manage multiple agents seamlessly.
I don't think they target the pros here who already solved this problem with vpn/tmux/ssh but to those whose thrilled serious reaction will be "whoaaa crazy i can command claude code now from my phone while on the toilet or on a date?" It's basically a defense attempt against Openclaw.
Well it DOES have less storage than a Nomad (hence lame), but this way you don't need to pay for a public IP address, or for a VPS to run Wireguard on, or for a commercial VPN solution, and then install a terminal emulator on your phone and set up SSH keys.
People tried reinventing terminals, SSH, and tmux for phones. It's a pretty terrible experience using your thumbs. And it takes significant know-how to set up.
And in modern stacks, it almost necessitates a man in the middle - tailscale is common but it's still a central provider. So is it really the most inefficient way possible?
I'm probably 10 years out of date. Are ethereum smart contracts still a thing? I'm sure you could deploy one of those for every agent session to handle the notifications
Fair point technically, but I think the value proposition isn't the persistent session, rathere it's the abstraction layer. Screen/tmux assumes you know what commands to run. This assumes you know what outcome you want. For someone like me who came to coding late and doesn't have 20 years of muscle memory with terminal tools, the inefficiency in transport is more than offset by the efficiency in intent. Different tools for different people.
If you want this to compete with tools in the OpenClaw space, I’d prioritize first class Telegram and Slack support. Push progress into a chat thread, and let me approve, retry, cancel from there. That’s where teams live. A separate mobile frontend will always feel clunky and fragile.
I've been running something similar for a few months, which is a voice-first interface for Claude Code running on a local Flask server. Instead of texting from my phone, I just talk to it. It spawns agents in tmux sessions, manages context with handoff notes between sessions, and has a card display for visual output.
The remote control feature is cool but the real unlock for me was voice. Typing on a phone is a terrible interface for coding conversations. Speaking is surprisingly natural for things like "check the test output" or "what did that agent do while I was away."
The tmux crowd in this thread is right that SSH + tmux gets you 90% of the way there. But adding voice on top changes the interaction model. You stop treating it like a terminal and start treating it like a collaborator.
Doesn't look like it has proper worktree management. UIs that abstract away worktrees are very powerful. I vibe coded my own (https://github.com/9cb14c1ec0/vibe-manager), which unfortunately doesn't have the remote component that hapi does.
My needs are very basic and it hasn’t failed me yet, I like that it doesn’t try to do much. I know it has voice capabilities through eleven labs but I haven’t used that feature.
Worth noting that this is currently broken for a number of users, I'm on a Max plan and I get the message "Error: Remote Control is not enabled for your account. Contact your administrator" which isn't helpful since I'm my administrator and ... this gets recursive quickly.
I would have hoped for them to at least support the "/clear" command or some form of it, especially to manage context if we're limited to a single session between the terminal and Claude iOS app. I like to work on things one at a time and /clear my way between them to get back to 0% context, which seems impossible with the current setup here?
Typing "/clear" in the terminal clears it, but the Claude iOS app just outputs raw XML instead and doesn't actually do anything:
This seems like an excellent thread to plug the TUI I've been working on that makes using bubblewrap relatively easy and somewhat pleasant. I have a recipe in the README for using it with Claude. Granted that Claude has --sandbox, but probably better that sandboxing be done by something outside of the Anthropic ecosystem.
Boggles my mind that this is actually a thing that still needs to be solved. Just remote into your computer (I prefer TeamViewer). That is it. One step.
I’ve been doing this with a tmux tunnel and an app on my laptop that connects sessions you select to a virtual terminal using sockets. I asked Claude to build it and it works great - full terminal functionality and Markdown review with comments so you don’t need to cross your eyes to review plans.
Excited to see how this matures so people without that inclination can also be constantly pestered by the nagging idea that someone, somewhere is being more productive than them :)
This kind of release shows Anthropic as a company is suffering from the same thing we all are right now. Removing the friction from having an idea and executing it stops you from remembering The Point. Yes, programming from your phone is an exciting modality and maybe even the future of how we work, but coding from your bedroom, AND the toilet, AND the woods AND your office is definitely (hopefully) not the future.
I wonder if is anyone working on an AI framework that encourages us to keep our eye on the big picture, then walk away when a reasonable amount of work is done for the day.
Yes, individuals are creating cool mobile coding solutions and Anthropic doesn't want to get left behind. I know I'm working my ass off at work right now because LLM coding makes it fun, but I also often don't prioritize what I'm doing for the big picture because I just try every thing that comes into my inbox, in order, because it's so fast to do with Claude Code.
There are two types of software engineers: Those who do and then think, or those who think and then do. Claude Code seems to strictly be for the former, while typically the engineers who can maintain software long-term are the latter.
Not sure if we have any LLM-tooling for the latter, seems to be more about how you use the tools we have available, but they're all pulling us to be "do first, think later" so unless you're careful, they'll just assume you want to do more and think less, hence all the vibeslop floating around.
> Claude Code seems to strictly be for the former, while typically the engineers who can maintain software long-term are the latter.
Given the number of CC users I know who spend significant time on creating/iterating designs and specs before moving to the coding phase, I can tell you, your assumption is wrong. Check how different people actually use it before projecting your views.
Yeah, I wasn't trying to say "These are the people who use CC, for these purposes" but rather what the intention seems to for Claude Code in the first place. I'm using CC from time to time, to keep up to date with what tooling is available, and also know people who use CC every day and plan a lot up front, sorry if I gave the impression that I meant that everyone using CC is doing that, was trying to get at what the purpose of the tool seems to be, which seems to be true today too, as the models continuously seem to steer you to "doing" and moving faster, not stopping and thinking.
This seems like a real coarse and not particularly accurate binary, but even if it were true, the thing about Claude Code and agentic coding like this is the cost of making a mistake or the cost of not being happy with a design and having to back it out is getting smaller and smaller.
I would argue that rapidly iterating reveals more about the problem, even for the most thoughtful of us. It's not like you check your own reasoning at the door when you just dive head first into something.
id say claude code is designed for think then do - thats where its different from other tools!
i think it still pulls to do then think because you cant tell what the agent understood of what you asked it to do from that first think, until its actually produced something.
This isn't a binary thing - even if you prefer to build maintainable systems very often the trade-off is - you don't ship in time and there's no long term - the project gets scrapped.
So even if it comes at the expense of long term maintainability - everyone should have this in their toolbox.
I find it often helps me to see a feature before I evaluate if it was really a good idea in the first place. This is my failing--but one thing I like about Claude is that it's now possible to just try stuff and throw away whatever doesn't work out.
I usually have conversations with Claude for clearing my mind and forming the scope of a project. I usually use voice transcription from Claude app to take notes and explore all my options.
Same. When I can't be at my desk, my projects don't stop -- I just do the tasks that work well enough on the phone. Brainstorming, planning, etc. Or tasks that the agent can easily verify.
Having access to my local repository and my whole home folder is much easier than dealing with Claude or ChatGPT on the web. (Lots of manual markdown shuffling, passing in zipfiles of repositories, etc).
I agree in your basic framing but not your conclusion. Met plenty of do-ers before thinkers that are self-aware enough to also maintain software longterm.
Claude Code and similar agents help me execute experiments, prototypes and full designs based on ideas that I have been refining in my head for years, but never had the time or resources to implement.
They also help get me past design paralysis driven by overthinking.
Perhaps the difference between acceleration and slop is the experience to know what to keep, what to throw away, and what to keep refining.
This is the real insight in this thread. The false binary of "rest OR work" is dissolving. I do some of my best problem-solving while walking my kid to school or making lunch...the context switch lets things percolate. Having a way to capture that momentum without needing to rush back to my desk and remember what I was thinking would be genuinely useful. The interface matters less than the latency between idea and execution.
"The false binary of "rest OR work" is dissolving."
If you're like most people in this forum, there are people who stand to gain financially if you convince yourself that you don't need boundaries between work and rest. You may even believe that you stand to gain financially, and that this will be best for you in the long term.
Please, take some time to rest for a day or two and really think about what you want your boundaries to be. Write them down.
> The false binary of "rest OR work" is dissolving
Sounds like someone hasn't yet worked multiple years with software engineering, or any job for that matter.
Your mind might trick you into believing it won't matter, but your body and mind NEEDS to be disconnected from work, 100%, at some point during your regular rhythms of life, otherwise you'll burn out much faster than the people you seemingly are trying to compete with.
Life never been a sprint, but it is a marathon, and if you spend all your young experience-less years on treating it as a sprint, you won't have any energy left for completing the marathon.
I’m guessing you’re suggesting it’s ok to lose time if you’re away from your computer enjoying life, and I agree. I also don’t see the issue in finding ways to be save time with work.
If you mean something different, please elaborate.
I think a significant distinction between your approach and Claude’s approach is that your approach requires allowing your machine to accept inbound connections but Claude’s approach does not. Claude probably went with the latter to avoid a whole class of security issues and mitigate risk of users having their machines compromised. I’m not familiar with what the new vectors of attack are with Claude’s approach though.
That's not what vendor lock in means. If you sign up for a cloud hoster and then build your whole product on propriety services that you can't get anywhere else instead of using an off the shelf database or open source software, that's vendor lock in.
If you'd have to switch to a different tool to do your coding that's not vendor lock in.
except there will be no dropbox moment. There is no startup that stands a chance, Openclaw is free, the foundation model providers basically won this space just by providing subscriptions cheaper than any competitor could ever do.
> Unlike Claude Code on the web, which runs on cloud infrastructure, Remote Control sessions run directly on your machine and interact with your local filesystem. The web and mobile interfaces are just a window into that local session.
For the vibe'y workflows, this would easily solve parallel long running work without skipping permissions: schedule 10 different tasks and go for a run. Occasionally review what the hallucination machine wants to do, smash yes a few times, occasionally tell it not to be silly, have a nice run. Essentially, solving remote development, though perhaps not quite in the way how people usually think of it.
> Limitations
> One remote session at a time: each Claude Code session supports one remote connection.
Ehh, I think it's hardly different from the people who leave Claude Code working on problems overnight with really loose permissions - seemingly the chance of them returning to it mining crypto for Putin is low enough for it to not be a consideration (see the whole OpenClaw movement).
And people have been remoting into their machines for a while, so now having a pretty-UI-but-walled-garden variety doesn't ring that many alarm bells. If they manage to get it right, it wouldn't be that much different from running some CI stuff on your machine while you're making tea, or reviewing pull requests while lounging around.
One more step closer to a closed source system. I think their objective is to move all your code on their systems so you can only modify the code through their AI so they have a moat and will be difficult to move away. They will “guard” your source code and you’ll never see it.
Imagine if tomorrow they make a 10x smarter AI, but they say: you can only use if you upload your source code to us and you can’t see anymore the source code.
So you either stay on lower end models or you give up and use a 10x model.
I only see one issue: will be very difficult for them to “guard” the source code and don’t let you access.
Seems like it could be problematic in the future since code can't be fixed by humans so the only source of future code for training is unedited Claudeslop.
Small UX note: the first time you run the command it only shows a URL. It's not until you run it again that you discover it also generates a QR code, which is actually the fastest way to open it on your phone. Would be nice if the QR showed up on the first run too, almost missed it.
Does anyone know if it caffeinates automatically? I sometimes see caffeinate appear in the terminal tab title so clearly they are using it, but I’m just curious if I have to run caffeinate separately if, for instance, the agent finishes its task and is waiting for a new one and I want to keep it alive.
I was just thinking about it two days ago - how nice it would be to use my local Claude code instead of the limited cloud version to make some ad hoc changes when I have a fresh idea on a hike. And two days later - here we go, a new release
Regular claude code is already a remote access door to your setup, once you've granted a few command execution permissions. (e.g. if it can edit your code and run the test suite)
This resonates hard. I'm a self-taught dev who started coding ~7 months ago, and honestly the conversational back-and-forth with Claude is how I built my entire first app. Not by reading docs cover to cover, but by describing what I wanted, getting code back, breaking it, asking why, and iterating. The idea of doing that untethered from my desk is genuinely exciting — not because I want to work more, but because some of my best thinking happens on walks, not in front of a screen.
How about they are pointing out a worrisome direction society might be taking, whereas work will infiltrate even more what used to be family or personal time, thus accelerating burnout?
Anthropic is spitting out software in 2 weeks that took enterprises 24 months to ship 5 years ago (and was still buggy AF, let's please actually think about all the vmware citrix enterprise trash you tolerated). It'll get hardened over the next couple weeks.
You all can pretend the software dev cycle hasn't changed... get real.
have you gotten a terminal interface on your phone to be acceptably usable? I haven't - not without a real keyboard attached in any case. too many parts of the UX are designed for a true keyboard.
Someone is going to solve this with a non-buggy app, but it really needs to have all the features of Claude code. Everyone is a power user in this segment
Oh come on, now that I have a personal remote control already set up using hooks, specifically the PermissionRequest, and Home Assistant push notifications where I can allow or deny a specific action?
TIL that HA notifications can have associated actions. I have the exact same setup as you, except I only receive the notification and then walk over to the laptop to unblock the agent feeling like a human tool call. This will improve my workflow, thank you.
The notification payload for reference, you will also need a permission input_select (pending/allow/deny) and an automation that triggers upon mobile_app_notification_action:
Exactly that. And the push notification includes what I am approving. Also with some sensible delay in sending out these pushes, because otherwise I may be bombarded with push notifications, while already having it manually approved.
Yep. Came to say the same thing. I'd only used Codex in VSCode and in the Codex app, and at least those have the same history, but my understanding is that the cloud and CLI versions have this hierarchy of 'visibility' [0]. Perhaps they'll need to change this design decision?
I honestly think this is definitely where (at least part of) the industry is heading, yes.
This is not to say engineers are getting replaced — but, certainly, they are changing their work. And, sure, maybe _some_ of them are being replaced. Not most of the ones I know, though. They are essential to orchestrate, curate, maintain, and drive all of this.
(Now, do they want to orchestrate it? Whole different story...)
WOW I had been using the Codex app (Claude/Anthropic have a few annoying problems) and wishing there was something like this!
I often get ideas while I'm in bed or outside away from my computer, and was thinking that the ability to code on your computer from your phone, through AI, would be such a killer app.
My favorite use case would be asking the AI to review code and going over its findings/suggestions while I'm away from the computer or trying to fall asleep.
Doesn't have to be. Before OpenClaw was a thing, people were experimenting with setups to allow them to drive their agent remotely.
And of course, OpenClaw is built to be a very generalist agent with a chat interface - same effective outcome as remotely controlling an AI harness, but not exactly what everyone wants.
Pretty happy to see this. I've previously tried happy.engineer for this, but that wanted my Anthropic API token for itself (!) which is a no-no.
Seeing how the labs tend to copy the best functionality in any FOSS developments, I decided to wait - happy I did, here's the official functionality for this that is much more trustworthy.
Right now:
- You can't interrupt Claude (you press stop and he keeps going!)
- At best it stops but just keeps spinning
- The UI disconnects intermittently
- It disconnects if you switch to other parts of Claude
- It can get stuck in plan mode
- Introspection is poor
- You see XML in the output instead of things like buttons
- One session at a time
- Sessions at times don't load
- Everytime you navigate away from Code you need to wait for your session to reappear
I'm sure I'm missing a few things.
I thought coding was a solved problem Boris?
Claude Code (the product, not the underlying model) has been one of the buggiest, least polished products I have ever used. And it's not exactly rocket science to begin with. Maybe they should try writing slightly less than 100% of their code with AI?
Their models are so good that they make dealing with the rest all worth it. But if I were a non-research engineer at Anthropic, I wouldn't strut around gloating. I'd hide my head in a paper bag.
Maybe this was sarcasm, but it's a good point:
"Coding" is solved in the same way that "writing English language" is solved by LLMs. Given ideas, AI can generate acceptable output. It's not writing the next "Ulysses," though, and it's definitely not coming up with authentically creative ideas.
But the days of needing to learn esoteric syntax in order to write code are probably numbered.
My current solution uses Tailscale with Termius on iOS. It's a pretty robust solution so far, except for the actual difficulty of reading/working on a mobile screen. But for the most part, input controls work.
My one gripe with Termius is that I can't put text directly into stdin using the default iOS voice-to-text feature baked into the keyboard.
[1] https://elliotbonneville.com/phone-to-mac-persistent-termina...
[2] https://elliotbonneville.com/claude-code-is-all-you-need/
I, like many others, have written my own "claw" implementation, but it's stagnated a bit. I use it through Slack, but the idea of journaling with it is compelling. Especially when combined with the recent "two sentence" journaling article[1] that floated through HN not too long ago.
[1] https://alexanderbjoy.com/two-sentence-journal-approaches/
I’ll have to check out the journaling article. I’ve been journaling a lot more lately!
It also feels kind of nice to just fire off an email and let it do it's thing.
I had to downgrade to an earlier release because an update introduced a regression where they weren't handling all of their own event types.
Being sarcastic doesn't lower the bar for a comment to meet to not get downvoted, so I wouldn't go thinking people miss the sarcasm without first considering whether the comment adds to the discussion when wondering why a comment is downvoted.
I use Claude Code almost every day [1], and when used properly (i.e. with manual oversight), it's an amazing productivity booster. The issue is when it's used to produce far more code than can be rigorously reviewed.
[0] https://www.reddit.com/r/ClaudeAI/comments/1px44q0/claude_co...
[1] https://news.ycombinator.com/item?id=45511128
This is normal behavior on desktop sometimes its in the middle of something? I also assume there's some latency
> - At best it stops but just keeps spinning
Latency issues then?
> - It can get stuck in plan mode
I've had this happen from the desktop, and using Claude Code from mobile before remote control, I assume this has nothing to do with remote control but a partial outage of sorts with Claude Code sometimes?
I don't work for Anthropic, just basing off my anecdotal experience.
- - -
get tailscale (free) and join on both devices
install tmux
get an ios/android terminal (echo / termius)
enable "remote login" if on mac (disable on public wifi)
mosh/ssh into computer
now you can do tmux then claude / codex / w/e on either device and reconnect freely via tmux ls and tmux attach -t <id>
- - -
You can name tmux and resume by name via tmux new -s <feature> and tmux attach -t <feature>
How do you deal with multiple concurrent sessions of CC with this setup?
How important is mosh? I wasn't able to get it set up the last time I tried... ran into a bunch of issues.
Could even use cc to check in on and/or "send-keys"
What wasn't working about mosh? Just install mosh and use mosh to connect
It's this.
Don't have a Dropbox moment ;) [1]
[1]: https://news.ycombinator.com/item?id=9224
What I posted "just works".
Based on my experience many people don't know this is a thing you can do.
You can test it right now if you want with the included free models.
https://opencode.ai/docs/web/
(I am actually using zellij on the remote and using various CLIs more than I am using only opencode on the web. I was using wezterm mux until about a week ago but the current state of the terminal is not very good for this scenario. It seems like almost all the CLIs are choking because of nodejs ink library)
I also remember base cable without any movies was around $60 or something and with some movie channels is >$100. And that's not inflation adjusted. You can easily get 3 or 4 of the top services for $100 today.
Finally claiming there are more ads on these services is a joke. There was ~20m for every 30m of programming, meaning 1/3 of the time you're watching commercials. And not just any commercials, the same commercials over and over. There was even a case of shows being sped up on cable to show more commercials.
I get it, everyone wants everything seamlessly and for next to nothing, but claiming that 90s cable was even comparable is absurd.
https://www.digitaltrends.com/home-theater/how-networks-spee...
I'm not sure what your point is.
You can get all 7 of the major streaming subs for less without even shopping out deals. That is 100s of times the volume and quality of content that was delivered on cable for far less. It is so much content realistically that no one I have ever met has subscribed to all of them at once.
The argument really is empty. The fragmentized experience is annoying, but it isn't more expensive...And it DEFINITELY has fewer ads.
I literally see no ads on my streaming subscription for close to a tenth of the price of cable.
The results are enough for me and I'm not doing things that allow me to differentiate the output between ChatGPT, Claude and, the others.
The agents are more like the radio in my car, whenever I want music, I switch channel until I find something good enough.
If I'm really in need of something special, I'll use Spotify on my phone.
And sometimes, I just drive with the radio off.
There's a comparison of the approaches as I see them here https://yepanywhere.com/subscription-access-approaches
Also, I felt the need to use it far more when I was on Pro vs a Max plan. On Pro when you hit the usage windows it's nice to be able to kick claude back into gear without scheduling your life around getting back to the terminal to type "continue".
- Some of us don't do full yolo mode all the time, then tool approvals or code reviews are required, nice to do a quick review and decide if you need to go back to your computer or not
- Letting claude spin or handle a long-running task outside of normal work hours and being able to check in intermittently to see if something crashed
I've found the terminal-based approach more reliable precisely because it's simpler - fewer layers of abstraction between the model and the filesystem. Adding a web UI on top reintroduces all the state synchronization issues we thought we'd left behind.
For example, maybe I have an idea for a feature and I want to spin up a new branch and have agents work on that. But then I get stuck or bored (I'm talking personal usage), so decide to park it. But maybe after a few days I have a shower thought and want to resume it.
The current method of listing sessions and resuming them can work, but you need to find the right session. If there is something that shows all the branches, a docs overview of what that feature it, and the current progress it would make this workflow a lot more effective. Plus I switch LLMs when I hit rate limits.
I'm probably going to just build it myself, but wondering if anyone has something that does this already.
https://youtu.be/6MBq1paspVU
I built Crabigator[1] and it's a wrapper around `claude` and `codex`, so its ready for coding on the go on start and already streaming. Plus, crabigator shows many parallel windows, separated by repo/project/machine, so you can manage multiple agents seamlessly.
[1]: https://drinkcrabigator.com
The daily “what broke and changed now” with claude code is wearing me out fast.
And in modern stacks, it almost necessitates a man in the middle - tailscale is common but it's still a central provider. So is it really the most inefficient way possible?
we can upload snapshot of zip files to blockchain, then notify customer via servers
You need to learn to type less than a dozen total characters including the command.
Not to mention a lot of terminals automatically integrate with tmux so you don’t have to do anything but open the terminal.
Sure, different tools for different people. And if you want to use a new fangled triangular wheel they just invented, no one’s going to stop you
It’s still a triangular wheel at the end of the day
The remote control feature is cool but the real unlock for me was voice. Typing on a phone is a terrible interface for coding conversations. Speaking is surprisingly natural for things like "check the test output" or "what did that agent do while I was away."
The tmux crowd in this thread is right that SSH + tmux gets you 90% of the way there. But adding voice on top changes the interaction model. You stop treating it like a terminal and start treating it like a collaborator.
Here is a demo of it controlling my smart lights: https://www.youtube.com/watch?v=HFmp9HFv50s
There's an open issue on github for it:
https://github.com/anthropics/claude-code/issues/28098
Why does the remote control needs that? For what?
I rather use the common developer tools like termux or mosh etc. on a phone if I need that functionality.
Typing "/clear" in the terminal clears it, but the Claude iOS app just outputs raw XML instead and doesn't actually do anything:
https://github.com/reubenfirmin/bubblewrap-tui
Excited to see how this matures so people without that inclination can also be constantly pestered by the nagging idea that someone, somewhere is being more productive than them :)
I wonder if is anyone working on an AI framework that encourages us to keep our eye on the big picture, then walk away when a reasonable amount of work is done for the day.
Yes, individuals are creating cool mobile coding solutions and Anthropic doesn't want to get left behind. I know I'm working my ass off at work right now because LLM coding makes it fun, but I also often don't prioritize what I'm doing for the big picture because I just try every thing that comes into my inbox, in order, because it's so fast to do with Claude Code.
We all sense it!: <https://newsroom.haas.berkeley.edu/ai-promised-to-free-up-wo...> <https://ghuntley.com/teleport/> <https://steve-yegge.medium.com/the-ai-vampire-eda6e4f07163>
Not sure if we have any LLM-tooling for the latter, seems to be more about how you use the tools we have available, but they're all pulling us to be "do first, think later" so unless you're careful, they'll just assume you want to do more and think less, hence all the vibeslop floating around.
Given the number of CC users I know who spend significant time on creating/iterating designs and specs before moving to the coding phase, I can tell you, your assumption is wrong. Check how different people actually use it before projecting your views.
I would argue that rapidly iterating reveals more about the problem, even for the most thoughtful of us. It's not like you check your own reasoning at the door when you just dive head first into something.
i think it still pulls to do then think because you cant tell what the agent understood of what you asked it to do from that first think, until its actually produced something.
So even if it comes at the expense of long term maintainability - everyone should have this in their toolbox.
Having access to my local repository and my whole home folder is much easier than dealing with Claude or ChatGPT on the web. (Lots of manual markdown shuffling, passing in zipfiles of repositories, etc).
Claude Code and similar agents help me execute experiments, prototypes and full designs based on ideas that I have been refining in my head for years, but never had the time or resources to implement.
They also help get me past design paralysis driven by overthinking.
Perhaps the difference between acceleration and slop is the experience to know what to keep, what to throw away, and what to keep refining.
My favorite way to vibe code is by voice while in the hot tub. Rest AND focus AND build.
If you're like most people in this forum, there are people who stand to gain financially if you convince yourself that you don't need boundaries between work and rest. You may even believe that you stand to gain financially, and that this will be best for you in the long term.
Please, take some time to rest for a day or two and really think about what you want your boundaries to be. Write them down.
Sounds like someone hasn't yet worked multiple years with software engineering, or any job for that matter.
Your mind might trick you into believing it won't matter, but your body and mind NEEDS to be disconnected from work, 100%, at some point during your regular rhythms of life, otherwise you'll burn out much faster than the people you seemingly are trying to compete with.
Life never been a sprint, but it is a marathon, and if you spend all your young experience-less years on treating it as a sprint, you won't have any energy left for completing the marathon.
Take care of yourself, your mind and your body.
I’m guessing you’re suggesting it’s ok to lose time if you’re away from your computer enjoying life, and I agree. I also don’t see the issue in finding ways to be save time with work.
If you mean something different, please elaborate.
The one feature drawback of tailscale/tmux/termius is no file upload. And ergonomics, ability to view files/diffs easily, though that's subjective.
With e.g. tmux you'll piggyback on decades of SSH development.
Or Mosh, just like OP said. Mosh handles interruptions much better than SSH does
If you'd have to switch to a different tool to do your coding that's not vendor lock in.
For the vibe'y workflows, this would easily solve parallel long running work without skipping permissions: schedule 10 different tasks and go for a run. Occasionally review what the hallucination machine wants to do, smash yes a few times, occasionally tell it not to be silly, have a nice run. Essentially, solving remote development, though perhaps not quite in the way how people usually think of it.
> Limitations
> One remote session at a time: each Claude Code session supports one remote connection.
Hmm. Give it 1-12 months.
And people have been remoting into their machines for a while, so now having a pretty-UI-but-walled-garden variety doesn't ring that many alarm bells. If they manage to get it right, it wouldn't be that much different from running some CI stuff on your machine while you're making tea, or reviewing pull requests while lounging around.
Imagine if tomorrow they make a 10x smarter AI, but they say: you can only use if you upload your source code to us and you can’t see anymore the source code.
So you either stay on lower end models or you give up and use a 10x model.
I only see one issue: will be very difficult for them to “guard” the source code and don’t let you access.
I wonder if they would take away your ability to prompt, maybe only letting you run agentically.
Claude Code only supports logging out the current session via /logout
There's no logout all sessions equivalent unlike the web UI.
jfc no
You all can pretend the software dev cycle hasn't changed... get real.
Tmux is annoying with a mobile keyboard, so I vibe coded a little mobile-friendly wrapper https://github.com/zakandrewking/pocketbot
Someone is going to solve this with a non-buggy app, but it really needs to have all the features of Claude code. Everyone is a power user in this segment
So your hook -> HA -> push notification? And then you just tap to approve?
[0] https://www.youtube.com/watch?v=cczkDMmmrEE
This is not to say engineers are getting replaced — but, certainly, they are changing their work. And, sure, maybe _some_ of them are being replaced. Not most of the ones I know, though. They are essential to orchestrate, curate, maintain, and drive all of this.
(Now, do they want to orchestrate it? Whole different story...)
I often get ideas while I'm in bed or outside away from my computer, and was thinking that the ability to code on your computer from your phone, through AI, would be such a killer app.
My favorite use case would be asking the AI to review code and going over its findings/suggestions while I'm away from the computer or trying to fall asleep.
And of course, OpenClaw is built to be a very generalist agent with a chat interface - same effective outcome as remotely controlling an AI harness, but not exactly what everyone wants.
Seeing how the labs tend to copy the best functionality in any FOSS developments, I decided to wait - happy I did, here's the official functionality for this that is much more trustworthy.