Why does AI tell you to use Terminal so much?

Posted by ingve 8 hours ago

Counter33Comment60OpenOriginal

Comments

Comment by littlecranky67 8 hours ago

Because it was not trained on screenshots or real rendered computer UIs, but text. That is also why in my experience, LLM suck at describing click paths, and are less helpful on UI development, as they never really "see" the result of the code as in rendered HTML outputs.

Comment by Gigachad 8 hours ago

The terminal is also just the easier way to instruct someone to do things. "Just run this" is easier than a step by step guide through UIs which often change.

Comment by jadeopteryx 8 hours ago

When Windows 95 was introduced as a fully graphical operating system every manual coming with Microsoft software instructed you to open the "Run"-dialog and type your drive letter followed by "setup.exe" to install the software.

Comment by al_borland 1 hour ago

Getting people in the habit of running random commands they don’t understand in the terminal seems dangerous.

Comment by jasonfrost 2 hours ago

Like the past several decades of Linux problems, you find some stack overflow answer saying just run this command. Terminal is eternal, UI changes

Comment by ErroneousBosh 7 hours ago

I go into the shop, I walk up to the counter, and I say "Can I have a 1/2" drive T50 Torx bit please", and the person behind the counter says "Yes of course" and we go over to the small expensive tools cabinet and get one out.

I don't go into the shop and wander about until I find something that looks like it, then stand there pointing things going "THAT!" until someone figures out what I mean.

And now I have a T50 Torx bit that I can stick on a ratchet with a long extension and get the passenger seat out of the Range Rover so I can retrieve my daughter's favourite necklace from where it's gotten entangled with the wiring to the gearbox and suspension ECUs in a place where I can see it with a dentist's mirror but can't actually get a grabber onto to fish it out, worse luck.

So that's my afternoon sorted then. Because we're not just hacking on computers round here.

Comment by relaxing 6 hours ago

On the other hand, if you went and browsed the visual interface, you might discover you could purchase a 1/2” drive to 1/4” hex adapter, thereby opening up the possibility of using the entire set of impact driver bits you already own.

Comment by ErroneousBosh 3 hours ago

That doesn't solve the problem I have, because I already have a 1/4" ratchet and I don't have a 1/4" T50 bit.

Furthermore, a T50 bit with 1/4" drive would just snap instantly. If the bit didn't break, you'd twist the end off the extension bar.

I have a specific problem, which I already know how to solve, which has a specific solution, for which I need a specific component.

Comment by ixsploit 8 hours ago

The UI is also less stable then most cli tools.

The enterprise tools I am currently working with often have outdated screenshots in their own documentation.

Comment by Jackson__ 8 hours ago

Alternative take: Because no designers are getting paid to move "rm" to "fileops rm" or otherwise between releases.

Comment by Myrmornis 8 hours ago

Getting them to take screenshots with playwright/puppeteer and look at them as part of their development iteration cycle works well.

Comment by littlecranky67 8 hours ago

For local inference, sure, but we simply lack the computing power to train them on all the images and html content that is available in the internet and books. That will happen sometime in the future, though.

Comment by Myrmornis 1 hour ago

Ah right, sorry, you were making a much more interesting point than my reply! I read "UI development" and jumped to the conclusion that the point was just about inference-time modify-test cycles. Yes, agreed, if they trained on images, or even better (?) on (code, image) or (code-delta, image-delta) pairs, they would surely be better at UI development.

Comment by 6LLvveMx2koXfwn 7 hours ago

Yep and also their 'click paths' (<- love that by the way) are trained on READMEs which are often out of date.

Comment by magnio 8 hours ago

I am not the most ardent supporter of LLM, but the whole article reads like a critique of macOS idiosyncrasies and its aversion to CLI and text format. Why does macOS tell you to use the GUI so much?

Sure, GUI is more accessible to the average users, but all the tasks in the article aren't going to be done by the average user. And for the more technical users, having to navigate System Settings to find anything is like Dr. Sattler plunging her arms into a pile of dinosaur dung.

Comment by piva00 5 hours ago

Power users can use CLIs quite easily on macOS. The official documentation is geared towards the non-power users but information about most tasks a power user wants done in a CLI are available, it just requires a power user skill of searching for it.

It's a good filter, keep it simple and easy for the vast majority of people, and have tools for the advanced ones to use.

Comment by coldtea 6 hours ago

>Why does macOS tell you to use the GUI so much?

Because it's whole point is that it's a graphical OS.

If you used just cli unix userland, might as well use Linux.

Comment by shevy-java 7 hours ago

> macOS idiosyncrasies and its aversion to CLI

But people using OSX often also know the commandline quite well - at the least better than most windows users. I saw this again and again in university.

Comment by kolinko 7 hours ago

It also helps that OSX has FreeBSD underneath (so, practically, Linux).

Comment by coldtea 6 hours ago

>FreeBSD underneath (so, practically, Linux).

BLASPHEMY

Comment by juancn 44 minutes ago

I would guess that it's because the second L in LLM is for Language, not Visual.

In any case, being technical myself I actually like that LLMs give me command line commands.

Those are unambiguous, composable and much easier to check what they do, but for muggles, yeah, they could be dangerous.

Comment by sunaookami 8 hours ago

The main point of the article is not what the title claims but the fact that ChatGPT sucks big time for troubleshooting since even the terminal commands are nonsense.

Comment by 7 hours ago

Comment by properbrew 7 hours ago

This is something that interests me a lot. In my own personal experience, ChatGPT is awesome at troubleshooting, it's given me terminal commands that are perfect and use the exact flags needed to identify and then fix the problem.

Why is there this massive disparity in experience? Is it the automatic routing that ChatGPT auto is doing? Does it just so happen that I've hit all the "common" issues (one was flashing an ESP32 to play around with WiFi motion detection - https://github.com/francescopace/espectre) but even then, I just don't get this "ChatGPT is shit" output that even the author is seeing.

Comment by kolinko 7 hours ago

The author uses a free version of ChatGPT. Fails to mention that anywhere, but you can see from the screenshots.

And they don’t provide the prompt, so you can’t really verify if a proper model has the same issues.

Comment by ChrisMarshallNY 7 hours ago

His guess is as good as mine, as to “why,” but the results can be terrible.

As noted, terminal commands can be ridiculously powerful, and can result in messy states.

The last time I asked an LLM for help, was when I wanted to move an automounted disk image from the internal disk to an external one. If you do that, when the mount occurs, is important.

It gave me a bunch of really crazy (and ineffective) instructions, to create login items with timed bash commands, etc. To be fair, I did try to give it the benefit of the doubt, but each time its advice pooched, it would give even worse workarounds.

One of the insidious things, was that it never instructed to revert the previous attempt, like most online instruction posts. This resulted in one attempt colliding with the previous ineffective one, when I neglected to do so, on my own judgment.

Eventually, I decided the fox wasn’t worth the chase, and just left the image on the startup disk. It wasn’t that big, anyway. I made sure to remove all the litter from the LLM debacle.

Taught me a lesson.

> “A man who carries a cat by the tail learns something he can learn in no other way.“

-Mark Twain

Comment by cindyllm 1 hour ago

[dead]

Comment by xnorswap 8 hours ago

The main reason I wouldn't tell someone command line options is because I'd be unsure that either I'd make a mistake and mix something up, or that the person I'm helping would make a mistake.

UIs have better visual feedback for "Am I about to do the right thing?".

But with the AI, there's a good chance it has it correct, and a good chance it'll just be copy/pasted or even run directly. So the risk is reduced.

Comment by modo_mario 6 hours ago

Conversely. The main reason I wouldn't tell someone to to use the UI is the reasons you listed.

At least. If I am not able to follow along step by step to point at their screen and the relative position of buttons. Even more so if the person I'm talking to is clueless to provide and interpret context.

Comment by mgaunard 8 hours ago

The real question is why wouldn't you prefer the terminal way over silly GUIs?

Comment by OJFord 7 hours ago

Author seems to be a painter/journalist/art journalist—so the answer to that is the same as to the OP question: so far it's primarily been built out for programming, by and for software engineers, where it seems completely natural.

Comment by OliverM 7 hours ago

The author is an accomplished software engineer.

Comment by stavros 7 hours ago

Then why do they call it "Terminal" (ie the macOS app) instead of "the terminal" (the concept)? I was baffled.

Comment by matsemann 6 hours ago

It's an Apple-user thing. It's not "my phone", it's "my iPhone". It's not my laptop, it's my MacBook. It's not my headphones, it's my AirPods. It's not my smart watch, it's my Apple Watch Ultra 3 Sapphire Gold Plated. It's not my terminal, it's the Terminal, the one to rule them all. Only plebs use non-branded terminals!

Comment by kolinko 7 hours ago

and he’s using a free version of chatgpt? and not publishing source prompts - so there is no way to replicate?

Comment by llarsson 7 hours ago

Because it's been trained on decades of StackOverflow and forum posts. And because while some command line tools go in and out of fashion, quite a lot are very stable, so their use will show up all the time in the training material.

Since it's all statistics under the LLM hood, both of those cause proven CLI tools to have strong signals as being the right answer.

Comment by ErroneousBosh 7 hours ago

If you asked an AI about joinery, it'd tell you to use a measuring tape, a pencil, a saw, a level, and a hammer a lot.

I wonder why?

Maybe because that's where the basic tools live.

Comment by dkdbejwi383 8 hours ago

Because it’s text.

Comment by hinkley 8 hours ago

Wait, hold on.

Are you trying to tell me that a Large LANGUAGE Model is better at text than at pictures? What are you going to tell me next? That the sidewalk is hot on a sunny day?

Comment by fyredge 8 hours ago

TFA is short and only shows a single example, but it illuminated something for me. LLMs are a misnomer. These are Large Text Models, or better yet, Large Token Models. The appearance of Language is a result of embedding words or parts of words into Tokens, then identifying the relations between Tokens via Machine Learning.

This further solidifies my view that LLMs will not achieve AGI by refuting the oft repeated popsci argument that human brains predict the next word in a sentence just like LLMs.

Comment by tinco 7 hours ago

Why couldn't a machine that identifies relations between tokens be AGI? You're imposing an arbitrary constraint. It is either generally intelligent or its not, whether it uses tokens or whatever else is irrelevant.

Also, languages made up of tokens are still languages, in fact most academics would argue all languages are made up of tokens.

Anyway, it's not LLM's that achieve AGI, it's systems built around LLM's that achieved AGI quite some time ago.

Comment by Hard_Space 8 hours ago

This problem is chronic with GPT[N] dealing with a Windows environment. I have to constantly remind it to prefer the GUI option, though nothing really works. I don't know if agents make use of screenshots the way older automation routines have always done, but increasing use of that kind of data would help LLMs progress beyond CLI-addiction.

Comment by ZiiS 8 hours ago

Why dose software with text interface tell you to use a text interface?

Comment by randomtools 8 hours ago

Spending time on debugging on UI especially since AI is growing exponentially is quite waste of a time

Comment by dewey 8 hours ago

This seems like one of these “why does my bad prompt give me bad results” kind of complaints. Just tell the LLM something like “in your reply prefer the macOS GUI and wrote the instructions for a non technical user similar to the Apple help pages” and it’ll look much different.

It won’t be as fast to go through them than just pasting some commands but if that’s what the user prefers…

Comment by 8 hours ago

Comment by Yie1cho 6 hours ago

Because real men use terminal, that's why. Really.

Comment by Markoff 8 hours ago

because GUI can vary over time (or Linux distros) much more than terminal commands

edit: ChatGPT talked me recently through Linux Mint installation on two old laptops I have at home where Mint didn't detect existing Windows installation (which I wanted to keep), don't think anyone on Reddit or elsewhere would be as fast/patient as ChatGPT, it was mostly done by terminal commands, one computer was easy, the other had already 4 partitions and FAT32, so it took longer

Comment by kolinko 7 hours ago

The blog author uses free version of ChatGPT (not logged in on screenshot) - so really talks about the previous generation models.

It would be nice if this was mentioned transparently in the beginning of article.

I mean - new models also tell you to use the terminal, but the quality is incomparable to what the author is using.

Comment by dude250711 7 hours ago

It wants us travel back to 1980s. The simpler times.

Comment by shevy-java 7 hours ago

My initial reaction was "because AI is so stupid".

However had, I use the terminal all the time. It is the primary user interface to me to get computers to do what I want; in the most basic sense I simply invoke various commands from the commandline, often delegating onto self-written ruby scripts. For instance "delem" is my commandline alias for delete_empty_files (kept in delete_empty_files.rb). I have tons of similar "actions"; oldschool UNIX people may use some commandline flags for this. I also support commandline flags, of course, but my brain works best when I keep everything super-simple at all times. So I actually do not disagree with AI here; the terminal is efficient. I just don't need AI to tell me that - I knew that before already. So AI may still be stupid. It's like a young over-eager kid, but without a real ability to "learn".

Comment by dr_dshiv 8 hours ago

These days, command line offers way better usability and accessibility (because Claude Code can do it). Whenever I have to use a GUI I’m like uuughgh…

Am I the only one who thinks like this?

Comment by kleiba 8 hours ago

> Few understand the commands used...

By "few" you mean "few Gen-Zs?"

Comment by waffleiron 8 hours ago

It seems from the article the LLM also doesn’t understand then commands it uses, as they do things that are not what is described.

Comment by saagarjha 8 hours ago

They’re running laps around you.

Comment by Grimblewald 7 hours ago

Remains to be seen on my end. Nothing notworthy about gen z in terms of tech/stem. Not particularly smart or dumb, they just kind of exist. On average, they have mediocre tech skills, but generally higher than average emotional intelligence. There are outliers of course, but running laps in a tech scene is a bit of a stretch.

Comment by saagarjha 7 hours ago

Noteworthy things are not done by average people.

Comment by theshrike79 8 hours ago

Because terminal commands are the only way to automate actions on an OS. Clicking through UIs is not it.

Comment by karel-3d 8 hours ago

Read the actual article. The AI recommended him 5 things that are all more easily done by UI and are all accomplishing different thing than they say anyway.

Comment by mrkeen 7 hours ago

I read the article. Parent's comment about automation is spot on. TFA didn't describe any GUI interaction in detail, or even suggest that there was a way to achieve these goals without needing a meatbag to physically interact with the computer (and capture its output in /dev/meatbrain).

But at least TFA wrote up the criticism in text, even transcribing some of the screenshots.

Comment by theshrike79 7 hours ago

More easily maybe, but the CLI command is deterministic and works as long as the user can successfully paste it to a terminal and run it.

For UI you need to figure out different locales, OS versions, etc.

Comment by 1718627440 8 hours ago

theshrike79 was talking about automating. Automating using an UI requires you to have a program which can simulate click events and a Display server which allows this. It's also really brittle, because you are not actually depending on the action you want to invoke, but on the UI location this action is exposed at.

Automating terminal commands is easy, because that is how the OS works anyways. All programs invoke each other by issuing (arrays of) strings to the OS and telling it to exec this.

Comment by antonvs 8 hours ago

I’ll paraphrase a user named Bear from Usenet a few decades back: if all you know how to do is point at what you want, you’re operating at the level of a preverbal child.