Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hey I'm one of the Console devs who's been working on this feature for a while now. I'll be hanging around in the comments for a little while to try and answer any questions that people might have.

TL;DR of this announcement: We've added a new pseudoconsole feature to the Windows Console that will the people create "Terminal" applications on Windows very similarly to how they work on *nix. Terminals will be able to interact with the conpty using only a stream of characters, while commandline applications will be able to keep using the entire console API surface as they always have.



While it is nice that MS is focusing on console and command line now, it seems to me that you are mostly working on improving compatibility with legacy UNIXy stuff.

Do you have some vision or plans to go well beyond the classic UNIXy style of console and command line? I'm thinking in the lines of projects like DomTerm http://domterm.org/ which could have nice interactions with e.g. PowerShell.


PM for PowerShell here!

I haven't seen DomTerm before, but it looks pretty awesome. At a glance, it's basically a GUI-fied tmux hosted in Electron? It would be awesome to have in Windows, but wouldn't that just require that DomTerm add support for these ConPty APIs?

In any case, I'm more interested in your proposed interactions. Did you have anything cool in mind? Given that we ship PowerShell on Linux, we could theoretically do some stuff there (including within PowerShell on WSL) before it's hooked up to ConPty


I'm not the person you were asking, but this should interest you.

I've been working on a terminal emulator ( Extraterm http://extraterm.org/ ) with some novel features which would dovetail nicely with how PowerShell works. The first is the ability to send files to the terminal where they can be displayed as text, images, sound, etc or as an opaque download. Extraterm also adds a command `from` which lets you use previous terminal output or files, as input in a command pipeline. See http://extraterm.org/features.html "Reusing Command Output" for a demo. This opens up other, more interactive and iterative workflows. For example, you could show some initial data and then in later commands filter and refine it while visually checking the intermediate results.

What I would like to do sometime is integrate this idea with PowerShell and its approach of processing objects instead of "streams of bytes". It should then be possible to display a PowerShell list of objects directly in the terminal, and then reuse that list in a different command while preserving the "objectness" of the data. For example, you could show a list of user objects in one tab and then in another tab (possibly a different machine) grab that list and filter it the same way as any normal list of objects in PowerShell. You could also directly show tabular data in the terminal, let the user edit it "in place" in the terminal, and then use that editted data in a new command. It allows for more hybrid and interactive workflows in the terminal while still remaining centered around the command line.

Extraterm does these features using extra (custom) vt escape codes. ConPty should allow me to extend these features to Windows too.


Ooooh yeah, that sounds awesome! Going to share this with some people on our team (lots of folks love and use Hyper already, but we're always looking for new stuff to play with).

I would highly recommend you check out the excellent HistoryPx module[1]. Among (many) other things, it supports automatically saving the most recently emitted output to a magic `$__` variable. Theoretically, you could save a lot further back, but you may start to run into memory constraints (turns out .NET objects are a little heavier than text... ;) )

[1]: https://github.com/KirkMunro/HistoryPx


The ConPTY is not just about compatibility with nix. It's about proper remoting of consoles. Unix got that right / Windows got that wrong, and now Windows will finally get it right too -- that it helps nix compat seems like a happy accident (though obviously they want that too, so not so accidental.


Right. For a long time, the MS remoting philosophy was that applications should be remoted, not text streams. The stance goes all the way back to DCOM. That's why PowerShell remoting looks more like a local PowerShell executing commands on a remote machine than it looks like you just connecting to a PowerShell running elsewhere.

The difference is important, since in the traditional MS model, each program that wants to do the remote thing needs to essentially implement its own client-server setup, albeit with a massive amount of help from various runtimes. Named pipes and central authentication made this approach not quite as horrible as it sounds.

This new API is a departure from this model. It will make it possible to just remote via text streams. Perhaps that's uglier --- everyone knows in-band signaling is fragile. But long experience shows they just remoting the damn text streams is easily the more pragmatic option.


I mean, it could be a packet stream where some packets are control packets, and then the Console API could be used and remoted, but you'd need clients that understand it.

The fragility of in-band signaling in the TTY world is not *nix's fault or anything. VMS had that too, since VMS too had to deal with TTYs. The fact is that a) the system evolved from real, hardware TTYs of the 1970s, b) using a text stream with some in-band signaling doesn't even half suck -- mostly it rocks, and all you have to do for it to rock is get the right $TERM value and not output binary files to the tty.



The Windows Console was even more terminally dead. Terminals have their limitations, but honestly, I'll take a terminal, tty/pty, and terminfo, any day over the Windows Console.


You also have to sanitise untrusted data that you want to output.


And remember to use less -R or else pipe to col(1) or whatever.


I want a pipe2(2) flag or an fcntl or something that lets me signal "the other end of this pipe understands ANSI escapes"


First of all you have to define "ANSI escape".

Hints:

* You have the wrong country, on the wrong continent.

* It's not as simple in reality as your first answer will be. (-:


Yeah, I know


I've considered before that you could allow each end of a pipe to set name/value attributes, which can be read by the other end.


And now you're reinventing the Windows Console API... :)

I mean, both approaches have their pluses, but the API approach is only ever going to work well for remoting if it is standardized and interoperable. And the installed base of Unix termcap/terminfo programs is huge, so plain old text-with-in-band-controls is not going away anytime soon.


With things like SIGWINCH / ioctl(TIOCGWINSZ) we already know what kind of API it is, we are just haggling about the price.


It's a difference between making mv/move, cp/copy client-server aware and making the shell running these simple utilities client-server aware.


The Windows way would be to mount the remote filesystem and use local mv and cp. Which works fine, since Windows has excellent remote filesystem support. The Unix world only partially began to catch up to CIFS with the advent of NFSv4 and its lease infrastructure and it still isn't as seamless as the Windows stuff.


Have you given thought on how to solve the "unwanted console" problem? For example if you run a .py file (Python) under Windows then you get a console. That is fine for command line stuff, but beyond annoying if the file displays as a gui. So there are now two Python binaries - python.exe and pythonw.exe. The only difference is the latter ensures no console appears. Also good luck if the script printed a help message since often the console disappears before you even know that happened.

I presume many tools deal with this issue, and do it in different ways. Perhaps it is as simple as making the console itself only appear once there is any output, or a blocking read of input.


Technically, any executable that's compiled as a commandline application is going to get a console allocated for it, no matter what on Windows. I don't believe that's something we can fix retroactively unfortunately, that's just a part of how things have to be.

Now, I believe that python could have python.exe compiled as a win32 application, then call AllocateConsole as soon as the script called print() or something. If the app was already running in a console, I believe (don't quote me) that AllocateConsole won't allocate a new console for it, but if it doesn't yet have a console it'll spawn one.


I think the behaviour of cmd.exe is part of the problem here. When an interactive cmd.exe launches a console-subsystem app, it waits for the process to finish before showing the prompt again, but when it launches a GUI-subsystem app, cmd.exe writes the prompt again immediately, so even if the new process calls AttachConsole(ATTACH_PARENT_PROCESS) before it tries to write to the console, it will write over cmd.exe's prompt, which makes a poor user experience.

So, if someone wants to make a "dual-mode" app that works as a win32-subsystem app when launched from Explorer and a console-subsystem app when launched from a console, they have to choose between two bad options. They can make their app a console-subsystem app, which means a console will always briefly appear on screen when the app is started (no matter how quickly the app calls FreeConsole(),) or they can make their app a GUI-subsystem app (that opportunistically calls AttachConsole(),) which behaves sub-optimally in cmd.exe.

Maybe the solution is to add a flag (in the .manifest file?) that makes the console initially hidden for a console-subsystem app. That would prevent the brief appearance of a console window when launching a console-subsystem app from Explorer. Then there would be no need for pythonw.exe and python.exe could show the console window only after a message is printed.


Hmmm.

MSDN doesn't really say much about AllocConsole(), if that's the right function: https://docs.microsoft.com/en-us/windows/console/allocconsol...

If AllocConsole does behave in the way you say (which I understand it may{, not}), then the documentation sorely needs updating, because right now that bit of functionality (if it is there) is rather implicit.

It would be really cool to effectively deprecate the current console functionality and make it relatively straightforward to use the PTY API going forward, adding the bits people need to support use cases like this (allocating a console when/as it's needed).

Perhaps Visual Studio could introduce a new template for commandline applications that targets the PTY, and put "(Recommended)" next to that one? :D


> It would be really cool to effectively deprecate the current console functionality and make it relatively straightforward to use the PTY API going forward,

Please no!

The console subsystem in Windows is what terminal I/O evolved into, in the 1980s. Going back to terminal I/O is a massively retrograde step.

In any case, for addressing the problem at hand removing the console API is not the answer. rossy has explained what the problems actually are, which relate to the command interpeter waiting for processes to terminate and whether Win32 program image files are marked as GUI or console.

Personally I have always regarded that behaviour of Microsoft's command interpreter as a flaw, not a feature. I've always turned it off in JP Software command interpreters, which make it configurable. I didn't implement it in my command interpreter. However, I do appreciate that Microsoft's strong commitment to backwards compatibility hampers what can be done.

* https://jpsoft.com./help/waiting.htm


Hmmm. Good point. I can't say I think the PTY architecture is actually fun or anything, it's brain-numbing. Now I realize what I was asking I disagree too, hah!

IMO, the "real" answer - the viable one :) - is a redesigned console model that supersets/encapsulates what already exists in such a way that things remain backward compatible but incorporate features that allow for progressive enhancement. The question is in how to actually build that out; what to start with, what to do when and where, etc.

Probably the most important thing I'd start with is having each console be like an independent terminal server: make it so anything can watch all the stdio streams of everything attached to the PTY, make it possible to introspect the resulting terminal stream, etc. Then the TTY itself could be queried to get the character cell grid, query individual characters, etc. And also make it possible to change arbitrary PTY+TTY settings [out from under whatever's using a given console] as well.

By "anything can watch" I mean that there would be an actual "terminal server" somewhere, likely in a process that owned a bunch of ptys, and this would have an IPC API to do monitoring and so forth. Obviously security and permissions would need to be factored in.

But this would roughly take the best from the UNIX side (line disciplines are kind of cool, having the PTY architecture Just Work with RS232 is... I understand the history, but it makes for an interesting current-day status quo, IMO), and then combine this with the best of the Win32 side (reading what's on the screen!!!!! Yes please!!).

I'm not sure how to build something sane that could incorporate graphics though. ReGIS and Sixel are... no. 8-bit cleanness is an unfortunately-probable requirement for portability (at least UTF-8 can be shooed away in broken environments with a LC_ALL=C), but base64 encoding is also equally no. Referencing files (what Enlightenment's Terminology does) is a nononono. w3m-image's approach of taking over the X11 drawable associated with the terminal window is awesome in its hilarious terribleness. The best I can think of is a library that all image/UI operations would be delegated to, which would do some escape-sequence dances with the terminal (and whatever proxies were in the way?) to detect capabilities and either use out-of-channel communications (???) to send the image data over rapidly, or alternatively 8-bit or 7-bit encode the image data into the TTY stream (worst case scenario).

This is a bit hazy/sketchily laid out, but it's something I've been thinking about for several years. When I started out pondering all of this stuff circa 2006 I was most definitely all over the place :) I'm a bit better now but I still have a lot of unresolved ideas/things. I'm trying to build a from-scratch UX that provides a more flexible model to using terminals and browsing the web, but in a way that's backward-compatible and not "different to the point of being boring".

I've (very slowly...) come to understand that slow and progressive enhancement is the only viable path forward (that people will adopt), so I'm trying to understand the best way to do that.


Actually, extending it in a native Win32 way in line with the existing console model would have been quite a simple thing.

There are already CONIN and CONOUT objects. One simply needs to make the latter a synchronization object, the former already being waitable. The console screen buffer maintains a dirty rectangle of cells that have been dirtied by any output operation, be it high-level or low-level output. It becomes signalled when that dirty rectangle is not zero-sized, the cursor is moved, or the screen buffer is resized. And there's a new GetConsoleScreenBufferDirtyRect() call that atomically retrieves the current rectangle and resets it to zero.

With that, capturing console I/O is simply a matter of waiting for the console output buffer handle along with all of the other handles that one is waiting for, getting size/cursor info and clearing the dirty rectangle with GetConsoleCursorInfo()/GetConsoleScreenBufferInfo()/GetConsoleScreenBufferDirtyRect(), and reading the new cell values of the dirty rectangle with ReadConsoleOutput().


Why... can't you have I/O redirection to the null device and have it understand (and throw away) Console API messages? You might as well also add some pseudo-device to convert Console API messages into Unix-style text streams (with or without metadata converted to terminal control sequences), so that one could redirect console programs' output to files / pipes.

When the user's (programmer's) intent is to run a program with no console window, then that's what they should get: no console window.


If I remember correctly, this approach breaks if stdin/out is redirected.

There's also some funky stuff about explicit AllocConsole-allocated consoles; for example, when you attach a native debugger, all output from such console is automatically redirected to that debugger (i.e. the VS Output window or similar). This is very annoying in practice.


I'm unclear as to whether I'll be able to pipe binary data between a classic console application and a ConPTY application due to the VT translation and rendering components in ConHost.

So, for example if I was to pipe into 7z.exe, a classic console app, using something like "type mybinaryfile.bin | 7z.exe a -si c:\temp\myarchive.7z" from a ConPTY console, would the VT translation affect the piped stream?


Nope! We're only rendering the effect of any attached processes to the VT on the conpty side of things. On the client side (where cmd, type, 7z.exe are all running), they're going to keep working just the same as they always have. They're all running on the "slave" side of conpty, while the emitted VT is coming from the "master" side of the conpty.


Will there be the ability to disown, background, nohup processes and close the console, leaving the commands running?


Those sound like they'll be more like the responsibility of the terminal emulator, unfortunately.

Windows console applications aren't really able to live without being attached to a console. Now, a terminal might be able to implement those features...

actually now you've got me thinking. I'll play around with that idea. Definitely non-committal, but it might be possible in the future.


The sad thing is that this was all already implemented and done in Microsoft's second POSIX subsystem for Windows NT. It provided signals and process groups support for job control shells. It had a full control sequence interpreter for output and control sequence generator for extended keys. There were termcap/terminfo database records that people had added to other operating systems. It had a line discipline with "canonical" and "raw" modes. It had pseudo-terminals, with both BSD and System 5 access semantics.

* https://technet.microsoft.com/en-gb/library/bb497016.aspx

* https://technet.microsoft.com/en-gb/library/bb463219.aspx

* https://news.ycombinator.com/item?id=12866843

* http://jdebp.info./FGA/interix-terminal-type.html

And Microsoft owns it.


I wish people would stop pointing at the Linux subsystem and using it as an example of the stuff Windows in general could do. The Linux subsystem is a subsystem. In NT terminology, that makes it practically a container. It's disconnected from the rest of the system. It can interact with the win32 world in only a few constrained ways. You can't deliver SIGWINCH to a win32 process!


Actually, the second POSIX subsystem could deliver signals to Win32 processes. Who pointed at the Linux subsystem, by the way?

* https://news.ycombinator.com/item?id=11416392


I mean, the same is true on Unix. If an app is in the background and wants to write to (or read from) the terminal, it gets SIGTTOU (SIGTTIN) sent to it immediately. It might have to be impossible on Windows to ignore SIGTTOU/SIGTTIN/SIGTSTOP, but I think that's just fine.

Mind you, ptys + tmux/similar is certainly very good, and if that's all we'll get that's still way way better than the current state of affairs, but if that's all that will be possible it should at least be possible to pause the console's output (and flow-control the console application).


This is awesome. Thank you for all ya'll's hard work!


Can conhost still do anything that users of the new API can't?


Excellent question! There are a few limitations that we have to place on the ConPty to make it work quite right. Primarily, client apps running attached to the conpty will not be able to have separate viewport and buffer sizes. On *nix, the entire "console buffer" is just the size of the window, but on Windows, technically, the buffer is much larger than the window. (as an example, when you open up a command prompt, there's a giant empty space at the bottom if you scroll down). Fortunately, we haven't came across any apps that _need_ the buffer to be a different size than the viewport, and it's a technically valid console configuration, so apps should have been able to support it before.

Input is also tricky - VT doesn't let you express input with as much fidelity as a console app might be expecting, though this we're working on a solution for :)


Now Windows just needs to ship with a decent pager. :-)

Programmatic access to scroll back is useful for a few things. For example, back when I was on Windows Phone, I wrote a compiler wrapper that would scroll back to the first error message.

It'd be nice for the POSIX terminal world to standardize on similar scrollback access. I know the zsh people would love it.


*nix consoles typically have two buffers - many full-screen temrinal applications switch to the "alternate screen" on start and back to the principal screen on exit. That's why when you exit vim(1), you see the terminal state back as it was before you started it. Will ConPty support this?


Yep! It'll probably act a little different than you'd expect - the pseudoconsole itself will switch between the primary and alt buffers, leaving the terminal in the main buffer always. We'll "render" the contents of the alt buffer to the terminal, then when the client app switches back to the main buffer, conpty will re-render the contents of the main buffer to the terminal.

It's not the most elegant solution, but we're still very early in on this project. We still have lots of improvements to be made to the infrastructure and translation, and even as I type this up, I'm thinking there's probably a better way of handling alt buffers.


That's pretty much how it works if you have a screen/tmux type program in your path as well, so that makes sense.


The Windows console subsystem itself has supported multiple output buffers from the start.

* https://docs.microsoft.com/en-gb/windows/console/console-scr...


I hate the alternate screen with a passion.


Will this replace Command Prompt or Powershell? Or is this more the back-end for these apps? I believe that we are heading for console overload (in a good way?!) with Command Prompt and Powershell installed on every computer and Debian/Ubuntu available on the MS store.


First off: command prompt (cmd.exe) and powershell are commandline client applications. They are shells just the same as bash is.

All commandline clients run attached to a console server, and that server is conhost.exe. Conhost is responsible not only for being the console server, but drawing the actual terminal window these apps run in. So when you alunch cmd or powershell, what you're seeing is conhost.exe "hosting" these console applications.

What we're exposing here is the "master side" of conhost, which will the other applications act as Terminals, like how there is gnome-terminal, xterm, terminator, etc on linux.


Is there any chance cmd and powershell will improve from a user interface perspective? And perhaps become usable? Cmd has been garbage since it's inception.


cmd is parked for pretty much everything except for major issues. It's a scary codebase that has a LOT of code that's dependent on it, and we can't really add any new features there without the possibility of breaking someone.

This feature is mostly focused on the other end of the communication, on being able to create new Terminal windows to run shells inside of them.


Any chance that you could just fork it? Create cmd2.exe, hell you could even open-source it.


Is the pay-off big enough? Who knows what garbage code they might want to hide in there, for both technical reasons (including security) or even legal ones.


Isn't that basically Powershell?


Is there a chance this will be connected to a functional shell interface. I get your point that cmd cannot be upgraded because of legacy issues and that is understandable and unfortunate, but windows needs a proper shell. This is obviously a great start for one side of the equation. But until there is a decent terminal app, windows will continue to be a nonstarter.


It sounds like you're asking for two different things here:

cmd.exe is a shell, and that's the guy that's parked.

conhost.exe is a terminal, and that's under active development, though it's slower than something like VsCode, because we can't just go adding features as we see fit, we have a LOT of back compat we still need to support.

Fortunately, conpty will allow for the creation of new terminal applications on Windows. If you're looking for a better shell experience on windows, I can point you to powershell or even [yori](http://www.malsmith.net/yori/), which looks pretty cool


I think u/paulie_a is just asking for a better shell. If cmd can't be made better, then make a new one.


That's Powershell.


Exactly, I understand there are different underlying concepts and systems to the front end and what it interacts with. It just seem incredible that windows is basically stuck with a windows 95 interface for a shell.


Powershell is open source, the 6.1 preview 4 is nice and fast, and you get real objects with keys rather than scraping for regexs all the time like bash.


Windows has a proper shell, it's called PowerShell and it is by far the most discoverable and consistent shell that exists.


It's also the most absurdly verbose shell that exists, which means it sucks as a shell even if it's a half-decent scripting language.


[flagged]


Powershell doesn't get any credit for the tab completion when the default behavior for it just makes the verbosity even more of a nuisance. Having tab completion scroll through all the possible completions one at a time doesn't save keystrokes in most cases, especially when there are dozens or hundreds of options that are really long so when you give up on tab completion and decide to type it out manually in full, you have to erase 10-20 characters. The bash-style completion behavior of completing any unambiguous characters then giving you a list of the possibilities would be even more useful for powershell than it is for bash. But Microsoft once again had to throw in gratuitous differences at the cost of usability.


I prefer the way bash did it too, so I choose to do the same on powershell:

http://stackoverflow.com/questions/39221953/can-i-make-power...

    Set-PSReadlineKeyHandler -Chord Tab -Function MenuComplete


That's subjective opinion. I much prefer the Windows way because I can either specify additional characters or keep hitting tab to until what I want comes up. "Display all 1026 possibilities? (y or n)" certainly isn't an improvement.

Not to mention of course that it tab completes things bash can't and doesn't, and all the other reasons your initial comment was wrong.


> I much prefer the Windows way because I can either specify additional characters or keep hitting tab to until what I want comes up.

You can't specify additional characters until after you've erased all of the incorrect trailing characters that powershell filled in, and that's where powershell's completion method runs up the keystroke count unreasonably. Your comparison isn't valid if you ignore that aspect. It's also not very subjective at all. We're talking about objectively countable keystrokes.

I will concede that out of the box bash doesn't offer completion for anything other than file and command names, but it does include a programmable completion feature and many packages provide completion rules for their commands. It's up to the distro to determine whether to enable all of those completion rules by default or to stick with the more limited but predictable file-only completion behavior.

Aside from that, I can't see what "all the other reasons your initial comment was wrong" are; you appear to have only cited the existence of concise aliases for some commands as a refutation, and that obviously doesn't put powershell ahead of bash in any way, just lessens the severity of that downside.


> you appear to have only cited the existence of concise aliases for some commands as a refutation

You claim that PowerShell is overly verbose, I point out that it is only verbose if you intentionally make it verbose for readability, you proceed to pick nits and pretend you've done some elaborate study on keystroke counts or something, then resort to tired old Linux evangelism mainstays like blaming the distro.


PS doesn't just have concise aliases for some commands, it has a coherent system of concise aliases, based on the verb-noun convention.


I've attempted to use powershell numerous times, it's a piece of shit compared to zsh.


some other HN thread recommended cmder. I haven't touched cmd since

http://cmder.net/


Ah but see, now you're conflating two different things:

cmd.exe is a shell

conhost and cmder are terminals.

I believe cmder can come with git bash as well, which is also a shell.

The confusion comes from when you launch cmd, the window that appears by default is conhost, with cmd running attached to it. When you launch cmder, it's also running attached to cmd.


The back end. Basically people like ConEmu and Hyper and Terminus have been having to use various unreliable hacks for ages because there was no real console API. Now there is one.


Zadji I love your work. Which build is this going to land in? Do you know if any of the third party console apps using the new API yet?


It's already available in current insider's builds, and will be landing officially in the next available Windows release some time later this year.

We're still working with ConEmu, VsCode, and OpenSSH to get them all over to the new API, with varying levels of adoption in the next few months likely.

Currently, WSL is also using the same functionality, if you open a WSL distro and run any Windows executables (eg `cmd.exe`), they'll run attached to a conpty. I use this as my daily driver.


You might want to talk to the Cygwin people to get their pty layer to use the new virtual console.


Maybe the guys at alacritty are interested in this as well.

https://github.com/jwilm/alacritty



That is an excellent suggestion, I'll get our PM to reach out :)


Out of curiosity, are you backporting onto Windows 10, or is Windows 10 the only release vehicle for everything in master? If the latter, how are you releasing piecemeal?


Pretty much any new features we release are only available on new Windows 10 releases. Unless there's a gigantic demand for a feature, or business impact, we're not really capable on our team of backporting anything.


What I meant is: are you building a Windows 11, or is Windows 10 the only release vehicle? I'm trying to imagine you pushing to a master branch to integrate into what becomes the insider builds, and then... either those become a future release of Windows 10, or you backport onto a Windows 10 branch. Neither option would be very appealing to me if it was up to me.


Windows 10 is the "last Windows ever". There's a pretty wide tree of branches under "winmain" that teams contribute to, and at each layer more branches are merged together, until they finally hit winmain, and then weekly they tag (more or less) a build from winmain as the Insider's build for the week.

It's actually a lot more elegant than you'd expect it to be for a project with as many developers as Windows has


Thanks. Got it. Having worked on Solaris, this sort of thing is very interesting to me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: