Ask HN: Has anyone here worked on the Windows kernel?

jsolson · on July 13, 2022

I think I can count the number of kernel changes I've submitted on one hand, but I work on core virtualization that involves a lot of pretending to be hardware and (these days) a lot of poking directly at hardware registers.

I would say James Mickens sums things up nicely in "The Night Watch[0]." For example, you mention debugging with logs and metrics -- this snippet came to mind:

     “Yeah, that sounds bad. Have you checked the log files for
     errors?” I said, “Indeed, I would do that if I hadn’t broken every
     component that a logging system needs to log data. I have a
     network file system, and I have broken the network, and I have
     broken the file system, and my machines crash when I make
     eye contact with them. I HAVE NO TOOLS BECAUSE I’VE
     DESTROYED MY TOOLS WITH MY TOOLS. My only logging
     option is to hire monks to transcribe the subjective experience
     of watching my machines die as I weep tears of blood.”

Mind you, I absolutely _love_ working on low-level stuff, and I wouldn't trade the time I get to spend actually doing that for anything. That said, the complexity of modern operating systems, CPU architectures, interconnects, and peripherals creates opportunities for frustration and confusion that honor no bounds of reasonability or decency.

[0]: https://www.usenix.org/system/files/1311_05-08_mickens.pdf

EvanAnderson · on July 13, 2022

As an aside: James Mickens is a treasure. If anybody reading this hasn't read any of his articles or watched any of his talks before stop now and do it. You won't be sorry.

https://mickens.seas.harvard.edu/wisdom-james-mickens

sodapopcan · on July 13, 2022

I did as you suggested. I was not sorry. Thank you!

whimsicalism · on July 13, 2022

He is also an incredible professor.

retcon · on July 13, 2022

Reading this man's homepage he comes across as a insufferable egotist. I couldn't find any trace of didactic value among the self aggrandizing rhetoric not possible to convey with infinitely greater humility and persuasion as well as concision.

Edit: reference source decades of being quite intolerable myself. Takes one to know one. And by my experience incredible good fortune and more decades to repent.

vlmutolo · on July 13, 2022

The page is supposed to read as a joke. That's his schtick. I think his writing is pretty funny.

teekert · on July 13, 2022

Haha, I love this humor, it's the kind I write in a frenzy of enthusiasm and inspiration often fueled by a sudden realization of the crazy state some things are in.

I also have gotten a lot slack for it. But I see it as a way to convey that new feeling of wonder and awe a child has upon entering a new world. Challenge the assumptions, make up some nice words, pretend very hard that your academic title is an actual thing that gives you some blessed superpowers (as is the way that people often treat it!). I love it. It's truth wrapped in humor to soften the blow to the ones that think themselves overly important. The "shocked ones" are the ones that I try to avoid in life anyway, nice discussions begin at the edge of your comfort zone.

x0n · on July 13, 2022

Dude. How can you read "I’ve been a legendary hacker for 98% of my life, but there was a brief period when I did not possess the sum totality of human knowledge." and not realize this is self-parody?

sodapopcan · on July 13, 2022

I think even with the edit that this comment is also satire.

RagnarD · on July 13, 2022

The page is extremely obviously full of satire.

doctor_eval · on July 13, 2022

Is that you, Mickens?

whimsicalism · on July 13, 2022

Having interacted with Mickens quite a bit, he is far from intolerable.

Veen · on July 13, 2022

https://en.m.wikipedia.org/wiki/Irony

saghm · on July 13, 2022

> I HAVE NO TOOLS BECAUSE I’VE DESTROYED MY TOOLS WITH MY TOOLS

This might be my all-time favorite quote that I never to get use in relevant situations because nobody is around who would get the reference. I think of it almost every time I hear the word "tool"

dtgriscom · on July 13, 2022

> I think I can count the number of kernel changes I've submitted on one hand

I know I can; I don't even need any of my fingers.

cperciva · on July 13, 2022

poking directly at hardware registers

Such luxury! I just spent a couple weeks getting FreeBSD booting in the Firecracker VM and most of my debugging was performed by inserting hlt instructions into the FreeBSD kernel and looking at whether virtual CPU halted or hit a triple fault.

jsolson · on July 13, 2022

I take it the Firecrakcer folks haven't built out support for KVM_GUESTDBG_SINGLESTEP yet :)

jaclaz · on July 13, 2022

I would add, ... kids today ...

JFYI:

https://tinyapps.org/blog/200702250700_why_in_my_day.html

aasasd · on July 13, 2022

> whether virtual CPU halted or hit a triple fault

Ah, I see the gentleman works directly in binary.

darepublic · on July 13, 2022

> I HAVE NO TOOLS BECAUSE I'VE DESTROYED MY TOOLS WITH MY TOOLS

Only comparable experience I can think of is "breaking" the terminal via bash_profile and then not being able to fix bash_profile via the terminal. Or locking yourself out of a web server via security configuration update.

oblio · on July 13, 2022

Raise your hand if you broke the SSHD server config and then restarted it!

That: uh-oh! moment.

zrobotics · on July 13, 2022

Definitely didn't do that last week, no way. Thankfully, the server I didn't push the wrong credentials to before restarting wasn't the new instance of the test server.

doubled112 · on July 13, 2022

I've definitely never done that via Ansible on a bunch of machines at once, nope nope nope.

philistine · on July 13, 2022

I’m an enormous idiot who’s done that recently. If not for the fact I had another window open, I don’t know what I would have done.

thowaway_pandat · on July 13, 2022

chroot from an OS whose terminal is not broken and a shell other than bash

philistine · on July 13, 2022

zsh was the one I broke, so bash would have been preferable.

tomrod · on July 13, 2022

This is breathtakingly brilliant, and I am grateful to now have the wisdomful humor of James Mickens in my life. Thanks!

CodeWriter23 · on July 13, 2022

So what you do in that case, add a monochrome adapter to the PC and write to the screen memory. Or send [greatly abbreviated] trace out a serial port.

jsolson · on July 13, 2022

In at least one instance of breaking my tools with my tools, the machines in question were in a data-center. Ordinarily one might be able to grab debugging output over an NC-SI link, but that only works if the breakage doesn't also unexpectedly crash your NIC firmware...

CodeWriter23 · on July 14, 2022

If you can’t touch it, you might not be able to debug it. Kudos though for testing^H^H^H^H^H^H^Hdebugging in production.

CodeWriter23 · on July 14, 2022

Case in point. Had a client in the late 80’s with an X.25 connection between their office and a Mainframe across town. It was accumulating hundreds of CRC errors per minute. Couldn’t fix remotely so I flew out there. Had a scope hooked up on both sides.

We loaded the connection in varying patterns for about an hour. ZERO CRC errors. Scratching our heads, we decided to punt. Then suddenly the CRC error count started climbing on my scope. I shout into the speakerphone, excitedly, what did you all just do? They’re like “nothing but unplug the scope from the patch panel”. “Please plug it back in”. CRC error count stopped.

Diagnosis: faulty patch panel.

jsolson · on July 14, 2022

When your job is to build infrastructure, "in a data-center" does not necessarily mean it's a production server :)

TimSchumann · on July 14, 2022

Thank you for posting that. Epic read, amazing sense of humor, and I feel like I owe James a beer or three after reading it.

tarun_anand · on July 13, 2022

I worked in the kernel team many moons ago helping improve the performance and then built the .net clr later. Also worked on a NT Filesystem called Cairo. So I guess I have experienced both kernel dev and backend framework dev.

Kernel code is amazing, especially the parts written by the early members like DaveC, MarkZ and others.

For me the biggest part was working with a group of extremely smart people who were very nearly the best programmers in the world.

I really miss that outside of Microsoft. I would imagine that you can get the same experience if you worked with the Linux kernel dev team or some of the other few places in the world like FAMGA where you can work.

My suggestion is to go for it. After leaving MS I found peace by working with the open source community and open source software.

Hope that helps, happy to discuss further.

beefman · on July 13, 2022

I would love to hear any Cairo stories you may be able to share!

virgulino · on July 13, 2022

Me too!

KerrAvon · on July 12, 2022

If MS or any FAANG wants to hire you for a kernel dev position, go for it; they clearly see a lot of potential in you that maybe you don't see yourself. I want to point out a couple of things that you should be prepared for:

> 2. Debugging exclusively via metrics and logs, since I can't just attach a debugger to a running server.

You often can't do this in kernel/OS development work either. Serial printf logging is often required. It can be a real "my tools to debug my tools are broken" slog.

> 3. Designing systems in general. Some people love the challenge of distributed transactions, eventual consistency and all that jazz, but it just rubs my brain the wrong way. I'm not interested at all in that problem space [1].

I'm not sure I understand this issue. I mean, kernels have subsystems for doing stuff; you might need to design one someday? But it won't be dull-as-dishwater web technology stacks, it'll be you writing data structures directly in C/C++ or Rust if MS goes there.

ccrush · on July 13, 2022

Can you no longer use windbg to attach to a running kernel on a remote box, and debug the remote box from your development machine? I believe that's how kernel debugging is generally done.

amluto · on July 13, 2022

(I’m a Linux kernel dev)

Tools like windbg probably work great until you are debugging the code that runs when windbg tries to take over. Or you’re debugging something sensitive to interrupts and it’s literally impossible to keep up with a tool like windbg. Or you a debugging something that overwrites windbg in memory because you have a corrupt pointer. Or you triple-fault the machine and it reboots with so much prejudice that windbg doesn’t have a chance. Or you’re poking at hardware that the debugger can’t usefully interact with. Etc.

Basically, in the kernel, a lot of the things that a working kernel does to hold your hand aren’t available. So there’s a lot of time spent just thinking or adding logging statements or otherwise using low tech tools. (Or using high tech instrumentation or sanitizers!)

Kernel development is great :)

muricula · on July 13, 2022

I have debugged the kernels of Windows, Linux and now iOS using internal tools. Windbg generally works, and is far superior to the equivalent tools for other kernels. Like vim it has a baroque syntax and a steep learning curve, but it also has a gui with big buttons to compensate. Yeah, you can get yourself in bad situations, especially in early boot, but I miss it.

amluto · on July 14, 2022

I confess that it’s been years since I used WinDbg. It was quite bare bones back then.

These days I mostly use gdb (which is incredibly buggy for such things) attached to QEMU’s gdbserver. It works except when I doesn’t.

DaiPlusPlus · on July 13, 2022

I thought everyone doing OS/kernel dev would be running code in a VM they can pause, snapshot, and single-step all without a guest debugger?

xmodem · on July 13, 2022

Virtual machines are used extensively by kernel developers but are not an adequate substitute for real hardware in a lot of situations. After all, one of a kernel's jobs is to abstract away differences in underlying hardware.

pjmlp · on July 13, 2022

Windows kernel can generate ETW profiling data, no need to manually add logging information after the fact.

https://docs.microsoft.com/en-us/windows-hardware/drivers/de...

surajrmal · on July 13, 2022

Having worked on both windows and Linux kernel development, I can easily say that Windows has superior debugging facilities. It's pretty rare to make the debugger fall over. Working in the kernel is certainly not without its own challenges, but I feel like Linux makes it harder than it needs to be.

starfleet_bop · on July 13, 2022

Does x86 have a JTAG like / hardware debugger?

vbitz · on July 13, 2022

You can get something like JTAG from Intel with a NDA.

https://designintools.intel.com/Silicon_View_Technology_Clos...

Most of the time you can do kernel dev with a serial port and printf though.

ace2358 · on July 13, 2022

That’s probably true, but sometimes you have to go one level deeper right? What happens when your debugger has crashed?

Arainach · on July 13, 2022

In 8 years on the Windows team I can't recall the debugger ever crashing. If it does, you open the debugger again and reconnect to the target machine.

hyperrail · on July 13, 2022

Heh, I wish that had ever been the case for me. Sometimes it feels like WinDbg freezes every other command. Plus, getting it to reconnect to a kd after it freezes and I kill it was such an exercise in frustration that if the target machine were my own repro box instead of a stress break remote, I would have just hard-reset the machine or (if it were a VM) restored it from snapshot.

maybekerneldev · on July 13, 2022

> > 3. Designing systems in general. Some people love the challenge of distributed transactions, eventual consistency and all that jazz, but it just rubs my brain the wrong way. I'm not interested at all in that problem space [1].

>I'm not sure I understand this issue. I mean, kernels have subsystems for doing stuff; you might need to design one someday? But it won't be dull-as-dishwater web technology stacks, it'll be you writing data structures directly in C/C++ or Rust if MS goes there.

Ah, I meant designing distributed systems. Like, figuring out the difference services, storage requirements, databases, then planning out the infrastructure. It's really not my cup of tea. I do enjoy designing systems/subsystems/components as long as they're within a single host :)

shultays · on July 13, 2022

  Serial printf logging is often required

This reminded me using "printk" in an incorrect place in an operating system course assignment. It was actually printk itself that was hanging the kernel!

Ombudsman · on July 13, 2022

Bad stuff:

1. There will be a lot of infrastructure complexity in the kernel, just prepare yourself for that. Even worse bugs! You'll be fixing a lot of bugs, or looking at a lot of bugs, and most of these bugs are from other teams who are interacting with your component! Just order a copy of Windows Internals and get yourself familiar with how thing work.

2. Old ass engineering systems. Just as the interviewers said you will spend a lot of your time waiting for Windows builds.

Good stuff:

1. Work life balance is amazing actually. Most of the time there's very little pressure for you to get work done, as long as you're doing something no one really bothers you.

2. Since you said you hate designing systems, good news, everything has been designed for you! Your, job will mostly be implementing new features for a component.

3. Windbg is actually great and I will die on this hill. You might have to print some debug logs but you won't be looking at metrics because there's whole teams dedicated to doing that stuff. There's also tools which quickly spin up VMs for you to do some live kernel debugging.

Meh stuff:

1. The pay, could be better. That might change soon though.

gavinray · on July 13, 2022

  >  Windbg is actually great and I will die on this hill.

Someone convinced me to try Windbg Preview about 2 years ago and I was in awe. I too will now die on the hill that it's an incredible cross-language debugger capable of a lot more than that.

asveikau · on July 13, 2022

> Windbg is actually great and I will die on this hill

Came here to say this. (Actually kd, but lots of overlap.)

criddell · on July 13, 2022

How did you learn it? I use it occasionally to debug memory dumps after an application crash but I know that WinDbg can do so much more than show me the stack just as the application crashed.

asveikau · on July 13, 2022

I learned it at Microsoft. When debugging issues in Windows, it was pretty common to leave a remote server open and pass it around among colleagues to find the right person who knows what code is failing. You could see what the remote party was typing and doing. It was a pretty social way to learn your way around.

matthewfcarlson · on July 13, 2022

This comment is something that matches my experience. My new company tends to have worse work life balance but higher pay. It’s a trade off. I think the best play is join microsoft and leave after the nice sign off bonus all vests. I can’t complain about the ride, just the drop at the end.

jvert · on July 13, 2022

I spent about ten years on the Windows Kernel team. In the 90's, so likely very different than what you would experience today, but probably still a lot of the same problems. (Funny to hear people are still complaining about windbg!)

I think you are coming at this from the wrong perspective. Rather than thinking about how to avoid work you DON'T like, think about what you DO like and then decide if the new job would offer more or less of that.

Personally, I've found that every five years I end up sick of working on the same kind of problem and I have to go work on something completely different. Maybe that's where you're at.

jackling · on July 13, 2022

Don't you find it difficult to transition from one domain to another?

syschick · on July 13, 2022

Disclaimer: I worked at MSFT for 14 years, but that was >10 years ago, in Servers (Exchange, ISS) not kernel. But I've debugged windows stresslab crashes across the network, before windbg got good :)

Is it difficult to transition? Yes, in that if you spend 10 years learning SystemA, then it will take some time with SystemB to build up to the _same level of expertise_ that you enjoyed on SystemA.

The neat thing about learning one OS or system _deeply_ is that the deep knowledge gifts you with frameworks for learning the next-system.

I wrote software on various *nix'es during my university time, then worked on OS/2 (yes, "OS-who?"), Windows, and now I'm at Google, (my work involves 80% Linux 20% Win). It took me some time to re-build my skills on each environment/platform, but it's been a great ride and a lot of fun.

(P.S. earhart@ pointed me to this thread, and says hi to all y'all: Evan, JonO, JVert, LandyW? Dan?)

heleninboodler · on July 13, 2022

The most growth (and fun, in my opinion) is experienced when you hop to a new discipline where you have no idea what you're doing.

Source: 33 years experience hopping all over the spectrum

jvert · on July 13, 2022

I like to learn new things; I find it energizing, especially when I am feeling burned out. Sure, it can be difficult but that is also what makes it rewarding. It's also really interesting to see the different tradeoffs that have been made between different solutions. (Linux vs Windows for example)

Ultimately, it's all just code and problem solving. And you can usually find a way to leverage your expertise in one domain into a different domain.

hyperrail · on July 13, 2022

Here's my thoughts based on working on Windows in-box code (user mode only, though) across 2 different Windows component teams, totaling about 5 years with a break in the middle. (Split into multiple comments for readability. Some parts removed because other people said it better.)

Most importantly, MAKE SURE YOU KNOW WHAT THE JOB IS! Microsoft people don't try to be dishonest, but there can be misunderstandings between you and your future coworkers about your role, and if you take a job that turns out to be different from what you expected, you will be unhappy.

If you haven't already, you should ask the hiring manager more about what the team does. Try to get enough specifics that you might not know everything the manager refers to, but can easily Google what you don't know: "For example, in Windows 10 version 2004, we shipped the API and implementation for the Windows hypervisor feature that lets third-party VM host software like VirtualBox force their VM guests' virtualized RAM to be paged into the host machine's physical RAM all the time." (Not an actual feature, at least as far as I know.) At this level of detail, you'll be able to judge whether the work is really what you think it is.

Talk to your other interviewers to learn more about the work and the team, if they gave you their contact info or otherwise seemed inclined to hear from you. 3 out of 4 of them are likely going to be your peers, and the 4th is either the hiring manager, another mid-level to senior leader, or a team architect - all will be at least close enough to your team that they won't give you vague generalities.

hyperrail · on July 13, 2022

An advantage of working at Microsoft that only other huge tech companies can match is that you'll get the chance to interact with many different people, some of whom will inspire outright hero worship among you and your direct coworkers. Those interactions could be in email discussions (having to send endless emails to random people or unarchived mailing lists to get things done or find things out is the curse of Microsoft life), or in API review meetings, or just water cooler talk.

Getting the chance to work with people like that was one of the highlights of my Microsoft career. Some of them are famous or semi-famous outside Microsoft, like David Cutler (mentioned repeatedly in this comment thread), while others are not known outside MSFT at all but arguably should be, while others are respected among a small geeky community (I'm thinking here of 2 Linux kernel subsystem maintainers who joined MSFT after making their names in Linux, and continue Linux work today). If making those connections is something you'd want to do as well, I'd definitely see that as a big plus of an MSFT job.

hyperrail · on July 13, 2022

Even if you do work on Windows in-box code only, it might not be so easy. Here's another issue you raised with your old job:

> 2. Debugging exclusively via metrics and logs, since I can't just attach a debugger to a running server.

I find debugging Windows issues fun, but it might not be for you. On rare occasions, you might be lucky even to have telemetry and logs, for customer issues that can't be reliably reproduced. "We noticed that 30% of our module's crashes in the last Windows Insider Preview Dev build had this new stack, so we used Watson Portal to request more crash dumps from devices that ran into that particular crash, and a week later, we've accumulated this set of dump cabs...."

On the other hand, if you get an automated email saying "test case XYZ broke due to a crash in your module", you'll probably get a live kernel debugger remote - an email link or copy-paste command to open a kernel debugger on the dead VM, preserved for your debugging. But of course, bugs aren't necessarily caught that early, and even if they are, finding things via kernel debugging is a needle-in-a-haystack problem because you're debugging the entire computer, not just one process.

int_19h · on July 13, 2022

To be fair, Watson post-mortem debugging is something that pretty much every team that ships a product running on users' hardware has to deal with.

hyperrail · on July 13, 2022

I think what I meant to say is that sometimes even telemetry and logging is unavailable to you. In my example, you might ask for more dumps for your Windows component or Microsoft first-party app on Watson Portal / Get More Data, but even after a few days of waiting you don't get any more, or only one more, because the issue occurs in the wild too rarely to show up in telemetry, or repros often but doesn't reliably produce a useful dump.

As an aside, if you are a third-party Windows application developer, I highly recommend you register your apps with the Windows Desktop Application Program: https://docs.microsoft.com/windows/win32/appxpkg/windows-des... - this will give you access to the same app reliability telemetry, including user-mode crash dumps (.DMP / .CAB files), that Microsoft has. This is only the latest of several iterations of this data access over the years, but it and its predecessors are still badly underused in my non-Microsoft experience.

int_19h · on July 13, 2022

Yup, this all sounds very familiar to me - and I never worked on anything Windows.

The "repros often but doesn't reliably produce a useful dump" is particularly frustrating. Like, you're seeing all those crashes, and every one of them is likely to be some poor user who is at best annoyed, and at worst just lost some data. And you have no clue as to what the bug is or how to fix it to help them.

hyperrail · on July 13, 2022

One reason I specifically suggest you confirm what your job is is that the teams that work on Windows in-box kernel-mode components aren't just Windows-specific teams anymore. They are part of the Microsoft Azure Edge + Platform division.

That name is misleading, but only partly so: shipping the Windows desktop and Windows Server products are a major part of that division's mission, but so is building Microsoft's internal-use Linux distribution CBL-Mariner, all of Microsoft's embedded and Internet of Things software products (Azure ThreadX RTOS, Windows IoT, etc.), and various Microsoft-internal software and hardware products.

It's very possible that the team you'd be joining would be working on an Azure product or Windows kernel-mode code for an Azure product, which means all your 5 issues could be a concern, especially:

> 4. The insane amount of work required to stand up even the smallest microservice: infrastructure provisioning, certificates, security reviews, GDPR compliance, etc.

> 5. Anything I build will end up paging some poor soul at 3am some day when something is down or under heavy traffic.

(Point 5 is even worse in Azure because you will get paged yourself if you're on-call. You can't just assume that the operations or site reliability engineering teams will take care of problems without pulling in the original engineers, especially when the product is new and buggy :)

hyperrail · on July 14, 2022

It's also possible that you could be working in a department that focuses on writing bug fixes for released versions of Windows, instead of writing features and fixes for the next version. The bug-fix department will often be called something like "servicing" or "sustained engineering" (I believe it's currently called Windows Servicing and Delivery, or WSD).

oldmanhorton · on July 13, 2022

I've worked adjacent to windows for a few years, and it's not perfect. They have mountains of legacy requirements by the very nature of being windows and other comments about siloed culture can often be true, though I think this is true in most large companies. With that being said, windows kernel developers are by and large extremely skilled and passionate, and working on that team is pretty unique within most of software engineering - rarely do you have the ability to impact so many people.

The concern for me would be the fact that windows is no longer a fully offline experience running on people's desks. A large part of the kernel team is working on features that only benefit Azure and you may get some of the same services exposure there as you claim to be so burnt out on.

Asking to discuss the role more with the team to get a better sense of what they're doing, how they're doing it, and who their customers are seems like the path forward IMO.

jiggawatts · on July 13, 2022

> A large part of the kernel team is working on features that only benefit Azure

And the IIS team, and the networking team, and…

jpgvm · on July 13, 2022

Haven't worked on NT kernel, did implement a device driver for it once. Spent first 4 years of my career doing Linux kernel development.

In general, go for it. The level of understanding you obtain by working on the kernel about how things -really- work will make you a better engineer even if it turns out not to be for you.

Things I would caution you about based on the downsides of your current job:

1. Kernels make k8s look simple. Not just simple, childs-play.

2. Debugging experience in kernel-land can vary widely depending on what layer of the stack you are working on and that nature of the bugs. Highly concurrent pieces of any kernel are a nightmare to debug because such bugs are normally race conditions and timings are incredibly sensitive when you get this close to the metal.

3. Whilst you won't have to worry about this initially as you will probably need a few years experience at this level before you design new kernel subsystems I definitely wouldn't consider architecture at this level less complicated than distributed systems. This is because computers these days -are- distributed system. NUMA essentially means you have all the same problems. You do have much more convenient tools for solving them though (at the cost of performance) like HW coherency, etc.

4. Ok yeah, you shouldn't need to worry about this one. You won't be standing up new build systems or anything.

5. Well.. this is the rough part of kernel bugs. When you fk something up you potentially fk over everyone, usually in a very subtle, hard to diagnose and even harder to workaround way.

So yeah, go do it but don't do it because you think you will be getting away from those things because you aren't really. Do it because you think it will be enriching/fun/whatever you want more of in your life.

maybekerneldev · on July 13, 2022

Those are fair points. I should've expanded a bit on my list though, since a lot of my points could be interpreted in a way that I didn't mean.

1. Yeah, I have no doubt about that. My peeve with k8s is not its inherent complexity, but just how much worse it made our day-to-day work. When we ran things on managed services, everything was a lot more understandable and straightforward. k8s is piles and piles of hard-to-discover yamls and I (and other folks in the team) feel that it made our infra a lot harder to understand and change.

2. This was my most unclear point. What turns me off about my current experience debugging distributed systems is having to trudge through millions of log messages coming in constantly. I just can't say I like the ELK stack, despite its popularity.

3. Fair point, though I'm aware of the distributed nature of things even within a single host. I do expect it to be different though. Will I still have to account for "the response was lost, you don't know what happened" situations? My understanding is that being close to the metal I can at least trust that the wires are all still there and working, and if they aren't, well, I can say the machine (or something within it) is broken and needs to be replaced. Do you have to do things like sagas or deal with eventual consistency at the kernel level? Hardware guarantees seem a lot stronger than what you get in networked systems.

4. Yay!

5. The difference here is that even a distributed system that's built correctly will still page someone one day. There'll be more traffic than it was built for, or some network dependency will be down (like, not the things you accounted for, but something fundamental like a DNS server or whatever), etc. My expectation (and past experience) is that in code that's intended to run on a single host, I'll either get it right and it'll just work as intended, or there'll be bugs that I have to fix, and once I do things will just work. What I find extremely frustrating about services is that it seems that no matter how much effort you put into quality, there will still be times when it just goes down and people have to put out a fire.

Definitely going for it after all the responses here!

elankart · on July 13, 2022

I used to work in the Windows team before I moved onto a team in Azure. I’ve contributed to both user and kernel mode components. You’ll work with very smart people and they will be very technical. Hiring bar in the Windows kernel team is very high compared to other teams in the company. You learn a lot. Work life balance will also be great.

The other great part is the sheer amount of engineering that will go into every feature you will own. I say “engineering” because it’ll be different from any services work you have done.

Tools that are at your disposal as part of Windows Core team is top notch.

Good luck! You will not regret this.

RamRodification · on July 13, 2022

Sooo... Is Azure any good? (the product, but I guess also the workplace)

elankart · on July 13, 2022

We solve a different set of problems. But people here have different skillsets. Cadence of release is also faster.

I enjoyed my time in the Windows team the most.

drewcoo · on July 12, 2022

The design is very different than *nix. The entire ecosystem is different. I think experience with both is eye-opening. There may even be a couple of grizzled old graybeards left who followed DaveC over from DEC. I don't have a clue now.

If you can read and enjoy these books, consider it:

https://docs.microsoft.com/en-us/sysinternals/resources/wind...

The ugh!: You will be a tiny cog in a big machine, spending very little time writing code, mostly banging your head over someone else's bugs. And M$FT still has a certain stink to it these days and that may cling to you.

jdsully · on July 12, 2022

When I was on the Excel team the kernel devs I talked to knew their stuff and were of the highest quality. More importantly they cared and stuck around to see the consequences of their actions.

The windows org has a reputation of being dysfunctional although I never personally observed this, and personally think even if it were true it would be in the higher level parts of the stack.

Kernel dev was my original dream but I fell in love with Excel during my internship. I definitely think you should go for it.

anaisbetts · on July 13, 2022

The Windows Kernel is a quite well-maintained and well-written piece of software. I would say one thing that can be frustrating is that the NT kernel simply does a ton of stuff, and breaking compatibility with all of that stuff is not an option. I once had a one character fix in NLS, that turned into two separate 500+ line AppCompat shims

It leads to code being Complicated, and a lot of Kobayashi Maru situations where there is no good solution to an engineering problem, just a bunch of bad compromises. Depending on your area, you might run into this all the time, or you might not.

rot13xor · on July 13, 2022

"Write once, run for 20 years" is a strong value proposition. I dabbled in web dev and it's rare for third party libraries and services to go a year without breaking interface changes. Meanwhile a program written for Windows XP will likely still run on Windows 11 even without a recompile (perhaps the source code was lost). In the worst case you might have to run the binary in XP compatibility mode.

fassssst · on July 12, 2022

I work on Windows (not the kernel) and the best part is that you can work on critical infrastructure without having to be on call. Work life balance is great.

hthrone · on July 13, 2022

I work at Apple on the kernel team. Congratulations on the job! Getting into kernel development is very difficult.

You said you were looking for things that you might not like about kernel development, so here’s a few:

- Distributed systems. Everything is distributed these days with NUMA, cache coherency, PCI transactions, etc. You’ll have to know how to use the right kind of atomic memory ordering, debug lockless algorithms, and know how the OS scheduler works at a deep level. If this doesn’t excite you, the job might not be a good fit.

- Working with physical hardware. Although in most cases you can get away with a VM, you will almost certainly have to use an actual device for some of your work. The hardware you use might not be 100% functional or even correct- I’ve had to debug kernel issues that ultimately were hardware errata. At Apple it’s easy to get in contact with the silicon design teams, but I’m not sure about Microsoft and Intel/Qualcomm. Working with physical hardware also limits your ability to work remotely easily. Lugging around a bunch of laptops, phones, and tablets in a carry-on suitcase is no fun, especially when you have to take them all out at TSA checkpoints.

- Lack of user-visible impact. Kernel dev is vitally important but it doesn’t get much visibility, except when something goes wrong. New kernel features hardly make the headlines. It can be a little annoying to see your colleagues in app/web dev get recognized for the features they worked on at WWDC or Microsoft Build, while you’re toiling away working on kernel features that very few people care about.

- Debugging. I actually think debugging kernels is not too difficult if you have the right set of tools. At Apple we use lldb and hardware debuggers (JTAG/SWD), and have the ability to take a core dump of the kernel after a panic to analyze later. But since kernel dev is at the core of the operating system, you’ll have to learn to debug other parts of the stack too. For example, you might make a kernel change that breaks the file explorer but only when you visit a specific directory. So you’ll need to know how to debug both the user space process to know what’s going wrong, and the kernel to know why that is happening.

Although there are some negative aspects of the job, there are many more positives. I really like my job and can see myself staying at Apple for many years. I’m constantly learning and working with some of the smartest people I have ever met.

Give the Microsoft job a shot; you can always leave if it’s not a good fit but you will have left a better engineer than when you started.

maybekerneldev · on July 13, 2022

> - Distributed systems. Everything is distributed these days with NUMA, cache coherency, PCI transactions, etc. You’ll have to know how to use the right kind of atomic memory ordering, debug lockless algorithms, and know how the OS scheduler works at a deep level. If this doesn’t excite you, the job might not be a good fit.

I'm familiar with the things you mentioned, except PCI transactions. In this context, do you also have to deal with distributed transactions, lost responses, eventual consistency, and the like? Do you have to account for "response lost, don't know if it went through" situations?

Definitely going for it after all the responses here :)

Joe_Boogz · on July 13, 2022

I work at MSFT on windows kernel drivers - not the kernel directly though. I’d say go for it. Knowing how to program in kernel mode is a lucrative skill set. There aren’t enough people that know how to do it.

In my experience the kernel devs are the best of the best developers. But know that working on the windows kernel will be complicated and require a lot of domain knowledge. Expect to learn about it for _years_ to come.

silentsea90 · on July 13, 2022

Part of me fears that you might run into an even worse class of problems as it relates to your points including:

> 1. Infrastructure complexity - Sure, but there would be a LOT of complexity in kernel dev

> 2. Debugging - :) MS kernel is ancient and must be full of cruft. Systems dev is notorious for all sorts of weird timing, coordination bugs.

> 3. Designing systems - You'd have transactions, concurrency, race condition kind of problems which imo tickle the same part of the brain as distributed txns, eventual consistency etc

That said, if you like it, you like it. Can't know without trying. If I were you, and if I had the opportunity - I'd go for a newer variant of this which could look like Apple's M1 team, Tesla's systems teams etc. simply to have a fresh slate to build on.

cschep · on July 12, 2022

Kinda seems like if you hate it you could go back to building services really easily. No real downside to giving it a shot! Unless I'm missing another trap door it feels very reversible. :)

You might even be able to return to your old company if you hate it. Tell them directly "hey I need to give this a shot but I'd love to check-in in a year".

remflight · on July 12, 2022

Why wouldn’t you take a chance, given how much you dislike what you’re going right now? What have you got to lose?

nu11ptr · on July 13, 2022

I did not work on the kernel itself, but wrote NDIS IM kernel drivers (a shim between NIC driver and TCP/IP - like a firewall or load balancer). I found it lots of fun personally. Back in the day I recall using Windbg and later SoftIce for debugging, so when you got stuck there was lots of rebooting to get into a debug session and then careful stepping which could be tedious. It was very challenging, but rewarding and every cycle/instruction counts and every memory allocation had to be carefully allocated from paged or non-paged (never swapped out) pool. I specifically recall you could only call certain functions/syscalls at certain IRQL levels or the kernel would bluescreen, so you always had to know which IRQL level you were being run at. This was 20 years ago so I'm sure much has changed (and probably much hasn't).

layer8 · on July 12, 2022

As someone who shares your preferences, you should absolutely go for it I would say, even just for the learning experience. Many devs would metaphorically kill to work on an OS kernel team. If it turns out to not be that great after all, you’ll have no difficulty to return to Kubernetes-land ;), or rather more likely, find a different low-level oriented dev job with that Windows kernel experience.

willcipriano · on July 13, 2022

Man I envy you, that sort of thing is the dream. If you don't try, it will always be a what if. The services side of things will always be there if you change your mind.

py_or_dy · on July 13, 2022

I'm pretty burned out too with my work (mostly web dev stuff). I've been meaning to get "AWS certified" or learn Kubernetes, but like you, it all seems so crazy to me. I used to love old school "linux administration", but this new wave of tech gives me no interest.

Curious how you get an offer such as this? I've thought about changing job roles, but I really suck at leet coding so I've never really bothered. I figured that as a 40 year old male, no one would hire me for a role unless I was already experienced. Is that not the case?

low_tech_punk · on July 13, 2022

Feeling the same. Tired of web stack. Don’t want to do Kubernetes. Don’t want to do ML. All I want is descending to lower level programming. Rust would be nice. C would do too. LLVM IR might be my ultimate language. I don’t waste time on Leetcode. I spend most of my free time on side projects on GitHub which helps me build up a portfolio for my next job. I also watch MIT courses to learn system programming. I don’t care about earning a degree. Just watching it for the pleasure of it.

Have you considered applying for entry level jobs of a different domain? As long as you are ok with a pay cut, there is not such thing as over-qualifying.

hansoolo · on July 13, 2022

Would you mind sharing a link or two to these courses?

low_tech_punk · on July 13, 2022

MIT 6.033 Computer System Engineering, Spring 2005 https://www.youtube.com/playlist?list=PL6535748F59DCA484

MIT 6.172 Performance Engineering of Software Systems, Fall 2018 https://www.youtube.com/playlist?list=PLUl4u3cNGP63VIBQVWguX...

I feel what is underpinning this thread is a job dissatisfaction issues from the ratio between "using other people's software" vs. "writing software for others". The former is inevitably frustrating as I'd become a consumer of software, rather than a creator. Kernel programming, and low-level system engineering in general is more aligned with the latter where you have more creative freedom and less constraints from other people's abstractions. It's like driving an automatic vs. manual car. The latter is just more fun, despite its inefficiency.

Enjoy the courses!

hansoolo · on July 15, 2022

Thanks a lot. Will give this a shot pretty soon!

ineedasername · on July 13, 2022

If you're miserable in your job now then switching seems like there's lots of upside and only a little downside (it could potentially be worse than current).

The main thing I'd consider in this is the work/life balance. Will it allow the same stellar level of that? People vary in their priorities. For some it's interesting/fun problems to solve, for some it's impact, but for me it's work/life balance, presuming of course that the work side isn't a hateful stressfest. No amount of remote work or flex schedule will make up for absolutely hellish work environment, and I had something like that once. (I ended up taking a 30% pay cut that I could just barely afford and got the heck out)

So I'm sorry that I can't speak to the kernel work itself, but instead the framework for making the decision. If the work/life balance will remain constant then you're only risking the possibility that the work won't be any better, and hopefully no worse. If the work/life balance for the new job is also uncertain then (for me) that would be a bigger risk consideration.

Finally, you should consider the worst case scenario: The work is worse, the work/life balance is also worse. How easily can you shift to something better? If you have the chops to work on the Windows kernel then I'd wager that if you hit the worst case scenario you could get out of it without too much trouble, but you're the only one who knows whether or not that's true.

Best of luck to you, sláinte, and may the wind be always at your back.

ForgottenEnmity · on July 13, 2022

Is work/life balance a US thing? In my country, for example, the employer can't easily make you work for longer than 8 hours per day, and even then there are big big compensations like extra payment with time off, restrictions on how much overtime per year is allowed and so on. So when people speak of a bad life/work balance, how does that happen in the first place?

hyperrail · on July 14, 2022

Salaried office workers receive far poorer wage and hour protections in the U.S. than you may be used to. Overtime above 40 hours per week is often unpaid, as is the case at Microsoft, and no amount of paid vacation is guaranteed by law. (Workers paid hourly do get overtime pay at 1.5 times the normal wage or higher.)

Microsoft's work/life balance is good in my experience; managers respect your time, product deadlines rarely or never cause long-hour rushes, and if your role lacks an on-call rotation then you won't be woken up in the middle of the night. But other companies may be far worse, and Microsoft is big enough that one person's experience never truly generalizes.

For time off in particular, as of my last experience 2 years ago, MSFT's US policy on paid time off was stingy by American tech company standards:

* There is no scheduled period of the year where everyone gets paid time off. (LinkedIn has 2 such weeks a year but it manages its HR independently of the rest of Microsoft.)

* You accumulate 3 weeks of vacation PTO per year of work for your first 6 years at the company, then 4 weeks per year for years 7-12, then 5 weeks per year for years 13+.

* Unused vacation days expire at the end of the calendar year following the calendar year in which you accumulated them.

* When you leave the company, you get a "vacation cash-out:" you are paid a lump sum equal to all your unused, unexpired vacation days.

* In addition to vacation, each calendar year you get 2 weeks of paid sick leave and 2 "floating holidays" to use like vacation days.

For comparison, at Google you accumulate 4 weeks of vacation per year starting just in year 2, while at Netflix and many smaller companies you get unlimited vacation at the cost of no cash-out.

In practice, there is more leeway at Microsoft on PTO than this policy allows. Even when I had an otherwise poor relationship with my manager, they still looked the other way when I went on Christmas/New Year's week vacation without filing an absence request or using up my vacation days. And I have heard of divisions where after a particularly stressful and overtime-heavy year, the division Vice President told everyone to take such an off-the-books holiday vacation.

ForgottenEnmity · on July 15, 2022

Appreciate the info, thank you.

landr0id · on July 13, 2022

If you accept the job pay attention to what the Hyper-V team does. They were the best kernel team that I personally worked with and were eager to do things the best way possible.

g051051 · on July 13, 2022

As much as I loathe MSFT and Windows, I'd jump at the chance to be a low-level C programmer again.

sys_64738 · on July 13, 2022

Definitely got for it for no other reason to say that you tried it as you'll always regret it if you don't. The only thing worse than not doing it is leaving such a team for the outside world and realizing that's the biggest mistake of your working life.

waynesonfire · on July 12, 2022

i think it's more impressive you were able to land a job in such a vastly different ecosystem. what's your trick against being pigeon-holed?

colinfinck · on July 13, 2022

I have worked on a Windows-compatible kernel. Not at Microsoft, but within the ReactOS Project. And I can affirm that even after 15 years of low-level development, nowadays mostly within various Rust projects, it doesn't get boring. Can't imagine going back to high-level while there is still so much work to do on the underlying foundations..

Congrats on your kernel development offer from Microsoft! I imagine these to be very scarce these days, considering that today's Microsoft seems to be more into services and less about low-level software development.

If you like to chat more about this, just drop me a line. Always interested to hear from low-level devs and aspiring ones. My website/e-mail is in my HN profile.

erichocean · on July 13, 2022

Just a thought, you might like working at https://oxide.computer/. I'd at least interview there before you sign on at MSFT.

mwcampbell · on July 13, 2022

I thought the same thing, simply because Oxide have been very public about their systems programming work. But bcantrill said a few months ago that they're very, very over-subscribed from a hiring perspective [1].

[1]: https://news.ycombinator.com/item?id=30892773

rufius · on July 13, 2022

I have worked on it and debugged it in the past. It’s probably some of the cleanest code at Microsoft if I were to guess.

It’s very readable, if a bit idiosyncratic. I would work on it again, given the right opportunity.

danrocks · on July 13, 2022

I envy you. I would probably pay money to give up my Sr. Manager position at FAANG managing multiple teams doing critical infrastructure work (or rather, trying to) to become an IC embedded developer.

dboreham · on July 13, 2022

Seriously? You have to ask??

if (offered_nt_kernel_job) accept_right_away()

You can always quit if it doesn't turn out well.

sedatk · on July 13, 2022

As a former engineer in Windows Core OS team, I can vouch for two topics: kernel code being in great quality and the emphasis Microsoft puts in work and life balance. These two are excellent.

You'll work with people great at their job whom you can learn a lot from.

I also loved having my own window-side private office. I've since heard that Microsoft was trying to adopt open offices, but if it's still there, awesome. Remote work is okay too of course.

ribtoks · on July 13, 2022

Especially in Microsoft, such old historical projects are a huge huge mess. For Microsoft, never join Windows (anything, not just kernel), Power BI, Office (desktop one). Even many parts of Azure that I witnessed when working there, were already a huge and messy legacy. Colleagues, code and development workflow would be your worst nightmare there.

quilombodigital · on July 13, 2022

I guess the real issue here is the "work-life balance is stellar". Humans just cant stand being in a comfortable position for a long time. People struggling in their jobs/life will say:"what are you talking about??? I would kill for this opportunity!", but the fact is that we know that growth only happens when there is change and when you are being challenged. This will affect not only your work, but also your relationships. Absolutely any job gets boring after some time, even if you are a Nascar racer. People get depressed when they have "everything". My formula to deal with this is to reinvent yorself many times, and look at your work/life from different angles each time. Understanding that you will be in the exact same position after some time makes you realize that eventually you will have to deal with it.

mandeepj · on July 12, 2022

Sharing because it might give you some Windows history at MS

https://www.amazon.com/Old-New-Thing-Development-Throughout/...

Also read - Showstopper

_e4mv · on July 12, 2022

I'm way less experienced than you are but sort of started out closer to the kernel world (work involves using C mostly) than the webdev world. It's fun if you're interested in that sort of stuff. The things that mostly make you go "ugh" would be testing your work (can't let the kernel slip up, quality is important), the processes involved at some companies in committing something as simple as one line of code, and the lower upper-bar for salary. Maybe Microsoft is a bit wealthier than where I work though.

keithnz · on July 13, 2022

I once interviewed a guy who worked on various parts of the core windows, naturally being curious I asked a lot of questions about it. I think it would be an interesting experience, but ultimately it seemed very constrained in terms of what you can do, decision making is quite constrained etc. It sounded like you would end up very very focused. Which might be your thing. The guy I interviewed was wanting a lot more freedom. In the embedded world, there's lots of opportunities for low level programming.

jve · on July 13, 2022

To all the Windows folks out there: Thank you for this wonderful OS :)

I know, it gets a lot of trashing all around, the ads, the ui, the whatever - I like the UI and I really like what it offers me for a technical person: ETW traces, Performance monitor, Event fowarding, PowerShell, Windows Terminal, GPO, Windows Defender with advanced features, Resource Monitor, ACLs (Well, auditing file access could have been better :) ), wf.msc, RSAT, HyperV and whatever features under the hood that enables Process Monitor!, ..., ... ,...

fshbbdssbbgdd · on July 13, 2022

> 2. Debugging exclusively via metrics and logs, since I can't just attach a debugger to a running server.

I’m not sure this is really a difference between kernel and backend service development. You can run the server on a dev machine, attach a debugger, and reproduce the bug. If your problem is you can’t reproduce the bug, you may encounter the same issue when working on the kernel. If someone reports a kernel bug, you may not be able to attach a debugger to their computer, so you will need to reproduce it.

earhart · on July 13, 2022

I worked on the Windows kernel in the early 2000s. I really enjoyed it - I learned a lot about PC hardware, worked with some very smart people, and although a few projects were ambitious and ambiguous and didn’t really pan out, most of them were really solid; it felt great to be improving something that really made a difference for a huge number of people. I’m a much better developer, with a much better sense for how code serves business, for having spent time there.

robert_foss · on July 13, 2022

Kernel development is not a panacea. I'm a maintainer and you should set you expectations (if you end up upstreaming code) to:

- a lot of bikeshedding

- a mailing list driven workflow

- poor continuous integration

phendrenad2 · on July 13, 2022

I'm in the same boat. I've been a backend web developer for so long, I've almost become accustomed to the kubernetes churn. I sometimes dabble in kernel dev, and sometimes want to show off my skills to a company like Google and try to go work on Fuchsia or Android or something. But it feels like grass-is-greener thinking, and I'm wary of such things, having burned myself by falling for it before.

kramerger · on July 13, 2022

I recommend you have a look at some old timer blogs to understand the type of challenges you may encounter.

Raymond Chen is a favourite of mine:

https://devblogs.microsoft.com/oldnewthing/

Firmwarrior · on July 12, 2022

Microsoft is a friggin dumpster fire, dude. Some teams are great to work for, but a LOT of them are little crews of treacherous rogues who're only looking out for themselves and always ready to stab someone in the back.

You'll run into a lot of the same issues on the Windows kernel that you're getting irritated with in service development land. A lot of infrastructure is already out there, but figuring out how to use it effectively ends up being almost as much work as just rewriting it from scratch. Things are documented but poorly.

For Microsoft specifically you'll often get stuck because of problems with some other team's code, and then find yourself embroiled in multi-week battles with multiple engineers, managers, project managers, and product managers all at each other's throats.

All that said, the Windows kernel itself is a work of art IMO, and kernel development is a lot of fun. I think that if you want to make the big bucks but still work with c/c++, it's getting to the point where your choices are very limited: Microsoft, Google, Facebook, Apple, or Tableau.

Microsoft was terrible to work for, but they aren't as bad as a bad small company. At least there's HR to report gross violations to, they have a TON of great perks, the money is really good, they have good work-life balance, and it's not too hard to change teams (but if you find yourself wanting to jump to a different team, do it well before performance reviews come along, since your manager will definitely give you a bad review and hamstring your ability to move around within the company.. so you'll end up having to change companies. Not the worst thing if you do so, I guess)

EDIT: I just reread your list of complaints.. timing and consistency in the kernel are WAY more complicated than your average web services. There's also a ludicrous amount of red tape around getting anything done. I'm starting to think that you should look at Microsoft as a stepping stone in your career rather than its final destination.. it's still a good place to work for overall compared to random small companies, it'll give a boost to your resume, but it sounds like what you actually want to work on is game emulators or raspi home automation gizmos in your free time.

chasil · on July 13, 2022

The original kernel from which Windows evolved was written in assembler by Dave Cutler for the VAX architecture. After his DECWest team was hired by Microsoft, they reimplemented it in C. The MIPS architecture was the first fully complete port, I believe.

This is documented in Zachary's Showstoppers book:

https://www.flyingpigbooks.com/book/9780759285781

It appears that performance and architecture improvements are very hard to get accepted (and the kernel suffers for it):

https://blog.zorinaq.com/i-contribute-to-the-windows-kernel-...

It is a great piece of engineering, but it is likely for a developer that is more comfortable making small changes within the hierarchy than for any revolutionary ideas (unfortunately).

If Cutler hadn't found his way to Microsoft, then they probably would have ended up on a BSD kernel, as Apple did. As it was, Microsoft sold their Xenix business around the time of Cutler's arrival.

jiggawatts · on July 13, 2022

> think that if you want to make the big bucks but still work with c/c++

Judging by the presenters at CPPCon, a significant fraction of highly-skilled and highly-paid C++ devs are employed in the High Frequency Trading (HFT) industry.

They meet OP's criteria in many ways as well, as HFT trading tends to be "very close to the metal", optimised to death, and often involves solving deep technical problems. It's diametrically opposed to modern web development practices.

int_19h · on July 13, 2022

The downside of working on HFT is that the value one provides to society is at best nil, and some would argue that it's negative. Some people care about things like that.

Firmwarrior · on July 13, 2022

I’ve been thinking about getting into HFT.. they’re always bragging about operating on millisecond/microsecond timescales, but they’re doing it on bespoke heavily-monitored systems. Meanwhile here I am shipping the same type of code to millions of people on commodity hardware..

I’m pretty sure that you could donate a big chunk of the million+ ill-gotten dollars to charity and make a lot of good things happen

greggsy · on July 12, 2022

You basically described any workplace in a large company. Organisations are systems made of loosely interconnected and dependent systems, made up of people. The complexity of business norms, personality and cultural differences, and unrealistic expectations around business processes can be difficult to reign in inside a global company. Some workplaces have worked out a sweet spot, balancing people, HR, tools, culture and purpose.

I’m not telling you or anyone else in these companies to suck it up, but there’s a point where employees and teams in these situations need to step back and realise what they’re dealing with, and adjust their mindset accordingly.

Personally, I like identifying and unfucking bottlenecks. Sometimes explaining why something is so fucked to the people being fucked can go a long way.

notJim · on July 13, 2022

I really disagree with this idea, we shouldn't normalize every workplace being full of back-stabbers and retaliatory jerk managers. Nor is horrible red tape a given, although some bureaucracy is of course required. None of this has been my experience at fairly large, well-known companies, although I'm definitely aware of cases where it has happened to friends.

greggsy · on July 13, 2022

It’s not an idea - it’s just a perspective that some people forget: the world is volatile, uncertain, complex and ambiguous, and our organisations are a reflection of that. I know that it’s fashionable to talk about not normalising things, but the reality doesn’t change.

I don’t like it either, but I can’t really do much about it. If you realise you’re in a difficult organisation, you can at least start to take action, or take a different tact (or leave). Being annoyed about the situation is probably just going to make you more and more unhappy because the system won’t (or can’t) change in your desired timeframe.

Firmwarrior · on July 14, 2022

Microsoft CEO Satya Nadella wrote a book right around the time I was planning to leave, and gave everyone a free copy. The book really perfectly nailed what a toxic dumpster fire the company had become, and then it outlined his plan to fix it.

What's funny is that the guy agreed with everything I thought, he saw the problems that were hamstringing me (and everyone else), he had complete authority to fix the problems.. and so far as I can tell, he wasn't having any luck with it

The best fix seems to be to tell people to quit being jerks to each other and focus on making good products and making customers happy, but good luck getting that to happen.

One of the main criteria for performance reviews was suddenly something along the lines of "Teamwork" that captured what a non-toxic, non-Ballmerized good person you were. Of course, all that happened is that the best backstabbers and the weaseliest scumbags manipulated their way into high "Teamwork" scores, and anyone who sat down and focused on productive work/helping others got screwed on that score just like everything else.

retcon · on July 13, 2022

Is enabling the same as normalization? Surely normalization is acceptance into culture whereas even in dysfunctional organizations the individual actors all think that they're solving problems and not taking any shit.

The pity is often how the most qualified talent for resolving poor behaviour is so often subordinated into slopping out the pigsty. There needs to be a foundational next step.

Edit: since Fairchild we've been scared of merely assembling raw talent under a thin layer of management capabilities and from this is developed the pseudo matrix management system that e.g. Microsoft operates.

Firmwarrior · on July 13, 2022

I’ve been at other big companies where it wasn’t the case. The engineers were all there to get a job done and were willing to help other teams when necessary to make that happen, without lengthy negotiations ahead of time

There’s going to be SOME of that stuff any time you have more than one person in the same building, but a decade under the iron fist of Ballmer has permanently ruined Microsoft as an organization.

carabiner · on July 13, 2022

Ah, so it is true: https://www.androidpolice.com/wp-content/uploads/2021/07/08/...

teddyh · on July 13, 2022

The original link: https://bonkersworld.net/organizational-charts

Firmwarrior · on July 13, 2022

Man, it is so true, or at least was when I left in 2017

It’s a fractal shape too. You see the same sort of intrapersonal weaponry at the division level, the org level, the group level, and even within teams

Maybe it even continues to a Herman’s Head-style inner conflict for good company men

pjmlp · on July 13, 2022

You forgot NVidia and Adobe, two big hires of well known names in the C++ community and ISO C++ working groups.

Firmwarrior · on July 13, 2022

That’s a good point, I’ve heard good things about NVidia

Presumably Tesla and Uber must have some “native” code being written somewhere in-house too, and someone mentioned the HFT industry

There are a lot of decent jobs out there that will keep you comfortable, too, even if you can’t bring in 300k+.

localhost · on July 13, 2022

What level (title is fine)? What team? I can try to help point you at some folks in Windows though it's been a while since I worked in that org. You can reach me via contact info in my profile if you don't mind decloaking.

rramadass · on July 13, 2022

Perhaps the book Advanced Windows Debugging by Mario Hewardt and Daniel Pravat will get you excited enough to take up "Kernel Development" work.

cglan · on July 12, 2022

Definitely do not join Microsoft and definitely do not join any windows related teams. Microsoft is hell. Every team is fully siloed. Culture wise it’s a wasteland and teams as a communication platform is awful

SBArbeit · on July 13, 2022

Sitting on the Microsoft campus at the moment... not a kernel developer. <looks around> yeah, this is not hell. Not even a bit.

Microsoft is great. It's an amazing place to make impacts in many different areas of technology. Work teams used to be more siloed, but that's changed a lot in the last 6-7 years. We regularly work across teams at GitHub and Microsoft to get things done with great cooperation.

As for Teams vs. Slack vs. Discord, it's a personal taste thing. I'd rather have Teams over Slack + Zoom 1000x.

Take the Windows kernel job! Worst case, it's not a fit and you move on, like any other job. Best case, you stick around for a long time and have a great career working on tech that 1B+ people use.

When Robert Downey Jr. was trying to talk Gwyneth Paltrow into joining "Iron Man", he said to her something like, "do you want to work on art house films the rest of your life or do you want to be in something that people actually see?"

Windows is massive... it's the largest code base that I'm aware of. It has a ton of process and procedures and test cycles because... it's Windows. Of course you can't just freelance there. But you can "be in something that people actually see", and most SV startups will never even get 1,000 users, much less 1,000,000,000+, and more if you count the users on Azure services using Windows indirectly.

Mandatum · on July 13, 2022

> When Robert Downey Jr. was trying to talk Gwyneth Paltrow into joining "Iron Man", he said to her something like, "do you want to work on art house films the rest of your life or do you want to be in something that people actually see?"

Sounds like a bastardization of Jobs on convicing John Sculley manufactured by a PR person:

Steve Jobs and John Sculley, then PepsiCo president, were sitting on a balcony overlooking New York’s Central Park. Jobs turned to Sculley and said, “Do you want to sell sugar water for the rest of your life or come with me and change the world?”

https://www.forbes.com/sites/carminegallo/2016/11/12/how-ste...

That's like asking someone who cooks at a fine-dining restaurant to work at McDonalds, "do you want to make pretentious food for 40 cashed up foodies and critics a day, or do you want to make food for hundreds of regular people a day?"

People pretending that Facebook, Google, Microsoft, et al are noble endevours are deranged. It's like rooting for Walmart to win the World Series of Retail.

philwelch · on July 13, 2022

McDonalds corporate actually employs chefs to develop new menu items, and a large part of the process is figuring out how to work within the constraints of being able to consistently reproduce a menu item at thousands of locations, sourcing the necessary ingredients via the McD supply chain. I'm sure it's an interesting problem and a chef working at a fine-dining restaurant may very well find it a nice change of pace compared to feeding a small number of rich, pretentious jerks.

dekhn · on July 13, 2022

I worked at McDonalds some time ago. New managers were sent to "hamburger university" in illinois and come back talking like robots. https://en.wikipedia.org/wiki/Hamburger_University

marktangotango · on July 13, 2022

Say what you will, but McDonalds is pretty remarkable in their consistency, even internationally. A big mac in New York is the same a Big Mac in San Francisco, as in the Midwest, as in Germany.

A pet peeve of mine is pizza joints who serve a different pie every time, depending on who's in the kitchen. If I have a great pizza, I expect to get it again, but rarely do. Consistency is hard.

darknavi · on July 13, 2022

Hard disagree about the "Hell" portion. Siloed, probably but really dependent on the team/org.

The culture in games is very healthy from my experience and one of the best compensation in the games industry.

Someone1234 · on July 13, 2022

Do you care to contextualize this? Did you work for Microsoft and or on Windows? For how long?

cglan · on July 13, 2022

Sure. I've worked at Microsoft primarily, and I'm not speaking for all teams but I have a sibling on another team in another org and he has the same issues

Microsoft is incredibly siloed. Each team is essentially their own mini company and they're mainly guided by large top line metrics but there's no top down overall vision on what something should look like. It's like those party games where everyone has to draw a portion of a drawing. It comes out looking like a disaster even if every individual portion is good. This also means that product management doesn't work with any particular team either. You essentially get some random big metric and are told "make this metric better" without any context. Everyone is duplicating work and there's nowhere to learn from

Microsoft Teams is a disaster in so many ways. In the most obvious way, it's very very slow and a pain to use. This subtly hinders teamwork because no one wants to use teams. In other ways, there's no global search so finding stuff in other orgs is impossible. The teams "channels" are essentially shitty forums that are unintuitive to use. You'll never get a channel about hobbies and stuff and even if you did it's hard to find and they're usually dead. Everyone uses private group chats including each team but these have zero discoverability. What this means is you'll never have a golang channel or something where people share and chat about stuff. Most people have a facebook group where they chat about stuff (wild).

Every team has their own onboarding down to what hardware you should get. In theory this allows for some flexibility but what this actually means is that no one has any idea how you should be onboarded and you essentially are sent to flounder until you pick stuff up.

Everyone is doing everything. Every person is their own product manager, scrum master, manager, and also programmer. There's so much duplicated process overhead it's wild.

They have not handled remote well. They insisted on trying to send me a desktop computer. I asked for a laptop and they couldn't give it to me and they instead sent me a used intern laptop. They gave my sibling a used surface tablet. This is a 2 trillion dollar company and they're unwilling to shell out 2k for a basic workstation computer with 6-8 cores and 32gb of ram. Not a huge ask. Also some stuff is only accessible through a direct hardline in the office. Whether you want to use a desktop or not is irrelevant. It's mainly how cheap they are when it comes to hardware.

EVERYTHING has to be Microsoft software for the most part. If you think nih syndrome is bad at your company, imagine you're at a company where they've been making mostly mediocre versions of other software for the past 30 years. Yeah. I'm not a huge splunk fan but trust me when I say the azure equivalent is much shittier.

The pay isn't top notch. In fact it's pretty bottom barrel for a big company and if you can pass the microsoft interviews you can pass somewhere else. Their interview process is also a nightmare. I went through 4 different recruiters and it took me 2 months between passing to get an offer letter. The whole thing was insane. They initially offered me such a paltry amount it made me laugh.

Everyone is a lifer because anyone else has left. Imagine talking to your boss about docker or talking to him about IntelliJ and he's never heard of it because he's been at Microsoft for 20 years.

There's a lot of weird "not racism" but might as well be where certain ethnic groups have taken over certain orgs and speak in their primarily mother tongue despite being in the US in a US based company. It makes teammwork really hard.

You need a separate laptop to log into any production resource. Production resource is a loose term because that also includes int environments and anything on azure. I have 2 laptops and a desktop that I'm forced to remote into it's insane and so slow.

There is no one to talk to about anything. You literally cannot find what anyone is working on or what's happening. Discoverability at this company is literally 0

HR processes are like actually completely broken down to me not receiving healthcare for 3 weeks after I joined forcing me to pay some very big expenses out of pocket that would have been 0 after the fact. Getting reimbursed doesn't work because after the fact those expenses only applied to my deductible which wouldn't have been the case if I paid initially with my healthcare.

Most things are a huge heaping pile of legacy crap that absolutely cannot be changed for backwards compatibility. Imagine working on a c++ code base with no local environment, no unit tests, and the only way it can be changed is to make a change and upload it to a build server (1hr + build times) and then deploy it. Yeah. There's no room to improve things because it's all so delicate.

There's simply so much. Thankfully I'm leaving but it's been ass

b20000 · on July 13, 2022

The services world, what does that mean? Kernel development is highly specialized work, so I don't understand where you are coming from and how your previous experience is relevant.

jacobsenscott · on July 13, 2022

I've managed to just to full stack rails apps for years so I've been avoiding most of those issues. But I feel you on the GDPR, and whatever other regulation is there or coming down the pipeline. It really sucks the fun out of programming. I used to be a systems programmer, and was just thinking the other day that would probably be more fun then building web apps these days.

itisme · on July 13, 2022

The answers you will get here at HN will mostly be misplaced hatred against Microsoft, from people who hate MS, because someone else here at HN hates MS.

I think what you should do is, if possible, arrange for a short call with some of the team mates that you are going to work with. Just a casual one about how the kernel team day to day work looks like there. If not with the team, then maybe with your manager.

Unless someone from that team or some ex-kernel team member replies here, all the rest of these posts are just speculative and not worth basing your decision on.

tpmx · on July 13, 2022

I guess you're probably thinking about someone like me - someone who got started in 90s, or perhaps even earlier.

I don't hate the competent developers at Microsoft (of course they exist, there have been source code leaks) - I just very strongly dislike the machine that is Microsoft. I don't think the machine has reformed.

beebmam · on July 12, 2022

[flagged]

jnwatson · on July 12, 2022

K8s is like 4-wheel drive. It allows one to get stuck further off the main road.

beebmam · on July 12, 2022

I don't see how the analogy is applicable. To illustrate what I mean, you could have also said: "K8s is like a train. It's big and heavy, will carry most things that you need it to", and it would be just as much of a non sequitur. Analogies are usually worthless to an argument (and therefore should be worthless to your opinions), unless one is able to demonstrate why thinking about an idea with an analogy is applicable.

There's zero reason to think that Kubernetes is like a 4-wheel driven car for the same reason to think that Kubernetes is like a train, or Kubernetes is like a virus, or Kubernetes is like a snowman; it's completely unrelated.

jpalawaga · on July 12, 2022

What pendantry is this? The only thing that you could take away from OP's comment is that they made a metaphor, and since you can make a metaphor about anything, the chosen metaphor says nothing?

That's like saying, well, you can use words to construct a lie, therefore, all words are untrustworthy.

He's saying that k8s can be powerful, but that that power can be destructive if not wielded correctly.

Personally? k8s is powerful. But you need a team to manage it, and if you're the poor sob who wants to write software but gets stuck managing k8s (due to its complexity) half the time, you're not going to be very happy.

Aloha · on July 12, 2022

I think you're right.

K8's (and other similar tools) allow you to push complexity into places that are harder to troubleshoot. Its another layer of abstraction, and too much abstraction is dangerous in complex systems.

mickael-kerjean · on July 13, 2022

> But you need a team to manage it,

I do run a kubernetes cluster outside work as a side project and it hardly takes me more than 2 hours per month. It mostly run cloud instance of my open source software for customers and is saving a lot of time compared to the manual process of creating dns manually, rp rules for everyone, handling of SSL, monitoring, ... Just to think about the work that would be required to get all this done outside kubernetes makes me sweat and the maintenance aspect of such a solution would make it even worse.

goodpoint · on July 13, 2022

> side project ... 2 hours per month

That's your error: you are evaluating k8s on a use-case that is not representative.

Once you scale up the number of nodes, the number of applications, and the years of uptime the complexity grows exponentially.

Now your team needs to be able to debug each component in the OS stack as well as k8s itself.

In the long run you always pay the price for unnecessary complexity - and with interests.

beebmam · on July 13, 2022

That's not what I'm saying.

I said that saying K8s is like X requires that you show how it is like X first, before you can draw any conclusions from that analogy. Otherwise, it is not an argument and should have zero value in a conversation, other than someone stating their (unsupported) opinion.

maxbond · on July 13, 2022

I think what you're missing is that they didn't give you an argument, they told you how they felt about Kubernetes. And that's a totally reasonable thing for them to do. You aren't owed an argument. But if you had asked politely you probably would've gotten one.

When it comes to political discussions, I think it is reasonable to demand that everyone who chooses to participate substantiate their claims and make arguments rather than stating opinions. Misinformation about politics is serious, it can be a matter of life and death.

That standard does not apply to talking smack about Kubernetes on the internet. I can tell you that programming in Python feels like a game of operation and that programming in Rust feels like a breath of fresh air. I don't have to tell you why. You shouldn't try to argue against this, either - it isn't something I can possibly be wrong about, or that you could possibly understand better than I. You can tell me about how you feel like Rust is overhyped garbage (I have no idea how you feel about Rust, it's only an example). But why would you try to tell me how I feel?

What is the consequence if my statement goes unchallenged? Someone tries to learn Rust and is disappointed that it doesn't match the hype? Some engineer starts a new project in Go instead of Python because they don't want to play operation? Life goes on.

If you look at it through that lens, you can make sense of it. They're not saying there's a relevant similarity between Kubernetes and vehicles. They're saying that with more complex tools you make more complex mistakes, that Kubernetes may have solved problems but it also gave us more rope to hang ourselves. And I bet you'd have something interesting and productive to say in response to that idea - you obviously have passionate opinions on this subject.

I'd also note, no one ever said not to use Kubernetes. The general vibe was that Kubernetes is stressful to work with. Again, no one is making an argument - they're venting to other engineers.

beebmam · on July 13, 2022

>I think what you're missing is that they didn't give you an argument

That's fine, and I'm glad that people are admitting that their comment shouldn't be taken as persuasive, or anything other than self-expression. Many people think analogies are persuasive, but they're often nothing more than illogical rhetoric, and it's a critical blunder to be persuaded by an unjustified analogy.

inkeddeveloper · on July 12, 2022

This rant is completely unrelated and definitely not wanted.

naikrovek · on July 12, 2022

I’ve written entire software packages which do what I need, exactly, rather than try to get Kubernetes to behave, and I am much happier for it.

nicce · on July 12, 2022

You can write many books from Kubernetes and still not know it perfectly.