Microsoft have a goal that states they want to get to "1 engineer, 1 month, 1 million lines of code." You can't do that if you write the code yourself. That means they'll always be chasing the best model. Right now, that's Opus 4.5.
> "Microsoft have a goal that states they want to get to "1 engineer, 1 month, 1 million lines of code.""
No, one researcher at Microsoft made a personal LinkedIn post that his team were using that as their 'North Star' for porting and transpiling existing C and C++ code, not writing new code, and when the internet hallucinated that he meant Windows and this meant new code, and started copypasting this as "Microsoft's goal", the post was edited and Microsoft said it isn't the company's goal.
That's still writing new code. Also, its kind of an extremely bad idea to do that because how are you going to test it? If you have to rewrite anything (hint: you probably don't) its best to do it incrementally over time because of the QA and stakeholder alignment overhead. You cannot push things into production unless it works as its users are expecting and it does exactly what stakeholders expect as well.
No no, your talking common sense and logic. You can't think like that. You have to think "How do I rush out as much code as possible?" After all, this is MS we're talking about, and Windows 11 is totally the shining example of amazing and completely stable code. /s
It is kind of funny that throughout my career, there has always been pretty much a consensus that lines of code are a bad metric, but now with all the AI hype, suddenly everybody is again like “Look at all the lines of code it writes!!”
I use LLMs all day every day, but measuring someone or something by the number of lines of code produced is still incredibly stupid, in my opinion.
It all comes from "if you can't measure it you can't improve it". The job of management is to improve things, and that means they need to measure it and in turn look for measures. When working on an assembly line there are lots of things to measure and improve, and improving many of those things have shown great value.
They want to expand that value into engineering and so are looking for something they can measure. I haven't seen anyone answer what can be measured to make a useful improvement though. I have a good "feeling" that some people I work with are better than others, but most are not so bad that we should fire them - but I don't know how to put that into something objective.
Yes, the problem of accurately measuring software "productivity" has stymied the entire industry for decades, but people keep trying. It's conceivable that you might be able to get some sort of more-usable metric out of some systematized AI analysis of code changes, which would be pretty ironic.
Ballmer hasn’t been around for a long long time. Not since the Red Ring of Death days. Ever since Satya took the reins, MBAs have filled upper and middle management to try to take over open source so that Sales guys had something to combat RedHat. Great for open source. Bad for Microsoft. However, Satya comes from the Cloud division so he knows how to Cloud and do it well. Azure is a hit with the enterprise. Then along comes AI…
Microsoft lost its way with Windows Phone, Zune, Xbox360 RRoD, and Kinect. They haven’t had relevance outside of Windows (Desktop) in the home for years. With the sole exception being Xbox.
They have pockets of excellence. Where great engineers are doing great work. But outside those little pockets, no one knows.
I believe the "look at all the lines of code" argument for LLMs is not a way to showcase intelligence, but more-so a way to showcase time saved. Under the guise that the output is the/a correct solution, it's a way to say "look at all the code I would have had to write, it saved so much time".
It's all contextual. Sometimes, particularly when it comes to modern frontends, you have inescapable boilerplate and lines of code to write. Thats where it saves time. Another example is scaffolding out unit tests for series of services. There are many such cases where it just objectively saves time.
I wonder if we can use the compression ratio that an LLM-driven compressor could generate to figure out how much entropy is actually in the system and how much is just boilerplate.
Of course then someone is just going to pregenerate a random number lookup table and get a few gigs of 'value' from pure garbage...
it's still a bad metric and OP is also just being loose by repeating some marketing / LinkedIn post by a person who uses bad metrics about an overhyped subject
Ironically, AI may help get past that. In order to measure "value chunks" or some other metric where LoC is flexibly multiplied by some factor of feature accomplishment, quality, and/or architectural importance, an opinion of the section in question is needed, and an overseer AI could maybe do that.
Totally agreed. The numbers are silly. My only point is that you don't need 100k engineers if you're letting Claude dump all that code into production.
I used to work at a place that had the famous Antoine de Saint-Exupéry quote painted near the elevators where everyone would see it when they arrived for work:
Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.
Which is a bald-faced lie written in response to a PR disaster. The original claims were not ambiguous:
> My goal is to eliminate every line of C and C++ from Microsoft by 2030. Our strategy is to combine AI and Algorithms to rewrite Microsoft’s largest codebases. Our North Star is “1 engineer, 1 month, 1 million lines of code”.
Obviously, "every line of C and C++ from Microsoft" is not contained within a single research project, nor are "Microsoft's largest codebases".
The original claims were not ambigious, it's "My" goal not "Microsoft's goal".
The fact that it's a "PR disaster" for a researcher to have an ambitious project at one of the biggest tech companies on the planet, or to talk up their team on LinkedIn, is unbelievably ridiculous.
One supposes, when a highly senior employee publicly talks about project goals in recruitment material, that they are not fancifully daydreaming about something that can never happen but are in fact actually talking about the work they're doing that justifies their ~$1,000,000/yr compensation in the eyes of their employer.
Talking about rewriting Windows at a rate of 1 million lines of code per engineer per month with LLMs is absolutely going to garner negative publicity, no matter how much you spin it with words like "ambitious" (do you work in PR? it sounds like it's your calling).
You suppose that there are no highly-paid researchers on the planet working on AGI? Because there are, and that's less proven than "porting one codebase to another language" is. What about Quantum Computers, what about power-producing nuclear fusion? Both less proven than porting code. What about all other blue-sky research labs?
Why would you continue supposing such a thing when both the employee, and the employer, have said that your suppositions are wrong?
Sure, there are plenty of researchers working on fanciful daydreams. They pursue those goals at behest of their employer. You attempted to make a distinction between the employer and the employee's goals, as though a Distinguished Engineer at Microsoft was just playing around on a whim doing hobby projects for fun. If Microsoft is paying him $1m annually to work on this, plus giving him a team to pursue the goal of rewriting Windows, it is not inaccurate to state that Microsoft's goal is to completely rewrite Windows with LLMs, and they will earn negative publicity for making that fact public. The project will likely fail given how ridiculous it is, but it is still a goal they are funding.
Microsoft funded Simon Peyton-Jones (Haskell) and Don Syme (F#) and SP-J worked on Excel, but it would be inaccurate to say that their goal was to rewrite Excel, Windows, C#, .NET into functional programming. Yes, to an extent researchers "play around on a whim doing hobby projects for fun", or more formally as SP-J said in an interview "the mission statement that Microsoft Research had at that time which was to push forward the boundaries of knowledge; put Microsoft in a position to be agile when new stuff heaves over the horizon; provide a reservoir of expertise for the rest of the company to draw on".
Otherwise your position is that "blue-sky research" doesn't exist (it does) or that big companies don't fund it (they do). In particular, the LinkedIn in question said nothing about "Windows", that is something internet has hallucinated to maximise ragebait.
The authentic quote “1 engineer, 1 month, 1 million lines of code” as some kind of goal that makes sense, even just for porting/rewriting, is embarassing enough from an OS vendor.
As @mrbungie says on this thread: "They took the stupidest metric ever and made a moronic target out of it"
Wow such bad practice, using lines of code as a performance metric has been shown to be really bad practice decades ago. For a software company to do this now...
I mean, if 1% out of 8 billion is "top" and that applies to Lines of Code, too, than ... more code contains more quality, ... by their logic, I guess ...
I've not heard that goal before. If true, it makes me sad to hear that once again, people confuse "More LOC == More Customer Value == More Profit". Sigh.
I've written a C recompiler in an attempt to build homomorphic encryption. It doesn't work (it's not correct) but it can translate 5 lines of working code in 100.000 lines of almost-working code.
Any MBAs want to buy? For the right price I could even fix it ...
why stopping at rust? Let's have a windows version written in python another in crystal and another in java. At least the generated code will be readable and maintainable!!!/s
1. Classic Coding (Traditional Development)
In the classic model, developers are the primary authors of every line.
Production Volume: A senior developer typically writes between 10,000 and 20,000 lines of code (LOC) per year.
Workflow: Manual logic construction, syntax memorization, and human-led debugging using tools like VS Code or JetBrains IDEs.
Focus: Writing the implementation details. Success is measured by the quality and maintainability of the hand-written code.
2. AI-Supported Coding (The Modern Workflow)
AI tools like GitHub Copilot and Cursor act as a "pair programmer," shifting the human role to a reviewer and architect.
Production Volume: Developers using full AI integration have seen a 14x increase in code output (e.g., from ~24k lines to over 810k lines in a single year).
Work Distribution: Major tech leaders like AWS report that AI now generates up to 75% of their production code.
The New Bottleneck: Developers now spend roughly 70% of their time reviewing AI-generated code rather than writing it.
I think realistic 5x to 10x is possible. 50.000 - 200.000 LOC per YEAR !!!! Would it be good code? We will see.