Introduction
One of the areas of AI where there has been a lot of interest and discussion is its use in programming. That interest is understandable given the high salaries of software developers and backlogs that never seem to shrink, a problem made even more challenging in the federal space given clearances and other security considerations.
Outside of my role at Redhorse, I’ve been using a number of coding assistants on a near-daily basis for the past six months. This experience has been quite enlightening, so I wanted to provide some insights on the potential value and applicability of AI for programming within the federal space. But first, I’ll start by providing a brief overview of the types of AI code assistants.
What is an AI Code Assistant?
At the simplest level, an AI code assistant is some AI-based application or tool that helps software developers code – ideally making them more efficient. The original ChatGPT was an effective code assistant, as one could see by the drop in visitors to Stack Overflow, a site that developers would frequent when they ran into an issue they needed some help solving. As useful as ChatGPT was for helping out developers, it quickly became apparent that it lacked many features needed to truly unlock the value of AI. Some of its shortcomings were:
- Lack of awareness about the codebase a developer was working in. Since the model only knew about the code that was provided to it, it wouldn’t be able to easily diagnose issues that spanned multiple files or were specific to that codebase.
- Short context windows. The models could only keep a small amount of code in-memory.
- Context-switching. Developers had to switch from their development environment to ChatGPT, breaking their concentration.
- Prompting. As with any task with LLMs, how you interact with the model will influence the quality of its results. If a developer didn’t write a good prompt, they may not get a good result.
Fast forward to 2025 and developers (and even non-developers, as we’ll see) have quite a few options that overcome these challenges. I’ll assign these applications into three groups, though there is quite a bit of overlap:
- IDE-based code assistants: An Integrated Development Environment (IDE) is the application, like VS Code, where most developers spent their time coding. There are plug-ins that work with existing IDEs or forks of open source IDEs that companies have made their own.
- Full-featured text-to-application apps: These applications allow you to create a working application, complete with features like a database to store information, user login, and payment options, just by chatting in natural language with the app. Many now allow you to also provide a screenshot or url as a starting point.
- AI Software Developers: These agents aim to act like coworkers. Users can give them a task, which they will attempt to go off and do, reporting back when the task is complete or they need further guidance.
The promise of these applications is that they can greatly increase the productivity of any developer. So let’s look beyond the hype and see what impact these applications could have in the federal space.
IDE-based code assistants
Note: Open source options are in italics
Example applications: Cursor, GitHub Copilot, Codeium, Continue.dev, Cody
By integrating directly into the environment where developers are coding, IDE-based code assistants solve the problems I identified earlier. This integration enables developers to get suggestions as they are typing or via keyboard shortcut, so they don’t have to switch to other applications. While this workflow takes a little while to get used to, once a developer becomes comfortable with it, they can see significant productivity increases.
What exactly are those benefits? At the most basic level, it’s not having to look up the syntax or variable name that you might have forgotten. Whole functions can be written based on the existing context and the start of a function definition. While these improvements may seem trivial, the efficiency gains add up. In my experience there are times when I’m using Cursor that it feels like its reading my mind.
These applications are backed by closed (e.g. Anthropic, OpenAI, Google) and open (e.g. Meta Llama, Mistral) models, so developers benefit immediately as these models improve. While OpenAI garners a lot of the media attention, the model of choice for many developers has consistently been Anthropic’s Claude Sonnet, which ranks among the top coding models while being a relatively fast model.
Full-featured text-to-application apps
Example applications: Replit Agent, Lovable.dev, v0, Cursor (Agents), Codeium Windsurf, GitHub Copilot (Agents), Bolt.new
To make effective use of the IDE-based code assistants you still need to know how to code. That is starting to change with the move to agents.
Most of the IDEs now offer some variation on an agent, which does more of the work for developers. For example, in Cursor I’m able to ask the agent to add a feature to my application, such as to change the color of a button or switch from using OpenAI’s to Anthropic’s models. If I didn’t specify the file, it will scan the codebase to find the relevant file, make the changes, and share those changes with me to review. Even for changes that span multiple files, Cursor will complete the edits in tens of seconds – the time it would usually have taken me just to locate the correct files and lines of code to start thinking about implementing a solution.
These models aren’t perfect of course, so they do make errors. Often these are minor and can be corrected quickly, but every once in a while, I’ll have to start over from scratch because the agent gets caught in a line of thinking that it can’t break out of. Despite their flaws, the agents allow me to build apps in programming languages I’m not familiar with, compressing weeks of trial-and-error with hours of focused work.
Applications like Replit Agent are even more powerful. I enjoy prototyping in Replit Agent because it handles a lot of the environment setup and deployment that can eat up a developer’s time (especially for someone like myself who isn’t coding full time!). With Replit Agent, you can create and deploy prototypes, and even full applications, solely with natural language in a matter of minutes or hours. As an example, we hosted a Winter Tech Challenge and our winners were employees who had little or no programming experience. Using Replit Agent they were able to build and deploy fully functional prototypes that included databases and connections to third-party APIs. I was honestly blown away when I started reviewing their submissions.
AI Software Developers
Example applications: Devin, OpenHands
Since AI agents are getting so good at quickly producing working code, it’s natural to wonder if we need humans at all. Enter the AI Software Developers, which aim to be able to do whatever a human developer can do. As of now these applications are roughly on par with a junior developer for some tasks, while other times they fail quite miserably. While the applications I discussed earlier were roughly real-time, AI Software Developers will operate asynchronously. You task them as you would a junior developer and they’ll check in when a task is complete or when they need input.
While these applications, particularly Devin, have not lived up to the hype for many people the trend is clear: as the underlying models become more competent, and we become more competent at directing them, that these AI Software Developers will be able to complete many basic coding tasks. Sometime in the next couple of years their capabilities may be more on par with mid-level engineers. It will not be unusual for someone to manage a team of AI developers.
Concluding Thoughts
I see many people online make the mistake of focusing on some failure of these applications on a seemingly easy task, leading them to make the conclusion that it’s all hype. These criticisms miss a few important points:
- The underlying models are getting better at a rapid pace
- As with any other tool, we need to learn how to use AI in software development properly
Even though I’ve been using many of these applications on a near-daily basis, I can see there is still so much for me to learn to better unlock their power. While I’m still early on the learning curve, these tools have enabled me to be much more capable than I otherwise would have been – literally building applications I never would have even attempted to create.
If you’re a developer, I encourage you to try and give some of these applications a fair shot – you might be surprised at how helpful they can be.
Public Sector Implications
While many developers and startups have readily embraced these applications, there has been understandably less traction in the federal space given the sensitivity of the data and cleared environments that many people work in. However, these applications will start to find their way into the federal space because they are becoming too impactful to ignore. In my next post I discuss the short- and long-term implications of these applications in the public sector.