So you want AI to write your code for you..
I don’t blame you, it’ so easy to just write a prompt and have AI write our your entire app for you. But here’s the truth, it’s not really that simple. I’ve spent the last few months working on several complex apps in python using ChatGPT, Google Gemini and Claude.AI and here is some advice for you if you’re interested in trying to do something similar.
Background…
I am not a developer or programmer by profession, I’m a UI/UX and graphic designer so coding is not something I specialize in or do much of on my own. But, having working in web design for a long time I’ve worked with a lot of various languages for projects and have basic proficiency in most of them. That by no means qualifies me as a software developer, but it does give me enough experience to be able to guide a project to completion.
Lets write some code..
Here is some advice for writing code with AI..
- Use AI to write the prompts for code generation
- Start your project by giving the AI of your choice the best possible description of the project you want to create. Try and include as many steps and parts of the process as you can in the order you want it to happen. Then ask it to take your description and turn it into step by step instructions that a LLM would be able to follow to reproduce the results you’ve described.
- Verify prompts follow your directions
- This is kind of obvious, but read through the prompts it just gave you and make sure that they do indeed match up with what your goal for the project is. This is your chance to make clarifications and changes if you think something doesn’t make sense. The more detailed and the more precise you are in your steps and directions, the better your initial outcome is going to be.
- Go step by step through the prompts
- Start a new conversation and then go step by step through the prompts. If you’re using Claude.AI, then be sure that you tell it to break up large files into chunks and that you want each files to be in a new response. Claude has a short reply window so you wont always be able to get all your code in one reply, which can be very frustrating at times. So its best to plan ahead for it not to be able to give it all at once so you don’t have to constantly repeat steps.
- Initial results
- Once you’re done going through all your prompts you should have a basicly complete version of your project. It may contain multiple files and directories. You can also ask the AI to create a setup script to make all the directories and files for you, that can be a helpful first step.
Are you using API’s in your code?
If your app is going to use API’s to get data, there is a good chance that the AI knows about the API you want to use, but it’s going to be very helpful if you include the scheme and example response from the calls you want to made as part of your prompts. Don’t let the AI guess at the endpoints its supposed to use, it will make up the names of ones it thinks should exist and obviously wont work with your app, thus causing more wasted time fixing bugs.
Give it documentation before you start
If you can, copy and paste any related documentation about whatever youre doing before you start. If you’re using API’s copy and paste examples from the docs. Copy and paste anything and everything you can in order to give it as much background information as you can. This is going to help eliminate some of the guessing and making things up that LLM’s like to do. It’s still going to make things up occasionally, but if you give it documentation, it will at least make more of an effort to do the right thing.
Request the use of existing libraries
If you don’t tell it to use specific libraries that already exist, it will make its own to fill the use case you’re asking for. It will waste your time and resources reinventing the wheel and trust me, it will make it square not round. And the code it writes will be buggy and not work and cause you more work and headaches.
You will need to do some initial research to find the libraries that you want to use. Search for things like: “Python library for <whatever it is you want to do>.”
Here’s an example, I use MariaDB for the backend of a lot of web apps, and MariaDB’s python library has built in connection pooling. But you know, instead of using that, when I asked for connection pooling it wrote its own pooling library that as super buggy and wasted several days of my time before I realized that these functions were built in to the Python MariaDB Connector.
Prompt Example:
Use the following existing libraries in this project: <list of libraries to use>
Ewwww… bugs!
So now you’ve got something and its time to see if it works. Don’t be shocked when it doesn’t. In fact there is a very high probability its going to crash, but don’t worry, it can be fixed.
However, the bug fixing can be the tricky part and this is where the AI likes to take liberties and do its own thing with the changes. It will, I promise you, make up endpoints of API’s that you’re using that don’t exist. It will add placeholders and fake functions and take shortcuts to produce results. All of which will lead to more bugs and more headaches.
- Always ask for the full code
- Always ask for the full code when it does bug fixes. Otherwise it will give you a snippet and some vague advice of where to find it in the code to replace the old version. It’s not super helpful and it can be time consuming to try and do it manually, just have it rewrite the whole file its just easier.
Example of a GOOD Prompt:
Update the entire file, do not create partial updates or code snippets.
- DON’T use the word “rewrite” when asking for bug fixes
- The word “rewrite” implies a lot of things. It gives the AI permission to interpret the code and to add what it thinks the code needs to work. This is how you end up with extraneous code and functions that don’t do anything.
Example of a BAD prompt:
Rewrite the script and fix the bugs.
- USE the word “update” instead of “rewrite” when asking for bug fixes
- “Update” is a lot less vague and is much more specific about what you want the AI to do. Don’t give it the opportunity to “think” for itself, you just want it to follow your directions and fix the errors.
Example of a GOOD prompt:
Update the files to include the suggested changes.
- ALWAYS tell it not to make changes other than what you’re asking for
- When you ask for updates or changes, you absolutely need to include in your request for there to be no changes to existing functionality or business logic other than to fix the bugs. Otherwise you’re going to open up the situation to AI interpretation of your code and again, you’re going to end up with changes you don’t want and often don’t work.
Example of a GOOD prompt:
Do not change any other functionality or business logic when doing the updates. Only update the necessary sections of the code.
Copy and paste the entire error
When you come back to the AI for your first round of bug fixes, start by just copying and pasting the entire output from the error you got. Most of the time this is good enough to get the issues fixed, but it also doesn’t hurt to include that you only want an update and not a rewrite.
But this is also when you’re going to start running int the limits of some of these AI. Claude is extremely frustrating to work with, you will hit the chat limits quickly when doing bug fixes and then it will completely cut you off and say you’ve reached your daily limit on responses. Google and ChatGPT have much larger limits and make this process a lot easier. However a long conversation with ChatGPT can start to lose focus and go off the rails, so if things start to seem like its not listening anymore, start a new conversation.
Upload your files!
When you start a new conversion after hitting a limit or you feel like its not listening anymore, always remember to attach all the files of the project in their current version to the first message in the conversation. This will help remind it of what you’re talking about and lead to better results.
The quirks of AI..
So which LLM should you use for your project?
Google Gemini
People like to say how good Gemini is at coding and I agree its pretty good, but what it really excels at is doing what it wants to do and not what you ask it to do. It also loves to use pseudocode, take shortcuts and include placeholder functions. Gemini is great if you’ve got time to kill in a frustrating back and forth argument about how it shouldn’t be using shortcuts, placeholders and pseudocode in your app.
Especially with Gemini (and if your using AI Studio) make sure you add to the “system instructions” not to take shortcuts, use pseudocode or to use placeholders. Also be sure to include to update the entire file and not snippets, that’s a Gemini favorite to shortcut you with crappy snippets that aren’t even complete.
Prompt example for Google Gemini:
Do not use pseudocode, take shortcuts, or use placeholder code. Always produce production ready code, in full, with no snippets or partial functions.
OpenAI’s ChatGPT
ChatGPT is also pretty good at coding, but it also has a tendency to do its own thing and not follow directions. It doesn’t use pseudocode or take shortcuts as often as Gemini does, but it does like to optimizes things and make changes that it thinks are right which will inevitable change your code and remove functionality that you want in order to achieve its goals. You have to be sure that you explicitly tell it not to make changes to exiting functionality or else its going to do whatever it wants to achieve its goal of writing some code that it thinks does what you want.
I think that ChatGPT is much better at doing small coding task like fixing bugs. If you keep the task small and specific, it tends to do a better job at delivering results that work and are compatible with your existing code base.
Anthropic’s Claude.AI
Claude is the one I prefer. While it does suffer from some serious limitation (short reply window and limited number or responses), it tends to “stick to the script” a lot better than Gemini and ChatGPT. It still needs some guidance, it will make up endpoints for APIs and sometimes it will forget to include functions that it references in other functions, but its pretty good about fixing its mistakes when asked.
Claude is good at UI and UX task, it defaults to using Bootstrap for CSS (but if you ask for something else it has no problem using that instead). It does a good job of following best practices for page layouts and responsive design. I like to use it to make dashboards for web apps.
The biggest problem with Claude is that it has a very small reply window. You often can’t get a full files worth of code out of a single response. You have to be sure that you specify that you want it to break up the replies into sections and into muliple responses. This can be extremely frustrating and time consuming when you have to constantly start over to try and get the whole file worth of code.
Claude also has a limited number of inputs available for you each session, you’ll find that you’ll be rigt in the middle of something and then all of a sudden it’ll say you’ve hit your limit for the day and you’ll have to wait 5 hours before you can continue (Claude Sonnet 3.7).
Prompt example for Claude:
Break your response up into multiple replies. Do a section of the code in each response. When you’re done ask me to tell you to continue.
You’re going to get mad!
I’m going to tell you now that you’re going to experience a ton of frustration and disappointment in this process, so be prepared. It’s like working with a 5 year old who’s a genius. They may be super smart, but they are still 5 and they still act like a 5 year old. None of the LLM’s are going to listen to you, they are not going to follow directions and they are going to do what they want to do and you’re going to want to throw your computer out a window because of it. You have to remember that you need to be precise in your instructions but also very detailed and repetitive. Always remind it to follow your instructions and not to “think” for itself. Remind it multiple times in the same prompt just to be sure. And when it makes a mistake, its going to apologize and say “oh my bad, i should have done what you said” and then you’re going to want to say “that’s damn right you should,” but none of that matters because in the next prompt its going to do the same thing again.
Overall I would say..
If you’re going to use an LLM AI for a coding project, they’re all good so go ahead and pick any of them, but just remember that they just have their own quirks and limitations and that once you’re familiar with them and how they work, you can pretty much guide them to the best results. It will take some trail and error, you will get frustrated and you will get mad and want to throw your hands up and say “I quit.” But if you stick with it, it gets easier over time. Don’t expect prefect results, be prepared of bugs and a process of back and forth to get them fixed.