7 LLM Projects to Boost Your Machine Learning Portfolio

7 LLM Projects to Boost Your Machine Learning Portfolio
Image by Author | Created on Canva

Large language models (LLMs) are super helpful in a variety of tasks. Building LLM-powered applications can seem quite daunting at first. But all you need are:

the ability to code, preferably in Python or TypeScript and
a few not-so-fun tasks or problems that you’d like to make simpler (I’m sure you have many!).

To build LLM applications, you should be able to run and interact with LLMs, connect to various data sources—files on your local machine, APIs , databases, and more. The following is not an exhaustive list, but here are some tools and frameworks you can use to build out applications with LLMs:

Programming languages: Python, TypeScript
Frameworks: LangChain, LlamaIndex
APIs: OpenAI API, Cohere API
Running LLMs: Ollama, Llamafile
Vector databases: ChromaDB, Weaviate, Pinecone, and more

This guide goes over seven interesting projects you can build with LLMs. Along the way, you’ll learn to work with vector databases, frameworks, and useful APIs. We also share learning resources and example projects to help you hit the ground running. Let’s get started.

1. Retrieval-Based Q&A App for Technical Documentation

Build a Q&A system for developers that uses RAG to pull from various technical documents, Stack Overflow answers, or internal docs and knowledge bases as needed. Such an app can summarize and clarify complex concepts or answer specific technical questions.

Key components in the project include:

RAG framework that retrieves relevant documents and snippets

Open-source LLMs for interpreting questions and generating answers
Integration with APIs for external sources such as Stack Overflow, Confluence

Assisting developers with instant, reliable answers to technical questions without manually searching through large docs. This can be especially helpful for frameworks like Django where the docs are extensive.

To learn all about RAG, check out LangChain: Chat with Your Data from DeepLearning.AI and Learn RAG From Scratch.

2. LLM-Powered Workflow Automation Agent

Create an agent that can simplify repetitive workflows and boring tasks based on natural language instructions. The agent should be able to work through a sequence of steps either predefined in advance or autonomously given the end goal.

Such an agent should be able to handle tasks like creating new project folders, setting up Git repositories, creating the project’s dependency files, and more.

Key components, besides the LLM, are:

API integrations for various tools such as Docker, Git, and AWS
Engine to execute the LLM-generated scripts

You can improve the first version you build to get a more helpful app that reduces manual setup and admin tasks for developers or teams, allowing them to focus on higher-value work.

3. Text-to-SQL Query Generator

It’s always intuitive and simpler to think of business questions in plain English. However, a straightforward question like “What is the quarterly sales of a specific product across various customer segments?” may translate to a fairly complex SQL query with joins and multiple subqueries. Which is why building a text-to-SQL generator can help.

You can build an app that translates natural language queries into SQL using LLMs. The app should:

Convert user input into SQL queries based on a predefined database schema
Executes them against a connected database to return relevant data

Here’s a sample project walkthrough: End-To-End Text-To-SQL LLM App by Krish Naik.

4. AI-Powered Documentation Generator for Codebases

Build a tool that uses an LLM to scan code repositories and automatically generate comprehensive documentation. including function summaries, module descriptions, and architecture overviews. You can build it out as a CLI tool or a GitHub Action.

You’ll need:

Integration with repository services to scan codebase files
Options to review and add feedback to refine or edit generated docs

A more advanced version of such a generator can actually be used to auto-generate technical documentation for development teams. Though getting perfect docs can be a challenge, such a tool will definitely save hours of work!

5. AI Coding Assistant

Build an LLM-powered coding assistant that can act as a real-time pair programmer. This tool should provide suggestions, write code snippets, debug existing code, and even offer real-time explanations on complex logic during a live coding session.

When building such an app, ensure:

Good choice of LLMs that are good at code generation
IDE integration, such as VS Code extension, for in-editor functionality.
Contextual awareness from the current coding environment—libraries used, files open, and the like

Check out ADVANCED Python AI Agent Tutorial – Using RAG for a complete walkthrough of building a coding assistant.

6. Text-Based Data Pipeline Builder

Develop an LLM app that allows users to describe data pipelines in natural language. Say: “Write an ETL script to ingest a CSV file from S3, clean the data, and load it into a PostgreSQL database”. The app should then generate the code for a complete ETL pipeline—using tools like Apache Airflow or Prefect.

So you’ll have to focus on:

Support for various data sources (S3, databases) and destinations.
Automation of pipeline creation and scheduling with tools like Airflow.

This should help you build and schedule complex data pipelines with minimal coding. Even if the code is not completely accurate, it should help you start many steps ahead when compared to writing the pipeline from scratch.

7. LLM-Powered Code Migration Tool

There are off-the-shelf solutions, but you can also try building code migration tools from scratch. Build a tool that can analyze code written in one programming language and convert it into another language, using LLMs to understand the original logic and reimplement it in the target language. For example, you may want to migrate Python code to Go or Rust.

You have to experiment with the following:

Choice of LLMs for code translation between languages
Static analysis tools to ensure logical correctness after translation
Support for different paradigms and language-specific constructs

Such an app can help migrate legacy codebases to newer, more performant languages with minimal manual rewriting.

Wrapping Up

That’s a wrap! I hope you found these project ideas interesting.

These should be a good starting point for more interesting and helpful ideas you may have. Once you’ve built a working app, you can explore other directions. For example, you may want to build out a financial statement analyzer or your personalized research assistant using RAG.

As mentioned, you only need a problem to solve, an interest to build things, and coffee?

Happy coding!

About Bala Priya C

Bala Priya C is a developer and technical writer from India. She likes working at the intersection of math, programming, data science, and content creation. Her areas of interest and expertise include DevOps, data science, and natural language processing. She enjoys reading, writing, coding, and coffee! Currently, she’s working on learning and sharing her knowledge with the developer community by authoring tutorials, how-to guides, opinion pieces, and more. Bala also creates engaging resource overviews and coding tutorials.

Source link