Everything you need to know about the Gemini API as a developer in less than 5 minutes

By: Logan Kilpatrick

Re-posted from: https://medium.com/around-the-prompt/everything-you-need-to-know-about-the-gemini-api-as-a-developer-in-less-than-5-minutes-5e75343ccff9?source=rss-2c8aac9051d3------2

Get started building with the Gemini API

Gemini is Google’s family of frontier generative AI models, built from the ground up to be multi-modal and long context (more on this later). Gemini is available across the entire Google suite, from Gmail to the Gemini App. For developers who want to build with Gemini, the Gemini API is the best place to get started.

In this article, we will explore what the Gemini API offers, how to get started using Gemini for free, and more advanced use cases like fine-tuning. As always, you are reading my personal blog, so you guessed it, these are my personal views. Let’s dive in!

How can I test the latest Gemini models?

If you want to first test the Gemini models (everything from the latest experimental models to production models) without writing running any code, you can head to Google AI Studio. Once you get done testing there, you can also generate a Gemini API key in AI Studio (“Get API Key” in the top left corner). AI Studio is free and there is a generous free tier on the API as well, which includes 1,500 requests per day with Gemini 1.5 Flash.

Image captured by Author in aistudio.google.com

What does the Gemini API offer?

The Gemini API comes standard with most of the things developers are looking for. At a high level, it comes with:

Fine-tuning support for Gemini 1.5 Flash
Context caching, to help reduce production deployment costs
Code execution, to augment the models capabilities by running code
Structured outputs, to extract data from input sources
Video, image, and audio understanding
Document processing, supporting PDFs up to 1,000 pages long

And much more! In general, the Gemini API offers most if not all of the features developers have come to expect when building with large language model API’s, in addition to many things that are unique to Gemini (like long context, video understanding, and more).

<a href="https://medium.com/media/432672276d4baf75d0bab4ef5cd3c587/href">https://medium.com/media/432672276d4baf75d0bab4ef5cd3c587/href</a>

What models does the Gemini API support?

By default, the two model variants available in the Gemini API as of September 21st, 2024 are Gemini 1.5 Flash and Gemini 1.5 Pro. There are different instances of these models available, some of which are newer and have performance updates. Each model also offers different features, such as the context length of ability for the model to be tuned. You can check out the Gemini models page for more details.

Image captured by Author on ai.google.dev

Sending your first Gemini API request

With as little as 6 lines of code, you can send your first API request, make sure to get your API key from Google AI Studio before running the code below:

import google.generativeai as genai
import os

genai.configure(api_key=os.environ["API_KEY"])

model = genai.GenerativeModel("gemini-1.5-flash")
response = model.generate_content("Explain how AI works")
print(response.text)

The Gemini API SDK’s also support creating a chat object which makes it so you can append messaged to a simple structure:

model = genai.GenerativeModel("gemini-1.5-flash")
chat = model.start_chat(
    history=[
        {"role": "user", "parts": "Hello"},
        {"role": "model", "parts": "Great to meet you. What would you like to know?"},
    ]
)
response = chat.send_message("I have 2 dogs in my house.")
print(response.text)
response = chat.send_message("How many paws are in my house?")
print(response.text)

If you want a simple repo with a little more complexity to get started with, check out the official Gemini API quickstart repo on GitHub.

How much does the Gemini API cost?

There are two tiers in the Gemini API, the free tier and paid. The former is well, free, and the later comes with an increased rate limit intended to support production workloads. Gemini 1.5 Flash is the most competitively priced large language model in its capability class and recently had its price decreased by 70%.

Image captured from Google Developers Blog

Or put another way, you can access 1.5 billion tokens for free with Gemini every single day:

<a href="https://medium.com/media/c12dcafb260435d98e04066ea29271f6/href">https://medium.com/media/c12dcafb260435d98e04066ea29271f6/href</a>

Fine-tuning Gemini 1.5 Flash

Gemini 1.5 Flash can be fine-tuned for free through Google AI Studio and the tuned model does not cost more to use than the base model, a benefit that is rather unique in the AI ecosystem. Once you tune the model, it can be used as a drop in replacement in the existing code you have. Google AI Studio also comes with sample datasets to do testing tuning with and a mode called “Structured prompting” which is useful for creating fine-tuning datasets.

Image capture by Author in Google AI Studio

Closing thoughts

The Gemini API continues to get better week over week, there is a steady stream of new features landing which continue to improve the developer experience. If you have feedback, suggestions, or questions, join the conversation on the Google AI developer forum. Happy building!

Everything you need to know about the Gemini API as a developer in less than 5 minutes was originally published in Around the Prompt on Medium, where people are continuing the conversation by highlighting and responding to this story.

juliabloggers.com

A Julia Language Blog Aggregator