Text Generation Models

Unleashing the Power of Language with Friendli Serverless Endpoints

Welcome to the captivating world of Text Generation Models (TGMs)! These AI models learn from massive datasets of text and code, mimicking human language patterns to generate creative and informative outputs. Friendli Serverless Endpoints empowers you to harness the potential of several cutting-edge TGMs through its convenient interface, letting you unlock the magic of words with ease.

This guide dives into the characteristics of six popular TGMs available on Friendli Serverless Endpoints:

Model Supports

meta-llama-3-70b-instruct
mistral-7b-instruct-v0-2
mixtral-8x7b-instruct-v0-1
gemma-7b-it

Please note that the pricing for each model can be found in the pricing section.

Llama 3 70B Instruct

Focus: Engaging dialogues and interactive experiences.
Strengths:
- Natural language understanding and human-like response generation in conversational settings.
- Maintains coherence and context throughout dialogues, fostering seamless interactions.
- Can adapt to different conversation styles and tones.
Example Use Cases:
- Building customer service chatbots that understand natural language and offer personalized support.
- Creating interactive storytelling experiences and AI companions.
- Developing game AI characters with engaging back-and-forth conversations.

Examples

When you install friendli-client, you can generate chat response with Python SDK.

info

You must set FRIENDLI_TOKEN environment variable before initializing the client instance with client = Friendli(). Alternatively, you can provide the value of your personal access token as the token argument when creating the client, like so:

from friendli import Friendli

client = Friendli(token="YOUR PERSONAL ACCESS TOKEN")

Default
Streaming
Async
Streaming (Async)

from friendli import Friendli

client = Friendli()

chat_completion = client.chat.completions.create(
    model="meta-llama-3-70b-instruct",
    messages=[
        {
            "role": "user",
            "content": "Tell me how to make a delicious pancake"
        }
    ],
    stream=False,
)
print(chat_completion.choices[0].message.content)

from friendli import Friendli

client = Friendli()

stream = client.chat.completions.create(
    model="meta-llama-3-70b-instruct",
    messages=[
        {
            "role": "user",
            "content": "Tell me how to make a delicious pancake"
        }
    ],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

import asyncio
from friendi import AsyncFriendli

client = AsyncFriendli()


async def main() -> None:
    chat_completion = await client.chat.completions.create(
        model="meta-llama-3-70b-instruct",
        messages=[
            {
                "role": "user",
                "content": "Tell me how to make a delicious pancake"
            }
        ],
        stream=False,
    )
    print(chat_completion.choices[0].message.content)


asyncio.run(main())

import asyncio
from friendi import AsyncFriendli

client = AsyncFriendli()


async def main() -> None:
    stream = await client.chat.completions.create(
        model="meta-llama-3-70b-instruct",
        messages=[
            {
                "role": "user",
                "content": "Tell me how to make a delicious pancake"
            }
        ],
        stream=True,
    )
    async for chunk in stream:
        print(chunk.choices[0].delta.content or "", end="")


asyncio.run(main())

Mistral-7B-instruct

Focus: Clear and concise explanations for instructional settings.
Strengths:
- Builds upon Mistral 7B's factual accuracy with a focus on explaining complex concepts simply and effectively.
- Ideal for generating educational tools, tutorials, and insightful explanations.
- Empowers knowledge transfer and simplifies challenging topics.
Example Use Cases:
- Creating online tutorials and educational materials that break down complex concepts.
- Building AI-powered teaching assistants that personalize learning experiences.
- Generating clear explanations for technical documentation and user manuals.

Examples

When you install friendli-client, you can generate chat response with Python SDK.

info

from friendli import Friendli

client = Friendli(token="YOUR PERSONAL ACCESS TOKEN")

Default
Streaming
Async
Streaming (Async)

from friendli import Friendli

client = Friendli()

chat_completion = client.chat.completions.create(
    model="mistral-7b-instruct-v0-2",
    messages=[
        {
            "role": "user",
            "content": "Tell me how to make a delicious pancake"
        }
    ],
    stream=False,
)
print(chat_completion.choices[0].message.content)

from friendli import Friendli

client = Friendli()

stream = client.chat.completions.create(
    model="mistral-7b-instruct-v0-2",
    messages=[
        {
            "role": "user",
            "content": "Tell me how to make a delicious pancake"
        }
    ],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

import asyncio
from friendi import AsyncFriendli

client = AsyncFriendli()


async def main() -> None:
    chat_completion = await client.chat.completions.create(
        model="mistral-7b-instruct-v0-2",
        messages=[
            {
                "role": "user",
                "content": "Tell me how to make a delicious pancake"
            }
        ],
        stream=False,
    )
    print(chat_completion.choices[0].message.content)


asyncio.run(main())

import asyncio
from friendi import AsyncFriendli

client = AsyncFriendli()


async def main() -> None:
    stream = await client.chat.completions.create(
        model="mistral-7b-instruct-v0-2",
        messages=[
            {
                "role": "user",
                "content": "Tell me how to make a delicious pancake"
            }
        ],
        stream=True,
    )
    async for chunk in stream:
        print(chunk.choices[0].delta.content or "", end="")


asyncio.run(main())

Beyond the Models: Generation Settings:

Friendli Serverless Endpoints unlocks further customization through various generation settings, allowing you to fine-tune your Text Generation Model (TGM) outputs:

max_tokens: This defines the maximum number of words your TGM generates. Lower values produce concise outputs, while higher values allow for longer narratives.
temperature: Think of temperature as a creativity knob. Higher values promote more imaginative and surprising outputs, while lower values favor safe and predictable responses.
top_p: This parameter governs the diversity of your output. Lower values focus on the most likely continuation, while higher values encourage exploration of less probable but potentially interesting options.

Unleashing the Full Potential:

Friendli Serverless Endpoints removes the technical hurdles, letting you focus on exploring the magic of TGMs. Start experimenting with different models and settings, tailoring the outputs to your unique vision. Remember, practice makes perfect – the more you interact with these models, the more you'll understand their strengths and discover the incredible possibilities they hold.

Text Generation Models

Unleashing the Power of Language with Friendli Serverless Endpoints​

Model Supports​

Llama 3 70B Instruct​

Examples​

Mistral-7B-instruct​

Examples​

Beyond the Models: Generation Settings:​

Unleashing the Full Potential:​

Ready to embark on your text generation journey? Friendli Serverless Endpoints is your gateway to a world of boundless creativity and innovative applications. Sign up today and let the words flow!​

Unleashing the Power of Language with Friendli Serverless Endpoints

Model Supports

Llama 3 70B Instruct

Examples

Mistral-7B-instruct

Examples

Beyond the Models: Generation Settings:

Unleashing the Full Potential:

Ready to embark on your text generation journey? Friendli Serverless Endpoints is your gateway to a world of boundless creativity and innovative applications. Sign up today and let the words flow!