Titan Takeoff Pro

TitanML helps businesses build and deploy better, smaller, cheaper, and faster NLP models through our training, compression, and inference optimization platform.

Note: These docs are for the Pro version of Titan Takeoff. For the community version, see the page for Titan Takeoff.

Our inference server, Titan Takeoff (Pro Version) enables deployment of LLMs locally on your hardware in a single command. Most generative model architectures are supported, such as Falcon, Llama 2, GPT2, T5 and many more.

Example usage

Here are some helpful examples to get started using the Pro version of Titan Takeoff Server. No parameters are needed by default, but a baseURL that points to your desired URL where Takeoff is running can be specified and generation parameters can be supplied.

from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.prompts import PromptTemplate
from langchain_community.llms import TitanTakeoffPro

# Example 1: Basic use
llm = TitanTakeoffPro()
output = llm("What is the weather in London in August?")
print(output)


# Example 2: Specifying a port and other generation parameters
llm = TitanTakeoffPro(
    base_url="http://localhost:3000",
    min_new_tokens=128,
    max_new_tokens=512,
    no_repeat_ngram_size=2,
    sampling_topk=1,
    sampling_topp=1.0,
    sampling_temperature=1.0,
    repetition_penalty=1.0,
    regex_string="",
)
output = llm("What is the largest rainforest in the world?")
print(output)


# Example 3: Using generate for multiple inputs
llm = TitanTakeoffPro()
rich_output = llm.generate(["What is Deep Learning?", "What is Machine Learning?"])
print(rich_output.generations)


# Example 4: Streaming output
llm = TitanTakeoffPro(
    streaming=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()])
)
prompt = "What is the capital of France?"
llm(prompt)

# Example 5: Using LCEL
llm = TitanTakeoffPro()
prompt = PromptTemplate.from_template("Tell me about {topic}")
chain = prompt | llm
chain.invoke({"topic": "the universe"})

Example usage​

Example usage