How to Use the API
For reasoning models that produce longer, more detailed responses, we highly recommend streaming tokens to ensure the best user experience.Default Behavior
By default, the model will provide a step-by-step thought process before the final answer.Disabling Reasoning
To get a direct answer without the chain-of-thought process, add/no_think
to the beginning of your system prompt.
Best Practices
To get the best results from Llama 3.3 Nemotron, treat it like an expert problem-solver. Provide a clear, high-level objective and let the model determine the best steps to reach the solution.- Strengths: Excels at open-ended reasoning, multi-step logic, and complex coding or mathematical problems.
- Avoid Over-prompting: Micromanaging each step can limit the model’s advanced reasoning capabilities. Give it the goal, not the exact path.
- Provide Clear Objectives: Ensure your prompt is clear and unambiguous to get the most accurate and relevant response.
- Use Streaming: For complex queries, the reasoning process can generate a lot of text. Streaming the response provides a much better user experience.
Use Cases
- Code Generation & Analysis: Analyze large codebases, suggest improvements, and generate complex code snippets.
- Strategic Planning: Develop multi-stage plans, reasoning about optimal approaches and potential obstacles.
- Complex Document Analysis: Process and summarize technical specifications, legal contracts, and research papers.
- Agentic Workflows: Build sophisticated AI agents that can perform complex, multi-step tasks.
- Scientific Research: Assist in hypothesis generation, experimental design, and data analysis.
- Advanced Problem Solving: Handle ambiguous requirements by inferring unstated assumptions and providing logical solutions.