Frequently Asked Questions about NetMind Model Library
Find answers to the most common questions about NetMind Model Library, integration, deployment, and best practices to help you get started quickly and make the most of the platform.
Q What is the NetMind Model Library?
A
A curated catalog of multimodal AI models—chat, embeddings, image, video, and audio—exposed through a unified, OpenAI-compatible API and SDK.
Q Which modalities and AI models are supported?
A
Chat LLMs, embedding models for RAG and search, image and video generation, and audio (TTS/ASR). Each model card lists features, latency, and usage examples.
Q How do I integrate with one API and a unified SDK?
A
Create an API key and call any model via our unified SDK or REST API. OpenAI-compatible request/response shapes simplify migration from other providers.
Q Do you offer serverless endpoints and dedicated GPU deployments?
A
Yes. Use elastic serverless endpoints for bursty traffic, or dedicated single-tenant GPUs for predictable latency, throughput, and isolation.
Q How does pricing work (pay-as-you-go, volume discounts)?
A
Usage-based pricing per tokens/images/minutes with transparent metering in the dashboard. Volume discounts and dedicated plans are available.
Q What about reliability—uptime SLAs and automatic fallback?
A
Optional automatic fallback can switch to secondary models during incidents. Dedicated plans include uptime SLAs and priority capacity.
Q What performance can I expect (latency, concurrency, streaming)?
A
Endpoints support high concurrency, streaming responses, and batching—designed for low-latency inference and production workloads.
Q Do you provide a dashboard for usage, billing, and analytics?
A
Yes. Monitor requests, costs, latency, and error rates in real time. Export usage for finance or capacity planning.
Q How is my data handled (privacy, retention, compliance)?
A
Enterprise controls cover encryption in transit/at rest and configurable retention. Compliance options are available—see our data and security docs.
Q Can I bring my own model (BYOM) or fine-tune models?
A
You can deploy custom or fine-tuned models to dedicated endpoints and manage them via CLI or API.
Q Are webhooks supported for long-running jobs (image/video)?
A
Yes. Use job IDs and webhooks for async tasks like image/video generation and large batch runs.
Q How do I migrate from other OpenAI-compatible providers?
A
Point your existing OpenAI-style client to our base URL and update the model name. Most integrations need minimal code changes.
Q What SDKs and tools are available (Python/JavaScript, REST)?
A
Use our Python and raw HTTPS/REST. Examples cover chat, embeddings, image/video generation, and audio TTS/ASR.
Q Do you support error inspection and observability?
A
The dashboard and logs include request IDs, error codes, and traces to help diagnose issues quickly.
Q How often are models updated (SOTA, versioning)?
A
The library is updated regularly. Model cards show versions and release notes so you can track changes and upgrade safely.