Using DaaluUsage and pricing

27. Usage and pricing

One screen for what Daalu costs and what your own GPU saves — usage totals, your plan, and where the tokens went.

At a glance

What it isThe usage-and-pricing dashboard: your own-GPU status, period totals against your included bundle, the plan picker, and spend charts.
Where to find ithttps://ops.daalu.io/billing (labelled Usage & Pricing in the sidebar).
Who can use itEveryone can view; changing the plan is admin-only.

The Usage & Pricing page answers “what does running this cost, and what is my own hardware saving me?” in one place. The deeper pricing model, the SKU breakdown, and cross-cloud spend analysis live in Part VIII (Chapters 46–49); this page is the day-to-day operator view.


Your own GPU

The top of the page is a status card for local inference — the single biggest lever on what Daalu costs you per call.

Status pill

  • Online — your GPU answered a router probe in the last few minutes. Inference is running on your hardware.
  • Offline — the last response is stale. Usually the serving pod is unhealthy or the federation tunnel is down; the router has fallen back to the Daalu-hosted tier in the meantime.
  • Not configured — no GPU connected. Everything runs on the Daalu-hosted tier.

When a GPU is healthy, the card shows its served model and a running estimate of what you saved this period by serving inference yourself instead of on the hosted tier.

Onboarding card

If you haven’t connected a GPU, the card becomes a call to action that drops you into the connect-a-GPU flow (Chapter 9). The why-and-how of own-GPU inference is the AI Factory model (Chapter 16) and the router (Chapter 17).

Why it matters — Most inference calls by volume — classifier passes, routine Assistant turns — are cheap to serve on your own GPU and cost you nothing per call. The router sends them to your hardware first and only reaches for the hosted tier when it has to. This card is where you watch that pay off.


Period totals and bundle progress

Below the GPU card is the headline for the current billing period:

  • Period total — usage charged so far this period.
  • Bundle progress — a progress bar showing how much of your plan’s included allowance you’ve consumed. Green while you’re inside the bundle; it tips toward a warning as you approach the limit, so an overage never surprises you.
  • Period dates — when the current window started and when it resets.

The progress bar is the number to glance at: comfortably inside the bundle means no action; pinned near the top means it’s time to look at the plan picker.


Plan and SKU

This section shows the plan (SKU) your tenant is on and lets an admin change it.

  • Current plan — its name, what’s included, and the price.
  • Plan picker — the catalog of available plans as cards, each with its included bundle and price. An admin selects one and confirms to switch; the change takes effect for the tenant immediately.

Note — Only tenant admins can change the plan. Everyone else sees the picker read-only.


Spend charts

The lower half of the page visualizes where usage actually went.

30-day trend

A daily series of usage across the last 30 days. Spot the day a new workload landed, or confirm that turning on your own GPU bent the curve down.

Tokens by tier

A breakdown of token usage by tier of call — classifier, chat, large-context chat, embeddings — split between your own GPU and the Daalu-hosted tier. The proportion of own-GPU green to hosted-tier grey is the at-a-glance answer to “how much of my inference am I actually serving myself?”

Top sources

Which features and integrations drove the most usage this period. Useful for attributing cost to a workload — a chatty agent, a heavy briefing schedule, a busy Assistant — and deciding whether to tune it.


Spend alerts

Cost anomalies surface as alerts, raised by the cost-anomaly agent (Chapter 20). You tune the thresholds — daily deviation from the trailing average, per-source spikes, and which channel gets notified — in Settings → Briefings → Cost anomaly (Chapter 28). The defaults catch obvious “today is 3× yesterday” spikes; tighten them once you’ve seen a few false positives.


What’s not on this page

A couple of things live elsewhere on purpose:

  • Cross-cloud spend analysis — detailed cloud-cost attribution and forecasting is its own topic, in Part VIII (Chapter 48).
  • Per-user billing — Daalu doesn’t bill per action, so per-user cost is rarely meaningful. Plans are per-bundle, not per-seat-action.

Next: Chapter 28 — Settings