A / 01 Service line

AI & Data Intelligence.

Where data becomes decisions. We build sovereign AI pipelines that read your Arabic content, retrieve from your own documents, and reason at frontier scale — without sending a byte across the border.

Arabic-firstfuṣḥā · MSA · Gulf dialect
Sovereignin-kingdom inference
GovernedSDAIA · ISO 42001
SCHEMATIC · DATA PIPELINE
v1.0
INGEST PDF · DOCX · DB PROCESS CHUNK · OCR · AR EMBED ARABIC-NATIVE VECTOR PG / WEAVIATE POLICY PDPL ✓ SOVEREIGN LLM ALLAM · JAIS FRONTIER CLAUDE · GPT AUDIT ISO 42001 DECISION CITED · GROUNDED · LOGGED
» 8-stage pipeline · arabic-native end to end
» policy gate before any frontier call
» every output cited · every call audited
01 / Capabilities

What we deliver under AI & Data.

Six capability areas, all production-grade, all built to operate inside the Kingdom's regulatory perimeter.

CAP / 01

Arabic RAG pipelines

Morphology-aware chunking, root-form normalization, and Gulf-dialect embeddings — so retrieval works on classical, modern, and dialect Arabic without losing context.

nuqta engineHybrid searchRe-ranking
CAP / 02

Fine-tuning & adaptation

LoRA / QLoRA / full fine-tuning of ALLaM, Jais, and Nemotron on your domain corpus. Evaluation suites for hallucination, refusal, and Arabic linguistic correctness.

ALLaMJaisNemotron
CAP / 03

MLOps & lifecycle

Continuous training, drift detection, A/B routing, and prompt versioning. Observability stack with token-level cost attribution and per-tenant audit trails.

MLflowLangSmithCustom telemetry
CAP / 04

AI governance

Risk classification, model cards, bias evaluation, and policy gates aligned to SDAIA Ethics Framework, ISO 42001, and the EU AI Act for export-ready engagements.

SDAIA EthicsISO 42001Model cards
CAP / 05

Data engineering

From spreadsheets to data lakehouses — ingestion, cleansing, semantic catalogs, and feature stores that make your AI systems actually useful in production.

Lakehouse designNDMO classifiedLineage
CAP / 06

Evaluation & assurance

Independent red-teaming and benchmark suites. Arabic-specific test sets for fairness, factuality, and refusal behavior across regulated domains.

Red-teamingEval harnessHallucination tests
02 / Technology

The stack we build on, and partner with.

Sovereign-first, with a tight allowlist of frontier models reachable only through a policy router — never by default.

SOVEREIGN LLM
ALLaM
SOVEREIGN LLM
Jais
SOVEREIGN LLM
Nemotron
FRONTIER · GATED
Claude
COMPUTE
NVIDIA DGX
COMPUTE
DGX Spark
CLOUD
AWS Bedrock
CLOUD
Azure AI
VECTOR
pgvector
VECTOR
Weaviate
ORCHESTRATION
LangGraph
OBSERVABILITY
LangSmith
03 / Methodology

How we engage on AI & Data.

PHASE / 01

Data & AI readiness

A focused diagnostic across data quality, infrastructure, regulatory posture, and use-case fit. Output: a prioritized backlog with business case for the top three opportunities.

2–3 weeksfixed scope
PHASE / 02

Pilot build

One narrow use case, deployed end-to-end on sovereign infrastructure. Measurable against your KPIs, with a go/no-go gate before scale.

4–8 weeksmilestone-based
PHASE / 03

Platform rollout

Hardening, integration, governance, and rollout across functions. Includes user training, change management, and observability stack.

3–6 monthsscoped to program
PHASE / 04

Managed AI operations

Embedded pod or 24×7 managed service. Continuous evaluation, retraining, drift response, and incident handling. SLA-backed.

OngoingSLA retainer
04 / Compliance

Standards we build to, not retrofit.

Every AI engagement is delivered against a baseline of Saudi and international AI/data regulation.

SDAIA AI Ethics
ISO/IEC 42001
PDPL
NDMO Data Classification
NCA ECC
SAMA CSF
ISO 27001
EU AI Act (export)
05 / Sector application

Where we apply AI & Data first.

The disciplines compose. AI & Data is the substrate every other practice draws on.

06 / FAQ

Common questions we answer first.

Will our data ever leave the Kingdom?

No, by default. Sovereign LLMs (ALLaM, Jais, Nemotron) run on in-kingdom hardware. Frontier models are reachable only through a policy router — and only after we co-author the policy with you. Most engagements never need to invoke a frontier call.

Why Arabic-native instead of just translating English models?

Translation breaks on Arabic morphology, diacritics, and dialect. Tokenizers built for English split Arabic words mid-root, embedding models miss semantic similarity, and dialect (Najdi, Hejazi, Khaleeji) is treated as noise. Arabic-native pipelines retain meaning end to end — the difference shows up in retrieval recall and hallucination rates.

How do you measure success?

Up front, we co-define KPIs tied to a business outcome — cycle time saved, decision accuracy, deflection rate, regulatory finding closure. Pilots have a go/no-go gate against those KPIs before scaling.

What about hallucinations?

Every nuqta-grounded answer carries page-level citations. Outputs without retrievable sources are flagged or refused. We run domain-specific factuality benchmarks before deployment, and continuously after.

Can you fine-tune on classified or sensitive data?

Yes. Fine-tuning runs on air-gapped hardware. Datasets are classified per NDMO, access is logged, and model artifacts inherit the classification of their training data. Defence-grade engagements run under separate enclaves.

Do you build, or only advise?

We build. Most of the firm is engineering. Advisory is the front door, not the product.

Move from data to decisions.

من البيانات إلى القرار.

Thirty-minute working session with our AI & Data lead. We'll map your three highest-leverage AI use cases and a sovereign deployment shape before the call ends.