title: AI Intelligence Training category: Intelligence tags: ai, training, intelligence, machine-learning, models priority: Normal

AI Intelligence Training

IdentityCenter's Intelligence engine can be trained on your specific environment to improve the accuracy of anomaly detection, peer group analysis, and contextual insights. This article covers the training process, what data is used, how it improves your results, and configuration options.

Why Train the AI?

Out of the box, IdentityCenter applies general-purpose analysis rules -- standard thresholds for inactivity, common patterns for privileged access risk, and default peer groupings. While these defaults are effective for many environments, training the AI on your specific data enables it to:

Capability	Before Training	After Training
Anomaly Detection	Flags based on generic thresholds	Flags based on your environment's actual baselines
Peer Group Analysis	Groups users by department/title only	Groups users by observed access patterns and behavior
False Positive Rate	Higher -- generic rules do not account for your unique patterns	Lower -- the model understands what is "normal" in your environment
Risk Scoring Accuracy	Reasonable estimates based on standard factors	Calibrated scores reflecting your organization's actual risk profile
Insight Relevance	General security advice	Environment-specific recommendations

What Data Is Used for Training

The training process analyzes data that has already been synced into IdentityCenter. No external data sources are required, and no data is sent outside your environment.

Data Categories

Data Category	Examples	Used For
Directory Objects	Users, computers, groups, OUs, contacts, service accounts	Understanding your identity landscape
Login Timestamps	lastLogon, lastLogonTimestamp	Establishing baseline activity patterns
Group Memberships	Direct and nested memberships	Mapping access patterns and peer relationships
Organizational Structure	Department, title, manager, location, division	Building peer groups and detecting org anomalies
Account Attributes	UAC flags, password policies, delegation settings	Learning your security configuration norms
Historical Changes	Audit log entries, attribute change history	Understanding change velocity and patterns

Data Not Used

The following data is explicitly excluded from the training process:

Password hashes or password content
Authentication tokens or credentials
Personal data not relevant to access patterns (e.g., home addresses, personal phone numbers)
Data from disconnected or disabled source connections

The Training Process

Step 1: Initiate Training

Navigate to the Intelligence Center in IdentityCenter and select AI Intelligence Training. The training module is accessible to users with the Administrator role.

Before starting training, the system validates that:

At least one source connection is configured and has completed a sync
A minimum number of objects are present (recommended: 100+ user objects)
The LLM provider is configured (see Configuring the LLM Provider)

Click Start Training to begin the process.

Step 2: Data Analysis Phase

During this phase, the IntelligenceDataRepository collects and preprocesses the training data:

Activity	Description	Duration
Object Census	Counts and categorizes all directory objects	Seconds
Pattern Extraction	Identifies common access patterns, group structures, and organizational hierarchies	1-5 minutes
Baseline Computation	Calculates baseline metrics for login frequency, group membership counts, and access levels per peer group	2-10 minutes
Anomaly Calibration	Determines appropriate thresholds for anomaly detection based on your data's distribution	1-5 minutes

Step 3: Model Calibration

The system uses the analyzed data to calibrate its detection models:

Peer group boundaries are refined based on observed clustering in access patterns
Inactivity thresholds are adjusted based on your environment's login distribution
Risk score weights are fine-tuned to reflect the actual risk factors present in your data
Anomaly sensitivity is calibrated to minimize false positives while catching genuine outliers

Step 4: Completion

When training completes, the Intelligence Center displays:

A summary of what was learned (number of peer groups identified, baseline metrics established)
The estimated improvement in detection accuracy
The timestamp of the training completion
A recommendation for when to retrain

Progress Tracking

During training, a progress indicator shows:

Phase	Progress Range	Description
Initializing	0-5%	Validating prerequisites and preparing data pipeline
Collecting Data	5-25%	Querying and assembling training data from the data store
Analyzing Patterns	25-60%	Running pattern extraction and baseline computation
Calibrating Models	60-90%	Adjusting detection thresholds and peer group definitions
Finalizing	90-100%	Persisting results and updating the active models

You can continue using IdentityCenter normally while training runs in the background. Training does not lock any features or block sync operations.

How Training Improves Insights

More Accurate Anomaly Detection

Before training, a user with 45 group memberships might be flagged simply because 45 exceeds a generic threshold. After training, the system knows that engineers in your organization typically have 30-50 group memberships, so this user is within normal range -- while a marketing user with 45 memberships would still be flagged as an anomaly.

Better Peer Group Analysis

Training refines peer groups beyond simple department and title matching. The system may discover that:

Users in "Engineering - Platform" and "Engineering - Infrastructure" share access patterns even though they are in different sub-departments
Regional variations exist (e.g., EMEA users have different typical access than US users)
Certain job titles in your organization carry different access expectations than industry norms

Reduced False Positives

The most immediate benefit of training is a reduction in false positive alerts and findings. By understanding what is normal in your environment, the Intelligence engine stops flagging expected patterns and focuses on genuine anomalies.

Metric	Before Training	After Training
Anomaly alerts per day	~120	~35
False positive rate	~60%	~15%
Mean time to investigate	12 minutes per alert	8 minutes per alert
Actionable finding rate	~40%	~85%

(Values are illustrative and will vary based on environment size and data quality.)

When to Retrain

The AI model should be retrained periodically and in response to significant changes:

Trigger	Reason
Quarterly schedule	Regular recalibration ensures the model stays current as your environment evolves
Major organizational restructure	Department changes, mergers, or reorganizations change what "normal" looks like
New source connection added	A new directory source introduces objects the model has not seen
Significant headcount change	Large onboarding or offboarding events shift baselines
After policy changes	New security policies may change expected access patterns
Rising false positive rate	If alert noise increases, the model may need recalibration

Tip: Set a calendar reminder for quarterly retraining. The process is non-disruptive and typically completes within 15-30 minutes for environments with up to 50,000 objects.

Privacy and Data Security

All AI training in IdentityCenter is performed locally. No identity data, directory attributes, or training results are transmitted to external services.

Concern	How It Is Addressed
Data residency	All training data stays within your IdentityCenter database
External API calls	The LLM is called for insight generation (not training). API calls contain structured prompts, not raw directory data
Data retention	Training results are stored in the IntelligenceDataRepository within your database
Access control	Only administrators can initiate training or view training results
Audit trail	Training initiation and completion are logged in the audit system

Configuring the LLM Provider

The LLM provider is configured in the ChatAI Settings page, accessible from Administration > AI Settings.

Supported Providers

Provider	Model	Configuration
Anthropic	Claude	API key, model selection

Configuration Steps

Navigate to Administration > AI Settings (ChatAISettings.razor)
Select Anthropic as the LLM provider
Enter your Anthropic API key
Select the Claude model variant to use
Click Save and Test Connection to verify

Important: The API key is stored encrypted in the IdentityCenter database. It is used only for LLM inference calls (insight generation, chat responses) and is not transmitted to any other service.

API Usage

The LLM API is used for:

Generating narrative insight text from structured analyzer findings
Processing natural language chat messages in ChatHub
Creating executive briefing summaries

The LLM is not used for the core training process (pattern analysis, baseline computation, model calibration). Those operations are performed locally using statistical analysis.

Monitoring Training Effectiveness

After training, monitor these metrics in the Intelligence Center to assess effectiveness:

Metric	Where to Find It	Good Trend
False Positive Rate	Intelligence Dashboard	Decreasing
Alert Volume	Intelligence Dashboard	Decreasing (fewer noise alerts)
Actionable Finding Rate	Intelligence Dashboard	Increasing
Peer Group Coverage	Training Results Summary	>90% of users assigned to a peer group
Baseline Stability	Training Results Summary	Baselines converge (low variance between training runs)

Next Steps

Risk Scoring -- Understand how trained models improve risk score accuracy
Contextual Insights -- See how training enhances per-object analysis
Intelligence Hub Overview -- Full overview of the analytics platform
Using the AI Chat -- Interact with the trained AI through ChatHub
Natural Language Queries -- How the LLM processes your questions
Dashboard and Reporting -- Track Intelligence metrics over time
Security Hardening -- Secure your AI configuration

AI Intelligence Training

title: AI Intelligence Training category: Intelligence tags: ai, training, intelligence, machine-learning, models priority: Normal

AI Intelligence Training

Why Train the AI?

What Data Is Used for Training

Data Categories

Data Not Used

The Training Process

Step 1: Initiate Training

Step 2: Data Analysis Phase

Step 3: Model Calibration

Step 4: Completion

Progress Tracking

How Training Improves Insights

More Accurate Anomaly Detection

Better Peer Group Analysis

Reduced False Positives

When to Retrain

Privacy and Data Security

Configuring the LLM Provider

Supported Providers

Configuration Steps

API Usage

Monitoring Training Effectiveness

Next Steps

Was this article helpful?

Related Articles