Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
b216ad2
Add scripts for running samples and setting up environment for Azure …
changjian-wang Apr 22, 2026
a7f9a1c
Enhance SKILL.md files to include related skills for better user guid…
changjian-wang Apr 22, 2026
7c74788
Update README.md to enhance Table of Contents with new sections for b…
changjian-wang Apr 22, 2026
872dd16
Add setup and run scripts for Azure AI Content Understanding SDK samples
wangchangjian1130 Apr 22, 2026
3906706
Merge branch 'main' into changjian-wang/cu-sdk-skills
changjian-wang Apr 23, 2026
8146dd8
Update SKILL.md to enhance user guidance for Java SDK setup and confi…
wangchangjian1130 Apr 23, 2026
c44e5da
Update SKILL.md to clarify Maven installation commands and add troubl…
wangchangjian1130 Apr 23, 2026
e0332ec
Add setup script for Azure AI Content Understanding Java SDK environment
changjian-wang Apr 24, 2026
2aa38c2
Enhance setup_user_env.sh script for Azure AI Content Understanding SDK
wangchangjian1130 Apr 25, 2026
f96aa50
Merge branch 'main' into changjian-wang/cu-sdk-skills
changjian-wang Apr 25, 2026
fd5b9ce
fix: update environment variable handling and improve script comments
wangchangjian1130 Apr 25, 2026
b0275cf
fix: update logging link in README to point to the new documentation
wangchangjian1130 Apr 25, 2026
b1c4aff
refactor: remove setup_samples.sh script and update SKILL.md for envi…
wangchangjian1130 Apr 25, 2026
997e2bc
feat: add Sample07_ListAnalyzers to SKILL.md for analyzer enumeration
wangchangjian1130 Apr 26, 2026
efce265
Address PR review feedback for cu-sdk-sample-run skill
changjian-wang May 8, 2026
8aa71e7
Merge remote-tracking branch 'origin/main' into changjian-wang/cu-sdk…
changjian-wang May 8, 2026
d83299a
Fix cspell error in run_sample.sh
changjian-wang May 8, 2026
3fa0321
[Java CU SDK] Align cu-sdk-sample-run skill structure with Python; ad…
changjian-wang May 8, 2026
ab3d09e
Merge branch 'main' into changjian-wang/cu-sdk-skills
changjian-wang May 9, 2026
c29be38
Enhance Sample16_CreateAnalyzerWithLabelsAsync and related tests for …
changjian-wang May 9, 2026
a8687e3
Merge branch 'changjian-wang/cu-sdk-skills' of https://github.com/Azu…
changjian-wang May 9, 2026
db0c3ad
Merge branch 'main' into changjian-wang/cu-sdk-skills
changjian-wang May 9, 2026
691bde4
Bump azure-storage-blob to 12.33.4 to satisfy version_client.txt
changjian-wang May 9, 2026
32347f7
Fix cspell error: rename 'Knowledge srcs' to 'Knowledge sources'
changjian-wang May 9, 2026
f04c36a
Address Copilot review: storage-blob test scope, SAS clock skew
changjian-wang May 9, 2026
e95e2d1
Merge branch 'main' into changjian-wang/cu-sdk-skills
changjian-wang May 9, 2026
b6df03f
Merge branch 'main' into changjian-wang/cu-sdk-skills
changjian-wang May 12, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
---
name: cu-sdk-common-knowledge
description: Domain knowledge for Azure AI Content Understanding. Use this skill to answer questions about Content Understanding concepts, analyzers, field schemas, API operations, and Java SDK usage. Always consult official documentation before answering.
---

# Azure AI Content Understanding Domain Knowledge

This skill provides domain knowledge for Azure AI Content Understanding, a multimodal AI service that extracts semantic content from documents, video, audio, and image files.

> **[COPILOT GUIDANCE]:** Always consult the official documentation first before answering user questions. Use `fetch_webpage` to read the relevant doc page when the reference material below is insufficient or may be outdated.
>
> When a user's question is broad or ambiguous, ask them to clarify:
> - "Which modality are you working with — documents, images, audio, or video?"
> - "Are you using a prebuilt analyzer, or building a custom one?"
> - "Are you asking about the Java SDK specifically, or the service in general?"

## Official Documentation

The authoritative source for Content Understanding is: **https://learn.microsoft.com/azure/ai-services/content-understanding/**

Always read the relevant page (via `fetch_webpage`) before answering if the reference material below does not cover the topic.

### Key Documentation Pages

| Topic | URL |
|-------|-----|
| **Overview** | https://learn.microsoft.com/azure/ai-services/content-understanding/overview |
| **What's new** | https://learn.microsoft.com/azure/ai-services/content-understanding/whats-new |
| **Content Understanding Studio** | https://learn.microsoft.com/azure/ai-services/content-understanding/quickstart/content-understanding-studio?tabs=portal%2Ccu-studio |
| **Service limits** | https://learn.microsoft.com/azure/ai-services/content-understanding/service-limits |
| **Region & language support** | https://learn.microsoft.com/azure/ai-services/content-understanding/language-region-support |
| **Prebuilt analyzers** | https://learn.microsoft.com/azure/ai-services/content-understanding/concepts/prebuilt-analyzers |
| **Create custom analyzer** | https://learn.microsoft.com/azure/ai-services/content-understanding/tutorial/create-custom-analyzer?tabs=portal%2Cdocument&pivots=programming-language-java |
| **Document markdown** | https://learn.microsoft.com/azure/ai-services/content-understanding/document/markdown |
| **Document elements** | https://learn.microsoft.com/azure/ai-services/content-understanding/document/elements |
| **Video overview** | https://learn.microsoft.com/azure/ai-services/content-understanding/video/overview |
| **Video elements** | https://learn.microsoft.com/azure/ai-services/content-understanding/video/elements |
| **Audio overview** | https://learn.microsoft.com/azure/ai-services/content-understanding/audio/overview |
| **Image overview** | https://learn.microsoft.com/azure/ai-services/content-understanding/image/overview |
| **REST API reference** | https://learn.microsoft.com/rest/api/contentunderstanding/operation-groups |

### Java SDK Resources

| Resource | URL |
|----------|-----|
| **Maven Central** | https://central.sonatype.com/artifact/com.azure/azure-ai-contentunderstanding |
| **Java SDK README** | https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/contentunderstanding/azure-ai-contentunderstanding/README.md |
| **Java SDK Samples** | https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/contentunderstanding/azure-ai-contentunderstanding/src/samples |

> **Search tip:** If the above pages don't cover the user's question, search the doc tree at `https://learn.microsoft.com/azure/ai-services/content-understanding/`.

## Related Skills

- `cu-sdk-setup` — Set up environment variables for Java SDK samples
- `cu-sdk-sample-run` — Run specific Java SDK samples interactively

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,259 @@
#!/usr/bin/env bash
set -euo pipefail
# cspell:ignore envfile esac

# run_sample.sh
# Run a specific Java sample for the Azure AI Content Understanding SDK.
# Compiles the samples module (if needed) and runs the specified sample class
# using mvn exec:java.
#
# Usage:
# run_sample.sh <SampleClassName> [--env <env-file>] [--dry-run]
# Examples:
# run_sample.sh Sample02_AnalyzeUrl
# run_sample.sh Sample02_AnalyzeUrlAsync
# run_sample.sh Sample02_AnalyzeUrl --env .env
# run_sample.sh --list

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# Package root is 4 levels up from scripts: .github/skills/cu-sdk-sample-run/scripts -> package root
PACKAGE_ROOT="$(cd "$SCRIPT_DIR/../../../.." && pwd)"
SAMPLES_DIR="$PACKAGE_ROOT/src/samples/java/com/azure/ai/contentunderstanding/samples"
PACKAGE="com.azure.ai.contentunderstanding.samples"

# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color

# Defaults
DRY_RUN=0
ENV_FILE=""
SAMPLE_NAME=""

print_info() { echo -e "${BLUE}$1${NC}"; }
print_success() { echo -e "${GREEN}$1${NC}"; }
print_warning() { echo -e "${YELLOW}$1${NC}"; }
print_error() { echo -e "${RED}$1${NC}"; }

print_help() {
cat <<EOF
Usage: $(basename "$0") <SampleClassName> [OPTIONS]

Run a specific Java sample for the Azure AI Content Understanding SDK.

Arguments:
<SampleClassName> Sample class name (e.g., Sample02_AnalyzeUrl).
The .java extension is optional.

Options:
--env <file> Load environment variables from the given .env file before running.
--dry-run Print what would be executed without running.
--list List available samples and exit.
--help, -h Show this help message.

Examples:
$(basename "$0") Sample02_AnalyzeUrl
$(basename "$0") Sample02_AnalyzeUrlAsync
$(basename "$0") Sample02_AnalyzeUrl --env .env
$(basename "$0") --list
EOF
}

list_samples() {
echo ""
print_info "=== Available Sync Samples ==="
for f in "$SAMPLES_DIR"/Sample*.java; do
[ -f "$f" ] || continue
local name
name="$(basename "$f" .java)"
# Skip async samples in this section
[[ "$name" == *Async ]] && continue
echo " $name"
done
echo ""
print_info "=== Available Async Samples ==="
for f in "$SAMPLES_DIR"/Sample*Async.java; do
[ -f "$f" ] || continue
local name
name="$(basename "$f" .java)"
echo " $name"
done
echo ""
}

# Load environment variables from a .env file.
#
# Only simple NAME=VALUE assignments are accepted (with an optional leading
# `export `). Names must be valid shell identifiers ([A-Za-z_][A-Za-z0-9_]*).
# A single matching pair of surrounding double or single quotes is stripped
# from the value. Anything else is skipped with a warning. We deliberately
# avoid `eval` so a malicious or malformed .env file cannot execute arbitrary
# commands or trigger command substitution.
load_env_file() {
local envfile="$1"
if [[ ! -f "$envfile" ]]; then
print_error "Error: .env file not found: $envfile"
exit 1
fi
print_info "Loading environment variables from: $envfile"
local line name value lineno=0
while IFS= read -r line || [[ -n "$line" ]]; do
lineno=$((lineno + 1))
# Skip empty lines and comments
[[ -z "$line" || "$line" =~ ^[[:space:]]*# ]] && continue
# Strip optional leading `export ` (with surrounding whitespace)
line="${line#"${line%%[![:space:]]*}"}" # strip leading whitespace
line="${line#export }"
# Require NAME=VALUE with a valid identifier on the left
if [[ ! "$line" =~ ^([A-Za-z_][A-Za-z0-9_]*)=(.*)$ ]]; then
print_warning " Skipping line $lineno (not a NAME=VALUE assignment)"
continue
fi
name="${BASH_REMATCH[1]}"
value="${BASH_REMATCH[2]}"
# Strip a single matching pair of surrounding double or single quotes
if [[ "$value" =~ ^\"(.*)\"$ ]]; then
value="${BASH_REMATCH[1]}"
elif [[ "$value" =~ ^\'(.*)\'$ ]]; then
value="${BASH_REMATCH[1]}"
fi
export "$name=$value"
done < "$envfile"
print_success "✓ Environment variables loaded"
}

# Parse arguments
while [[ $# -gt 0 ]]; do
case "$1" in
--help|-h)
print_help
exit 0
;;
--list|-l)
list_samples
exit 0
;;
--dry-run)
DRY_RUN=1
shift
;;
--env)
if [[ -z "${2:-}" ]]; then
print_error "Error: --env requires a file path argument"
exit 1
fi
ENV_FILE="$2"
shift 2
;;
-*)
print_error "Unknown option: $1"
print_help
exit 1
;;
*)
if [[ -z "$SAMPLE_NAME" ]]; then
SAMPLE_NAME="$1"
else
print_error "Error: Multiple samples specified. Only one sample is supported."
exit 1
fi
shift
;;
esac
done

if [[ -z "$SAMPLE_NAME" ]]; then
print_error "Error: No sample name provided"
echo ""
print_help
exit 1
fi

# Normalize: strip .java extension if provided
SAMPLE_NAME="${SAMPLE_NAME%.java}"

# Verify the sample file exists
SAMPLE_FILE="$SAMPLES_DIR/${SAMPLE_NAME}.java"
if [[ ! -f "$SAMPLE_FILE" ]]; then
print_error "Error: Sample not found: $SAMPLE_FILE"
echo ""
echo "Did you mean one of these?"
ls "$SAMPLES_DIR"/Sample*.java 2>/dev/null | xargs -n1 basename | sed 's/\.java$//' | grep -i "${SAMPLE_NAME}" | head -5 || true
echo ""
echo "Run '$(basename "$0") --list' to see all available samples"
exit 1
fi

FULL_CLASS="${PACKAGE}.${SAMPLE_NAME}"

echo ""
print_info "=== Run Java Sample ==="
echo "Package root: $PACKAGE_ROOT"
echo "Sample class: $FULL_CLASS"
echo "Sample file: $SAMPLE_FILE"
echo ""

# Navigate to package root
cd "$PACKAGE_ROOT"

# Load .env file if specified
if [[ -n "$ENV_FILE" ]]; then
# Resolve relative path from original cwd
if [[ "$ENV_FILE" != /* ]]; then
ENV_FILE="$PACKAGE_ROOT/$ENV_FILE"
fi
load_env_file "$ENV_FILE"
echo ""
fi

# Check for required environment variable
if [[ -z "${CONTENTUNDERSTANDING_ENDPOINT:-}" ]]; then
print_warning "⚠ CONTENTUNDERSTANDING_ENDPOINT is not set. Most samples will fail without it."
echo " Set it with: export CONTENTUNDERSTANDING_ENDPOINT=\"https://your-foundry.services.ai.azure.com/\""
echo " Or use: $(basename "$0") $SAMPLE_NAME --env .env"
echo ""
fi

# Sample16 demo-mode banner: warn if the user is about to run the labeled-data
# sample without configuring either Option A (SAS URL) or Option B (storage
# account + container) — the sample will still run but skip the labeled-data
# code path.
if [[ "$SAMPLE_NAME" == Sample16* ]]; then
if [[ -z "${CONTENTUNDERSTANDING_TRAINING_DATA_SAS_URL:-}" ]]; then
if [[ -z "${CONTENTUNDERSTANDING_TRAINING_DATA_STORAGE_ACCOUNT:-}" \
|| -z "${CONTENTUNDERSTANDING_TRAINING_DATA_CONTAINER:-}" ]]; then
print_warning "⚠ DEMO MODE: no training data configured for $SAMPLE_NAME."
echo " The analyzer will be created without labeled data ('Knowledge sources: 0')."
echo " To exercise the labeled-data API path, configure ONE of:"
echo " Option A: CONTENTUNDERSTANDING_TRAINING_DATA_SAS_URL=<container SAS URL>"
echo " Option B: CONTENTUNDERSTANDING_TRAINING_DATA_STORAGE_ACCOUNT=<account>"
echo " CONTENTUNDERSTANDING_TRAINING_DATA_CONTAINER=<container>"
echo " then re-run: set -a && source .env && set +a"
echo ""
fi
fi
fi

# Build command. Sample classes live under src/samples/java and are compiled
# as test sources, so we must run test-compile before exec:java; otherwise on
# a clean checkout the sample class will not exist on the classpath.
MVN_CMD="mvn -DskipTests test-compile exec:java -Dexec.mainClass=\"${FULL_CLASS}\" -Dexec.classpathScope=test"

if [[ $DRY_RUN -eq 1 ]]; then
echo "DRY RUN: would execute:"
echo " cd $PACKAGE_ROOT"
[[ -n "${ENV_FILE:-}" ]] && echo " (env loaded from $ENV_FILE)"
echo " $MVN_CMD"
exit 0
fi

# Run the sample
print_info "Running: $SAMPLE_NAME"
echo ""
mvn -DskipTests test-compile exec:java -Dexec.mainClass="${FULL_CLASS}" -Dexec.classpathScope=test

Comment thread
changjian-wang marked this conversation as resolved.
echo ""
print_success "✓ Sample completed: $SAMPLE_NAME"
Loading