A high-performance face detection and validation service built in Go that uses ONNX Runtime for inference. The service can detect and validate single/multiple faces in images with optimized processing using SIMD instructions.
- Fast face detection using YOLOv11n model (IR version 9)
- SIMD-optimized image preprocessing (AVX-512, AVX2, SSE4.1)
- Concurrent processing with efficient memory management
- Model session pooling for improved performance
- Support for multiple input formats (JSON, multipart/form-data, raw)
- Health monitoring and metrics endpoints
- Docker support with multi-stage builds
- Graceful shutdown handling
- Go 1.22+
- ONNX Runtime 1.20.0
- YOLOv11n neural network model
- SIMD optimizations (AVX-512, AVX2, SSE4.1)
- Docker
-
Detection Engine
- ONNX model inference
- SIMD-optimized image preprocessing
- Bounding box processing
- Clustering algorithm for multiple detections
-
Session Management
- Thread-safe model session pool
- Automatic session health checks
- Configurable pool size and timeouts
-
Image Processing Pipeline
- Image decoding
- Resizing
- Channel processing
- SIMD-accelerated preprocessing
- SIMD instructions for faster image processing
- Concurrent channel processing
- Memory pooling and reuse
- Efficient bounding box clustering
- Session pooling to reduce model loading overhead
Validates faces in the provided image.
Supported Input Formats:
- JSON with base64 encoded image
- Multipart form-data
- Raw image data
Response:
{
"is_valid": boolean,
"face_count": integer,
"message": string
}
Health check endpoint.
Returns pool metrics and performance statistics.
Environment Variables:
PORT
: Server port (default: 8080)DEBUG
: Enable debug loggingMODEL_OUTPUT_CHANNELS
: Model output channelsMODEL_OUTPUT_GRID_SIZE
: Model output grid sizeGOMAXPROCS
: Go runtime thread limitGOMEMLIMIT
: Go runtime memory limit
# Build the image
docker build -t face-validation-service .
# Run the container
docker run -p 8080:8080 face-validation-service
Requirements:
- Go 1.22+
- ONNX Runtime 1.20.0
- C compiler (for CGO)
# Clone the repository
git clone https://github.com/Tutortoise/face-validation-service
# Build the service
cd face-validation-service/web
go build -o server
# Run the service
./server
- Uses SIMD instructions when available
- Implements memory pooling to reduce GC pressure
- Concurrent processing of image channels
- Session pooling for better resource utilization
- Efficient clustering algorithm for multiple face detection
The service provides monitoring endpoints for:
- Session pool statistics
- Processing times
- Resource usage
- Error rates
- Health status
The service implements comprehensive error handling for:
- Invalid input formats
- Image decoding failures
- Model inference errors
- Resource exhaustion
- Timeout scenarios
face-validation-service/
├── web/
│ ├── clustering/ # Clustering algorithms
│ ├── detections/ # Face detection logic
│ ├── lib/ # ONNX Runtime libraries
│ ├── models/ # Data models
│ ├── onnx_model/ # Neural network model
│ ├── main.go # Main application
│ └── pool.go # Session pool implementation
└── Dockerfile
# Run tests
go test ./...
# Build with debug info
go build -gcflags="all=-N -l" -o server
# Run with debug logging
DEBUG=true ./server
The face detection model was trained using two datasets to ensure accurate detection of human faces while eliminating false positives:
-
Main Face Detection Dataset
- Source: Face Detection Dataset from Kaggle
- Purpose: Primary training data for human face detection
- Features:
- High-quality human face images
- Various poses and lighting conditions
- Different age groups and ethnicities
-
Anime Face Dataset (Negative Examples)
- Source: Annotated Anime Faces Dataset
- Purpose: Used as negative examples to reduce false positives
- Features:
- Anime and cartoon-style faces
- Helps model discriminate between real human faces and illustrated faces
- Improves model's specificity for human face detection
-
Data Preparation
- Combined both datasets with appropriate labeling
- Human faces labeled as positive examples
- Anime faces labeled as negative examples
- Applied data augmentation techniques
-
Model Architecture
- Based on YOLOv11n with IR version 9
- Optimized for 256x256 input resolution
- Modified to handle binary classification (human face vs. non-human face)
-
Training Configuration
- Trained on Kaggle's GPU environment
- Used transfer learning from pre-trained weights
- Implemented custom loss function to balance precision and recall
- Employed hard negative mining to improve discrimination capability
This dual-dataset approach ensures the model:
- Accurately detects human faces
- Minimizes false positives from animated or illustrated faces
- Provides robust performance in real-world applications