Updates to saliency and recognize text

Knightro63 · Feb 3, 2024 · c9ee4ee · c9ee4ee
1 parent e9c518e
commit c9ee4ee
Show file tree

Hide file tree

Showing 18 changed files with 216 additions and 75 deletions.
diff --git a/README.md b/README.md
@@ -31,6 +31,7 @@ Apple Vision is a Flutter plugin that enables Flutter apps to use [Apple Vision]
 |[Barcode Scanner](https://developer.apple.com/documentation/vision/vnbarcodeobservation)                   | [apple\_vision\_scanner](https://pub.dev/packages/apple_vision_scanner) [![Pub Version](https://img.shields.io/pub/v/apple_vision_scanner)](https://pub.dev/packages/apple_vision_scanner)                                        | [![GitHub](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](https://github.com/Knightro63/apple_vision/tree/master/packages/apple_vision_scanner)             | ✅ | ✅ |
 |[Animal Pose](https://developer.apple.com/documentation/vision/vndetectanimalbodyposerequest)                   | [apple\_vision\_animal\_pose](https://pub.dev/packages/apple_vision_animal_pose) [![Pub Version](https://img.shields.io/pub/v/apple_vision_animal_pose)](https://pub.dev/packages/apple_vision_animal_pose)                                        | [![GitHub](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](https://github.com/Knightro63/apple_vision/tree/master/packages/apple_vision_animal_pose)             | ✅ | ✅ |
 |[Pose 3D](https://developer.apple.com/documentation/vision/identifying_3d_human_body_poses_in_images)                   | [apple\_vision\_pose\_3d](https://pub.dev/packages/apple_vision_pose_3d) [![Pub Version](https://img.shields.io/pub/v/apple_vision_pose_3d)](https://pub.dev/packages/apple_vision_pose_3d)                                        | [![GitHub](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](https://github.com/Knightro63/apple_vision/tree/master/packages/apple_vision_pose_3d)             | ✅ | ✅ |
+|[Saliency](https://developer.apple.com/documentation/vision/vnsaliencyimageobservation)                   | [apple\_vision\_saliency](https://pub.dev/packages/apple_vision_saliency) [![Pub Version](https://img.shields.io/pub/v/apple_vision_saliency)](https://pub.dev/packages/apple_vision_saliency)                                        | [![GitHub](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](https://github.com/Knightro63/apple_vision/tree/master/packages/apple_vision_saliency)             | ✅ | ❌ |
 
 ## Requirements
 

diff --git a/packages/apple_vision/CHANGELOG.md b/packages/apple_vision/CHANGELOG.md
@@ -1,3 +1,8 @@
+## 0.0.4
+
+* Updated Recognize Text
+* Added Sailency
+
 ## 0.0.3
 
 * Added Human Pose 3D

diff --git a/packages/apple_vision/README.md b/packages/apple_vision/README.md
@@ -31,6 +31,7 @@ Apple Vision is a Flutter plugin that enables Flutter apps to use [Apple Vision]
 |[Barcode Scanner](https://developer.apple.com/documentation/vision/vnbarcodeobservation)                   | [apple\_vision\_scanner](https://pub.dev/packages/apple_vision_scanner) [![Pub Version](https://img.shields.io/pub/v/apple_vision_scanner)](https://pub.dev/packages/apple_vision_scanner)                                        | [![GitHub](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](https://github.com/Knightro63/apple_vision/tree/master/packages/apple_vision_scanner)             | ✅ | ✅ |
 |[Animal Pose](https://developer.apple.com/documentation/vision/vndetectanimalbodyposerequest)                   | [apple\_vision\_animal\_pose](https://pub.dev/packages/apple_vision_animal_pose) [![Pub Version](https://img.shields.io/pub/v/apple_vision_animal_pose)](https://pub.dev/packages/apple_vision_animal_pose)                                        | [![GitHub](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](https://github.com/Knightro63/apple_vision/tree/master/packages/apple_vision_animal_pose)             | ✅ | ✅ |
 |[Pose 3D](https://developer.apple.com/documentation/vision/identifying_3d_human_body_poses_in_images)                   | [apple\_vision\_pose\_3d](https://pub.dev/packages/apple_vision_pose_3d) [![Pub Version](https://img.shields.io/pub/v/apple_vision_pose_3d)](https://pub.dev/packages/apple_vision_pose_3d)                                        | [![GitHub](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](https://github.com/Knightro63/apple_vision/tree/master/packages/apple_vision_pose_3d)             | ✅ | ✅ |
+|[Saliency](https://developer.apple.com/documentation/vision/vnsaliencyimageobservation)                   | [apple\_vision\_saliency](https://pub.dev/packages/apple_vision_saliency) [![Pub Version](https://img.shields.io/pub/v/apple_vision_saliency)](https://pub.dev/packages/apple_vision_saliency)                                        | [![GitHub](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](https://github.com/Knightro63/apple_vision/tree/master/packages/apple_vision_saliency)             | ✅ | ❌ |
 
 ## Requirements
 

diff --git a/packages/apple_vision/pubspec.yaml b/packages/apple_vision/pubspec.yaml
@@ -1,6 +1,6 @@
 name: apple_vision
 description: A Flutter plugin to use all API's from Apple vision made for osx and ios.
-version: 0.0.3
+version: 0.0.4
 homepage: https://github.com/Knightro63/apple_vision/tree/main/packages/apple_vision
 
 environment:
@@ -21,10 +21,11 @@ dependencies:
   apple_vision_object_tracking: ^0.0.2
   apple_vision_pose: ^0.0.3
   apple_vision_pose_3d: ^0.0.1
-  apple_vision_recognize_text: ^0.0.2
+  apple_vision_recognize_text: ^0.0.3
   apple_vision_scanner: ^0.0.2
   apple_vision_selfie: ^0.0.2
-
+  apple_vision_sailency: ^0.0.1
+
 dev_dependencies:
   flutter_test:
     sdk: flutter

diff --git a/packages/apple_vision_recognize_text/CHANGELOG.md b/packages/apple_vision_recognize_text/CHANGELOG.md
@@ -1,3 +1,9 @@
+## 0.0.3
+
+* Added language support
+* Added automatic language detection
+* Added recognition level
+
 ## 0.0.2
 
 * Added image orientation

diff --git a/packages/apple_vision_recognize_text/darwin/Classes/AppleVisionRecognizeTextPlugin.swift b/packages/apple_vision_recognize_text/darwin/Classes/AppleVisionRecognizeTextPlugin.swift
@@ -41,15 +41,71 @@ public class AppleVisionRecognizeTextPlugin: NSObject, FlutterPlugin {
             let height = arguments["height"] as? Double ?? 0
             let candidates = arguments["candidates"] as? Int ?? 1
             let orientation = arguments["orientation"] as? String ?? "downMirrored"
+            let recognitionLevel = arguments["recognitionLevel"] as? String ?? "accurate"
+            let queueString = arguments["dispatch"] as? String ?? "defaultQueue"
+            let languages = arguments["languages"] as? [String] ?? nil
+            let automaticallyDetectsLanguage = arguments["automaticallyDetectsLanguage"] as? Bool ?? false
 
+            var dq:DispatchQoS.QoSClass = DispatchQoS.QoSClass.default
+            switch queueString{
+                case "background":
+                    dq = DispatchQoS.QoSClass.background
+                case "unspecified":
+                    dq = DispatchQoS.QoSClass.unspecified
+                case "userInitiated":
+                    dq = DispatchQoS.QoSClass.userInitiated
+                case "userInteractive":
+                    dq = DispatchQoS.QoSClass.userInteractive
+                case "utility":
+                    dq = DispatchQoS.QoSClass.utility
+                default:
+                    dq = DispatchQoS.QoSClass.default
+            }
+
             #if os(iOS)
                 if #available(iOS 13.0, *) {
-                    return result(convertImage(Data(data.data),CGSize(width: width , height: height),candidates,CIFormat.BGRA8,orientation))
+                    let event = self.convertImage(
+                        Data(data.data),
+                        CGSize(width: width , height: height),
+                        candidates,
+                        CIFormat.BGRA8,
+                        orientation,
+                        recognitionLevel,
+                        languages,
+                        automaticallyDetectsLanguage
+                    )                
+                    if dq == DispatchQoS.QoSClass.default{
+                        return result(event)
+                    }
+                    else{
+                        DispatchQueue.global(qos: dq).async {
+                            DispatchQueue.main.async{result(event)}
+                        }
+                    }
                 } else {
                     return result(FlutterError(code: "INVALID OS", message: "requires version 12.0", details: nil))
                 }
             #elseif os(macOS)
-                return result(convertImage(Data(data.data),CGSize(width: width , height: height),candidates,CIFormat.ARGB8,orientation))
+                let event = self.convertImage(
+                    Data(data.data),
+                    CGSize(width: width , height: height),
+                    candidates,
+                    CIFormat.ARGB8,
+                    orientation,
+                    recognitionLevel,
+                    languages,
+                    automaticallyDetectsLanguage
+                )
+                if dq == DispatchQoS.QoSClass.default{
+                    return result(event)
+                }
+                else{
+                    DispatchQueue.global(qos: dq).async {
+                        DispatchQueue.main.async{
+                            result(event)
+                        }
+                    }
+                }
             #endif
         default:
             result(FlutterMethodNotImplemented)
@@ -60,41 +116,51 @@ public class AppleVisionRecognizeTextPlugin: NSObject, FlutterPlugin {
     #if os(iOS)
     @available(iOS 13.0, *)
     #endif
-    func convertImage(_ data: Data,_ imageSize: CGSize, _ candidates: Int,_ format: CIFormat,_ oriString: String) -> [String:Any?]{
+    func convertImage(
+        _ data: Data,
+        _ imageSize: CGSize,
+        _ candidates: Int,
+        _ format: CIFormat,
+        _ oriString: String,
+        _ recognitionLevelString: String,
+        _ languages: [String]?,
+        _ automaticallyDetectsLanguage: Bool
+    ) -> [String:Any?]{
         let imageRequestHandler:VNImageRequestHandler
 
         var orientation:CGImagePropertyOrientation = CGImagePropertyOrientation.downMirrored
         switch oriString{
             case "down":
                 orientation = CGImagePropertyOrientation.down
-                break
             case "right":
                 orientation = CGImagePropertyOrientation.right
-                break
             case "rightMirrored":
                 orientation = CGImagePropertyOrientation.rightMirrored
-                break
             case "left":
                 orientation = CGImagePropertyOrientation.left
-                break
             case "leftMirrored":
                 orientation = CGImagePropertyOrientation.leftMirrored
-                break
             case "up":
                 orientation = CGImagePropertyOrientation.up
-                break
             case "upMirrored":
                 orientation = CGImagePropertyOrientation.upMirrored
-                break
             default:
                 orientation = CGImagePropertyOrientation.downMirrored
+        }
+
+        var recognitionLevel:VNRequestTextRecognitionLevel = VNRequestTextRecognitionLevel.accurate
+        switch recognitionLevelString{
+            case "fast":
+                recognitionLevel = VNRequestTextRecognitionLevel.fast
+                break
+            default:
+                recognitionLevel = VNRequestTextRecognitionLevel.accurate
                 break
         }
 
         if data.count == (Int(imageSize.height)*Int(imageSize.width)*4){
             // Create a bitmap graphics context with the sample buffer data
             let context =  CIImage(bitmapData: data, bytesPerRow: Int(imageSize.width)*4, size: imageSize, format: format, colorSpace: nil)
-
             imageRequestHandler = VNImageRequestHandler(ciImage:context,orientation: orientation)
         }
         else{
@@ -104,11 +170,9 @@ public class AppleVisionRecognizeTextPlugin: NSObject, FlutterPlugin {
         var event:[String:Any?] = ["name":"noData"];
 
         do {
-            try
-            imageRequestHandler.perform([VNRecognizeTextRequest { (request, error)in
+            let request = VNRecognizeTextRequest {(req, error)in
                 if error == nil {
-
-                    if let results = request.results as? [VNRecognizedTextObservation] {
+                    if let results = req.results as? [VNRecognizedTextObservation] {
                         var listText:[[String:Any?]] = []
                         for text in results {
                             listText.append(self.processObservation(text,imageSize,candidates))
@@ -126,7 +190,21 @@ public class AppleVisionRecognizeTextPlugin: NSObject, FlutterPlugin {
                     event = ["name":"error","code": "No Text Detected", "message": error!.localizedDescription]
                     print(error!.localizedDescription)
                 }
-            }])
+            }
+            if languages != nil {
+                request.recognitionLanguages = languages!
+            }
+            #if os(iOS)
+                if #available(iOS 16.0, *) {
+                    request.automaticallyDetectsLanguage = automaticallyDetectsLanguage
+                }
+            #elseif os(macOS)
+                if #available(macOS 13.0, *) {
+                    request.automaticallyDetectsLanguage = automaticallyDetectsLanguage
+                }
+            #endif
+            request.recognitionLevel = recognitionLevel
+            try imageRequestHandler.perform([request])
         } catch {
             event = ["name":"error","code": "Data Corropted", "message": error.localizedDescription]
             print(error)
@@ -141,9 +219,8 @@ public class AppleVisionRecognizeTextPlugin: NSObject, FlutterPlugin {
     func processObservation(_ observation: VNRecognizedTextObservation,_ imageSize: CGSize, _ candidates: Int) -> [String:Any?] {
         // Retrieve all torso points.
         let recognizedPoints = observation.boundingBox
-        let coord =  VNImagePointForNormalizedPoint(recognizedPoints.origin,
-                                             Int(imageSize.width),
-                                             Int(imageSize.height))
+        let coord =  VNImagePointForNormalizedPoint(recognizedPoints.origin,Int(imageSize.width),Int(imageSize.height))
+
         return [
             "minX":Double(recognizedPoints.minX),
             "maxX":Double(recognizedPoints.maxX),

diff --git a/packages/apple_vision_recognize_text/example/lib/main.dart b/packages/apple_vision_recognize_text/example/lib/main.dart
@@ -4,6 +4,7 @@ import '../camera/camera_insert.dart';
 import 'package:flutter/foundation.dart';
 import 'package:flutter/services.dart';
 import 'camera/input_image.dart';
+import 'package:apple_vision_commons/apple_vision_commons.dart';
 
 void main() {
   runApp(const MyApp());
@@ -61,7 +62,15 @@ class _VisionRT extends State<VisionRT>{
         }
         if(mounted) {
           Uint8List? image = i.bytes;
-          visionController.processImage(image!, imageSize).then((data){
+          visionController.processImage(
+            RecognizeTextData(
+              image: image!,
+              imageSize: imageSize,
+              recognitionLevel: RecognitionLevel.accurate,
+              languages: [const Locale('en-US')],
+              automaticallyDetectsLanguage: true,
+            )
+          ).then((data){
             textData = data;
             setState(() {
 

diff --git a/packages/apple_vision_recognize_text/example/macos/Runner/Info.plist b/packages/apple_vision_recognize_text/example/macos/Runner/Info.plist
@@ -28,5 +28,7 @@
 	<string>MainMenu</string>
 	<key>NSPrincipalClass</key>
 	<string>NSApplication</string>
+	<key>NSCameraUsageDescription</key>
+	<string>Please allow camera usage?</string>
 </dict>
 </plist>
diff --git a/packages/apple_vision_recognize_text/lib/apple_vision_recognize_text.dart b/packages/apple_vision_recognize_text/lib/apple_vision_recognize_text.dart
@@ -1,4 +1,4 @@
 library apple_vision_recognize_text;
 
 export 'src/recognize_text_controller.dart';
-
+export 'src/recognize_text_info.dart';
diff --git a/packages/apple_vision_recognize_text/lib/src/recognize_text_controller.dart b/packages/apple_vision_recognize_text/lib/src/recognize_text_controller.dart
@@ -1,5 +1,6 @@
 import 'dart:async';
 
+import 'package:apple_vision_recognize_text/src/recognize_text_info.dart';
 import 'package:flutter/cupertino.dart';
 import 'package:flutter/services.dart';
 import 'package:apple_vision_commons/apple_vision_commons.dart';
@@ -19,25 +20,26 @@ class AppleVisionRecognizeTextController {
 
   /// Process the image using apple vision and return the requested information or null value
   /// 
-  /// [image] as Uint8List is the image that needs to be processed
-  /// this needs to be in an image format raw will not work.
-  /// 
-  /// [imageSize] as Size is the size of the image that is being processed
-  /// 
-  /// [orientation] The orientation of the image
-  Future<List<RecognizedText>?> processImage(Uint8List image, Size imageSize,[ImageOrientation orientation = ImageOrientation.down]) async{
+  /// [data] acontains all the information to send to the method
+  Future<List<RecognizedText>?> processImage(RecognizeTextData data) async{
     try {
-      final data = await _methodChannel.invokeMapMethod<String, dynamic>(  
+      final returnedData = await _methodChannel.invokeMapMethod<String, dynamic>(  
         'process',
-        {'image':image,
-          'width': imageSize.width,
-          'height':imageSize.height,
+        {
+          'image': data.image,
+          'width': data.imageSize.width,
+          'height': data.imageSize.height,
           'candidates': numberOfCandidates,
-          'orientation': orientation.name
-          //'languages': languages
+          'orientation': data.orientation.name,
+          'recognitionLevel': data.recognitionLevel.name,
+          'dispatchQueue': data.dispatch.name,
+          'languages': data.languages == null?null:[
+            for (final locale in data.languages ?? <Locale>[]) locale.toLanguageTag(),
+          ],
+          'automaticallyDetectsLanguage': data.automaticallyDetectsLanguage,
         },
       );
-      return _convertData(data);
+      return _convertData(returnedData);
     } catch (e) {
       debugPrint('$e');
     }

diff --git a/packages/apple_vision_recognize_text/lib/src/recognize_text_info.dart b/packages/apple_vision_recognize_text/lib/src/recognize_text_info.dart
@@ -0,0 +1,43 @@
+import 'package:apple_vision_commons/apple_vision_commons.dart';
+import 'package:flutter/cupertino.dart';
+import 'package:flutter/services.dart';
+
+/// A value that determines whether the request prioritizes accuracy or speed in text recognition.
+enum RecognitionLevel {fast,accurate}
+enum Dispatch {defaultQueue,background,unspecified,userInitiated,userInteractive,utility}
+
+class RecognizeTextData{
+  /// Process the image using apple vision and return the requested information or null value
+  /// 
+  /// [image] as Uint8List is the image that needs to be processed
+  /// this needs to be in an image format raw will not work.
+  /// 
+  /// [imageSize] as Size is the size of the image that is being processed
+  /// 
+  /// [dispatch] the ability to use this in the background
+  /// 
+  /// [recognitionLevel] the speed of determining the information
+  /// 
+  /// [orientation] the orientation of the image being processed
+  /// 
+  /// [languages] An array of locale to detect, in priority order.
+  /// 
+  /// [automaticallyDetectsLanguage] A Boolean value that indicates whether to attempt detecting the language to use the appropriate model for recognition and language correction. (Only available in iOS 16.0 or newer.)
+  RecognizeTextData({
+    required this.image,
+    required this.imageSize,
+    this.orientation = ImageOrientation.up,
+    this.dispatch = Dispatch.defaultQueue,
+    this.recognitionLevel = RecognitionLevel.fast,
+    this.automaticallyDetectsLanguage = false,
+    this.languages
+  });
+
+  Uint8List image; 
+  Size imageSize;
+  Dispatch dispatch;
+  RecognitionLevel recognitionLevel;
+  ImageOrientation orientation;
+  bool automaticallyDetectsLanguage;
+  List<Locale>? languages;
+}
diff --git a/packages/apple_vision_recognize_text/pubspec.yaml b/packages/apple_vision_recognize_text/pubspec.yaml
@@ -1,6 +1,6 @@
 name: apple_vision_recognize_text
 description: A Flutter plugin to use Apple Vision Recognize Text in an image or live camera feed.
-version: 0.0.2
+version: 0.0.3
 homepage: https://github.com/Knightro63/apple_vision/tree/main/packages/apple_vision_recognize_text
 
 environment:

diff --git a/packages/apple_vision_saliency/CHANGELOG.md b/packages/apple_vision_saliency/CHANGELOG.md
@@ -1,7 +1,3 @@
-## 0.0.2
-
-* Added image orientation
-
 ## 0.0.1
 
 * Initial release
diff --git a/packages/apple_vision_saliency/example/macos/Runner.xcodeproj/project.pbxproj b/packages/apple_vision_saliency/example/macos/Runner.xcodeproj/project.pbxproj
@@ -258,7 +258,7 @@
 			isa = PBXProject;
 			attributes = {
 				LastSwiftUpdateCheck = 0920;
-				LastUpgradeCheck = 1430;
+				LastUpgradeCheck = 1300;
 				ORGANIZATIONNAME = "";
 				TargetAttributes = {
 					331C80D4294CF70F00263BE5 = {