diff --git a/_posts/zmediumtomarkdown/2018-10-16-9a9aa892f9a9.md b/_posts/zmediumtomarkdown/2018-10-16-9a9aa892f9a9.md index 35fd8ad00..9e507e9c4 100644 --- a/_posts/zmediumtomarkdown/2018-10-16-9a9aa892f9a9.md +++ b/_posts/zmediumtomarkdown/2018-10-16-9a9aa892f9a9.md @@ -2,7 +2,7 @@ title: "Vision 初探 — APP 頭像上傳 自動識別人臉裁圖 (Swift)" author: "ZhgChgLi" date: 2018-10-16T16:01:24.511+0000 -last_modified_at: 2024-04-13T07:14:30.996+0000 +last_modified_at: 2024-08-13T08:17:24.185+0000 categories: "ZRealm Dev." tags: ["swift","machine-learning","facedetection","ios","ios-app-development"] description: "Vision 實戰應用" @@ -15,6 +15,9 @@ render_with_liquid: false Vision 實戰應用 +### \[2024/08/13 Update\] +- 請參考新文章、新的 API:「 [iOS Vision framework x WWDC 24 Discover Swift enhancements in the Vision framework Session](../755509180ca8/) 」 + #### 一樣不多說,先上一張成品圖: diff --git a/_posts/zmediumtomarkdown/2024-08-13-755509180ca8.md b/_posts/zmediumtomarkdown/2024-08-13-755509180ca8.md new file mode 100644 index 000000000..fa58172cb --- /dev/null +++ b/_posts/zmediumtomarkdown/2024-08-13-755509180ca8.md @@ -0,0 +1,217 @@ +--- +title: "iOS Vision framework x WWDC 24 Discover Swift enhancements in the Vision framework Session" +author: "ZhgChgLi" +date: 2024-08-13T08:10:37.015+0000 +last_modified_at: 2024-08-13T08:22:23.496+0000 +categories: "KKday Tech Blog" +tags: ["ios-app-development","vision-framework","apple-intelligence","ai","machine-learning"] +description: "Vision framework 功能回顧 & iOS 18 新 Swift API 試玩" +image: + path: /assets/755509180ca8/1*NqN-_MAE4tt11n6MnUQWxQ.jpeg +render_with_liquid: false +--- + +### iOS Vision framework x WWDC 24 Discover Swift enhancements in the Vision framework Session + +Vision framework 功能回顧 & iOS 18 新 Swift API 試玩 + + +![Photo by [BoliviaInteligente](https://unsplash.com/@boliviainteligente?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash){:target="_blank"}](/assets/755509180ca8/1*NqN-_MAE4tt11n6MnUQWxQ.jpeg) + +Photo by [BoliviaInteligente](https://unsplash.com/@boliviainteligente?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash){:target="_blank"} +#### 主題 + + +![跟 Vision Pro 的關係就跟熱狗跟狗的關係一樣,毫無關係。](/assets/755509180ca8/1*ebqm2jzCK1GSrDDY0XtrUA.png) + +跟 Vision Pro 的關係就跟熱狗跟狗的關係一樣,毫無關係。 +### Vision framework + +Vision framework 是 Apple 整合機器學習的圖像辨識框架,讓開發者可以簡單快速地實現常見的圖像辨識功能;Vision framework 早在 iOS 11\.0\+ \(2017/ iPhone 8\) 就已推出,期間不斷地迭代優化,並完善與 Swift Concurrency 的特性整合提升執行效能,並且從 iOS 18\.0 提供全新的 Swift Vision framework API 發揮 Swift Concurrency 最大效果。 + +**Vision framework 特色** +- 內建眾多圖片辨識、動態追蹤方法 \(iOS 18 為止一共 31 種\) +- On\-Device 單純使用手機晶片運算,辨識過程不依賴雲端服務,快速又安全 +- API 簡單好操作 +- Apple 全平台均支援 iOS 11\.0\+, iPadOS 11\.0\+, Mac Catalyst 13\.0\+, macOS 10\.13\+, tvOS 11\.0\+, visionOS 1\.0\+ +- 已發布多年 \(2017~今\) 且不斷更新 +- 整合 Swift 語言特性提升運算效能 + + + +> **_6 年前曾經小玩過: [Vision 初探 — APP 頭像上傳 自動識別人臉裁圖 \(Swift\)](../9a9aa892f9a9/)_** + + + + + +> _這次搭配 [WWDC 24 Discover Swift enhancements in the Vision framework Session](https://developer.apple.com/videos/play/wwdc2024/10163/){:target="_blank"} 重新回顧並結合新的 Swfit 特性再玩一次。_ + + + + +#### CoreML + +Apple 還有另外一個 Framework 叫 [CoreML](https://developer.apple.com/documentation/coreml){:target="_blank"} ,也是基於 On\-Device 晶片的機器學習框架;但他可以讓你自己訓練想辨識的物件、文件模型,並將模型放到 App 中直接使用,有興趣的朋友也可以玩看看。\(e\.g\. [即時文章分類](../793bf2cdda0f/) 、即時 [垃圾訊息檢測](https://apps.apple.com/tw/app/%E7%86%8A%E7%8C%AB%E5%90%83%E7%9F%AD%E4%BF%A1-%E5%9E%83%E5%9C%BE%E7%9F%AD%E4%BF%A1%E8%BF%87%E6%BB%A4/id1319191852){:target="_blank"} …\) +#### p\.s\. + +[**Vision**](https://developer.apple.com/documentation/vision/){:target="_blank"} **v\.s\. [VisionKit](https://developer.apple.com/documentation/visionkit){:target="_blank"} :** + + +> [**_Vision_**](https://developer.apple.com/documentation/vision/){:target="_blank"} _:主要用於圖像分析任務,如臉部識別、條碼檢測、文本識別等。它提供了強大的 API 來處理和分析靜態圖像或視頻中的視覺內容。_ + + + + + +> [**_VisionKit_**](https://developer.apple.com/documentation/visionkit){:target="_blank"} _:專門用於處理與文件掃描相關的任務。它提供了一個掃描儀視圖控制器,可以用來掃描文檔,並生成高質量的 PDF 或圖像。_ + + + + + +Vision framework 在 M1 機型上無法跑在模擬器,只能接實體手機測試;在模擬器環境執行會拋出 `Could not create Espresso context` Error,查 [官方論壇討論,沒找到解答](https://forums.developer.apple.com/forums/thread/675806){:target="_blank"} 。 + + +> _因手邊沒有實體 iOS 18 裝置進行測試,所以本文中的所有執行結果都是使用舊的 \(iOS 18 以前\) 的寫法結果; **如新寫法有出現錯誤再麻煩留言指教** 。_ + + + + +### WWDC 2024 — Discover Swift enhancements in the Vision framework + + +![[Discover Swift enhancements in the Vision framework](https://developer.apple.com/videos/play/wwdc2024/10163/?time=45){:target="_blank"}](/assets/755509180ca8/1*8N5GtY1uqxP-4iAAAticOA.png) + +[Discover Swift enhancements in the Vision framework](https://developer.apple.com/videos/play/wwdc2024/10163/?time=45){:target="_blank"} + + +> _本文是針對 WWDC 24 — [Discover Swift enhancements in the Vision framework](https://developer.apple.com/videos/play/wwdc2024/10163/?time=45){:target="_blank"} Session 的分享筆記,跟一些自己實驗的心得。_ + + + + +### Introduction — Vision framework Features +#### 人臉辨識、輪廓識別 + + +![](/assets/755509180ca8/1*RNGfE_EeaQhiKAPdJeFYQw.png) + + + +![](/assets/755509180ca8/1*iMdzeLm2aWjATVV6_Kvrjg.png) + +#### 圖像內容文字辨識 + +截至 iOS 18 為止,支援 18 種語言。 + + +![](/assets/755509180ca8/1*kU_OYn5w368h-ahDYU4lDw.png) + +```swift +// 支援的語系列表 +if #available(iOS 18.0, *) { + print(RecognizeTextRequest().supportedRecognitionLanguages.map { "\($0.languageCode!)-\(($0.region?.identifier ?? $0.script?.identifier)!)" }) +} else { + print(try! VNRecognizeTextRequest().supportedRecognitionLanguages()) +} + +// 實際可用辨識語言以這為主。 +// 實測 iOS 18 輸出以下結果: +// ["en-US", "fr-FR", "it-IT", "de-DE", "es-ES", "pt-BR", "zh-Hans", "zh-Hant", "yue-Hans", "yue-Hant", "ko-KR", "ja-JP", "ru-RU", "uk-UA", "th-TH", "vi-VT", "ar-SA", "ars-SA"] +// 未看到 WWDC 提到的 Swedish 語言,不確定是還沒推出還是跟裝置地區、語系有關聯 +``` +#### 動態動作捕捉 + + +![](/assets/755509180ca8/1*6TfyCcszdD1NdId0bdM16Q.gif) + + + +![](/assets/755509180ca8/1*8y_XXdH36uKpfP0p6BCJQA.gif) + +- 可以實現人、物件動態捕捉 +- 手勢補捉實現隔空簽名功能 + +#### What’s new in Vision? \(iOS 18\)— 圖片評分功能 \(品質、記憶點\) +- 可對輸入圖片得計算出分數,方便篩選出優質照片 +- 分數計算方式包含多個維度,不只是畫質,還有光線、角度、拍攝主體、 **是否有讓人感到的記憶點** …等等 + + + +![](/assets/755509180ca8/1*XwjeaHcB6arxJhIR7cFsWg.png) + + + +![](/assets/755509180ca8/1*YdhZlZBlTaIZd4nLxhBtaQ.png) + + + +![](/assets/755509180ca8/1*IhMDFdk6DWwTv1qIG0Gi0Q.png) + + +WWDC 中給了以上三張圖片做說明\(相同畫質之下\),分別是: +- 高分的圖片:取景、光線、有記憶點 +- 低分的圖片:沒有主體、像是隨手或不小心拍的 +- 素材的圖片:技術上拍的很好但是沒有記憶點,像是作為素材圖庫用的圖片 + + +**iOS ≥ 18 New API: [CalculateImageAestheticsScoresRequest](https://developer.apple.com/documentation/vision/calculateimageaestheticsscoresrequest){:target="_blank"}** +```swift +let request = CalculateImageAestheticsScoresRequest() +let result = try await request.perform(on: URL(string: "https://zhgchg.li/assets/cb65fd5ab770/1*yL3vI1ADzwlovctW5WQgJw.jpeg")!) + +// 照片分數 +print(result.overallScore) + +// 是否被判定為素材圖片 +print(result.isUtility) +``` +#### What’s new in Vision? \(iOS 18\) — 身體+手勢姿勢同時偵測 + + +![](/assets/755509180ca8/1*A9320aRV-jdccgiXrmSrJw.png) + + +以往只能個別偵測人體 Pose 和 手部 Pose,這次更新可以讓開發者同時偵測身體 Pose \+ 手部 Pose,合成同一個請求跟結果,方便我們做更多應用功能開發。 + +**iOS ≥ 18 New API: [DetectHumanBodyPoseRequest](https://developer.apple.com/documentation/vision/detecthumanbodyposerequest){:target="_blank"}** +```swift +var request = DetectHumanBodyPoseRequest() +// 一併偵測手部 Pose +request.detectsHands = true + +guard let bodyPose = try await request.perform(on: image). first else { return } + +// 身體 Pose Joints +let bodyJoints = bodyPose.allJoints() +// 左手 Pose Joints +let leftHandJoints = bodyPose.leftHand.allJoints() +// 右手 Pose Joints +let rightHandJoints = bodyPose.rightHand.allJoints() +``` +### New Vision API + +Apple 在這次的更新當中提供了新的 Swift Vision API 封裝給開發者使用,除了基本的包含原本的功能支援之外,主要針對加強 Swift 6 / Swift Concurrency 的特性,提供效能更優、寫起來更 Swift 的 API 操作方式。 +### Get started with Vision + + +![](/assets/755509180ca8/1*mv9g5jmqrS6YScxoGYJemQ.png) + + + +![](/assets/755509180ca8/1*iidNN7nKHoskh_tcjfuHKQ.png) + + +這邊講者又重新介紹了一次 Vision framework 的基礎使用方式,Apple 已經封裝好了 [31 種](https://developer.apple.com/documentation/vision/visionrequest){:target="_blank"} \(截至 iOS 18\)常見的圖像辨識請求「Request」與對應回傳的「Observation」物件。 +1. **Request:** DetectFaceRectanglesRequest 人臉區域識別請求 +**Result:** FaceObservation +之前的文章「 [Vision 初探 — APP 頭像上傳 自動識別人臉裁圖 \(Swift\)](../9a9aa892f9a9/) 」就是用這對請求。 +2. **Request:** RecognizeTextRequest 文字辨識請求 +**Result:** RecognizedTextObservation +3. **Request:** GenerateObjectnessBasedSaliencyImageRequest 主體物件辨識請求 +**Result:** SaliencyImageObservation + +### 全部 31 種請求 Request: + +[VisionRequest](https://developer.apple.com/documentation/vision/visionrequest){:target="_blank"} 。 diff --git a/assets/755509180ca8/1*6TfyCcszdD1NdId0bdM16Q.gif b/assets/755509180ca8/1*6TfyCcszdD1NdId0bdM16Q.gif new file mode 100644 index 000000000..4435ce4c1 Binary files /dev/null and b/assets/755509180ca8/1*6TfyCcszdD1NdId0bdM16Q.gif differ diff --git a/assets/755509180ca8/1*8N5GtY1uqxP-4iAAAticOA.png b/assets/755509180ca8/1*8N5GtY1uqxP-4iAAAticOA.png new file mode 100644 index 000000000..d5ae0883d Binary files /dev/null and b/assets/755509180ca8/1*8N5GtY1uqxP-4iAAAticOA.png differ diff --git a/assets/755509180ca8/1*8y_XXdH36uKpfP0p6BCJQA.gif b/assets/755509180ca8/1*8y_XXdH36uKpfP0p6BCJQA.gif new file mode 100644 index 000000000..6aac2d908 Binary files /dev/null and b/assets/755509180ca8/1*8y_XXdH36uKpfP0p6BCJQA.gif differ diff --git a/assets/755509180ca8/1*A9320aRV-jdccgiXrmSrJw.png b/assets/755509180ca8/1*A9320aRV-jdccgiXrmSrJw.png new file mode 100644 index 000000000..e781f0b7e Binary files /dev/null and b/assets/755509180ca8/1*A9320aRV-jdccgiXrmSrJw.png differ diff --git a/assets/755509180ca8/1*IhMDFdk6DWwTv1qIG0Gi0Q.png b/assets/755509180ca8/1*IhMDFdk6DWwTv1qIG0Gi0Q.png new file mode 100644 index 000000000..f1c354ca5 Binary files /dev/null and b/assets/755509180ca8/1*IhMDFdk6DWwTv1qIG0Gi0Q.png differ diff --git a/assets/755509180ca8/1*NqN-_MAE4tt11n6MnUQWxQ.jpeg b/assets/755509180ca8/1*NqN-_MAE4tt11n6MnUQWxQ.jpeg new file mode 100644 index 000000000..2b8891163 Binary files /dev/null and b/assets/755509180ca8/1*NqN-_MAE4tt11n6MnUQWxQ.jpeg differ diff --git a/assets/755509180ca8/1*RNGfE_EeaQhiKAPdJeFYQw.png b/assets/755509180ca8/1*RNGfE_EeaQhiKAPdJeFYQw.png new file mode 100644 index 000000000..48224a573 Binary files /dev/null and b/assets/755509180ca8/1*RNGfE_EeaQhiKAPdJeFYQw.png differ diff --git a/assets/755509180ca8/1*XwjeaHcB6arxJhIR7cFsWg.png b/assets/755509180ca8/1*XwjeaHcB6arxJhIR7cFsWg.png new file mode 100644 index 000000000..657f313e7 Binary files /dev/null and b/assets/755509180ca8/1*XwjeaHcB6arxJhIR7cFsWg.png differ diff --git a/assets/755509180ca8/1*YdhZlZBlTaIZd4nLxhBtaQ.png b/assets/755509180ca8/1*YdhZlZBlTaIZd4nLxhBtaQ.png new file mode 100644 index 000000000..0569111d5 Binary files /dev/null and b/assets/755509180ca8/1*YdhZlZBlTaIZd4nLxhBtaQ.png differ diff --git a/assets/755509180ca8/1*ebqm2jzCK1GSrDDY0XtrUA.png b/assets/755509180ca8/1*ebqm2jzCK1GSrDDY0XtrUA.png new file mode 100644 index 000000000..2e6da9a88 Binary files /dev/null and b/assets/755509180ca8/1*ebqm2jzCK1GSrDDY0XtrUA.png differ diff --git a/assets/755509180ca8/1*iMdzeLm2aWjATVV6_Kvrjg.png b/assets/755509180ca8/1*iMdzeLm2aWjATVV6_Kvrjg.png new file mode 100644 index 000000000..f2e6fd475 Binary files /dev/null and b/assets/755509180ca8/1*iMdzeLm2aWjATVV6_Kvrjg.png differ diff --git a/assets/755509180ca8/1*iidNN7nKHoskh_tcjfuHKQ.png b/assets/755509180ca8/1*iidNN7nKHoskh_tcjfuHKQ.png new file mode 100644 index 000000000..8c9219d0d Binary files /dev/null and b/assets/755509180ca8/1*iidNN7nKHoskh_tcjfuHKQ.png differ diff --git a/assets/755509180ca8/1*kU_OYn5w368h-ahDYU4lDw.png b/assets/755509180ca8/1*kU_OYn5w368h-ahDYU4lDw.png new file mode 100644 index 000000000..e70827419 Binary files /dev/null and b/assets/755509180ca8/1*kU_OYn5w368h-ahDYU4lDw.png differ diff --git a/assets/755509180ca8/1*mv9g5jmqrS6YScxoGYJemQ.png b/assets/755509180ca8/1*mv9g5jmqrS6YScxoGYJemQ.png new file mode 100644 index 000000000..a80c22545 Binary files /dev/null and b/assets/755509180ca8/1*mv9g5jmqrS6YScxoGYJemQ.png differ